(Because the weaponized word lexicon's license does not allow for redistribution, query results will be deleted and relevant log outputs from the notebook will not be shown) 

In [77]:
import requests
import json
import pickle
from itertools import chain
from collections import defaultdict

## The weaponized word lexicon
The weaponized word (https://weaponizedword.org) provides lexicons of hate words for different use cases. Although all of them seem interesting, we are only going to look at the discriminatory words here. First, all relevant terms need to be fetched from the API.

In [95]:
def auth(key):
    response = requests.post("https://api.weaponizedword.org/lexicons/1-0/authenticate", {"api_key": key})
    return response.json()

In [58]:
def get_discriminatory(token):
    last_status = 200
    num = 1
    result = []
    while last_status == 200 and num <= 16:
        response = requests.post("https://api.weaponizedword.org/lexicons/1-0/get_discriminatory", {"token": token, "page": num, "language_id": "eng"})
        last_status = response.status_code
        if last_status == 200:
            result.append(response.json())
        else:
            break
        num += 1
    return result

In [None]:
token = auth("MY API KEY")["token"] 
discriminatory = get_discriminatory("token")
file = open("data/hate_words/discriminatory.pkl", "wb")
pickle.dump(discriminatory, file)
file.close()
discriminatory

In [None]:
items = list(chain.from_iterable([item["result"] for item in discriminatory[:16]]))
items

In [159]:
len(items)

1576

Now we over 1000 terms. However, because the goal is to possibly extract a hatefulness component from these terms, we should disregard any that are not strongly hateful. The idea is that the embeddings probably cannot pick up some of the subtleties and thus would lead to noise if we use too many too weak terms. Weaponized word annotates a term's offensiveness and thus we can filter out any words that are not strongly offensive. Also, the items may include archaic terms which are unwanted as well.

Or rather, this is what we would want ideally. However, after applying these heavy filters, there are only about 30 terms left. We need more for a strong categorization, so every term and term variation will be accepted into the hate term set. For now.

In [191]:
extremes = [item for item in items]
# extremes = [item for item in items if item["offensiveness"] == "Extremely offensive" or item["offensiveness"] == "Very offensive" or item["offensiveness"] == "Significantly offensive" or item["offensiveness"] == "Moderately offensive"]
# extremes = [item for item in extremes if item["is_archaic"] == "N"]
# extremes = [item for item in extremes if item["variant_of_id"] == None]
# extremes = [item for item in extremes if item["plural_of_id"] == None]
len(extremes)

1576

Now, the terms can be grouped by target group. There are 7 supercategories:
- nationality
- ethnicity
- religion
- gender
- orientation
- disability
- class

All but the last two contain further subcategories. We can thus group the terms by target

In [192]:
nationality = defaultdict(lambda: [])
ethnicity = defaultdict(lambda: [])
religion = defaultdict(lambda: [])
gender = defaultdict(lambda: [])
orientation = defaultdict(lambda: [])
disability = []
class_group = []

In [193]:
for term in extremes:
    if term["is_about_nationality"] == "Y" and term["nationalities"] is not None:
        for national in term["nationalities"]:
            nationality[national].append(term["term"])
    if term["is_about_ethnicity"] == "Y" and term["athnicities"] is not None:
        for ethnic in term["athnicities"]:
            ethnicity[ethnic].append(term["term"])
    if term["is_about_religion"] == "Y" and term["religions"] is not None:
        for relig in term["religions"]:
            religion[relig].append(term["term"])
    if term["is_about_gender"] == "Y" and term["genders"] is not None:
        for gen in term["genders"]:
            gender[gen].append(term["term"])
    if term["is_about_orientation"] == "Y" and term["orientations"] is not None:
        for orient in term["orientations"]:
            orientation[orient].append(term["term"])
    if term["is_about_disability"] == "Y":
        disability.append(term["term"])
    if term["is_about_class"] == "Y":
        class_group.append(term["term"])
        

In [194]:
print("Nationalities:", list(nationality.keys()))
print("Ethnicities:", list(ethnicity.keys()))
print("Genders:", list(gender.keys()))
print("Orientations:", list(orientation.keys()))

Nationalities: ['TR', 'PH', 'IN', 'DE', 'IE', 'GB', 'PS', 'PF', 'ZA', 'JP', 'AU', 'LT', 'LB', 'AL', 'US', 'NL', 'MX', 'NZ', 'IT', 'CN', 'CA', 'PK', 'CU', 'VN', 'AR', 'MK', 'CZ', 'UA', 'SO', 'HU', 'PL', 'WS', 'EG', 'DO']
Ethnicities: ['African', 'African American', 'Arabs', 'European', 'German', 'Jews', 'Irish', 'Italian', 'Bihari', 'Japanese', 'Aboriginal', 'Albanian', 'Asian', 'English', 'Hispanic', 'Chinese', 'Dinka', 'Korean', 'Pacific Islander', 'Romani', 'Egyptian', 'Portuguese', 'Sardinian', 'Spaniard', 'Czech', 'Ukrainian', 'French', 'Hungarian', 'Inuit', 'Vietnamese', 'Hawaiian', 'Samoan', 'Pakistani', 'India', 'Tamil']
Genders: ['female', 'male']
Orientations: ['homosexual']


There are a couple groups now. However, these terms still need to be hand-filtered. Because semantic embeddings don't capture context very well, we should disregard terms that are made out of multiple word compounds and those who have a very common non-hate meaning. This should make the component filtering more robust. Also, we will disregard categories with very few hate words associated with them. This will lead to a more robust result as well, although it will have as a consequences that not all target groups will be captured.

In [None]:
all_terms = []
for key in nationality.keys():
    nats = set([item.lower() for item in nationality[key] if not " " in item])
    if (len(nats) >= 3):
        print(key, nats)
        all_terms += nats
for key in ethnicity.keys():
    ets = set([item.lower() for item in ethnicity[key] if not " " in item])
    if (len(ets) >= 3):
        print(key, ets)
        all_terms += ets
for key in gender.keys():
    gen = set([item.lower() for item in gender[key] if not " " in item])
    if (len(gen) >= 3):
        print(key, gen)
        all_terms += gen
for key in orientation.keys():
    ors = set([item.lower() for item in orientation[key] if not " " in item])
    if (len(ors) >= 3):
        print(key, ors)
        all_terms += ors
for key in religion.keys():
    rel = set([item.lower() for item in religion[key] if not " " in item])
    if (len(rel) >= 3):
        print(key, rel)
        all_terms += rel
dis = set([item.lower() for item in disability if not " " in item])
if (len(dis) >= 3):
    print("disability", dis)
    all_terms += dis
cla = set([item.lower() for item in class_group if not " " in item])
if (len(cla) >= 3):
    print("class", cla)
    all_terms += dis

Adding the singular terms from davidsons n-gram dictionary (https://github.com/t-davidson/hate-speech-and-offensive-language/blob/master/lexicons/refined_ngram_dict.csv) to the list to compile the final term set. Because of the metric we are using here, we can simply combine all the terms into one big set. However, a more refined approach might include seperate metrics for hate based on group identity. Since we might come back to that later, it is useful to have the terms separated by group as well.

In [200]:
hate_terms = list(set([term for term in all_terms if not "-" in term]))
hate_terms += ["chink", "dyke", "faggot", "nigger", "spic"]
print(list(set(hate_terms)))

[A BIG LIST OF BAD WORDS]


Having compiled a set of terms, we can now train the similarity classifier used in the Relative sentiment Bias metric. Using this classifier, we will be able to assign a score to each word or phrase that measures its propensity for hatefulness (or, more specifically: its propensity to be similar to a derogatory term from the generated list). RNSB scoring is performed by comparing against two sets, usually a set of positive vs negative sentiment words. We will swap the negative sentiment word list for the hate term set and work under the assumption that *hate vs. positive* instead of *negative vs. positive* will still work.

In [181]:
from wefe.datasets import load_bingliu
from wefe.metrics import RNSB
from gensim.models import fasttext
from gensim.test.utils import datapath

In [198]:
positive_terms = load_bingliu()['positive_words']

Since we want to use the basic as well as the finetuned word embeddings, two classifiers will be trained

In [183]:
basic_embeddings = fasttext.load_facebook_vectors(datapath("C:/Users/flohk/OneDrive/Uni/Projektseminar/data/crawl-300d-2M-subword/crawl-300d-2M-subword.bin"))
finetuned_embeddings = fasttext.FastText.load(datapath("C:/Users/flohk/OneDrive/Uni/Projektseminar/data/fasttext-finetune/model150.model")).wv

In [206]:
basic_vectors = [{word: basic_embeddings[word] for word in positive_terms}, {word: basic_embeddings[word] for word in hate_terms}]
finetuned_vectors = [{word: finetuned_embeddings[word] for word in positive_terms}, {word: finetuned_embeddings[word] for word in hate_terms}]

In [213]:
print("Basic embeddings ", end="")
basic_classifier = RNSB()._train_classifier(attribute_embeddings_dict=basic_vectors, print_model_evaluation=True)
print("Finetuned embeddings ", end="")
finetuned_classifier = RNSB()._train_classifier(attribute_embeddings_dict=finetuned_vectors, print_model_evaluation=True)

Basic embeddings Classification Report:
              precision    recall  f1-score   support

        -1.0       1.00      0.89      0.94       109
         1.0       0.97      1.00      0.99       402

    accuracy                           0.98       511
   macro avg       0.99      0.94      0.96       511
weighted avg       0.98      0.98      0.98       511

Finetuned embeddings Classification Report:
              precision    recall  f1-score   support

        -1.0       0.82      0.85      0.84       109
         1.0       0.96      0.95      0.96       402

    accuracy                           0.93       511
   macro avg       0.89      0.90      0.90       511
weighted avg       0.93      0.93      0.93       511



Ouch! The finetuned model really looks like it overfitted quite a bit when compared to the metrics of the basic embeddings. However, we can still continue with it. For now, our work here is done. The classifier scores can now be used as features!

In [214]:
file = open("data/RNSB_classifiers/basic_classifier.pkl", "wb")
pickle.dump(basic_classifier, file)
file.close()
file = open("data/RNSB_classifiers/finetuned_classifier.pkl", "wb")
pickle.dump(finetuned_classifier, file)
file.close()