# Assignment 10: Data Bias (Coding)
By Zoe Toy

## Step 1: Set up a Perspective API key

Project ID: data-bias-coding

API key: AIzaSyAnFSqXNzosVYYYrHAY_QRk3kB7CcDKeiY

## Step 2: Exploration

Loading the sample data, transforming into a more usable form.

In [111]:
import pandas as pd
import numpy as np

df = pd.read_csv('sample_labeled_data.csv')
df

Unnamed: 0.1,Unnamed: 0,id,comment_text,toxic
0,5,0001ea8717f6de06,Thank you for understanding I think very highl...,no
1,7,000247e83dcc1211,Dear god this site is horrible,no
2,11,0002f87b16116a7f,Somebody will invariably try to add Religion ...,no
3,13,0003e1cccfd5a40a,It says it right there that it IS a type The...,no
4,14,00059ace3e3e9a53,Before adding a new product to the list mak...,no
...,...,...,...,...
55247,153147,fff83b80284d8440,Consensus for ruining Wikipedia I think that c...,no
55248,153149,fff8f521a7dbcd47,shut down the mexican border withought looking...,no
55249,153150,fff8f64043129fa2,Jerome I see you never got around to this… I’m...,no
55250,153151,fff9d70fe0722906,Lucky bastard httpwikimediafoundationorgwikiP...,no


In [77]:
df.info()

number_of_duplicates = df.duplicated().sum()
print (f" Number of duplicates before : {number_of_duplicates}")

print(len(df))

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 55252 entries, 0 to 55251
Data columns (total 4 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   Unnamed: 0    55252 non-null  int64 
 1   id            55252 non-null  object
 2   comment_text  55246 non-null  object
 3   toxic         55252 non-null  object
dtypes: int64(1), object(3)
memory usage: 1.7+ MB
 Number of duplicates before : 0
55252


Filtering out the nontoxic comments and taking only the first 200 data points to calculate a threshold.

In [78]:
toxic_df = df[df['toxic'] == 'yes'] 

toxic_df = toxic_df.iloc[:200]

toxic_df

Unnamed: 0.1,Unnamed: 0,id,comment_text,toxic
8,21,00091c35fa9d0465,Arabs are committing genocide in Iraq but no ...,yes
34,76,001d739c97bc2ae4,How dare you vandalize that page about the HMS...,yes
36,81,001eff4007dbb65b,No he is an arrogant self serving immature idi...,yes
81,219,005f47397e07e12f,Eek but shes cute in an earthy kind of way Can...,yes
97,258,0071940212267fea,Well it sucks to have a university to be nickn...,yes
...,...,...,...,...
2901,8056,0d88c800f3e9a913,31 hours only for that you guys got to kidding...,yes
2909,8083,0d971c3c731da0cc,Written like an advertisement It sounds like ...,yes
2947,8201,0dc178393342e270,Comments Well what do you think I made this p...,yes
2955,8219,0dc579561eca125d,It also needs to be added that nigger or nigga...,yes


Using the Perspective API to calculate the average toxicity score for the first 200 data points.

In [79]:
#!pip install google-api-python-client

from googleapiclient import discovery
import json

API_KEY = 'AIzaSyAnFSqXNzosVYYYrHAY_QRk3kB7CcDKeiY'

client = discovery.build(
  "commentanalyzer",
  "v1alpha1",
  developerKey=API_KEY,
  discoveryServiceUrl="https://commentanalyzer.googleapis.com/$discovery/rest?version=v1alpha1",
  static_discovery=False)

toxic_scores = []

for comment in toxic_df['comment_text']:
    try:
        analyze_request = {'comment': { 'text': comment },'requestedAttributes': {'TOXICITY': {}}}
        response = client.comments().analyze(body=analyze_request).execute()
        toxic = response['attributeScores']['TOXICITY']['spanScores'][0]['score']['value']
        toxic_scores.append(toxic)
    except:
        continue 
print(str(len(toxic_scores)) + " toxic scores calculated")

def range(lst):
    average = sum(lst) / len(lst)
    min_score = min(lst)

    print("The average score for toxic comments was: " + str(average) + ", and the minimum score was " + str(min_score))

range(toxic_scores)

42 toxic scores calculated
The average score for toxic comments was: 0.7044839116666667, and the minimum score was 0.36095104


## Step 3: Hypothesis & Testing

Because the average score for toxic comments was about 0.70, I have chosen this to be our threshold for the model. Given the Perspective API's documentation, this would indicate that a comment is considered toxic when 7/10 people indicate it as so. Per the documentation's recommendations, "For social science researchers using Perspective to study harassment, we recommend experimenting with thresholds of 0.7 or 0.9, similar to typical moderation use cases."

Hypothesis: the Perspective API will fail if we incorrectly spell common swear words.

In the code blocks below, I created a function that calculates the Perspective API's toxicity score for an input comment. I then created several samples to test my hypothesis.

In [113]:
API_KEY = 'AIzaSyAnFSqXNzosVYYYrHAY_QRk3kB7CcDKeiY'

client = discovery.build(
  "commentanalyzer",
  "v1alpha1",
  developerKey=API_KEY,
  discoveryServiceUrl="https://commentanalyzer.googleapis.com/$discovery/rest?version=v1alpha1",
  static_discovery=False)


def test_comments(comment):
    analyze_request = {'comment': { 'text': comment },'requestedAttributes': {'TOXICITY': {}}}
    response = client.comments().analyze(body=analyze_request).execute()
    test_score = response['attributeScores']['TOXICITY']['spanScores'][0]['score']['value']
    print("The toxicity score is: " + str(test_score))

In [114]:
test_comments("you suck")
test_comments("you scuk")

The toxicity score is: 0.88599813
The toxicity score is: 0.7308154


In [115]:
test_comments("i hate you")
test_comments("i haet you")

The toxicity score is: 0.68408644
The toxicity score is: 0.6491204


In [116]:
test_comments("i hope you die. you are a bitch.")
test_comments("i hope you dye. you are a biatch.")

The toxicity score is: 0.9749944
The toxicity score is: 0.944597


In [117]:
test_comments("fuck you")
test_comments("duck you")
test_comments("fck you")

The toxicity score is: 0.944597
The toxicity score is: 0.34328604
The toxicity score is: 0.8988238


In [118]:
test_comments("you are ugly")
test_comments("you are ulgy")

The toxicity score is: 0.7510937
The toxicity score is: 0.6827122


In [119]:
test_comments("you're a fatty")
test_comments("you're a fattie")

The toxicity score is: 0.64447093
The toxicity score is: 0.60311776


In [120]:
test_comments("cunt")
test_comments("cuntt")

The toxicity score is: 0.6744087
The toxicity score is: 0.5779532


In [121]:
test_comments("she looks so slutty")
test_comments("she looks so sluttie")

The toxicity score is: 0.85333383
The toxicity score is: 0.82048255


In [122]:
test_comments("you're a hoe")
test_comments("you're a ho")

The toxicity score is: 0.743089
The toxicity score is: 0.3308965


In [123]:
test_comments("stupid")
test_comments("stoopid")

The toxicity score is: 0.7761081
The toxicity score is: 0.65996873


In [124]:
test_comments("she's such a bitch")
test_comments("she's such a bish")

The toxicity score is: 0.9391453
The toxicity score is: 0.584095


In [125]:
test_comments("i hope you get murdered")
test_comments("i hope you get murddered")

The toxicity score is: 0.8988238
The toxicity score is: 0.88599813


For the purpose of this assignment, I tested my hypothesis with a sample size of n = 12. 25 total comments were made and inputted into the Perspective API to calculate a toxicity score for comparison.

Because of this small sample size, there is a high risk that our findings are due to chance. Thus, the conclusions we draw from them can be a helpful starting point for determining if our hypothesis is actually correct. 

## Step 4: Results

For the hypothesis testing step, I chose to create a test score function that allowed me to input a sample comment and output the toxicity score. To test my hypothesis, I first inputted a mean comment with a common toxic word and generated its toxicity score. Then, I inputted that same mean comment but slightly misspelled the toxic word.

The results of my testing showed that, for all 12 of my samples, my hypothesis was to some extent correct: the Perspective API will fail if we incorrectly spell common swear words. For example, the word "stupid" generated a toxic score of 0.78 while "stoopid" generated a score of 0.66. In this case, the Perspective API scored "stupid" above the threshold for toxic, while its counterpart, "stoopid", was scored below the threshold and would be deemed not toxic.

While not every test showed this difference in toxicity when it comes to the threshold (that is, sometimes even the misspelled comments scored above 0.70), the slightly misspelled toxic comments were consistently lower in every test. The findings suggest that the Perspective API could be biased towards only common curse words (i.e. non-slang versions) and begins to be less effective at spotting and scoring toxic comments when toxic words are slightly misspelled, shortened, or in a slang version.

As for my theories about why I got these results, I think that a possible reason could be that slang versions of curse words are more difficult to identify since new slang is always circulating and changing. When it comes to solving this problem, if there were some way to identify if a word sounded similar to a curse word (ex. "hoe" vs "ho"), then this issue could be resolved. Additionally, I think it is difficult for Perspective to identify when a curse word is slightly misspelled; however, if there were a way to use something like spell-check to help find common errors in curse words than the toxic words could be detected more effectively.