### Perspective API Exploration

First, we have a dataset of Wikipedia comments made available by Jigsaw, a subsidiary of Google that created the Perspective tool. The dataset includes a unique comment id, the text of the comment, and a series of binary labels applied by human raters: "toxic," "severe_toxic," "obscene," "threat," "insult," and "identity_hate." I have appended the "score" column, which represents the toxicity score assigned to the comment text by the live version of the Perspective API. The data is available under a CC0 license.

# New Section

In [1]:
import pandas as pd
import seaborn as sns
import time

In [2]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [3]:
df1 = pd.read_csv('/content/drive/MyDrive/labeled_and_scored_comments.csv')

In [4]:
df1.sort_values(['score'])

Unnamed: 0,id,comment_text,score,toxic,severe_toxic,obscene,threat,insult,identity_hate
1158,61a85266de70cca0,Можешь говорить по русски.,0.000004,0,0,0,0,0,0
5246,030a9ba0b32b4456,Rodzina Krusenstern Potomkowie Admirała Adama ...,0.000009,0,0,0,0,0,0
11040,a6814193ac6b4705,The opening paragraph states that the game is ...,0.000091,0,0,0,0,0,0
24192,f0d6dee96e057a5b,The book is used as a source in a number of ar...,0.000116,0,0,0,0,0,0
2332,cbb9268d3c7429c1,Vals Vienés \n |Rumba \n |Cha Cha Chá(ciclo 7),0.000131,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...
19994,1429a2c930688317,you fucking piece of shit,0.997278,1,1,1,0,1,0
28728,d82e45eb103ad07b,fuck off you stupid aspy asshole,0.997736,1,1,1,0,1,0
25535,e9b92c6c6b01aeef,FUCK YOU YOU PIECE OF SHIT \n FUCK YOU YOU PIE...,0.997982,1,1,1,0,1,0
25945,e783fd267f3a9d3b,FUCK WIKIPEDIA ON WHEELS! \n\nFuck off wikiped...,0.998136,1,1,1,0,1,0


In [5]:
df1.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 41338 entries, 0 to 41337
Data columns (total 9 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   id             41338 non-null  object 
 1   comment_text   41338 non-null  object 
 2   score          41338 non-null  float64
 3   toxic          41338 non-null  int64  
 4   severe_toxic   41338 non-null  int64  
 5   obscene        41338 non-null  int64  
 6   threat         41338 non-null  int64  
 7   insult         41338 non-null  int64  
 8   identity_hate  41338 non-null  int64  
dtypes: float64(1), int64(6), object(2)
memory usage: 2.8+ MB


In [6]:
df2 = df1.drop(['id', 'comment_text'], axis = 1)

In [7]:
df2.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 41338 entries, 0 to 41337
Data columns (total 7 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   score          41338 non-null  float64
 1   toxic          41338 non-null  int64  
 2   severe_toxic   41338 non-null  int64  
 3   obscene        41338 non-null  int64  
 4   threat         41338 non-null  int64  
 5   insult         41338 non-null  int64  
 6   identity_hate  41338 non-null  int64  
dtypes: float64(1), int64(6)
memory usage: 2.2 MB


In [8]:
df2.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
score,41338.0,0.244467,0.257221,4e-06,0.074772,0.128969,0.310894,0.998329
toxic,41338.0,0.095384,0.293749,0.0,0.0,0.0,0.0,1.0
severe_toxic,41338.0,0.009168,0.095313,0.0,0.0,0.0,0.0,1.0
obscene,41338.0,0.05305,0.224137,0.0,0.0,0.0,0.0,1.0
threat,41338.0,0.003024,0.054907,0.0,0.0,0.0,0.0,1.0
insult,41338.0,0.049809,0.217553,0.0,0.0,0.0,0.0,1.0
identity_hate,41338.0,0.009725,0.098134,0.0,0.0,0.0,0.0,1.0


In [9]:
perc =[0.9, 0.925, 0.95, 0.975, 0.99]
df2.describe(percentiles = perc).T

Unnamed: 0,count,mean,std,min,50%,90%,92.5%,95%,97.5%,99%,max
score,41338.0,0.244467,0.257221,4e-06,0.128969,0.704435,0.806061,0.890144,0.950366,0.981209,0.998329
toxic,41338.0,0.095384,0.293749,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0
severe_toxic,41338.0,0.009168,0.095313,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0
obscene,41338.0,0.05305,0.224137,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0
threat,41338.0,0.003024,0.054907,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0
insult,41338.0,0.049809,0.217553,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0
identity_hate,41338.0,0.009725,0.098134,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0


In [10]:
# count of values equal to 1 per label {percent of total records}

for col in df2.columns:
    x = (df2[col].values == 1).sum()
    p = round(x/len(df2)*100,2)
    print(col, ": ", x, "{", p, "%}")

score :  0 { 0.0 %}
toxic :  3943 { 9.54 %}
severe_toxic :  379 { 0.92 %}
obscene :  2193 { 5.31 %}
threat :  125 { 0.3 %}
insult :  2059 { 4.98 %}
identity_hate :  402 { 0.97 %}


In [11]:
df3 = df1.drop(df1[df1['toxic'] == 0].index)

In [12]:
df3 = df3.drop(['id'], axis=1)

In [13]:
df3.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 3943 entries, 8 to 41333
Data columns (total 8 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   comment_text   3943 non-null   object 
 1   score          3943 non-null   float64
 2   toxic          3943 non-null   int64  
 3   severe_toxic   3943 non-null   int64  
 4   obscene        3943 non-null   int64  
 5   threat         3943 non-null   int64  
 6   insult         3943 non-null   int64  
 7   identity_hate  3943 non-null   int64  
dtypes: float64(1), int64(6), object(1)
memory usage: 277.2+ KB


In [14]:
# count of values equal to 1 per label {percent of total records}

for col in df3.columns[2:]:
    x = (df3[col].values == 1).sum()
    p = round(x/len(df3)*100,2)
    print(col, ": ", x, "{", p, "%}")

toxic :  3943 { 100.0 %}
severe_toxic :  379 { 9.61 %}
obscene :  2061 { 52.27 %}
threat :  117 { 2.97 %}
insult :  1923 { 48.77 %}
identity_hate :  377 { 9.56 %}


In [44]:
# find toxic = 0 and identity_hate=1
df4 = df1.drop(df1[df1['toxic'] == 1].index)
df4 = df4.drop(df4[df4['identity_hate'] == 0].index)
df4.head(25).sort_values(['score'])

Unnamed: 0,id,comment_text,score,toxic,severe_toxic,obscene,threat,insult,identity_hate
15076,63c44599fce8f0c2,"Basically, homosexuality lowers productivity, ...",0.444456,0,0,0,0,0,1
40496,9903cd41f8b37abb,REDIRECT Talk:Bolivian chinchilla rat,0.534903,0,0,0,0,1,1
23915,f24a0fcc1f403334,yeah after few days some jews will say nasser ...,0.534903,0,0,0,0,0,1
3839,d0c8b8c428c3af1e,I Know that Jewish Americans are the perpetrat...,0.598252,0,0,0,0,0,1
37967,a6611676c4d32317,Whats the idea f blocking me editing the David...,0.60654,0,0,0,0,1,1
35224,b550a8f4fe248cfe,holocaust was needed by the whole humanity,0.658223,0,0,0,0,0,1
32669,c3589a6658f8f0f5,"Stay out of this, fat boy. We're not in Japan,...",0.66438,0,0,0,0,1,1
10639,ad4079b703b3d6a8,"""Behinderte Russen: Meaning """"Retarded Russian...",0.695427,0,0,1,0,1,1
10498,af479dae8f48eb56,The Jews who keep trolling this article are me...,0.700894,0,0,0,0,0,1
33125,c0d2ac2a3bb7a344,"LWC, there is more to it than you think. You ...",0.702262,0,0,0,0,0,1


In [55]:
for i in range(25):
  print('Line', i)
  print('score', df4.iat[i,2])
  print('toxic', df4.iat[i,3])
  print('severe_toxic', df4.iat[i,4])
  print('obscene', df4.iat[i,5])
  print('threat', df4.iat[i,6])
  print('insult', df4.iat[i,7])
  print('identity_hate', df4.iat[i,8])
  print(df4.iat[i,1])
  print()

Line 0
score 0.80332977
toxic 0
severe_toxic 0
obscene 0
threat 0
insult 0
identity_hate 1
your beat as hell stop getting rid of my cool junk aight homie i know we can get along why must we fight ;0

Line 1
score 0.7870861
toxic 0
severe_toxic 0
obscene 0
threat 0
insult 0
identity_hate 1
She is a upper class white racist. She hides it by attacking Black Conservatives.

Line 2
score 0.59825194
toxic 0
severe_toxic 0
obscene 0
threat 0
insult 0
identity_hate 1
I Know that Jewish Americans are the perpetrators of 9/11!

Line 3
score 0.78038925
toxic 0
severe_toxic 0
obscene 0
threat 0
insult 1
identity_hate 1
Check this out dumb Mexican:
http://www.ripoffreport.com/reports/ripoff193659.htm

Line 4
score 0.700894
toxic 0
severe_toxic 0
obscene 0
threat 0
insult 0
identity_hate 1
The Jews who keep trolling this article are merely trying to cause a problem.  If they weren't then why are they always together, Jewish and all on topics even just slightly related to Jews?  Maybe they should lea

In [56]:
# find toxic = 0 and threat=1
df5 = df1.drop(df1[df1['toxic'] == 1].index)
df5 = df5.drop(df5[df5['threat'] == 0].index)
df5.head(8).sort_values(['score'])

Unnamed: 0,id,comment_text,score,toxic,severe_toxic,obscene,threat,insult,identity_hate
30686,cd86e1ac9998d147,hello. i know where you live. i know what you ...,0.25043,0,0,0,1,0,0
19733,180b31f596d885b8,@ Good! Just tell me how delete my account so ...,0.495113,0,0,0,1,0,0
21237,00a3157f7c3a24eb,"""\n\n Vandalize?! \n\nI would have ignored you...",0.512203,0,0,0,1,0,0
37409,a96dbeef4bb6b1be,Contd \n\nscoundrel the block on padmalskhmi w...,0.587106,0,0,0,1,1,0
28394,da225210fb20b417,Homey \n\nYo I beat u up,0.72138,0,0,0,1,0,0
16166,52ce17deb400d628,limited time \n\ni know who you are heathen. i...,0.72624,0,0,0,1,0,0
17454,3c8eb4dc87366409,"I'm going to punt your cat about, then it will...",0.776018,0,0,0,1,0,0
17177,40d6431e631e36ac,"That may be the case, but there is still no am...",0.803698,0,0,0,1,0,0


In [57]:
for i in range(8):
  print('Line', i)
  print('score', df5.iat[i,2])
  print('toxic', df5.iat[i,3])
  print('severe_toxic', df5.iat[i,4])
  print('obscene', df5.iat[i,5])
  print('threat', df5.iat[i,6])
  print('insult', df5.iat[i,7])
  print('identity_hate', df5.iat[i,8])
  print(df5.iat[i,1])
  print()

Line 0
score 0.7262401
toxic 0
severe_toxic 0
obscene 0
threat 1
insult 0
identity_hate 0
limited time 

i know who you are heathen. i know what you do.  i will finish you.

Line 1
score 0.8036984
toxic 0
severe_toxic 0
obscene 0
threat 1
insult 0
identity_hate 0
That may be the case, but there is still no ambiguity over the repulsiveness of irish nationalists, who deserved to be shot dead. 79.75.141.101

Line 2
score 0.77601826
toxic 0
severe_toxic 0
obscene 0
threat 1
insult 0
identity_hate 0
I'm going to punt your cat about, then it will die. Then you can invent a cyber-cat you geek. However, I will proceed to punt that.

Line 3
score 0.49511254
toxic 0
severe_toxic 0
obscene 0
threat 1
insult 0
identity_hate 0
@ Good! Just tell me how delete my account so I can get away from you and 's harassment and abuse. You two have made a powerful and unstoppable enemy once I find who your identities are and if you even come to Washington state. I will destroy you guys' life once I find your i

In [58]:
# find toxic = 0 and obscene=1
df6 = df1.drop(df1[df1['toxic'] == 1].index)
df6 = df6.drop(df6[df6['obscene'] == 0].index)
df6.head(132).sort_values(['score'])

Unnamed: 0,id,comment_text,score,toxic,severe_toxic,obscene,threat,insult,identity_hate
25109,ebfbac4939cefc26,Award !\nFor proposing that the Rosetta stone ...,0.240716,0,0,1,0,0,0
20009,13e5e151727db704,"""noobs...\nI have contested the speedy deletio...",0.271009,0,0,1,0,0,0
14443,6e84ad5af3ca4586,get p!$$ed off when someone expresses hate for...,0.321188,0,0,1,0,0,0
23170,f6341a610fd95c3b,How the Hell does fighting him at Memory's Sky...,0.390454,0,0,1,0,0,0
12250,926a65c24209d14b,SimCopter Shenanigans\nHiya. I originally foll...,0.431721,0,0,1,0,0,0
...,...,...,...,...,...,...,...,...,...
39074,a0a6c48d7c1e2746,Everybody knows Thomas Robb likes big black di...,0.950290,0,0,1,0,1,0
41208,95147bcf9dbef5bc,"""\nNo your wrong, EVula said it was about pers...",0.959081,0,0,1,0,1,0
9496,bf82c8b8473745d4,Message\n\nHey Dx927 you're a fat ugly cunt wi...,0.966388,0,0,1,0,1,0
15372,5eef195f8b3e1c6c,"As I said, you're an asshole and that's all th...",0.968973,0,0,1,0,1,0


In [59]:
df6.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
score,132.0,0.766224,0.14987,0.240716,0.695427,0.796767,0.869687,0.982097
toxic,132.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
severe_toxic,132.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
obscene,132.0,1.0,0.0,1.0,1.0,1.0,1.0,1.0
threat,132.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
insult,132.0,0.371212,0.48497,0.0,0.0,0.0,1.0,1.0
identity_hate,132.0,0.068182,0.253018,0.0,0.0,0.0,0.0,1.0


In [60]:
for i in range(132):
  print('Line', i)
  print('score', df6.iat[i,2])
  print('toxic', df6.iat[i,3])
  print('severe_toxic', df6.iat[i,4])
  print('obscene', df6.iat[i,5])
  print('threat', df6.iat[i,6])
  print('insult', df6.iat[i,7])
  print('identity_hate', df6.iat[i,8])
  print(df6.iat[i,1])
  print()

Line 0
score 0.9820972
toxic 0
severe_toxic 0
obscene 1
threat 0
insult 1
identity_hate 0
== O Lucilio é um filho da puta == 

 LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA

In [68]:
# find toxic = 0 and insult=1
df7 = df1.drop(df1[df1['toxic'] == 1].index)
df7 = df7.drop(df7[df7['insult'] == 0].index)
df7.head().sort_values(['score'])

Unnamed: 0,id,comment_text,score,toxic,severe_toxic,obscene,threat,insult,identity_hate
441,20e49c9eefd120bd,*Oppose - A grossly understated euphemism for ...,0.695427,0,0,0,0,1,0
2219,c0ff2add840d3cfa,""" \n\n :Uh, excuse me, Chueyjoo, but I'm a """"m...",0.809085,0,0,0,0,1,0
254,1388593eb53ca72b,:You are so obviously trolling us that you can...,0.854771,0,0,0,0,1,0
170,0be8c8c9289c43a6,""" \n\n == Warning vandals == \n\n You recently...",0.915653,0,0,0,0,1,0
1014,52e2f15ab54528da,== O Lucilio é um filho da puta == \n\n LUCILI...,0.982097,0,0,1,0,1,0


In [69]:
df7.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
score,136.0,0.741096,0.161778,0.242594,0.636407,0.771711,0.860626,0.982097
toxic,136.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
severe_toxic,136.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
obscene,136.0,0.360294,0.481861,0.0,0.0,0.0,1.0,1.0
threat,136.0,0.007353,0.085749,0.0,0.0,0.0,0.0,1.0
insult,136.0,1.0,0.0,1.0,1.0,1.0,1.0,1.0
identity_hate,136.0,0.102941,0.305005,0.0,0.0,0.0,0.0,1.0


In [70]:
for i in range(136):
  print('Line', i)
  print('score', df7.iat[i,2])
  print('toxic', df7.iat[i,3])
  print('severe_toxic', df7.iat[i,4])
  print('obscene', df7.iat[i,5])
  print('threat', df7.iat[i,6])
  print('insult', df7.iat[i,7])
  print('identity_hate', df7.iat[i,8])
  print(df7.iat[i,1])
  print()

Line 0
score 0.9156528
toxic 0
severe_toxic 0
obscene 0
threat 0
insult 1
identity_hate 0
" 



Line 1
score 0.8547711
toxic 0
severe_toxic 0
obscene 0
threat 0
insult 1
identity_hate 0
:You are so obviously trolling us that you can't even keep yourself from hiding that. It's really pathetic. —…

Line 2
score 0.69542736
toxic 0
severe_toxic 0
obscene 0
threat 0
insult 1
identity_hate 0
*Oppose - A grossly understated euphemism for what everyone undertands is a defamatory attack of epic proportion. There's no putting lipstick on this particular pig.

Line 3
score 0.9820972
toxic 0
severe_toxic 0
obscene 1
threat 0
insult 1
identity_hate 0
== O Lucilio é um filho da puta == 

 LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA PUTA! LUCILIO FILHO DA 

In [92]:
# find toxic=1 and severe_toxic=1
df8 = df1.drop(df1[df1['toxic'] == 0].index)
df8 = df8.drop(df8[df8['severe_toxic'] == 0].index)
df8['score'].min()

0.49612892

In [93]:
# find toxic=1 and obscene=1
df8 = df1.drop(df1[df1['toxic'] == 0].index)
df8 = df8.drop(df8[df8['obscene'] == 0].index)
df8['score'].min()

0.29922405

In [94]:
# find toxic=1 and insult=1
df8 = df1.drop(df1[df1['toxic'] == 0].index)
df8 = df8.drop(df8[df8['insult'] == 0].index)
df8['score'].min()

0.307727

In [95]:
# find toxic=1 and threat=1
df8 = df1.drop(df1[df1['toxic'] == 0].index)
df8 = df8.drop(df8[df8['threat'] == 0].index)
df8['score'].min()

0.31089434

In [96]:
# find toxic=1 and identity_hate=1
df8 = df1.drop(df1[df1['toxic'] == 0].index)
df8 = df8.drop(df8[df8['identity_hate'] == 0].index)
df8['score'].min()

0.31089434

In [97]:
# find toxic=1
df8 = df1.drop(df1[df1['toxic'] == 0].index)
df8['score'].min()

0.05439934

In [98]:
# find severe_toxic=1
df8 = df1.drop(df1[df1['severe_toxic'] == 0].index)
df8['score'].min()

0.49612892

In [88]:
# find obscene=1
df8 = df1.drop(df1[df1['obscene'] == 0].index)
df8['score'].min()

0.24071598

In [89]:
# find insult=1
df8 = df1.drop(df1[df1['insult'] == 0].index)
df8['score'].min()

0.24259439

In [90]:
# find threat=1
df8 = df1.drop(df1[df1['threat'] == 0].index)
df8['score'].min()

0.2504299

In [91]:
# find identity_hate=1
df8 = df1.drop(df1[df1['identity_hate'] == 0].index)
df8['score'].min()

0.31089434

**EXPLORATORY DATA ANALYSIS:**

First, I filtred the data by toxic first to narrow the analysis. Then, after further exploratory data analysis I separated the labels by non-toxic and observed the following:

A. 25 records labeled as identy_hate but not labeled as toxic or severe_toxic
-range of toxicity score: .4444 to .9188

-well composed sentences tend to score lower than comments with poor grammar
-there is variance in scores for comments with certain topics or ethnic groups mentioned

-derogatory language can make a comment score higher even if the meaning of the sentence is not as ‘toxic’ or hateful as other comments


B. 8 records labeled as a threat but not labeled as toxic or severe_toxic

-range of toxicity score: . 2504 to .8036

-comments of longer length once again tend to score lower even if the message is more violent in the case of threats

-For some reason an opinion without a direct or indirect threat was categorized as the most toxic, without being labeled as toxic

- the most direct threats (death wishes) score lower in comparison to the other types of comments, especially ones that claim to know identities and locations of users, really surprising these score lower than other comments not mentioning any direct threat


C. 132 records labeled as obscene but not labeled as toxic or severe_toxic

- range of toxicity score: .2407 to .9821

-Most obscene comments with high scores have repetitive language and derogatory insults

- For these types of comments replacing characters does seem to affect the toxicity score, for other labels the scores are usually higher with substitution of characters and poor grammar, in this case the opposite is true for a good amount of the comments i managed to look at

-Unlike the threat and identity_hate labels, severe_toxic is overlapping with some of the records, threat and identity_hate labels had no overlap with either toxic or severe_toxic labels


D. 136 records labeled as insults but not labeled as toxic or severe_toxic

- range of toxicity score: .2425 to .9821

- Records for Obscene and Insult labels have more overlap than other labels, even the distribution of toxicity scores is very similar

- Comments don’t necessarily include derogatory language to make them score higher 

-there seems to be a lot more variance in the scores than other labels





For the thresholds with toxicity the score threshold is around 0.3 and above, but when removing the toxic label the score threshold drops to around 0.25 except for severe_toxic label which has the highest threshold of all lables (around 0.49).


In [18]:
from googleapiclient.discovery import build
import json

def get_toxicity_score(comment):
    
  API_KEY = 'xxxxxxxxxxxxx' # Put your API key here
    
  client = build(
  "commentanalyzer",
  "v1alpha1",
  developerKey=API_KEY,
  discoveryServiceUrl="https://commentanalyzer.googleapis.com/$discovery/rest?version=v1alpha1"
  )

  analyze_request = {
  'comment': { 'text': comment },
  'requestedAttributes': {'TOXICITY': {}}
  }
    
  response = client.comments().analyze(body=analyze_request).execute()
  toxicity_score = response["attributeScores"]["TOXICITY"]["summaryScore"]["value"]
    
  return toxicity_score

We can call this function with original comments like this:

In [99]:
get_toxicity_score("I love you")

0.05566647

Due to the constraint of one request per second, we should insert a time.sleep() statement between consecutive API calls, which will pause execution for one second.

In [100]:
comment_list = ['hello', 'how are you', 'fine thanks']

for comment in comment_list:
    score = get_toxicity_score(comment)
    print(comment, score)
    time.sleep(1)

hello 0.054463096
how are you 0.073404066
fine thanks 0.031898104


Hypothesis for testing the Perspective API: The API will score comments that include the mention of minorities (women,gays, latinos, other foreigners, religion, etc.) higher.

In [102]:
#Test 1
get_toxicity_score('why mom!')

0.08609277

In [107]:
#Test 2
get_toxicity_score('why Dad!')

0.05821534

In [105]:
#Test 3
get_toxicity_score('Asian mom')

0.19426279

In [104]:
#Test 4
get_toxicity_score('American mom')

0.08565118

In [103]:
#Test 5
get_toxicity_score('Jewish mom')

0.24159904

**RESULTS OF TEST:**

The scores do refelct some bias twoards minorities. Even when the context of the comment seems to not carry any toxic traits it will more likely score higher than comments including more generic topics. Derogatory language and grammar also play a role in how the API scores comments. The model doesn't discriminate against the bias of human raters and does not consider the words within the text like other machine learning models. The bias comes from the human raters interpretation of the comments.