##Perspective API Exploration##

Below, I will be analyzing a dataset of Wikipedia comments made available by Jigsaw, a subsidiary of Google that created the Perspective tool. This dataset includes a unique comment id, the text of the comment, and a series of binary labels applied by human raters: "toxic," "severe_toxic," "obscene," "threat," "insult," and "identity_hate" + an appended "score" column, which represents the toxicity score assigned to the comment text by the live version of the Perspective API. The data is available under a CC0 license.

My hypothesis is that the Perspective API will make more mistakes in classifying comments as toxic if they contain more Internet slang acronyms, such as 'lol', 'lmao', 'lmfao', 'wth','wtf', 'jk', 'idk', 'smh', 'ikr', and 'tbh'.

In [74]:
import pandas as pd
import time

df = pd.read_csv('/Users/emilydo/Downloads/labeled_and_scored_comments.csv')

Below is the dataset, organized by toxicity score.

In [75]:
df.sort_values(['score'])

Unnamed: 0,id,comment_text,score,toxic,severe_toxic,obscene,threat,insult,identity_hate
1158,61a85266de70cca0,Можешь говорить по русски.,0.000004,0,0,0,0,0,0
5246,030a9ba0b32b4456,Rodzina Krusenstern Potomkowie Admirała Adama ...,0.000009,0,0,0,0,0,0
11040,a6814193ac6b4705,The opening paragraph states that the game is ...,0.000091,0,0,0,0,0,0
24192,f0d6dee96e057a5b,The book is used as a source in a number of ar...,0.000116,0,0,0,0,0,0
2332,cbb9268d3c7429c1,Vals Vienés \n |Rumba \n |Cha Cha Chá(ciclo 7),0.000131,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...
19994,1429a2c930688317,you fucking piece of shit,0.997278,1,1,1,0,1,0
28728,d82e45eb103ad07b,fuck off you stupid aspy asshole,0.997736,1,1,1,0,1,0
25535,e9b92c6c6b01aeef,FUCK YOU YOU PIECE OF SHIT \n FUCK YOU YOU PIE...,0.997982,1,1,1,0,1,0
25945,e783fd267f3a9d3b,FUCK WIKIPEDIA ON WHEELS! \n\nFuck off wikiped...,0.998136,1,1,1,0,1,0


In [76]:
df.head()

Unnamed: 0,id,comment_text,score,toxic,severe_toxic,obscene,threat,insult,identity_hate
0,0001ea8717f6de06,Thank you for understanding. I think very high...,0.075638,0,0,0,0,0,0
1,000247e83dcc1211,:Dear god this site is horrible.,0.450459,0,0,0,0,0,0
2,0002f87b16116a7f,"""::: Somebody will invariably try to add Relig...",0.667964,0,0,0,0,0,0
3,0003e1cccfd5a40a,""" \n\n It says it right there that it IS a typ...",0.068434,0,0,0,0,0,0
4,00059ace3e3e9a53,""" \n\n == Before adding a new product to the l...",0.151724,0,0,0,0,0,0


Then, I pulled the comments labeled as toxic or severe toxic by human labelers.

In [77]:
df[(df['toxic']==1)|(df['severe_toxic']==1)].head()

Unnamed: 0,id,comment_text,score,toxic,severe_toxic,obscene,threat,insult,identity_hate
8,00091c35fa9d0465,"== Arabs are committing genocide in Iraq, but ...",0.635929,1,0,0,0,0,0
24,0013fed3aeae76b7,DJ Robinson is gay as hell! he sucks his dick ...,0.989706,1,0,1,0,1,1
29,0017d4d47894af05,":Fuck off, you anti-semitic cunt. |",0.959459,1,0,1,0,1,0
37,001d739c97bc2ae4,How dare you vandalize that page about the HMS...,0.564152,1,0,0,0,0,0
38,001eff4007dbb65b,"::No, he is an arrogant, self serving, immatur...",0.956893,1,0,1,0,1,0


Below includes a function to make calls to the Perspective API to get the toxicity score.

In [79]:
from googleapiclient.discovery import build
import json

def get_toxicity_score(comment):
    
  API_KEY = 'XXXX' # Put your API key here
    
  client = build(
  "commentanalyzer",
  "v1alpha1",
  developerKey=API_KEY,
  discoveryServiceUrl="https://commentanalyzer.googleapis.com/$discovery/rest?version=v1alpha1"
  )

  analyze_request = {
  'comment': { 'text': comment },
  'requestedAttributes': {'TOXICITY': {}}
  }
    
  response = client.comments().analyze(body=analyze_request).execute()
  toxicity_score = response["attributeScores"]["TOXICITY"]["summaryScore"]["value"]
    
  return toxicity_score

Here, I am gauging to see what the threshold of toxicity could be, aka at what point is a score considered toxic. I tested different phrases to do so, including positive, neutral, and explicit phrases.

In [80]:
get_toxicity_score("I love you")

0.040628307

In [81]:
get_toxicity_score("thank you")

0.01561861

In [82]:
get_toxicity_score("how are you")

0.018187003

In [83]:
get_toxicity_score("ugly")

0.6357597

In [84]:
get_toxicity_score("fat")

0.44670263

In [85]:
get_toxicity_score("fuck")

0.9017833

In [86]:
get_toxicity_score("hate")

0.2712817

I have decided to make the threshold as 0.4, to account for how a word like "hate" scored 0.271 while "ugly" scored 0.636. Then, I am testing to see how well the Perspective API performs in marking toxic comments in general with the complete dataset, based off this threshold.

In [15]:
threshold = 0.4

df['prediction'] = (df['score'] > threshold).astype(int)
df['prediction'].value_counts()

0    33115
1     8223
Name: prediction, dtype: int64

It seems that the Perspective API marks more comments as toxic than not. I am interested to see the ratio of true positivies to false positives as well as true negatives to false negatives.

In [17]:
from sklearn.metrics import confusion_matrix

In [18]:
confusion_matrix(df['toxic'], df['prediction'])

array([[32978,  4417],
       [  137,  3806]])

It seems like in general, the Perspective API has a greater ratio of true positives to fale positives, compared to true negatives to false negatives. 

Now, I am going to pull comments from the dataset that contain Internet slang acronyms, specifically 'lol', 'lmao', 'lmfao', 'wth','wtf', 'jk', 'idk', 'smh', 'ikr', and 'tbh'.

In [101]:
slang_df = df.loc[df.comment_text.str.contains(r'\b(?:lol|lmao|lmfao|jk|wth|wtf|idk|smh|ikr|tbh)\b')]

I have created a new sample from the dataset, containing comments with the above Internet sland acronyms. Notedly, it has an average API toxicity score of 0.440, with a standard deviation of 0.302.

In [68]:
slang_df.describe()

Unnamed: 0,score,toxic,severe_toxic,obscene,threat,insult,identity_hate,prediction
count,145.0,145.0,145.0,145.0,145.0,145.0,145.0,145.0
mean,0.439647,0.268966,0.006897,0.137931,0.0,0.082759,0.02069,0.489655
std,0.301553,0.444959,0.083045,0.346023,0.0,0.276472,0.142837,0.501626
min,0.051259,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,0.144994,0.0,0.0,0.0,0.0,0.0,0.0,0.0
50%,0.377297,0.0,0.0,0.0,0.0,0.0,0.0,0.0
75%,0.713667,1.0,0.0,0.0,0.0,0.0,0.0,1.0
max,0.994802,1.0,1.0,1.0,0.0,1.0,1.0,1.0


In [69]:
slang_df.sort_values(['score'])

Unnamed: 0,id,comment_text,score,toxic,severe_toxic,obscene,threat,insult,identity_hate,prediction
2939,0b742fa92b4169b8,== I do wonder lol == \n\n I was going to give...,0.051259,0,0,0,0,0,0,0
21685,fdd5f92d0d639e7f,Just look for things like match reports. It's...,0.059447,0,0,0,0,0,0,0
41219,950c2210de11f3b3,"Yes, I can tell. lol",0.061462,0,0,0,0,0,0,0
12111,94848791a3cd9be6,"lol, that's all I wanted to hear ) Thankyou!",0.072715,0,0,0,0,0,0,0
19339,1e762d0e8f59f364,http://www.smh.com.au/business/skilled-group-c...,0.073950,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...
9394,c14b1ca18a037a35,Hey JustAnother \nYou suck balls lol,0.949581,1,0,0,0,0,0,1
1448,79940986763428e2,brezzo sucks balls hahahahaha lol,0.950997,1,0,1,0,1,0,1
37988,a6457ca0175c88fa,fucking die \n\nlol rofl \njoke\n\nhahahahahah...,0.962246,1,1,1,0,0,0,1
1756,95943cf07c616ce2,nigga bitch holez lol,0.982484,1,0,1,0,1,1,1


Below, I pulled specifically comments that were labeled as toxic or severe toxic by humans.

Below, I pulled specifically comments that were labeled as toxic by humans.

In [70]:
slang_df[(slang_df['toxic']==1)|(slang_df['severe_toxic']==1)].head()

Unnamed: 0,id,comment_text,score,toxic,severe_toxic,obscene,threat,insult,identity_hate,prediction
640,33b5ce98646a0dc9,lol \n\n \n\n == rules == \n\n rules! \n\n no...,0.901749,1,0,1,0,1,1,1
1448,79940986763428e2,brezzo sucks balls hahahahaha lol,0.950997,1,0,1,0,1,0,1
1717,91b08eb97add3b24,*You haven't been uncivil? BS! Your ignoranc...,0.796997,1,0,0,0,0,0,1
1756,95943cf07c616ce2,nigga bitch holez lol,0.982484,1,0,1,0,1,1,1
1809,997c318147caa4dd,i hate u !! lol jkinn,0.912972,1,0,0,0,0,0,1


Using the same threshold of 0.4, I am testing to see how well the Perspective API performs in marking toxic comments with the Internet slang acronyms.

In [99]:
threshold = 0.4

slang_df['prediction'] = (slang_df['score'] > threshold).astype(int)
slang_df['prediction'].value_counts()

0    74
1    71
Name: prediction, dtype: int64

It seems that the Perspective API marks almost just as many comments toxic as not toxic. I am interested to see the ratio of true positivies to false positives as well as true negatives to false negatives.

In [100]:
confusion_matrix(slang_df['toxic'], slang_df['prediction'])

array([[74, 32],
       [ 0, 39]])

Here, it is shown that the Perspective API did worse marking toxic content. However, notedly I am working with a much smaller sample, which could be influencing these numbers. From this sample though, this proves my hypothesis to be true as the Perspective API performed worse with comments that had Internet slang acronynms. 