### Perspective API Exploration

First, we have a dataset of Wikipedia comments made available by Jigsaw, a subsidiary of Google that created the Perspective tool. The dataset includes a unique comment id, the text of the comment, and a series of binary labels applied by human raters: "toxic," "severe_toxic," "obscene," "threat," "insult," and "identity_hate." I have appended the "score" column, which represents the toxicity score assigned to the comment text by the live version of the Perspective API. The data is available under a CC0 license.

In [66]:
import pandas as pd
import time

df = pd.read_csv('labeled_and_scored_comments.csv')

In [67]:
df.sort_values(['score'])

Unnamed: 0,id,comment_text,score,toxic,severe_toxic,obscene,threat,insult,identity_hate
1158,61a85266de70cca0,Можешь говорить по русски.,0.000004,0,0,0,0,0,0
5246,030a9ba0b32b4456,Rodzina Krusenstern Potomkowie Admirała Adama ...,0.000009,0,0,0,0,0,0
11040,a6814193ac6b4705,The opening paragraph states that the game is ...,0.000091,0,0,0,0,0,0
24192,f0d6dee96e057a5b,The book is used as a source in a number of ar...,0.000116,0,0,0,0,0,0
2332,cbb9268d3c7429c1,Vals Vienés \n |Rumba \n |Cha Cha Chá(ciclo 7),0.000131,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...
19994,1429a2c930688317,you fucking piece of shit,0.997278,1,1,1,0,1,0
28728,d82e45eb103ad07b,fuck off you stupid aspy asshole,0.997736,1,1,1,0,1,0
25535,e9b92c6c6b01aeef,FUCK YOU YOU PIECE OF SHIT \n FUCK YOU YOU PIE...,0.997982,1,1,1,0,1,0
25945,e783fd267f3a9d3b,FUCK WIKIPEDIA ON WHEELS! \n\nFuck off wikiped...,0.998136,1,1,1,0,1,0


## This is a function to make calls to the Perspective API for my own testing. I have inserted my own API key and prepare to test different comments with this function

In [68]:
from googleapiclient.discovery import build
import json

def get_toxicity_score(comment):
    
  API_KEY = 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX    
  client = build(
  "commentanalyzer",
  "v1alpha1",
  developerKey=API_KEY,
  discoveryServiceUrl="https://commentanalyzer.googleapis.com/$discovery/rest?version=v1alpha1",
  static_discovery=False,
  )

  analyze_request = {
  'comment': { 'text': comment },
  'requestedAttributes': {'TOXICITY': {}}
  }
    
  response = client.comments().analyze(body=analyze_request).execute()
  toxicity_score = response["attributeScores"]["TOXICITY"]["summaryScore"]["value"]
    
  return toxicity_score

## As I begin Parsing the data I take words that happen to be in the commets with higher toxicisity scores and apply them to the toxicisity score function to see what is commonly found in a comment of high toxicisity

In [69]:
comment_list = ['FUCK', 'fuck', 'Fuck', 
                'piece of shit', 'PIECE OF SHIT', 'shit', 'SHIT', 'Shit', 'asshole', 'ASSHOLE', 'MOTHERFUCKER', 
                'motherfucker','motherfucking','fuck you', 'FUCK YOU', 'Fuck off', 
               'bitch', 'BITCH', 'Bitch', 'WHORE', 'whore', 'Whore', 'SLUT', 'Slut', 'slut', 'DICK', 'Dick', 'dick',
               
               'love', 'lover', 'loving',
               'motherfucker', 'motherfucking',
               'Bullshit', 'Bullshitter', 'Bullshiting',
               
                'HATE', 'hate', 'Hate',
               'Hoe', 'hoe', 'Hoe',
               
               'Hypocritical']

for comment in comment_list:
    score = get_toxicity_score(comment)
    print(comment, score)
    time.sleep(1)

FUCK 0.9320454
fuck 0.9017833
Fuck 0.9189384
piece of shit 0.95068264
PIECE OF SHIT 0.9583344
shit 0.7007861
SHIT 0.8546526
Shit 0.8129004
asshole 0.89158845
ASSHOLE 0.91406596
MOTHERFUCKER 0.96176314
motherfucker 0.9254919
motherfucking 0.89158845
fuck you 0.95473075
FUCK YOU 0.9646382
Fuck off 0.9646382
bitch 0.9254919
BITCH 0.95068264
Bitch 0.9448569
WHORE 0.7569718
whore 0.78524166
Whore 0.8129004
SLUT 0.6020386
Slut 0.6020386
slut 0.60152835
DICK 0.5958905
Dick 0.59863794
dick 0.59863794
love 0.024147147
lover 0.059728492
loving 0.033246122
motherfucker 0.9254919
motherfucking 0.89158845
Bullshit 0.8763571
Bullshitter 0.6556601
Bullshiting 0.61826205
HATE 0.31714454
hate 0.2712817
Hate 0.23891698
Hoe 0.4826145
hoe 0.47119883
Hoe 0.4826145
Hypocritical 0.5026305


##### I noticed that the API tends to rate uppercase comments with a higher toxicisity score even if it was the same word or phrase. I also was able to recognise that phrases and words more specifically profanities in the present tense (verbs) tended to be ranked with higher toxicisity.

## From the testing step I was able to develop a hypothesis that words and phrases in all uppercase letters will tend to receive a higher toxicity score than those that aren't

In [70]:
comment_list_2 = ['I CANT BELIEVE YOU ASSHOLE', 'i cant believe you asshole',
                 'FUCK YOU', 'fuck you',
                 'YOURE A FUCKING MOTHERFUCKER', 'youre a fucking motherfucker',
                  'i hate you', 'I HATE YOU',
                  'you piece of shit', 'YOU PIECE OF SHIT',
                 'I LOVE YOU', 'i love you',
                 'YOU ARE BEAUTIFUL', 'you are beautiful']

for comment in comment_list_2:
    score = get_toxicity_score(comment)
    print(comment, score)
    time.sleep(1)

I CANT BELIEVE YOU ASSHOLE 0.95068264
i cant believe you asshole 0.9448569
FUCK YOU 0.9646382
fuck you 0.95473075
YOURE A FUCKING MOTHERFUCKER 0.9863082
youre a fucking motherfucker 0.9850823
i hate you 0.7627802
I HATE YOU 0.7969615
you piece of shit 0.9646382
YOU PIECE OF SHIT 0.96751314
I LOVE YOU 0.04268845
i love you 0.045091953
YOU ARE BEAUTIFUL 0.045778666
you are beautiful 0.045778666


##### According to my small sample of test phrases the hypothesis is true that words and phrases in all uppercase letters will tend to receive a higher toxicity score than those that aren't even if its the exact same word/phrase. However I will note that this trend appears more on comments that are already negative, for example comments that already have profanities.

## Here I am sorting the comments by what I belive the threshold of toxicity to be after parsing the comments and scores that were in the csv.

For this project I'm only interested in the 'toxic' label and toxicity scores above the threshold of .68

In [71]:
abovethreshold = df[(df['score']>= 0.68)]

abovethreshold

Unnamed: 0,id,comment_text,score,toxic,severe_toxic,obscene,threat,insult,identity_hate
17,001068b809feee6b,""" \n\n ==balance== \n This page has one senten...",0.729625,0,0,0,0,0,0
24,0013fed3aeae76b7,DJ Robinson is gay as hell! he sucks his dick ...,0.989706,1,0,1,0,1,1
27,0016b94c8b20ffa6,I WILL BURN YOU TO HELL IF YOU REVOKE MY TALK ...,0.902320,0,0,0,0,0,0
29,0017d4d47894af05,":Fuck off, you anti-semitic cunt. |",0.959459,1,0,1,0,1,0
38,001eff4007dbb65b,"::No, he is an arrogant, self serving, immatur...",0.956893,1,0,1,0,1,0
...,...,...,...,...,...,...,...,...,...
41290,94aca26c29f7a0bd,"Hypocritical, double standard admins, as typic...",0.833779,0,0,0,0,1,0
41297,94a6beeb87ab68c0,"Damn, those are some rancid sources.",0.821990,1,0,1,0,0,0
41299,94a5024323152cd1,"==Why does it bother you, fuckface?89.123.100....",0.989706,1,0,1,0,1,0
41332,9481cd7393b583c9,"RE: \n\nIt's a fucking album cover, how the fu...",0.932649,1,0,1,0,0,0


##### I determined my threshold by inputting sample coments of words and phrases into the function that appear in toxic comments and I noticed that most words/ phrases that I would consider to be more of a toxic nature scored at 0.68 and above.

## I saved my new dataset that contains comments from above my threshold of .68 as a csv file so I could parse it for false positives/negatives to determine why things werent rated as toxic that had high toxicity scores

In [72]:
#I uploaded this data frame to the repository 

abovethreshold.to_csv('abovethreshold_comments.csv')

## False positives/negatives in the top 5 and bottom 5 of the abovethreshold_comments.csv

##### I was able to infer that comments which had high toxicity scores with no toxic rating were longer pieces of text such as comment id "001068b809feee6b". This specific comment was a long entry of sorts discussing the etymology of the word bitch. However because it had the word bitch in the entry so many times the API likely didn't understand the context and rated it high toxicity while the human was able to differentiate and didn't rate the comment toxic.

##### Comment id "0016b94c8b20ffa6" was not rated toxic by the grader however the API rated it above my established threshold of toxicity. I believe this to be because there wasn't much toxicity within the comment, it appeared to be someone just getting mad about the fact that they were being revoked talk page access. 

##### Comment id "94aca26c29f7a0bd" appears to be more of an insult towards wikipedia admins calling them "Hypocritical" and that word itself scores a 0.5 toxicity score, there was absolutly an error with the API on that comment because the comment itself wasn't toxic. 