In [4]:
import numpy as np
import pandas as pd

test_df = pd.read_csv("Sample_labaled_data.csv")
test_df.head()

Unnamed: 0.1,Unnamed: 0,id,comment_text,toxic
0,5,0001ea8717f6de06,Thank you for understanding I think very highl...,no
1,7,000247e83dcc1211,Dear god this site is horrible,no
2,11,0002f87b16116a7f,Somebody will invariably try to add Religion ...,no
3,13,0003e1cccfd5a40a,It says it right there that it IS a type The...,no
4,14,00059ace3e3e9a53,Before adding a new product to the list mak...,no


Hypothesis: If the comment contains threats, it is more likely for Perspective API to classify it as toxic.

In [5]:
from googleapiclient import discovery

API_KEY = 'API'

client = discovery.build(
    "commentanalyzer",
    "v1alpha1",
    developerKey=API_KEY,
    discoveryServiceUrl="https://commentanalyzer.googleapis.com/$discovery/rest?version=v1alpha1",
    static_discovery=False,
)

TOXICITY_THRESHOLD = 0.8
comments = test_df['comment_text']

comment_count = 1

for comment in comments:
    try:
        analyze_request = {'comment': { 'text': comment },'requestedAttributes': {'TOXICITY': {},'THREAT': {},}}  
        response = client.comments().analyze(body=analyze_request).execute()
        toxicity_score = response['attributeScores']['TOXICITY']['summaryScore']['value']
        threat_score = response['attributeScores']['THREAT']['summaryScore']['value']
    except:
        continue
        
    if threat_score > 0.5:
        print(f"Comment {comment_count} '{comment}' contains threats and is classified as toxic with a score of {toxicity_score}.\n")
    else:
        if toxicity_score > TOXICITY_THRESHOLD:
            print(f"Comment {comment_count} '{comment}' is classified as toxic with a score of {toxicity_score}.\n")
        else:
            print(f"Comment {comment_count} '{comment}' is classified as non-toxic with a score of {toxicity_score}.\n")
            
    comment_count += 1
    
    if comment_count == 100:
        break
    
  



Comment 1 'Thank you for understanding I think very highly of you and would not revert without discussion' is classified as non-toxic with a score of 0.016210219.

Comment 2 'Dear god this site is horrible' is classified as non-toxic with a score of 0.4014846.

Comment 3 '  It says it right there that it IS a type The Type of institution is needed in this case because there are three levels of SUNY schools  University Centers and Doctoral Granting Institutions  State Colleges  Community Colleges  It is needed in this case to clarify that UB is a SUNY Center It says it even in Binghamton University University at Albany State University of New York and Stony Brook University Stop trying to say its not because I am totally right in this case' is classified as non-toxic with a score of 0.050821137.

Comment 4 'this other one from 1897' is classified as non-toxic with a score of 0.025910228.

Comment 5 'blocked from editing Wikipedia   ' is classified as non-toxic with a score of 0.10089093

Based on the reults of the test, I found that the model tends to give higher toxicity scores for comments that contain provanity than comments that are threatening. One explanation for this is that the Perspective API model values profanity scores more than threat scores. The datasets that Perspective API was trained with is likely to be biased in that there probably were more comments with profanity than threats. In one aspect, it makes sense that profanity is more focused in the model, as profanity is more commonly seen online than threats. However, to humans, threats are considered more toxic that provanity. Perspective API may want to take threatening comment more seriously. Below is a second test to check my explanation of these results.

In [22]:
from googleapiclient import discovery

API_KEY = 'API'

client = discovery.build(
    "commentanalyzer",
    "v1alpha1",
    developerKey=API_KEY,
    discoveryServiceUrl="https://commentanalyzer.googleapis.com/$discovery/rest?version=v1alpha1",
    static_discovery=False,
)

comments = ["I going to kidnap your entire family", "I going to kidnap your entire family fuck","I'm going to stab you", "I'm going to stab you shitty","You are shitty","You are so stupid"]

for comment in comments:
        analyze_request = {
            "comment": {"text": comment},
            "requestedAttributes": {
                "TOXICITY": {},
                "PROFANITY": {},
                "THREAT": {},
            },
        }
        response = client.comments().analyze(body=analyze_request).execute()
        toxicity_score = response["attributeScores"]["TOXICITY"]["summaryScore"]["value"]
        profanity_score = response["attributeScores"]["PROFANITY"]["summaryScore"]["value"]
        threat_score = response["attributeScores"]["THREAT"]["summaryScore"]["value"]


        print(f"Comment: {comment}\nToxicity score: {toxicity_score}\nProfanity score: {profanity_score}\nThreat score: {threat_score}\n"
    )


Comment: I going to kidnap your entire family
Toxicity score: 0.90451443
Profanity score: 0.43900812
Threat score: 0.8535384

Comment: I going to kidnap your entire family fuck
Toxicity score: 0.9817629
Profanity score: 0.90862066
Threat score: 0.85759705

Comment: I'm going to stab you
Toxicity score: 0.8696708
Profanity score: 0.33453682
Threat score: 0.85759705

Comment: I'm going to stab you shitty
Toxicity score: 0.9563754
Profanity score: 0.7809412
Threat score: 0.85759705

Comment: You are shitty
Toxicity score: 0.9061063
Profanity score: 0.8255558
Threat score: 0.009048914

Comment: You are so stupid
Toxicity score: 0.91625386
Profanity score: 0.5381406
Threat score: 0.008764107



These tests prove my analysis correct. Profanity is regarded as more toxic in the Perspective API model than threatening comments. 