# Coding Assignment: Data Bias

### Hypothesis: The Perspective API will make more mistakes on shorter or more informal pieces of content (like tweets and social media comments) compared to longer ones (like book and movie and reviews) To test my hypothesis, I have gathered 16 short comments and 16 long comments from various social media platforms, customer reviews and blog posts, and tested them with the Perspective API client to check whether it correctly classifies them as toxic or non toxic.

In [42]:
import numpy as np
import pandas as pd
from googleapiclient import discovery
import json

# load the dataset in 3 column format : type, human_label, comment
df = pd.read_csv("data_bias.csv")
df

Unnamed: 0,type,human_label,comment
0,short,non toxic,follow me on instagram @33abhiram
1,short,non toxic,Cmon what are you doing?
2,short,non toxic,Did you vote for Obama?
3,short,non toxic,You're just plain wrong.
4,short,non toxic,I dont have any muslim friends
5,short,non toxic,I dont have any african american friends
6,short,non toxic,he passed away a couple days ago :(
7,short,non toxic,Americans have no general knowledge
8,short,toxic,You're an absolute disgrace.
9,short,toxic,wtf are you doing!


In [43]:
# Function that uses the Perspective API client to score each comment
def get_perspective_score(comment):
    API_KEY = "AIzaSyDcnow7xd_AmASuqLHuTDSIP7YZza5f0K4"

    client = discovery.build(
        "commentanalyzer",
        "v1alpha1",
        developerKey=API_KEY,
        discoveryServiceUrl="https://commentanalyzer.googleapis.com/$discovery/rest?version=v1alpha1",
        static_discovery=False,
    )

    analyze_request = {
        "comment": {"text": comment},
        "requestedAttributes": {"TOXICITY": {}},
    }

    response = client.comments().analyze(body=analyze_request).execute()
    perspective_score = response['attributeScores']['TOXICITY']['spanScores'][0]['score']['value']

    return perspective_score

# Test the function
get_perspective_score("This is a test comment")



0.017341165

In [44]:
# Use the get_perspective_score function to score each comment and create a new column perspective_score
df["perspective_score"] = df["comment"].apply(get_perspective_score)
df

Unnamed: 0,type,human_label,comment,perspective_score
0,short,non toxic,follow me on instagram @33abhiram,0.044636
1,short,non toxic,Cmon what are you doing?,0.07292
2,short,non toxic,Did you vote for Obama?,0.024967
3,short,non toxic,You're just plain wrong.,0.157667
4,short,non toxic,I dont have any muslim friends,0.377512
5,short,non toxic,I dont have any african american friends,0.502472
6,short,non toxic,he passed away a couple days ago :(,0.019854
7,short,non toxic,Americans have no general knowledge,0.397729
8,short,toxic,You're an absolute disgrace.,0.749544
9,short,toxic,wtf are you doing!,0.571686


In [45]:
# Create a new column api_label to label each comment based on the perspective_score
df["api_label"] = [1 if score > 0.5 else 0 for score in df["perspective_score"]]
df

Unnamed: 0,type,human_label,comment,perspective_score,api_label
0,short,non toxic,follow me on instagram @33abhiram,0.044636,0
1,short,non toxic,Cmon what are you doing?,0.07292,0
2,short,non toxic,Did you vote for Obama?,0.024967,0
3,short,non toxic,You're just plain wrong.,0.157667,0
4,short,non toxic,I dont have any muslim friends,0.377512,0
5,short,non toxic,I dont have any african american friends,0.502472,1
6,short,non toxic,he passed away a couple days ago :(,0.019854,0
7,short,non toxic,Americans have no general knowledge,0.397729,0
8,short,toxic,You're an absolute disgrace.,0.749544,1
9,short,toxic,wtf are you doing!,0.571686,1


In [46]:
# Convert human_label to binary
df["human_label"] = [1 if x=='toxic' else 0 for x in df['human_label']]
df

Unnamed: 0,type,human_label,comment,perspective_score,api_label
0,short,0,follow me on instagram @33abhiram,0.044636,0
1,short,0,Cmon what are you doing?,0.07292,0
2,short,0,Did you vote for Obama?,0.024967,0
3,short,0,You're just plain wrong.,0.157667,0
4,short,0,I dont have any muslim friends,0.377512,0
5,short,0,I dont have any african american friends,0.502472,1
6,short,0,he passed away a couple days ago :(,0.019854,0
7,short,0,Americans have no general knowledge,0.397729,0
8,short,1,You're an absolute disgrace.,0.749544,1
9,short,1,wtf are you doing!,0.571686,1


In [47]:
# Select relevant columns for testing
df = df[["type", "human_label", "api_label"]]
df

Unnamed: 0,type,human_label,api_label
0,short,0,0
1,short,0,0
2,short,0,0
3,short,0,0
4,short,0,0
5,short,0,1
6,short,0,0
7,short,0,0
8,short,1,1
9,short,1,1


In [48]:
# Extract indices for short and long comments (first step of finding accuracy)
type_column = df["type"]

short_indices = []
long_indices = []
y_actual = [y for y in df['human_label']]
y_predicted = [y for y in df['api_label']]

for i in range(len(type_column)):
    if type_column[i] == "long":
        long_indices.append(i)
    else:
        short_indices.append(i)

y_actual_short = [y_actual[i] for i in short_indices]
y_predicted_short = [y_predicted[i] for i in short_indices]

y_actual_long = [y_actual[i] for i in long_indices]
y_predicted_long = [y_predicted[i] for i in long_indices]

print(len(short_indices))
print(len(long_indices))

16
16


In [50]:
# Define function to find class wise accuracy of comments
def class_wise_acc(y_actual, y_predicted):
    total_p = 0
    total_n = 0
    TP=0
    TN=0
    for i in range(len(y_predicted)):
        if y_actual[i]==1:
            total_p = total_p+1
            if y_actual[i]==y_predicted[i]:
               TP=TP+1
        if y_actual[i]==0:
            total_n=total_n+1
            if y_actual[i]==y_predicted[i]:
               TN=TN+1
    return(TP/total_p, TN/total_n)

class_1_acc_short, class_0_acc_short = class_wise_acc(y_actual_short, y_predicted_short)
class_1_acc_long, class_0_acc_long = class_wise_acc(y_actual_long, y_predicted_long)

print (f"Accuracy for short, non toxic comments = {class_0_acc_short}")
print (f"Accuracy for short, toxic comments = {class_1_acc_short}")
print (f"Accuracy for long, non toxic comments = {class_0_acc_long}")
print (f"Accuracy for long, toxic comments = {class_1_acc_long}")

Accuracy for short, non toxic comments = 0.875
Accuracy for short, toxic comments = 0.75
Accuracy for long, non toxic comments = 0.875
Accuracy for long, toxic comments = 0.875


### Insights: Based on my results, the accuracy of the model drops by 12.5% for short toxic comments compared to longer ones, suggesting a potential bias in its performance. This difference implies that the model may be predisposed to struggle with shorter, more informal content commonly found in social media comments and tweets.

### The informal nature of shorter comments, especially in the form of slang and expressions found in social media comments and tweets, may introduce bias by making it more challenging for the Perspective API to accurately interpret and classify toxicity. Longer comments like book and movie reviews offer a more comprehensive language context, which potentially reduces any bias in the API model's evaluation.

### These findings suggest that users should be mindful of the model's potential limitations and biases in scenarios where content length and informality are involved.