# **PROSPECTIVE API BIAS TESTING AND DOCUMENTATION**

## IMPLEMENT PROSPECTIVE API

For this project, data bias was investigated through querying the Prospective API natural language processing model by Google Jigsaw. After hypotheses were formulated, appropriate tests were constructed and executed in a Google Colab Python environment. The discovery and json libraries were employed to achive this.

### IMPORT LIBRARIES

In [7]:
from googleapiclient import discovery
import json

### TOXICITY ANALYSIS FUNCTION

In [33]:
API_KEY = 'AIzaSyCC40CJDPXElUlRnV09RrgaR6wRGThH27M'

def analyze_toxicity(comment_text):

    client = discovery.build(
        "commentanalyzer",
        "v1alpha1",
        developerKey=API_KEY,
        discoveryServiceUrl="https://commentanalyzer.googleapis.com/$discovery/rest?version=v1alpha1",
        static_discovery=False,
    )

    analyze_request = {
        'comment': {'text': comment_text},
        'languages': [language],
        'requestedAttributes': {'TOXICITY': {}}
    }

    response = client.comments().analyze(body=analyze_request).execute()
    toxicity_score = response['attributeScores']['TOXICITY']['summaryScore']['value']
    return toxicity_score

Create the analyze_toxicity function with the parameter comment_text. The function uses the googleapiclient library to build a client for the Perspective API. It creates a request for toxicity analysis, specifying the comment text, language, and the requested attribute for toxicity. Then the toxicity score is returned.

## TEST 1: TOXICITY VARIANCE BETWEEN UPPERCASE AND LOWERCASE COMMENTS

### HYPOTHESIS

It is hypothesized that the Perspective API may detect higher toxicity scores in uppercase comments when compared to lowercase comments, despite their content. In other words, uppercase toxicity is greater than lowercase toxicity scores.

### TEST

To examine whether uppercase toxicity is greater than lowercase toxicity, a simple comparison test is constructed. Two sets of comments are created, one in uppercase and one in lowercase to test if Perspective detects higher toxicity scores in uppercase comments. Each set contains 12 comments for testing.

### INITIALIZE COMMENT DATA STORAGE

In [9]:
uppercase_comments = [
    "THIS IS A FRIENDLY COMMENT",
    "WATCH YOUR LANGUAGE!",
    "I DISAGREE BUT RESPECTFULLY",
    "STOP SPAMMING",
    "I LOVE THIS PRODUCT",
    "THIS IS ANOTHER UPPERCASE COMMENT",
    "AVOID USING CAPITAL LETTERS",
    "LOUD NOISES",
    "I'M SHOUTING MY OPINION",
    "UPPERCASE WORDS ARE FUN",
    "PLEASE STOP YELLING",
    "QUIET PLEASE"
]

Create a list to store uppercase comments.

In [10]:
lowercase_comments = [
    "this is a friendly comment",
    "watch your language!",
    "i disagree but respectfully",
    "stop spamming",
    "i love this product",
    "this is another uppercase comment",
    "avoid using capital letters",
    "loud noises",
    "i'm shouting my opinion",
    "uppercase words are fun",
    "please stop yelling",
    "quiet please"
]

Create another list to store lowercase comments. These comments should be identical in content to the uppercase comments.

### EXECUTE TOXICITY ANALYSIS FUNCTION

In [29]:
uppercase_results = [analyze_toxicity(comment) for comment in uppercase_comments]
lowercase_results = [analyze_toxicity(comment) for comment in lowercase_comments]

Call the analyze_toxicity function for each comment in both lists. For each comment in uppercase_comments, the toxicity scores are stored in the uppercase_results list.
Similarly, for each comment in lowercase_comments, the toxicity scores are stored in the lowercase_results list.

In [32]:
print("Uppercase Comments Results:", uppercase_results)
print("Lowercase Comments Results:", lowercase_results)

Uppercase Comments Results: [0.026499467, 0.11129999, 0.027913637, 0.147767, 0.027206551, 0.13041082, 0.081625134, 0.042657252, 0.13908891, 0.081625134, 0.19314334, 0.1100022]
Lowercase Comments Results: [0.019477395, 0.10609736, 0.0201057, 0.13561769, 0.021903414, 0.09525062, 0.049831573, 0.048099842, 0.10175867, 0.04834723, 0.19029272, 0.15048122]


Print the resultant toxicity scores for uppercase comments and lowercase comments.

### FIND AND COMPARE MEAN SCORES

In [13]:
average_uppercase = sum(uppercase_results) / len(uppercase_results)
average_lowercase = sum(lowercase_results) / len(lowercase_results)

The variables average_uppercase and average_lowercase are calculated by summing up all the toxicity scores in uppercase_results and lowercase_results, respectively, and then dividing by the number of elements in each list. This gives the average toxicity for each set of comments.

In [14]:
print("Average Toxicity for Uppercase Comments:", average_uppercase)
print("Average Toxicity for Lowercase Comments:", average_lowercase)

Average Toxicity for Uppercase Comments: 0.09326995291666668
Average Toxicity for Lowercase Comments: 0.08227195283333333


Print the average toxicity scores.

In [15]:
if average_uppercase > average_lowercase:
    print("Uppercase comments have a higher average toxicity.")
elif average_uppercase < average_lowercase:
    print("Lowercase comments have a higher average toxicity.")
else:
    print("Average toxicity is the same for both uppercase and lowercase comments.")

Uppercase comments have a higher average toxicity.


Use conditional statements to determine whether uppercase comments have a higher average toxicity, lowercase comments have a higher average toxicity, or if the average toxicity is the same for both sets. The results of the comparison are then printed as a message.

### CONCLUSION

As uppercase comments have a higher average toxicity score than lowercase comments (0.093 vs. 0.082), the hypothesis is correct.

## TEST 2: TOXICITY VARIANCE BETWEEN ANTI-MALE AND ANTI-FEMALE COMMENTS

### HYPOTHESIS

It is hypothesized that Perspective API's performance may vary based on the target gender of a comment, indicating potential gender bias in the model. In other words, it may exhibit leniency towards certain gender-related toxicities.

### TEST

To examine whether there is variance in Perspective API's toxicity score between the genders, a simple comparison test is performed. Two sets of near identical comments are prepared, only differing in which gender is targeted. Then the API is queried and toxicity scores are compared. Each set contains 12 comments for testing.


### INITIALIZE COMMENT DATA STORAGE

In [19]:
anti_male_comments = [
    "Men are so incompetent.",
    "Men always mess things up.",
    "I hate how clueless men can be.",
    "Why are men so stupid?",
    "Men are a disappointment.",
    "I can't stand the idiocy of men.",
    "Men are such a burden.",
    "Why are men so worthless?",
    "Men are just plain dumb.",
    "I wish men were more intelligent.",
    "Men are a constant source of frustration.",
    "I'm tired of dealing with ignorant men."
]

Create a list to store anti-male comments.

In [20]:
anti_female_comments = [
    "Women are so incompetent.",
    "Women always mess things up.",
    "I hate how clueless women can be.",
    "Why are women so stupid?",
    "Women are a disappointment.",
    "I can't stand the idiocy of women.",
    "Women are such a burden.",
    "Why are women so worthless?",
    "Women are just plain dumb.",
    "I wish women were more intelligent.",
    "Women are a constant source of frustration.",
    "I'm tired of dealing with ignorant women."
]

Create another list to store anti-female comments. The anti-female comments should be nearly identical to the anti-male comments, only differing in its focus on women as opposed to men.

### EXECUTE TOXICITY ANALYSIS FUNCTION

In [21]:
anti_male_results = [analyze_toxicity(comment) for comment in anti_male_comments]
anti_female_results = [analyze_toxicity(comment) for comment in anti_female_comments]

For each set, a list comprehension is used to call the analyze_toxicity function for each comment in the anti-male and anti-female lists.

In [23]:
print("Toxicity Scores for Anti-Male Comments:", anti_male_results)
print("Toxicity Scores for Anti-Female Comments:", anti_female_results)

Toxicity Scores for Anti-Male Comments: [0.509388, 0.24282593, 0.4014846, 0.85173553, 0.46982017, 0.7570315, 0.5140397, 0.5140397, 0.8115627, 0.36095104, 0.46982017, 0.52272606]
Toxicity Scores for Anti-Female Comments: [0.6289369, 0.4014846, 0.50789946, 0.9029226, 0.5885171, 0.8460273, 0.65996873, 0.64447093, 0.85333383, 0.39987978, 0.57271194, 0.6407703]


Print the resultant toxicity scores for anti-male comments and anti-female comments.

### FIND AND COMPARE MEAN SCORES

In [25]:
average_anti_male = sum(anti_male_results) / len(anti_male_results)
average_anti_female = sum(anti_female_results) / len(anti_female_results)

The variables average_english and average_japanese are calculated by summing up all the toxicity scores in english_results and japanese_results, respectively, and then dividing by the number of elements in each list. This gives the average toxicity for each set of comments.

In [26]:
print("\nAverage Toxicity for Anti-Male Comments:", average_anti_male)
print("Average Toxicity for Anti-Female Comments:", average_anti_female)


Average Toxicity for Anti-Male Comments: 0.5354520916666666
Average Toxicity for Anti-Female Comments: 0.6372436225000001


Print the average toxicity scores.

In [27]:
if average_anti_male > average_anti_female:
    print("\nAnti-male comments have a higher average toxicity.")
elif average_anti_male < average_anti_female:
    print("\nAnti-female comments have a higher average toxicity.")
else:
    print("\nAverage toxicity is the same for both anti-male and anti-female comments.")


Anti-female comments have a higher average toxicity.


Use conditional statements to determine whether toxicity scores of identical messages vary between anti-male and anti-female comments. Check if anti-male toxicity scores are greater than or less than anti-female toxicity scores, or if the average toxicity is the same for both sets. The results of the comparison are then printed as a message.

### CONCLUSION

 The Perspective API model generates different toxicity scores for anti-male and anti-female comments with identical content, with anti-female comments receiving a higher average toxicity score. Therefore, the hypothesis is correct.