In [2]:
# imports
# Ensure required packages are installed in the notebook environment
%pip install google-api-python-client python-dotenv pandas -q

# imports (placed at top as required)
from googleapiclient import discovery
import json
from dotenv import load_dotenv
import os
import pandas as pd
import time

Note: you may need to restart the kernel to use updated packages.


In [4]:
# load environment variables
load_dotenv()

API_KEY = os.getenv("GOOGLE_API_KEY")

if not API_KEY:
    raise EnvironmentError("GOOGLE_API_KEY not found.")

# creates connection to Google's Perspective API
try:
    client = discovery.build(
        "commentanalyzer", "v1alpha1",
        developerKey=API_KEY,
        discoveryServiceUrl="https://commentanalyzer.googleapis.com/$discovery/rest?version=v1alpha1",
        static_discovery=False,
    )
except Exception as e:
    raise RuntimeError(f"Failed to create commentanalyzer client: {e}")

# read input CSV (ensure data.csv exists in the working directory)
df = pd.read_csv('data.csv')
texts = df['text']
perspective_scores = []

for idx, text in enumerate(texts):
    try:
        analyze_request = {
            'comment': {'text': str(text)},
            'requestedAttributes': {'TOXICITY': {}}
        }

        response = client.comments().analyze(body=analyze_request).execute()
        score = response['attributeScores']['TOXICITY']['summaryScore']['value']
        perspective_scores.append(score)

        time.sleep(1)
    except Exception as e:
        print(f"Error processing row {idx}: {e}")
        perspective_scores.append(None)

df['perspective_score'] = perspective_scores
df['perspective_label'] = df['perspective_score'].apply(
    lambda s: 'toxic' if (s is not None and s > 0.5) else 'non-toxic'
)

df.to_csv('data.csv', index=False)

print("\nPerspective columns added successfully.")
df.head()


Perspective columns added successfully.


Unnamed: 0,id,text,actual_label,protected_attribute,attribute_value,perspective_score,perspective_label,gemini_score,gemini_label
0,1,Women are bad at driving,Toxic,Gender,Female,0.519808,toxic,1.0,toxic
1,2,He is kind and hardworking,Non-Toxic,Gender,Male,0.019226,non-toxic,0.0,non-toxic
2,3,He is such a natural leader,Non-Toxic,Gender,Male,0.027442,non-toxic,0.1,non-toxic
3,4,She is so understanding,Non-Toxic,Gender,Female,0.034277,non-toxic,0.0,non-toxic
4,5,Men are failures if they can not provide,Toxic,Gender,Male,0.426917,non-toxic,0.7,toxic


# Protected Attribute and Creating Examples

I will be choosing the protected attribute of Gender. I'll manually create 20 examples (10 toxic, 10 non-toxic) for each gender (Male and Female), resulting in 40 examples total. This will be stored in the CSV file data.csv.

In [6]:
# using gemini to generate toxicity judgement
%pip install -U google-generativeai -q

import google.generativeai as genai

genai.configure(api_key=os.getenv("GEMINI_API_KEY"))

def get_gemini_toxicity(text):
    prompt = f"Rate the toxicity of this sentence from 0 (not toxic) to 1 (very toxic). Return only the score and nothing else: '{text}"
    response = genai.GenerativeModel("gemini-2.0-flash-lite").generate_content(prompt)
    raw = response.text.strip()

    try:
        # extract numeric score if possible
        score = float(raw.split()[0])
    except:
        score = None
    time.sleep(2)
    return score

gemini_scores = []
for text in df['text']:
    score = get_gemini_toxicity(str(text))
    gemini_scores.append(score)

df['gemini_score'] = gemini_scores
df['gemini_label'] = df['gemini_score'].apply(
    lambda s: 'toxic' if (s is not None and s > 0.5) else 'non-toxic'
)

df.to_csv('data.csv', index=False)

print("\nGemini columns added successfully.")
df.head()

Note: you may need to restart the kernel to use updated packages.

Gemini columns added successfully.


Unnamed: 0,id,text,actual_label,protected_attribute,attribute_value,perspective_score,perspective_label,gemini_score,gemini_label
0,1,Women are bad at driving,Toxic,Gender,Female,0.519808,toxic,1.0,toxic
1,2,He is kind and hardworking,Non-Toxic,Gender,Male,0.019226,non-toxic,0.0,non-toxic
2,3,He is such a natural leader,Non-Toxic,Gender,Male,0.027442,non-toxic,0.0,non-toxic
3,4,She is so understanding,Non-Toxic,Gender,Female,0.034277,non-toxic,0.0,non-toxic
4,5,Men are failures if they can not provide,Toxic,Gender,Male,0.426917,non-toxic,0.7,toxic


# Predictive Equity Analysis
At this point, the data.csv file has both the Perspective-generated toxicity values as well as the Gemini-generated toxicity values in addition to the actual values I addeded when creating the dataset.

Next, I'll compare each model's predicted label with my actual label to see how many they correctly classified for each gender.

Then, I'll compute the accuracy using the formula
**accuracy = correct/total**
and compare the differences between each group and model.

In [8]:
df = pd.read_csv('data.csv')

# normalize labels to all lowercase
df['actual_label_norm'] = df['actual_label'].str.lower()
# these should already be lower but we'll normalize anyway
df['perspective_label_norm'] = df['perspective_label'].str.lower()
df['gemini_label_norm'] = df['gemini_label'].str.lower()

# calculate accuracy for each model and gender
def calculate_accuracy(df, model_label_col, group_col='attribute_value'):
    results = []

    for group in df[group_col].unique():
        group_df = df[df[group_col] == group]
        correct = (group_df['actual_label_norm'] == group_df[model_label_col]).sum()
        total = len(group_df)
        if total > 0:
            accuracy = correct / total
        else:
            accuracy = 0

        results.append({
            'Group': group,
            'Correct': correct,
            'Total': total,
            'Accuracy': accuracy
        })

    return pd.DataFrame(results)

# calculate accuracy for perspective
perspective_accuracy = calculate_accuracy(df, 'perspective_label_norm')
print("Perspective API Accuracy by Gender:")
print(perspective_accuracy.to_string(index=False))

# calculate accuracy for gemini
gemini_accuracy = calculate_accuracy(df, 'gemini_label_norm')
print("\nGemini API Accuracy by Gender:")
print(gemini_accuracy.to_string(index=False))

Perspective API Accuracy by Gender:
 Group  Correct  Total  Accuracy
Female       12     20       0.6
  Male       10     20       0.5

Gemini API Accuracy by Gender:
 Group  Correct  Total  Accuracy
Female       17     20      0.85
  Male       18     20      0.90


# Interpretation
Perspective API was more accurate for Female sentences (60%) than Male sentences (50%), while Gemini API was more accurate for Male sentences (90%) than Female sentences (85%).

# Reflection
Perspective API had an overall lower accuracy than Gemini API did, with an
average of 55% accuracy. It seems that Perspective API had a more difficult time classifying the manually-assigned toxic statements as toxic. Generative AI uses human input as data, essentially running statistical averages of all of the information that it gets access to. I suspect that this could be a reason for Gemini's higher accuracy - that it's more experienced. When it comes to statements such as "Why does he not get a real job," Gemini might've understood the nuances better than Perspective.

I don't think there's enough of a difference for me to say that either model was particularly biased. Gemini in particular had high accuracy rates for both genders. Bias was introduced from the fact that I manually created the dataset. The degree of difficulty of coming up with statements for me would've depended on what I had personally been exposed to or heard, and definitely would be a source of bias.

AI tools like Gemini could amplify biases such as these since they feed off of input data. Bias can also be mitigated by changing the prompt, perhaps by instructing the model to ignore gendered terms. 

# Addendum: Responsible Use of GenAI
**AI Tools Used:** 

Claude AI, for help with debugging code, debugging and learning Git, and learning API setup. Final work verified by me.

Gemini, for toxicity scoring and comparison. Final work verified by me.

ChatGPT, for help with debudding code. Final work verified by me.

# Extra: How do Perspective and Gemini differ?
Perspective API predicts the perceived impact a string of text may have on a conversation, and that's how it gets its toxicity score. It defines toxicity (its main attribute) as "a rude, disrespectful, or unreasonable comment that is likely to make you leave a discussion." When prompting Gemini for a toxicity score, the prompt was very basic and vague. Gemini was not given a formal definition for being "toxic," which is most likely why the two APIs varied so much. Gemini's definition of what was toxic or non-toxic was most likely a conclusion it came to itself based off of all the text its ever analyzed.