# Checking Comments that had a Claim

## Google Fact Check API

- Filtered all comments marked as check-worthy and kept the claim text, video ID, and category.
- Added unique IDs and saved the file as `claims_comments.csv`.
- Sampled 10,000 claims across categories without oversampling.
- Sent each claim to the Google Fact Check API to see if it matched anything.
- If a match was found, I saved the matched text, publisher, rating, review date, and URL.
- Saved everything to `fact_check_results_10000.csv`.


### Importing Libraries

In [None]:
import pandas as pd
import requests
import time

In [4]:
# Load dataset and filter for check-worthy claims
df = pd.read_csv("Final_Thesis_Merged.csv")
checkworthy_df = df[df["Claim_Detection"] == 1].copy()

# Create cleaned DataFrame with necessary columns
checkworthy_df = checkworthy_df[["Video_ID", "Rewritten Comment", "Category"]]
checkworthy_df.columns = ["Video_ID", "Claim_Text", "Category"]

# Add unique Claim_ID
checkworthy_df.insert(0, "Claim_ID", [f"CMT_{i:04d}" for i in range(1, len(checkworthy_df) + 1)])

# Save to CSV
checkworthy_df.to_csv("claims_comments.csv", index=False)
print("Saved: 'claims_comments.csv'")


Saved: 'claims_comments.csv'


### Sampling 10,000 Claims Across Categories


In [19]:
df = pd.read_csv("claims_comments.csv")

category_sample_sizes = {
    "Human Rights": 3560,
    "Geopolitics": 2630,
    "Financial Ethics": 1245,
    "Other": 1015,
    "Corruption": 700,
    "Media Criticism": 480,
    "Sportswashing": 380,
    "Environmental Concerns": 1 
}

# Sample from each category 
sampled_df = pd.concat([
    df[df["Category"] == cat].sample(n=min(size, len(df[df["Category"] == cat])), random_state=42)
    for cat, size in category_sample_sizes.items()
]).reset_index(drop=True)

sampled_df.to_csv("sampled_10000_claims.csv", index=False)
print(f"Saved: sampled_10000_claims.csv with {len(sampled_df)} claims")


Saved: sampled_10000_claims.csv with 10011 claims


### Querying Sampled Claims Using the Google Fact Check API


In [22]:
# Load sampled claims
df = pd.read_csv("sampled_10000_claims.csv")

API_KEY = "****************"
FACT_CHECK_URL = "https://factchecktools.googleapis.com/v1alpha1/claims:search"

# Store results
results = []

# Loop through claims and submit to API
for idx, row in df.iterrows():
    claim_id = row["Claim_ID"]
    claim_text = row["Claim_Text"]
    category = row["Category"]

    params = {
        "query": claim_text,
        "key": API_KEY,
        "languageCode": "en"
    }

    try:
        response = requests.get(FACT_CHECK_URL, params=params)
        if response.status_code == 200:
            data = response.json()
            claim_reviews = data.get("claims", [])

            # If there are results, save them
            for review in claim_reviews:
                entry = {
                    "Claim_ID": claim_id,
                    "Category": category,
                    "Query_Claim": claim_text,
                    "Matched_Text": review.get("text", ""),
                    "Claim_Publisher": review.get("claimReview", [{}])[0].get("publisher", {}).get("name", ""),
                    "Claim_Review_URL": review.get("claimReview", [{}])[0].get("url", ""),
                    "Review_Rating": review.get("claimReview", [{}])[0].get("textualRating", ""),
                    "Review_Date": review.get("claimReview", [{}])[0].get("reviewDate", "")
                }
                results.append(entry)
        else:
            print(f"Error: {response.status_code} for Claim_ID {claim_id}")
        time.sleep(0.5)  # Throttle to stay within rate limits
    except Exception as e:
        print(f"Error at Claim_ID {claim_id}: {e}")
        time.sleep(1)

# Save results to CSV
result_df = pd.DataFrame(results)
result_df.to_csv("fact_check_results_10000.csv", index=False)
print(f"Saved: fact_check_results_10000.csv with {len(result_df)} matched results.")


Saved: fact_check_results_10000.csv with 1 matched results.


Out of the 10,000 check-worthy YouTube claims I sampled across categories like Human Rights, Corruption, and Geopolitics, only one matched something in Google’s Fact Check API. This shows how most of the stuff people are saying in YouTube comments isn’t being picked up by formal fact-checking sources. It highlights a pretty big gap — the kinds of claims that come up in everyday online discussions often aren’t being verified anywhere.


## AZURE BING SEARCH & GPT 4o

- Used Azure OpenAI’s GPT-4o with Bing grounding to fact-check claims from YouTube comments.
- Sent one claim at a time in a structured prompt via the API.
- Model searched the web in real-time and returned:
  - A result code:
    - `1 = Likely True`
    - `0 = Unverifiable`
    - `-1 = Likely False`
    - `-2 = Opinion or Speculation`
  - A short explanation (1–3 sentences).
  - 1–3 sources (titles or URLs).
- Parsed responses and saved them incrementally to a CSV file.
- This gave me a labelled dataset for factuality based on live web content.


### Importing Libraries

In [None]:
import pandas as pd
import time
import uuid
import requests
from tqdm import tqdm
import os
import re

### Setting Up Azure OpenAI Credentials and Configuration


In [4]:
API_KEY = "****************"
ENDPOINT = "https://************.openai.azure.com/"
DEPLOYMENT_NAME = "gpt-4o"
API_VERSION = "2024-02-15-preview"

HEADERS = {
    "Content-Type": "application/json",
    "api-key": API_KEY
}

### Claim Verification Pipeline Using Azure OpenAI and Bing Grounding


In [None]:
# File Paths
INPUT_CSV = "claims_comments.csv"
OUTPUT_CSV = "claim_comments_Results.csv"
CLAIM_BATCH_SIZE = 1  

# Load and Filter Data
df = pd.read_csv(INPUT_CSV)
df = df.dropna(subset=["Claim_Text"]).reset_index(drop=True)

# Batch utility
def create_batched_claims(df, batch_size):
    for i in range(0, len(df), batch_size):
        yield df.iloc[i:i + batch_size]

# Prompt Builder
def build_prompt(claims):
    system_msg = (
        "You are a fact-checking assistant. You will be given 1 claim at a time.\n"
        "For each claim:\n"
        "- Search the web using Bing grounding.\n"
        "- Return a result using the following code:\n"
        "  1 = Likely True\n"
        "  0 = Unverifiable\n"
        "  -1 = Likely False\n"
        "  -2 = Opinion or Speculation\n"
        "- Provide a 1–3 sentence explanation.\n"
        "- List 1–3 sources: preferably article titles, and URLs if available.\n\n"
        "Format your response exactly like this:\n"
        "Claim:\nResult: [1 / 0 / -1 / -2]\nExplanation: [short paragraph]\nSources: [source 1, source 2, source 3]"
    )
    messages = [{"role": "system", "content": system_msg}]
    for _, row in claims.iterrows():
        messages.append({"role": "user", "content": f"Claim {row['Claim_ID']}: {row['Claim_Text']}"})
    return messages

# Extract structured fields
def extract_fields(claim_id, claim_text, response_text):
    result_match = re.search(r"Result:\s*\[?(-?[0-2])\]?", response_text)
    explanation_match = re.search(r"Explanation:\s*(.+?)Sources:", response_text, re.DOTALL)
    sources_match = re.search(r"Sources:\s*(.+)", response_text, re.DOTALL)

    return {
        "Claim_ID": claim_id,
        "Claim_Text": claim_text,
        "Result": result_match.group(1).strip() if result_match else "",
        "Explanation": explanation_match.group(1).strip() if explanation_match else "",
        "Sources": sources_match.group(1).strip() if sources_match else ""
    }

# Main Loop
records = []
url = f"{ENDPOINT}openai/deployments/{DEPLOYMENT_NAME}/chat/completions?api-version={API_VERSION}"

for batch in tqdm(create_batched_claims(df, CLAIM_BATCH_SIZE), total=(len(df) // CLAIM_BATCH_SIZE) + 1):
    messages = build_prompt(batch)

    body = {
        "messages": messages,
        "temperature": 0.2,
        "max_tokens": 1500,
        "top_p": 1,
        "frequency_penalty": 0,
        "presence_penalty": 0
    }

    try:
        response = requests.post(url, headers=HEADERS, json=body)

        if response.status_code == 200:
            reply = response.json()["choices"][0]["message"]["content"]
            claim_id = batch.iloc[0]["Claim_ID"]
            claim_text = batch.iloc[0]["Claim_Text"]

            record = extract_fields(claim_id, claim_text, reply)
            records.append(record)

            # Save to output CSV incrementally
            pd.DataFrame([record]).to_csv(
                OUTPUT_CSV,
                mode='a',
                index=False,
                header=not os.path.exists(OUTPUT_CSV)
            )
        else:
            print("Error:", response.status_code)
            print(response.text)

    except Exception as e:
        print("Exception:", str(e))

    time.sleep(1)

print("Processing complete. Output saved to:", OUTPUT_CSV)

100%|█████████████████████████████████▊| 26434/26537 [22:06:05<01:22,  1.25it/s]

### Generating Category-Wise Breakdown of Claim Verification Results


In [19]:
# Load and map results
df = pd.read_csv("claims_comments_Results.csv")
df['Result'] = df['Result'].map({
    1: 'Likely True',
    0: 'Unverifiable',
    -1: 'Likely False',
    -2: 'Opinion or Speculation'
})

# Group and sort
by_cat = df.groupby(['Category', 'Result']).size().unstack(fill_value=0)
by_cat = by_cat.loc[by_cat.sum(axis=1).sort_values(ascending=False).index]

print(by_cat)


              Category  Likely False  Likely True  Opinion or Speculation
          Human Rights           893         4181                    3131
           Geopolitics           877         1884                    3327
      Financial Ethics           264         1079                    1511
                 Other           271          861                    1101
            Corruption           219          606                     743
       Media Criticism           120          287                     595
         Sportswashing            77          339                     508
Environmental Concerns             0            1                       0

              Category  Unverifiable
          Human Rights           955
           Geopolitics           799
      Financial Ethics           449
                 Other           412
            Corruption           289
       Media Criticism           243
         Sportswashing            91
Environmental Concerns             0


### Merging Transcript Agreement Labels into Claim-Level Results


In [7]:
# Loading the CSV files
merged_df = pd.read_csv("Final_Thesis_Merged.csv")
claims_df = pd.read_csv("claims_comments_Results.csv")

# Merge Agreed_with_Transcript into the claims DataFrame
claims_df = claims_df.merge(
    merged_df[['Rewritten Comment', 'Agreed_with_Transcript']],
    left_on='Claim_Text',
    right_on='Rewritten Comment',
    how='left'
)

# Drop redundant 'Rewritten Comment' column
claims_df.drop(columns=['Rewritten Comment'], inplace=True)

# Overwrite the original CSV with the updated data
claims_df.to_csv("claims_comments_Results.csv", index=False)


  merged_df = pd.read_csv("Final_Thesis_Merged.csv")


### Comparing Claim Verdicts with How Much They Agree with the Transcript


In [21]:
# Load your CSV file
df = pd.read_csv("claims_comments_Results.csv")

# Keep only rows with Agreed_with_Transcript values of -1, 0, or 1
df = df[df['Agreed_with_Transcript'].astype(str).isin(['-1', '0', '1'])]

# Drop rows with missing values in Result column
df = df.dropna(subset=['Result'])

# Convert both columns to integers
df['Agreed_with_Transcript'] = df['Agreed_with_Transcript'].astype(int)
df['Result'] = df['Result'].astype(int)

# Group by both columns
grouped = df.groupby(['Result', 'Agreed_with_Transcript']).size().reset_index(name='Count')

# Mapping dictionaries
result_map = {
    1: 'Likely True',
   -1: 'Likely False',
   -2: 'Opinion',
    0: 'Unverifiable'
}

agreement_map = {
   -1: 'Disagrees',
    0: 'Neutral',
    1: 'Agrees'
}

# Map to readable labels
grouped['Verdict'] = grouped['Result'].map(result_map)
grouped['Agreement'] = grouped['Agreed_with_Transcript'].map(agreement_map)

# Final output
cleaned = grouped[['Verdict', 'Agreement', 'Count']].sort_values(by=['Verdict', 'Agreement'])

# Print result
print(cleaned)


         Verdict  Agreement  Count
8   Likely False     Agrees    240
6   Likely False  Disagrees   1082
7   Likely False    Neutral   1404
14   Likely True     Agrees   2307
12   Likely True  Disagrees   1942
13   Likely True    Neutral   5029
5        Opinion     Agrees   1375
3        Opinion  Disagrees   4577
4        Opinion    Neutral   5368
11  Unverifiable     Agrees    342
9   Unverifiable  Disagrees    936
10  Unverifiable    Neutral   1974
2            NaN     Agrees     56
17           NaN     Agrees      5
0            NaN  Disagrees    133
15           NaN  Disagrees     20
1            NaN    Neutral    228
16           NaN    Neutral     22
