# Politics of Emotions or Propaganda? (P3)

This project explores how **emotional language** is used strategically in political texts—such as speeches, social media posts, or debates—to **influence perception and manipulate audience response**.

The objective is to move beyond simple emotion classification and toward an **interpretation of emotion’s rhetorical function** within political discourse.

Data source: https://www.presidency.ucsb.edu/


In [1]:
# Install required packages from the requirement.txt file if not already installed
!pip install -r requirements.txt



## Dataset Development
Data source: https://www.presidency.ucsb.edu/documents/app-categories/elections-and-transitions/debates

Check the file fun.py to see how the speaches are cleaned.

In [2]:
from fun import process_debate_transcripts
process_debate_transcripts("transcripts", "data")

Processed: TRUMP_BIDEN_ATLANTA_2024.txt → TRUMP_BIDEN_ATLANTA_2024.csv
Processed: TRUMP_BIDEN_CLEVELAND_2020.txt → TRUMP_BIDEN_CLEVELAND_2020.csv
Processed: TRUMP_BIDEN_NASHVILLE_2020.txt → TRUMP_BIDEN_NASHVILLE_2020.csv
Processed: TRUMP_CLINTON_HEMPSTEAD_2016.txt → TRUMP_CLINTON_HEMPSTEAD_2016.csv
Processed: TRUMP_CLINTON_LOUIS_2016.txt → TRUMP_CLINTON_LOUIS_2016.csv
Processed: TRUMP_CLINTON_NEVADA_2016.txt → TRUMP_CLINTON_NEVADA_2016.csv
Processed: TRUMP_HARRIS_PHILADELPHIA_2024.txt → TRUMP_HARRIS_PHILADELPHIA_2024.csv


In [3]:
import pandas as pd
atlanta = pd.read_csv("data\TRUMP_BIDEN_ATLANTA_2024.csv")
atlanta

Unnamed: 0,SpeechID,Speech,Speaker,Location,Year
0,1,"We're live from Georgia, a key battleground st...",Moderator,Atlanta,2024
1,2,"In just moments, the current U.S. president wi...",Moderator,Atlanta,2024
2,3,We want to welcome our viewers in the United S...,Moderator,Atlanta,2024
3,4,This debate is being produced by CNN and it's ...,Moderator,Atlanta,2024
4,5,This is a pivotal moment between President Joe...,Moderator,Atlanta,2024
...,...,...,...,...,...
713,714,"Now, you can go and you can get something. You...",Trump,Atlanta,2024
714,715,"Choice for our soldiers, where our soldiers, i...",Trump,Atlanta,2024
715,716,care of themselves and they're living. And tha...,Trump,Atlanta,2024
716,717,"So, all of these things – we're in a failing n...",Trump,Atlanta,2024


In [4]:
nevada = pd.read_csv("data\TRUMP_CLINTON_NEVADA_2016.csv")
nevada

Unnamed: 0,SpeechID,Speech,Speaker,Location,Year
0,1,Good evening from the Thomas and Mack Center a...,Moderator,Nevada,2016
1,2,"I'm Chris Wallace of Fox News, and I welcome y...",Moderator,Nevada,2016
2,3,This debate is sponsored by the Commission on ...,Moderator,Nevada,2016
3,4,The commission has designed the format: Six ro...,Moderator,Nevada,2016
4,5,"For the record, I decided the topics and the q...",Moderator,Nevada,2016
...,...,...,...,...,...
797,798,We cannot take four more years of Barack Obama...,Trump,Nevada,2016
798,799,Thank you both. Secretary Clinton—hold on just...,Moderator,Nevada,2016
799,800,That brings to an end this year's debates spon...,Moderator,Nevada,2016
800,801,Now the decision is up to you. While millions ...,Moderator,Nevada,2016


In [5]:
import os

# List to store dataframes
dataframes = []

if os.path.exists("data/combined_speeches.csv"):
    os.remove("data/combined_speeches.csv")

# Iterate through all files in the folder
for file in os.listdir("data"):
    if file.endswith(".csv"):
        file_path = os.path.join("data", file)
        df = pd.read_csv(file_path)
        dataframes.append(df)
# Concatenate all dataframes
combined_df = pd.concat(dataframes, ignore_index=True)

# Save the combined dataframe to a new .csv file
combined_df.to_csv("data/combined_speeches.csv", index=False)

In [6]:
# Check the combined dataframe
speeches = pd.read_csv("data/combined_speeches.csv")
speeches.count()

SpeechID    5840
Speech      5840
Speaker     5840
Location    5840
Year        5840
dtype: int64

In [7]:
speeches

Unnamed: 0,SpeechID,Speech,Speaker,Location,Year
0,1,"We're live from Georgia, a key battleground st...",Moderator,Atlanta,2024
1,2,"In just moments, the current U.S. president wi...",Moderator,Atlanta,2024
2,3,We want to welcome our viewers in the United S...,Moderator,Atlanta,2024
3,4,This debate is being produced by CNN and it's ...,Moderator,Atlanta,2024
4,5,This is a pivotal moment between President Joe...,Moderator,Atlanta,2024
...,...,...,...,...,...
5835,723,She gave it to Afghanistan.,Trump,Philadelphia,2024
5836,724,"What these people have done to our country, an...",Trump,Philadelphia,2024
5837,725,"The worst President, the worst Vice President ...",Trump,Philadelphia,2024
5838,726,President Trump thank you. And that is our ABC...,Moderator,Philadelphia,2024


## Feeding the model

In [8]:
import torch
print(torch.cuda.is_available())
print(torch.cuda.get_device_name(0))

True
NVIDIA GeForce GTX 1070 Ti


In [9]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("SamLowe/roberta-base-go_emotions")
model = AutoModelForSequenceClassification.from_pretrained("SamLowe/roberta-base-go_emotions")

# Move model to GPU if available
device = 0 if torch.cuda.is_available() else -1

classifier = pipeline(task="text-classification", model="SamLowe/roberta-base-go_emotions", top_k=None, device=device)

Device set to use cuda:0


In [10]:
sentences = ["Hello, I am Joe Biden"]

model_outputs = classifier(sentences)
model_outputs[0]
# produces a list of dicts for each of the labels


[{'label': 'neutral', 'score': 0.9357642531394958},
 {'label': 'approval', 'score': 0.024361375719308853},
 {'label': 'excitement', 'score': 0.010637336410582066},
 {'label': 'realization', 'score': 0.01003023236989975},
 {'label': 'joy', 'score': 0.005965998861938715},
 {'label': 'annoyance', 'score': 0.004213426727801561},
 {'label': 'admiration', 'score': 0.0030739184003323317},
 {'label': 'amusement', 'score': 0.0028852401301264763},
 {'label': 'surprise', 'score': 0.00273154117166996},
 {'label': 'fear', 'score': 0.0024596164003014565},
 {'label': 'optimism', 'score': 0.00232110102660954},
 {'label': 'sadness', 'score': 0.002088801935315132},
 {'label': 'disgust', 'score': 0.0019926358945667744},
 {'label': 'gratitude', 'score': 0.0019211502512916923},
 {'label': 'curiosity', 'score': 0.001899787806905806},
 {'label': 'anger', 'score': 0.0018068865174427629},
 {'label': 'confusion', 'score': 0.001738501014187932},
 {'label': 'disappointment', 'score': 0.0016148401191458106},
 {'la

This function will classify the text and return the label and score of the most probable emotion.

In [11]:
from fun import get_top_emotion
# Use a lambda to pass both the text and classifier to the function
speeches[['emotion', 'score']] = speeches['Speech'].apply(lambda x: get_top_emotion(x, classifier))

You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset


In [12]:
speeches

Unnamed: 0,SpeechID,Speech,Speaker,Location,Year,emotion,score
0,1,"We're live from Georgia, a key battleground st...",Moderator,Atlanta,2024,neutral,0.800911
1,2,"In just moments, the current U.S. president wi...",Moderator,Atlanta,2024,neutral,0.881252
2,3,We want to welcome our viewers in the United S...,Moderator,Atlanta,2024,desire,0.744509
3,4,This debate is being produced by CNN and it's ...,Moderator,Atlanta,2024,neutral,0.937532
4,5,This is a pivotal moment between President Joe...,Moderator,Atlanta,2024,neutral,0.742817
...,...,...,...,...,...,...,...
5835,723,She gave it to Afghanistan.,Trump,Philadelphia,2024,neutral,0.961735
5836,724,"What these people have done to our country, an...",Trump,Philadelphia,2024,annoyance,0.352410
5837,725,"The worst President, the worst Vice President ...",Trump,Philadelphia,2024,disgust,0.481651
5838,726,President Trump thank you. And that is our ABC...,Moderator,Philadelphia,2024,gratitude,0.987653
