# Politics of Emotions or Propaganda? (P3)

This project explores how **emotional language** is used strategically in political texts—such as speeches, social media posts, or debates—to **influence perception and manipulate audience response**.

The objective is to move beyond simple emotion classification and toward an **interpretation of emotion’s rhetorical function** within political discourse.

Data source: https://www.presidency.ucsb.edu/


In [1]:
# Install required packages from the requirement.txt file if not already installed
!pip install -r requirements.txt



## Dataset Development
Data source: https://www.presidency.ucsb.edu/documents/app-categories/elections-and-transitions/debates

Check the file fun.py to see how the speaches are cleaned.

In [2]:
from fun import process_debate_transcripts
process_debate_transcripts("transcripts", "data")

Processed: TRUMP_BIDEN_ATLANTA_2024.txt → TRUMP_BIDEN_ATLANTA_2024.csv
Processed: TRUMP_BIDEN_CLEVELAND_2020.txt → TRUMP_BIDEN_CLEVELAND_2020.csv
Processed: TRUMP_BIDEN_NASHVILLE_2020.txt → TRUMP_BIDEN_NASHVILLE_2020.csv
Processed: TRUMP_CLINTON_HEMPSTEAD_2016.txt → TRUMP_CLINTON_HEMPSTEAD_2016.csv
Processed: TRUMP_CLINTON_LOUIS_2016.txt → TRUMP_CLINTON_LOUIS_2016.csv
Processed: TRUMP_CLINTON_NEVADA_2016.txt → TRUMP_CLINTON_NEVADA_2016.csv
Processed: TRUMP_HARRIS_PHILADELPHIA_2024.txt → TRUMP_HARRIS_PHILADELPHIA_2024.csv


In [3]:
import pandas as pd
atlanta = pd.read_csv("data\TRUMP_BIDEN_ATLANTA_2024.csv")
atlanta

Unnamed: 0,SpeechID,Speech,Speaker,Location,Year
0,1,"We're live from Georgia, a key battleground st...",Moderator,Atlanta,2024
1,2,This debate is being produced by CNN and it's ...,Moderator,Atlanta,2024
2,3,"I'm Jake Tapper, anchor of CNN's ""THE LEAD"" an...",Moderator,Atlanta,2024
3,4,"When it's time for our candidate to speak, his...",Moderator,Atlanta,2024
4,5,"Now, please welcome the 46th of the United Sta...",Moderator,Atlanta,2024
...,...,...,...,...,...
173,176,It is now time for the candidates to deliver t...,Moderator,Atlanta,2024
174,177,We've made significant progress from the debac...,Biden,Atlanta,2024
175,178,"Thank you, President Biden. President Trump, y...",Moderator,Atlanta,2024
176,179,"Like so many politicians, this man is just a c...",Trump,Atlanta,2024


In [4]:
nevada = pd.read_csv("data\TRUMP_CLINTON_NEVADA_2016.csv")
nevada

Unnamed: 0,SpeechID,Speech,Speaker,Location,Year
0,1,Good evening from the Thomas and Mack Center a...,Moderator,Nevada,2016
1,2,"Thank you very much, Chris. And thanks to UNLV...",Clinton,Nevada,2016
2,3,"Secretary Clinton, thank you. Mr. Trump, same ...",Moderator,Nevada,2016
3,4,"Well, first of all, it's great to be with you,...",Trump,Nevada,2016
4,5,"Mr. Trump, thank you. We now have about 10 min...",Moderator,Nevada,2016
...,...,...,...,...,...
329,330,"This is—this is the final time, probably to bo...",Moderator,Nevada,2016
330,331,"Well, I would like to say to everyone watching...",Clinton,Nevada,2016
331,332,"Secretary Clinton, thank you. Mr. Trump?",Moderator,Nevada,2016
332,333,She's raising the money from the people she wa...,Trump,Nevada,2016


In [5]:
import os

# List to store dataframes
dataframes = []

# Iterate through all files in the folder
for file in os.listdir("data"):
    if file.endswith(".csv"):
        file_path = os.path.join("data", file)
        df = pd.read_csv(file_path)
        dataframes.append(df)
# Concatenate all dataframes
combined_df = pd.concat(dataframes, ignore_index=True)

# Save the combined dataframe to a new .csv file
combined_df.to_csv("data/combined_speeches.csv", index=False)

In [6]:
# Check the combined dataframe
speeches = pd.read_csv("data/combined_speeches.csv")
speeches.count()

SpeechID    17815
Speech      17815
Speaker     17815
Location    17815
Year        17815
dtype: int64

In [7]:
speeches.head()

Unnamed: 0,SpeechID,Speech,Speaker,Location,Year
0,1,"We're live from Georgia, a key battleground st...",Moderator,Atlanta,2024
1,2,This debate is being produced by CNN and it's ...,Moderator,Atlanta,2024
2,3,"I'm Jake Tapper, anchor of CNN's ""THE LEAD"" an...",Moderator,Atlanta,2024
3,4,"When it's time for our candidate to speak, his...",Moderator,Atlanta,2024
4,5,"Now, please welcome the 46th of the United Sta...",Moderator,Atlanta,2024


## Feeding the model

In [8]:
import torch
print(torch.cuda.is_available())
print(torch.cuda.get_device_name(0))

True
NVIDIA GeForce GTX 1070 Ti


In [9]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("SamLowe/roberta-base-go_emotions")
model = AutoModelForSequenceClassification.from_pretrained("SamLowe/roberta-base-go_emotions")

# Move model to GPU if available
device = 0 if torch.cuda.is_available() else -1

classifier = pipeline(task="text-classification", model="SamLowe/roberta-base-go_emotions", top_k=None, device=device)

Device set to use cuda:0


In [10]:
sentences = ["Hello, I am Joe Biden"]

model_outputs = classifier(sentences)
model_outputs[0]
# produces a list of dicts for each of the labels


[{'label': 'neutral', 'score': 0.9357642531394958},
 {'label': 'approval', 'score': 0.024361375719308853},
 {'label': 'excitement', 'score': 0.010637336410582066},
 {'label': 'realization', 'score': 0.01003023236989975},
 {'label': 'joy', 'score': 0.005965998861938715},
 {'label': 'annoyance', 'score': 0.004213426727801561},
 {'label': 'admiration', 'score': 0.0030739184003323317},
 {'label': 'amusement', 'score': 0.0028852401301264763},
 {'label': 'surprise', 'score': 0.00273154117166996},
 {'label': 'fear', 'score': 0.0024596164003014565},
 {'label': 'optimism', 'score': 0.00232110102660954},
 {'label': 'sadness', 'score': 0.002088801935315132},
 {'label': 'disgust', 'score': 0.0019926358945667744},
 {'label': 'gratitude', 'score': 0.0019211502512916923},
 {'label': 'curiosity', 'score': 0.001899787806905806},
 {'label': 'anger', 'score': 0.0018068865174427629},
 {'label': 'confusion', 'score': 0.001738501014187932},
 {'label': 'disappointment', 'score': 0.0016148401191458106},
 {'la

This function will classify the text and return the label and score of the most probable emotion.

In [12]:
from fun import get_top_emotion
speeches[['emotion', 'score']] = speeches['Speech'].apply(lambda x: get_top_emotion(x, classifier))

You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset


In [13]:
speeches

Unnamed: 0,SpeechID,Speech,Speaker,Location,Year,emotion,score
0,1,"We're live from Georgia, a key battleground st...",Moderator,Atlanta,2024,desire,0.608718
1,2,This debate is being produced by CNN and it's ...,Moderator,Atlanta,2024,neutral,0.579422
2,3,"I'm Jake Tapper, anchor of CNN's ""THE LEAD"" an...",Moderator,Atlanta,2024,neutral,0.921747
3,4,"When it's time for our candidate to speak, his...",Moderator,Atlanta,2024,neutral,0.899099
4,5,"Now, please welcome the 46th of the United Sta...",Moderator,Atlanta,2024,gratitude,0.778229
...,...,...,...,...,...,...,...
17810,224,So I think you've heard tonight two very diffe...,Harris,Philadelphia,2024,optimism,0.489765
17811,225,"Vice President Harris, thank you. President Tr...",Moderator,Philadelphia,2024,gratitude,0.960313
17812,226,"So, she just started by saying she's gonna to ...",Trump,Philadelphia,2024,neutral,0.422355
17813,227,President Trump thank you. And that is our ABC...,Moderator,Philadelphia,2024,gratitude,0.987653
