## 2016 Election Project 

This notebook is intended to document my data processing throughout this project. I'll be poking around and modifying my data in this file. The data I am starting out with are transcripts of the presidential debates from the 2016 US Election between Hillary Clinton and Donald Trump. The transcripts were taken from UCSB's American Presidency Project, and the citation for each of the transcripts will be included both above, and as a part of the data I process.

To begin, I have three transcripts from the three presidential debates.

Presidential Candidates Debates: "Presidential Debate at the University of Nevada in Las Vegas," October 19, 2016. Online by Gerhard Peters and John T. Woolley, The American Presidency Project. http://www.presidency.ucsb.edu/ws/?pid=119039.

Presidential Candidates Debates: "Presidential Debate at Washington University in St. Louis, Missouri," October 9, 2016. Online by Gerhard Peters and John T. Woolley, The American Presidency Project. http://www.presidency.ucsb.edu/ws/?pid=119038.

Presidential Candidates Debates: "Presidential Debate at Hofstra University in Hempstead, New York," September 26, 2016. Online by Gerhard Peters and John T. Woolley, The American Presidency Project. http://www.presidency.ucsb.edu/ws/?pid=118971.


**I might use other speeches as well, but I think I will use their individual speeches AFTER they have both been chosen as their party's candidates. Since I'll be adding manual RE annotation, I worry about using too many files** 

In [1]:
import nltk
from nltk.corpus import PlaintextCorpusReader
import pandas as pd
import glob
import os


In [2]:
os.chdir('/Users/Paige/Documents/Data_Science/2016-Election-Project/data/Debates')
files = glob.glob("*.txt")
files

['10-19-16.txt', '10-9-16.txt', '9-26-16.txt']

In [3]:
#I'm creating a list where each entry in the list is a transcript
transcripts = []
for f in files:
    fi = open(f, 'r')
    txt = fi.read()
    fi.close
    transcripts.append(txt)

In [4]:
print(transcripts[0][:200])

PARTICIPANTS:
Former Secretary of State Hillary Clinton (D) and
Businessman Donald Trump (R)
MODERATOR:
Chris Wallace (Fox News)

WALLACE: Good evening from the Thomas and Mack Center at the Universit


In [5]:
print(transcripts[1][:200])

PARTICIPANTS:
Former Secretary of State Hillary Clinton (D) and
Businessman Donald Trump (R)
MODERATORS:
Anderson Cooper (CNN) and
Martha Raddatz (ABC News)

RADDATZ: Ladies and gentlemen the Republic


In [6]:
print(transcripts[2][:500])

PARTICIPANTS:
Former Secretary of State Hillary Clinton (D) and
Businessman Donald Trump (R)
MODERATOR:
Lester Holt (NBC News)

HOLT: Good evening from Hofstra University in Hempstead, New York. I'm Lester Holt, anchor of "NBC Nightly News." I want to welcome you to the first presidential debate.

The participants tonight are Donald Trump and Hillary Clinton. This debate is sponsored by the Commission on Presidential Debates, a nonpartisan, nonprofit organization. The commission drafted tonight'


**We can see that we need to do some clean up. What I would eventually like to end up with is a dataframe where the columns are Debate, Date, Source, Speaker, Sents, where the Sents are in the order of their speech. For now, I will keep the speech/questions of the moderators, becuase it might be interesting to compare the referring expressions *they* use for the candidates vs what the cadidates use for each other.**

In [7]:
#I want to split large chunks of the transcript based on who is speaking.
#Since the transcript data has a pretty standardized fomat (The speaker is in all caps followed by a colon)
#I can add a marker to each of these sections, and split the data on that marker

speaker_split = []

for txt in transcripts:
    speaker_split.append(txt.replace("CLINTON:", "#$&CLINTON*:").replace("TRUMP:", "#$&TRUMP*:").replace("WALLACE:", "#$&WALLACE*:").replace("COOPER:", "#$&COOPER*:").replace("RADDATZ:", "#$&RADDATZ*:").replace("HOLT:", "#$&HOLT*:").replace("PARTICIPANTS:", "#$&PARTICIPANTS*:").replace("MODERATOR:", "#$&MODERATOR*:").replace("MODERATORS:", "#$&MODERATORS*:").replace("\n", " "))

speaker_split = [txt.strip().split("#$&") for txt in speaker_split]

In [8]:
speaker_split[0][:4]

['',
 'PARTICIPANTS*: Former Secretary of State Hillary Clinton (D) and Businessman Donald Trump (R) ',
 'MODERATOR*: Chris Wallace (Fox News)  ',
 "WALLACE*: Good evening from the Thomas and Mack Center at the University of Nevada, Las Vegas. I'm Chris Wallace of Fox News, and I welcome you to the third and final of the 2016 presidential debates between Secretary of State Hillary Clinton and Donald J. Trump.  This debate is sponsored by the Commission on Presidential Debates. The commission has designed the format: Six roughly 15-minute segments with two-minute answers to the first question, then open discussion for the rest of each segment. Both campaigns have agreed to those rules.  For the record, I decided the topics and the questions in each topic. None of those questions has been shared with the commission or the two candidates. The audience here in the hall has promised to remain silent. No cheers, boos, or other interruptions so we and you can focus on what the candidates have

In [9]:
#Creating three separate lists of split speech by speaker for each debate
debate3 = speaker_split[0]
debate2 = speaker_split[1]
debate1 = speaker_split[2]

In [10]:
#Splitting the SPEAKER: from the speech
debate3 = [txt.split("*:") for txt in debate3]
debate2 = [txt.split("*:") for txt in debate2]
debate1 = [txt.split("*:") for txt in debate1]

In [11]:
debate3 #We can see that we need to remove the empty list at the beginning, and strip all of the entries
debate3.remove([''])
debate3[:4]
#We'll strip the entries when they're in the data frame

[['PARTICIPANTS',
  ' Former Secretary of State Hillary Clinton (D) and Businessman Donald Trump (R) '],
 ['MODERATOR', ' Chris Wallace (Fox News)  '],
 ['WALLACE',
  " Good evening from the Thomas and Mack Center at the University of Nevada, Las Vegas. I'm Chris Wallace of Fox News, and I welcome you to the third and final of the 2016 presidential debates between Secretary of State Hillary Clinton and Donald J. Trump.  This debate is sponsored by the Commission on Presidential Debates. The commission has designed the format: Six roughly 15-minute segments with two-minute answers to the first question, then open discussion for the rest of each segment. Both campaigns have agreed to those rules.  For the record, I decided the topics and the questions in each topic. None of those questions has been shared with the commission or the two candidates. The audience here in the hall has promised to remain silent. No cheers, boos, or other interruptions so we and you can focus on what the candi

In [12]:
debate2.remove([''])
debate1.remove([''])

In [13]:
debate3df = pd.DataFrame(debate3)
#I want to add a column of the source of the transcript for each dataframe
#I'm adding these columns with all the same value because I will eventually combine the 
#dataframes from all three debates, and this information will be important then

debate3df['Source'] = 'http://www.presidency.ucsb.edu/ws/?pid=119039'
debate3df['Debate'] = '3' 
debate3df['Location'] = 'University of Nevada in Las Vegas'
debate3df['Date'] = '10/19/16'

In [14]:
debate3df.head(10)

Unnamed: 0,0,1,Source,Debate,Location,Date
0,PARTICIPANTS,Former Secretary of State Hillary Clinton (D)...,http://www.presidency.ucsb.edu/ws/?pid=119039,3,University of Nevada in Las Vegas,10/19/16
1,MODERATOR,Chris Wallace (Fox News),http://www.presidency.ucsb.edu/ws/?pid=119039,3,University of Nevada in Las Vegas,10/19/16
2,WALLACE,Good evening from the Thomas and Mack Center ...,http://www.presidency.ucsb.edu/ws/?pid=119039,3,University of Nevada in Las Vegas,10/19/16
3,CLINTON,"Thank you very much, Chris. And thanks to UNL...",http://www.presidency.ucsb.edu/ws/?pid=119039,3,University of Nevada in Las Vegas,10/19/16
4,WALLACE,"Secretary Clinton, thank you. Mr. Trump, sam...",http://www.presidency.ucsb.edu/ws/?pid=119039,3,University of Nevada in Las Vegas,10/19/16
5,TRUMP,"Well, first of all, it's great to be with you...",http://www.presidency.ucsb.edu/ws/?pid=119039,3,University of Nevada in Las Vegas,10/19/16
6,WALLACE,"Mr. Trump, thank you. We now have about 10 m...",http://www.presidency.ucsb.edu/ws/?pid=119039,3,University of Nevada in Las Vegas,10/19/16
7,CLINTON,"Well, first of all, I support the Second Amen...",http://www.presidency.ucsb.edu/ws/?pid=119039,3,University of Nevada in Las Vegas,10/19/16
8,WALLACE,Let me bring Mr. Trump in here. The bipartisa...,http://www.presidency.ucsb.edu/ws/?pid=119039,3,University of Nevada in Las Vegas,10/19/16
9,TRUMP,"Well, the D.C. vs. Heller decision was very s...",http://www.presidency.ucsb.edu/ws/?pid=119039,3,University of Nevada in Las Vegas,10/19/16


In [15]:
#Renaming the first two columns
debate3df.columns = ['Speaker', 'Speech', 'Source', 'Debate', 'Location', 'Date']

#Stripping the text in the Speaker and Speech columns
debate3df['Speaker'] = debate3df['Speaker'].apply(lambda x: x.strip())
debate3df['Speech'] = debate3df['Speech'].apply(lambda x: x.strip())

In [16]:
debate3df.head(10)

Unnamed: 0,Speaker,Speech,Source,Debate,Location,Date
0,PARTICIPANTS,Former Secretary of State Hillary Clinton (D) ...,http://www.presidency.ucsb.edu/ws/?pid=119039,3,University of Nevada in Las Vegas,10/19/16
1,MODERATOR,Chris Wallace (Fox News),http://www.presidency.ucsb.edu/ws/?pid=119039,3,University of Nevada in Las Vegas,10/19/16
2,WALLACE,Good evening from the Thomas and Mack Center a...,http://www.presidency.ucsb.edu/ws/?pid=119039,3,University of Nevada in Las Vegas,10/19/16
3,CLINTON,"Thank you very much, Chris. And thanks to UNLV...",http://www.presidency.ucsb.edu/ws/?pid=119039,3,University of Nevada in Las Vegas,10/19/16
4,WALLACE,"Secretary Clinton, thank you. Mr. Trump, same...",http://www.presidency.ucsb.edu/ws/?pid=119039,3,University of Nevada in Las Vegas,10/19/16
5,TRUMP,"Well, first of all, it's great to be with you,...",http://www.presidency.ucsb.edu/ws/?pid=119039,3,University of Nevada in Las Vegas,10/19/16
6,WALLACE,"Mr. Trump, thank you. We now have about 10 mi...",http://www.presidency.ucsb.edu/ws/?pid=119039,3,University of Nevada in Las Vegas,10/19/16
7,CLINTON,"Well, first of all, I support the Second Amend...",http://www.presidency.ucsb.edu/ws/?pid=119039,3,University of Nevada in Las Vegas,10/19/16
8,WALLACE,Let me bring Mr. Trump in here. The bipartisan...,http://www.presidency.ucsb.edu/ws/?pid=119039,3,University of Nevada in Las Vegas,10/19/16
9,TRUMP,"Well, the D.C. vs. Heller decision was very st...",http://www.presidency.ucsb.edu/ws/?pid=119039,3,University of Nevada in Las Vegas,10/19/16


In [17]:
#Reorganize the order of the columns
debate3df = debate3df[['Location', 'Date', 'Debate', 'Source', 'Speaker', 'Speech']]

#Drop these first two rows, because they are not speech information
debate3df.drop(0, inplace=True)
debate3df.drop(1, inplace=True)

debate3df.head()

Unnamed: 0,Location,Date,Debate,Source,Speaker,Speech
2,University of Nevada in Las Vegas,10/19/16,3,http://www.presidency.ucsb.edu/ws/?pid=119039,WALLACE,Good evening from the Thomas and Mack Center a...
3,University of Nevada in Las Vegas,10/19/16,3,http://www.presidency.ucsb.edu/ws/?pid=119039,CLINTON,"Thank you very much, Chris. And thanks to UNLV..."
4,University of Nevada in Las Vegas,10/19/16,3,http://www.presidency.ucsb.edu/ws/?pid=119039,WALLACE,"Secretary Clinton, thank you. Mr. Trump, same..."
5,University of Nevada in Las Vegas,10/19/16,3,http://www.presidency.ucsb.edu/ws/?pid=119039,TRUMP,"Well, first of all, it's great to be with you,..."
6,University of Nevada in Las Vegas,10/19/16,3,http://www.presidency.ucsb.edu/ws/?pid=119039,WALLACE,"Mr. Trump, thank you. We now have about 10 mi..."


In [18]:
#This is the same processing for debate2df
debate2df = pd.DataFrame(debate2)
debate2df['Source'] = 'http://www.presidency.ucsb.edu/ws/?pid=119038'
debate2df['Debate'] = '2' 
debate2df['Location'] = 'Washington University in St. Louis, Missouri'
debate2df['Date'] = '10/9/16'

#Renaming the first two columns
debate2df.columns = ['Speaker', 'Speech', 'Source', 'Debate', 'Location', 'Date']

#Stripping the text in the Speaker and Speech columns
debate2df['Speaker'] = debate2df['Speaker'].apply(lambda x: x.strip())
debate2df['Speech'] = debate2df['Speech'].apply(lambda x: x.strip())

#Reorganize the order of the columns
debate2df = debate2df[['Location', 'Date', 'Debate', 'Source', 'Speaker', 'Speech']]

#Drop these first two rows, because they are not speech information
debate2df.drop(0, inplace=True)
debate2df.drop(1, inplace=True)

In [19]:
#This is the same processing for debate1df
debate1df = pd.DataFrame(debate1)
debate1df['Source'] = 'http://www.presidency.ucsb.edu/ws/?pid=118971'
debate1df['Debate'] = '1' 
debate1df['Location'] = 'Hofstra University in Hempstead, New York'
debate1df['Date'] = '9/26/16'

#Renaming the first two columns
debate1df.columns = ['Speaker', 'Speech', 'Source', 'Debate', 'Location', 'Date']

#Stripping the text in the Speaker and Speech columns
debate1df['Speaker'] = debate1df['Speaker'].apply(lambda x: x.strip())
debate1df['Speech'] = debate1df['Speech'].apply(lambda x: x.strip())

#Reorganize the order of the columns
debate1df = debate1df[['Location', 'Date', 'Debate', 'Source', 'Speaker', 'Speech']]

#Drop these first two rows, because they are not speech information
debate1df.drop(0, inplace=True)
debate1df.drop(1, inplace=True)

In [20]:
debate1df.head()

Unnamed: 0,Location,Date,Debate,Source,Speaker,Speech
2,"Hofstra University in Hempstead, New York",9/26/16,1,http://www.presidency.ucsb.edu/ws/?pid=118971,HOLT,Good evening from Hofstra University in Hempst...
3,"Hofstra University in Hempstead, New York",9/26/16,1,http://www.presidency.ucsb.edu/ws/?pid=118971,CLINTON,"How are you, Donald? [applause]"
4,"Hofstra University in Hempstead, New York",9/26/16,1,http://www.presidency.ucsb.edu/ws/?pid=118971,HOLT,"Good luck to you. [applause] Well, I don't ex..."
5,"Hofstra University in Hempstead, New York",9/26/16,1,http://www.presidency.ucsb.edu/ws/?pid=118971,CLINTON,"Well, thank you, Lester, and thanks to Hofstra..."
6,"Hofstra University in Hempstead, New York",9/26/16,1,http://www.presidency.ucsb.edu/ws/?pid=118971,HOLT,"Secretary Clinton, thank you. Mr. Trump, the ..."


In [21]:
debate2df.head()

Unnamed: 0,Location,Date,Debate,Source,Speaker,Speech
2,"Washington University in St. Louis, Missouri",10/9/16,2,http://www.presidency.ucsb.edu/ws/?pid=119038,RADDATZ,Ladies and gentlemen the Republican nominee fo...
3,"Washington University in St. Louis, Missouri",10/9/16,2,http://www.presidency.ucsb.edu/ws/?pid=119038,COOPER,Thank you very much for being here. We're goin...
4,"Washington University in St. Louis, Missouri",10/9/16,2,http://www.presidency.ucsb.edu/ws/?pid=119038,CLINTON,"Well, thank you. Are you a teacher? Yes, I thi..."
5,"Washington University in St. Louis, Missouri",10/9/16,2,http://www.presidency.ucsb.edu/ws/?pid=119038,COOPER,"Secretary Clinton, thank you. Mr. Trump, you h..."
6,"Washington University in St. Louis, Missouri",10/9/16,2,http://www.presidency.ucsb.edu/ws/?pid=119038,TRUMP,"Well, I actually agree with that. I agree with..."


**Currently, the entries in this data frame are split into chunks of who is speaking. I think I might want each entry in the data frame to be a sentence instead. I'm going to make another dataframe where each row is information about one sentence. I'm going to keep both dataframes in case I decide one would be more helpful than the other later.**

In [22]:
debate3[:3]

[['PARTICIPANTS',
  ' Former Secretary of State Hillary Clinton (D) and Businessman Donald Trump (R) '],
 ['MODERATOR', ' Chris Wallace (Fox News)  '],
 ['WALLACE',
  " Good evening from the Thomas and Mack Center at the University of Nevada, Las Vegas. I'm Chris Wallace of Fox News, and I welcome you to the third and final of the 2016 presidential debates between Secretary of State Hillary Clinton and Donald J. Trump.  This debate is sponsored by the Commission on Presidential Debates. The commission has designed the format: Six roughly 15-minute segments with two-minute answers to the first question, then open discussion for the rest of each segment. Both campaigns have agreed to those rules.  For the record, I decided the topics and the questions in each topic. None of those questions has been shared with the commission or the two candidates. The audience here in the hall has promised to remain silent. No cheers, boos, or other interruptions so we and you can focus on what the candi

In [23]:
debate3_sent = []
for chunk in debate3:
    sents = nltk.sent_tokenize(chunk[1])
    for sent in sents:
        debate3_sent.append([chunk[0], sent])

In [24]:
df3_sents = pd.DataFrame(debate3_sent)
df3_sents['Source'] = 'http://www.presidency.ucsb.edu/ws/?pid=119039'
df3_sents['Debate'] = '3' 
df3_sents['Location'] = 'University of Nevada in Las Vegas'
df3_sents['Date'] = '10/19/16'

#Renaming the first two columns
df3_sents.columns = ['Speaker', 'Speech', 'Source', 'Debate', 'Location', 'Date']

#Stripping the text in the Speaker and Speech columns
df3_sents['Speaker'] = df3_sents['Speaker'].apply(lambda x: x.strip())
df3_sents['Speech'] = df3_sents['Speech'].apply(lambda x: x.strip())

#Reorganize the order of the columns
df3_sents = df3_sents[['Location', 'Date', 'Debate', 'Source', 'Speaker', 'Speech']]

#Drop these first two rows, because they are not speech information
df3_sents.drop(0, inplace=True)
df3_sents.drop(1, inplace=True)

In [25]:
df3_sents.head()

Unnamed: 0,Location,Date,Debate,Source,Speaker,Speech
2,University of Nevada in Las Vegas,10/19/16,3,http://www.presidency.ucsb.edu/ws/?pid=119039,WALLACE,Good evening from the Thomas and Mack Center a...
3,University of Nevada in Las Vegas,10/19/16,3,http://www.presidency.ucsb.edu/ws/?pid=119039,WALLACE,"I'm Chris Wallace of Fox News, and I welcome y..."
4,University of Nevada in Las Vegas,10/19/16,3,http://www.presidency.ucsb.edu/ws/?pid=119039,WALLACE,This debate is sponsored by the Commission on ...
5,University of Nevada in Las Vegas,10/19/16,3,http://www.presidency.ucsb.edu/ws/?pid=119039,WALLACE,The commission has designed the format: Six ro...
6,University of Nevada in Las Vegas,10/19/16,3,http://www.presidency.ucsb.edu/ws/?pid=119039,WALLACE,Both campaigns have agreed to those rules.


In [26]:
#The same for debate 1
debate1_sent = []
for chunk in debate1:
    sents = nltk.sent_tokenize(chunk[1])
    for sent in sents:
        debate1_sent.append([chunk[0], sent])

In [27]:
df1_sents = pd.DataFrame(debate1_sent)
df1_sents['Source'] = 'http://www.presidency.ucsb.edu/ws/?pid=118971'
df1_sents['Debate'] = '1' 
df1_sents['Location'] = 'Hofstra University in Hempstead, New York'
df1_sents['Date'] = '9/26/16'

#Renaming the first two columns
df1_sents.columns = ['Speaker', 'Speech', 'Source', 'Debate', 'Location', 'Date']

#Stripping the text in the Speaker and Speech columns
df1_sents['Speaker'] = df1_sents['Speaker'].apply(lambda x: x.strip())
df1_sents['Speech'] = df1_sents['Speech'].apply(lambda x: x.strip())

#Reorganize the order of the columns
df1_sents = df1_sents[['Location', 'Date', 'Debate', 'Source', 'Speaker', 'Speech']]

#Drop these first two rows, because they are not speech information
df1_sents.drop(0, inplace=True)
df1_sents.drop(1, inplace=True)
df1_sents.head()

Unnamed: 0,Location,Date,Debate,Source,Speaker,Speech
2,"Hofstra University in Hempstead, New York",9/26/16,1,http://www.presidency.ucsb.edu/ws/?pid=118971,HOLT,Good evening from Hofstra University in Hempst...
3,"Hofstra University in Hempstead, New York",9/26/16,1,http://www.presidency.ucsb.edu/ws/?pid=118971,HOLT,"I'm Lester Holt, anchor of ""NBC Nightly News."""
4,"Hofstra University in Hempstead, New York",9/26/16,1,http://www.presidency.ucsb.edu/ws/?pid=118971,HOLT,I want to welcome you to the first presidentia...
5,"Hofstra University in Hempstead, New York",9/26/16,1,http://www.presidency.ucsb.edu/ws/?pid=118971,HOLT,The participants tonight are Donald Trump and ...
6,"Hofstra University in Hempstead, New York",9/26/16,1,http://www.presidency.ucsb.edu/ws/?pid=118971,HOLT,This debate is sponsored by the Commission on ...


In [28]:
#The same for debate 2
debate2_sent = []
for chunk in debate2:
    sents = nltk.sent_tokenize(chunk[1])
    for sent in sents:
        debate2_sent.append([chunk[0], sent])

In [29]:
df2_sents = pd.DataFrame(debate2_sent)
df2_sents['Source'] = 'http://www.presidency.ucsb.edu/ws/?pid=119038'
df2_sents['Debate'] = '2' 
df2_sents['Location'] = 'Washington University in St. Louis, Missouri'
df2_sents['Date'] = '10/9/16'

#Renaming the first two columns
df2_sents.columns = ['Speaker', 'Speech', 'Source', 'Debate', 'Location', 'Date']

#Stripping the text in the Speaker and Speech columns
df2_sents['Speaker'] = df2_sents['Speaker'].apply(lambda x: x.strip())
df2_sents['Speech'] = df2_sents['Speech'].apply(lambda x: x.strip())

#Reorganize the order of the columns
df2_sents = df2_sents[['Location', 'Date', 'Debate', 'Source', 'Speaker', 'Speech']]

#Drop these first two rows, because they are not speech information
df2_sents.drop(0, inplace=True)
df2_sents.drop(1, inplace=True)
df2_sents.head()

Unnamed: 0,Location,Date,Debate,Source,Speaker,Speech
2,"Washington University in St. Louis, Missouri",10/9/16,2,http://www.presidency.ucsb.edu/ws/?pid=119038,RADDATZ,Ladies and gentlemen the Republican nominee fo...
3,"Washington University in St. Louis, Missouri",10/9/16,2,http://www.presidency.ucsb.edu/ws/?pid=119038,RADDATZ,[applause]
4,"Washington University in St. Louis, Missouri",10/9/16,2,http://www.presidency.ucsb.edu/ws/?pid=119038,COOPER,Thank you very much for being here.
5,"Washington University in St. Louis, Missouri",10/9/16,2,http://www.presidency.ucsb.edu/ws/?pid=119038,COOPER,We're going to begin with a question from one ...
6,"Washington University in St. Louis, Missouri",10/9/16,2,http://www.presidency.ucsb.edu/ws/?pid=119038,COOPER,Each of you will have two minutes to respond t...


**Now I have 2 dataframes for each debate. One is a dataframe where each row is information on a chunk of speech, and the other is a dataframe where each row is information on a particular sentence. Both the chunks and sentences are in the order in which they were spoken. Now I'm going to export these dataframes to CSV files and annotate them for referring expressions manually.**

In [30]:
df1_sents.to_csv('debate1_sents.csv')
df2_sents.to_csv('debate2_sents.csv')
df3_sents.to_csv('debate3_sents.csv')
debate1df.to_csv('debate1.csv')
debate2df.to_csv('debate2.csv')
debate3df.to_csv('debate3.csv')