<h1>U.S. Presidents' Inaugural Address Analysis</h1>

US prisidents' inaugural address content file: https://guanyipei.github.io/assets/portfolio/nlp/us_presidents_inaugural_address/presidential_speech.csv
<p>Look into the content of inauguration speech to disentangle presidents' value priorities as well as their ideological orientations.</p>
<p>Previous research (Jones et al., 2017) has identified ten clusters of values that are relevant to political ideology, as well as ten sets of keywords to gauge their magnitudes as follows:</p>

|No.|Value|Keywords (example)|
|:-----:|:-----:|:-----:|
|1|Self-Direction|ability,accountability,freedom
|2|Stimulation|action,active,confront
|3|Hedonism|amusement,delight,gratification
|4|Achievement|accomplish,attain,improve
|5|Power|authorize,dominance,reign
|6|Security|asylum,refuge,border,prison
|7|Tradition|catholic,christian,jewish,patriot
|8|Conformity|abide,mandate,rule,restraint
|9|Benevolence|allegiance,altruism,sympathy
|10|Universalism|equality,affordability,balances

Political ideology file: https://guanyipei.github.io/assets/portfolio/nlp/us_presidents_inaugural_address/political_ideology.txt
<p>File format: Every line refers to one value. Value and its keywords are separated by colon and function words are separated by comma.</p>
<p>Analyze the relative importance of each value in presidential speeches. Relative importance is defined as the number of times one value are mentioned per thousand words.</p>

In [1]:
import nltk
from nltk.stem import WordNetLemmatizer
Lemmatizer=WordNetLemmatizer()

In [2]:
def lemmatization(word_list):
    words_lemma=[]
    for word,pos in nltk.pos_tag(word_list):
        if pos[0]=='N':
            words_lemma.append(Lemmatizer.lemmatize(word))
        elif pos[0]=='J':
            words_lemma.append(Lemmatizer.lemmatize(word,'a'))
        elif pos[0]=='V':
            words_lemma.append(Lemmatizer.lemmatize(word,'v'))
        else:
            words_lemma.append(word)
    return(words_lemma)

In [3]:
value_dic={}
full_file = open('political_ideology.txt','r')
lines = full_file.readlines()
for line in lines:
    cleaned_line = line.strip()
    value_key = cleaned_line.split(':')[0]
    keywords_list = cleaned_line.split(':')[1].lower().split(',')
    keywords_lemma = lemmatization(keywords_list)
    # remove duplicate lemmatized keywords
    keywords_lemma_without_duplicate = list(dict.fromkeys(keywords_lemma))
    value_dic[value_key] = keywords_lemma_without_duplicate
full_file.close()
print('Done')
print(value_dic)

Done
{'Self-Direction': ['ability', 'abide', 'accountability', 'accountable', 'actualize', 'actualized', 'actualizes', 'advocacy', 'advocate', 'analytical', 'analyze', 'analyzes', 'ask', 'autonomy', 'build', 'choice', 'choose', 'chose', 'compose', 'composes', 'consider', 'considers', 'construct', 'conversation', 'converse', 'create', 'creates', 'creation', 'creative', 'curious', 'debate', 'decide', 'decides', 'decision', 'democracy', 'design', 'devise', 'discover', 'discus', 'discuss', 'discussion', 'elect', 'election', 'establish', 'establishes', 'examine', 'examines', 'explain', 'explains', 'explore', 'explores', 'formulate', 'formulates', 'freedom', 'idea', 'independence', 'independent', 'individualistic', 'initiate', 'initiated', 'innovate', 'innovates', 'innovative', 'inquire', 'inquires', 'inquisitive', 'inspect', 'inspecting', 'inspects', 'invent', 'invents', 'investigate', 'investigates', 'investigative', 'liberty', 'make', 'negotiation', 'negotiate', 'negotiates', 'opportunity

In [4]:
import pandas as pd
speech_file = pd.read_csv('presidential_speech.csv')
speech_file

Unnamed: 0.1,Unnamed: 0,president,speech
0,0,Washington,Fellow Citizens of the Senate and the House of...
1,1,Jefferson,"FRIENDS AND FELLOW-CITIZENS,\nCalled upon to u..."
2,2,Lincoln,Fellow-Countrymen:\nAt this second appearing t...
3,3,Roosevelt,I am certain that my fellow Americans expect t...
4,4,Kennedy,"Vice President Johnson, Mr. Speaker, Mr. Chief..."
5,5,Nixon,"Senator Dirksen, Mr. Chief Justice, Mr. Vice P..."
6,6,Reagan,"Senator Hatfield, Mr. Chief Justice, Mr. Presi..."
7,7,Bush,"Mr. Chief Justice, Mr. President, Vice Preside..."
8,8,Clinton,My fellow citizens :\nToday we celebrate the m...
9,9,W Bush,"President Clinton, distinguished guests and my..."


In [5]:
def clean_and_spilt(str):
    str = str.lower()
    str = re.sub('[\W]+',' ',str)
    str = str.strip()
    list_of_str = str.split(' ')
    return(list_of_str)

In [6]:
import re
processed_speech_dict = {}
for i in range(12):
    speech_file.iloc[i]['president']
    cleaned_speech = clean_and_spilt(speech_file.iloc[i]['speech'])
    lemmatized_speech = lemmatization(cleaned_speech)
    processed_speech_dict[speech_file.iloc[i]['president']] = lemmatized_speech

In [7]:
import csv
file=open('president_value_priority.csv','w',newline='\n')
writer=csv.writer(file)
writer.writerow(['president','Self-Direction','Stimulation','Hedonism','Achievement','Power',
                 'Security','Tradition','Conformity','Benevolence','Universalism'])
for president in processed_speech_dict.keys():
    WC = len(processed_speech_dict[president])
    calculate_result_dict = {}
    for value in value_dic.keys():
        AVP = 0
        for keyword in value_dic[value]:
            AVP += processed_speech_dict[president].count(keyword)
        result_of_one_value = AVP/WC*1000
        calculate_result_dict[value] = result_of_one_value
    writer.writerow(
        [
        president,calculate_result_dict['Self-Direction'],calculate_result_dict['Stimulation'],calculate_result_dict['Hedonism'],
        calculate_result_dict['Achievement'],calculate_result_dict['Power'],calculate_result_dict['Security'],
        calculate_result_dict['Tradition'],calculate_result_dict['Conformity'],calculate_result_dict['Benevolence'],calculate_result_dict['Universalism']
        ]
    ) # same order as the first row (column title)
file.close()

In [8]:
#Self-check
table=pd.read_csv('president_value_priority.csv')
table

Unnamed: 0,president,Self-Direction,Stimulation,Hedonism,Achievement,Power,Security,Tradition,Conformity,Benevolence,Universalism
0,Washington,13.249651,3.48675,1.3947,13.249651,6.276151,4.1841,4.88145,8.368201,7.670851,8.368201
1,Jefferson,12.702079,4.04157,3.464203,12.702079,12.702079,8.660508,8.083141,12.124711,8.660508,13.279446
2,Lincoln,9.929078,4.255319,0.0,9.929078,4.255319,21.276596,14.184397,0.0,7.092199,18.439716
3,Roosevelt,10.621349,11.152416,1.593202,11.152416,7.966012,9.028147,9.028147,10.621349,12.745619,5.841742
4,Kennedy,19.832189,5.339436,2.28833,9.153318,16.018307,14.492754,5.339436,6.102212,15.25553,12.204424
5,Nixon,14.547161,7.038949,4.223369,13.608634,9.385265,7.038949,4.223369,2.81558,10.323792,10.323792
6,Reagan,14.669927,4.889976,1.222494,15.077425,11.817441,6.519967,8.149959,4.482478,10.187449,6.519967
7,Bush,19.683355,4.706889,2.567394,10.697475,10.269576,5.562687,5.134788,2.995293,18.399658,10.269576
8,Clinton,18.170426,19.423559,4.385965,8.77193,11.904762,6.892231,5.639098,3.759398,14.411028,7.518797
9,W Bush,17.621145,10.698553,1.258653,10.698553,12.586532,11.957206,8.810573,8.181246,16.991819,13.845186


If all inauguration speeches are considered, which value is the most important and most mentioned?

In [9]:
result_holder = {}
for i in table.columns:
    if i == 'president':
        continue
    else:
        result_holder[i] = table[i].mean()
print(max(result_holder, key=result_holder.get))

Self-Direction


In [10]:
# self-check
result_holder

{'Self-Direction': 14.614293759442562,
 'Stimulation': 7.636102356436537,
 'Hedonism': 2.74265256197229,
 'Achievement': 11.431200536596075,
 'Power': 10.604493787693634,
 'Security': 10.331280440536073,
 'Tradition': 7.425404321767654,
 'Conformity': 5.415375318115129,
 'Benevolence': 11.816347809876005,
 'Universalism': 10.671245752317596}

Among all studied presidents, who emphasizes the value of STIMULATION more often than others?

In [11]:
print(table[table['Stimulation']==max(table['Stimulation'])]['president'])

8    Clinton
Name: president, dtype: object


Among all studied values, what is the most important thing for Trump?

In [12]:
trump_data = table[table['president']=='Trump'].to_dict(orient='list')
del trump_data['president']
print(max(trump_data, key=trump_data.get))

Security


Among all studied values, what is the least important thing for Trump?

In [13]:
print(min(trump_data, key=trump_data.get))

Conformity
