**Analysis of Presidential speech and election data**

This notebook scrapes [The American Presidency Project](http://www.presidency.ucsb.edu) and downloads the campagin speeches of all 2016 presidential candidates.  It then builds a markov chain out each president's data capable of generating sentences in the style of their campaign speeches. 

In [1]:
import pandas as pd
import numpy as np
import requests
from lxml import html
from bs4 import BeautifulSoup
import markovify

In [2]:

def getCandidateSpeechLinks(url):
    allCandidatePage = requests.get(url)
    allCandidatePageSoup = BeautifulSoup(allCandidatePage.text, 'lxml')
    links={}
    table = allCandidatePageSoup.find('table', width=680)
    for area in table.findAll('td', class_='doctext'):
        for a in area.findAll('a'):
            if ('campaign' in a.text.lower()):
                links[area.find('span', class_='roman').text] = a['href']
    return links

def scrapeCampaignSpeechesToFile(url, path):
    allSpeechPages = requests.get(url)
    allSpeechSoup=BeautifulSoup(allSpeechPages.text, 'lxml')
    root = 'http://www.presidency.ucsb.edu/'
    table = allSpeechSoup.find('table', width=700)
    links = []
    for link in table.findAll('a'):
        if('interview' not in link.text.lower()):
            links.append(root+(link['href'])[3:])

    speechPages = [requests.get(link , 'lxml')for link in links]
    speechesSoup = [BeautifulSoup(speechPage.text, 'lxml') for speechPage in speechPages]
    
    with open(path, "w+", encoding='utf-8') as outFile:
        outFile.seek(0)
        for i,speech in enumerate(speechesSoup):            
            outFile.write(speechesSoup[i].find('span', class_='displaytext').text+'\n')

def trainMarkov(path):

    # Get raw text as string.
    with open(path, encoding='utf-8') as f:
        text = f.read()

    # Build the model.
    text_model = markovify.Text(text)
    return text_model

Create the dictionary of each candidate's name and link to their campaign speech page

In [3]:
campaignSpeechLinkDict = getCandidateSpeechLinks('http://www.presidency.ucsb.edu/2016_election.php')
print(campaignSpeechLinkDict.keys())

dict_keys(['Bobby Jindal', 'Rick Santorum', 'Ben Carson', 'Chris Christie', 'John Kasich', 'Scott Walker', 'Bernie Sanders', 'George Pataki', 'Lincoln Chafee', 'Hillary Clinton ', 'Carly Fiorina', 'Marco Rubio', 'Jim Webb', 'Donald Trump', 'Rand Paul', 'Ted Cruz', "Martin O'Malley", 'Lindsey Graham', 'Mike Huckabee', 'Jeb Bush', 'Rick Perry'])


Loops through the campagin speech links, puts each candidate's campagin speeches into individual files

In [4]:
root = 'http://www.presidency.ucsb.edu/'

for name, url in campaignSpeechLinkDict.items():
    path = './Campaign Speeches/' + name.replace(' ', '-') + '.txt'
    scrapeCampaignSpeechesToFile(root + url, path)
    

Train the bots and store them in a dictionary 

In [5]:
bots = {}
for pres in campaignSpeechLinkDict.keys():
    bots[pres] = trainMarkov('./Campaign Speeches/' + pres.replace(' ', '-') + '.txt')

Print 10 short 'tweet' sentences for each president in the dictionary

In [6]:
for name,bot in bots.items():
    print('\n' + name + ': ')
    for i in range(10):
        print(bot.make_short_sentence(max_chars=140))


Bobby Jindal: 
It is time for a land where the public is headed.
You've heard Jeb Bush is saying is that we have statewide school choice — because every child deserves an equal opportunity for a doer.
We are the light of freedom in a house without electricity, without running water.
I'm not going to take care of you, to make big changes.
He was the aftermath of Katrina, our economy was locked in a house without electricity, without running water.
No, quite to the contrary, we are again, we will repeal Obamacare.
What Jeb Bush say that we are naive to believe in.
But they say term limits is a mess.
Our enemies need to trust us.
There was a place in this world where people were free, and the opportunities are real, I am running for President already.

Rick Santorum: 
My mom and dad worked.
A lot of greatness in this race, you stood with his back against the big guys will take over.
And, in fact, the highest-quality oil in the steel industry smaller than it is great to be a very unusual 