This repo contains political speech from four political candidates for president in 2016:
- Hillary Clinton - 97,457 words
- Ted Cruz - 72,243 words
- Bernie Sanders - 166,292 words
- Donald Trump - 318,134 words
Their speeches were scraped from the web and from videos with closed captions. All of the speech was taken during the time of their candidacy for president and has to do with their campaigns. This is NOT exhaustive. We have compiled political speech from around 20-25 appearances from each candidate.
The folder for each candidate contains the raw data from each appearance as well as high-level summaries of their most frequently used words, bigrams (two-word phrases) and trigrams (three-word phrases).
Data scraped and compiled by myself, Yuval Shapira and the Engagement Lab. Python script to assemble the files by Guy Shapira.
Use this data with DataBasic.io to start doing some basic text analysis on the candidates, like the below:
Images created with DataBasic.io.