# Political Social Media Analysis

In this notebook, I will try to compare the tweets of Donald Trump, Barrack Obama, and Hillary Clinton to come up with meaningful insights

Data:
There are 3 CSV files which will be used:
1. DonaldTrump
2. BarackObama
3. HillaryClinton

All 3 have the same structure
sl no,date,id,link,retweet,text,author

Import libraries

In [26]:
import pandas as pd
import numpy as np
import re

Read the data

In [14]:
trump = pd.read_csv("data/DonaldTrump.csv")
obama = pd.read_csv("data/BarackObama.csv")
clinton = pd.read_csv("data/HillaryClinton.csv")

Check out the data

In [15]:
trump.head()

Unnamed: 0.1,Unnamed: 0,date,id,link,retweet,text,author
0,0,Oct 7,784609194234306560,/realDonaldTrump/status/784609194234306560,False,Here is my statement.pic.twitter.com/WAZiGoQqMQ,DonaldTrump
1,1,Oct 10,785608815962099712,/realDonaldTrump/status/785608815962099712,False,Is this really America? Terrible!pic.twitter.c...,DonaldTrump
2,2,Oct 8,784840992734064640,/realDonaldTrump/status/784840992734064641,False,The media and establishment want me out of the...,DonaldTrump
3,3,Oct 11,785979396620324864,/realDonaldTrump/status/785979396620324865,False,"Wow, @CNN Town Hall questions were given to Cr...",DonaldTrump
4,4,Oct 10,785561269571026944,/realDonaldTrump/status/785561269571026946,False,Debate polls look great - thank you!\n#MAGA #A...,DonaldTrump


In [16]:
obama.head()

Unnamed: 0.1,Unnamed: 0,date,id,link,retweet,text,author
0,0,20h20 hours ago,786982739517943808,/BarackObama/status/786982739517943808,False,Denying climate change is dangerous. Join @OFA...,BarackObama
1,1,18h18 hours ago,787010142378332160,/BarackObama/status/787010142378332160,False,The American Bar Association gave Judge Garlan...,BarackObama
2,2,16h16 hours ago,787039774330748928,/BarackObama/status/787039774330748928,False,We need a fully functional Supreme Court. Edit...,BarackObama
3,3,21h21 hours ago,786964419905523712,/BarackObama/status/786964419905523712,False,"Cynics, take note: When we #ActOnClimate, we b...",BarackObama
4,4,Oct 13,786680553617629184,/BarackObama/status/786680553617629185,False,"""That’s how we will overcome the challenges we...",BarackObama


In [17]:
clinton.head()

Unnamed: 0.1,Unnamed: 0,date,id,link,retweet,text,author
0,0,Oct 9,785272428905791488,/HillaryClinton/status/785272428905791489,False,Remember. #Debatepic.twitter.com/rlMbTt5WwY,HillaryClinton
1,1,Oct 9,785325012152713216,/HillaryClinton/status/785325012152713216,False,She won. http://hrc.io/2dQkjip #Debatepic.twi...,HillaryClinton
2,2,Oct 9,785282982261190656,/HillaryClinton/status/785282982261190656,False,Let's go. #Debatepic.twitter.com/HD3ZVJ9xl8,HillaryClinton
3,3,7m7 minutes ago,786963642080227328,/HillaryClinton/status/786963642080227328,False,"""Everyone knows how bright she is and how resi...",HillaryClinton
4,4,29m29 minutes ago,786958117531742208,/HillaryClinton/status/786958117531742208,False,"""All the progress we've made these last 8 year...",HillaryClinton


## Data Cleaning

A couple of issues have been identified by eye-balling.
1. There is an unnamed column which is unnecessary and must be dropped
2. ID, and link are not required as well
3. Date format is not standard
4. Since this is a tweet, it contains many non-alphanumeric characters as well

In [20]:
#Dropping the unnecessary columns
drop_cols = ["Unnamed: 0", "id", "link"]
trump.drop(drop_cols, axis=1, inplace=True)
obama.drop(drop_cols, axis=1, inplace=True)
clinton.drop(drop_cols, axis=1, inplace=True)

trump.head()

Unnamed: 0,date,retweet,text,author
0,Oct 7,False,Here is my statement.pic.twitter.com/WAZiGoQqMQ,DonaldTrump
1,Oct 10,False,Is this really America? Terrible!pic.twitter.c...,DonaldTrump
2,Oct 8,False,The media and establishment want me out of the...,DonaldTrump
3,Oct 11,False,"Wow, @CNN Town Hall questions were given to Cr...",DonaldTrump
4,Oct 10,False,Debate polls look great - thank you!\n#MAGA #A...,DonaldTrump


In [25]:
#For our analysis, we do not need non-alphanumeric characters like "%^&*() etc"  
text = "jeanf34%#$@asdfass svdaomve;l34 34 23; 2 l4 reldcxzf; "
i=0
text.replace()

Remember. #Debatepic.twitter.com/rlMbTt5WwY
She won. http://hrc.io/2dQkjip  #Debatepic.twitter.com/C4MP8cPONF
Let's go. #Debatepic.twitter.com/HD3ZVJ9xl8
"Everyone knows how bright she is and how resilient she is, but...they're going to see how much she cares." —@JoeBidenpic.twitter.com/QYF2Ewy55Y
"All the progress we've made these last 8 years is on the ballot." —@POTUS

Make sure you're ready to vote:http://IWillVote.com 
"You have a chance to reject the politics of fear. You can lift up again the politics of hope.” —@POTUS http://IWillVote.com 
"Change is possible, but it doesn’t just depend on one person—it depends on all of us.” —@POTUS http://hillaryclinton.com/volunteer  #StrongerTogether
"She knows that in a democracy as big and diverse as this, we can't demonize each other.” —@POTUS on Hillary
"He may be up at 3am, but it’s because he’s tweeting insults at somebody who got under his skin.” —@POTUS on Trump
“She's got the knowledge and experience and the temperament to be the n