# A Song of Vice and Higher: Characterizing Presidential Nominees through Game of Thrones

## Prepare to access Reddit's API 

We got the hang of using Reddit's API by following [Shropshire's article](https://towardsdatascience.com/exploring-reddits-ask-me-anything-using-the-praw-api-wrapper-129cf64c5d65).

1.  Install or Update PRAW in your Terminal.

2. Create and/or Login to Your Reddit Account to begin Authenticating via OAuth

### Import Necessary Libraries

In [2]:
import os             # file system stuff
import json           # digest json
import praw           # reddit API
import pandas as pd   # Dataframes
import pymongo        # MongoDB
import helper     # Custom helper functions

### Load the API keys

3. Create your first authorized Reddit instance.

In [3]:
# Define path to secret
secret_path = os.path.join(os.environ['HOME'], 'mia/.secret', 'reddit_api.json')

In [4]:
keys = helper.get_keys(secret_path)

In [5]:
reddit = praw.Reddit(client_id=keys['client_id'] 
                     ,client_secret=keys['api_key']
                     ,username=keys['username']
                     ,password=keys['password']
                     ,user_agent='reddit_research accessAPI:v0.0.1 (by /u/FlatDubs)')

4. Obtain a Subreddit Instance from your Reddit Instance. Ours will come from two different subreddits.

In [6]:
politics = reddit.subreddit('politics')

In [7]:
got = reddit.subreddit('gameofthrones')

5. Obtain a submission instance from your Subreddit instance and compile the submission stats to a list
6. Create a Pandas dataframe of the submission stats

In [8]:
#step 5 obtain submissions through search
got_search = got.search('bran' or 'brandon stark' 
                        or 'jon snow' or 'jon' #will reddit authors be included in results?
                        or 'khaleesi' or 'dany' or 'daenerys' or 'danyris', 
                        sort='comments',
                       limit=5)

#step 5 compile submission into list
title = [] 
num_comments = []
upvote_ratio = []
sub_id = []
i=0

for submission in got_search:
    i+=1
    title.append(submission.title)
    num_comments.append(submission.num_comments)
    upvote_ratio.append(submission.upvote_ratio)
    sub_id.append(submission.id)
    if i%100 == 0:
        print(f'{i} submissions completed')

#step 6 create dataframe
df_got = pd.DataFrame(
    {'title': title,
     'num_comments': num_comments,
     'upvote_ratio': upvote_ratio,
     'id':sub_id
    })
df_got

Unnamed: 0,title,num_comments,upvote_ratio,id
0,[S7E5] Post-Premiere Discussion - S7E5 'Eastwa...,26054,0.98,6tjeos
1,[S6E5] Post-Premiere Discussion - S6E5 'The Door',17604,0.97,4klpws
2,[S6E3] Post-Premiere Discussion - S6E3 'Oathbr...,11830,0.97,4ihick
3,[S6E2] Post-Premiere Discussion - S6E2 'Home',11359,0.98,4hdflw
4,[S7E5] Live Premiere Discussion - S7E5 'Eastwa...,9374,0.97,6tj3lx


In [9]:
#do same for politics 
dem_search = politics.search('kamala', 
                              sort='comments',
                             limit=5)

title = [] 
num_comments = []
upvote_ratio = []
sub_id = []
i=0

for submission in dem_search:
    i+=1
    title.append(submission.title)
    num_comments.append(submission.num_comments)
    upvote_ratio.append(submission.upvote_ratio)
    sub_id.append(submission.id) 
    if i%100 == 0:
        print(f'{i} submissions completed')

df_dem = pd.DataFrame(
    {'title': title,
     'num_comments': num_comments,
     'upvote_ratio': upvote_ratio,
     'id':sub_id
    })
df_dem

Unnamed: 0,title,num_comments,upvote_ratio,id
0,Megathread: AG Willam Barr releases his top li...,45579,0.88,b50gkr
1,Megathread: President Trump delivers remarks o...,32332,0.82,6tx8h7
2,Megathread: Likely Explosive Devices Addressed...,21359,0.9,9rlm9p
3,Megathread: President Trump announces a deal t...,12928,0.88,ajsubi
4,[Megathread] President Trump’s Address on Bord...,9081,0.91,ae2e7b


## Retrieve Comments

In [10]:
submission = reddit.submission(id=df_dem['id'][0])

In [11]:
# Instantiate list to hold comments
test_comments = []
comments_dicts = []

submission.comments.replace_more(limit=5)
for comment in submission.comments.list()[:100]:
#     print(comment.body)
    # List of comments, as strings
    test_comments.append(comment.body)

    # List of comments (dicts)
    comments_dicts.append({
        'comment': comment.body
    })
    

In [12]:
# Check 
test_comments[4]

'https://twitter.com/neal_katyal/status/1109904171178704897?s=21\n\n>The Barr Letter says Mueller did not make a determination of whether Trump obstructed justice. It sets out both sides. But Barr concludes— on his determination — that there was no obstruction. How did he do that without trying to interview Trump? Full report now needs to come out\n\n- Neal Katyal\n\n'

In [13]:
# Put them in a dataframe, as POC
pol_df = pd.DataFrame(test_comments, columns=['comment'])

len(pol_df)

100

In [14]:
#test to see how we'll search strings for later when we use vader
pol_df['comment'].str.contains('joyann')
#case sensitive
#should write function that attributes comment to person 
#forward slashes in links seem to operate as spaces 
#make all comments all lowercase to simplify attributing phase

0     False
1     False
2     False
3     False
4     False
5     False
6     False
7     False
8     False
9     False
10    False
11    False
12    False
13    False
14    False
15    False
16    False
17    False
18    False
19    False
20    False
21    False
22    False
23    False
24    False
25    False
26    False
27    False
28    False
29    False
      ...  
70    False
71    False
72    False
73    False
74    False
75    False
76    False
77    False
78    False
79    False
80    False
81    False
82    False
83    False
84    False
85    False
86    False
87    False
88    False
89    False
90    False
91    False
92    False
93    False
94    False
95    False
96    False
97    False
98    False
99    False
Name: comment, Length: 100, dtype: bool

## How about some Vader Sentiment Action?

In [15]:
#pip install --upgrade vaderSentiment

In [17]:
from vaderSentiment import vaderSentiment

In [22]:
analyzer = vaderSentiment.SentimentIntensityAnalyzer()

In [23]:
for comment in test_comments:
    print(comment)
    print(analyzer.polarity_scores(comment))
#https://github.com/cjhutto/vaderSentiment#about-the-scoring 

>https://twitter.com/JoyAnnReid/status/1109932247858245632

>Joy Reid: So presumably everyone will now be in bipartisan agreement that there’s **no reason why the full Mueller report should not be released, like the Starr report was.** If it’s as good for the president as AG Barr says it is, it can only be good for everyone to see it published.
{'neg': 0.035, 'neu': 0.818, 'pos': 0.147, 'compound': 0.7839}
[https://twitter.com/RepJerryNadler/status/1109913142933573632](https://twitter.com/RepJerryNadler/status/1109913142933573632)

"In light of the very concerning discrepancies and final decision making at the Justice Department following the Special Counsel report, where Mueller did not exonerate the President, we will be calling Attorney General Barr in to testify before [~~@~~**HouseJudiciary**](https://twitter.com/HouseJudiciary) in the near future."
{'neg': 0.047, 'neu': 0.829, 'pos': 0.123, 'compound': 0.5815}
Cool. Seems like it would be pretty harmless to the President to relea