# Pulling From APIs


In [13]:
# Import these libraries 
import requests
import json
import pandas as pd
import time
import warnings

We will need the requests library. In order to pull from APIs you can either use the Requests our URLIB2 library

In [5]:
# First step is to find the url
url = "http://www.reddit.com/r/weather.json"

In [8]:
# next we need to create a header
# Headers might vary based upon APis, most will require a password, but similar concept.
res = requests.get(url, headers={'User-agent': 'Marc Bot 0.1'})

In [9]:
# make sure to
res.status_code

200

In [11]:
url

'http://www.reddit.com/r/weather.json'

In [15]:
# next step will be to build a for loop to get posts from reddit
posts = []
url = 'https://www.reddit.com/r/weather/top.json?t=all'
after = None

#Step 1: build the loop and print URL
for _ in range(20):
    if after == None:
        current_url = url
    else:
        current_url = url + '&after=' + after
        
    print(f'Page {_}:', current_url)
    
    
    #Step 2a: make the requests and handle status code. Add time.sleep now (2.b)
    res = requests.get(current_url, headers={'User-agent': 'Marc Bot 0.1'})

    if res.status_code != 200:
        print('Status error', res.status_code)
        break
    else:
        print(res)

        
    #Step 3: Actually deal with the data   
    current_dict = res.json()
    current_posts = [p['data'] for p in current_dict['data']['children']]
    posts.extend(current_posts)

    after = current_dict['data']['after']

    #Step 2b:
    time.sleep(1)

Page 0: https://www.reddit.com/r/weather/top.json?t=all
<Response [200]>
Page 1: https://www.reddit.com/r/weather/top.json?t=all&after=t3_bulkjc
<Response [200]>
Page 2: https://www.reddit.com/r/weather/top.json?t=all&after=t3_bgj27g
<Response [200]>
Page 3: https://www.reddit.com/r/weather/top.json?t=all&after=t3_963sed
<Response [200]>
Page 4: https://www.reddit.com/r/weather/top.json?t=all&after=t3_c0ebv9
<Response [200]>
Page 5: https://www.reddit.com/r/weather/top.json?t=all&after=t3_8fb2aa
<Response [200]>
Page 6: https://www.reddit.com/r/weather/top.json?t=all&after=t3_9dtap9
<Response [200]>
Page 7: https://www.reddit.com/r/weather/top.json?t=all&after=t3_b4uohq
<Response [200]>
Page 8: https://www.reddit.com/r/weather/top.json?t=all&after=t3_6ycido
<Response [200]>
Page 9: https://www.reddit.com/r/weather/top.json?t=all&after=t3_89eu42
<Response [200]>
Page 10: https://www.reddit.com/r/weather/top.json?t=all&after=t3_aoidku
<Response [200]>
Page 11: https://www.reddit.com/r/we

Let's take a step back!
In order to make it all work we need to work through that messy JSON. The key here is to find the right dictionary keys. Not all JSONS are created equally, but all have similar concepts. The goal is to narrow down that incredible mess!

Copy and pase that link into a web browser. http://www.reddit.com/r/weather.json
Look how messy that is!

In [39]:
current_dict.keys()

dict_keys(['kind', 'data'])

In [45]:
current_dict['data'].keys()

dict_keys(['modhash', 'dist', 'children', 'after', 'before'])

In [48]:
current_dict['data']['children'][0].keys()

dict_keys(['kind', 'data'])

In [51]:
len(current_dict['data']['children'])

25

In [50]:
current_dict['data']['children'][0]

{'kind': 't3',
 'data': {'approved_at_utc': None,
  'subreddit': 'weather',
  'selftext': '',
  'author_fullname': 't2_2luw301h',
  'saved': False,
  'mod_reason_title': None,
  'gilded': 0,
  'clicked': False,
  'title': 'Land-icane over us in North Dakota back in November 2018',
  'link_flair_richtext': [],
  'subreddit_name_prefixed': 'r/weather',
  'hidden': False,
  'pwls': 6,
  'link_flair_css_class': None,
  'downs': 0,
  'thumbnail_height': 140,
  'hide_score': False,
  'name': 't3_awb0vh',
  'quarantine': False,
  'link_flair_text_color': 'dark',
  'author_flair_background_color': None,
  'subreddit_type': 'public',
  'ups': 194,
  'total_awards_received': 0,
  'media_embed': {},
  'thumbnail_width': 140,
  'author_flair_template_id': None,
  'is_original_content': False,
  'user_reports': [],
  'secure_media': {'reddit_video': {'fallback_url': 'https://v.redd.it/7c4357xqalj21/DASH_1080?source=fallback',
    'height': 1080,
    'width': 608,
    'scrubber_media_url': 'https://

## Now we can finally use Pandas to put all that data together!

In [52]:
df = pd.DataFrame(posts)
df.head()
df = df[['title', 'ups', 'num_comments', 'subreddit_name_prefixed']]
df.head()

Unnamed: 0,title,ups,num_comments,subreddit_name_prefixed
0,Storm chasers are paying tribute to Bill Paxto...,1830,70,r/weather
1,I'm from Colima. I'm gonna try to post whateve...,1089,724,r/weather
2,The Weather Channel has the perfect response f...,918,78,r/weather
3,From someone in an area that never has tornado...,827,651,r/weather
4,Heard the sirens. Stepped out of my office. Th...,730,83,r/weather


# APIs with Wrappers

Sometimes it really gets that good! Wrappers are sometimes created to make your life easier. In this next example I am pulling data from the Quandl API. Quandly is a financial data company. https://www.quandl.com/tools/api


It's super easy to use! Just sign up for an API key and the website will direct you exactly what you need to do. 

In [16]:
# You might fist need to Pip Install quandl
import quandl

I am hiding my API Key, but it really is that easy! Just follow the site and you can pull data just like that!  

# Note: The bottom cell won't execute on your computer 

In [22]:
quandl.ApiConfig.api_key = 'Hidden API KEY'
SPY=quandl.get('AAII/AAII_SENTIMENT', start_date='2010-01-01', end_date='2018-10-13')
SPY.tail(10)

Unnamed: 0_level_0,Bullish,Neutral,Bearish,Total,Bullish 8-Week Mov Avg,Bull-Bear Spread,Bullish Average,Bullish Average + St. Dev,Bullish Average - St. Dev,S&P 500 Weekly High,S&P 500 Weekly Low,S&P 500 Weekly Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
2018-08-09,0.363636,0.326019,0.310345,1.0,0.337165,0.053291,0.381696,0.482914,0.280477,2863.43,2796.34,2857.7
2018-08-16,0.361702,0.347518,0.29078,1.0,0.333979,0.070922,0.381696,0.482914,0.280477,2862.48,2802.49,2818.37
2018-08-23,0.384615,0.344729,0.270655,0.999999,0.346496,0.11396,0.381696,0.482914,0.280477,2873.23,2802.49,2861.82
2018-08-30,0.434959,0.321138,0.243902,0.999999,0.366047,0.191057,0.381696,0.482914,0.280477,2916.5,2854.03,2914.04
2018-09-06,0.422222,0.314815,0.262963,1.0,0.365011,0.159259,0.381696,0.482914,0.280477,2916.5,2876.92,2888.6
2018-09-13,0.320896,0.350746,0.328358,1.0,0.361794,-0.007462,0.381696,0.482914,0.280477,2894.65,2864.12,2888.92
2018-09-20,0.320423,0.359155,0.320423,1.000001,0.362445,0.0,0.381696,0.482914,0.280477,2912.36,2879.2,2907.95
2018-09-27,0.362205,0.326772,0.311024,1.000001,0.371332,0.051181,0.381696,0.482914,0.280477,2940.91,2903.28,2905.97
2018-10-04,0.456621,0.292237,0.251142,1.0,0.382955,0.205479,0.381696,0.482914,0.280477,2939.86,2903.28,2925.51
2018-10-11,0.306061,0.339394,0.354545,1.0,0.376,-0.048484,0.381696,0.482914,0.280477,2939.86,2784.86,2785.68
