<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px">


# Introduction to APIs
*Authors: dorkydragon and GA*

![](imgs/data_is_around.png)

----

## The `requests` Library
The `requests` library is a library for submitting HTTP requests from Python. Despite its frequent use, it's not included in the Python standard library. You'll need to `pip install requests` yourself.
![](imgs/pokeapi.png)

API documentation can be found at https://pokeapi.co/

In [1]:
import pandas as pd
import numpy as np
import requests
import time

In [2]:
# Create url for API call.
url = 'https://pokeapi.co/api/v2/pokemon/ditto'

In [3]:
# Submit request
res = requests.get(url)

In [4]:
# Request response code
res.status_code

200

In [7]:
# Text of request
res.text

'{"abilities":[{"ability":{"name":"limber","url":"https://pokeapi.co/api/v2/ability/7/"},"is_hidden":false,"slot":1},{"ability":{"name":"imposter","url":"https://pokeapi.co/api/v2/ability/150/"},"is_hidden":true,"slot":3}],"base_experience":101,"forms":[{"name":"ditto","url":"https://pokeapi.co/api/v2/pokemon-form/132/"}],"game_indices":[{"game_index":76,"version":{"name":"red","url":"https://pokeapi.co/api/v2/version/1/"}},{"game_index":76,"version":{"name":"blue","url":"https://pokeapi.co/api/v2/version/2/"}},{"game_index":76,"version":{"name":"yellow","url":"https://pokeapi.co/api/v2/version/3/"}},{"game_index":132,"version":{"name":"gold","url":"https://pokeapi.co/api/v2/version/4/"}},{"game_index":132,"version":{"name":"silver","url":"https://pokeapi.co/api/v2/version/5/"}},{"game_index":132,"version":{"name":"crystal","url":"https://pokeapi.co/api/v2/version/6/"}},{"game_index":132,"version":{"name":"ruby","url":"https://pokeapi.co/api/v2/version/7/"}},{"game_index":132,"version"

In [8]:
# Bring in the JSON!
ditto = res.json()

In [12]:
# Since we've converted the JSON -> dict, we know how to work with this!
sorted(list(ditto.keys()))

['abilities',
 'base_experience',
 'forms',
 'game_indices',
 'height',
 'held_items',
 'id',
 'is_default',
 'location_area_encounters',
 'moves',
 'name',
 'order',
 'past_types',
 'species',
 'sprites',
 'stats',
 'types',
 'weight']

In [13]:
# Height, Weight
ditto['height']

3

In [14]:
ditto['weight']

40

## Challenge
---
What moves can Pikachu learn? In the cells below, do the following:
1. Use `requests` library to get Pikachu's data
1. Convert the JSON into a Python dictionary.
1. Use list comprehension to get just the names of each move. Your result should look like this:

```python
['mega-punch',
 'pay-day',
 'thunder-punch',
 'slam',
 'mega-kick',
 'headbutt',
 'body-slam',
 'take-down',
 ...
]
```

In [19]:
# Use requests library to get Pikachu's data
url = 'https://pokeapi.co/api/v2/pokemon/pikachu'
res = requests.get(url)

In [20]:
res.status_code

200

In [21]:
# Convert the JSON into a Python dictionary.
pikachu = res.json()

In [31]:
# Use list comprehension to get just the names of each move.
move = pikachu['moves'][0]
move['move']['name']

'mega-punch'

In [32]:
[move['move']['name'] for move in pikachu['moves']]

['mega-punch',
 'pay-day',
 'thunder-punch',
 'slam',
 'double-kick',
 'mega-kick',
 'headbutt',
 'body-slam',
 'take-down',
 'double-edge',
 'tail-whip',
 'growl',
 'surf',
 'submission',
 'counter',
 'seismic-toss',
 'strength',
 'thunder-shock',
 'thunderbolt',
 'thunder-wave',
 'thunder',
 'dig',
 'toxic',
 'agility',
 'quick-attack',
 'rage',
 'mimic',
 'double-team',
 'defense-curl',
 'light-screen',
 'reflect',
 'bide',
 'swift',
 'skull-bash',
 'flash',
 'rest',
 'substitute',
 'thief',
 'snore',
 'curse',
 'reversal',
 'protect',
 'sweet-kiss',
 'mud-slap',
 'zap-cannon',
 'detect',
 'endure',
 'charm',
 'rollout',
 'swagger',
 'spark',
 'attract',
 'sleep-talk',
 'return',
 'frustration',
 'dynamic-punch',
 'encore',
 'iron-tail',
 'hidden-power',
 'rain-dance',
 'rock-smash',
 'uproar',
 'facade',
 'focus-punch',
 'helping-hand',
 'brick-break',
 'knock-off',
 'secret-power',
 'signal-beam',
 'covet',
 'volt-tackle',
 'calm-mind',
 'shock-wave',
 'natural-gift',
 'feint',
 '

In [33]:
move

{'move': {'name': 'mega-punch', 'url': 'https://pokeapi.co/api/v2/move/5/'},
 'version_group_details': [{'level_learned_at': 0,
   'move_learn_method': {'name': 'machine',
    'url': 'https://pokeapi.co/api/v2/move-learn-method/4/'},
   'version_group': {'name': 'red-blue',
    'url': 'https://pokeapi.co/api/v2/version-group/1/'}},
  {'level_learned_at': 0,
   'move_learn_method': {'name': 'machine',
    'url': 'https://pokeapi.co/api/v2/move-learn-method/4/'},
   'version_group': {'name': 'yellow',
    'url': 'https://pokeapi.co/api/v2/version-group/2/'}},
  {'level_learned_at': 0,
   'move_learn_method': {'name': 'tutor',
    'url': 'https://pokeapi.co/api/v2/move-learn-method/3/'},
   'version_group': {'name': 'emerald',
    'url': 'https://pokeapi.co/api/v2/version-group/6/'}},
  {'level_learned_at': 0,
   'move_learn_method': {'name': 'tutor',
    'url': 'https://pokeapi.co/api/v2/move-learn-method/3/'},
   'version_group': {'name': 'firered-leafgreen',
    'url': 'https://pokeapi

## Creating a `pandas` DataFrame from JSON
---
To create a DataFrame, we simply need a list of dictionaries. Let's try creating a DataFrame from Pikachu's abilities.

In [35]:
instructors = [
    {'name': 'Riley'},
    {'name': 'Katie'},
    {'name': 'Alanna', 'location': 'SEA'}
]
pd.DataFrame(instructors)

Unnamed: 0,name,location
0,Riley,
1,Katie,
2,Alanna,SEA


That's all fine and good. But notice each `ability` is a dictionary. In the cell below, let's extract the `name` from each ability and set it as its own column.

In [37]:
pikachu_df = pd.DataFrame(pikachu['abilities'])

In [41]:
def extract_name(ability_dict):
    return ability_dict['name']

pikachu_df['name'] = pikachu_df['ability'].map(extract_name)
pikachu_df

Unnamed: 0,ability,is_hidden,slot,name
0,"{'name': 'static', 'url': 'https://pokeapi.co/...",False,1,static
1,"{'name': 'lightning-rod', 'url': 'https://poke...",True,3,lightning-rod


## Challenge: get a description for each ability.
---

If you're a Pokemon neophyte (like me), you might not know what the **static** is. Fortunately, we can use the `url` from our `ability` dictionary to make an API request. Because we're hitting several URLs in a loop, we want to throttle our requests so as not to overwhelm the API.

In the cell below, create a function that retrieves the `effect`. NOTE: There are multiple languages, choose one.

Now use the `.map()` method to get the effect for each ability.


## On to Reddit!

----

<img src="imgs/reddit_logo.png" style="width: 250px; margin: 0 0 20px 0;" alt="Reddit Logo" />

We'll use the PushShift API. 

[Click for documentation](https://github.com/pushshift/api)

In [42]:
url = 'https://api.pushshift.io/reddit/search/submission'

## Parameters
---

If we want the post recent posts from /r/Science, we have to use the `subreddit` parameter. The URL would be as follows:

https://api.pushshift.io/reddit/search/submission/?subreddit=science

Parameters are everything after the `?` in a URL. They're sort of like a dictionary, in that each parameter is a key/value pair. (ie `subreddit=science`). 

Multiple parameters are separated with a `&`.

https://api.pushshift.io/reddit/search/submission/?subreddit=science&foo=bar

## Challenge
---
Use the API documentation to answer the following question:

How would we search for the word "election" in [/r/news](https://reddit.com/r/news)? Your answer should be a URL with two parameters.

In [48]:
# NOTE: You already have base_url
url + '?subreddit=news&q=election+2022'

'https://api.pushshift.io/reddit/search/submission?subreddit=news&q=election+2022'

## Route
https://api.pushshift.io/reddit/search/submission

## Params
?

subreddit=news
&
q=election+2022

With the `requests` library, we can use the full URL (including params):

```python
requests.get('https://api.pushshift.io/reddit/search/submission/?subreddit=news&q=election')
```

Or (even better), the `requests` will accept the parameters as a dictionary:

```python
params = {
    'subreddit': 'news',
    'q': 'election'
}
requests.get(base_url + 'search/submission/', params=params)
```

In [52]:
# Get the most recent *50* posts from */r/news* containing the word *"election"*
params = {
    'subreddit': 'news',
    'q': 'election',
    'size': 50
}

res = requests.get(url, params=params)

In [53]:
res.status_code

200

In [54]:
data = res.json()

In [56]:
data.keys()

dict_keys(['data'])

In [63]:
posts = pd.DataFrame(data['data'])[['title', 'selftext', 'subreddit', 'created_utc']]
posts

Unnamed: 0,title,selftext,subreddit,created_utc
0,Election round up: PM Narendra Modi meets seve...,,news,1645209330
1,Amazon agrees nyc union election terms setting...,,news,1645067708
2,Latest plan by the election crybabies at the A...,,news,1645026613
3,Clerk who espoused election fraud conspiracies...,,news,1645012775
4,UP Phase 2 Election 2022 Highlights,,news,1644936927
5,Manchin clarifies: He'd oppose second high cou...,,news,1644925638
6,BREAKING：P&amp;G taking advantage of forced la...,,news,1644910081
7,"Election round up: Heavy turnout in UP, Uttara...",,news,1644863726
8,Turkmenistan to hold early presidential electi...,,news,1644857087
9,Chinese spies attempted to install Labor candi...,,news,1644626395


In [None]:
# Convert the posts into a DataFrame

## Challenge
---
The posts are ordered by `created_utc`. In the cell below, prove that this is true.

In [69]:
before = posts['created_utc'].values[-1]

Create a variable that includes the earilest date (ie most distant).

Use the last date from the previous step to get the **next** 50 posts that occur **before** your variable. Save it as another DataFrame.

In [71]:
# Get the most recent *50* posts from */r/news* containing the word *"election"*
params = {
    'subreddit': 'news',
    'q': 'election',
    'size': 50,
    'before': before
}

res = requests.get(url, params=params)

In [72]:
data = res.json()

In [75]:
posts2 = pd.DataFrame(data['data'])[['title', 'selftext', 'subreddit', 'created_utc']]

Concatenate the two DataFrames together.

In [77]:
pd.concat([posts, posts2], ignore_index=True)

Unnamed: 0,title,selftext,subreddit,created_utc
0,Election round up: PM Narendra Modi meets seve...,,news,1645209330
1,Amazon agrees nyc union election terms setting...,,news,1645067708
2,Latest plan by the election crybabies at the A...,,news,1645026613
3,Clerk who espoused election fraud conspiracies...,,news,1645012775
4,UP Phase 2 Election 2022 Highlights,,news,1644936927
...,...,...,...,...
95,"Biden all but concedes defeat on voting, elect...",,news,1642109434
96,2023 Election: All I'm seeing are geriatrics p...,,news,1642103717
97,MyPillow CEO Mike Lindell says he has evidence...,,news,1642081367
98,"INEC to deploy 12,000 ad hoc personnel for FCT...",,news,1642071154


In [82]:
int(time.time())

1645214456

In [85]:
dfs = []

subreddits = ['boardgames', 'philosophy']

for subreddit in subreddits:
    
    # set before to be current time
    
    for i in range(1):
        print(subreddit, i)
        # create params: before, subreddit, size
        
        # use the requests to get the response
        
        # turn the response into JSON
        
        # turn the JSON into a DataFrame
        
        # add posts DataFrame to dfs
        
        # set before to be the timestamp of the last post
        
        time.sleep(3)
# concat all dfs

boardgames 0
boardgames 1
boardgames 2
boardgames 3
boardgames 4
boardgames 5
boardgames 6
boardgames 7
boardgames 8
boardgames 9
philosophy 0
philosophy 1


KeyboardInterrupt: 