# Accessing data from an API

This notebook has two simple excerises demonstrating how to extract data from an [Application Programming Interface](https://en.wikipedia.org/wiki/API). An API is a tool for computers or applications to interact with one another. In our case, we'll be asking for data, and the API will return it. These systems can be complicated, but most of those we might use in data journalism are relatively simple.

#### Import our data tools

In [1]:
%load_ext lab_black

In [2]:
import pandas as pd
import requests

In [3]:
pd.options.display.max_columns = 100
pd.options.display.max_rows = 1000
pd.options.display.max_colwidth = None

---

## Cat facts!

[Read the documentation](https://alexwohlbruck.github.io/cat-facts/docs/)

#### Get random facts

In [4]:
cat_df = pd.read_json(
    "https://cat-fact.herokuapp.com/facts/random?animal_type=cat&amount=500"
)

#### First five rows

In [5]:
cat_df.head()

Unnamed: 0,status,_id,user,text,type,deleted,createdAt,updatedAt,__v,source,used,sendDate
0,"{'verified': None, 'sentCount': 0}",6239fefb6f13e62a722fcfdd,61b8a19266b26cede617e5a2,Ороролом.,cat,False,2022-03-22T16:53:15.424Z,2022-03-22T16:53:15.424Z,0.0,,,
1,"{'verified': True, 'sentCount': 1}",5b1b3f48841d9700146158cb,5a9ac18c7478810ea6c06381,"At night, Disneyland is overrun by cats. The theme park feeds them and takes care of them though, because they keep the rodent population in check.",cat,False,2018-06-21T20:20:04.352Z,2020-08-23T20:20:01.611Z,0.0,user,0.0,
2,"{'verified': None, 'sentCount': 0}",61d350b7403b4002d3795dbc,61b8566766b26cede617b4ef,222222222222222.,cat,False,2022-01-03T19:38:31.660Z,2022-01-03T19:38:31.660Z,0.0,,,
3,"{'verified': True, 'sentCount': 1}",591f98703b90f7150a19c179,5a9ac18c7478810ea6c06381,A cat's field of vision is about 200 degrees.,cat,False,2018-01-04T01:10:54.673Z,2020-08-23T20:20:01.611Z,0.0,api,0.0,
4,"{'verified': None, 'sentCount': 0}",61b8308866b26cede617a45e,61b82e5766b26cede617a314,"While us humans have 206 bones, cats on average have 244. It ranges between 230-250 depending on how long a cat’s tail is and how many toes the cat has.",cat,False,2021-12-14T05:50:00.012Z,2021-12-14T05:50:00.012Z,0.0,,,


#### How many records? 

In [6]:
len(cat_df)

500

#### What's the first fact?

In [7]:
cat_df["text"][0]

'Ороролом.'

#### Exctract the nested json inside the `status` column

In [8]:
cat_df[["verified", "sentCount", "feedback"]] = pd.json_normalize(cat_df["status"])

ValueError: Columns must be same length as key

In [None]:
cat_df.head()

#### Slim the dataframe

In [None]:
cat_df_slim = cat_df[["_id", "sentCount", "text", "createdAt", "verified",]].copy()

In [None]:
cat_df_slim

#### Just the verified facts, pls

In [None]:
verified_df = cat_df_slim[cat_df_slim["verified"] == True]

#### Find facts that mentions specific words? 

In [None]:
len(verified_df)

In [None]:
verified_df[verified_df["text"].str.lower().str.contains("dog|food|toys")]

In [None]:
verified_df.head()

#### Find the oldest fact? 

In [None]:
verified_df.sort_values("createdAt", ascending=False).head()

In [None]:
verified_df.dtypes

In [None]:
verified_df.['date'] = pd.to_datetime(verified_df["createdAt"]).dt.strftime("%Y-%m-%d")

#### Most recent verified fact?

In [None]:
verified_df.sort_values('date', ascending = False).head()

---

## Dad jokes!

[Read the documentation](https://icanhazdadjoke.com/api#fetch-a-random-dad-joke)

#### Give the request headers so the API knows how to answer it

In [11]:
headers = {
    "Accept": "application/json",
}

#### Get a response from the API in the format we requested

In [12]:
response = requests.get("https://icanhazdadjoke.com/search?page=1", headers=headers)

#### What comes back?

In [13]:
response.json()

{'current_page': 1,
 'limit': 20,
 'next_page': 2,
 'previous_page': 1,
 'results': [{'id': '0189hNRf2g',
   'joke': "I'm tired of following my dreams. I'm just going to ask them where they are going and meet up with them later."},
  {'id': '08EQZ8EQukb',
   'joke': "Did you hear about the guy whose whole left side was cut off? He's all right now."},
  {'id': '08xHQCdx5Ed',
   'joke': 'Why didn’t the skeleton cross the road? Because he had no guts.'},
  {'id': '0DQKB51oGlb',
   'joke': "What did one nut say as he chased another nut?  I'm a cashew!"},
  {'id': '0DtrrOZDlyd',
   'joke': "Chances are if you' ve seen one shopping center, you've seen a mall."},
  {'id': '0LuXvkq4Muc',
   'joke': "I knew I shouldn't steal a mixer from work, but it was a whisk I was willing to take."},
  {'id': '0ga2EdN7prc',
   'joke': 'How come the stadium got hot after the game? Because all of the fans left.'},
  {'id': '0oO71TSv4Ed',
   'joke': 'Why was it called the dark ages? Because of all the knights.

#### What's the limit per API call? 

In [14]:
response.json()["results"]

[{'id': '0189hNRf2g',
  'joke': "I'm tired of following my dreams. I'm just going to ask them where they are going and meet up with them later."},
 {'id': '08EQZ8EQukb',
  'joke': "Did you hear about the guy whose whole left side was cut off? He's all right now."},
 {'id': '08xHQCdx5Ed',
  'joke': 'Why didn’t the skeleton cross the road? Because he had no guts.'},
 {'id': '0DQKB51oGlb',
  'joke': "What did one nut say as he chased another nut?  I'm a cashew!"},
 {'id': '0DtrrOZDlyd',
  'joke': "Chances are if you' ve seen one shopping center, you've seen a mall."},
 {'id': '0LuXvkq4Muc',
  'joke': "I knew I shouldn't steal a mixer from work, but it was a whisk I was willing to take."},
 {'id': '0ga2EdN7prc',
  'joke': 'How come the stadium got hot after the game? Because all of the fans left.'},
 {'id': '0oO71TSv4Ed',
  'joke': 'Why was it called the dark ages? Because of all the knights. '},
 {'id': '0oz51ozk3ob', 'joke': 'A steak pun is a rare medium well done.'},
 {'id': '0ozAXv4Mmj

#### How many total jokes? 

In [18]:
response.json()["total_jokes"]

649

#### How many pages of 20 jokes? 

#### Ok, just the jokes

In [16]:
jokes_df = pd.DataFrame(response.json()["results"])

#### How many records?

In [17]:
len(jokes_df)

20

#### Get all the jokes with a loop

#### How many? 

#### Export 