# Accessing data from an API

This notebook has two simple excerises demonstrating how to extract data from an [Application Programming Interface](https://en.wikipedia.org/wiki/API). An API is a tool for computers or applications to interact with one another. In our case, we'll be asking for data, and the API will return it. These systems can be complicated, but most of those we might use in data journalism are relatively simple.

#### Import our data tools

In [8]:
%load_ext lab_black

The lab_black extension is already loaded. To reload it, use:
  %reload_ext lab_black


In [13]:
import pandas as pd
import requests

In [14]:
pd.options.display.max_columns = 100
pd.options.display.max_rows = 1000
pd.options.display.max_colwidth = None

---

## Cat facts!

[Read the documentation](https://alexwohlbruck.github.io/cat-facts/docs/)

#### Get random facts

In [20]:
cat_df = pd.read_json(
    "https://cat-fact.herokuapp.com/facts/random?animal_type=cat&amount=500"
)

#### First five rows

In [22]:
cat_df.head()

Unnamed: 0,status,_id,user,text,source,__v,updatedAt,type,createdAt,deleted,used,sendDate
0,"{'verified': True, 'sentCount': 1}",596f4f50a8d3440020e2d77d,596ea14ed4d9720020401f7b,"Due to the controversy, though loved by most, the Kashmir is overlooked by many cat fanciers.",user,0.0,2020-08-23T20:20:01.611Z,cat,2018-04-04T20:20:01.991Z,False,0.0,
1,"{'verified': True, 'sentCount': 1}",5c3551d48e0b8d00148d45e4,5a9ac18c7478810ea6c06381,"The Bengal is the result of crossbreeding between domestic cats and Asian leopard cats, and its name is derived from the scientific name for the Asian leopard cat (Felis bengalensis).",user,0.0,2020-08-23T20:20:01.611Z,cat,2019-01-09T01:43:48.303Z,False,0.0,
2,"{'verified': None, 'sentCount': 0}",61c748a00c7a44ab650af810,61c7471b0c7a44ab650af755,Cats are lasy.,,0.0,2021-12-25T16:36:48.144Z,cat,2021-12-25T16:36:48.144Z,False,,
3,"{'verified': True, 'sentCount': 1}",591f98b44c120c1529b375f2,5a9ac18c7478810ea6c06381,"When well treated, a cat can live twenty or more years but the average life span of a domestic cat is 14 years.",api,0.0,2020-08-23T20:20:01.611Z,cat,2018-01-04T01:10:54.673Z,False,0.0,
4,"{'verified': True, 'sentCount': 1}",591f98c5d1f17a153828aa0b,5a9ac18c7478810ea6c06381,Domestic cats purr both when inhaling and when exhaling.,api,0.0,2020-08-23T20:20:01.611Z,cat,2018-01-04T01:10:54.673Z,False,0.0,


#### How many records? 

In [None]:
(len) cat_df

#### What's the first fact?

In [77]:
cat_df["text"][0]

'Due to the controversy, though loved by most, the Kashmir is overlooked by many cat fanciers.'

#### Exctract the nested json inside the `status` column

In [78]:
cat_df[["verified", "sen_Count", "feedback"]] = pd.json_normalize(cat_df["status"])

#### Slim the dataframe

In [80]:
slim_df = cat_df[["_id", "text", "createdAt", "verified"]].copy()

In [81]:
cat_df_slim.head()

Unnamed: 0,_id,text,createdAt,verified
0,596f4f50a8d3440020e2d77d,"Due to the controversy, though loved by most, the Kashmir is overlooked by many cat fanciers.",2018-04-04T20:20:01.991Z,True
1,5c3551d48e0b8d00148d45e4,"The Bengal is the result of crossbreeding between domestic cats and Asian leopard cats, and its name is derived from the scientific name for the Asian leopard cat (Felis bengalensis).",2019-01-09T01:43:48.303Z,True
2,61c748a00c7a44ab650af810,Cats are lasy.,2021-12-25T16:36:48.144Z,
3,591f98b44c120c1529b375f2,"When well treated, a cat can live twenty or more years but the average life span of a domestic cat is 14 years.",2018-01-04T01:10:54.673Z,True
4,591f98c5d1f17a153828aa0b,Domestic cats purr both when inhaling and when exhaling.,2018-01-04T01:10:54.673Z,True


#### Just the verified facts, pls

In [85]:
verified_df = cat_df_slim[cat_df_slim["verified"] == True]

In [86]:
len(verified_df)

287

#### Find facts that mentions specific words? 

In [56]:
verified_df[verified_df["text"].str.lower().str.contains("dog|food|toys")]

Unnamed: 0,_id,text,createdAt,verified
10,5c35531e8e0b8d00148d45e8,"If you grow your own catnip, here's how to prepare it for kitty's enjoyment: Cut several stalks of the plant from the base. Hang them upside down in a dark and dry room for several weeks. Then cut the catnip into small pieces, rub some on your cat's favorite toys or scratching post, and let the games begin!",2019-01-09T01:49:18.342Z,True
40,591f98108dec2e14e3c20b0f,Cats have been domesticated for half as long as dogs have been.,2018-01-04T01:10:54.673Z,True
71,591f98783b90f7150a19c1cc,British cat owners spend roughly 550 million pounds yearly on cat food.,2018-04-15T20:20:02.691Z,True
95,591f98703b90f7150a19c16e,"On February 28, 1 980 a female cat climbed 70 feet up the sheer pebble-dash outside wall of a block of flats in Bradford, Yorkshire and took refuge in the roof space. She had been frightened by a dog.",2018-01-04T01:10:54.673Z,True
139,5887e1d85c873e0011036889,Cats make about 100 different sounds. Dogs make only about 10.,2018-01-15T21:20:00.003Z,True
162,591f98803b90f7150a19c238,In 1987 cats overtook dogs as the number one pet in America.,2018-01-04T01:10:54.673Z,True
169,58e00a850aac31001185ed1a,"Cats have a longer-term memory than dogs, especially when they learn by actually doing rather than simply seeing.",2018-02-18T21:20:03.044Z,True
195,58e00a090aac31001185ed16,Cats make more than 100 different sounds whereas dogs make around 10.,2018-02-11T21:20:03.745Z,True
199,591f98883b90f7150a19c281,Cats' hearing is much more sensitive than humans and dogs.,2018-04-23T20:20:02.517Z,True
209,591f98803b90f7150a19c229,"In an average year, cat owners in the United States spend over $2 billion on cat food.",2018-01-04T01:10:54.673Z,True


#### Find the oldest fact? 

In [87]:
verified_df.sort_values("createdAt", ascending=False).head()

Unnamed: 0,_id,text,createdAt,verified
381,5d9d4ae168a764001553b388,Cats conserve energy by sleeping for an average of 13 to 14 hours a day.,2019-10-09T02:50:09.633Z,True
147,5d9c556168a764001553b382,"A cat has 244 bones in its entire body—even more than a human, who only has 206 bones.",2019-10-08T09:22:41.032Z,True
6,5d38be200f1c57001592f157,"The Turkish Van is often called the ""swimming cat"" because they are naturals in the water, thanks in part to their uniquely textured, water-resistant coat.",2019-07-24T20:22:56.148Z,True
446,5d38bd750f1c57001592f155,"Legend holds that a goddess rewarded a temple cat's piety by turning the cat's eyes blue and his coat golden, thus creating the first Birman cat.",2019-07-24T20:20:05.522Z,True
34,5d38bcc00f1c57001592f153,"The irresistable and cuddly Ragamuffin is the result of crossbreeding Ragdoll cats with Persians, Himalayans, and other larger longhaired breeds.",2019-07-24T20:17:04.878Z,True


In [None]:
verified_df.sort_values("date", ascending=False).head()

#### Most recent verified fact?

In [None]:
verified_df.sort_values("date", ascending=False).head()

---

## Dad jokes!

[Read the documentation](https://icanhazdadjoke.com/api#fetch-a-random-dad-joke)

#### Give the request headers so the API knows how to answer it

In [92]:
headers = {
    "Accept": "application/json",
}

#### Get a response from the API in the format we requested

In [93]:
response = requests.get("https://icanhazdadjoke.com/search?page=1", headers=headers)

#### What comes back?

In [94]:
response.json()

{'current_page': 1,
 'limit': 20,
 'next_page': 2,
 'previous_page': 1,
 'results': [{'id': '0189hNRf2g',
   'joke': "I'm tired of following my dreams. I'm just going to ask them where they are going and meet up with them later."},
  {'id': '08EQZ8EQukb',
   'joke': "Did you hear about the guy whose whole left side was cut off? He's all right now."},
  {'id': '08xHQCdx5Ed',
   'joke': 'Why didn’t the skeleton cross the road? Because he had no guts.'},
  {'id': '0DQKB51oGlb',
   'joke': "What did one nut say as he chased another nut?  I'm a cashew!"},
  {'id': '0DtrrOZDlyd',
   'joke': "Chances are if you' ve seen one shopping center, you've seen a mall."},
  {'id': '0LuXvkq4Muc',
   'joke': "I knew I shouldn't steal a mixer from work, but it was a whisk I was willing to take."},
  {'id': '0ga2EdN7prc',
   'joke': 'How come the stadium got hot after the game? Because all of the fans left.'},
  {'id': '0oO71TSv4Ed',
   'joke': 'Why was it called the dark ages? Because of all the knights.

#### What's the limit per API call? 

In [98]:
response.json()["results"]

[{'id': '0189hNRf2g',
  'joke': "I'm tired of following my dreams. I'm just going to ask them where they are going and meet up with them later."},
 {'id': '08EQZ8EQukb',
  'joke': "Did you hear about the guy whose whole left side was cut off? He's all right now."},
 {'id': '08xHQCdx5Ed',
  'joke': 'Why didn’t the skeleton cross the road? Because he had no guts.'},
 {'id': '0DQKB51oGlb',
  'joke': "What did one nut say as he chased another nut?  I'm a cashew!"},
 {'id': '0DtrrOZDlyd',
  'joke': "Chances are if you' ve seen one shopping center, you've seen a mall."},
 {'id': '0LuXvkq4Muc',
  'joke': "I knew I shouldn't steal a mixer from work, but it was a whisk I was willing to take."},
 {'id': '0ga2EdN7prc',
  'joke': 'How come the stadium got hot after the game? Because all of the fans left.'},
 {'id': '0oO71TSv4Ed',
  'joke': 'Why was it called the dark ages? Because of all the knights. '},
 {'id': '0oz51ozk3ob', 'joke': 'A steak pun is a rare medium well done.'},
 {'id': '0ozAXv4Mmj

#### How many total jokes? 

In [100]:
response.json()["total_jokes"]

649

#### How many pages of 20 jokes? 

#### Ok, just the jokes

In [99]:
pd.DataFrame(response.json()["results"])

Unnamed: 0,id,joke
0,0189hNRf2g,I'm tired of following my dreams. I'm just going to ask them where they are going and meet up with them later.
1,08EQZ8EQukb,Did you hear about the guy whose whole left side was cut off? He's all right now.
2,08xHQCdx5Ed,Why didn’t the skeleton cross the road? Because he had no guts.
3,0DQKB51oGlb,What did one nut say as he chased another nut? I'm a cashew!
4,0DtrrOZDlyd,"Chances are if you' ve seen one shopping center, you've seen a mall."
5,0LuXvkq4Muc,"I knew I shouldn't steal a mixer from work, but it was a whisk I was willing to take."
6,0ga2EdN7prc,How come the stadium got hot after the game? Because all of the fans left.
7,0oO71TSv4Ed,Why was it called the dark ages? Because of all the knights.
8,0oz51ozk3ob,A steak pun is a rare medium well done.
9,0ozAXv4Mmjb,Why did the tomato blush? Because it saw the salad dressing.


#### How many records?

In [None]:
len(jokes_df)

#### Get all the jokes with a loop

In [None]:
data_pages = []

for r in range(0, 34):
    data_pages.append(
        pd.DataFrame(
            requests.get(
                f"https://icanhazdadjoke.com/search?page={r}", headers=headers
            ).json()["results"]
        )
    )

jokes_df = pd.concat(data_pages).reset_index(drop=True)

#### How many? 

In [None]:
len(jokes_df)

#### Export 

In [None]:
jokes_df.to_csv("../../data/processed/dad-jokes.csv", index=False)