# Consuming REST APIs

We'll use the [`requests`](http://www.python-requests.org/) library to interact with the [Recipe Puppy](http://www.recipepuppy.com/) search engine.
[This page](http://www.recipepuppy.com/about/api/) provides some information on their API.

In [1]:
import json
import pandas as pd
import requests

In [2]:
BASE_URL = 'http://www.recipepuppy.com/api/'

Let's use `requests.get()` to retrieve the first page of results for pizza recipes.

In [23]:
response = requests.get(BASE_URL, params={'q': 'pizza'})

# give me all pizza recipe

If everything worked correctly, the HTTP status code of the `response` should be `200`.
You can find a list of status codes [on Wikipedia](https://en.wikipedia.org/wiki/List_of_HTTP_status_codes).

In [24]:
response.status_code

200

200 means success

The response content is saved in the `content` attribute of the `response` as an array of bytes.

In [25]:
response.content

b'{"title":"Recipe Puppy","version":0.1,"href":"http:\\/\\/www.recipepuppy.com\\/","results":[{"title":"BBQ Chicken Pizza","href":"http:\\/\\/www.recipezaar.com\\/BBQ-Chicken-Pizza-144689","ingredients":"chicken, brown sugar, cayenne, garlic salt, green pepper, honey, italian cheese blend, salad dressing, margarine, molasses, onions, barbecue sauce, black pepper, prepared pizza crust, provolone cheese, ranch dressing, salt","thumbnail":""},{"title":"Basic Chicago-style Pizza Recipe","href":"http:\\/\\/www.grouprecipes.com\\/65487\\/basic-chicago-style-pizza.html","ingredients":"pizza, vegetable oil, cornmeal, water, flour, sausage, provolone cheese, olive oil, tomato, yeast, pepperoni, salt, salt, sugar, basil, oregano","thumbnail":""},{"title":"BBQ\'d Cheeseburger Pizza","href":"http:\\/\\/www.recipezaar.com\\/BBQd-Cheeseburger-Pizza-299376","ingredients":"barbecue sauce, cheddar cheese, onions, tomato, dill pickle, dill relish, parsley, french dressing, garlic powder, ground beef, le

We can use the `json.loads()` function to parse this JSON blob into the corresponding Python representation.

In [26]:
json.loads(response.content)

{'title': 'Recipe Puppy',
 'version': 0.1,
 'href': 'http://www.recipepuppy.com/',
 'results': [{'title': 'BBQ Chicken Pizza',
   'href': 'http://www.recipezaar.com/BBQ-Chicken-Pizza-144689',
   'ingredients': 'chicken, brown sugar, cayenne, garlic salt, green pepper, honey, italian cheese blend, salad dressing, margarine, molasses, onions, barbecue sauce, black pepper, prepared pizza crust, provolone cheese, ranch dressing, salt',
   'thumbnail': ''},
  {'title': 'Basic Chicago-style Pizza Recipe',
   'href': 'http://www.grouprecipes.com/65487/basic-chicago-style-pizza.html',
   'ingredients': 'pizza, vegetable oil, cornmeal, water, flour, sausage, provolone cheese, olive oil, tomato, yeast, pepperoni, salt, salt, sugar, basil, oregano',
   'thumbnail': ''},
  {'title': "BBQ'd Cheeseburger Pizza",
   'href': 'http://www.recipezaar.com/BBQd-Cheeseburger-Pizza-299376',
   'ingredients': 'barbecue sauce, cheddar cheese, onions, tomato, dill pickle, dill relish, parsley, french dressing, 

Alternatively, we can use the `json()` method of `response` to do the same.

In [27]:
json_response = response.json()

In [28]:
json_response

{'title': 'Recipe Puppy',
 'version': 0.1,
 'href': 'http://www.recipepuppy.com/',
 'results': [{'title': 'BBQ Chicken Pizza',
   'href': 'http://www.recipezaar.com/BBQ-Chicken-Pizza-144689',
   'ingredients': 'chicken, brown sugar, cayenne, garlic salt, green pepper, honey, italian cheese blend, salad dressing, margarine, molasses, onions, barbecue sauce, black pepper, prepared pizza crust, provolone cheese, ranch dressing, salt',
   'thumbnail': ''},
  {'title': 'Basic Chicago-style Pizza Recipe',
   'href': 'http://www.grouprecipes.com/65487/basic-chicago-style-pizza.html',
   'ingredients': 'pizza, vegetable oil, cornmeal, water, flour, sausage, provolone cheese, olive oil, tomato, yeast, pepperoni, salt, salt, sugar, basil, oregano',
   'thumbnail': ''},
  {'title': "BBQ'd Cheeseburger Pizza",
   'href': 'http://www.recipezaar.com/BBQd-Cheeseburger-Pizza-299376',
   'ingredients': 'barbecue sauce, cheddar cheese, onions, tomato, dill pickle, dill relish, parsley, french dressing, 

Let's now extract `title` and `ingredients` for each recipe and transform the latter into a list.

In [29]:
recipes = [{'title': x['title'],
            'ingredients': x['ingredients'].split(', ')}
           for x in json_response['results']]

In [30]:
recipes

[{'title': 'BBQ Chicken Pizza',
  'ingredients': ['chicken',
   'brown sugar',
   'cayenne',
   'garlic salt',
   'green pepper',
   'honey',
   'italian cheese blend',
   'salad dressing',
   'margarine',
   'molasses',
   'onions',
   'barbecue sauce',
   'black pepper',
   'prepared pizza crust',
   'provolone cheese',
   'ranch dressing',
   'salt']},
 {'title': 'Basic Chicago-style Pizza Recipe',
  'ingredients': ['pizza',
   'vegetable oil',
   'cornmeal',
   'water',
   'flour',
   'sausage',
   'provolone cheese',
   'olive oil',
   'tomato',
   'yeast',
   'pepperoni',
   'salt',
   'salt',
   'sugar',
   'basil',
   'oregano']},
 {'title': "BBQ'd Cheeseburger Pizza",
  'ingredients': ['barbecue sauce',
   'cheddar cheese',
   'onions',
   'tomato',
   'dill pickle',
   'dill relish',
   'parsley',
   'french dressing',
   'garlic powder',
   'ground beef',
   'lettuce',
   'mayonnaise',
   'mozzarella cheese',
   'pizza dough',
   'mustard']},
 {'title': 'Healthy Italian Brea

Finally, let's put everything together in a function.

In [31]:
#this function put everything together from above
def get_recipes(query, page=1):
    params = {'q': query, 'p': page}
    response = requests.get(BASE_URL, params=params)
    response.raise_for_status()
    return [{'title': x['title'],
             'ingredients': x['ingredients'].split(', ')}
            for x in response.json()['results']]

In [32]:
get_recipes('pizza')

[{'title': 'BBQ Chicken Pizza',
  'ingredients': ['chicken',
   'brown sugar',
   'cayenne',
   'garlic salt',
   'green pepper',
   'honey',
   'italian cheese blend',
   'salad dressing',
   'margarine',
   'molasses',
   'onions',
   'barbecue sauce',
   'black pepper',
   'prepared pizza crust',
   'provolone cheese',
   'ranch dressing',
   'salt']},
 {'title': 'Basic Chicago-style Pizza Recipe',
  'ingredients': ['pizza',
   'vegetable oil',
   'cornmeal',
   'water',
   'flour',
   'sausage',
   'provolone cheese',
   'olive oil',
   'tomato',
   'yeast',
   'pepperoni',
   'salt',
   'salt',
   'sugar',
   'basil',
   'oregano']},
 {'title': "BBQ'd Cheeseburger Pizza",
  'ingredients': ['barbecue sauce',
   'cheddar cheese',
   'onions',
   'tomato',
   'dill pickle',
   'dill relish',
   'parsley',
   'french dressing',
   'garlic powder',
   'ground beef',
   'lettuce',
   'mayonnaise',
   'mozzarella cheese',
   'pizza dough',
   'mustard']},
 {'title': 'Healthy Italian Brea

We can call `get_recipes()` multiple times to retrieve consecutive pages of results.

In [33]:
query = 'pizza'
recipes = []
for page in range(1, 11):
    print('Retrieving page {}...'.format(page))
    recipes.extend(get_recipes(query, page))

Retrieving page 1...
Retrieving page 2...
Retrieving page 3...
Retrieving page 4...
Retrieving page 5...
Retrieving page 6...
Retrieving page 7...
Retrieving page 8...
Retrieving page 9...
Retrieving page 10...


In [34]:
len(recipes)

100

We can also easily turn the resulting list of dictionaries into a `DataFrame`.

In [37]:
# we have convert it to a dataframe
recipes_df = pd.DataFrame(recipes)

In [40]:
recipes_df.head(5)

Unnamed: 0,ingredients,title,id
0,"[chicken, brown sugar, cayenne, garlic salt, g...",BBQ Chicken Pizza,0
1,"[pizza, vegetable oil, cornmeal, water, flour,...",Basic Chicago-style Pizza Recipe,1
2,"[barbecue sauce, cheddar cheese, onions, tomat...",BBQ'd Cheeseburger Pizza,2
3,"[brown sugar, garlic powder, italian seasoning...",Healthy Italian Bread Sticks or Pizza Crust,3
4,"[bacon, black pepper, cheddar cheese, garlic, ...",Bacon Cheeseburger Pizza,4


Before running some analyses, let's also add an `id` column and flatten out `ingredients`.

In [41]:
recipes_df['id'] = range(len(recipes_df))

In [42]:
flat_ingredients = recipes_df['ingredients']\
                   .apply(pd.Series)\
                   .stack()\
                   .reset_index(level=1, drop=True)\
                   .rename('ingredient')

recipes_df = recipes_df.join(flat_ingredients).reset_index(drop=True).drop(columns=['ingredients'])

In [43]:
recipes_df.head()

Unnamed: 0,title,id,ingredient
0,BBQ Chicken Pizza,0,chicken
1,BBQ Chicken Pizza,0,brown sugar
2,BBQ Chicken Pizza,0,cayenne
3,BBQ Chicken Pizza,0,garlic salt
4,BBQ Chicken Pizza,0,green pepper


Now that our recipes are in this familiar format, we can start running our analyses.
For example, which are the most common ingredients?

In [44]:
recipes_df['ingredient'].value_counts().head(5)

olive oil            56
mozzarella cheese    46
salt                 36
garlic               34
water                33
Name: ingredient, dtype: int64

Which are the recipes with the most ingredients?

In [45]:
recipes_df.groupby(['id', 'title'])\
          ['ingredient'].count()\
          .sort_values(ascending=False)\
          .head(5)

id  title                                                       
56  White Pizza With Fennel Goat Cheese Rosemary And More Recipe    20
61  Healthy Pizzeria Style Pizza Recipe                             20
0   BBQ Chicken Pizza                                               17
80  Garlic Chicken Pizza                                            16
1   Basic Chicago-style Pizza Recipe                                16
Name: ingredient, dtype: int64