<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Using-the-code-from-the-lesson-as-a-guide,-create-a-dataframe-named-items-that-has-all-of-the-data-for-items." data-toc-modified-id="Using-the-code-from-the-lesson-as-a-guide,-create-a-dataframe-named-items-that-has-all-of-the-data-for-items.-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Using the code from the lesson as a guide, create a dataframe named items that has all of the data for items.</a></span><ul class="toc-item"><li><span><a href="#Let's-automate-this-with-a-function" data-toc-modified-id="Let's-automate-this-with-a-function-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Let's automate this with a function</a></span></li></ul></li><li><span><a href="#Do-the-same-thing,-but-for-stores." data-toc-modified-id="Do-the-same-thing,-but-for-stores.-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Do the same thing, but for stores.</a></span></li></ul></div>

In [2]:
import pandas as pd
import numpy as np

import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

plt.rc('figure', figsize=(11, 9))
plt.rc('font', size=13)

from pandas_datareader import data
import requests

import warnings
warnings.filterwarnings("ignore")

## Using the code from the lesson as a guide, create a dataframe named items that has all of the data for items.

In [3]:
# let's check out the documentation for this API

base_url = 'https://python.zach.lol'
response = requests.get(base_url + '/documentation')
data = response.json()
print(data['payload'])


The API accepts GET requests for all endpoints, where endpoints are prefixed
with

    /api/{version}

Where version is "v1"

Valid endpoints:

- /stores[/{store_id}]
- /items[/{item_id}]
- /sales[/{sale_id}]

All endpoints accept a `page` parameter that can be used to navigate through
the results.



In [4]:
api_url = base_url + '/api/v1/'
response = requests.get(api_url + 'items')
response.ok

True

In [5]:
# Use .json() method on our response, and we have a dictionary object

data = response.json()
print(type(data))
data

<class 'dict'>


{'payload': {'items': [{'item_brand': 'Riceland',
    'item_id': 1,
    'item_name': 'Riceland American Jazmine Rice',
    'item_price': 0.84,
    'item_upc12': '35200264013',
    'item_upc14': '35200264013'},
   {'item_brand': 'Caress',
    'item_id': 2,
    'item_name': 'Caress Velvet Bliss Ultra Silkening Beauty Bar - 6 Ct',
    'item_price': 6.44,
    'item_upc12': '11111065925',
    'item_upc14': '11111065925'},
   {'item_brand': 'Earths Best',
    'item_id': 3,
    'item_name': 'Earths Best Organic Fruit Yogurt Smoothie Mixed Berry',
    'item_price': 2.43,
    'item_upc12': '23923330139',
    'item_upc14': '23923330139'},
   {'item_brand': 'Boars Head',
    'item_id': 4,
    'item_name': 'Boars Head Sliced White American Cheese - 120 Ct',
    'item_price': 3.14,
    'item_upc12': '208528800007',
    'item_upc14': '208528800007'},
   {'item_brand': 'Back To Nature',
    'item_id': 5,
    'item_name': 'Back To Nature Gluten Free White Cheddar Rice Thin Crackers',
    'item_price':

In [6]:
# the keys in our dictionary object are payload and status

data.keys()

dict_keys(['payload', 'status'])

In [7]:
# let's look at the keys in payload
# payload is also a key for a dictionary of items
# items is the key for a list of dictionaries (items)

data['payload'].keys()

dict_keys(['items', 'max_page', 'next_page', 'page', 'previous_page'])

In [8]:
data['payload']['items'][:2]

[{'item_brand': 'Riceland',
  'item_id': 1,
  'item_name': 'Riceland American Jazmine Rice',
  'item_price': 0.84,
  'item_upc12': '35200264013',
  'item_upc14': '35200264013'},
 {'item_brand': 'Caress',
  'item_id': 2,
  'item_name': 'Caress Velvet Bliss Ultra Silkening Beauty Bar - 6 Ct',
  'item_price': 6.44,
  'item_upc12': '11111065925',
  'item_upc14': '11111065925'}]

In [9]:
# We can house our items in a DataFrame

items_df = pd.DataFrame(data['payload']['items'])
items_df.head(3)

Unnamed: 0,item_brand,item_id,item_name,item_price,item_upc12,item_upc14
0,Riceland,1,Riceland American Jazmine Rice,0.84,35200264013,35200264013
1,Caress,2,Caress Velvet Bliss Ultra Silkening Beauty Bar...,6.44,11111065925,11111065925
2,Earths Best,3,Earths Best Organic Fruit Yogurt Smoothie Mixe...,2.43,23923330139,23923330139


In [10]:
# Grab the next page. We just need to add this to the base_url

data['payload']['next_page']

'/api/v1/items?page=2'

In [11]:
response = requests.get(base_url + data['payload']['next_page'])
data = response.json()
items_2 = pd.DataFrame(data['payload']['items'])
items_2

# concat the two DataFrames
items_df = pd.concat([items_df, items_2]).reset_index(drop=True)
items_df

Unnamed: 0,item_brand,item_id,item_name,item_price,item_upc12,item_upc14
0,Riceland,1,Riceland American Jazmine Rice,0.84,35200264013,35200264013
1,Caress,2,Caress Velvet Bliss Ultra Silkening Beauty Bar...,6.44,11111065925,11111065925
2,Earths Best,3,Earths Best Organic Fruit Yogurt Smoothie Mixe...,2.43,23923330139,23923330139
3,Boars Head,4,Boars Head Sliced White American Cheese - 120 Ct,3.14,208528800007,208528800007
4,Back To Nature,5,Back To Nature Gluten Free White Cheddar Rice ...,2.61,759283100036,759283100036
5,Sally Hansen,6,Sally Hansen Nail Color Magnetic 903 Silver El...,6.93,74170388732,74170388732
6,Twinings Of London,7,Twinings Of London Classics Lady Grey Tea - 20 Ct,9.64,70177154004,70177154004
7,Lea & Perrins,8,Lea & Perrins Marinade In-a-bag Cracked Pepper...,1.68,51600080015,51600080015
8,Van De Kamps,9,Van De Kamps Fillets Beer Battered - 10 Ct,1.79,19600923015,19600923015
9,Ahold,10,Ahold Cocoa Almonds,3.17,688267141676,688267141676


### Let's automate this with a function that works on all of the pages we need.

In [12]:
def get_df(name):
    """
    This function takes in
    '/items', '/stores', or '/sales' and
    returns a df containing all pages.
    """
    base_url = 'https://python.zach.lol'
    api_url = base_url + '/api/v1/'
    response = requests.get(api_url + name)
    data = response.json()
    
    # create first DataFrame
    df = pd.DataFrame(data['payload'][name])
    
    # loop through the pages of items
    if data['payload']['next_page'] != None:
        response = requests.get(base_url + data['payload']['next_page'])
        data = response.json()

        # create next DataFrame
        df2 = pd.DataFrame(data['payload'][name])
        
        # concat new DataFrame to old DataFrame
        df = pd.concat([df, df2]).reset_index(drop=True)
        df.to_csv(name + '.csv')
    return df

In [15]:
df = get_df('items') 
df

Unnamed: 0,item_brand,item_id,item_name,item_price,item_upc12,item_upc14
0,Riceland,1,Riceland American Jazmine Rice,0.84,35200264013,35200264013
1,Caress,2,Caress Velvet Bliss Ultra Silkening Beauty Bar...,6.44,11111065925,11111065925
2,Earths Best,3,Earths Best Organic Fruit Yogurt Smoothie Mixe...,2.43,23923330139,23923330139
3,Boars Head,4,Boars Head Sliced White American Cheese - 120 Ct,3.14,208528800007,208528800007
4,Back To Nature,5,Back To Nature Gluten Free White Cheddar Rice ...,2.61,759283100036,759283100036
5,Sally Hansen,6,Sally Hansen Nail Color Magnetic 903 Silver El...,6.93,74170388732,74170388732
6,Twinings Of London,7,Twinings Of London Classics Lady Grey Tea - 20 Ct,9.64,70177154004,70177154004
7,Lea & Perrins,8,Lea & Perrins Marinade In-a-bag Cracked Pepper...,1.68,51600080015,51600080015
8,Van De Kamps,9,Van De Kamps Fillets Beer Battered - 10 Ct,1.79,19600923015,19600923015
9,Ahold,10,Ahold Cocoa Almonds,3.17,688267141676,688267141676


## Do the same thing, but for stores.

3. Extract the data for sales. There are a lot of pages of data here, so your code will need to be a little more complex. Your code should continue fetching data from the next page until all of the data is extracted.


4. Save the data in your files to local csv files so that it will be faster to access in the future.


5. Combine the data from your three separate dataframes into one large dataframe.