# Part 1 - Extracting and Saving Data from Yelp API

## Obective

- For this CodeAlong, we will be working with the Yelp API. 
- You will use the the Yelp API to search your home town for a cuisine type of your choice.
- Next class, we will then use Plotly Express to create a map with the Mapbox API to visualize the results.
    
    

## Tools You Will Use
- Part 1:
    - Yelp API:
        - Getting Started: 
            - https://www.yelp.com/developers/documentation/v3/get_started

    - `YelpAPI` python package
        -  "YelpAPI": https://github.com/gfairchild/yelpapi
- Part 2:

    - Plotly Express: https://plotly.com/python/getting-started/
        - With Mapbox API: https://www.mapbox.com/
        - `px.scatter_mapbox` [Documentation](https://plotly.com/python/scattermapbox/): 




### Applying Code From
- Efficient API Calls Lesson Link: https://login.codingdojo.com/m/376/12529/88078

In [1]:
# Standard Imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Additional Imports
import os, json, math, time
from yelpapi import YelpAPI
from tqdm.notebook import tqdm_notebook

## 1. Registering for Required APIs


- Yelp: https://www.yelp.com/developers/documentation/v3/get_started


> Check the official API documentation to know what arguments we can search for: https://www.yelp.com/developers/documentation/v3/business_search

### Load Credentials and Create Yelp API Object

In [2]:
# Load API Credentials
with open('/Users/jhugh/.secret/yelp_api.json','r') as f:
    login = json.load(f)
login.keys()

dict_keys(['client_id', 'api_key'])

In [3]:
# Instantiate YelpAPI Variable
yelp = YelpAPI(login['api_key'],timeout_s=5.0)

### Define Search Terms and File Paths

In [4]:
# set our API call parameters and filename before the first call
location = "Logan,OH"
term = "pizza"

In [16]:
## Specify fodler for saving data
FOLDER = 'Data/'
os.makedirs(FOLDER,exist_ok=True)

# Specifying JSON_FILE filename (can include a folder)
JSON_FILE = FOLDER+('Logan,pizza')

In [17]:
JSON_FILE

'Data/Logan,pizza'

### Check if Json File exists and Create it if it doesn't

In [27]:
## Check if JSON_FILE exists

## If it does not exist: 
    
    ## CREATE ANY NEEDED FOLDERS
    # Get the Folder Name only
if os.path.isfile(JSON_FILE)==False:
    print(" The file does not exist. Creating empty file")
    with open(JSON_FILE,'w') as f:
        json.dump(results['businesses'],f)
        
else:
    print('File exists')
    
    ## If JSON_FILE included a folder:

        # create the folder

        
        
    ## INFORM USER AND SAVE EMPTY LIST

    
    
    ## save the first page of results

        
## If it exists, inform user


 The file does not exist. Creating empty file


### Load JSON FIle and account for previous results

In [None]:
## Load previous results and use len of results for offset

## set offset based on previous results


### Make the first API call to get the first page of data

- We will use this first result to check:
    - how many total results there are?
    - Where is the actual data we want to save?
    - how many results do we get at a time?


In [13]:
# use our yelp_api variable's search_query method to perform our API call
results = yelp.search_query(term = term, location=location)

In [14]:
## How many results total?
results

{'businesses': [{'id': '3Xg7NgcOJKYPtUjaGaZDwA',
   'alias': 'pizza-crossing-logan',
   'name': 'Pizza Crossing',
   'image_url': 'https://s3-media3.fl.yelpcdn.com/bphoto/lkMsbk4QlA6luQ8NNdb-gw/o.jpg',
   'is_closed': False,
   'url': 'https://www.yelp.com/biz/pizza-crossing-logan?adjust_creative=l_1zZIAnKO6FPzv5VJxtuQ&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=l_1zZIAnKO6FPzv5VJxtuQ',
   'review_count': 150,
   'categories': [{'alias': 'pizza', 'title': 'Pizza'}],
   'rating': 4.0,
   'coordinates': {'latitude': 39.54067, 'longitude': -82.40677},
   'transactions': [],
   'price': '$$',
   'location': {'address1': '58 N Mulberry St',
    'address2': '',
    'address3': '',
    'city': 'Logan',
    'zip_code': '43138',
    'country': 'US',
    'state': 'OH',
    'display_address': ['58 N Mulberry St', 'Logan, OH 43138']},
   'phone': '+17403858558',
   'display_phone': '(740) 385-8558',
   'distance': 681.3090799267927},
  {'id': 'Dp0RkgafimTLjX1KRMBx1A',
   

In [18]:
results.keys()

dict_keys(['businesses', 'total', 'region'])

In [19]:
results['businesses'][0]

{'id': '3Xg7NgcOJKYPtUjaGaZDwA',
 'alias': 'pizza-crossing-logan',
 'name': 'Pizza Crossing',
 'image_url': 'https://s3-media3.fl.yelpcdn.com/bphoto/lkMsbk4QlA6luQ8NNdb-gw/o.jpg',
 'is_closed': False,
 'url': 'https://www.yelp.com/biz/pizza-crossing-logan?adjust_creative=l_1zZIAnKO6FPzv5VJxtuQ&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=l_1zZIAnKO6FPzv5VJxtuQ',
 'review_count': 150,
 'categories': [{'alias': 'pizza', 'title': 'Pizza'}],
 'rating': 4.0,
 'coordinates': {'latitude': 39.54067, 'longitude': -82.40677},
 'transactions': [],
 'price': '$$',
 'location': {'address1': '58 N Mulberry St',
  'address2': '',
  'address3': '',
  'city': 'Logan',
  'zip_code': '43138',
  'country': 'US',
  'state': 'OH',
  'display_address': ['58 N Mulberry St', 'Logan, OH 43138']},
 'phone': '+17403858558',
 'display_phone': '(740) 385-8558',
 'distance': 681.3090799267927}

In [20]:
pd.DataFrame(results['businesses'])

Unnamed: 0,id,alias,name,image_url,is_closed,url,review_count,categories,rating,coordinates,transactions,price,location,phone,display_phone,distance
0,3Xg7NgcOJKYPtUjaGaZDwA,pizza-crossing-logan,Pizza Crossing,https://s3-media3.fl.yelpcdn.com/bphoto/lkMsbk...,False,https://www.yelp.com/biz/pizza-crossing-logan?...,150,"[{'alias': 'pizza', 'title': 'Pizza'}]",4.0,"{'latitude': 39.54067, 'longitude': -82.40677}",[],$$,"{'address1': '58 N Mulberry St', 'address2': '...",17403858558,(740) 385-8558,681.30908
1,Dp0RkgafimTLjX1KRMBx1A,cristys-pizza-logan,Cristy's Pizza,https://s3-media1.fl.yelpcdn.com/bphoto/jNRynU...,False,https://www.yelp.com/biz/cristys-pizza-logan?a...,14,"[{'alias': 'pizza', 'title': 'Pizza'}, {'alias...",4.5,"{'latitude': 39.5437316894531, 'longitude': -8...",[delivery],$,"{'address1': '1320 W Hunter St', 'address2': '...",17403801901,(740) 380-1901,1640.761168
2,Kf0a0Do04qFO0MVslyzScA,bobs-pizza-house-logan,Bob's Pizza House,https://s3-media4.fl.yelpcdn.com/bphoto/pL3CT1...,False,https://www.yelp.com/biz/bobs-pizza-house-loga...,18,"[{'alias': 'pizza', 'title': 'Pizza'}]",4.5,"{'latitude': 39.5414886, 'longitude': -82.4186...",[delivery],$,"{'address1': '758 W Front St', 'address2': '',...",17403856839,(740) 385-6839,893.640193
3,DQi86dFU8Z-oA9UMabGCTg,captain-rons-pirate-pizza-logan,Captain Ron's Pirate Pizza,https://s3-media4.fl.yelpcdn.com/bphoto/TqvR1Q...,False,https://www.yelp.com/biz/captain-rons-pirate-p...,27,"[{'alias': 'pizza', 'title': 'Pizza'}]",4.0,"{'latitude': 39.4865347692034, 'longitude': -8...",[pickup],,"{'address1': '16757 State Rte 664 S', 'address...",17403852221,(740) 385-2221,8041.299496
4,YkqRMQCQ9GJFgVnChxUtDg,pizza-hut-logan-3,Pizza Hut,https://s3-media1.fl.yelpcdn.com/bphoto/zR4pm9...,False,https://www.yelp.com/biz/pizza-hut-logan-3?adj...,3,"[{'alias': 'pizza', 'title': 'Pizza'}, {'alias...",3.5,"{'latitude': 39.5398204, 'longitude': -82.4366...",[pickup],$,"{'address1': '12876 State Route 664', 'address...",17403800030,(740) 380-0030,2231.10217
5,C9Hva9ka0MfBsxpOfXJDrA,hocking-hills-winery-logan,Hocking Hills Winery,https://s3-media3.fl.yelpcdn.com/bphoto/Af9fep...,False,https://www.yelp.com/biz/hocking-hills-winery-...,123,"[{'alias': 'wineries', 'title': 'Wineries'}]",4.0,"{'latitude': 39.5470095064139, 'longitude': -8...",[],$$,"{'address1': '30402 Freeman Rd', 'address2': '...",17403857117,(740) 385-7117,3785.708482
6,VVwvlnTgHE9X0knJYlLAnA,hocking-hills-inn-and-coffee-emporium-logan,Hocking Hills Inn and Coffee Emporium,https://s3-media2.fl.yelpcdn.com/bphoto/KMywUH...,False,https://www.yelp.com/biz/hocking-hills-inn-and...,53,"[{'alias': 'coffee', 'title': 'Coffee & Tea'},...",4.5,"{'latitude': 39.525834, 'longitude': -82.460762}",[],$$,"{'address1': '13984 OH-664 Scenic', 'address2'...",17403000020,(740) 300-0020,4393.442143
7,nSIkF2KGzm_Vny454oj9zg,dominos-pizza-logan-5,Domino's Pizza,,False,https://www.yelp.com/biz/dominos-pizza-logan-5...,3,"[{'alias': 'pizza', 'title': 'Pizza'}, {'alias...",2.5,"{'latitude': 39.5424995, 'longitude': -82.4230...",[],$,"{'address1': '1027 W Hunter St', 'address2': '...",17403859655,(740) 385-9655,1256.678871
8,NgCTLa01c8pd01VHUcfFYQ,hungry-buffalo-logan,Hungry Buffalo,https://s3-media2.fl.yelpcdn.com/bphoto/P0bHr9...,False,https://www.yelp.com/biz/hungry-buffalo-logan?...,151,"[{'alias': 'newamerican', 'title': 'American (...",3.0,"{'latitude': 39.5437735348388, 'longitude': -8...","[pickup, delivery]",$$,"{'address1': '12762 Grey St', 'address2': '', ...",17403800088,(740) 380-0088,3346.394836
9,dSVoVJyYX3frxiGsgld_Xw,urban-grille-logan-4,Urban Grille,https://s3-media2.fl.yelpcdn.com/bphoto/5uLQGX...,False,https://www.yelp.com/biz/urban-grille-logan-4?...,33,"[{'alias': 'pubs', 'title': 'Pubs'}, {'alias':...",4.0,"{'latitude': 39.518314, 'longitude': -82.3788031}","[pickup, delivery]",$$,"{'address1': '14405 Country Club Ln', 'address...",17403858966,(740) 385-8966,3391.541097


In [21]:
## How many results total?
results['total']

14

In [22]:
results['region']

{'center': {'longitude': -82.41119384765625, 'latitude': 39.53581017789831}}

In [23]:
results_per_page = len(results['businesses'])

In [24]:
results_per_page

14

- Where is the actual data we want to save?

In [15]:
## How many did we get the details for?


- Calculate how many pages of results needed to cover the total_results

In [25]:
# Use math.ceil to round up for the total number of pages of results.
import math
n_pages = math.ceil(results['total']/results_per_page)
n_pages

1

In [28]:
for i in tqdm_notebook( range(1,n_pages+1)):
    with open(JSON_FILE) as f:
        prev_results = json.load(f)
    n_results = len(prev_results)
    results = yelp.search_query(term=term, location=location)
    prev_results.exten(results['businesses'])
    with open(JSON_FILE, 'w') as f:
        json.dump(prev_results,f)
    
    ## The block of code we want to TRY to run
        
        
        ## Read in results in progress file and check the length
        
        ## save number of results for to use as offset
        
        
        
        ## use n_results as the OFFSET 
        

        ## append new results and save to file
        

            
    ## What to do if we get an error/exception.
        


  0%|          | 0/1 [00:00<?, ?it/s]

ConnectionError: ('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None))

## Open the Final JSON File with Pandas

In [30]:
df = pd.read_json(JSON_FILE)
df

Unnamed: 0,id,alias,name,image_url,is_closed,url,review_count,categories,rating,coordinates,transactions,price,location,phone,display_phone,distance
0,3Xg7NgcOJKYPtUjaGaZDwA,pizza-crossing-logan,Pizza Crossing,https://s3-media3.fl.yelpcdn.com/bphoto/lkMsbk...,False,https://www.yelp.com/biz/pizza-crossing-logan?...,150,"[{'alias': 'pizza', 'title': 'Pizza'}]",4.0,"{'latitude': 39.54067, 'longitude': -82.40677}",[],$$,"{'address1': '58 N Mulberry St', 'address2': '...",17403858558,(740) 385-8558,681.30908
1,Dp0RkgafimTLjX1KRMBx1A,cristys-pizza-logan,Cristy's Pizza,https://s3-media1.fl.yelpcdn.com/bphoto/jNRynU...,False,https://www.yelp.com/biz/cristys-pizza-logan?a...,14,"[{'alias': 'pizza', 'title': 'Pizza'}, {'alias...",4.5,"{'latitude': 39.5437316894531, 'longitude': -8...",[delivery],$,"{'address1': '1320 W Hunter St', 'address2': '...",17403801901,(740) 380-1901,1640.761168
2,Kf0a0Do04qFO0MVslyzScA,bobs-pizza-house-logan,Bob's Pizza House,https://s3-media4.fl.yelpcdn.com/bphoto/pL3CT1...,False,https://www.yelp.com/biz/bobs-pizza-house-loga...,18,"[{'alias': 'pizza', 'title': 'Pizza'}]",4.5,"{'latitude': 39.5414886, 'longitude': -82.4186...",[delivery],$,"{'address1': '758 W Front St', 'address2': '',...",17403856839,(740) 385-6839,893.640193
3,DQi86dFU8Z-oA9UMabGCTg,captain-rons-pirate-pizza-logan,Captain Ron's Pirate Pizza,https://s3-media4.fl.yelpcdn.com/bphoto/TqvR1Q...,False,https://www.yelp.com/biz/captain-rons-pirate-p...,27,"[{'alias': 'pizza', 'title': 'Pizza'}]",4.0,"{'latitude': 39.4865347692034, 'longitude': -8...",[pickup],,"{'address1': '16757 State Rte 664 S', 'address...",17403852221,(740) 385-2221,8041.299496
4,YkqRMQCQ9GJFgVnChxUtDg,pizza-hut-logan-3,Pizza Hut,https://s3-media1.fl.yelpcdn.com/bphoto/zR4pm9...,False,https://www.yelp.com/biz/pizza-hut-logan-3?adj...,3,"[{'alias': 'pizza', 'title': 'Pizza'}, {'alias...",3.5,"{'latitude': 39.539820399999996, 'longitude': ...",[pickup],$,"{'address1': '12876 State Route 664', 'address...",17403800030,(740) 380-0030,2231.10217
5,C9Hva9ka0MfBsxpOfXJDrA,hocking-hills-winery-logan,Hocking Hills Winery,https://s3-media3.fl.yelpcdn.com/bphoto/Af9fep...,False,https://www.yelp.com/biz/hocking-hills-winery-...,123,"[{'alias': 'wineries', 'title': 'Wineries'}]",4.0,"{'latitude': 39.5470095064139, 'longitude': -8...",[],$$,"{'address1': '30402 Freeman Rd', 'address2': '...",17403857117,(740) 385-7117,3785.708482
6,VVwvlnTgHE9X0knJYlLAnA,hocking-hills-inn-and-coffee-emporium-logan,Hocking Hills Inn and Coffee Emporium,https://s3-media2.fl.yelpcdn.com/bphoto/KMywUH...,False,https://www.yelp.com/biz/hocking-hills-inn-and...,53,"[{'alias': 'coffee', 'title': 'Coffee & Tea'},...",4.5,"{'latitude': 39.525834, 'longitude': -82.460762}",[],$$,"{'address1': '13984 OH-664 Scenic', 'address2'...",17403000020,(740) 300-0020,4393.442143
7,nSIkF2KGzm_Vny454oj9zg,dominos-pizza-logan-5,Domino's Pizza,,False,https://www.yelp.com/biz/dominos-pizza-logan-5...,3,"[{'alias': 'pizza', 'title': 'Pizza'}, {'alias...",2.5,"{'latitude': 39.5424995, 'longitude': -82.4230...",[],$,"{'address1': '1027 W Hunter St', 'address2': '...",17403859655,(740) 385-9655,1256.678871
8,NgCTLa01c8pd01VHUcfFYQ,hungry-buffalo-logan,Hungry Buffalo,https://s3-media2.fl.yelpcdn.com/bphoto/P0bHr9...,False,https://www.yelp.com/biz/hungry-buffalo-logan?...,151,"[{'alias': 'newamerican', 'title': 'American (...",3.0,"{'latitude': 39.5437735348388, 'longitude': -8...","[pickup, delivery]",$$,"{'address1': '12762 Grey St', 'address2': '', ...",17403800088,(740) 380-0088,3346.394836
9,dSVoVJyYX3frxiGsgld_Xw,urban-grille-logan-4,Urban Grille,https://s3-media2.fl.yelpcdn.com/bphoto/5uLQGX...,False,https://www.yelp.com/biz/urban-grille-logan-4?...,33,"[{'alias': 'pubs', 'title': 'Pubs'}, {'alias':...",4.0,"{'latitude': 39.518314, 'longitude': -82.3788031}","[pickup, delivery]",$$,"{'address1': '14405 Country Club Ln', 'address...",17403858966,(740) 385-8966,3391.541097


In [31]:
## convert the filename to a .csv.gz
csv_file = JSON_FILE.replace('.json','.csv.gz')
csv_file

'Data/Logan,pizza'

In [32]:
## Save it as a compressed csv (to save space)
df.to_csv(csv_file,compression='gzip',index=False)

## Bonus: compare filesize with os module's `os.path.getsize`

In [33]:
size_json = os.path.getsize(JSON_FILE)
size_csv_gz = os.path.getsize(JSON_FILE.replace('.json','.csv.gz'))

print(f'JSON FILE: {size_json:,} Bytes')
print(f'CSV.GZ FILE: {size_csv_gz:,} Bytes')

print(f'the csv.gz is {size_json/size_csv_gz} times smaller!')

JSON FILE: 2,513 Bytes
CSV.GZ FILE: 2,513 Bytes
the csv.gz is 1.0 times smaller!


## Next Class: Processing the Results and Mapping 