# Efficient Yelp API Calls (Core)

For this assignment, you will be working with the Yelp API.

As before, you will use the Yelp API to search your favorite city for a cuisine type of your choice.

Extract all of the results from your search and compile them into one dataframe using a for loop as shown in the lesson "Code for Efficient API Extraction"

Save your notebook, commit the change to your repository and submit the repository URL for this assignment.

In [1]:
from yelpapi import YelpAPI
from tqdm.notebook import tqdm_notebook
import json, math, time, os
# json - for json files
# math - round up results
# time - shourt pause to not overwhelm server
# os - saving and loading files
import pandas as pd

In [2]:
with open('/Users/j29ma/.secret/yelp_api.json') as f:
    login = json.load(f)
login.keys()

dict_keys(['client-id', 'api-key'])

In [3]:
# instantiate YelpAPI Variable
yelp_api = YelpAPI(login['api-key'], timeout_s = 5.0)
yelp_api

<yelpapi.yelpapi.YelpAPI at 0x1a9b8065c40>

## Search Terms and File Paths

In [4]:
# define location and term
location = 'Los Angeles, California'
term = 'Shabu Shabu'

In [5]:
split = location.split(',')[0]
split

'Los Angeles'

In [6]:
# specify folder to save data
FOLDER = 'Data/'

# make folder
os.makedirs(FOLDER, exist_ok = True)

# specify JSON_FILE filename to save results
JSON_FILE = FOLDER+f'{split}_{term}.json'
JSON_FILE

'Data/Los Angeles_Shabu Shabu.json'

## Check if JSON file exists and Create if it does not

In [7]:
# check if JSON_FILE exist
file_exists = os.path.isfile(JSON_FILE)

# if it does not exists
if file_exists == False:
    # create folder
    folder = os.path.dirname(JSON_FILE)
    
    # if JSON_FILE included a folder:
    if len(folder) > 0:
        # create folder
        os.makedirs(folder, exist_ok = True)
    # inform user and save empty list
    print(f'[i] {JSON_FILE} not found. Saving empty list to file')
    #save the first page of results
    with open(JSON_FILE, 'w') as f:
        json.dump([], f)

# if it dies exists
else:
    print(f'[i] {JSON_FILE} already exist')

[i] Data/Los Angeles_Shabu Shabu.json already exist


## Make the first API call to get the first page to data

In [8]:
results = yelp_api.search_query(term = term, location = location)

In [9]:
type(results)

dict

In [10]:
results.keys()

dict_keys(['businesses', 'total', 'region'])

In [11]:
results['region']

{'center': {'longitude': -118.41064453125, 'latitude': 34.02010806957986}}

In [12]:
results['total']

576

In [13]:
results['businesses']

[{'id': 'tGusz6xqBBABc1awGP1aLA',
  'alias': 'shabuya-los-angeles-2',
  'name': 'Shabuya',
  'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/Iry_2AEmQf3VCQSFY7PqXw/o.jpg',
  'is_closed': False,
  'url': 'https://www.yelp.com/biz/shabuya-los-angeles-2?adjust_creative=u67t0z_UFGQjtCvI1hRZUg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=u67t0z_UFGQjtCvI1hRZUg',
  'review_count': 925,
  'categories': [{'alias': 'seafood', 'title': 'Seafood'},
   {'alias': 'hotpot', 'title': 'Hot Pot'},
   {'alias': 'korean', 'title': 'Korean'}],
  'rating': 4.0,
  'coordinates': {'latitude': 34.05156, 'longitude': -118.278207},
  'transactions': ['delivery'],
  'price': '$$',
  'location': {'address1': '1925 W Olympic Blvd',
   'address2': '',
   'address3': None,
   'city': 'Los Angeles',
   'zip_code': '90006',
   'country': 'US',
   'state': 'CA',
   'display_address': ['1925 W Olympic Blvd', 'Los Angeles, CA 90006']},
  'phone': '+12133185635',
  'display_phone': '(213) 31

### How many results in total?

In [14]:
pd.DataFrame(results['businesses'])

Unnamed: 0,id,alias,name,image_url,is_closed,url,review_count,categories,rating,coordinates,transactions,price,location,phone,display_phone,distance
0,tGusz6xqBBABc1awGP1aLA,shabuya-los-angeles-2,Shabuya,https://s3-media2.fl.yelpcdn.com/bphoto/Iry_2A...,False,https://www.yelp.com/biz/shabuya-los-angeles-2...,925,"[{'alias': 'seafood', 'title': 'Seafood'}, {'a...",4.0,"{'latitude': 34.05156, 'longitude': -118.278207}",[delivery],$$,"{'address1': '1925 W Olympic Blvd', 'address2'...",12133185635,(213) 318-5635,12694.795397
1,A0pgVbh53BPoqJUPRwx4ZA,bon-shabu-los-angeles-5,Bon Shabu,https://s3-media4.fl.yelpcdn.com/bphoto/s-tAz2...,False,https://www.yelp.com/biz/bon-shabu-los-angeles...,133,"[{'alias': 'hotpot', 'title': 'Hot Pot'}, {'al...",4.0,"{'latitude': 34.0613422, 'longitude': -118.299...",[restaurant_reservation],,"{'address1': '3454 Wilshire Blvd', 'address2':...",12133185004,(213) 318-5004,11201.847844
2,32Wd-O6flvq5vzyUaerX9A,aki-shabu-ktown-los-angeles-2,Aki Shabu Ktown,https://s3-media2.fl.yelpcdn.com/bphoto/JJBt3X...,False,https://www.yelp.com/biz/aki-shabu-ktown-los-a...,91,"[{'alias': 'hotpot', 'title': 'Hot Pot'}]",4.5,"{'latitude': 34.062775332045454, 'longitude': ...","[pickup, delivery]",$$$,"{'address1': '621 S Western Ave 301', 'address...",12135294011,(213) 529-4011,10459.559699
3,v9RZZLxE39Zx9L0mXB8kww,shabu-shabu-house-los-angeles,Shabu Shabu House,https://s3-media1.fl.yelpcdn.com/bphoto/rlXR7Z...,False,https://www.yelp.com/biz/shabu-shabu-house-los...,1462,"[{'alias': 'japanese', 'title': 'Japanese'}, {...",4.0,"{'latitude': 34.049034, 'longitude': -118.240347}",[delivery],$$,{'address1': '127 Japanese Village Plaza Mall'...,12136803890,(213) 680-3890,16017.032628
4,7BZjZLN-YDmFb4Y3N-3aJw,shabushi-los-angeles,ShaBuShi,https://s3-media2.fl.yelpcdn.com/bphoto/YI0_NI...,False,https://www.yelp.com/biz/shabushi-los-angeles?...,538,"[{'alias': 'japanese', 'title': 'Japanese'}, {...",4.0,"{'latitude': 34.0982549, 'longitude': -118.302...","[pickup, delivery]",$$,"{'address1': '5185 Sunset Blvd', 'address2': '...",13235226457,(323) 522-6457,13209.536437
5,rfVUBEy2i9aHsKjsJbqZwA,haidilao-hot-pot-century-city-los-angeles,Haidilao Hot Pot Century City,https://s3-media3.fl.yelpcdn.com/bphoto/mLdfJr...,False,https://www.yelp.com/biz/haidilao-hot-pot-cent...,731,"[{'alias': 'hotpot', 'title': 'Hot Pot'}]",4.5,"{'latitude': 34.058799, 'longitude': -118.41932}","[restaurant_reservation, delivery, pickup]",,"{'address1': '10250 Santa Monica Blvd', 'addre...",14243821234,(424) 382-1234,4390.768236
6,KyppDi1Hdg2G1F_No30Bpw,mizu-212-los-angeles,Mizu 212,https://s3-media1.fl.yelpcdn.com/bphoto/EbvMKt...,False,https://www.yelp.com/biz/mizu-212-los-angeles?...,640,"[{'alias': 'japanese', 'title': 'Japanese'}, {...",3.5,"{'latitude': 34.0409879964362, 'longitude': -1...","[pickup, delivery]",$$$,"{'address1': '2000 Sawtelle Blvd', 'address2':...",13102352120,(310) 235-2120,3813.509708
7,BPgbSNw9tPUSaZpuhjVgfA,joon-shabu-shabu-glendale-3,Joon Shabu Shabu,https://s3-media2.fl.yelpcdn.com/bphoto/f0YIH8...,False,https://www.yelp.com/biz/joon-shabu-shabu-glen...,1274,"[{'alias': 'hotpot', 'title': 'Hot Pot'}, {'al...",4.5,"{'latitude': 34.14611, 'longitude': -118.2528}",[delivery],$$,"{'address1': '220 E Broadway', 'address2': '',...",18184845552,(818) 484-5552,20189.445774
8,91-q3tw6-zBrIH2Tvgf6ow,momo-paradise-torrance,Momo Paradise,https://s3-media2.fl.yelpcdn.com/bphoto/QtXWPP...,False,https://www.yelp.com/biz/momo-paradise-torranc...,1084,"[{'alias': 'japanese', 'title': 'Japanese'}, {...",4.0,"{'latitude': 33.832087, 'longitude': -118.309605}","[pickup, delivery]",$$,"{'address1': '21641 S Western Ave', 'address2'...",13107813052,(310) 781-3052,22889.228836
9,_8Tvjpgu56ioJf0ohHzUUg,seoul-garden-restaurant-los-angeles,Seoul Garden Restaurant,https://s3-media1.fl.yelpcdn.com/bphoto/w9xDfm...,False,https://www.yelp.com/biz/seoul-garden-restaura...,586,"[{'alias': 'korean', 'title': 'Korean'}, {'ali...",4.0,"{'latitude': 34.0507689, 'longitude': -118.277...","[pickup, delivery]",$$,"{'address1': '1833 W Olympic Blvd', 'address2'...",12133868477,(213) 386-8477,12756.447344


In [15]:
# results per page
results_per_page = len(results['businesses'])
results_per_page

20

In [16]:
# number of pages
# math.ceil to round up for the total number of results
n_pages = math.ceil(results['total'] / results_per_page)
n_pages

29

In [17]:
!pip install tqdm



In [21]:
# get the rest of the data ( because we only have 20 results)
for i in tqdm_notebook(range(1, n_pages+1)):
    # block of code we want to try
    try:
        time.sleep(.2) # [!] need to know this
        # read results in progress file and check the length
        with open(JSON_FILE, 'r') as f:
            previous_results = json.load(f)
        
        # save number of results for the use as offset
        n_results = len(previous_results)
        
        # use n_results as the OFFSET
        results = yelp_api.search_query(term = term, location = location,
                                       offset = n_results+1)
        
        #append new results and save to file
        previous_results.extend(results['businesses'])
        
        with open(JSON_FILE, 'w') as f:
            json.dump(previous_results, f)
    # what to do on errors
    except Exception as e:
        print('[!] Error', e)

  0%|          | 0/29 [00:00<?, ?it/s]

## Open the Final JSON file with Pandas

In [22]:
df = pd.read_json(JSON_FILE)

In [23]:
df.head()

Unnamed: 0,id,alias,name,image_url,is_closed,url,review_count,categories,rating,coordinates,transactions,location,phone,display_phone,distance,price
0,A0pgVbh53BPoqJUPRwx4ZA,bon-shabu-los-angeles-5,Bon Shabu,https://s3-media4.fl.yelpcdn.com/bphoto/s-tAz2...,False,https://www.yelp.com/biz/bon-shabu-los-angeles...,133,"[{'alias': 'hotpot', 'title': 'Hot Pot'}, {'al...",4.0,"{'latitude': 34.0613422, 'longitude': -118.299...",[restaurant_reservation],"{'address1': '3454 Wilshire Blvd', 'address2':...",12133185004,(213) 318-5004,11201.847844,
1,32Wd-O6flvq5vzyUaerX9A,aki-shabu-ktown-los-angeles-2,Aki Shabu Ktown,https://s3-media2.fl.yelpcdn.com/bphoto/JJBt3X...,False,https://www.yelp.com/biz/aki-shabu-ktown-los-a...,91,"[{'alias': 'hotpot', 'title': 'Hot Pot'}]",4.5,"{'latitude': 34.062775332045454, 'longitude': ...","[pickup, delivery]","{'address1': '621 S Western Ave 301', 'address...",12135294011,(213) 529-4011,10459.559699,$$$
2,v9RZZLxE39Zx9L0mXB8kww,shabu-shabu-house-los-angeles,Shabu Shabu House,https://s3-media1.fl.yelpcdn.com/bphoto/rlXR7Z...,False,https://www.yelp.com/biz/shabu-shabu-house-los...,1462,"[{'alias': 'japanese', 'title': 'Japanese'}, {...",4.0,"{'latitude': 34.049034, 'longitude': -118.240347}",[delivery],{'address1': '127 Japanese Village Plaza Mall'...,12136803890,(213) 680-3890,16017.032628,$$
3,7BZjZLN-YDmFb4Y3N-3aJw,shabushi-los-angeles,ShaBuShi,https://s3-media2.fl.yelpcdn.com/bphoto/YI0_NI...,False,https://www.yelp.com/biz/shabushi-los-angeles?...,538,"[{'alias': 'japanese', 'title': 'Japanese'}, {...",4.0,"{'latitude': 34.0982549, 'longitude': -118.302...","[pickup, delivery]","{'address1': '5185 Sunset Blvd', 'address2': '...",13235226457,(323) 522-6457,13209.536437,$$
4,rfVUBEy2i9aHsKjsJbqZwA,haidilao-hot-pot-century-city-los-angeles,Haidilao Hot Pot Century City,https://s3-media3.fl.yelpcdn.com/bphoto/mLdfJr...,False,https://www.yelp.com/biz/haidilao-hot-pot-cent...,731,"[{'alias': 'hotpot', 'title': 'Hot Pot'}]",4.5,"{'latitude': 34.058799, 'longitude': -118.41932}","[pickup, restaurant_reservation, delivery]","{'address1': '10250 Santa Monica Blvd', 'addre...",14243821234,(424) 382-1234,4390.768236,


In [24]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 575 entries, 0 to 574
Data columns (total 16 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   id             575 non-null    object 
 1   alias          575 non-null    object 
 2   name           575 non-null    object 
 3   image_url      575 non-null    object 
 4   is_closed      575 non-null    bool   
 5   url            575 non-null    object 
 6   review_count   575 non-null    int64  
 7   categories     575 non-null    object 
 8   rating         575 non-null    float64
 9   coordinates    575 non-null    object 
 10  transactions   575 non-null    object 
 11  location       575 non-null    object 
 12  phone          575 non-null    object 
 13  display_phone  575 non-null    object 
 14  distance       575 non-null    float64
 15  price          546 non-null    object 
dtypes: bool(1), float64(2), int64(1), object(12)
memory usage: 68.1+ KB


In [25]:
# convert filename to a .csv.gz
csv_file = JSON_FILE.replace('.json', '.csv.gz')
csv_file

'Data/Los Angeles_Shabu Shabu.csv.gz'

In [26]:
# save as a compressed csv (to save space)
df.to_csv(csv_file, compression = 'gzip', index = False)