# Yelp API Restaurant Calls By City

For each city listed in cities.csv, return all restaurants found via yelp api and
extract data regarding:

- name
- address & zip
- coordinates
- rating
- review count
- price level
- category
- yelp id


From above data, also determine the following for each city:

- total count of restaurants
- distribution of categories (eg. 10% Pizza, 20% Sushi, etc.)
- ratio of price level options (ie. total count of pricier levels divided by count of lower levels)
- total count of restaurants above a given rating
- concentration of restaurants (ie. total count divided by sq. mi of city)


In [131]:
# Dependencies
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import requests
import time
import json
import csv

# Import API key
from api_keys import api_key

### Perform API calls

- Dataframe from cities.csv
- Build function that calls and writes restaurant data to new csv, given a city
- Run list of cities through function

In [132]:
# Create dataframe from csv file
cities_df = pd.read_csv("cities.csv", names=["City", "County", "Population", "Area (sq. mi)"])
cities_df

Unnamed: 0,City,County,Population,Area (sq. mi)
0,Alameda,Alameda,73812,10.61
1,Albany,Alameda,18539,1.79
2,American Canyon,Napa,19454,4.84
3,Antioch,Contra Costa,102372,28.35
4,Atherton,San Mateo,6914,5.02
5,Belmont,San Mateo,25835,4.62
6,Belvedere,Marin,2068,0.52
7,Benicia,Solano,26997,12.93
8,Berkeley,Alameda,112580,10.47
9,Brentwood,Contra Costa,51481,14.79


In [146]:
# Function returns up to 1000 restaurant listings for input city
def get_restaurants(city, api_key):
    
    url = "https://api.yelp.com/v3/businesses/search"
    headers = {"Authorization": "Bearer %s" % api_key}
    restaurant_data = []
    count = 0
    
    # Increases search return limit
    for offset in range(0, 1000, 50):
        
        # Set parameters and pass into API calls
        params = {"term": "restaurants", "location":city + ", CA", "limit":50, "offset":offset}
        req = requests.get(url, params=params, headers=headers)
        
        # Breaks if error occurs with search
        if req.status_code == 400:
            break
        elif req.status_code == 200:
            
            # Convert to json
            response = req.json()
            
            # Log history
            count += 1
            print(f"Now processing query set {count} of approx 20 for {city}")
            
            # Breaks if no further entries in query
            if response["businesses"] == []:
                break
            
            else:
                # Iterate through business results and extract data
                for biz in response["businesses"]:
                    
                    # Logic to replace missing price level data with NaN
                    if "price" not in biz:
                        restaurant_data.append([city, biz["name"], biz["coordinates"]["latitude"], 
                                                biz["coordinates"]["longitude"], biz["location"]["address1"], 
                                                biz["location"]["zip_code"], biz["rating"], 
                                                biz["review_count"], "", biz["categories"][0]["title"], 
                                                biz["id"]])

                    # Replace missing category data with empty string
                    elif biz["categories"] == []:
                        restaurant_data.append([city, biz["name"], biz["coordinates"]["latitude"], 
                                                biz["coordinates"]["longitude"], biz["location"]["address1"], 
                                                biz["location"]["zip_code"], biz["rating"], 
                                                biz["review_count"], biz["price"], "", biz["id"]])

                    else:
                        restaurant_data.append([city, biz["name"], biz["coordinates"]["latitude"], 
                                                biz["coordinates"]["longitude"], biz["location"]["address1"], 
                                                biz["location"]["zip_code"], biz["rating"], 
                                                biz["review_count"], biz["price"], 
                                                biz["categories"][0]["title"], biz["id"]])

    # Write to csv
    with open('restaurant_data.csv', 'a', encoding="utf-8") as csvFile:
        writer = csv.writer(csvFile)
        writer.writerows(restaurant_data)
    csvFile.close()
    
    # Returns total count of restaurants in city
    return response["total"]


In [143]:
# Track # of cities processed
count = 0

# List to track total restaurants found in city
totals_count = []

print("LOG HISTORY OF API CALLS:")
print("---------------------------")

# Loop thru list of cities in cities_df
for city in cities_df["City"]:
    
    #  Call get_restaurants fn and append to total_count list
    totals_count.append(get_restaurants(city, api_key))
    
    # Print log history
    count += 1
    rem = len(cities_df["City"]) - count
    print("-----------------------------------------")
    if city == cities_df.iloc[-1,0]:
        print("Full list of cities processed!")
    elif city == cities_df.iloc[-2,0]:
        print("Now getting results for final city. Almost there!")
        print("-----------------------------------------")
    else:
        print(f"Data retrieval for {city} complete")
        print(f"Getting results for next city.. there are {rem} cities left")
        print("-----------------------------------------")

cities_df["Total # of restaurants"] = totals_count

LOG HISTORY OF API CALLS:
---------------------------
Now processing query set 1 of approx 20 for Benicia
Now processing query set 2 of approx 20 for Benicia
Now processing query set 3 of approx 20 for Benicia
Now processing query set 4 of approx 20 for Benicia
Now processing query set 5 of approx 20 for Benicia
Now processing query set 6 of approx 20 for Benicia
Now processing query set 7 of approx 20 for Benicia
Now processing query set 8 of approx 20 for Benicia
Now processing query set 9 of approx 20 for Benicia


356

In [128]:
# Show updated cities_df
cities_df.head()

Unnamed: 0,City,County,Population,Area (sq. mi),Total # of restaurants
0,Albany,Alameda,18539.0,1.79,227
1,Colma,,,,180
2,Yountville,,,,32


### Create Dataframe

- Read csv file
- Clean data

In [130]:
# Read csv file
restaurants_df = pd.read_csv("restaurant_data.csv", encoding = "ISO-8859-1", 
                            names=["City", "Name", "Lat", "Lng", "Address", "Zip", "Rating", "# of Reviews", 
                                   "Price Level", "Category", "Yelp ID"])

# Replace NaN entries with blank string
restaurants_df = restaurants_df.fillna('')
restaurants_df

Unnamed: 0,City,Name,Lat,Lng,Address,Zip,Rating,# of Reviews,Price Level,Category,Yelp ID
0,Zaytoon Mediterranean Restaurant & Bar,37.89052,-122.297730,1133 Solano Ave,94706,4.5,320.0,$$,Mediterranean,g15dMYbefEL-ylCgk0MBbw,
1,310 Eatery,37.89241,-122.299270,747 San Pablo Ave,94706,4.5,648.0,$$,Burgers,1ErPhzdCaoMVSHsXc9TOmQ,
2,Juanita & Maude,37.89099,-122.298800,825 San Pablo Ave,94706,4.5,162.0,$$$$,American (New),jzmCjMb4nJscElnEgtY-Pw,
3,Wojia Hunan Cuisine,37.88954,-122.298340,917 San Pablo Ave,94706,4.5,141.0,$$,Szechuan,shWuD8dJ5wbXppAzEybpgw,
4,Bowl'd Korean Rice Bar,37.8911161,-122.288295,1479 Solano Ave,94706,4,1087.0,$$,Korean,vD0mp-ZGHixwQrdzCcxuGw,
5,DaNang,37.89005,-122.298620,905 San Pablo Ave,94706,4.5,50.0,,Vietnamese,TyIzSjcr0z0jimwLeISMQg,
6,The Hot Shop,37.88998,-122.298590,909 San Pablo Ave,94706,4.5,330.0,$,Mexican,4llVHbcdMPfrZImiiF-D9w,
7,El Mono,37.905185576526,-122.304549,10264 San Pablo Ave,94530,4.5,1308.0,$$,Peruvian,Uq3u_kbGVGLjEDVU45WLog,
8,938 Crawfish,37.8889,-122.298860,938 San Pablo Ave,94706,4,474.0,$$,Seafood,u5-xaRYrBeeVa5VxMzOs2Q,
9,Oori Rice Triangles,37.8907327309419,-122.293740,1247 Solano Ave,94706,4,339.0,$,Japanese,UuC0xMHDZ3xKlbBKSvqSqQ,


In [97]:
### Clean data, how to treat missing address, zip, price level, food trucks/mobile, low review counts

restaurants_df["Category"].value_counts()

Pizza                        19
Mexican                      17
Food Trucks                  15
Burgers                      13
Thai                         12
Chinese                      12
Japanese                     11
Indian                        9
Breakfast & Brunch            7
Vietnamese                    7
Coffee & Tea                  6
Bakeries                      5
American (New)                5
Sushi Bars                    5
Himalayan/Nepalese            5
Fast Food                     5
Sandwiches                    4
Szechuan                      4
Mediterranean                 4
Italian                       3
Brazilian                     3
French                        3
Cafes                         3
Hot Dogs                      3
Seafood                       3
Korean                        3
American (Traditional)        3
Vegan                         3
Latin American                2
Delis                         2
Soul Food                     2
Desserts