# Yelp Project
We will be mining Yelp data today to find zip codes that have the highest rated restaurants for different categories of food. We will be using the excellent Yelp Fusion API. An API (application programming interface) is a way for developers to directly interact with a server. APIs make it easy to request specific data. The Fusion API is the third generation of Yelp APIs. There is very good documentation.

## Make a yelp account
1. Please visit the [Yelp Fusion home page](https://www.yelp.com/developers/documentation/v3) and click the button for creating an app
1. Its not important what you name your app. Just put anything down
1. After app creation you will get both a client id and a client secret. Yelp uses this to keep track of how you use their API
1. Copy and paste the client id and client secret into the strings below

## Authenticate Yourself

Before you can use the API, you must authenticate yourself using the standard OAuth2 protocol. You will make a post request with your id and secret and Yelp will respond with an access token. Once you have your access token you can start using the API.

In [2]:
import pandas as pd
import requests

In [3]:
# replace CLIENT_ID and CLIENT_SECRET with your info
app_id = 'CLIENT_ID'
app_secret = 'CLIENT_SECRET'

# don't edit these lines
data = {'grant_type': 'client_credentials',
        'client_id': app_id,
        'client_secret': app_secret}
token = requests.post('https://api.yelp.com/oauth2/token', data=data)
access_token = token.json()['access_token']

### Using the API
The Yelp API is fairly easy to use. There are about six different **endpoints** that you can use. An endpoint is a URL that you make a web request to. Along with the endpoint you send a list of parameters that specify the results that you would like. This project will use the business search endpoint. It has a url of **https://api.yelp.com/v3/businesses/search**. 

There are about a dozen parameters you can use to specify an exact search. Check the [search documentation](https://www.yelp.com/developers/documentation/v3/business_search) for more detail.

### Your first search
The following API call to the business search endpoint, searches for the top 50 Italian restaurants in Houston sorted by rating with price of 1, 2 or 3. The highest price is 4. This makes a web request and will take a second or so.

In [4]:
# don't edit these lines
url = 'https://api.yelp.com/v3/businesses/search'
headers = {'Authorization': f'bearer {access_token}'}

# change these to make an API call
params = {'location': 'Houston',
          'categories':'italian',
          'limit':'50',
          'sort_by':'rating',
          'price':'1,2,3'
         }

resp = requests.get(url=url, params=params, headers=headers)

### Examine response
Look back at the documentation and you will see a sample response. The response for most APIs is JSON data which results directly as a Python dictionary. JSON data is a hierarchical and nested similar to how your file system stores files and directories.

### Convert response into Python dictionary
The following command converts the response into a Python dictionary

In [5]:
# data is now a Python dictionary 
data = resp.json()

### The data keys
The entire response now resides **`data`** Python variable which is a dictionary. There are three keys in the dictionary.

In [12]:
data.keys()

dict_keys(['businesses', 'total', 'region'])

### The businesses key
The main data resides in the businesses keys. The number of results are found with **total** and the lat/long of the search region is paired with **`region`**.

In [13]:
data['total']

40

In [14]:
data['region']

{'center': {'latitude': 29.763025675716687, 'longitude': -95.35995483398438}}

In [18]:
# Since data['businesses'] is a list, this will output the first 3 restaurants
data['businesses'][:3]

[{'categories': [{'alias': 'italian', 'title': 'Italian'},
   {'alias': 'pastashops', 'title': 'Pasta Shops'}],
  'coordinates': {'latitude': 29.73824, 'longitude': -95.412087},
  'display_phone': '(713) 528-1329',
  'distance': 5737.959021541999,
  'id': 'fabios-fresh-pasta-houston-3',
  'image_url': 'https://s3-media1.fl.yelpcdn.com/bphoto/-8G1sHB7qROEcq7qC4bmng/o.jpg',
  'is_closed': False,
  'location': {'address1': '2129 W Alabama',
   'address2': '',
   'address3': '',
   'city': 'Houston',
   'country': 'US',
   'display_address': ['2129 W Alabama', 'Houston, TX 77098'],
   'state': 'TX',
   'zip_code': '77098'},
  'name': "Fabio's Fresh Pasta",
  'phone': '+17135281329',
  'price': '$$',
  'rating': 5.0,
  'review_count': 169,
  'transactions': [],
  'url': 'https://www.yelp.com/biz/fabios-fresh-pasta-houston-3?adjust_creative=GpXHZzZXfAL6u6beIC0cIA&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GpXHZzZXfAL6u6beIC0cIA'},
 {'categories': [{'alias': 'desser

### A list of restaurants (dictionaries)
By examining the output of the **`businesses`** key we see a list. This list is composed of dictionaries that contain the actual restaurant data.

In [19]:
# the business key contains a list of restaurants (dictionaries)
type(data['businesses'])

list

In [20]:
# we specified 50 results returned
len(data['businesses'])

50

### Look at the first restaurant

In [21]:
# Lets go to this restaurant!
data['businesses'][0]

{'categories': [{'alias': 'italian', 'title': 'Italian'},
  {'alias': 'pastashops', 'title': 'Pasta Shops'}],
 'coordinates': {'latitude': 29.73824, 'longitude': -95.412087},
 'display_phone': '(713) 528-1329',
 'distance': 5737.959021541999,
 'id': 'fabios-fresh-pasta-houston-3',
 'image_url': 'https://s3-media1.fl.yelpcdn.com/bphoto/-8G1sHB7qROEcq7qC4bmng/o.jpg',
 'is_closed': False,
 'location': {'address1': '2129 W Alabama',
  'address2': '',
  'address3': '',
  'city': 'Houston',
  'country': 'US',
  'display_address': ['2129 W Alabama', 'Houston, TX 77098'],
  'state': 'TX',
  'zip_code': '77098'},
 'name': "Fabio's Fresh Pasta",
 'phone': '+17135281329',
 'price': '$$',
 'rating': 5.0,
 'review_count': 169,
 'transactions': [],
 'url': 'https://www.yelp.com/biz/fabios-fresh-pasta-houston-3?adjust_creative=GpXHZzZXfAL6u6beIC0cIA&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GpXHZzZXfAL6u6beIC0cIA'}

# Your turn
It's now your turn to make a completely different request and examine the results. Use the Yelp documentation to make a different search.

In [183]:
# don't edit these lines
url = 'https://api.yelp.com/v3/businesses/search'
headers = {'Authorization': f'bearer {access_token}'}

# CHANGE THESE PARAMETERS to make the search
params = {'location': 'Houston',
          'categories':'italian',
          'limit':'50',
          'sort_by':'rating',
          'price':'1,2,3'
         }

resp = requests.get(url=url, params=params, headers=headers)

# data is now a Python dictionary that contains all the data
data = resp.json()

### Use the next few lines to examine the results

# Project: Find the best restaurants at the best price for each zip code
We will now walk through in class how to turn the JSON data into a pandas DataFrame in order to answer more interesting questions like finding the best restaurants at the best price for each zip code.

In [140]:
with open('zip_codes.txt', 'r') as f:
    zip_codes = [int(line.strip()) for line in f.readlines()]

In [106]:
# this will take a long time. You might want to use less zip codes
all_restaurants = []
for zip_code in zip_codes:
    params = {'location': f'{zip_code}',
              'categories':'restaurants',
              'offset':'0',
              'limit':'50'
             }

    resp = requests.get(url=url, params=params, headers=headers)
    cur_data = resp.json()['businesses']
    all_restaurants.extend(cur_data)

In [147]:
rows = []
for restaurant in all_restaurants:
    row = {}
    row['Category'] = restaurant['categories'][0]['title']
    row['Latitude'] = restaurant['coordinates']['latitude']
    row['Longitude'] = restaurant['coordinates']['longitude']
    row['Phone'] = restaurant['display_phone']
    row['ID'] = restaurant['id']
    row['image_url'] = restaurant['image_url']
    row['Address'] = restaurant['location']['address1']
    row['City'] = restaurant['location']['city']
    row['State'] = restaurant['location']['state']
    row['Zip Code'] = restaurant['location']['zip_code']
    row['Name'] = restaurant['name']
    row['Price'] = restaurant.get('price', None)
    row['Rating'] = restaurant['rating']
    row['Review Count'] = restaurant['review_count']
    row['URL'] = restaurant['url']
    
    rows.append(row)

In [154]:
df_restaurants = pd.DataFrame(rows)
df_restaurants = df_restaurants.drop_duplicates()

In [185]:
df_restaurants.head(10)

Unnamed: 0,Address,Category,City,ID,Latitude,Longitude,Name,Phone,Price,Rating,Review Count,State,URL,Zip Code,image_url
0,607 South Friendswood Dr,Brasseries,Friendswood,brasserie-1895-friendswood,29.527583,-95.196976,Brasserie 1895,(832) 385-2278,$$,4.5,77,TX,https://www.yelp.com/biz/brasserie-1895-friend...,77546,https://s3-media2.fl.yelpcdn.com/bphoto/frI1jM...
1,709 W Parkwood,Italian,Friendswood,amici-restaurant-friendswood,29.505117,-95.192062,Amici Restaurant,(832) 569-5736,$$,4.5,90,TX,https://www.yelp.com/biz/amici-restaurant-frie...,77546,https://s3-media4.fl.yelpcdn.com/bphoto/za6Swg...
2,5105 Fm 2351 Rd,Tex-Mex,Friendswood,habaneros-tex-mex-friendswood-2,29.545499,-95.193596,Habaneros Tex-Mex,(832) 569-2289,$,4.5,69,TX,https://www.yelp.com/biz/habaneros-tex-mex-fri...,77546,https://s3-media4.fl.yelpcdn.com/bphoto/Nx_Jme...
3,111 S Friendswood Dr,Bars,Friendswood,friends-uncorked-friendswood,29.532714,-95.204429,Friends Uncorked,(281) 648-1707,$$,4.0,22,TX,https://www.yelp.com/biz/friends-uncorked-frie...,77546,https://s3-media1.fl.yelpcdn.com/bphoto/VED4G4...
4,3640 E Fm 528 Rd,Vietnamese,Friendswood,nobi-asian-grill-friendswood,29.522029,-95.169271,Nobi Asian Grill,(281) 482-6624,$,4.0,170,TX,https://www.yelp.com/biz/nobi-asian-grill-frie...,77546,https://s3-media3.fl.yelpcdn.com/bphoto/i1cX_D...
5,400 W Parkwood,Mexican,Friendswood,la-escondida-mexican-grill-friendswood,29.508131,-95.191543,La Escondida Mexican Grill,(832) 569-5785,$$,4.0,58,TX,https://www.yelp.com/biz/la-escondida-mexican-...,77546,https://s3-media2.fl.yelpcdn.com/bphoto/t7qoYJ...
6,700 Baybrook Mall,Cajun/Creole,Webster,the-rouxpour-webster,29.546088,-95.149401,The Rouxpour,(281) 480-4052,$$,3.5,117,TX,https://www.yelp.com/biz/the-rouxpour-webster?...,77546,https://s3-media4.fl.yelpcdn.com/bphoto/Yw_evF...
7,6011 W Main St,Breakfast & Brunch,League City,red-oak-cafe-league-city,29.488402,-95.157472,Red Oak Cafe,(832) 905-3150,$$,4.0,140,TX,https://www.yelp.com/biz/red-oak-cafe-league-c...,77573,https://s3-media2.fl.yelpcdn.com/bphoto/4LHERD...
8,700 Baybrook Mall,American (New),Friendswood,yard-house-friendswood-2,29.546362,-95.150104,Yard House,(281) 282-9273,$$,3.5,170,TX,https://www.yelp.com/biz/yard-house-friendswoo...,77546,https://s3-media1.fl.yelpcdn.com/bphoto/YM91-c...
9,700 Baybrook Mall,Salvadoran,Friendswood,glorias-latin-cuisine-friendswood-2,29.545375,-95.14933,Gloria's Latin Cuisine,(281) 667-9869,$$,4.0,38,TX,https://www.yelp.com/biz/glorias-latin-cuisine...,77546,https://s3-media3.fl.yelpcdn.com/bphoto/25kk3S...
