## Yelp data

You can use the tool below to pull in search results from Yelp for businesses and other establishments in a given neighborhood. The tool is currently set up to pull in all search results for Red Hook regardless of the establishment type. The API call will return lists containing dictionaries. Each dictionary corresponds to one, specific establishment.

Note that in order to run the code, you will need to enter your own API key in the following cell. You can also refine the searches by adding specific search terms (like "bar" or "restaurant"), price, and other search tools available on Yelp.

In [2]:
# Libraries
import requests
import json
import pandas as pd
import geopandas as gpd
import shapely
from fiona.crs import from_epsg
import matplotlib.pylab as plt
import urllib.request
import copy

try:
    # For Python 3.0 and later
    from urllib.error import HTTPError
    from urllib.parse import quote
    from urllib.parse import urlencode
except ImportError:
    # Fall back to Python 2's urllib2 and urllib
    from urllib2 import HTTPError
    from urllib import quote
    from urllib import urlencode

In [127]:
# Your Yelp API key here
API_KEY = ''

In [128]:
# You no longer need to provide Client ID to fetch Data
# It now uses private keys to authenticate requests (API Key)
# You can find it on
# https://www.yelp.com/developers/v3/manage_app

# API constants, you shouldn't have to change these.
API_HOST = 'https://api.yelp.com'
SEARCH_PATH = '/v3/businesses/search'
BUSINESS_PATH = '/v3/businesses/'  # Business ID will come after slash.

# Set search limit due to API restrictions
SEARCH_LIMIT = 50

def request(host, path, api_key, url_params=None):
    """Given your API_KEY, send a GET request to the API.
    Args:
        host (str): The domain host of the API.
        path (str): The path of the API after the domain.
        API_KEY (str): Your API Key.
        url_params (dict): An optional set of query parameters in the request.
    Returns:
        dict: The JSON response from the request.
    Raises:
        HTTPError: An error occurs from the HTTP request.
    """
    url_params = url_params or {}
    url = '{0}{1}'.format(host, quote(path.encode('utf8')))
    headers = {
        'Authorization': 'Bearer %s' % api_key,
    }

    response = requests.request('GET', url, headers=headers, params=url_params)

    return response.json()

def search(api_key, location, offset):
    """Query the Search API by a search term and location.
    Args:
        term (str): The search term passed to the API.
        location (str): The search location passed to the API.
    Returns:
        dict: The JSON response from the request.
    """

    url_params = {
        'location': location.replace(' ', '+'),
        'limit': SEARCH_LIMIT,
        'offset': offset
    }
    return request(API_HOST, SEARCH_PATH, api_key, url_params=url_params)

def get_business(api_key, business_id):
    """Query the Business API by a business ID.
    Args:
        business_id (str): The ID of the business to query.
    Returns:
        dict: The JSON response from the request.
    """
    business_path = BUSINESS_PATH + business_id

    return request(API_HOST, business_path, api_key)

def query_api(location, offset):
    """Queries the API by the input values from the user.
    Args:
        term (str): The search term to query.
        location (str): The location of the business to query.
    """
    response = search(API_KEY, location, offset)

    businesses = response.get('businesses')
    return businesses

In [129]:
# Create offsets for multiple API calls since Yelp limits the number of requests one can make at a time
offsets = [0]
# Control the total number of results using the second argument in the below range (50 * __)
for i in range(1,10):
    off = i*50
    offsets.append(off)

In [130]:
"""
Loop through multiple queries to get the first 500 results for Red Hook.

Note that the API call results in 500 dictionaries - each dictionary contains the information for one business.
"""
results = []
for i in range(len(offsets)):
    dives = query_api('Red Hook, Brooklyn, NY', offsets[i])
    results.append(dives)

In [131]:
# Check to make sure we have 500 results (10 sets of 50 results)
for i in range(len(results)):
    print(len(results[i]))

50
50
50
50
50
50
50
50
50
50


In [132]:
# Sample of result from API call - this particular dictionary includes data on Hometown BBQ
results[0][0]

{'id': 'Ms3CAGddVbgetiQrpzqxPQ',
 'alias': 'hometown-bar-b-que-brooklyn-3',
 'name': 'Hometown Bar-B-Que',
 'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/Eo9NJvaF8j9HLa0GX9yNUA/o.jpg',
 'is_closed': False,
 'url': 'https://www.yelp.com/biz/hometown-bar-b-que-brooklyn-3?adjust_creative=blQ-cNUMXpZs8T2qda_yow&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=blQ-cNUMXpZs8T2qda_yow',
 'review_count': 1191,
 'categories': [{'alias': 'bbq', 'title': 'Barbeque'},
  {'alias': 'smokehouse', 'title': 'Smokehouse'}],
 'rating': 4.0,
 'coordinates': {'latitude': 40.6748965703426, 'longitude': -74.0160489746129},
 'transactions': [],
 'price': '$$',
 'location': {'address1': '454 Van Brunt St',
  'address2': '',
  'address3': None,
  'city': 'Brooklyn',
  'zip_code': '11231',
  'country': 'US',
  'state': 'NY',
  'display_address': ['454 Van Brunt St', 'Brooklyn, NY 11231']},
 'phone': '+13472944644',
 'display_phone': '(347) 294-4644',
 'distance': 546.3809722270964}

In [133]:
results

[[{'id': 'Ms3CAGddVbgetiQrpzqxPQ',
   'alias': 'hometown-bar-b-que-brooklyn-3',
   'name': 'Hometown Bar-B-Que',
   'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/Eo9NJvaF8j9HLa0GX9yNUA/o.jpg',
   'is_closed': False,
   'url': 'https://www.yelp.com/biz/hometown-bar-b-que-brooklyn-3?adjust_creative=blQ-cNUMXpZs8T2qda_yow&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=blQ-cNUMXpZs8T2qda_yow',
   'review_count': 1191,
   'categories': [{'alias': 'bbq', 'title': 'Barbeque'},
    {'alias': 'smokehouse', 'title': 'Smokehouse'}],
   'rating': 4.0,
   'coordinates': {'latitude': 40.6748965703426,
    'longitude': -74.0160489746129},
   'transactions': [],
   'price': '$$',
   'location': {'address1': '454 Van Brunt St',
    'address2': '',
    'address3': None,
    'city': 'Brooklyn',
    'zip_code': '11231',
    'country': 'US',
    'state': 'NY',
    'display_address': ['454 Van Brunt St', 'Brooklyn, NY 11231']},
   'phone': '+13472944644',
   'display_phone': '

In [134]:
# Make lists of the business names, types, prices, lats, and lons
names = []
types = []
prices = []
lats = []
lons = []
for i in range(len(results)):
    for j in range(len(results[i])):
        name = results[i][j]['name']
        biz_type = results[i][j]['categories'][0]['alias']
        try:
            price = results[i][j]['price']
        except KeyError:
            price = "N/A"
        lat = results[i][j]['coordinates']['latitude']
        lon = results[i][j]['coordinates']['longitude']
        names.append(name)
        types.append(biz_type)
        prices.append(price)
        lats.append(lat)
        lons.append(lon)

In [135]:
# Make a dataframe
rh_biz = pd.DataFrame(
    {'name': names,
     'type': types,
     'price': prices,
     'lat': lats,
     'lon': lons
    })

In [136]:
"""
Note that the initial request pulls in results for businesses and landmarks not located dicretly in Red Hook. This
will be corrected with further refinement of the API query or through spatial joins using the latitude and
longitude information contained here along with spatial data for Red Hook.
"""
rh_biz.head(50)

Unnamed: 0,name,type,price,lat,lon
0,Hometown Bar-B-Que,bbq,$$,40.674897,-74.016049
1,Red Hook Lobster Pound,seafood,$$,40.679769,-74.010369
2,Steve's Authentic Key Lime Pies,bakeries,$$,40.67776,-74.018067
3,Baked,bakeries,$$,40.676789,-74.013211
4,Buttermilk Channel,newamerican,$$,40.675919,-73.999059
5,Defontes,sandwiches,$$,40.678944,-74.005369
6,Lucali,pizza,$$,40.6818,-74.00024
7,The Good Fork,newamerican,$$,40.67599,-74.01432
8,Brooklyn Ice House,bbq,$,40.67919,-74.011099
9,Sunny's Bar,bars,$,40.67569,-74.01689
