# Yelp data

You can use the tool below to pull in search results from Yelp for businesses and other establishments in a given neighborhood. The tool is currently set up to pull in all search results for Red Hook regardless of the establishment type. The API call will return lists containing dictionaries. Each dictionary corresponds to one specific establishment.

Note that in order to run the code, you will need to enter your own API key in the following cell. You can also refine the searches by adding specific search terms (like "bar" or "restaurant"), price, and other search tools available on Yelp.

In [1]:
# Libraries
import requests
import json
import pandas as pd
import matplotlib.pylab as plt
import urllib.request
import copy

try:
    # For Python 3.0 and later
    from urllib.error import HTTPError
    from urllib.parse import quote
    from urllib.parse import urlencode
except ImportError:
    # Fall back to Python 2's urllib2 and urllib
    from urllib2 import HTTPError
    from urllib import quote
    from urllib import urlencode

In [2]:
API_KEY = open('yelp_api.txt', 'r').readlines()[0][:].rstrip() # api stored in seperate .txt file

## Functions for API calls

In [3]:
# You no longer need to provide Client ID to fetch Data
# It now uses private keys to authenticate requests (API Key)
# You can find it on
# https://www.yelp.com/developers/v3/manage_app
# API_KEY = open('yelp_api.txt', 'r').readlines()[0][:] # api stored in seperate .txt file

# API constants, you shouldn't have to change these.
API_HOST = 'https://api.yelp.com'
SEARCH_PATH = '/v3/businesses/search'
BUSINESS_PATH = '/v3/businesses/'  # Business ID will come after slash.
REVIEW_PATH = '/v3/businesses/{}/reviews'

# Set search limit due to API restrictions
SEARCH_LIMIT = 50

def request(host, path, api_key, url_params=None):
    """Given your API_KEY, send a GET request to the API.
    Args:
        host (str): The domain host of the API.
        path (str): The path of the API after the domain.
        API_KEY (str): Your API Key.
        url_params (dict): An optional set of query parameters in the request.
    Returns:
        dict: The JSON response from the request.
    Raises:
        HTTPError: An error occurs from the HTTP request.
    """
    url_params = url_params or {}
    url = '{0}{1}'.format(host, quote(path.encode('utf8')))
    headers = {
        'Authorization': 'Bearer %s' % api_key,
    }

    response = requests.request('GET', url, headers=headers, params=url_params)

    return response.json()

def radius_search(api_key, latitude, longitude, radius, offset):
    """Query the Search API
    Returns:
        dict: The JSON response from the request.
    """

    url_params = {
        'latitude': latitude,
        'longitude': longitude,
        'radius': radius,
        'limit': SEARCH_LIMIT,
        'offset': offset
    }
    return request(API_HOST, SEARCH_PATH, api_key, url_params=url_params)

def query_api(latitude, longitude, radius, offset):
    # Queries the API by the input values from the user.
    response = radius_search(API_KEY, latitude, longitude, radius, offset)

    businesses = response.get('businesses')
    return businesses

def get_review(api_key, biz_id):
    return request(API_HOST, REVIEW_PATH.format(biz_id), api_key)

# Still working on this part, not sure how to get it to work right (but the previous function works)
def get_reviews(api_key, biz_ids):
    responses = []
    for biz_id in biz_ids:
        response = request(API_HOST, REVIEW_PATH.format(biz_id), api_key)
        responses.append(response)
    return responses

## Getting establishments near site

In [26]:
# Create offsets for multiple API calls since Yelp limits the number of requests one can make at a time
offsets = [0]
# Control the total number of results using the second argument in the below range (50 * __)
for i in range(1,10):
    off = i*50
    offsets.append(off)

In [27]:
# Set search radius (in meters) - max allowable value is 40,000
# 800 meter radius around Red Hook centroid roughly covers all of Red Hook
radius = 800

In [28]:
"""
Loop through multiple queries to get the first 500 results for Red Hook.

Note that the API call results in 500 dictionaries - each dictionary contains the information for one business.
"""
results = []
for i in range(len(offsets)):
    # Using centroid for Red Hook CTs 53, 59, and 85
    businesses = query_api(40.67554871068841, -74.0091782600863, radius, offsets[i])
    # Using the longitude and latitude of the site centroid
#     businesses = query_api(40.67840802364635, -74.01521633676086, radius, offsets[i])
    results.append(businesses)

In [29]:
# To see how many total results were received from query
for i in range(len(results)):
    print("Batch {}: {} results".format(i,len(results[i])))

Batch 0: 50 results
Batch 1: 12 results
Batch 2: 0 results
Batch 3: 0 results
Batch 4: 0 results
Batch 5: 0 results
Batch 6: 0 results
Batch 7: 0 results
Batch 8: 0 results
Batch 9: 0 results


In [30]:
# Sample of result from API call - this particular dictionary includes data on Hometown BBQ
results[0][0]

{'id': 'Ms3CAGddVbgetiQrpzqxPQ',
 'alias': 'hometown-bar-b-que-brooklyn-3',
 'name': 'Hometown Bar-B-Que',
 'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/Eo9NJvaF8j9HLa0GX9yNUA/o.jpg',
 'is_closed': False,
 'url': 'https://www.yelp.com/biz/hometown-bar-b-que-brooklyn-3?adjust_creative=blQ-cNUMXpZs8T2qda_yow&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=blQ-cNUMXpZs8T2qda_yow',
 'review_count': 1195,
 'categories': [{'alias': 'bbq', 'title': 'Barbeque'},
  {'alias': 'smokehouse', 'title': 'Smokehouse'}],
 'rating': 4.0,
 'coordinates': {'latitude': 40.6748965703426, 'longitude': -74.0160489746129},
 'transactions': [],
 'price': '$$',
 'location': {'address1': '454 Van Brunt St',
  'address2': '',
  'address3': None,
  'city': 'Brooklyn',
  'zip_code': '11231',
  'country': 'US',
  'state': 'NY',
  'display_address': ['454 Van Brunt St', 'Brooklyn, NY 11231']},
 'phone': '+13472944644',
 'display_phone': '(347) 294-4644',
 'distance': 583.9407138632308}

In [31]:
# Make a df from the results
dfs_list = []
for i in range(len(results)):
    temp_df = pd.DataFrame(results[i])
    dfs_list.append(temp_df)
biz_df = pd.concat(dfs_list)
biz_df.reset_index(inplace=True)
biz_df.head()

Unnamed: 0,index,alias,categories,coordinates,display_phone,distance,id,image_url,is_closed,location,name,phone,price,rating,review_count,transactions,url
0,0,hometown-bar-b-que-brooklyn-3,"[{'alias': 'bbq', 'title': 'Barbeque'}, {'alia...","{'latitude': 40.6748965703426, 'longitude': -7...",(347) 294-4644,583.940714,Ms3CAGddVbgetiQrpzqxPQ,https://s3-media2.fl.yelpcdn.com/bphoto/Eo9NJv...,False,"{'address1': '454 Van Brunt St', 'address2': '...",Hometown Bar-B-Que,13472944644,$$,4.0,1195,[],https://www.yelp.com/biz/hometown-bar-b-que-br...
1,1,red-hook-lobster-pound-brooklyn,"[{'alias': 'seafood', 'title': 'Seafood'}]","{'latitude': 40.6797687107191, 'longitude': -7...",(718) 858-7650,479.860312,nOjGNqPcu5jHRRElOndQqQ,https://s3-media3.fl.yelpcdn.com/bphoto/saTV5k...,False,"{'address1': '284 Van Brunt St', 'address2': '...",Red Hook Lobster Pound,17188587650,$$,4.0,944,"[pickup, delivery]",https://www.yelp.com/biz/red-hook-lobster-poun...
2,2,baked-brooklyn,"[{'alias': 'bakeries', 'title': 'Bakeries'}, {...","{'latitude': 40.676789, 'longitude': -74.013211}",(718) 222-0345,366.984075,Q_7J5E-cYCQfHNCkCyMdLA,https://s3-media1.fl.yelpcdn.com/bphoto/vpA4Pd...,False,"{'address1': '359 Van Brunt St', 'address2': '...",Baked,17182220345,$$,4.0,495,[],https://www.yelp.com/biz/baked-brooklyn?adjust...
3,3,defontes-brooklyn,"[{'alias': 'sandwiches', 'title': 'Sandwiches'}]","{'latitude': 40.6789444357833, 'longitude': -7...",(718) 625-8052,495.737279,d_rQ-nVpY6Z5C722Q5wpog,https://s3-media3.fl.yelpcdn.com/bphoto/5ugsEH...,False,"{'address1': '379 Columbia St', 'address2': ''...",Defontes,17186258052,$$,4.5,309,[],https://www.yelp.com/biz/defontes-brooklyn?adj...
4,4,the-good-fork-brooklyn,"[{'alias': 'newamerican', 'title': 'American (...","{'latitude': 40.67599, 'longitude': -74.01432}",(718) 643-6636,434.34488,-BOAKHyKKAXE1WhuVpMFkQ,https://s3-media2.fl.yelpcdn.com/bphoto/JFWO8Z...,False,"{'address1': '391 Van Brunt St', 'address2': '...",The Good Fork,17186436636,$$,4.0,399,[],https://www.yelp.com/biz/the-good-fork-brookly...


In [32]:
print("Number of establishments listed on Yelp within {} meters of site centroid: {}".format(radius, \
                                                                                             biz_df.shape[0]))

Number of establishments listed on Yelp within 800 meters of site centroid: 62


In [None]:
# Red Hook centroid in red with buffer and site in blue with buffer
f, ax = plt.subplots(figsize=(10,10))
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
# Red Hook combined census tracts with centroid and buffer
rh_comb_tract.plot(alpha=.5,linewidth=3,ax=ax,color='pink',edgecolor='black')
# Uncomment if you want Red Hook center and buffer
rh_comb_center.plot(alpha=1, linewidth=0.2,ax=ax,color='r',edgecolor='black',markersize=200)
rh_buffer.plot(alpha=.3,linewidth=1,ax=ax,color='r',edgecolor='black')
# Site block with centroid and buffer
site_poly.plot(alpha=1,linewidth=0.8,ax=ax,color='w',edgecolor='black')
site_centroid.plot(alpha=1, linewidth=0.2,ax=ax,color='b',edgecolor='black',markersize=50)
site_buffer.plot(alpha=.1,linewidth=1,ax=ax,color='b',edgecolor='b')
# Show plot
plt.title("Red Hook and Project Site", fontsize=20)
plt.show()

## Getting Yelp reviews from nearby establishments

Here's more info on the API query for Yelp reviews:

https://www.yelp.com/developers/documentation/v3/business_reviews

Note that this API call is limited to only 3 reviews per establishment.

In [34]:
# Use the businesses that were returned from the previous query
biz_ids = list(biz_df.id)

In [35]:
# API call for reviews
reviews = []
for biz_id in biz_ids:
    review = get_review(API_KEY, biz_id)
    reviews.append(review)

In [36]:
# Make a df from those reviews
reviews_df = pd.DataFrame(reviews)
reviews_df.head()

Unnamed: 0,possible_languages,reviews,total
0,"[fr, en, it, ja]","[{'id': 'gfJfccAseabkH9Hhp41Qww', 'url': 'http...",1195
1,"[en, it]","[{'id': 'WLhCbdA75IFVMm9Isk8LLg', 'url': 'http...",944
2,[en],"[{'id': '63UNNOriaxChZdtugnBhuw', 'url': 'http...",495
3,"[en, es]","[{'id': 'WRl3GkRzzoQsvapYIm4l3Q', 'url': 'http...",309
4,[en],"[{'id': 'S7lv-5qAj9bAvXxrekpHzA', 'url': 'http...",399


In [37]:
# Split the reviews into separate columns
reviews_df = pd.concat([reviews_df, reviews_df['reviews'].apply(pd.Series)], axis = 1).drop('reviews', \
                                                                                               axis = 1)
# And break out each part of each review into separate columns
for i in range(3):
    df_sep = reviews_df[i].apply(pd.Series)
    df_sep.rename(columns={'id':'id_{}'.format(i),'url':'url_{}'.format(i),'text':'text_{}'.format(i),\
                           'rating':'rating_{}'.format(i),'time_created':'time_created_{}'.format(i),\
                           'user':'user_{}'.format(i)}, inplace=True)
    reviews_df = pd.concat([reviews_df, df_sep], axis = 1).drop([i], axis = 1)

reviews_df.head()

  result = result.union(other)
  index = _union_indexes(indexes, sort=sort)
  result = result.union(other)


Unnamed: 0,possible_languages,total,id_0,url_0,text_0,rating_0,time_created_0,user_0,id_1,url_1,...,time_created_1,user_1,0,id_2,url_2,text_2,rating_2,time_created_2,user_2,0.1
0,"[fr, en, it, ja]",1195,gfJfccAseabkH9Hhp41Qww,https://www.yelp.com/biz/hometown-bar-b-que-br...,What a solid BBQ joint! \n\nMy friends and I c...,5,2019-04-20 13:47:44,"{'id': 'RoO8V10M8wLrJT-JnuNVig', 'profile_url'...",dhbYgl2ubq5Ckio26jmlcA,https://www.yelp.com/biz/hometown-bar-b-que-br...,...,2019-04-26 13:06:48,"{'id': 'xh84BlvTytrjuyOkSyevHg', 'profile_url'...",,h066kAT2eX89It6WPSJYPw,https://www.yelp.com/biz/hometown-bar-b-que-br...,Incredible BBQ. Loved every bite of the Briske...,4.0,2019-04-13 19:01:43,"{'id': 'gthIh2LBOUDwdjI4atXS6A', 'profile_url'...",
1,"[en, it]",944,WLhCbdA75IFVMm9Isk8LLg,https://www.yelp.com/biz/red-hook-lobster-poun...,"I had a major lobster craving, hauled myself t...",5,2019-01-04 06:29:40,"{'id': '-mHn6PHX8V8QepZMaifSNQ', 'profile_url'...",d6Mf18WZbG8me_6VODZ_HA,https://www.yelp.com/biz/red-hook-lobster-poun...,...,2018-11-29 17:36:08,"{'id': 'fhjoJL5oixDvuDKFFRb6Fg', 'profile_url'...",,kngx60XRwUpQER6MZmQumQ,https://www.yelp.com/biz/red-hook-lobster-poun...,there are times when I'm ok to wander out for ...,4.0,2018-11-29 13:59:03,"{'id': 'NeVJULvjNMzbWGZJf7XU0w', 'profile_url'...",
2,[en],495,63UNNOriaxChZdtugnBhuw,https://www.yelp.com/biz/baked-brooklyn?adjust...,"I'm drooling writing this review. Great, now I...",5,2019-03-27 16:55:56,"{'id': 'bt51F2SgYVcPWvNuDwZwXQ', 'profile_url'...",eo5zpZ5ene8-humhz6vkBw,https://www.yelp.com/biz/baked-brooklyn?adjust...,...,2019-04-20 11:40:09,"{'id': 'W8AK0XlpePkv2epHeqLWxg', 'profile_url'...",,aNFenmhOPRMwykJj7Zgdnw,https://www.yelp.com/biz/baked-brooklyn?adjust...,Coffee/pastries are great but literally every ...,1.0,2019-04-19 10:40:57,"{'id': 'hotAE5Qt-0CubkmHzzJ7Ng', 'profile_url'...",
3,"[en, es]",309,WRl3GkRzzoQsvapYIm4l3Q,https://www.yelp.com/biz/defontes-brooklyn?adj...,Roast beef mozz and eggplant sandwich = heaven...,5,2019-04-05 19:37:05,"{'id': 'IuP4sr2yvJjaAuWYPFjiHg', 'profile_url'...",8phz-YlOtGkMzObgWxB7fQ,https://www.yelp.com/biz/defontes-brooklyn?adj...,...,2018-12-31 15:00:01,"{'id': '0Sw9sDJy9beVwPX6EfOKeg', 'profile_url'...",,yqai75uk3INLYhSFZG-RLg,https://www.yelp.com/biz/defontes-brooklyn?adj...,Defonte's eggplant parmesan sandwich is a hot ...,5.0,2018-12-05 15:29:32,"{'id': 'ld0JdpptiauEAMjXdp0FJQ', 'profile_url'...",
4,[en],399,S7lv-5qAj9bAvXxrekpHzA,https://www.yelp.com/biz/the-good-fork-brookly...,One of my favorite itineraries for showing out...,5,2018-11-21 07:52:57,"{'id': '-yEhhXT6URxh_yxHko5Gzg', 'profile_url'...",FuZ4iXW-hq5oK-U6rNm0SQ,https://www.yelp.com/biz/the-good-fork-brookly...,...,2019-04-04 10:10:39,"{'id': 'SwvoOnfwvtxVDZOX_JcRGA', 'profile_url'...",,vtsyyAbCFJKNqKxpkDC00A,https://www.yelp.com/biz/the-good-fork-brookly...,A great accidental find in Red Hook. They have...,4.0,2018-08-05 11:46:14,"{'id': 'epyaaMECqFyTzDY_4B7rIQ', 'profile_url'...",


In [38]:
# Merge back with the establishment names
reviews_merge = pd.concat([reviews_df,biz_df.name],axis=1)
reviews_merge.head()

Unnamed: 0,possible_languages,total,id_0,url_0,text_0,rating_0,time_created_0,user_0,id_1,url_1,...,user_1,0,id_2,url_2,text_2,rating_2,time_created_2,user_2,0.1,name
0,"[fr, en, it, ja]",1195,gfJfccAseabkH9Hhp41Qww,https://www.yelp.com/biz/hometown-bar-b-que-br...,What a solid BBQ joint! \n\nMy friends and I c...,5,2019-04-20 13:47:44,"{'id': 'RoO8V10M8wLrJT-JnuNVig', 'profile_url'...",dhbYgl2ubq5Ckio26jmlcA,https://www.yelp.com/biz/hometown-bar-b-que-br...,...,"{'id': 'xh84BlvTytrjuyOkSyevHg', 'profile_url'...",,h066kAT2eX89It6WPSJYPw,https://www.yelp.com/biz/hometown-bar-b-que-br...,Incredible BBQ. Loved every bite of the Briske...,4.0,2019-04-13 19:01:43,"{'id': 'gthIh2LBOUDwdjI4atXS6A', 'profile_url'...",,Hometown Bar-B-Que
1,"[en, it]",944,WLhCbdA75IFVMm9Isk8LLg,https://www.yelp.com/biz/red-hook-lobster-poun...,"I had a major lobster craving, hauled myself t...",5,2019-01-04 06:29:40,"{'id': '-mHn6PHX8V8QepZMaifSNQ', 'profile_url'...",d6Mf18WZbG8me_6VODZ_HA,https://www.yelp.com/biz/red-hook-lobster-poun...,...,"{'id': 'fhjoJL5oixDvuDKFFRb6Fg', 'profile_url'...",,kngx60XRwUpQER6MZmQumQ,https://www.yelp.com/biz/red-hook-lobster-poun...,there are times when I'm ok to wander out for ...,4.0,2018-11-29 13:59:03,"{'id': 'NeVJULvjNMzbWGZJf7XU0w', 'profile_url'...",,Red Hook Lobster Pound
2,[en],495,63UNNOriaxChZdtugnBhuw,https://www.yelp.com/biz/baked-brooklyn?adjust...,"I'm drooling writing this review. Great, now I...",5,2019-03-27 16:55:56,"{'id': 'bt51F2SgYVcPWvNuDwZwXQ', 'profile_url'...",eo5zpZ5ene8-humhz6vkBw,https://www.yelp.com/biz/baked-brooklyn?adjust...,...,"{'id': 'W8AK0XlpePkv2epHeqLWxg', 'profile_url'...",,aNFenmhOPRMwykJj7Zgdnw,https://www.yelp.com/biz/baked-brooklyn?adjust...,Coffee/pastries are great but literally every ...,1.0,2019-04-19 10:40:57,"{'id': 'hotAE5Qt-0CubkmHzzJ7Ng', 'profile_url'...",,Baked
3,"[en, es]",309,WRl3GkRzzoQsvapYIm4l3Q,https://www.yelp.com/biz/defontes-brooklyn?adj...,Roast beef mozz and eggplant sandwich = heaven...,5,2019-04-05 19:37:05,"{'id': 'IuP4sr2yvJjaAuWYPFjiHg', 'profile_url'...",8phz-YlOtGkMzObgWxB7fQ,https://www.yelp.com/biz/defontes-brooklyn?adj...,...,"{'id': '0Sw9sDJy9beVwPX6EfOKeg', 'profile_url'...",,yqai75uk3INLYhSFZG-RLg,https://www.yelp.com/biz/defontes-brooklyn?adj...,Defonte's eggplant parmesan sandwich is a hot ...,5.0,2018-12-05 15:29:32,"{'id': 'ld0JdpptiauEAMjXdp0FJQ', 'profile_url'...",,Defontes
4,[en],399,S7lv-5qAj9bAvXxrekpHzA,https://www.yelp.com/biz/the-good-fork-brookly...,One of my favorite itineraries for showing out...,5,2018-11-21 07:52:57,"{'id': '-yEhhXT6URxh_yxHko5Gzg', 'profile_url'...",FuZ4iXW-hq5oK-U6rNm0SQ,https://www.yelp.com/biz/the-good-fork-brookly...,...,"{'id': 'SwvoOnfwvtxVDZOX_JcRGA', 'profile_url'...",,vtsyyAbCFJKNqKxpkDC00A,https://www.yelp.com/biz/the-good-fork-brookly...,A great accidental find in Red Hook. They have...,4.0,2018-08-05 11:46:14,"{'id': 'epyaaMECqFyTzDY_4B7rIQ', 'profile_url'...",,The Good Fork


# Next steps/to dos:

* How to process reviews/text?

## Word search

Searching for text indicating gentrification. Possible words to be included were pulled from:

https://wordassociations.net/en/words-associated-with/Gentrification?start=0

In [39]:
# List of words associated with gentrification for searching
gent_words = ['gentrification', 'gentrify', 'gentrified', 'change', 'redevelopment', 'redeveloped', 'displaced',\
            'displacement', 'renewal', 'segregation', 'enclave', 'regeneration', 'decay', 'decline', 'trend', \
              'trendy', 'starbucks', 'marxist', 'affluent', 'demographic', 'displace', 'rehabilitate']

In [40]:
# List of words associated with neighborhoods in general for searching
neighb_words = ['neighborhood', 'neighb']

In [41]:
rev0 = reviews_df.iloc[:,4]

In [42]:
rev0[rev0.str.contains('|'.join(gent_words))]

52    This place has survived the gentrification of ...
Name: text_0, dtype: object

In [43]:
rev0[rev0.str.contains('|'.join(neighb_words))]

9     Oh! My! God! Chef David! \nOne of the Best exp...
32    Basic Chinese takeout with delivery, but I rea...
42    Coffey Park is your friendly neighborhood park...
51    I was getting my car fixed nearby and was wand...
Name: text_0, dtype: object