# Background

In this excercise, I will assume I am planning a trip to Asheville, North Carolina. I am planning on trying out many of the most popular coffee shops in the area, and want to use Yelp data to help me determine which shops to visit. I will set the location to be the Kimpton Hotel in Downtown (Coordinates: 35.5952198778522, -82.55219933877078), and will use the 50 nearest shops from that point to go through.

To do this, I will request data from the YelpAPI, and clean/format the results.

In [174]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import folium
import apiKey

from yelpapi import YelpAPI
import argparse
from pprint import pprint
import json
%matplotlib inline

In [45]:
yelp_api = YelpAPI(apiKey.myKey)

In [43]:
coffee_places = yelp_api.search_query(categories='coffee', longitude=-82.55219933877078, latitude=35.5952198778522, limit=50)

In [175]:
#Add Data to a Dataframe
businesses = []
for biz in coffee_places['businesses']:
    businesses.append(biz)

df_raw = pd.DataFrame(businesses)
df_raw.head()

Unnamed: 0,id,alias,name,image_url,is_closed,url,review_count,categories,rating,coordinates,transactions,price,location,phone,display_phone,distance
0,MQvTn_qKW3KVJv8-a7uyDQ,bomba-asheville,Bomba,https://s3-media3.fl.yelpcdn.com/bphoto/pXE154...,False,https://www.yelp.com/biz/bomba-asheville?adjus...,204,"[{'alias': 'coffee', 'title': 'Coffee & Tea'},...",4.5,"{'latitude': 35.5948836605335, 'longitude': -8...",[delivery],$$,"{'address1': '1 SW Pack Sq', 'address2': '', '...",18282540209,(828) 254-0209,42.480837
1,9iB5AkYLPkqNrt13gsBoNw,green-sage-cafe-downtown-asheville-3,Green Sage Cafe - Downtown,https://s3-media2.fl.yelpcdn.com/bphoto/QXZSNf...,False,https://www.yelp.com/biz/green-sage-cafe-downt...,552,"[{'alias': 'coffee', 'title': 'Coffee & Tea'},...",4.0,"{'latitude': 35.595611, 'longitude': -82.552078}",[],$$,"{'address1': '5 Broadway St', 'address2': '', ...",18282524450,(828) 252-4450,44.853229
2,iRJCIMQmI64TKlKdRGAORg,old-europe-pastries-asheville,Old Europe Pastries,https://s3-media3.fl.yelpcdn.com/bphoto/ixhppj...,False,https://www.yelp.com/biz/old-europe-pastries-a...,680,"[{'alias': 'coffee', 'title': 'Coffee & Tea'},...",4.5,"{'latitude': 35.59595628241617, 'longitude': -...",[delivery],$$,"{'address1': '18 Broadway St', 'address2': '',...",18282555999,(828) 255-5999,85.794931
3,pmDjbJXNKTN0lQ0ntqTgEA,bebettes-a-new-orleans-coffeehouse-asheville-4,Bebettes: A New Orleans Coffeehouse,https://s3-media3.fl.yelpcdn.com/bphoto/CW83U4...,False,https://www.yelp.com/biz/bebettes-a-new-orlean...,206,"[{'alias': 'coffee', 'title': 'Coffee & Tea'},...",4.5,"{'latitude': 35.59538689, 'longitude': -82.556...",[delivery],$,"{'address1': '10 Page Ave', 'address2': '', 'a...",18283333130,(828) 333-3130,355.329551
4,NG_-kyEApUNSzxO4fWcFzQ,the-times-bar-and-coffee-shop-asheville,The Times Bar & Coffee Shop,https://s3-media2.fl.yelpcdn.com/bphoto/Bk3AhT...,False,https://www.yelp.com/biz/the-times-bar-and-cof...,156,"[{'alias': 'cocktailbars', 'title': 'Cocktail ...",4.5,"{'latitude': 35.59452, 'longitude': -82.55392}",[],$$,"{'address1': '56 Patton Ave', 'address2': '', ...",18287745028,(828) 774-5028,182.613758


Looking at the data, we need to format the coordinates column, as well as the address column.

There are quite a few columns that are not helpful (id, alias, image_url, etc.), let's drop those to simplify the set

In [176]:
#Lets grab the relevant columns
df = df_raw[['name','rating','is_closed','review_count','location','display_phone','coordinates','distance']].copy()
df.reset_index(drop = True)

#Convert and apply eval
df['location'] = df['location'].astype('str')
df['coordinates'] = df['coordinates'].astype('str')
df['location'] = df['location'].apply(eval)
df['coordinates'] = df['coordinates'].apply(eval)

#Return back into a dataframe to merge it with the original
df_location = pd.json_normalize(df['location'])
df_coordinates = pd.json_normalize(df['coordinates'])
df_final = df.merge(df_coordinates,left_index=True,right_index=True)
df_final = df_final.merge(df_location,left_index=True,right_index=True)

#Drop the old coordinates and location columns, as well as address 3 which is not helpful
df_final.drop(['location', 'coordinates','address3'], axis=1, inplace=True)

df_final.head()

Unnamed: 0,name,rating,is_closed,review_count,display_phone,distance,latitude,longitude,address1,address2,city,zip_code,country,state,display_address
0,Bomba,4.5,False,204,(828) 254-0209,42.480837,35.594884,-82.551976,1 SW Pack Sq,,Asheville,28801,US,NC,"[1 SW Pack Sq, Asheville, NC 28801]"
1,Green Sage Cafe - Downtown,4.0,False,552,(828) 252-4450,44.853229,35.595611,-82.552078,5 Broadway St,,Asheville,28801,US,NC,"[5 Broadway St, Asheville, NC 28801]"
2,Old Europe Pastries,4.5,False,680,(828) 255-5999,85.794931,35.595956,-82.551916,18 Broadway St,,Asheville,28801,US,NC,"[18 Broadway St, Asheville, NC 28801]"
3,Bebettes: A New Orleans Coffeehouse,4.5,False,206,(828) 333-3130,355.329551,35.595387,-82.556182,10 Page Ave,,Asheville,28801,US,NC,"[10 Page Ave, Asheville, NC 28801]"
4,The Times Bar & Coffee Shop,4.5,False,156,(828) 774-5028,182.613758,35.59452,-82.55392,56 Patton Ave,,Asheville,28801,US,NC,"[56 Patton Ave, Asheville, NC 28801]"


Now that I've cleaned up the data, I'm going to check if any places are closed. Then, let's choose a shop

In [177]:
#Check if there are any closed stores - There are not, as the only value in the column in false. Let's go ahead and drop this column then
df_final['is_closed'].unique()
df_final.drop(['is_closed'], axis=1, inplace = True)
df_final.reset_index(drop = True)

#Lets plot the shops and see how close they will be to the hotel
f = folium.Figure(width=800, height=500)
m = folium.Map(location=[35.5952198778522, -82.55219933877078], zoom_start=12.5, tiles='CartoDB positron').add_to(f)

for point in range(0, len(df_final)):
    lat = df_final['latitude'][point]
    long = df_final['longitude'][point]
    temp = lat,long
    folium.CircleMarker(temp,radius=0.001).add_to(m)
m

Looks like we have shops all over Asheville. This is great, but I'm going to go ahead focus on shops near the hotel to enjoy being downtown.
Also, I'll drop shops with less than 100 reviews

I'll set a low distance limit to help narrow it down

In [183]:
df_close = df_final.copy()
df_close.drop(df_close[df_close['review_count'] < 100].index, inplace = True)
df_close.drop(df_close[df_close['distance'] > 350].index, inplace = True)
df_close.reset_index(drop = True)
print("We've narrowed the list to",len(df_close),"shops")

We've narrowed the list to 8 shops


In [184]:
#Now that it is narrowed down to 8, let's order them by both distance and rating.
df_close = df_close.sort_values(['rating', 'distance'], ascending=[False, True])
df_close

Unnamed: 0,name,rating,review_count,display_phone,distance,latitude,longitude,address1,address2,city,zip_code,country,state,display_address
0,Bomba,4.5,204,(828) 254-0209,42.480837,35.594884,-82.551976,1 SW Pack Sq,,Asheville,28801,US,NC,"[1 SW Pack Sq, Asheville, NC 28801]"
2,Old Europe Pastries,4.5,680,(828) 255-5999,85.794931,35.595956,-82.551916,18 Broadway St,,Asheville,28801,US,NC,"[18 Broadway St, Asheville, NC 28801]"
7,High Five Coffee,4.5,281,(828) 713-5291,130.19253,35.59573,-82.55349,13 Rankin Ave,,Asheville,28801,US,NC,"[13 Rankin Ave, Asheville, NC 28801]"
4,The Times Bar & Coffee Shop,4.5,156,(828) 774-5028,182.613758,35.59452,-82.55392,56 Patton Ave,,Asheville,28801,US,NC,"[56 Patton Ave, Asheville, NC 28801]"
6,Double D's Coffee & Desserts,4.5,455,(828) 505-2439,231.136985,35.593242,-82.551413,41 Biltmore Ave,,Asheville,28801,US,NC,"[41 Biltmore Ave, Asheville, NC 28801]"
1,Green Sage Cafe - Downtown,4.0,552,(828) 252-4450,44.853229,35.595611,-82.552078,5 Broadway St,,Asheville,28801,US,NC,"[5 Broadway St, Asheville, NC 28801]"
16,Izzy's Coffee Den,4.0,151,(828) 258-2004,260.485004,35.59743,-82.55332,74 N Lexington Ave,,Asheville,28801,US,NC,"[74 N Lexington Ave, Asheville, NC 28801]"
12,Trade and Lore Coffee,4.0,200,(828) 552-5353,330.43835,35.59472,-82.55575,37 Wall St,,Asheville,28801,US,NC,"[37 Wall St, Asheville, NC 28801]"


8 shops gives me plenty to choose from for a few days. 

All are within walking distance from the hotel as well, and there are names, numbers, and addresses for any questions or directions.