## The Most Overrated and Underrated Restaurants Nearby

### 2. Data

First, we will get the latitude and longitude of a location (default is North York, ON).  

Second, we are going to use the **explore** function of Foursquare API to extract the venue IDs of recommended restaurants nearby.  

Third, based on each venue ID, we will use the **venues** function to get details of each restaurant, including **price**,  the count of **likes**, and their **rating**. The first two features are going to be used to contruct our indepedent variables, and the last one is used as our dependent variable in our multiple linear regression model.

#### 2.1. Get the latitude and longitude of a location

In [1]:
!conda install -c conda-forge geopy --yes
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
!pip install msgpack
!pip install geocoder
import geocoder

Solving environment: done

## Package Plan ##

  environment location: /home/jupyterlab/conda

  added / updated specs: 
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    openssl-1.0.2p             |       h470a237_1         3.1 MB  conda-forge
    certifi-2018.10.15         |        py36_1000         138 KB  conda-forge
    geopy-1.17.0               |             py_0          49 KB  conda-forge
    ca-certificates-2018.10.15 |       ha4d7672_0         135 KB  conda-forge
    conda-4.5.11               |        py36_1000         651 KB  conda-forge
    geographiclib-1.49         |             py_0          32 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         4.1 MB

The following NEW packages will be INSTALLED:

    geographiclib:   1.49-py_0            conda-forge
    geopy:           

In [2]:
# Input a location. Default is North York, ON.
address = 'North York, ON'
geolocator = Nominatim(user_agent="my-application")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(address, [latitude, longitude])

North York, ON [43.7709163, -79.4124102]


#### 2.2. Utilize the Foursquare API to explore the restaurants nearby

Define Foursquare Credentials and Version

In [3]:
# The code was removed by Watson Studio for sharing.

Get the top 50 venues that are within a radius of 1,000 meters

In [5]:
VERSION = '20181105'
radius = 1000 # within 1000 meters
LIMIT = 50 # maximum 50 venues
SECTION = 'food' # restaurants

url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}&section={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, radius, LIMIT, SECTION)

Send the GET request and examine the resutls

In [6]:
import requests
results = requests.get(url).json()

In [7]:
count = len(results['response']['groups'][0]['items'])
def restaurant():
    if count <= 1:
        return 'restaurant' 
    else:
        return 'restaurants'

In [8]:
scope = round(radius/1000, 2)
if scope <= 1:
    km = 'km'
else:
    km = 'kms'

In [9]:
'{} {} are found within {} {} radius of {}.'.format(count, restaurant(), scope, km, address)

'50 restaurants are found within 1.0 km radius of North York, ON.'

Now we are ready to clean the json and structure it into a pandas dataframe

In [10]:
items = results['response']['groups'][0]['items']
items[0]

{'reasons': {'count': 0,
  'items': [{'summary': 'This spot is popular',
    'type': 'general',
    'reasonName': 'globalInteractionReason'}]},
 'venue': {'id': '563d44fccd1044ad67a744fb',
  'name': "The Captain's Boil",
  'location': {'address': '5313 Yonge St',
   'lat': 43.773255217045026,
   'lng': -79.41380541792645,
   'labeledLatLngs': [{'label': 'display',
     'lat': 43.773255217045026,
     'lng': -79.41380541792645}],
   'distance': 283,
   'postalCode': 'M2N 5R4',
   'cc': 'CA',
   'city': 'Toronto',
   'state': 'ON',
   'country': 'Canada',
   'formattedAddress': ['5313 Yonge St', 'Toronto ON M2N 5R4', 'Canada']},
  'categories': [{'id': '4bf58dd8d48988d1ce941735',
    'name': 'Seafood Restaurant',
    'pluralName': 'Seafood Restaurants',
    'shortName': 'Seafood',
    'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/seafood_',
     'suffix': '.png'},
    'primary': True}],
  'photos': {'count': 0, 'groups': []}},
 'referralId': 'e-3-563d44fccd1044ad67a744f

Create a dataframe that contains all restaurants around the input location

In [11]:
import pandas as pd
from pandas.io.json import json_normalize

In [12]:
dataframe = json_normalize(items) # flatten JSON
dataframe.head()

Unnamed: 0,reasons.count,reasons.items,referralId,venue.categories,venue.id,venue.location.address,venue.location.cc,venue.location.city,venue.location.country,venue.location.crossStreet,...,venue.location.labeledLatLngs,venue.location.lat,venue.location.lng,venue.location.neighborhood,venue.location.postalCode,venue.location.state,venue.name,venue.photos.count,venue.photos.groups,venue.venuePage.id
0,0,"[{'summary': 'This spot is popular', 'type': '...",e-3-563d44fccd1044ad67a744fb-0,"[{'id': '4bf58dd8d48988d1ce941735', 'name': 'S...",563d44fccd1044ad67a744fb,5313 Yonge St,CA,Toronto,Canada,,...,"[{'label': 'display', 'lat': 43.77325521704502...",43.773255,-79.413805,,M2N 5R4,ON,The Captain's Boil,0,[],
1,0,"[{'summary': 'This spot is popular', 'type': '...",e-3-5a35b4443abcaf37eb1a0d88-1,"[{'id': '4bf58dd8d48988d1cc941735', 'name': 'S...",5a35b4443abcaf37eb1a0d88,,CA,Toronto,Canada,,...,"[{'label': 'display', 'lat': 43.7665789176648,...",43.766579,-79.412131,Willowdale,M2N 5P1,ON,The Keg,0,[],
2,0,"[{'summary': 'This spot is popular', 'type': '...",e-3-529f667511d2b09b2a210b5f-2,"[{'id': '4bf58dd8d48988d153941735', 'name': 'B...",529f667511d2b09b2a210b5f,5314 Yonge St,CA,Toronto,Canada,at McKee Ave,...,"[{'label': 'display', 'lat': 43.77305366245606...",43.773054,-79.414082,,M2N 6V1,ON,Burrito Boyz,0,[],
3,0,"[{'summary': 'This spot is popular', 'type': '...",e-3-5a02789d0a464d3112a58785-3,"[{'id': '55a59bace4b013909087cb24', 'name': 'R...",5a02789d0a464d3112a58785,5051 Yonge St,CA,Toronto,Canada,btwn Elmwood & Hillcrest Ave,...,"[{'label': 'display', 'lat': 43.76699771023422...",43.766998,-79.412222,Willowdale,M2N 5P2,ON,Konjiki Ramen,0,[],
4,0,"[{'summary': 'This spot is popular', 'type': '...",e-3-53c7201c498ef6785edb6856-4,"[{'id': '4bf58dd8d48988d113941735', 'name': 'K...",53c7201c498ef6785edb6856,5310 Yonge St,CA,,Canada,,...,"[{'label': 'display', 'lat': 43.77300999884723...",43.77301,-79.413875,,M2N 5P9,,Dakgogi,0,[],


In [13]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [14]:
# filter columns
filtered_columns = ['venue.name', 'venue.categories'] + [col for col in dataframe.columns if col.startswith('venue.location.')] + ['venue.id']
dataframe_filtered = dataframe.loc[:, filtered_columns]

# filter the category for each row
dataframe_filtered['venue.categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean columns
dataframe_filtered.columns = [col.split('.')[-1] for col in dataframe_filtered.columns]
dataframe_filtered

Unnamed: 0,name,categories,address,cc,city,country,crossStreet,distance,formattedAddress,labeledLatLngs,lat,lng,neighborhood,postalCode,state,id
0,The Captain's Boil,Seafood Restaurant,5313 Yonge St,CA,Toronto,Canada,,283,"[5313 Yonge St, Toronto ON M2N 5R4, Canada]","[{'label': 'display', 'lat': 43.77325521704502...",43.773255,-79.413805,,M2N 5R4,ON,563d44fccd1044ad67a744fb
1,The Keg,Steakhouse,,CA,Toronto,Canada,,483,"[Toronto ON M2N 5P1, Canada]","[{'label': 'display', 'lat': 43.7665789176648,...",43.766579,-79.412131,Willowdale,M2N 5P1,ON,5a35b4443abcaf37eb1a0d88
2,Burrito Boyz,Burrito Place,5314 Yonge St,CA,Toronto,Canada,at McKee Ave,273,"[5314 Yonge St (at McKee Ave), Toronto ON M2N ...","[{'label': 'display', 'lat': 43.77305366245606...",43.773054,-79.414082,,M2N 6V1,ON,529f667511d2b09b2a210b5f
3,Konjiki Ramen,Ramen Restaurant,5051 Yonge St,CA,Toronto,Canada,btwn Elmwood & Hillcrest Ave,436,"[5051 Yonge St (btwn Elmwood & Hillcrest Ave),...","[{'label': 'display', 'lat': 43.76699771023422...",43.766998,-79.412222,Willowdale,M2N 5P2,ON,5a02789d0a464d3112a58785
4,Dakgogi,Korean Restaurant,5310 Yonge St,CA,,Canada,,261,"[5310 Yonge St, M2N 5P9, Canada]","[{'label': 'display', 'lat': 43.77300999884723...",43.77301,-79.413875,,M2N 5P9,,53c7201c498ef6785edb6856
5,Aroma Espresso Bar,Café,6 Parkhome,CA,North York,Canada,,172,"[6 Parkhome, North York ON, Canada]","[{'label': 'display', 'lat': 43.76944882099181...",43.769449,-79.413081,,,ON,557767bc498ea4a20c8043f6
6,Sushi Bong,Sushi Restaurant,5 Northtown Way,CA,Toronto,Canada,at Yonge St,512,"[5 Northtown Way (at Yonge St), Toronto ON M2N...","[{'label': 'display', 'lat': 43.77542805510968...",43.775428,-79.413654,,M2N 7A1,ON,4b2c1999f964a52098c124e3
7,Pastel Creperie & Dessert House,Creperie,5417 Yonge St,CA,Toronto,Canada,at Finch Ave.,617,"[5417 Yonge St (at Finch Ave.), Toronto ON M2N...","[{'label': 'display', 'lat': 43.77621893353755...",43.776219,-79.414648,,M2N 5R6,ON,4d446252e198721e4c23bb8b
8,Buk Chang Dong Soon Tofu 북창동 순두부 돌솥밥,Korean Restaurant,5445 Yonge St.,CA,Toronto,Canada,at Kempford Blvd.,728,"[5445 Yonge St. (at Kempford Blvd.), Toronto O...","[{'label': 'display', 'lat': 43.77721927435775...",43.777219,-79.414861,,M2N 5S1,ON,4b5c97d1f964a520a83829e3
9,St. Louis Bar & Grill,Wings Joint,2050 Yonge St.,CA,Toronto,Canada,,260,"[2050 Yonge St., Toronto ON M4S 1Z9, Canada]","[{'label': 'display', 'lat': 43.77304026482439...",43.77304,-79.413771,,M4S 1Z9,ON,4b590484f964a520b27828e3


#### 2.3. Collect **venues** data of each restaurant, including **price**,  the count of **likes**, the count of **tips**, and their **rating**

In [15]:
ratings = list()
prices = list()
likes = list()
tips = list()

for venue_id in dataframe_filtered.id:
    url = 'https://api.foursquare.com/v2/venues/{}?client_id={}&client_secret={}&v={}'.format(venue_id, CLIENT_ID, CLIENT_SECRET, VERSION)
    result = requests.get(url).json()
    try:
        ratings.append(result['response']['venue']['rating'])
    except:
        ratings.append(None)
    try:
        prices.append(result['response']['venue']['price']['tier'])
    except:
        prices.append(None)
    try:
        likes.append(result['response']['venue']['likes']['count'])
    except:
        likes.append(None)
    try:
        tips.append(result['response']['venue']['stats']['tipCount'])
    except:
        tips.append(None)

dataframe_filtered['rating'] = ratings
dataframe_filtered['price'] = prices
dataframe_filtered['likes'] = likes

In [16]:
restaurant = dataframe_filtered[['name','categories', 'address', 'postalCode', 'lat', 'lng', 'price', 'likes', 'rating']]
restaurant

Unnamed: 0,name,categories,address,postalCode,lat,lng,price,likes,rating
0,The Captain's Boil,Seafood Restaurant,5313 Yonge St,M2N 5R4,43.773255,-79.413805,3.0,41,8.1
1,The Keg,Steakhouse,,M2N 5P1,43.766579,-79.412131,4.0,17,8.5
2,Burrito Boyz,Burrito Place,5314 Yonge St,M2N 6V1,43.773054,-79.414082,2.0,43,7.8
3,Konjiki Ramen,Ramen Restaurant,5051 Yonge St,M2N 5P2,43.766998,-79.412222,,25,8.0
4,Dakgogi,Korean Restaurant,5310 Yonge St,M2N 5P9,43.77301,-79.413875,2.0,11,7.6
5,Aroma Espresso Bar,Café,6 Parkhome,,43.769449,-79.413081,1.0,17,7.5
6,Sushi Bong,Sushi Restaurant,5 Northtown Way,M2N 7A1,43.775428,-79.413654,2.0,67,7.9
7,Pastel Creperie & Dessert House,Creperie,5417 Yonge St,M2N 5R6,43.776219,-79.414648,2.0,64,8.2
8,Buk Chang Dong Soon Tofu 북창동 순두부 돌솥밥,Korean Restaurant,5445 Yonge St.,M2N 5S1,43.777219,-79.414861,1.0,98,8.4
9,St. Louis Bar & Grill,Wings Joint,2050 Yonge St.,M4S 1Z9,43.77304,-79.413771,2.0,60,7.3
