## The Best Neighborhood for opening a Chinese restaurant in Toronto

### Introduction

In Toronto, My friend is looking to open a Chinese restaurant, and he is looking for the place for this restaurant. 
He knows that I am learning data science, so he asks for my help to recommend which place they can open it.

### Data

In the project of Segmenting and Clustering Neighborhoods in Toronto, I made two datasets to use for this project. I will use the Foursquare location data to execute my idea.

The 1st dataset is 'Toronto_Venues_Sorted'. I can use this dataset to ranking 'Neighborhood' by judging if the Most Common Venues include 'Chinese Restaurant' or not.

The 2nd dataset is 'Borough_Toronto' which includes the latitude and longitude of each neighborhood. I can use this dataset to find out all Chinese restaurants of neighborhoods in Toronto. Moreover, I can find out the rating and tips of each restaurant.

By using the common venue score, rating and tips, I can recommend the best place for my friend.

In [129]:
import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner

#### The dataset: Toronto_Venues_Sorted

In [130]:
toronto_venues_sorted = pd.read_csv("toronto_venues_sorted.csv")
toronto_venues_sorted.drop('Unnamed: 0', axis=1, inplace=True)

toronto_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Adelaide, King, Richmond",Coffee Shop,Café,Sushi Restaurant,Bar,Thai Restaurant,Asian Restaurant,Gym,Steakhouse,Hotel,American Restaurant
1,Agincourt,Chinese Restaurant,Coffee Shop,Indian Restaurant,Supermarket,Pharmacy,Restaurant,Caribbean Restaurant,Bookstore,Breakfast Spot,Clothing Store
2,"Agincourt North, L'Amoreaux East, Milliken, St...",Chinese Restaurant,Vietnamese Restaurant,Coffee Shop,Bubble Tea Shop,Bakery,Pharmacy,Dessert Shop,Noodle House,Korean Restaurant,Japanese Restaurant
3,"Albion Gardens, Beaumond Heights, Humbergate, ...",Coffee Shop,Fast Food Restaurant,Asian Restaurant,Pharmacy,Sandwich Place,Grocery Store,Chinese Restaurant,Caribbean Restaurant,Indian Restaurant,Sushi Restaurant
4,"Alderwood, Long Branch",Coffee Shop,Burger Joint,Furniture / Home Store,Café,Breakfast Spot,Bakery,Middle Eastern Restaurant,Burrito Place,Seafood Restaurant,Grocery Store


#### Getting 'Common Venue Score' and ranking the dataset by it.

In [131]:
for i in range(toronto_venues_sorted.shape[0]):
    if toronto_venues_sorted.loc[i,'1st Most Common Venue'].find('Chinese Restaurant')>=0 or   \
       toronto_venues_sorted.loc[i,'2nd Most Common Venue'].find('Chinese Restaurant')>=0 or   \
       toronto_venues_sorted.loc[i,'3rd Most Common Venue'].find('Chinese Restaurant')>=0:
            toronto_venues_sorted.loc[i, 'Common Venue Score'] = 10
    elif toronto_venues_sorted.loc[i,'4th Most Common Venue'].find('Chinese Restaurant')>=0 or \
       toronto_venues_sorted.loc[i,'5th Most Common Venue'].find('Chinese Restaurant')>=0 or   \
       toronto_venues_sorted.loc[i,'6th Most Common Venue'].find('Chinese Restaurant')>=0:
            toronto_venues_sorted.loc[i, 'Common Venue Score'] = 7
    elif toronto_venues_sorted.loc[i,'7th Most Common Venue'].find('Chinese Restaurant')>=0 or \
       toronto_venues_sorted.loc[i,'8th Most Common Venue'].find('Chinese Restaurant')>=0 or   \
       toronto_venues_sorted.loc[i,'9th Most Common Venue'].find('Chinese Restaurant')>=0 or   \
       toronto_venues_sorted.loc[i,'10th Most Common Venue'].find('Chinese Restaurant')>=0:
            toronto_venues_sorted.loc[i, 'Common Venue Score'] = 4
    else:
            toronto_venues_sorted.loc[i, 'Common Venue Score'] = 0
            
toronto_venues_sorted.sort_values('Common Venue Score' , ascending = False, inplace = True)   
toronto_venues_sorted.reset_index(drop=True, inplace=True)
toronto_venues_sorted[['Neighborhood','Common Venue Score']].head(12)

Unnamed: 0,Neighborhood,Common Venue Score
0,Cedarbrae,10.0
1,"Agincourt North, L'Amoreaux East, Milliken, St...",10.0
2,"Dorset Park, Scarborough Town Centre, Wexford ...",10.0
3,"Fairview, Henry Farm, Oriole",10.0
4,Bayview Village,10.0
5,Hillcrest Village,10.0
6,L'Amoreaux West,10.0
7,Agincourt,10.0
8,"Maryvale, Wexford",10.0
9,"Clarks Corners, Sullivan, Tam O'Shanter",10.0


#### The dataset: Borough_Toronto, which includes the latitude and longitude

In [132]:
borough_toronto = pd.read_csv("toronto_borough.csv")
borough_toronto.drop('Unnamed: 0', axis=1, inplace=True)
borough_toronto.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Harbourfront, Regent Park",43.65426,-79.360636
3,M6A,North York,"Lawrence Heights, Lawrence Manor",43.718518,-79.464763
4,M7A,Queen's Park,Queen's Park,43.662301,-79.389494


#### Getting all Chinese restaurants of neighborhoods in Toronto

In [133]:
CLIENT_ID = 'BMD5UMUXO0EIAIWFLU4QH2N5C3HQ1W3GXNEUHIWMGZ52RLZN' # your Foursquare ID
CLIENT_SECRET = 'WSAWH1VJ3YHFRYVJREYRIM3TIFHVNQYENEXYGCSPYZUV4WCE' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100
radius = 1000

In [134]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [135]:
def getNearbyVenues(names, latitudes, longitudes):
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):      
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&limit={}'.format(
            CLIENT_ID, CLIENT_SECRET, VERSION, lat, lng, LIMIT)          
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']      
        # return only relevant information for each nearby venue
        venues_list.append([(name, lat, lng, v['venue']['id'],
            v['venue']['name'], v['venue']['location']['lat'], v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])    
    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 'Neighborhood Latitude', 'Neighborhood Longitude', 
                  'Venue ID','Venue Name', 'Venue Latitude','Venue Longitude',  'Venue Category']
    return(nearby_venues)

In [136]:
toronto_venues = getNearbyVenues(names=borough_toronto['Neighborhood'],
                                   latitudes=borough_toronto['Latitude'],
                                   longitudes=borough_toronto['Longitude']
                                  )

toronto_chn_rest = toronto_venues.copy()

for i in range(toronto_chn_rest.shape[0]):
    if toronto_chn_rest.loc[i,'Venue Category'].find('Chinese Restaurant')<0 :
        toronto_chn_rest.drop(index = i, axis=0, inplace=True)
        
toronto_chn_rest.reset_index(drop = True, inplace=True)  

In [137]:
print('dataset shape: ', toronto_chn_rest.shape)
print('\n')
toronto_chn_rest.head()

dataset shape:  (160, 8)




Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue ID,Venue Name,Venue Latitude,Venue Longitude,Venue Category
0,Parkwoods,43.753259,-79.329656,5b6a321d340a58002cc0d9db,Omni Palace Noodle House,43.771047,-79.33157,Chinese Restaurant
1,Parkwoods,43.753259,-79.329656,584e235102b60e2d40263821,天天渔港 Captain's Catch,43.774961,-79.333873,Chinese Restaurant
2,Parkwoods,43.753259,-79.329656,4ae71b0cf964a52078a821e3,Noodle Delight,43.772399,-79.320209,Chinese Restaurant
3,Victoria Village,43.725882,-79.315572,55dded07498eecf46ed3e0d9,Hakka Legend,43.726046,-79.286561,Chinese Restaurant
4,Victoria Village,43.725882,-79.315572,5269be82498e1cf7de5d5dd4,Super Hakka Restaurant,43.742892,-79.304949,Chinese Restaurant


#### Getting the rating and tips of each restaurant

For example, we can get the rating and tips of the following 2 restaurants - Noodle Delight and Omni Palace Noodle House.

In [138]:
venue_id = '4ae71b0cf964a52078a821e3' # ID of Noodle Delight
url = 'https://api.foursquare.com/v2/venues/{}?client_id={}&client_secret={}&v={}'.format(venue_id, CLIENT_ID, CLIENT_SECRET, VERSION)

result = requests.get(url).json()
try:
    print('Rating: ', result['response']['venue']['rating'])
except:
    print('This venue has not been rated yet.')
    
print('# of Tips: ', result['response']['venue']['tips']['count'])

Rating:  7.3
# of Tips:  12


In [139]:
venue_id = '584e235102b60e2d40263821' # ID of Omni Palace Noodle House
url = 'https://api.foursquare.com/v2/venues/{}?client_id={}&client_secret={}&v={}'.format(venue_id, CLIENT_ID, CLIENT_SECRET, VERSION)

result = requests.get(url).json()
try:
    print('Rating: ', result['response']['venue']['rating'])
except:
    print('This venue has not been rated yet.')
    
print('# of Tips: ', result['response']['venue']['tips']['count'])

Rating:  7.7
# of Tips:  3
