# Capstone Project - The Battle of the Neighborhoods (Week 2)
### Applied Data Science Capstone by IBM/Coursera

## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

## Introduction: Business Problem <a name="introduction"></a>

My friend, who is a very recent immigrant in Canada and has a great passion for cuisine, wants to start a restaurant business targeting Chinese students in the college town. Right now, she has two options in mind, the University of Toronto or the University of British Columbia (in Vancouver, Canada). 

Although she has been well-informed that many Chinese families reside in these two cities, she has her concerns. What if these two college towns are already filled with high densities of a Chinese restaurant and her business would be engaged in fierce competitions from the beginning?

Therefore, she asks me to help her out, trying to evaluate the nearby neighborhoods, and figure out whether she should continue her business plan. If not, why not? Otherwise, where should she land the new business and, if possible, what kind of Chinese dishes should she be serving?

## Data <a name="data"></a>

In [1]:
# import libraries
import numpy as np

import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json

import requests
from pandas.io.json import json_normalize

import matplotlib.cm as cm
import matplotlib.colors as colors

from sklearn.cluster import KMeans

import folium

print('Libraries imported.')

Libraries imported.


In [2]:
import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning)

### The University of Toronto

#### Geopy.geocoders
To get the longtitude and langtitude of the universities. 

the University of British Columbia

In [3]:
# get coordinates of the university of toronto
from geopy.geocoders import Nominatim

geolocator = Nominatim(user_agent="myGeocoder")
address = 'the University of Toronto'
coordinates = geolocator.geocode(address)
ut_center = [coordinates.latitude, coordinates.longitude]

print(coordinates.latitude, coordinates.longitude)

43.663461999999996 -79.39775965337452


In [4]:
# transfer longitude and latitude to x, y

import shapely.geometry
import pyproj
import math

def lonlat_to_xy(lon, lat):
    proj_latlon = pyproj.Proj(proj="latlong", datum="WGS84")
    proj_xy = pyproj.Proj(proj='utm', zone=17, datum="WGS84")
    xy = pyproj.transform(proj_latlon, proj_xy, lon, lat)
    return xy[0],xy[1]

def xy_to_lonlat(x, y):
    proj_latlon = pyproj.Proj(proj="latlong", datum="WGS84")
    proj_xy = pyproj.Proj(proj='utm', zone=17, datum="WGS84")
    lonlat = pyproj.transform(proj_xy, proj_latlon, x, y)
    return lonlat[0], lonlat[1]

def calc_xy_distance(x1, y1, x2, y2):
    dx = x2 - x1
    dy = y2 - y1
    return math.sqrt(dx*dx + dy*dy)

print('Coordinate transformation check')
print('-------------------------------')
print('the University of Toronto longitude={}, latitude={}'.format(ut_center[1], ut_center[0]))
x, y = lonlat_to_xy(ut_center[1], ut_center[0])
print('the University of Toronto x={}, y={}'.format(x, y))
lo, la = xy_to_lonlat(x, y)
print('The University of Toronto longitude={}, latitude={}'.format(lo, la))

Coordinate transformation check
-------------------------------
the University of Toronto longitude=-79.39775965337452, latitude=43.663461999999996
the University of Toronto x=629182.8789669871, y=4835742.658336909
The University of Toronto longitude=-79.39775965337452, latitude=43.66346199999999


In [5]:
# create a hexagonal grid of cells

ut_center_x, ut_center_y = lonlat_to_xy(ut_center[1], ut_center[0]) # the center of University of Toronto

k = math.sqrt(5) / 2
x_min = ut_center_x - 5000
x_step = 590
y_min = ut_center_y - 5000 - (int(35/k)*k*500 - 10000)/2
y_step = 515 * k

latitudes = []
longitudes = []
distances_from_center = []
xs = []
ys = []
for i in range(0, int(35/k)):
    y = y_min + i * y_step
    x_offset = 200 if i%5==0 else 0
    for j in range(0, 35):
        x = x_min + j * x_step + x_offset
        distance_from_center = calc_xy_distance(ut_center_x, ut_center_y, x, y)
        if (distance_from_center <= 5001):
            lon, lat = xy_to_lonlat(x, y)
            latitudes.append(lat)
            longitudes.append(lon)
            distances_from_center.append(distance_from_center)
            xs.append(x)
            ys.append(y)

print(len(latitudes), 'candidate neighborhood centers generated.')

228 candidate neighborhood centers generated.


In [6]:
# Visualize the map of ut

map_ut = folium.Map(location=ut_center, zoom_start=13)
folium.Marker(ut_center, popup='University of Toronoto').add_to(map_ut)
for lat, lon in zip(latitudes, longitudes):
    folium.Circle([lat, lon], radius=300, color='blue', fill=False).add_to(map_ut)
map_ut

In [7]:
# get the address of the locations

def get_address(latlon):
    geolocator = Nominatim(user_agent="myGeocoder")
    try:
        result = geolocator.reverse(latlon)
        address = result.raw['display_name']
        return address
    except:
        return None

In [8]:
address_list = []
for lat,long in zip(latitudes, longitudes):
    latlon = str(lat) + ',' + str(long)
    address = get_address(latlon)
    if address is None:
        address = 'NO ADDRESS'
    address_list.append(address)
    print(' .', end='')
print(' done.')

 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . done.


In [9]:
# make the dataframe with locations

ut_locations = pd.DataFrame({
                'Address': address_list,
                'Latitude': latitudes,
                'Longitude': longitudes,
                'X': xs,
                'Y': ys,
                'Distance_from_center': distance_from_center})

ut_locations.head()

Unnamed: 0,Address,Latitude,Longitude,X,Y,Distance_from_center
0,NO ADDRESS,43.622003,-79.416957,627722.878967,4831108.0,17520.847582
1,"William G. Davis Trail, Fort York, Spadina—For...",43.621902,-79.409647,628312.878967,4831108.0,17520.847582
2,"Bathurst Quay, Spadina—Fort York, Old Toronto,...",43.6218,-79.402337,628902.878967,4831108.0,17520.847582
3,"Welcome to the Toronto Islands Sand Dunes, Lak...",43.621697,-79.395027,629492.878967,4831108.0,17520.847582
4,"Island Yacht Club, 2, Lakeshore Avenue, Spadin...",43.621594,-79.387718,630082.878967,4831108.0,17520.847582


#### Foursquare API
To get Chinese restaurants in the listed neighborhoods.

In [10]:
# get FourSquare API

CLIENT_ID = 'B4TH4BXPF14E1QG3NPDWQFKAJK22N4PPXK3YQPIBNPLRWAZ0'
CLIENT_SECRET = 'QH2AOS2NEV2XSYBLZZR5W42EJIF43401OAK4RXXN2RQL302G'
VERSION = '20180605'
LIMIT = 100

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: B4TH4BXPF14E1QG3NPDWQFKAJK22N4PPXK3YQPIBNPLRWAZ0
CLIENT_SECRET:QH2AOS2NEV2XSYBLZZR5W42EJIF43401OAK4RXXN2RQL302G


In [24]:
food_category = '4d4b7105d754a06374d81259'

chinese_restaurant_categories = ['4bf58dd8d48988d145941735',"52af3a5e3cf9994f4e043bea",
"52af3a723cf9994f4e043bec","52af3a7c3cf9994f4e043bed","58daa1558bbb0b01f18ec1d3",
"52af3a673cf9994f4e043beb","52af3a903cf9994f4e043bee","4bf58dd8d48988d1f5931735",
"52af3a9f3cf9994f4e043bef","52af3aaa3cf9994f4e043bf0","52af3ab53cf9994f4e043bf1",
"52af3abe3cf9994f4e043bf2","52af3ac83cf9994f4e043bf3","52af3ad23cf9994f4e043bf4",
"52af3add3cf9994f4e043bf5","52af3af23cf9994f4e043bf7","52af3ae63cf9994f4e043bf6",
"52af3afc3cf9994f4e043bf8","52af3b053cf9994f4e043bf9","52af3b213cf9994f4e043bfa",
"52af3b293cf9994f4e043bfb","52af3b343cf9994f4e043bfc","52af3b3b3cf9994f4e043bfd",
"52af3b463cf9994f4e043bfe","52af3b633cf9994f4e043c01","52af3b513cf9994f4e043bff",
"52af3b593cf9994f4e043c00","52af3b6e3cf9994f4e043c02","52af3b773cf9994f4e043c03",
"52af3b813cf9994f4e043c04","52af3b893cf9994f4e043c05","52af3b913cf9994f4e043c06",
"52af3b9a3cf9994f4e043c07","52af3ba23cf9994f4e043c08",]

    
def is_restaurant(categories, specific_filter=None):
    restaurant_words = ['ramen','dumpling','vietnam','japanese','thai','sushi']
    
    # to find similiarly asian flavor restaurants in the neighborhoods
    
    restaurant = False
    specific = False
    for c in categories:
        category_name = c[0].lower()
        category_id = c[1]
        for r in restaurant_words:
            if r in category_name:
                restaurant = True
            if not(specific_filter is None) and (category_id in specific_filter):
                specific = True
                restaurant = True
                
    return restaurant, specific
        

def get_categories(categories):
    return [(cat['name'], cat['id']) for cat in categories]


def get_venues_near_location(lat, lng, category, radius=500):
    url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&categoryID={}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            category,
            radius,
            LIMIT)
    
    try:
        results = requests.get(url).json()["response"]['groups'][0]['items']
        venues = [(
            item['venue']['id'],
            item['venue']['name'],
            get_categories(item['venue']['categories']),
            (item['venue']['location']['lat'], item['venue']['location']['lng']),
            item['venue']['location']['formattedAddress'][0],
            item['venue']['location']['distance'],
            item['venue']['categories']) for item in results]
        
    except:
        venues = []
        
    return venues


def get_restaurants(lats, lons):
    restaurants = {} # dict for remove any duplicates resulting from area overlaps
    chinese_restaurants = {}
    location_restaurants = []
    
    print('Obtaining venues around candidate locations:', end="")
    for lat, lon in zip(lats,lons):
        venues = get_venues_near_location(lat, lon, food_category, radius=500)
        area_restaurants = [] # to gather all restaurants near THIS location
        for venue in venues:
            venue_id = venue[0]
            venue_name = venue[1]
            venue_categories = venue[2]
            venue_latlon = venue[3]
            venue_address = venue[4]
            venue_distance = venue[5]
            is_res, is_chinese = is_restaurant(venue_categories, specific_filter=chinese_restaurant_categories)
            
            if is_res:
                x,y = lonlat_to_xy(venue_latlon[1], venue_latlon[0])
                restaurant = (venue_id, venue_name, venue_latlon[0], venue_latlon[1], venue_address, venue_distance, is_chinese, x, y, venue_categories)
                print(restaurant)
                if venue_distance <= 300:
                    area_restaurants.append(restaurant)
                restaurants[venue_id] = restaurant
                if is_chinese:
                    chinese_restaurants[venue_id] = restaurant
        location_restaurants.append(area_restaurants)
        print(' .', end="")
    print("done")
    return restaurants, chinese_restaurants, location_restaurants

In [25]:
import pickle

restaurant = {}
chinese_restaurants = {}
location_restaurants = []
loaded = False
try:
    with open('pickles/ut_restaurants.pickle','rb') as f:
        restaurants = pickle.load(f)
    
    with open('pickles/ut_chinese_restaurants.pickle','rb') as f:
        chinese_restaurants = pickle.load(f)

    with open('pickles/ut_location_restaurants.pickle','rb') as f:
        location_restaurants = pickle.load(f)
        
    print('ut restaurants data loaded.')
    loaded = True
        
except:
    pass

if not loaded:
    restaurants, chinese_restaurants, location_restaurants = get_restaurants(latitudes, longitudes)
    
    with open('pickles/ut_restaurants.pickle','wb') as f:
        pickle.dump(restaurants,f)

    with open('pickles/ut_chinese_restaurants.pickle','wb') as f:
        pickle.dump(chinese_restaurants,f)

    with open('pickles/ut_location_restaurants.pickle','wb') as f:
        pickle.dump(location_restaurants,f)

Obtaining venues around candidate locations: . . . . . . . . . . . . . . . . . . . . . . . . . . . .('4b4fa8d4f964a5209b0f27e3', 'Rice & Noodle', 43.64028335648371, -79.43835074476583, '1508 Queen St. West', 478, True, 625958.5705346143, 4833105.965088503, [('Chinese Restaurant', '4bf58dd8d48988d145941735')])
 .('4b4fa8d4f964a5209b0f27e3', 'Rice & Noodle', 43.64028335648371, -79.43835074476583, '1508 Queen St. West', 333, True, 625958.5705346143, 4833105.965088503, [('Chinese Restaurant', '4bf58dd8d48988d145941735')])
 .('5792a2e4cd10f4e248ad94e2', 'Guu Izakaya', 43.641845549029775, -79.43108645308222, '1314 Queen Street West', 498, False, 626541.2169619111, 4833290.516345014, [('Japanese Restaurant', '4bf58dd8d48988d111941735')])
 . . . .('5b12b96b31ac6c00397662c0', 'Sansotei Ramen', 43.63917614914748, -79.39818352488898, 'Fort York Blvd & Dan Leckie Way', 231, False, 629200.7681652745, 4833044.736329042, [('Ramen Restaurant', '55a59bace4b013909087cb24')])
('4b03122bf964a520744c22e3',

 .('501d4d55e4b0825080212f3c', 'Gonoe Sushi Japanese Restaurant', 43.63901430970685, -79.38591360078077, '262 Queens Quay W', 381, False, 630190.8091923266, 4833045.934787308, [('Sushi Restaurant', '4bf58dd8d48988d1d2941735')])
('4b50aa13f964a5202e2c27e3', 'Spice Thai', 43.63940652819094, -79.38445898645135, '246 Queens Quay. W', 391, False, 630307.2907779303, 4833091.77838637, [('Thai Restaurant', '4bf58dd8d48988d149941735')])
 .('56201ed4498e7f700c462170', 'Miku', 43.64137436, -79.37753063, '10 Bay St (at Queens Quay W)', 209, False, 630861.8574182086, 4833321.23225318, [('Japanese Restaurant', '4bf58dd8d48988d111941735')])
('4ae33054f964a520759121e3', 'Pearl Harbourfront', 43.63815654013541, -79.38068763743328, '200-207 Queens Quay W.', 457, True, 630614.1949429085, 4832958.878858992, [('Chinese Restaurant', '4bf58dd8d48988d145941735')])
('5a4fdf56772fbc5e9fa73c7f', 'Chotto Matte', 43.646473, -79.378782, '161 Bay St', 481, False, 630749.864018983, 4833885.530360119, [('Japanese Rest

 .('4ae73054f964a5203ca921e3', 'Ki Modern Japanese + Bar', 43.647223, -79.3793738, '181 Bay St (at Wellington St. W)', 33, False, 630700.508592509, 4833967.895454468, [('Japanese Restaurant', '4bf58dd8d48988d111941735')])
('5a4fdf56772fbc5e9fa73c7f', 'Chotto Matte', 43.646473, -79.378782, '161 Bay St', 127, False, 630749.864018983, 4833885.530360119, [('Japanese Restaurant', '4bf58dd8d48988d111941735')])
('50b7c53616485cd9efad60d5', 'Sukhothai', 43.64848710249909, -79.37454735245633, '52 Wellington St. E. (at Church St)', 433, False, 631087.0078955343, 4834115.903261915, [('Thai Restaurant', '4bf58dd8d48988d149941735')])
('4b999927f964a5207c8635e3', 'NAMI', 43.650853361785124, -79.37588746641697, '55 Adelaide Street East (at Church street)', 492, False, 630973.7876666919, 4834376.591433672, [('Japanese Restaurant', '4bf58dd8d48988d111941735')])
('5632426a498e0433cad5aa04', 'Ruby Thai (First Canadian Place)', 43.64909140219368, -79.38159990145141, 'Canada', 241, False, 630516.928487627,

 .('4af45a8af964a52097f121e3', 'Banh Mi Nguyen Huong', 43.653628006211044, -79.3983759881993, '322 Spadina Ave. (at Dundas St E)', 349, False, 629154.2682209668, 4834649.503636821, [('Vietnamese Restaurant', '4bf58dd8d48988d14a941735')])
('4ad9f607f964a520691c21e3', 'Manpuku まんぷく', 43.653612411792935, -79.39061276446213, '105 McCaul St. Unit 29-31 (at Dundas St. W.)', 302, False, 629780.3328016791, 4834659.884215475, [('Japanese Restaurant', '4bf58dd8d48988d111941735')])
('4ddbe8697d8b771c0b09b885', 'Dim Sum King Seafood Restaurant', 43.653503, -79.395405, '421 Dundas St W', 125, True, 629394.1189355598, 4834640.248604937, [('Dim Sum Restaurant', '4bf58dd8d48988d1f5931735')])
('58a7c100076be13f60d1dff5', 'Saigon Lotus Restaurant', 43.654310800343325, -79.39922450076845, '6 Saint Andrew St (Spadina Ave)', 438, False, 629084.3807507063, 4834724.016569896, [('Vietnamese Restaurant', '4bf58dd8d48988d14a941735')])
('4b119787f964a520188023e3', 'Dumpling House', 43.65386003934455, -79.3985583

 .('52c5dd8d498e25dee9f19457', 'Isshin Ramen 一心', 43.65651655724417, -79.40694540739058, '421 College St (at Bathurst St)', 232, False, 628457.0661708091, 4834957.01286019, [('Ramen Restaurant', '55a59bace4b013909087cb24')])
 .('4bbfe851461576b02a7d7932', 'Thai Country Kitchen', 43.65615859831661, -79.39942261134613, '412 Spadina (Spadina & Nassau)', 265, False, 629064.4465840565, 4834928.930709187, [('Thai Restaurant', '4bf58dd8d48988d149941735')])
('52c5dd8d498e25dee9f19457', 'Isshin Ramen 一心', 43.65651655724417, -79.40694540739058, '421 College St (at Bathurst St)', 481, False, 628457.0661708091, 4834957.01286019, [('Ramen Restaurant', '55a59bace4b013909087cb24')])
('58a7c100076be13f60d1dff5', 'Saigon Lotus Restaurant', 43.654310800343325, -79.39922450076845, '6 Saint Andrew St (Spadina Ave)', 453, False, 629084.3807507063, 4834724.016569896, [('Vietnamese Restaurant', '4bf58dd8d48988d14a941735')])
 .('5cb0d23175dcb7002cb5e3ad', 'Koh Lipe', 43.655933, -79.39348, '35 Baldwin Street',

 .('4adf9880f964a520f47b21e3', 'Mazz Sushi', 43.66104944385259, -79.4298421488191, '993 Bloor St W (Dovercourt)', 322, False, 626601.2219981928, 4835425.256923007, [('Sushi Restaurant', '4bf58dd8d48988d1d2941735')])
('5668bfa4498eede0b632cf59', 'COO Café Bread or Rice', 43.660555, -79.431919, '1049 Bloor St W', 464, False, 626434.8021168628, 4835367.1754493695, [('Japanese Restaurant', '4bf58dd8d48988d111941735')])
('5a62bde660255e18e3b6de0e', 'Vit Beo', 43.662384, -79.42435, '858 Bloor Street West', 326, False, 627041.2423809166, 4835581.872146941, [('Vietnamese Restaurant', '4bf58dd8d48988d14a941735')])
('525b39ca498e5d8e8c489d58', 'Thai Room', 43.66263290519075, -79.42257743493886, '810 Bloor St W (Ossington)', 451, False, 627183.6366111843, 4835612.232114344, [('Thai Restaurant', '4bf58dd8d48988d149941735')])
 .('5884fa852f91cb3bc7e8ae95', 'Japanhako', 43.663811, -79.417655, '712 Bloor Street West (Christie and Bloor)', 247, False, 627578.029624121, 4835750.632871837, [('Japanese R

 .('4b683ef1f964a520d66d2be3', 'Thai To Go', 43.663418, -79.36071, '452 Gerrard Street East (btw River & Sumach)', 142, False, 632170.1947624765, 4835796.128849879, [('Thai Restaurant', '4bf58dd8d48988d149941735')])
('4ee1490961aff90fe37e76da', 'Qi sushi', 43.66255180762697, -79.36425843422568, '358 Gerrard St.E.', 167, False, 631885.9899545432, 4835694.279599476, [('Sushi Restaurant', '4bf58dd8d48988d1d2941735')])
 .('526b048e11d269d7538f9d0b', 'Pho House', 43.66574006403083, -79.3514929027799, '610 Gerrard St. E. (Gerrard & Broadview)', 442, False, 632908.2222350212, 4836068.7506238585, [('Vietnamese Restaurant', '4bf58dd8d48988d14a941735')])
('4b683ef1f964a520d66d2be3', 'Thai To Go', 43.663418, -79.36071, '452 Gerrard Street East (btw River & Sumach)', 479, False, 632170.1947624765, 4835796.128849879, [('Thai Restaurant', '4bf58dd8d48988d149941735')])
('4f133f22e4b09e81da8cf1b7', "Rose's Vietnamese Sandwiches", 43.665611822958766, -79.35168477178863, '601 Gerrard St E (Broadview Ave

 .('5966b520fc9e94307406dafe', 'Si Lom', 43.66501007731949, -79.38068288199048, 'Toronto ON M4Y 1C5', 368, False, 630556.3793431564, 4835941.325023554, [('Thai Restaurant', '4bf58dd8d48988d149941735')])
('4c193c77838020a1e768e561', 'Kawa Sushi', 43.66389438938988, -79.38021009464505, '451 Church St.', 479, False, 630596.9172926486, 4835818.157075251, [('Japanese Restaurant', '4bf58dd8d48988d111941735')])
('4af36863f964a52053ed21e3', 'Ginger', 43.665371542785785, -79.38084590182032, '546 Church St (at Wellesley)', 336, False, 630542.4522924952, 4835981.214075953, [('Vietnamese Restaurant', '4bf58dd8d48988d14a941735')])
('59c54d4f2d2fd97564d4cfc8', 'Onnki Donburi', 43.669757426551236, -79.3845741987884, '40 Hayden Street', 472, False, 630232.3733728937, 4836462.465497529, [('Japanese Restaurant', '4bf58dd8d48988d111941735')])
('4ae787b9f964a52020ac21e3', 'Tokyo Kitchen', 43.66878262373862, -79.38515305215851, 'Charles St. (at Yonge St.)', 488, False, 630187.8143114208, 4836353.2919198815

 .('4b50f565f964a5209b3a27e3', 'Nijo Japanese Restaurant', 43.671848798792055, -79.37882430813714, '345 Bloor St. E (at Huntley St.)', 163, False, 630691.3844956524, 4836703.783667797, [('Japanese Restaurant', '4bf58dd8d48988d111941735')])
('4b02e714f964a520dd4a22e3', 'Ichiriki', 43.67085072767399, -79.3836698744015, '120 Bloor St. East (Church St.)', 464, False, 630302.9131762073, 4836585.31146159, [('Japanese Restaurant', '4bf58dd8d48988d111941735')])
('4af0b965f964a52094de21e3', 'Asahi Sushi', 43.66987385309732, -79.38294265075162, '640 Church St. (at Hayden St.)', 496, False, 630363.65510332, 4836477.958953864, [('Sushi Restaurant', '4bf58dd8d48988d1d2941735')])
 . . .('5998af0bccad6b5a979043ec', 'Ryus Noodle Bar', 43.67708634637724, -79.35889401670757, '786 Broadview Ave', 477, False, 632286.5807168416, 4837317.07863972, [('Ramen Restaurant', '55a59bace4b013909087cb24')])
 . . . .('5aabed888fb09e249951f426', 'Shunoko', 43.677539, -79.443972, '1201 St Clair Ave W (at Dufferin St)',

 .('524b52fd11d27b378fdc5dc6', 'Nakayoshi: Ramen Fine Japanese Cuisine', 43.67973628639116, -79.34137624567992, '812 Danforth Ave.', 385, False, 633692.7896800942, 4837639.4796604095, [('Ramen Restaurant', '55a59bace4b013909087cb24')])
('4ec8524d6c251306cc822d86', 'Ha gow Dim Sum House', 43.68044093157399, -79.33759006459042, '988 Danforth Ave. (at Donlands Ave.)', 475, True, 633996.4089107632, 4837723.850395432, [('Dim Sum Restaurant', '4bf58dd8d48988d1f5931735')])
('4d72d530ec075481ab7c8fbf', 'Number One Chinese Restaurant', 43.684585940432, -79.34690952301025, '897 Pape Ave (at Sammon Ave)', 415, True, 633236.0399157745, 4838169.196071903, [('Chinese Restaurant', '4bf58dd8d48988d145941735')])
 . .('4f10d060e4b09e81d8085704', 'Monami 153 Sushi', 43.68753999230954, -79.43888263723545, 'Oakwood (Rogers)', 412, False, 625816.9110588173, 4838353.632664722, [('Japanese Restaurant', '4bf58dd8d48988d111941735')])
 . . .('4b11a781f964a520808123e3', 'Edo-Ko Japanese Restaurant', 43.6880782978

 .('4dcda6b7e4cd130e16534f51', 'Thai Spicy House', 43.7019616162845, -79.3875129672506, '517 Mount Pleasant Road (North of Davisville)', 329, False, 629925.907824201, 4840034.57432796, [('Thai Restaurant', '4bf58dd8d48988d149941735')])
 . . .('4c06fc75b4aa0f471d066562', 'Nigiri-Ya', 43.70321796865987, -79.36427941347655, '897 Millwood Rd (at Sutherland Dr)', 470, False, 631795.2186628426, 4840210.783170139, [('Japanese Restaurant', '4bf58dd8d48988d111941735')])
 .('4b107e81f964a520b07123e3', 'EDO', 43.703753667738965, -79.41280203647945, '484 Eglinton Ave. W (btwn Heddington & Castleknock)', 205, False, 627884.4097086436, 4840194.285647992, [('Japanese Restaurant', '4bf58dd8d48988d111941735')])
('4c54a7d21b46c9b6b7ccebce', 'Tokyo Sushi', 43.704145677479275, -79.41063075217247, '373 Eglinton St. W (Avenue)', 344, False, 628058.5228000036, 4840241.175484507, [('Sushi Restaurant', '4bf58dd8d48988d1d2941735')])
('4afb75c5f964a520271e22e3', 'Kimono', 43.70424116320532, -79.41008497921086, '

In [206]:
print('Total number of asian flavor restaurants around UofT: {}'.format(len(restaurants)))
print('Total number of Chinese restaurants around UofT: {}'.format(len(chinese_restaurants)))
print('Percentage of Chinese restayrants around UofT: {}%'.format(round(len(chinese_restaurants)/len(restaurants) *100),2))

Total number of asian flavor restaurants: 216
Total number of Chinese restaurants: 25
Percentage of Chinese restayrants: 12%


In [208]:
print('List of all asian flavor restaurants around UofT')
print('-----------------------------------')
for r in list(restaurants.values())[:10]:
    print(r)
print('...')
print('Total:', len(restaurants))

List of all asian flavor restaurants around UofT
-----------------------------------
('4b4fa8d4f964a5209b0f27e3', 'Rice & Noodle', 43.64028335648371, -79.43835074476583, '1508 Queen St. West', 306, True, 625958.5705346143, 4833105.965088503)
('5792a2e4cd10f4e248ad94e2', 'Guu Izakaya', 43.641845549029775, -79.43108645308222, '1314 Queen Street West', 121, False, 626541.2169619111, 4833290.516345014)
('5b12b96b31ac6c00397662c0', 'Sansotei Ramen', 43.63917614914748, -79.39818352488898, 'Fort York Blvd & Dan Leckie Way', 472, False, 629200.7681652745, 4833044.736329042)
('4b03122bf964a520744c22e3', 'Guirei Sushi', 43.636867379583244, -79.39692662586845, '600 Queens Quay (Bathurst)', 387, False, 629307.1018584601, 4832790.275033057)
('53fb5e2a498e8b37590dfcc8', 'Iruka Sushi', 43.636900740901396, -79.3966095935637, '550 Queens Quay West - #11', 362, False, 629332.6030735838, 4832794.474169866)
('4c0a7b036071a5937795df32', 'Mi-Ne', 43.64073639213635, -79.39111356908798, '325 Bremmer Blvd. (at

In [209]:
print('List of all Chinese restaurants around UofT')
print('-----------------------------------')
for r in list(chinese_restaurants.values())[:10]:
    print(r)
print('...')
print('Total:', len(chinese_restaurants))

List of all Chinese restaurants around UofT
-----------------------------------
('4b4fa8d4f964a5209b0f27e3', 'Rice & Noodle', 43.64028335648371, -79.43835074476583, '1508 Queen St. West', 306, True, 625958.5705346143, 4833105.965088503)
('4ae33054f964a520759121e3', 'Pearl Harbourfront', 43.63815654013541, -79.38068763743328, '200-207 Queens Quay W.', 457, True, 630614.1949429085, 4832958.878858992)
('55df3345498e28c71648d892', 'Szechuan Express', 43.64134582504235, -79.37796013792254, "88 Queen's Quay East (At Bay St)", 484, True, 630827.2765176193, 4833317.385956355)
('5956abf335d3fc2a96d57cd1', 'Chop Chop', 43.652060345335286, -79.4070732835661, '771 Dundas St W', 457, True, 628456.2555137224, 4834461.894016317)
('4bead677415e20a1c380e5bb', 'China Island', 43.65611381294685, -79.4537759895975, '1572 Bloor St. W. (at Dorval Rd.)', 312, True, 624681.6379443393, 4834840.848418098)
('4ddbe8697d8b771c0b09b885', 'Dim Sum King Seafood Restaurant', 43.653503, -79.395405, '421 Dundas St W', 1

In [215]:
print('Restaurants around location')
print('---------------------------')
for i in range(79,89):
    rs = location_restaurants[i][:8]
    names = ','.join([r[1] for r in rs])
    print('Restaurants around location {}: {}'.format(i+1, names))

Restaurants around location
---------------------------
Restaurants around location 80: Chop Chop
Restaurants around location 81: Banh Mi Nguyen Huong,Saigon Lotus Restaurant,Dumpling House,Juicy Dumpling,Gushi
Restaurants around location 82: Dim Sum King Seafood Restaurant
Restaurants around location 83: Sansotei Ramen 三草亭
Restaurants around location 84: 
Restaurants around location 85: Gyu-Kaku Japanese BBQ
Restaurants around location 86: 
Restaurants around location 87: 
Restaurants around location 88: 
Restaurants around location 89: 


In [28]:
# map the restaurants around university of toronto
map_ut = folium.Map(location=ut_center, zoom_start=13)
folium.Marker(ut_center, popup='University of Toronto').add_to(map_ut)
for res in restaurants.values():
    lat = res[2]; lon = res[3]
    is_chinese = res[6]
    color = 'red' if is_chinese else 'blue'
    folium.CircleMarker([lat,lon], radius=3, color=color, fill=True, fill_color=color, fill_opacity=1).add_to(map_ut)
map_ut

### the University of British Columbia
Let's copy the process for the University of British Columbia

#### Geopy.geocoders

In [29]:
# get coordinates of the University of British Columbia

address = 'the University of British Columbia'
coordinates = geolocator.geocode(address)
ubc_center = [coordinates.latitude, coordinates.longitude]

print(coordinates.latitude, coordinates.longitude)

49.258393749999996 -123.24658161001929


In [30]:
# transfer longitude and latitude to x, y

import shapely.geometry
import pyproj
import math

def lonlat_to_xy(lon, lat):
    proj_latlon = pyproj.Proj(proj="latlong", datum="WGS84")
    proj_xy = pyproj.Proj(proj='utm', zone=10, datum="WGS84")
    xy = pyproj.transform(proj_latlon, proj_xy, lon, lat)
    return xy[0],xy[1]

def xy_to_lonlat(x, y):
    proj_latlon = pyproj.Proj(proj="latlong", datum="WGS84")
    proj_xy = pyproj.Proj(proj='utm', zone=10, datum="WGS84")
    lonlat = pyproj.transform(proj_xy, proj_latlon, x, y)
    return lonlat[0], lonlat[1]

def calc_xy_distance(x1, y1, x2, y2):
    dx = x2 - x1
    dy = y2 - y1
    return math.sqrt(dx*dx + dy*dy)

print('Coordinate transformation check')
print('-------------------------------')
print('the University of British Columbia longitude={}, latitude={}'.format(ubc_center[1], ubc_center[0]))
x, y = lonlat_to_xy(ubc_center[1], ubc_center[0])
print('the University of British Columbia x={}, y={}'.format(x, y))
lo, la = xy_to_lonlat(x, y)
print('the University of British Columbia longitude={}, latitude={}'.format(lo, la))

Coordinate transformation check
-------------------------------
the University of British Columbia longitude=-123.24658161001929, latitude=49.258393749999996
the University of British Columbia x=482057.8885961555, y=5456210.086487809
the University of British Columbia longitude=-123.2465816100193, latitude=49.258393749999996


In [31]:
# create a hexagonal grid of cells

ubc_center_x, ubc_center_y = lonlat_to_xy(ubc_center[1], ubc_center[0]) # the center of University of Toronto

k = math.sqrt(3) / 2
x_min = ubc_center_x
x_step = 600
y_min = ubc_center_y - 3000 - (int(13/k)*k*600 - 3000)/2
y_step = 600 * k

latitudes = []
longitudes = []
distances_from_center = []
xs = []
ys = []
for i in range(0, int(13/k)):
    y = y_min + i * y_step
    x_offset = 300 if i%2==0 else 0
    for j in range(0, 21):
        x = x_min + j * x_step + x_offset
        distance_from_center = calc_xy_distance(ubc_center_x, ubc_center_y, x, y)
        if (distance_from_center <= 8001): # Because the UBC is surrounded by sea and reserve park, we need to extend the distance.
            lon, lat = xy_to_lonlat(x, y)
            latitudes.append(lat)
            longitudes.append(lon)
            distances_from_center.append(distance_from_center)
            xs.append(x)
            ys.append(y)

print(len(latitudes), 'candidate neighborhood centers generated.')

190 candidate neighborhood centers generated.


In [32]:
map_ubc = folium.Map(location=ubc_center, zoom_start=12)
folium.Marker(ubc_center, popup='the University of British Columbia').add_to(map_ubc)
for lat,lon in zip(latitudes, longitudes):
    folium.Circle([lat,lon], radius=300, color='yellow',fill=False).add_to(map_ubc)
map_ubc

In [None]:
# get the address of the locations

def get_address(latlon):
    geolocator = Nominatim(user_agent="myGeocoder")
    try:
        result = geolocator.reverse(latlon)
        address = result.raw['display_name']
        return address
    except:
        return None
    
address_list = []
for lat,long in zip(latitudes, longitudes):
    latlon = str(lat) + ',' + str(long)
    address = get_address(latlon)
    if address is None:
        address = 'NO ADDRESS'
    address_list.append(address)
    print(' .', end='')
print(' done.')

 . . . . . . . . . .

In [None]:
# make the dataframe with locations

ubc_locations = pd.DataFrame({
                'Address': address_list,
                'Latitude': latitudes,
                'Longitude': longitudes,
                'X': xs,
                'Y': ys,
                'Distance_from_center': distance_from_center})

ubc_locations.head()

#### Foursquare API

In [None]:
food_category = '4d4b7105d754a06374d81259'

chinese_restaurant_categories = ['4bf58dd8d48988d145941735',"52af3a5e3cf9994f4e043bea",
"52af3a723cf9994f4e043bec","52af3a7c3cf9994f4e043bed","58daa1558bbb0b01f18ec1d3",
"52af3a673cf9994f4e043beb","52af3a903cf9994f4e043bee","4bf58dd8d48988d1f5931735",
"52af3a9f3cf9994f4e043bef","52af3aaa3cf9994f4e043bf0","52af3ab53cf9994f4e043bf1",
"52af3abe3cf9994f4e043bf2","52af3ac83cf9994f4e043bf3","52af3ad23cf9994f4e043bf4",
"52af3add3cf9994f4e043bf5","52af3af23cf9994f4e043bf7","52af3ae63cf9994f4e043bf6",
"52af3afc3cf9994f4e043bf8","52af3b053cf9994f4e043bf9","52af3b213cf9994f4e043bfa",
"52af3b293cf9994f4e043bfb","52af3b343cf9994f4e043bfc","52af3b3b3cf9994f4e043bfd",
"52af3b463cf9994f4e043bfe","52af3b633cf9994f4e043c01","52af3b513cf9994f4e043bff",
"52af3b593cf9994f4e043c00","52af3b6e3cf9994f4e043c02","52af3b773cf9994f4e043c03",
"52af3b813cf9994f4e043c04","52af3b893cf9994f4e043c05","52af3b913cf9994f4e043c06",
"52af3b9a3cf9994f4e043c07","52af3ba23cf9994f4e043c08",]

    
def is_restaurant(categories, specific_filter=None):
    restaurant_words = ['ramen','dumpling','vietnam','japanese','thai','sushi']
    
    # to find similiarly asian flavor restaurants in the neighborhoods
    
    restaurant = False
    specific = False
    for c in categories:
        category_name = c[0].lower()
        category_id = c[1]
        for r in restaurant_words:
            if r in category_name:
                restaurant = True
            if not(specific_filter is None) and (category_id in specific_filter):
                specific = True
                restaurant = True
                
    return restaurant, specific
        

def get_categories(categories):
    return [(cat['name'], cat['id']) for cat in categories]


def get_venues_near_location(lat, lng, category, radius=500):
    url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&categoryID={}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            category,
            radius,
            LIMIT)
    
    try:
        results = requests.get(url).json()["response"]['groups'][0]['items']
        print(results)
        venues = [(
            item['venue']['id'],
            item['venue']['name'],
            get_categories(item['venue']['categories']),
            (item['venue']['location']['lat'], item['venue']['location']['lng']),
            item['venue']['location']['formattedAddress'][0],
            item['venue']['location']['distance']) for item in results]
        
    except:
        venues = []
        
    return venues


def get_restaurants(lats, lons):
    restaurants = {} # dict for remove any duplicates resulting from area overlaps
    chinese_restaurants = {}
    location_restaurants = []
    
    print('Obtaining venues around candidate locations:', end="")
    for lat, lon in zip(lats,lons):
        venues = get_venues_near_location(lat, lon, food_category, radius=500)
        area_restaurants = [] # to gather all restaurants near THIS location
        for venue in venues:
            venue_id = venue[0]
            venue_name = venue[1]
            venue_categories = venue[2]
            venue_latlon = venue[3]
            venue_address = venue[4]
            venue_distance = venue[5]
            is_res, is_chinese = is_restaurant(venue_categories, specific_filter=chinese_restaurant_categories)
            
            if is_res:
                x,y = lonlat_to_xy(venue_latlon[1], venue_latlon[0])
                restaurant = (venue_id, venue_name, venue_latlon[0], venue_latlon[1], venue_address, venue_distance, is_chinese, x, y)
                if venue_distance <= 300:
                    area_restaurants.append(restaurant)
                restaurants[venue_id] = restaurant
                if is_chinese:
                    chinese_restaurants[venue_id] = restaurant
        location_restaurants.append(area_restaurants)
        print(' .', end="")
    print("done")
    return restaurants, chinese_restaurants, location_restaurants

In [None]:
import pickle

restaurant = {}
chinese_restaurants = {}
location_restaurants = []
loaded = False
try:
    with open('pickles/ubc_restaurants.pickle','rb') as f:
        restaurants = pickle.load(f)
    
    with open('pickles/ubc_chinese_restaurants.pickle','rb') as f:
        chinese_restaurants = pickle.load(f)

    with open('pickles/ubc_location_restaurants.pickle','rb') as f:
        location_restaurants = pickle.load(f)
        
    print('ubc restaurants data loaded.')
    loaded = True
        
except:
    pass

if not loaded:
    restaurants, chinese_restaurants, location_restaurants = get_restaurants(latitudes, longitudes)
    
    with open('pickles/ubc_restaurants.pickle','wb') as f:
        pickle.dump(restaurants,f)

    with open('pickles/ubc_chinese_restaurants.pickle','wb') as f:
        pickle.dump(chinese_restaurants,f)

    with open('pickles/ubc_location_restaurants.pickle','wb') as f:
        pickle.dump(location_restaurants,f)

In [None]:
print('Total number of asian flavor restaurants around UBC: {}'.format(len(restaurants)))
print('Total number of Chinese restaurants around UBC: {}'.format(len(chinese_restaurants)))
print('Percentage of Chinese restayrants around UBC: {}%'.format(round(len(chinese_restaurants)/len(restaurants) *100),2))

In [None]:
print('List of all asian flavor restaurants around UBC')
print('-----------------------------------')
for r in list(restaurants.values())[:10]:
    print(r)
print('...')
print('Total:', len(restaurants))

In [None]:
print('Restaurants around location')
print('---------------------------')
for i in range(120,129):
    rs = location_restaurants[i][:8]
    names = ','.join([r[1] for r in rs])
    print('Restaurants around location {}: {}'.format(i+1, names))

In [None]:
map_ubc = folium.Map(location=ubc_center, zoom_start=13)
folium.Marker(ubc_center, popup='the University of British Columbia').add_to(map_ubc)
for res in restaurants.values():
    lat = res[2]; lon = res[3]; name = res[1];
    is_chinese = res[6]
    color = "red" if is_chinese else "blue"
    folium.CircleMarker([lat,lon], radius=3, color=color, fill=True, fill_color=color, fill_opacity=1).add_to(map_ubc)
map_ubc

## Analysis

Since we are trying to decide the best location for a new Chinese restaurant that is also close to the college campus, we need to figure out the following two questions:  

(1) Which street has a high density of Asian-flavor restaurants while not many Chinese restaurants are there?  
The more Asian-flavour restaurants located in the neighborhood, the more residents around there are interested in Asian-flavor food. And a low density of Chinese restaurants means the new Chinese restaurants can avoid facing competition from similar tastes of food.  

(2) Which street is closest to the campus?  
After narrowing the street lists, we need to find the street which is closest to the campus.

### The University of Toronto

In [429]:
with open("pickles/ut_locations.pickle","rb") as f:
    ut_locations = pickle.load(f)

# location_restaurants_count = [len(res) for res in location_restaurants]
# ut_locations['Asian-flavor Restaurants in area'] = location_restaurants_count

# find nearest Chinese restaurants to each location    
distance_to_chinese_restaurant = []

for area_x, area_y in zip(xs,ys):
    max_distance =  1500 # As long as the chinese restaurants is far away than 1.5km, we are fine.
    for res in chinese_restaurants.values():
        res_x = res[7]
        res_y = res[8]
        d = calc_xy_distance(area_x, area_y, res_x, res_y)
        if d<max_distance:
            max_distance = d
    distance_to_chinese_restaurant.append(max_distance)
        
ut_locations['distance to nearest chinese restaurants'] = distance_to_chinese_restaurant


min_distance = 500

ut_locations = (
    ut_locations
    .loc[
        lambda x: x['Restaurants in area']>1 # exclude areas without any asian-flavor restaurants
    ]
    .loc[
        lambda x:x['distance to nearest chinese restaurants'] > min_distance
    ]
    .sort_values(by=['Restaurants in area'], ascending=False)
    .reset_index(drop=True)
)

print("{} areas own more than one Asian-flavour restaurants.".format(len(ut_locations)))
ut_locations

15 areas own more than one Asian-flavour restaurants.


Unnamed: 0,Address,Latitude,Longitude,X,Y,Distance_from_center,Restaurants in area,distance to nearest chinese restaurants
0,"90, Beatrice Street, Little Italy, University—...",43.653098,-79.41614,627722.878967,4834563.0,17520.847582,6,740.331267
1,"The LakeShore Condos, Bathurst Street, Bathurs...",43.637312,-79.399447,629102.878967,4832836.0,17520.847582,3,1500.0
2,"121, Galley Avenue, Roncesvalles, Parkdale—Hig...",43.643135,-79.445663,625362.878967,4833412.0,17520.847582,3,669.503658
3,"187, Bay Street, Commerce Court, Old Toronto, ...",43.647403,-79.379712,630672.878967,4833987.0,17520.847582,3,687.519956
4,"18, Woodlawn Avenue West, Deer Park, Toronto—S...",43.683886,-79.39337,629492.878967,4838018.0,17520.847582,3,544.759154
5,"282, Eglinton Avenue West, Chaplin Estates, Eg...",43.70482,-79.407456,628312.878967,4840321.0,17520.847582,3,1500.0
6,"45, Bathurst Street, King West, Spadina—Fort Y...",43.642529,-79.401788,628902.878967,4833412.0,17520.847582,2,1141.349801
7,"239, Dovercourt Road, Spadina—Fort York, Old T...",43.648017,-79.423589,627132.878967,4833987.0,17520.847582,2,1405.88843
8,"220, King Street West, Entertainment District,...",43.647506,-79.387024,630082.878967,4833987.0,17520.847582,2,949.038214
9,"85, Hepbourne Street, Dufferin Grove, Davenpor...",43.658482,-79.430633,626542.878967,4835139.0,17520.847582,2,1490.31923


But it turns out, the best area candidates all have the same distance from the center. Hence, we are going to use the kMeans cluster to decide the center of zones containing good locations.

In [431]:
number_of_clusters = 8

good_xys = ut_locations[['X','Y']].values
good_latitudes = ut_locations['Latitude'].values
good_longitudes = ut_locations['Longitude'].values
kmeans = KMeans(n_clusters=number_of_clusters, random_state=1).fit(good_xys)

cluster_centers = [xy_to_lonlat(cc[0], cc[1]) for cc in kmeans.cluster_centers_]

map_ut = folium.Map(location=ut_center, zoom_start=13)
folium.Marker(ut_center, popup='The University of Toronto').add_to(map_ut)
for lon,lat in cluster_centers:
    folium.Circle([lat,lon], radius=500, color='yellow', fill=True, fill_opacity=0.25).add_to(map_ut)
    
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat,lon], radisu=1, color='gray', fill=True, fill_color='gray', fill_opacity=0.5).add_to(map_ut)
    
for res in restaurants.values():
    lat = res[2]; lon = res[3]; name = res[1];
    is_chinese = res[6]
    color = "red" if is_chinese else "blue"
    folium.CircleMarker([lat,lon], radius=3, color=color, fill=True, fill_color=color, fill_opacity=1).add_to(map_ut)

map_ut

In [440]:
ut_center_x, ut_center_y = lonlat_to_xy(ut_center[1], ut_center[0])
candidate_area_addresses_ut = []

print('==============================================================')
print('Recommended locations for a new chinese restaurant in the University of Toronto')
print('==============================================================\n')

for lon,lat in cluster_centers:
    latlon = str(lat) + ',' + str(long)
    address = get_address(latlon)
    candidate_area_addresses_ut.append(address)
    x,y = lonlat_to_xy(lon,lat)
    d = calc_xy_distance(x, y, ut_center_x, ut_center_y)
    print("{} => {:.1f}km from The University of Toronto".format(address, d/1000))
    print("")

Recommended locations for a new chinese restaurant in the University of Toronto

36, Maclennan Avenue, Moore Park, University—Rosedale, Old Toronto, Toronto, Golden Horseshoe, Ontario, M4T 1C7, Canada => 2.7km from The University of Toronto

78, Bond Street, Garden District, Toronto Centre, Old Toronto, Toronto, Golden Horseshoe, Ontario, M5B 1X2, Canada => 1.9km from The University of Toronto

None => 2.9km from The University of Toronto

1, Harbour Square, Harbourfront, Spadina—Fort York, Old Toronto, Toronto, Golden Horseshoe, Ontario, M5J 2H2, Canada => 2.6km from The University of Toronto

12, Lamport Avenue, Rosedale, University—Rosedale, Old Toronto, Toronto, Golden Horseshoe, Ontario, M4W 1R1, Canada => 4.2km from The University of Toronto

Pinnacle Centre, Gardiner Expressway, South Core, Spadina—Fort York, Old Toronto, Toronto, Golden Horseshoe, Ontario, M5E, Canada => 4.5km from The University of Toronto

Brookfield Place, Wellington Street West, Toronto Centre, Old Toronto,

### The University of British Columbia

In [393]:
with open("pickles/ubc_locations.pickles","rb") as f:
    ubc_locations = pickle.load(f)
    
    
# find nearest Chinese restaurants to each location    
distance_to_chinese_restaurant = []

for area_x, area_y in zip(xs,ys):
    max_distance =  1500 # As long as the chinese restaurants is far away than 1.5km, we are fine.
    for res in chinese_restaurants.values():
        res_x = res[7]
        res_y = res[8]
        d = calc_xy_distance(area_x, area_y, res_x, res_y)
        if d<max_distance:
            max_distance = d
    distance_to_chinese_restaurant.append(max_distance)
        
ubc_locations['distance to nearest chinese restaurants'] = distance_to_chinese_restaurant


min_distance = 500

# filter locations with high density of asian-flavour resturants
location_restaurants_count = [len(res) for res in location_restaurants]
ubc_locations['Asian-flavor Restaurants in area'] = location_restaurants_count
    
ubc_locations = (
    ubc_locations
    .loc[
        lambda x:x['Asian-flavor Restaurants in area']>1
    ]
    .loc[
        lambda x:x['distance to nearest chinese restaurants'] > min_distance
    ]
    .sort_values(by=['Asian-flavor Restaurants in area'], ascending=False)
    .reset_index(drop=True)   
)

print("{} areas own more than one Asian-flavour restaurants.".format(len(ubc_locations)))
ubc_locations

7 areas own more than one Asian-flavour restaurants.


Unnamed: 0,Address,Latitude,Longitude,X,Y,Distance_from_center,Asian-flavor Restaurants in area,distance to nearest chinese restaurants
0,"West 8th Avenue, Point Grey, West Point Grey, ...",49.266007,-123.209507,484757.888596,5457048.0,12442.467712,5,1500.0
1,"Alma Street at West 7th Avenue, West 7th Avenu...",49.26605,-123.184766,486557.888596,5457048.0,12442.467712,3,1500.0
2,"4316, Dunbar Street, Dunbar-Southlands, Vancou...",49.247353,-123.184696,486557.888596,5454970.0,12442.467712,2,1500.0
3,"Macdonald Street, Arbutus-Ridge, Vancouver, Di...",49.256727,-123.16824,487757.888596,5456009.0,12442.467712,2,1090.575098
4,"Macdonald Street (NB) at West 7th Avenue, Macd...",49.266075,-123.168272,487757.888596,5457048.0,12442.467712,2,676.028338
5,"Point Grey Road, Kitsilano, Vancouver, Distric...",49.270743,-123.172411,487457.888596,5457568.0,12442.467712,2,975.161631
6,"1688, Cypress Street, Kitsilano, Vancouver, Di...",49.270777,-123.147668,489257.888596,5457568.0,12442.467712,2,895.564015


But it turns out, the best area candidates all have the same distance from the center. Hence, we are going to use the kMeans cluster to decide the center of zones containing good locations.

In [448]:
number_of_clusters = 5

good_xys = ubc_locations[['X','Y']].values
good_latitudes = ubc_locations['Latitude'].values
good_longitudes = ubc_locations['Longitude'].values
kmeans = KMeans(n_clusters=number_of_clusters, random_state=1).fit(good_xys)

cluster_centers = [xy_to_lonlat(cc[0], cc[1]) for cc in kmeans.cluster_centers_]

map_ubc = folium.Map(location=ubc_center, zoom_start=13)
folium.Marker(ubc_center, popup='The University of Brish Columbia').add_to(map_ubc)
for lon,lat in cluster_centers:
    folium.Circle([lat,lon], radius=500, color='yellow', fill=True, fill_opacity=0.25).add_to(map_ubc)
    
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat,lon], radisu=1, color='gray', fill=True, fill_color='gray', fill_opacity=0.5).add_to(map_ubc)
    
for res in restaurants.values():
    lat = res[2]; lon = res[3]; name = res[1];
    is_chinese = res[6]
    color = "red" if is_chinese else "blue"
    folium.CircleMarker([lat,lon], radius=3, color=color, fill=True, fill_color=color, fill_opacity=1).add_to(map_ubc)

map_ubc

In [449]:
ubc_center_x, ubc_center_y = lonlat_to_xy(ubc_center[1], ubc_center[0])
candidate_area_addresses_ubc = []

print('==============================================================')
print('Recommended locations for a new chinese restaurant in the University of British Columbia')
print('==============================================================\n')

for lon,lat in cluster_centers:
    latlon = str(lat) + ',' + str(long)
    address = get_address(latlon)
    candidate_area_addresses_ubc.append(address)
    x,y = lonlat_to_xy(lon,lat)
    d = calc_xy_distance(x, y, ubc_center_x, ubc_center_y)
    print("{} => {:.1f}km from the University of British Columbia".format(address, d/1000))
    print("")

Recommended locations for a new chinese restaurant in the University of British Columbia

Eeyou Istchee Baie-James, Nord-du-Québec, Québec, Canada => 5.3km from the University of British Columbia

Eeyou Istchee Baie-James, Nord-du-Québec, Québec, Canada => 2.8km from the University of British Columbia

Eeyou Istchee Baie-James, Nord-du-Québec, Québec, Canada => 4.7km from the University of British Columbia

Eeyou Istchee Baie-James, Nord-du-Québec, Québec, Canada => 7.3km from the University of British Columbia

Eeyou Istchee Baie-James, Nord-du-Québec, Québec, Canada => 5.7km from the University of British Columbia

