## Introduction

I am a person living in Hong Kong. I am living in the Kennedy Town region, where it is very close to the underground station. The region has a balance of the eastern and western culture. It is easily accessible to the central downtown, but far enough to enjoy a quiet and relaxed environment. Recently, I have been invited by my boss to work in Toronto. The package is a nice deal and I decided to accept it. I am very excited, and at the same time very busy at the preparation work. I am looking for an apartment in Toronto which has a similar ambience compared to my current living environment. The question is, which Neighborhood should I look for?


## Business Problem

To find a neighborhood in Toronto that exhibits the closest characteristics compared to my current home: Kennedy town. The steps could include:

1. Getting the characteristics of Kennedy Town
2. Matching the characteristics of Kennedy Town to a neighborhood (or a few neighborhoods) in Toronto for consideration.

## Data

We will use the Toronto Data we have prepared in week 3, and get the characteristics of Kennedy Town from FourSquare.

### Import Libraries

In [36]:
import pandas as pd
import pickle
import folium
import requests
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe


### Toronto Data

In [37]:
with open(r'canada_postcodes_and_coordinates_df.pkl', 'rb') as f:
    toronto_df = pickle.load(f)

In [38]:
toronto_df = toronto_df.rename({'Neighbourhood': 'Neighborhood'}, axis=1)

In [39]:
toronto_df.head()

Unnamed: 0,Postcode,Borough,Neighborhood,Latitude,Longitude
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476


### Kennedy Town Data

#### Use geopy library to get the latitude and longitude values of Kennedy Town

In [40]:
kennedy_town_address = 'Kennedy Town, Hong Kong'

geolocator = Nominatim(user_agent="explorer")
kt_location = geolocator.geocode(kennedy_town_address)
kt_latitude = kt_location.latitude
kt_longitude = kt_location.longitude
print('The geograpical coordinate of Kennedy Town are {}, {}.'.format(kt_latitude, kt_longitude))

The geograpical coordinate of Kennedy Town are 22.28131165, 114.12916039816602.


#### Show a map of where Kennedy Town is

In [33]:
map_kt = folium.Map(location=[kt_latitude, kt_longitude], zoom_start=16)

In [34]:
map_kt

#### Get characteristics of Kennedy Town from FourSquare

In [21]:
with open(r'foursquare_credentials.pkl', 'rb') as f:
    (CLIENT_ID, CLIENT_SECRET) = pickle.load(f)

In [22]:
VERSION = '20180605' # Foursquare API version

Now, let's get the top 100 venues that are in Kennedy Town within a radius of 500 meters.

In [23]:
radius = 500
LIMIT = 100


url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    kt_latitude, 
    kt_longitude, 
    radius, 
    LIMIT)

In [26]:
results = requests.get(url).json()

In [27]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [30]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

Unnamed: 0,name,categories,lat,lng
0,Winstons Coffee,Coffee Shop,22.281374,114.127172
1,Sun Hing Restaurant (新興食家),Dim Sum Restaurant,22.283036,114.128209
2,Comptoir,French Restaurant,22.281209,114.126975
3,Little Creatures,Brewery,22.28395,114.128264
4,Catch.,Breakfast Spot,22.283152,114.126988


And how many venues were returned by Foursquare?

In [31]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

77 venues were returned by Foursquare.
