# Toronto Neighbourhoods - clustering
#### This is part of the Course [<u>*Applied Data Science Capstone*</u>](https://www.coursera.org/learn/applied-data-science-capstone/) on Coursera, to complete the Specialization <u>*IBM Data Science Professional Certificate*</u>

This exercise is to cluster the Toronto Neighbourhoods, using the geocodes we got in the second [notebook](https://github.com/rareal/Coursera_Capstone/blob/master/Toronto_Neighborhoods_LatLong.ipynb).

---------------
Importing dependencies:

In [1]:
import requests
import pandas as pd
import numpy as np
import matplotlib.cm as cm
import matplotlib.colors as colors
from sklearn.cluster import KMeans
import folium

Read the dataset generated in the second notebook, with the PotalCodes, Borough, Neighbourhood and the Coordinates 

In [4]:
ToNeighLL = pd.read_csv('Toronto_neigh_latlon.csv',index_col=0)

In [5]:
ToNeighLL.head()

Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.7545,-79.33
1,M4A,North York,Victoria Village,43.7276,-79.3148
2,M5A,Downtown Toronto,"Harbourfront, Regent Park",43.6555,-79.3626
3,M6A,North York,"Lawrence Heights, Lawrence Manor",43.7223,-79.4504
4,M7A,Queen's Park,Queen's Park,43.6641,-79.3889


Need to get coordinates for Toronto to plot. I'll use LocationIQ as in the 2nd notebook.

In [6]:
apikey = pd.read_table('locationiq_api_key',header=None).iloc[0,0]

In [7]:
# Search / Forward Geocoding url
search_url = "https://us1.locationiq.com/v1/search.php"
data = {'key': apikey,'q': 'Toronto, Ontario, CA','format': 'json'}
response = requests.get(search_url, params=data)

In [8]:
TO = response.json()[0]
print(TO['display_name'])
print('latitude: ',TO['lat'])
print('longitude: ',TO['lon'])

Toronto, Ontario, M6K 1X9, Canada
latitude:  43.653963
longitude:  -79.387207


Create a Map for Toronto and plot the Neighbourhoods. 

In [9]:
def embed_map(m):
    from IPython.display import IFrame

    m.save('index.html')
    return IFrame('index.html', width='100%', height='750px')

In [10]:
# Toronto map
toto_map = folium.Map(location=[float(TO['lat']), float(TO['lon'])], zoom_start=14)

# add markers to map
for lat, lng, borough, neighborhood, postcode in zip(ToNeighLL['Latitude'], ToNeighLL['Longitude'], 
                                           ToNeighLL['Borough'], ToNeighLL['Neighbourhood'], ToNeighLL['PostalCode']):
    label = '{}, {}, {}'.format(neighborhood, borough, postcode)
    label = folium.Popup(label, parse_html=True)
    folium.Circle(  # Circle makes a difined size circle, CircleMarker makes a circle of the same size in all zooms
        [lat, lng],
        radius=200,  # meters
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(toto_map)  
    
embed_map(toto_map)

In [12]:
ToNeighLL.Borough.unique()

array(['North York', 'Downtown Toronto', "Queen's Park", 'Etobicoke',
       'Scarborough', 'East York', 'York', 'East Toronto', 'West Toronto',
       'Central Toronto', 'Mississauga'], dtype=object)

In [19]:
ToNeighLL_to = ToNeighLL[ToNeighLL.Borough.str.contains('Toronto')].sort_values(by='Borough').reset_index(drop=True)

In [21]:
foursquare_cred = pd.read_csv('foursquare_cred') # saved locally and ignored in .gitignore