# Segmenting and Clustering Neighborhoods in Toronto

## Problem 3

### Explore and cluster the neighborhoods in Toronto. You can decide to work with only boroughs that contain the word Toronto and then replicate the same analysis we did to the New York City data. It is up to you.

### Just make sure:

### to add enough Markdown cells to explain what you decided to do and to report any observations you make.
### to generate maps to visualize your neighborhoods and how they cluster together.

### Installing and Importing the required Libraries

In [2]:
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
!pip install folium
import folium # plotting library
from sklearn.cluster import KMeans
import matplotlib.cm as cm
import matplotlib.colors as colors
print('Libraries imported.')

Collecting folium
[?25l  Downloading https://files.pythonhosted.org/packages/a4/f0/44e69d50519880287cc41e7c8a6acc58daa9a9acf5f6afc52bcc70f69a6d/folium-0.11.0-py2.py3-none-any.whl (93kB)
[K     |████████████████████████████████| 102kB 7.9MB/s ta 0:00:011
[?25hCollecting branca>=0.3.0 (from folium)
  Downloading https://files.pythonhosted.org/packages/13/fb/9eacc24ba3216510c6b59a4ea1cd53d87f25ba76237d7f4393abeaf4c94e/branca-0.4.1-py3-none-any.whl
Installing collected packages: branca, folium
Successfully installed branca-0.4.1 folium-0.11.0
Libraries imported.


In [4]:
# import csv file to data frame
df = pd.read_csv("df_final.csv")

In [5]:
df.head()

Unnamed: 0,Postcode,Borough,Neighborhood,Latitude,Longitude
0,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
1,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494
2,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937
3,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418
4,M4E,East Toronto,The Beaches,43.676357,-79.293031


### Visualizing all the Neighbourhoods of the above data frame using Folium

In [10]:
map_toronto = folium.Map(location=[43.651070,-79.347015],zoom_start=10)

for lat,lng,borough,neighbourhood in zip(df['Latitude'],df['Longitude'],df['Borough'],df['Neighborhood']):
    label = '{}, {}'.format(neighbourhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
    [lat,lng],
    radius=5,
    popup=label,
    color='blue',
    fill=True,
    fill_color='#3186cc',
    fill_opacity=0.7,
    parse_html=False).add_to(map_toronto)
map_toronto

### The map might not be visible on Github. Check out the README for the map.

### Using KMeans clustering for the clustering of the neighbourhoods

In [12]:
k=5
toronto_clustering = df.drop(['Postcode','Borough','Neighborhood'],1)
kmeans = KMeans(n_clusters = k,random_state=0).fit(toronto_clustering)
kmeans.labels_
df.insert(0, 'Cluster Labels', kmeans.labels_)

In [13]:
df

Unnamed: 0,Cluster Labels,Postcode,Borough,Neighborhood,Latitude,Longitude
0,0,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
1,0,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494
2,0,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937
3,0,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418
4,4,M4E,East Toronto,The Beaches,43.676357,-79.293031
5,0,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306
6,0,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383
7,3,M6G,Downtown Toronto,Christie,43.669542,-79.422564
8,0,M5H,Downtown Toronto,"Richmond, Adelaide, King",43.650571,-79.384568
9,1,M6H,West Toronto,"Dufferin, Dovercourt Village",43.669005,-79.442259


In [14]:
# create map
map_clusters = folium.Map(location=[43.651070,-79.347015],zoom_start=10)

# set color scheme for the clusters
x = np.arange(k)
ys = [i + x + (i*x)**2 for i in range(k)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, neighbourhood, cluster in zip(df['Latitude'], df['Longitude'], df['Neighborhood'], df['Cluster Labels']):
    label = folium.Popup(' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### The map might not be visible on Github. Check out the README for the map.