# Segmenting and Clustering Neighborhoods in Toronto III

Explore and cluster the neighborhoods in Toronto. You can decide to work with only boroughs that contain the word Toronto and then replicate the same analysis we did to the New York City data. It is up to you. 

Just make sure:

1. to add enough Markdown cells to explain what you decided to do and to report any observations you make. 
2. to generate maps to visualize your neighborhoods and how they cluster together. 

## Start!

Import Libraries

In [1]:
!pip install folium
import pandas as pd
import numpy as np
from geopy.geocoders import Nominatim
import folium
import matplotlib.cm as cm
import matplotlib.colors as colors

  from cryptography.utils import int_from_bytes
  from cryptography.utils import int_from_bytes


Upload the dataset from part 2

In [2]:
df = pd.read_csv('Toronto_Data_Geo.csv')
df.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,Queen's Park,Ontario Provincial Government,43.662301,-79.389494


Only need Borough == Toronto

In [3]:
df_T = df[df['Borough'].str.contains('Toronto')]
df_Toronto = df_T.reset_index(drop=True)
df_Toronto.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
1,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937
2,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418
3,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306
4,M5G,Downtown Toronto,Bay Street,43.657952,-79.387383


In [4]:
df_Toronto.shape

(21, 5)

Use Borough as label for cluserting

In [5]:
df_Toronto['Borough'].value_counts()

Downtown Toronto    18
Central Toronto      1
East Toronto         1
North Toronto        1
Name: Borough, dtype: int64

Using int instead for label

In [6]:
df_Toronto['Label'] = df_Toronto['Borough'].replace(to_replace=['Downtown Toronto', 'North Toronto', 'Central Toronto', 'East Toronto'], value=[1,2,3,4], inplace=False)
df_Toronto.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Label
0,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636,1
1,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937,1
2,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418,1
3,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306,1
4,M5G,Downtown Toronto,Bay Street,43.657952,-79.387383,1


### Visualize Data

Use geopy library to get the latitude and longitude values of Toronto. (From previous Lab)

In [7]:
address = 'Toronto'
geolocator = Nominatim(user_agent="toronto_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto are 43.6534817, -79.3839347.


In [8]:
kclusters = len(df_Toronto.Label.unique())

# create map
toronto_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, cluster in zip(df_Toronto['Latitude'], df_Toronto['Longitude'], df_Toronto['Label']):
    label = folium.Popup(str(df_Toronto['Borough']) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(toronto_clusters)
       
toronto_clusters