# Toronto Neighborhood Visualizations
## Maps for Potential Movers

#### Author: Patrick de Guzman 

# Synopsis
The following are the final visualization maps for the capstone project of the Applied Data Science Specialization Program (offered by Coursera). 

The Foursquare API was used to identify types of venues within neighbourhoods of Toronto, Ontario. Data from the [City of Toronto website](https://www.toronto.ca/) was used to create choropleth maps to visualize several dimensions per neighborhood (these dimensions for exploration being distribution of average ages, household sizes, unit sizes, and income).

By clustering neighbourhoods based on venue-type via K-means clustering and combining this visualization with choropleth maps for each major dimension mentioned above, we provide a simple tool for individuals looking to potentially move into Toronto to gain a better understanding of the character and distribution of residents within. 


In [4]:
from IPython.display import HTML

import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

from bs4 import BeautifulSoup # import BeautifulSoup for web scraping

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

import folium # map rendering library

from zipfile import ZipFile

In [6]:
HTML('''<script>
code_show=true; 
function code_toggle() {
 if (code_show){
 $('div.input').hide();
 } else {
 $('div.input').show();
 }
 code_show = !code_show
} 
$( document ).ready(code_toggle);
</script>
The raw code for this IPython notebook is by default hidden for easier reading.
To toggle on/off the raw code, click <a href="javascript:code_toggle()">here</a>.''')

## Neighborhood Maps

#### i) Age Distribution (Weighted Average by Neighborhood)

In [162]:
# Choose data to plot on choropleth
data = df_ages
variable = "average_age"

# Create map of Toronto, ON 
age_map = folium.Map(location = [latitude, longitude], zoom_start = 11)

folium.Choropleth(
    geo_data = gj,
    data = data,
    columns = ['neighbourhood_number', variable],
    name = 'choropleth',
    key_on = 'feature.properties.HOODNUM',
    fill_color='BuPu').add_to(age_map)

# Set color scheme for clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0,1,len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# Add markers to cluster map 
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighborhoods'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster' + str(cluster), parse_html = True)
    folium.CircleMarker(
        [lat,lon],
        radius = 5,
        popup = label,
        color = rainbow[cluster-1],
        fill = True, 
        fill_color = rainbow[cluster-1],
        fill_opacity=0.7).add_to(age_map)

age_map

#### ii) Income Distribution

In [108]:
# Choose data to plot on choropleth
data = df_income
variable = "average_income"

# Create map of Toronto, ON 
income_map = folium.Map(location = [latitude, longitude], zoom_start = 11)

folium.Choropleth(
    geo_data = gj,
    data = data,
    columns = ['neighbourhood_number', variable],
    name = 'choropleth',
    key_on = 'feature.properties.HOODNUM',
    fill_color='YlGn').add_to(income_map)

# Set color scheme for clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0,1,len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# Add markers to cluster map 
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighborhoods'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster' + str(cluster), parse_html = True)
    folium.CircleMarker(
        [lat,lon],
        radius = 5,
        popup = label,
        color = rainbow[cluster-1],
        fill = True, 
        fill_color = rainbow[cluster-1],
        fill_opacity=0.7).add_to(income_map)

income_map

#### iii) Household Sizes

In [109]:
# Choose data to plot on choropleth
data = df_hhsize
variable = "average_household_size"

# Create map of Toronto, ON 
hh_map = folium.Map(location = [latitude, longitude], zoom_start = 11)

folium.Choropleth(
    geo_data = gj,
    data = data,
    columns = ['neighbourhood_number', variable],
    name = 'choropleth',
    key_on = 'feature.properties.HOODNUM',
    fill_color='Blues').add_to(hh_map)

# Set color scheme for clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0,1,len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# Add markers to cluster map 
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighborhoods'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster' + str(cluster), parse_html = True)
    folium.CircleMarker(
        [lat,lon],
        radius = 5,
        popup = label,
        color = rainbow[cluster-1],
        fill = True, 
        fill_color = rainbow[cluster-1],
        fill_opacity=0.7).add_to(hh_map)

hh_map

#### iv) Average Number of Bedrooms 

In [110]:
# Choose data to plot on choropleth
data = df_bedrooms
variable = "average_brs"

# Create map of Toronto, ON 
br_map = folium.Map(location = [latitude, longitude], zoom_start = 11)

folium.Choropleth(
    geo_data = gj,
    data = data,
    columns = ['neighbourhood_number', variable],
    name = 'choropleth',
    key_on = 'feature.properties.HOODNUM',
    fill_color='YlOrRd').add_to(br_map)

# Set color scheme for clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0,1,len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# Add markers to cluster map 
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighborhoods'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster' + str(cluster), parse_html = True)
    folium.CircleMarker(
        [lat,lon],
        radius = 5,
        popup = label,
        color = rainbow[cluster-1],
        fill = True, 
        fill_color = rainbow[cluster-1],
        fill_opacity=0.7).add_to(br_map)

br_map

## Neighborhood Character Listings (by Cluster)

In [156]:
for cluster in np.unique(kmeans.labels_): 
    print("---- "+'Cluster #'+cluster.astype(str)+" ----")
    cluster_table = toronto_merged.loc[toronto_merged['Cluster Labels'] == cluster, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]
    cluster_table = pd.DataFrame(pd.melt(cluster_table, id_vars = ['Borough','Cluster Labels'])['value'].value_counts().head(10)).reset_index().iloc[:,0]
    print(cluster_table)
    print('\n')

---- Cluster #0 ----
0             Coffee Shop
1              Restaurant
2                    Café
3    Fast Food Restaurant
4          Sandwich Place
5             Pizza Place
6          Farmers Market
7                Festival
8      Italian Restaurant
9      Falafel Restaurant
Name: index, dtype: object


---- Cluster #1 ----
0       Electronics Store
1             Pizza Place
2                Festival
3                 Dog Run
4             Yoga Studio
5          Farmers Market
6    Ethiopian Restaurant
7                   Field
8      Falafel Restaurant
9    Fast Food Restaurant
Name: index, dtype: object


---- Cluster #2 ----
0      Falafel Restaurant
1                Festival
2          Farmers Market
3                   Field
4                    Park
5             Yoga Studio
6    Ethiopian Restaurant
7    Fast Food Restaurant
8       Electronics Store
9                 Dog Run
Name: index, dtype: object


---- Cluster #3 ----
0      Falafel Restaurant
1                Festiv

## References 

- Toronto Postal Code & Neighborhoods ([Wikipedia](https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M))  
- Toronto GeoJSON Data ([Adam Wisniewski Github](http://adamw523.com/toronto-geojson/))
- Toronto Neighborhood Data: Age, Income, Household Sizes, Unit Sizes ([City of Toronto](https://open.toronto.ca/dataset/neighbourhood-profiles/))  
- Toronto Neighborhood Venue Data ([Foursquare](https://foursquare.com/))  