<a href="https://colab.research.google.com/github/tahsinhassan/coursera_ml/blob/master/CourseraCapstoneExploreToronto_Task.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Explore and cluster the neighborhoods in Toronto**

**Author - Tahsin Hassan**

In [0]:
!pip install --upgrade -q gspread

In [0]:
from google.colab import auth
auth.authenticate_user()

import gspread
from oauth2client.client import GoogleCredentials

gc = gspread.authorize(GoogleCredentials.get_application_default())

worksheet = gc.open('Wiki_TN').sheet1

# get_all_values gives a list of rows.
rows = worksheet.get_all_values()
print(rows)

**Build the dataframe from the list of values**

In [0]:
import pandas as pd
df = pd.DataFrame.from_records(rows, columns = ['Postcode','Borough','Neighbourhood'])
df.head()

**Cleanse the data and transform as per the given requirements**

In [0]:
# Cleanse the data and transform as per the given requirements
df.drop(df[df['Borough'] == 'Not assigned'].index, inplace=True)
df.loc[df.Neighbourhood == 'Not assigned', "Neighbourhood"] = df.Borough
df.head()

**Group the data by Postcode & Borough**

In [71]:
df_grp = df.groupby(['Postcode', 'Borough'])['Neighbourhood'].apply(', '.join).reset_index()
df_grp.columns = ['Postcode', 'Borough', 'Neighbourhood']
df_grp.tail(10)

Unnamed: 0,Postcode,Borough,Neighbourhood
93,M9A,Etobicoke,Islington Avenue
94,M9B,Etobicoke,"Cloverdale, Islington, Martin Grove, Princess ..."
95,M9C,Etobicoke,"Bloordale Gardens, Eringate, Markland Wood, Ol..."
96,M9L,North York,Humber Summit
97,M9M,North York,"Emery, Humberlea"
98,M9N,York,Weston
99,M9P,Etobicoke,Westmount
100,M9R,Etobicoke,"Kingsview Village, Martin Grove Gardens, Richv..."
101,M9V,Etobicoke,"Albion Gardens, Beaumond Heights, Humbergate, ..."
102,M9W,Etobicoke,Northwest


**Dataframe dimension**

In [72]:
df_grp.shape

(103, 3)

**Read the Geospatial csv file and inner join it with df_grp_can**

In [0]:
df_ll = pd.read_csv('http://cocl.us/Geospatial_data')
df_ll.columns = ['Postcode', 'Latitude', 'Longitude']
df_join = pd.merge(df_grp, df_ll, on=['Postcode'], how='inner')

Question: 2
Dataframe with location mapped

In [53]:
df_join.head(20)

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476
5,M1J,Scarborough,Scarborough Village,43.744734,-79.239476
6,M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park",43.727929,-79.262029
7,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge",43.711112,-79.284577
8,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West",43.716316,-79.239476
9,M1N,Scarborough,"Birch Cliff, Cliffside West",43.692657,-79.264848


In [54]:
print('The dataframe has {} boroughs and {} neighborhoods.'.format(
        len(df_join['Borough'].unique()),
        df_join.shape[0]
    )
)

The dataframe has 11 boroughs and 103 neighborhoods.


In [56]:
from geopy.geocoders import Nominatim

address = 'Toronto, Canada'

geolocator = Nominatim()
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))



The geograpical coordinate of Toronto are 43.653963, -79.387207.


**Creating map of Toronto using latitude and longitude values**

In [62]:
import folium
import matplotlib.cm as cm
import matplotlib.colors as colors

map_toronto = folium.Map(location=[latitude, longitude], zoom_start=10)

for lat, lng, borough, neighbourhood in zip(neighbourhoods['Latitude'], neighbourhoods['Longitude'], neighbourhoods['Borough'], neighbourhoods['Neighbourhood']):
    label = '{}, {}'.format(neighbourhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto