<h1 align=center><font size = 5>Neighborhoods in Toronto</font></h1>

## Introduction

In this notebook, we convert addresses into their equivalent latitude and longitude values. The Folium library is used to visualize the neighborhoods in Toronto.

## Importing required libraries

In [5]:
import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files
import csv

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes
import folium # map rendering library

#BeautifulSoup for Scraping Web
!pip install beautifulsoup4
from bs4 import BeautifulSoup
!pip install lxml
import lxml
!pip install requests
import requests

print('Libraries imported.')

Solving environment: done

# All requested packages already installed.

[31mtensorflow 1.3.0 requires tensorflow-tensorboard<0.2.0,>=0.1.0, which is not installed.[0m
[31mtensorflow 1.3.0 requires tensorflow-tensorboard<0.2.0,>=0.1.0, which is not installed.[0m
[31mtensorflow 1.3.0 requires tensorflow-tensorboard<0.2.0,>=0.1.0, which is not installed.[0m
Libraries imported.


#### We make a request to the Wikipedia page containing imformation of all the postal codes of Canada using the Requests library. The Beautiful Soup library is then used to parse this response and the title of the page is printed to check if the request if successful, and if the correct information is returned or not.

In [10]:
sourceHTML = requests.get('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M')
soup = BeautifulSoup(sourceHTML.content, 'lxml')

table = soup.find('tbody')
rows = table.select('tr')
row = [r.get_text() for r in rows]

#### Creating the Data Frame

In [19]:
df = pd.DataFrame(row)
df1 = df[0].str.split('\n', expand=True)
df2 = df1.rename(columns=df1.iloc[0])
df3 = df2.drop(df2.index[0])
df3.head()

Unnamed: 0,Unnamed: 1,Postcode,Borough,Neighbourhood,Unnamed: 5
1,,M1A,Not assigned,Not assigned,
2,,M2A,Not assigned,Not assigned,
3,,M3A,North York,Parkwoods,
4,,M4A,North York,Victoria Village,
5,,M5A,Downtown Toronto,Harbourfront,


#### Cleaning Data

In [38]:
# Dropping rows where Borough isn't assigned.
df4 = df3[df3.Borough != 'Not assigned']
df4.head()

# Grouping rows which have same Postcode
df5 = df4.groupby(['Postcode', 'Borough'], sort = False).agg(','.join)
df5.reset_index(inplace = True)
df5.head()

#Replacing Neighbourhood value with Borough's name where it isn't defined.
df5['Neighbourhood'] = np.where(df5['Neighbourhood'] == 'Not assigned', df5['Borough'], df5['Neighbourhood'])
df5.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Harbourfront,Regent Park"
3,M6A,North York,"Lawrence Heights,Lawrence Manor"
4,M7A,Queen's Park,Queen's Park


In [39]:
df5.shape

(103, 3)

#### Read CSV File

In [40]:
url = "http://cocl.us/Geospatial_data"
df7 = pd.read_csv(url)

df7.rename(columns={'Postal Code': 'Postcode'}, inplace=True)

#Merge the two DFs
df8 = pd.merge(df6, df7, on='Postcode')
df8.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Harbourfront,Regent Park",43.65426,-79.360636
3,M6A,North York,"Lawrence Heights,Lawrence Manor",43.718518,-79.464763
4,M7A,Queen's Park,Not assigned,43.662301,-79.389494


#### Creating a Data Frame that only contains the word Toronto

In [42]:
TorontoDF = df8[df8['Borough'].str.contains('Toronto')]
TorontoDF.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
2,M5A,Downtown Toronto,"Harbourfront,Regent Park",43.65426,-79.360636
9,M5B,Downtown Toronto,"Ryerson,Garden District",43.657162,-79.378937
15,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418
19,M4E,East Toronto,The Beaches,43.676357,-79.293031
20,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306


#### Generate maps to visualize neighborhoods

In [44]:
address = 'Toronto'
geolocator = Nominatim(user_agent="Toronto_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude

Toronto_map = folium.Map(location=[latitude, longitude], zoom_start=10)

for lat, lng, borough, neighborhood in zip(TorontoDF['Latitude'], TorontoDF['Longitude'], 
                                           TorontoDF['Borough'], TorontoDF['Neighbourhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(Toronto_map)  
    
Toronto_map