# Segmenting and Clustering Neighborhoods in Toronto - Part 2

This project is to explore, segment, and cluster the neighborhoods in the city of Toronto.

This part is to get the Latitude and Longitude of the Neighborhoods.

First, we download and import all the dependencies we will need for the project

In [2]:
import pandas as pd
import numpy as np
import requests

print('Libraries imported.')

Libraries imported.


Now, we need to get the Neighborhood data from Wikipedia

In [3]:
url='https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'

table = pd.read_html(url, header=0,keep_default_na=False) 

table_df = table[0]
table_df.columns = ['PostalCode','Borough','Neighborhood']
table_df.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront


And now, we need to remove the "Not Assigned" values in the Borough column from the Table

In [4]:
table_df = table_df.query('Borough != "Not assigned"').reset_index(drop=True)

table_df.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,Harbourfront
3,M5A,Downtown Toronto,Regent Park
4,M6A,North York,Lawrence Heights


In this step, we group the Postal Codes that are repeated

In [5]:
table_df = table_df.groupby('PostalCode', as_index=False).agg(lambda x: ', '.join(set(x.dropna())))
table_df.head(10)

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M1B,Scarborough,"Rouge, Malvern"
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union"
2,M1E,Scarborough,"Morningside, West Hill, Guildwood"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae
5,M1J,Scarborough,Scarborough Village
6,M1K,Scarborough,"Ionview, East Birchmount Park, Kennedy Park"
7,M1L,Scarborough,"Oakridge, Clairlea, Golden Mile"
8,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West"
9,M1N,Scarborough,"Cliffside West, Birch Cliff"


In this step, we replace the Not Assigned values in the Neighborhood column with the Borough data

In [6]:
table_df.loc[table_df['Neighborhood'] == 'Not assigned', 'Neighborhood' ] = table_df['Borough']
table_df.head(15)

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M1B,Scarborough,"Rouge, Malvern"
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union"
2,M1E,Scarborough,"Morningside, West Hill, Guildwood"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae
5,M1J,Scarborough,Scarborough Village
6,M1K,Scarborough,"Ionview, East Birchmount Park, Kennedy Park"
7,M1L,Scarborough,"Oakridge, Clairlea, Golden Mile"
8,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West"
9,M1N,Scarborough,"Cliffside West, Birch Cliff"


In [7]:
table_df.shape[0]

103

In [8]:
latlon_df = pd.read_csv('http://cocl.us/Geospatial_data')

latlon_df.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In this step we change the Column name to match the previous Data Frame

In [9]:
latlon_df.rename(columns={'Postal Code':'PostalCode'}, inplace=True)
latlon_df.head()

Unnamed: 0,PostalCode,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


And now we combine both Data Frames

In [10]:
Toronto_df = pd.merge(left=table_df, right=latlon_df)
Toronto_df.head(12)

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497
2,M1E,Scarborough,"Morningside, West Hill, Guildwood",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476
5,M1J,Scarborough,Scarborough Village,43.744734,-79.239476
6,M1K,Scarborough,"Ionview, East Birchmount Park, Kennedy Park",43.727929,-79.262029
7,M1L,Scarborough,"Oakridge, Clairlea, Golden Mile",43.711112,-79.284577
8,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West",43.716316,-79.239476
9,M1N,Scarborough,"Cliffside West, Birch Cliff",43.692657,-79.264848


In [11]:
Toronto_df.shape[0]

103