In [30]:
import numpy as np
import pandas as pd

Read in the postal codes from the Wikipedia page.

In [42]:
df = pd.read_html('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M', header=0)[0]
df.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"


Drop rows containing 'Not assigned' for Borough.

In [43]:
df = df[~df.Borough.str.contains("Not assigned")]
df.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"
5,M6A,North York,"Lawrence Manor, Lawrence Heights"
6,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"


Determine if any rows have duplicate postal codes.

In [44]:
df.describe()

Unnamed: 0,Postal Code,Borough,Neighbourhood
count,103,103,103
unique,103,10,99
top,M4W,North York,Downsview
freq,1,24,4


All rows contain a unique postal code, so there is no need to concatenate by Neighbourhood. This has already been completed on the Wikipedia page.

Next, determine if any rows contain 'Not assigned' for Neighbourhood after dropping such rows from the Borough column.

In [45]:
df[df.Neighbourhood.str.contains("Not assigned")].describe()

Unnamed: 0,Postal Code,Borough,Neighbourhood
count,0,0,0
unique,0,0,0


Again, there are no cells which meet this criterion. So there is no need to assign the same Borough name as the Neighbourhood.

Now load in the geospatial data CSV into a separate dataframe.

In [46]:
df2 = pd.read_csv('https://cocl.us/Geospatial_data')

In [47]:
df2.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


Merge the two tables using the Postal Code column.

In [54]:
df = df.merge(df2,on = 'Postal Code', how = 'inner')
df.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494
