# Canadian Postcode Analysis
Use the Notebook to build the code to scrape the following Wikipedia page, https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M, in order to obtain the data that is in the table of postal codes and to transform the data into a pandas dataframe like the one shown below:


In [4]:
import pandas as pd

pcode = pd.read_html('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M')[0]

print (pcode.head())
                                                                        

  Postcode           Borough     Neighbourhood
0      M1A      Not assigned      Not assigned
1      M2A      Not assigned      Not assigned
2      M3A        North York         Parkwoods
3      M4A        North York  Victoria Village
4      M5A  Downtown Toronto      Harbourfront


#### Ignore cells with a borough that is Not assigned.

In [9]:
pcodes_filtered = pcode[~(pcode["Borough"] == 'Not assigned')]

print (pcodes_filtered.head())  

  Postcode           Borough     Neighbourhood
2      M3A        North York         Parkwoods
3      M4A        North York  Victoria Village
4      M5A  Downtown Toronto      Harbourfront
5      M6A        North York  Lawrence Heights
6      M6A        North York    Lawrence Manor


#### Combine into one Boroughs into one row with the neighborhoods separated with a comma

In [10]:
pcodes_final = pcodes_filtered.groupby("Postcode").agg(lambda y:','.join(set(y)))

print(pcodes_final.head(100))

              Borough                                      Neighbourhood
Postcode                                                                
M1B       Scarborough                                      Malvern,Rouge
M1C       Scarborough               Highland Creek,Port Union,Rouge Hill
M1E       Scarborough                    West Hill,Morningside,Guildwood
M1G       Scarborough                                             Woburn
M1H       Scarborough                                          Cedarbrae
...               ...                                                ...
M9C         Etobicoke  Eringate,Bloordale Gardens,Old Burnhamthorpe,M...
M9L        North York                                      Humber Summit
M9M        North York                                    Emery,Humberlea
M9N              York                                             Weston
M9P         Etobicoke                                          Westmount

[100 rows x 2 columns]


#### If a cell has a borough but a Not assigned neighborhood, then the neighborhood will be the same as the borough.
But as test below shows here are no neighhoods like that so - nothing required

In [11]:
pcode_test= pcodes_filtered[(pcodes_filtered["Neighbourhood"] == 'Not assigned')]
print(pcode_test)

Empty DataFrame
Columns: [Postcode, Borough, Neighbourhood]
Index: []


In [12]:
print (pcodes_final.shape)

(103, 2)


# Stage 2
Import CSV File with Geo coordinates


In [27]:
geodata = pd.read_csv('/Users/Paul/Coding/IBM Courses Info/Capstone/Geospatial_Coordinates.csv')
print(geodata)

    Postal Code   Latitude  Longitude
0           M1B  43.806686 -79.194353
1           M1C  43.784535 -79.160497
2           M1E  43.763573 -79.188711
3           M1G  43.770992 -79.216917
4           M1H  43.773136 -79.239476
..          ...        ...        ...
98          M9N  43.706876 -79.518188
99          M9P  43.696319 -79.532242
100         M9R  43.688905 -79.554724
101         M9V  43.739416 -79.588437
102         M9W  43.706748 -79.594054

[103 rows x 3 columns]


Ensure the Postal Code columns are named the same


In [28]:
geodata = geodata.rename(columns={'Postal Code': 'Postcode'})


Then merge the dataframes


In [30]:
canada = pd.merge(pcodes_final,geodata, on='Postcode')
print(canada.head())
print(canada.shape)

  Postcode      Borough                         Neighbourhood   Latitude  \
0      M1B  Scarborough                         Malvern,Rouge  43.806686   
1      M1C  Scarborough  Highland Creek,Port Union,Rouge Hill  43.784535   
2      M1E  Scarborough       West Hill,Morningside,Guildwood  43.763573   
3      M1G  Scarborough                                Woburn  43.770992   
4      M1H  Scarborough                             Cedarbrae  43.773136   

   Longitude  
0 -79.194353  
1 -79.160497  
2 -79.188711  
3 -79.216917  
4 -79.239476  
(103, 5)
