# Segmenting and Clustering Neighbourhoods in Toronto - Part 02
## Applied Data Science Capstone 
### IBM Data Science Professional Certificate

In [1]:
import pandas as pd
import geocoder
from operator import eq

### Load in the dataframe that was saved in the previous notebook

In [3]:
postalCodesDF = pd.read_csv("postalCodes.csv")
postalCodesDF.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"


*This took a prohibitively long time to run, so I'll use the CSV file provided instead*

### Use the Geospatial_Coordinates.csv to add the latitudes and longitudes to the postal codes dataframe

This Geocoding API took a prohibitively long time to run, so I'll use the CSV file provided instead. The geospatial coordinates file contains the postal codes in a different order to those in the postal codes dataframe from Part 01. Since we don't care about the ordering of the postal codes, the simplest way to incorporate them is simply to sort the 2 dataframes by the postal code column (ensuring that they are in the same order after the sort), then concatenate the latitude and longitude columns from the geospatial coordinates dataframe to the postal code, borough, and neighborhood columns from the postal codes dataframe.

In [4]:
geoCoordsDF  = pd.DataFrame(pd.read_csv("geospatial_Coordinates.csv"))

geoCoordsDF  .sort_values(by=["Postal Code"], inplace=True)
postalCodesDF.sort_values(by=["PostalCode"],  inplace=True)
postalCodesDF.reset_index(drop=True, inplace=True)

# This makes sure that when we concatenate the columns, the lat lon coords will match up with the postal code
assert(all(map(eq, geoCoordsDF["Postal Code"], postalCodesDF["PostalCode"])))

neighbourhoodsDF = pd.concat([postalCodesDF, geoCoordsDF[["Latitude", "Longitude"]]], axis=1)

neighbourhoodsDF.to_csv("neighbourhoods.csv", index=False)

neighbourhoodsDF.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M1B,Scarborough,"Malvern, Rouge",43.806686,-79.194353
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476
