<h2>Objective</h2>
To obtain location coordinates for each postal code and append them to the respective record. 

<h3> Importing required libraries </h3>

In [34]:
import pandas as pd

<h3> Reading the data file which contains scrapped data from the wikipedia page </h3>

In [35]:
wikipedia_table = pd.read_csv('scrapped_data.csv')
wikipedia_table

Unnamed: 0,Postal Code,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"
...,...,...,...
98,M8X,Etobicoke,"The Kingsway, Montgomery Road, Old Mill North"
99,M4Y,Downtown Toronto,Church and Wellesley
100,M7Y,East Toronto,"Business reply mail Processing Centre, South C..."
101,M8Y,Etobicoke,"Old Mill South, King's Mill Park, Sunnylea, Hu..."


<h3> The instructor has provided with the choice of obtaining location coordinates either from Geocoder API or from the csv file. I chose the latter. </h3>

In [36]:
coordinates_data = pd.read_csv('Geospatial_Coordinates.csv')
coordinates_data

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476
...,...,...,...
98,M9N,43.706876,-79.518188
99,M9P,43.696319,-79.532242
100,M9R,43.688905,-79.554724
101,M9V,43.739416,-79.588437


<h3> Converting the dataframe containing location coordinates into a dictionary to easily access,  using Postal Code as the key </h3>

In [42]:
coordinates_dict = coordinates_data.set_index('Postal Code').T.to_dict('list')
coordinates_dict

{'M1B': [43.806686299999996, -79.19435340000001],
 'M1C': [43.7845351, -79.16049709999999],
 'M1E': [43.7635726, -79.1887115],
 'M1G': [43.7709921, -79.21691740000001],
 'M1H': [43.773136, -79.23947609999999],
 'M1J': [43.7447342, -79.23947609999999],
 'M1K': [43.7279292, -79.26202940000002],
 'M1L': [43.711111700000004, -79.2845772],
 'M1M': [43.716316, -79.23947609999999],
 'M1N': [43.692657000000004, -79.2648481],
 'M1P': [43.7574096, -79.27330400000001],
 'M1R': [43.750071500000004, -79.2958491],
 'M1S': [43.7942003, -79.26202940000002],
 'M1T': [43.7816375, -79.3043021],
 'M1V': [43.8152522, -79.2845772],
 'M1W': [43.799525200000005, -79.3183887],
 'M1X': [43.836124700000006, -79.20563609999999],
 'M2H': [43.8037622, -79.3634517],
 'M2J': [43.7785175, -79.3465557],
 'M2K': [43.7869473, -79.385975],
 'M2L': [43.7574902, -79.37471409999999],
 'M2M': [43.789053, -79.40849279999999],
 'M2N': [43.7701199, -79.40849279999999],
 'M2P': [43.752758299999996, -79.4000493],
 'M2R': [43.78273

<h3> For each record in the scrapped data we obtain latitude and longitude values from the dictionary above </h3>

In [45]:
result_df = pd.DataFrame(columns = ["Postal Code", "Borough", "Neighborhood", "Latitude", "Longitude"])
for row in wikipedia_table.iterrows():
    pc = row[1][0]
    bo = row[1][1]
    nh = row[1][2]
    lat, long = coordinates_dict[pc]
    result_df = result_df.append(pd.Series([pc, bo, nh, lat, long], index = result_df.columns), ignore_index = True)
result_df

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.654260,-79.360636
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494
...,...,...,...,...,...
98,M8X,Etobicoke,"The Kingsway, Montgomery Road, Old Mill North",43.653654,-79.506944
99,M4Y,Downtown Toronto,Church and Wellesley,43.665860,-79.383160
100,M7Y,East Toronto,"Business reply mail Processing Centre, South C...",43.662744,-79.321558
101,M8Y,Etobicoke,"Old Mill South, King's Mill Park, Sunnylea, Hu...",43.636258,-79.498509


In [46]:
result_df.to_csv('scrapped_data_coordinates.csv', index = None)