# Obtaining Neighbourhood Coordinates

This notebook will populate the list of neighbourhoods with latitudes and longitudes for each postal code.

In [22]:
import geocoder
import pandas as pd

First, read the data scraped earlier from the CSV.

In [23]:
df = pd.read_csv('data/toronto_neighbourhoods.csv')
df.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"


Iterate over each postal code, using the Google geocoder to obtain latitude and longitude for each.

**Note:** This is only for reference, since the Google geocoder would not return any results, and the OpenStreetMaps geocoder does not return accurate results when the query is a postal code. We will need to use an alternate source of location data.

In [24]:
# Test cell to compare different geocoding services on a single address

x = None
while(x is None):
    test = geocoder.osm('M3A, Toronto, Ontario')
    x=test.latlng

print(x)

[43.65238435, -79.38356765]


In [7]:
latitude = []
longitude = []

for postal_code in df['Postal Code']:
    lat_lng_coords = None # create a flag to end loop
    # loop until you get the coordinates
    while(lat_lng_coords is None):
        g = geocoder.osm('{}, Toronto, Ontario'.format(postal_code))
        lat_lng_coords = g.latlng
    # Append values to the respective list
    latitude.append(lat_lng_coords[0])
    longitude.append(lat_lng_coords[1])

print(latitude[0:5])
print(longitude[0:5])

## Location Data from CSV

Since we were unable to obtain responses from the Google Geocoder, we will use the provided CSV to obtain the coordinate data.

In [25]:
# Read the CSV file

coord = pd.read_csv('data/toronto_coordinates.csv')
coord.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [26]:
coord.dtypes # Check column data types

Postal Code     object
Latitude       float64
Longitude      float64
dtype: object

Use inner join to merge the two dataframes, with Postal Code as the key.

In [29]:
df_merged = pd.merge(df, coord, on='Postal Code', how='inner')
df_merged.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494


In [30]:
df_merged.shape # Check number of rows hasn't changed

(103, 5)

Save data to a new CSV file.

In [40]:
df_merged.to_csv('data/toronto_coordinates_merged.csv', index=False)