# Getting the Coordinates

Now that we have built a dataframe of the postal code of each Toronto neighborhood along with the borough name and neighborhood name, in order to utilize the Foursquare location data, we need to get the latitude and the longitude coordinates of each neighborhood.

## 1. Assignment Requirements

Use the Geocoder package to update the resulted dataframe from the previous part of the assignment by adding Latitude and Longitude columns for each row (PostalCode).

## 2. Reading Dataframe

First let's import libraries that we will use. And then read and check the resulted dataframe from the last step.

In [1]:
import pandas as pd
import numpy as np
import geocoder

toronto_geo_df = pd.read_csv('toronto_df.csv', index_col=0)
toronto_geo_df.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Harbourfront, Regent Park"
3,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,M7A,Queen's Park,Queen's Park


## 3. Getting Coordinates and Updating Dataframe

The problem with the Geocoder package is you have to be persistent sometimes in order to get the geographical coordinates of a given postal code. So you can make a call to get the latitude and longitude coordinates of a given postal code and the result would be `None`, and then make the call again and you would get the coordinates.

So, in order to make sure that we get the coordinates for all of our neighborhoods, we will run a `while` loop for each postal code until we get the coordinates. And then we will update our dataframe accordingly.

In [2]:
# let's first add empty columns latitude and longitude to our dataframe

toronto_geo_df['Latitude'] = np.nan
toronto_geo_df['Longitude'] = np.nan

toronto_geo_df.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,,
1,M4A,North York,Victoria Village,,
2,M5A,Downtown Toronto,"Harbourfront, Regent Park",,
3,M6A,North York,"Lawrence Manor, Lawrence Heights",,
4,M7A,Queen's Park,Queen's Park,,


We then iterate through the rows (indices) of our dataframe extracting the postal codes, getting the coordinates for those postal codes and adding them to the corresponding Latitude and Longitude columns.

In [3]:
for i in toronto_geo_df.index:
    # initialize coordinates variable for the row to None
    lat_lng_coords = None
    
    # loop until we get the coordinates
    while(lat_lng_coords is None):
        g = geocoder.arcgis('{}, Toronto, Ontario'.format(toronto_geo_df.loc[i, 'PostalCode']))
        lat_lng_coords = g.latlng
    
    toronto_geo_df.loc[i, ['Latitude', 'Longitude']] = lat_lng_coords
    
toronto_geo_df.head(10)

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.75244,-79.329271
1,M4A,North York,Victoria Village,43.730421,-79.31332
2,M5A,Downtown Toronto,"Harbourfront, Regent Park",43.65512,-79.36264
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.723125,-79.451589
4,M7A,Queen's Park,Queen's Park,43.661102,-79.391035
5,M9A,Etobicoke,Islington Avenue,43.662242,-79.528379
6,M1B,Scarborough,"Rouge, Malvern",43.811525,-79.195517
7,M3B,North York,Don Mills North,43.749195,-79.361905
8,M4B,East York,"Parkview Hill, Woodbine Gardens",43.707535,-79.311773
9,M5B,Downtown Toronto,"Garden District, Ryerson",43.657363,-79.37818


Finally, let's check that we obtained Latitude and Longitude for all postal codes from our Toronto dataframe. For that we will use `isnull()` method and then aggregate the results with `sum()` function which will sum up all `True` values if some data is missing in Latitude and Longitude columns.

In [4]:
toronto_geo_df[['Latitude', 'Longitude']].isnull().sum()

Latitude     0
Longitude    0
dtype: int64

Great! Looks like we got all the coordinates and can proceed further with no need to handle any missing values.

Let's just save the resulted dataframe so we can use it further in the assignment.

In [5]:
toronto_geo_df.to_csv('toronto_geo_df.csv')