# Segmenting and Clustering Neighborhoods in Toronto - Part 2

Now that you have built a dataframe of the postal code of each neighborhood along with the borough name and neighborhood name, in order to utilize the Foursquare location data, we need to get the latitude and the longitude coordinates of each neighborhood.

In an older version of this course, we were leveraging the Google Maps Geocoding API to get the latitude and the longitude coordinates of each neighborhood. However, recently Google started charging for their API: http://geoawesomeness.com/developers-up-in-arms-over-google-maps-api-insane-price-hike/, so we will use the Geocoder Python package instead: https://geocoder.readthedocs.io/index.html.

The problem with this Package is you have to be persistent sometimes in order to get the geographical coordinates of a given postal code. So you can make a call to get the latitude and longitude coordinates of a given postal code and the result would be None, and then make the call again and you would get the coordinates. So, in order to make sure that you get the coordinates for all of our neighborhoods, you can run a while loop for each postal code.

In [None]:
!pip install geocoder

In [1]:
import pandas as pd
import numpy as np
import geocoder
import pickle
print("Libraries imported!")

Libraries imported!


#### Reading the pickle file saved from part 1 and doing a data sanity check

In [2]:
with open('part1.pkl','rb') as f:
    toronto_df = pickle.load(f)

In [3]:
toronto_df.shape

(103, 3)

In [4]:
toronto_df.head()

Unnamed: 0,Postalcode,Borough,Neighborhood
0,M1B,Scarborough,"Malvern, Rouge"
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek"
2,M1E,Scarborough,"Guildwood, Morningside, West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae


### Lets get the Latitudes and Longitudes

In [12]:
def get_latilong(postal_code):
    '''
    Get Latitudes and Longitudes of Toronto, Ontario based on the postal codes
    '''
    lat_long_coords = None
    while(lat_long_coords is None):
        g = geocoder.arcgis('{}, Toronto, Ontario'.format(postal_code))
        lat_long_coords = g.latlng
    return lat_long_coords

In [13]:
# Retrieving Postal Code Co-ordinates
postal_codes = toronto_df['Postalcode']    
coords = [ get_latilong(postal_code) for postal_code in postal_codes.tolist() ]

In [15]:
# Adding Columns Latitude & Longitude
coords_df = pd.DataFrame(coords, columns=['Latitude', 'Longitude'])
toronto_df['Latitude'] = coords_df['Latitude']
toronto_df['Longitude'] = coords_df['Longitude']

In [16]:
toronto_df.head()

Unnamed: 0,Postalcode,Borough,Neighborhood,Latitude,Longitude
0,M1B,Scarborough,"Malvern, Rouge",43.81139,-79.19662
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek",43.78574,-79.15875
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.76575,-79.1747
3,M1G,Scarborough,Woburn,43.76812,-79.21761
4,M1H,Scarborough,Cedarbrae,43.76944,-79.23892


In [17]:
toronto_df.shape

(103, 5)

In [18]:
# with open('part2.pkl', 'wb') as handle:
#     pickle.dump(toronto_df, handle, protocol=pickle.HIGHEST_PROTOCOL)