# Coordinate Retrieval: Toronto Postal Codes

The project will use the following Wikipedia page as one of data sources.
https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M

Objective:
- Previously, we have succeeded to scrap the table data into a pandas dataframe in the `Part1_Web_Scraping.ipynb` notebook file. 

- Now, we will get the latitude and the longitude coordinates of each neighborhood in order to utilize the Foursquare location data later in the separate main project notebook.

Let's show some spirit by importing some basic libraries

In [13]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from tqdm import tqdm

#### Load Data

In [2]:
toronto_df = pd.read_csv('/Users/mac/Desktop/DataScience/Pojects_ds/coffee_shop/dataset/toronto_postal_codes.csv')
toronto_df.head()

Unnamed: 0,PostalCode,Borough,Neighbourhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,Regent Park / Harbourfront
3,M6A,North York,Lawrence Manor / Lawrence Heights
4,M7A,Queen's Park,Ontario Provincial Government


### Geocoder
- We will be using Geocoder module to retieve the coordinate for each postal code.

For more information: https://geocoder.readthedocs.io/index.html

We will use a geocoder method called `.arcgis()` to retrieve the coordinate.

**Wikipedia**: _ArcGIS is a geographic information system for working with maps and geographic information maintained by the Environmental Systems Research Institute_

Let's try one postal code and see the result:

In [8]:
import geocoder
g = geocoder.arcgis('M3A, Toronto, Ontario')
print(g.latlng)

[43.75356779300006, -79.32923091699996]


In [22]:
main_list = []
for i in tqdm(toronto_df['PostalCode']):
    d = {}
    d['PostalCode'] = i
    g = geocoder.arcgis(i+',Toronto, Ontario')
    d['Latitude'] = g.latlng[0]
    d['Longitude'] = g.latlng[1]
    main_list.append(d)

100%|██████████| 103/103 [02:11<00:00,  1.28s/it]


In [23]:
coord_df = pd.DataFrame(main_list)
coord_df.head()

Unnamed: 0,PostalCode,Latitude,Longitude
0,M3A,43.753568,-79.329231
1,M4A,43.727292,-79.313085
2,M5A,43.655201,-79.363016
3,M6A,43.722086,-79.450457
4,M7A,43.662437,-79.388758


In [24]:
toronto_cdf = pd.concat([toronto_df,coord_df[['Latitude','Longitude']]],axis=1)

In [25]:
toronto_cdf.head()

Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753568,-79.329231
1,M4A,North York,Victoria Village,43.727292,-79.313085
2,M5A,Downtown Toronto,Regent Park / Harbourfront,43.655201,-79.363016
3,M6A,North York,Lawrence Manor / Lawrence Heights,43.722086,-79.450457
4,M7A,Queen's Park,Ontario Provincial Government,43.662437,-79.388758


In [26]:
# Save the new dataframe to a new csv file
toronto_cdf.to_csv('/Users/mac/Desktop/DataScience/Pojects_ds/coffee_shop/dataset/toronto_poscode_latlng.csv', index=False)

# Print the shape of the new dataframe
toronto_cdf.shape

(103, 5)