<a href="https://www.coursera.org/professional-certificates/ibm-data-science?"><img src="https://blog.coursera.org/wp-content/uploads/2018/08/Image.1534447201044.png" align="center"></a>

<h2 align="center">Peer-graded Assignment: Segmenting and Clustering Neighborhoods in Toronto</h2>

<h2 align="center">Author: Diego Ignacio López Orellana</h2>

<hr>

## Notebook 2/3: Get the Latitude and Longitude coordinates from the neighborhoods in Toronto, and merge them into the *toronto_postcodes* dataframe

This notebook consists on getting the Latitude and Longitude coordinates from each neighborhood in Toronto, Canada, contained in the *toronto_postcodes* dataframe. This is achieved by using the **Geocoder package** to gather the required values. The dataframe mentioned was created by importing the *canada_postcodes* **CSV file** into a *Pandas dataframe* called *toronto_postcodes*.

After that, the Latitude and Longitude values for each neighborhood are appended to lists which are then merged into the *toronto_postcodes* dataframe mentioned before.

Finally, this Jupyter Notebook will be submitted to my **GitHub Coursera Capstone's repository**, which is available on the following [link](https://github.com/diegolopez92/Coursera_Capstone).

### I. Download, install and import libraries and dependencies

In [5]:
!conda install -c conda-forge geocoder --yes # Install geocoder, package used to convert postal code values to Latitude and Longitude coordinates of each
                                             # neighborhood in Toronto, Canada

# Data manipulation libraries

import pandas as pd # Library for manipulating and analyzing data (including the creation of dataframes)
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import numpy as np # Math library to work with N-dimensional arrays in Python

Collecting package metadata (current_repodata.json): done
Solving environment: done


  current version: 4.8.3
  latest version: 4.8.4

Please update conda by running

    $ conda update -n base -c defaults conda



# All requested packages already installed.



### II. Import *canada_postcodes* **.CSV file** into a *Pandas dataframe*

In [6]:
tto_postcodes = pd.read_csv('canada_postcodes.csv') # Read the .csv file into a dataframe called 'tto_postcodes'
toronto_postcodes = tto_postcodes.drop('Unnamed: 0', axis=1) # Drop the column 'Unnamed: 0' (containing the indexes of the rows), and rename the dataframe as
                                                             # 'toronto_postcodes'
toronto_postcodes.head(10) # Show the first 10 rows of the dataframe 'toronto_postcodes'

Unnamed: 0,Postal Code,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"
5,M9A,Etobicoke,"Islington Avenue, Humber Valley Village"
6,M1B,Scarborough,"Malvern, Rouge"
7,M3B,North York,Don Mills
8,M4B,East York,"Parkview Hill, Woodbine Gardens"
9,M5B,Downtown Toronto,"Garden District, Ryerson"


### III. Get the **Latitude and Longitude geographical coordinates** for each neighborhood in Toronto, according to its Postal Code

In [18]:
import geocoder # Import geocoder package
postal_code = toronto_postcodes['Postal Code'] # Postal code for each neighborhood in Toronto, Canada

# Initialize your variable to 'None'
lat_lng_coords = None

# Create an empty list to append the Latitude values
lat_toronto = []

# Create an empty list to append the Longitude values
lon_toronto = []

# Loop until getting the geographical coordinates
for post in postal_code:
    g = geocoder.arcgis('{}, Toronto, Ontario'.format(post))
    lat_lng_coords = g.latlng
    lat_toronto.append(lat_lng_coords[0])
    lon_toronto.append(lat_lng_coords[1])

In [21]:
# Show the Latitude coordinates for the first 10 neighborhoods from toronto_postcodes dataframe
lat_toronto[:10]

[43.75188000000003,
 43.73042000000004,
 43.655140000000074,
 43.72321000000005,
 43.66449000000006,
 43.66277000000008,
 43.81153000000006,
 43.74929000000003,
 43.707940000000065,
 43.65736000000004]

In [22]:
# Show the Longitude coordinates for the first 10 neighborhoods from toronto_postcodes dataframe
lon_toronto[:10]

[-79.33035999999998,
 -79.31281999999999,
 -79.36264999999997,
 -79.45140999999995,
 -79.39301999999998,
 -79.52830999999998,
 -79.19551999999999,
 -79.36168999999995,
 -79.31159999999994,
 -79.37817999999999]

In [31]:
# Print the number of Latitude values returned by ArcGis geocoder
print(len(lat_toronto))

103


In [32]:
# Print the number of Longitude values returned by ArcGis geocoder
print(len(lon_toronto))

103


#### Therefore, we verify that we get the Latitude and Longitude values for each Postal Code (neighborhood) in the *toronto_postcodes* dataframe.

## **Important note:** for this task, I used **ArcGis** as the Geocoder to get the geographical coordinates for each Postal Code (neighborhood), instead of the **Google** one. So, the Latitude and Longitude values may vary a little bit with respect to the values from the dataframe shown in the Peer-graded Assignment instructions.

### IV. Append the Latitude and Longitude from each neighborhood as columns of the *toronto_postcodes* dataframe

In [53]:
# Append the 'Latitude' column, from lat_toronto, to the dataframe toronto_postcodes.
toronto_postcodes['Latitude'] = [lat for lat in lat_toronto]

In [55]:
# Append the 'Longitude' column, from lon_toronto, to the dataframe toronto_postcodes.
toronto_postcodes['Longitude'] = [lon for lon in lon_toronto]

In [56]:
toronto_postcodes.head(10)

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.75188,-79.33036
1,M4A,North York,Victoria Village,43.73042,-79.31282
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65514,-79.36265
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.72321,-79.45141
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.66449,-79.39302
5,M9A,Etobicoke,"Islington Avenue, Humber Valley Village",43.66277,-79.52831
6,M1B,Scarborough,"Malvern, Rouge",43.81153,-79.19552
7,M3B,North York,Don Mills,43.74929,-79.36169
8,M4B,East York,"Parkview Hill, Woodbine Gardens",43.70794,-79.3116
9,M5B,Downtown Toronto,"Garden District, Ryerson",43.65736,-79.37818


### V. Export *toronto_postcodes* dataframe to a **.csv file** with the same name

In [57]:
toronto_postcodes.to_csv('toronto_postcodes.csv', index=False)