# Toronto vs Montreal: How similar or dissimmilar are Toronto and Montreal neighborhoods?

### Introduction

This project is for new Canadians who are deciding which metropolitan city to live in. Moving to a new country is challenging in its own right. Hopefully this project will capture the essence of the two cities and allow for the newcommers to chose their new home wisely based on their lifestyle.

### The data

The data is provided by Wikipedia (postal code, and neighborhood), Geocoder Library (longitude and latitude), and Foursquare API (location data before COVID-19). Foursquare location data will concentrate on nearvy venues such as restaurants, cafe, parks and museums to name a few.

### Data Analysis and Visualization

The common venues in each neighborhood of both cities, Toronto and Montreal, will be grouped into the clusters using the unsupervised machine learning technique, k-means clustering. Folium maps and other visualization methods will be used to showcase the results. 

## Notebook 

### Toronto Data 

In [1]:
import pandas as pd
import numpy as np 

print('Pandas and NumPy imported')

Pandas and NumPy imported


In [2]:
#Create first Toronto dataframe based on Wiki data 
to_df = pd.read_html('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M')
to_df = to_df[0]

#Remove Not assigned and reset index
to_df.replace('Not assigned', np.nan, inplace = True)
to_df.dropna(subset = ['Borough'], axis = 0, inplace = True)
to_df.drop(columns = ['Borough'], inplace = True)
to_df.reset_index(drop = True, inplace = True)
to_df.head()

Unnamed: 0,Postal Code,Neighbourhood
0,M3A,Parkwoods
1,M4A,Victoria Village
2,M5A,"Regent Park, Harbourfront"
3,M6A,"Lawrence Manor, Lawrence Heights"
4,M7A,"Queen's Park, Ontario Provincial Government"


In [4]:
!conda install -c conda-forge geocoder --yes
import geocoder 

print('Geocoder imported!')

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - geocoder


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    ca-certificates-2020.6.20  |       hecda079_0         145 KB  conda-forge
    ratelim-0.1.6              |             py_2           6 KB  conda-forge
    python_abi-3.6             |          1_cp36m           4 KB  conda-forge
    openssl-1.1.1g             |       h516909a_1         2.1 MB  conda-forge
    certifi-2020.6.20          |   py36h9f0ad1d_0         151 KB  conda-forge
    geocoder-1.38.1            |             py_1          53 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         2.5 MB

The following NEW packages will be INSTALLED:

    geocoder:        1.38.1-py_1       conda-forge
    python_abi:    

In [5]:
# Create new dataframe with postal code, latitude, longitude for toronto
geo_to = pd.read_csv('http://cocl.us/Geospatial_data')

# Merge both dataframe together to create the foundational dataframe toronto_df
toronto_df = pd.merge(to_df, geo_to, how = 'outer', on='Postal Code')
toronto_df.head()

Unnamed: 0,Postal Code,Neighbourhood,Latitude,Longitude
0,M3A,Parkwoods,43.753259,-79.329656
1,M4A,Victoria Village,43.725882,-79.315572
2,M5A,"Regent Park, Harbourfront",43.65426,-79.360636
3,M6A,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494


### Montreal Data

In [6]:
#https://www.geonames.org/postal-codes/CA/QC/quebec.html
#https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_H

mtl_data = {'Postal Code': ['H0M', 'H1A', 'H1B', 'H1C', 'H1E', 'H1G', 'H1H', 'H1J', 'H1K', 'H1L', 'H1M', 'H1N', 'H1P', 'H1R', 
                    'H1S', 'H1T', 'H1V', 'H1W', 'H1X', 'H1Y', 'H1Z', 'H2A', 'H2B', 'H2C', 'H2E', 'H2G', 'H2H', 'H2J', 
                    'H2K', 'H2L', 'H2M', 'H2N', 'H2P', 'H2R', 'H2S', 'H2T', 'H2V', 'H2W', 'H2X', 'H2Y', 'H2Z', 'H3A', 
                    'H3B', 'H3C', 'H3E', 'H3G', 'H3H', 'H3J', 'H3K', 'H3L', 'H3M', 'H3N', 'H3P', 'H3R', 'H3S', 'H3T', 
                    'H3V', 'H3W', 'H3X', 'H3X', 'H3Y', 'H3Z', 'H4A', 'H4B', 'H4C', 'H4E', 'H4G', 'H4H', 'H4J', 'H4K', 
                    'H4L', 'H4M', 'H4N', 'H4P', 'H4R', 'H4S', 'H4T', 'H4V', 'H4W', 'H4X', 'H4Y', 'H4Z', 'H5A', 'H5B', 
                    'H7A', 'H7B', 'H7C', 'H7E', 'H7G', 'H7H', 'H7J', 'H7K', 'H7L', 'H7M', 'H7N', 'H7P', 'H7R', 'H7S', 
                    'H7T', 'H7V', 'H7W', 'H7X', 'H7Y', 'H8N', 'H8P', 'H8R', 'H8R', 'H8S', 'H8T', 'H8Y', 'H8Z', 'H9A', 
                    'H9B', 'H9C', 'H9E', 'H9G', 'H9H', 'H9H', 'H9J', 'H9K', 'H9P', 'H9R', 'H9S', 'H9S', 'H9W', 'H9X'],
            'Neighborhood': ['Akwesasne Region', 'Pointe-Aux-Trembles', 'Montreal-East', 'Rivière-des-Prairies', 'Rivière-des-Prairies', 
                     'Montreal-Nord', 'Montreal-Nord', 'Anjou', 'Anjou', 'Mercier', 'Mercier', 'Mercier', 'Saint-Leonard', 'Saint-Leonard', 
                     'Saint-Leonard', 'Rosemont', 'Maisonneuve', 'Hochlelaga', 'Rosemount', 'Rosemount', 'Saint-Michel',
                     'Saint-Michel', 'Ahunstic', 'Ahunstic', 'Villeray', 'Petite-Patrie', 'Plateau Mount-Royal', 'Plateau Mount-Royal', 
                     'Centre-Sud', 'Centre-Sud', 'Ahunstic', 'Ahunstic', 'Villeray', 'Villeray', 'Petite-Patrie', 'Plateau Mount-Royal', 
                     'Outremount', 'Plateau Mount-Royal', 'Plateau Mount-Royal', 'Old Montreal', 'Downtown Montreal', 
                     'Downtown Montreal (McGill University)', 'Downtown Montreal', 'Griffin Town (Université de Montréal)', "Nun's Island", 
                     'Downtown Montreal (Concordia University)', 'Downtown Montreal', 'Petite Bourgogne', 'Pointe-Saint-Charles', 
                     'Ahunstic', 'Catierville', 'Parc Extension', 'Mount Royal', 'Mount Royal', 'Côte-des-Neiges', 'Côte-des-Neiges', 
                     'Côte-des-Neiges', 'Côte-des-Neiges', 'Hampstead', 'Côte-Saint-Luc', 'Westmount', 'Westmount', 'Notre-Dame-de-Grâce', 
                     'Notre-Dame-de-Grâce', 'Saint-Henri', 'Ville-Émard', 'Verdun', 'Verdun', 'Cartierville', 'Cartierville', 
                     'Saint-Laurent', 'Saint-Laurent', 'Saint-Laurent', 'Mount Royal', 'Saint-Laurent', 'Saint-Laurent', 'Saint-Laurent', 
                     'Côte-Saint-Luc', 'Côte-Saint-Luc', 'Montreal West', 'Dorval', 'Tour de la Bourse', 'Place Bonaventure', 'Place Desjardins', 
                     'Duvernay-Est', 'Saint-François', 'Saint-Vincent-de-Paul', 'Duvernay', 'Point-Viau', 'Auteuil', 'Auteuil', 'Auteuil', 
                     'Saint Rose', 'Vimont', 'Laval-des-Rapides', 'Fabreville', 'Laval-sur-le-Lac', 'Chomedy', 'Chomedy', 'Chomedy', 'Chomedy', 
                     'Sainte-Dorothée', 'Îles-Laval', 'LaSalle', 'LaSalle', 'LaSalle', 'Ville Saint-Pierre', 'Lachine', 'Lachine', 
                     'Pierrefonds-Roxboro', 'Pierrefonds', 'Dollard-des-Ormeaux', 'Dollard-des-Ormeaux', 'Île-Bizard', 'Île-Bizard', 
                     'Dollard-des-Ormeaux', 'Pierrefonds', 'Sainte-Geneviève', 'Kirkland', 'Senneville', 'Dorval', 'Pointe-Claire', 'Dorval', 
                     "L'Île-Dorval", 'Beaconsfield', 'Sainte-Anne-de-Bellevue']}

In [7]:
mtl_df = pd.DataFrame(mtl_data)

print(mtl_df)
print('\n''source: https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_H')

    Postal Code             Neighborhood
0           H0M         Akwesasne Region
1           H1A      Pointe-Aux-Trembles
2           H1B            Montreal-East
3           H1C     Rivière-des-Prairies
4           H1E     Rivière-des-Prairies
5           H1G            Montreal-Nord
6           H1H            Montreal-Nord
7           H1J                    Anjou
8           H1K                    Anjou
9           H1L                  Mercier
10          H1M                  Mercier
11          H1N                  Mercier
12          H1P            Saint-Leonard
13          H1R            Saint-Leonard
14          H1S            Saint-Leonard
15          H1T                 Rosemont
16          H1V              Maisonneuve
17          H1W               Hochlelaga
18          H1X                Rosemount
19          H1Y                Rosemount
20          H1Z             Saint-Michel
21          H2A             Saint-Michel
22          H2B                 Ahunstic
23          H2C 

In [15]:
#latitude = lat_lng_coords[0]
#longitude = lat_lng_coords[1]

def get_lat(postal_code):
    lat_lng_coords = None
    while(lat_lng_coords is None):
        g = geocoder.arcgis('{}, Montreal, Quebec'.format(postal_code))
        lat_lng_coords = g.latlng
        return lat_lng_coords[0]

def get_lon(postal_code):
    lat_lng_coords = None
    while(lat_lng_coords is None):
        g = geocoder.arcgis('{}, Montreal, Quebec'.format(postal_code))
        lat_lng_coords = g.latlng
        return lat_lng_coords[1]

print('latitude retrieved by geocoder for H9R:', get_lat('H9R'))  #test function  
print('longitude retrieved by geocoder for H9R:', get_lon('H9R')) # test function 

latitude retrieved by geocoder for H9R: 45.46047000000004
longitude retrieved by geocoder for H9R: -73.81335999999999


In [16]:
mtl_df['Latitude'] = mtl_df['Postal Code'].apply(get_lat)
mtl_df['Longitude'] = mtl_df['Postal Code'].apply(get_lon)
mtl_df.head()

Unnamed: 0,Postal Code,Neighborhood,Latitude,Longitude
0,H0M,Akwesasne Region,45.5124,-73.55469
1,H1A,Pointe-Aux-Trembles,45.67415,-73.50059
2,H1B,Montreal-East,45.62939,-73.52003
3,H1C,Rivière-des-Prairies,45.66019,-73.54076
4,H1E,Rivière-des-Prairies,45.63678,-73.58602
