# Capstone Project - Evaluating the best Neighborhood in Ontario, Canada to immigrate to

## Shaun Diplock - 22nd March 2021

## 1. Introduction

The problem / challenge addressed in this report and study is a very personal one; the target audience is myself and my future wife and family. Therefore, this 'business problem' is a fairly unconventional one, as I am performing this study for myself as the main stakeholder. I'll attempt to keep personal details and information minimal throughout this report, however as a natural consequence of the subject matter I will occasionally reference items of personal interest.

I met my fiance over 5 years ago in Ottawa, Canada, whilst I was working in the area on on a business trip. I live in England, and work as a system development manager and engineer, and frequently have to travel abroad for client site visits and buisness meetings. Our long-distance relationship has developed to the point where we our now engaged, and I am actively trying to immigrate to Canada so we can seriously start our life together. 

Moving to another country is a daunting prospect, and one that merits time and research into identifying the best area for both of us. Naturally, I am very anxious (but also excited) for what the future holds - this project represents a very real, genuine attempt to evaluate some areas that will maximise the chance of our move and choice being a success.

The minimum criteria for us is as follows:

1. Within a 60 minute drive (approximately 70 km) of Smiths Falls, Ontario (this is where my fiance currently works).
2. We do not want to live in Quebec.
3. We cannot live in the United States (Smiths Falls is relatively close to the border).
4. We do not want to live in a very small town - the town / city must contain more than 2,500 residents.
5. We do not want to live in a major city or urban area - the town / city must contain less than 50,000 residents.


Providing the above minimum criteria are met, towns / cities / neighboorhoods will be ranked using the following attributes:

1. The amount of restuarants and bars in the town / city, and their ratings.
2. The amount of gyms / fitness studios, and their ratings.
3. The amount of outdoor spaces, such as parks and trails.
4. The amount of shopping outlets and retail stores.
5. The average cost of housing in the area.
6. The distance from Smiths Falls, Ontario.

With the above critieria evaluated we will hopefully be able to identify some suitable areas for us to move to when I immigrate, to maximise the chance of our future life together being happy and successful.


## 2. Data Acquisition and Preparation


The source data for this problem will be acquired from [Distantias's location proximity tool](https://www.distantias.com/towns-radius-smiths_falls-ontario-canada.htm). This website provides an easy-to-use tool which can quickly search for all towns, cities and habitated places in proximity to another town or place. It then provides the location data in a convenient .csv file which will be easy to process for the study. Unfortunately, this tool also charges a small fee to use. Despite a thorough search for equivalent libraries / functions and data sources, the ease of which this tool provides outweighed the negative aspect of a small extra charge to access the data.

The reliability of this data seems very good, with transparency about some population data that may be missing from the returned query: *'We don't have data for every town and city in Canada and we specify this with NA in our data table. Population data is sourced from a variety of national and international databases some of which are more current than others. The oldest data set is from 2011 but we do make ongoing updates as new census data is released'.* With this in mind, a provisional review of my queried data did indeed contain hamlets and settlements with no population data; evaluating these manually showed that the settlements are so tiny that no census data has ever been collected from them. Therefore these can be simply dropped from the data as they do not fulfill critierium 4 as detailed above.

This data will then be leveraged using foresquare in order to evaluate the areas, towns, cities and neighborhoods that meet the minimum acceptable criteria. This Foresquare location and venues data will then be used to evaluate our preffered neighborhood attributes.

This data can be called to provide lots of meaningful and revelant information - for instance it can be used to examine and cluster the frequency of various amneties in an area, as shown by the following results from a previous and related exercise:

![alt text](https://github.com/ShaunDiplock/Coursera_Capstone/blob/main/Neighborhood%20data%20example.PNG?raw=true "Example clustered neighborhood data")

In addition to the Distantia and Foresquare data, I will also use the [MLS Home Price Index (HPI)](https://www.crea.ca/housing-market-stats/mls-home-price-index/) data from the [Canadian Real Estate Association](https://www.crea.ca/) website in order to evaluate the average price of housing in each area. This data is available for download using this link: https://www.crea.ca/hpi-tools-terms-of-use/.

Ultimately, I will be using all of this data to assign each area a 'weighted score' to help form a list of the best three suitable neighborhoods, with some final discussion about the pros and cons of each area.

# Import required libraries

In [1]:
import pandas as pd

# Import Distantias Data

In [124]:
path='D:\GithubProjects\Coursera_Capstone\smiths_falls_distantias_data.csv'

df_ontariodata = pd.read_csv(path)

df_ontariodata.head(10)

Unnamed: 0,Town Name,Web Link,Distance Miles,Distance KM,Precise Drive time and Directions URL,Approx Drive miles,Approx Drive Time,Assumed Average MPH,Total Minutes,Region,Country,Population,Direction,Latitude,Longitude,Date
0,Smiths Falls,https://www.distantias.com/towns-radius-smiths...,0.0,0.0,https://www.distantias.com/distance-from-smith...,0.0,0 hour(s) and 0 minutes,0,0.0,Ontario,Canada,9403,NE,44.9,-76.0167,2011
1,Merrickville,https://www.distantias.com/towns-radius-merric...,9.05,14.564527,https://www.distantias.com/distance-from-smith...,10.77,0 hour(s) and 25 minutes,26,24.9,Ontario,Canada,3067,NE,44.9167,-75.8333,2011
2,Perth,https://www.distantias.com/towns-radius-perth-...,10.67,17.171658,https://www.distantias.com/distance-from-smith...,13.22,0 hour(s) and 21 minutes,37,21.4,Ontario,Canada,6211,SW,44.8833,-76.2333,2011
3,Eloida,https://www.distantias.com/towns-radius-eloida...,15.17,24.413688,https://www.distantias.com/distance-from-smith...,18.8,0 hour(s) and 30 minutes,37,30.5,Ontario,Canada,not available,SE,44.6833,-75.9667,2011
4,Carleton Place,https://www.distantias.com/towns-radius-carlet...,17.09,27.503621,https://www.distantias.com/distance-from-smith...,21.17,0 hour(s) and 34 minutes,37,34.3,Ontario,Canada,10013,NW,45.1333,-76.1333,2011
5,Kemptville,https://www.distantias.com/towns-radius-kemptv...,20.39,32.814443,https://www.distantias.com/distance-from-smith...,24.77,0 hour(s) and 27 minutes,55,27.0,Ontario,Canada,3532,NE,45.0167,-75.6333,2011
6,Crosby,https://www.distantias.com/towns-radius-crosby...,20.74,33.377712,https://www.distantias.com/distance-from-smith...,25.2,0 hour(s) and 27 minutes,55,27.5,Ontario,Canada,not available,SW,44.65,-76.25,2011
7,Richmond,https://www.distantias.com/towns-radius-richmo...,21.52,34.632997,https://www.distantias.com/distance-from-smith...,26.15,0 hour(s) and 29 minutes,55,28.5,Ontario,Canada,3797,NE,45.1833,-75.8333,2011
8,Newboro,https://www.distantias.com/towns-radius-newbor...,22.71,36.548111,https://www.distantias.com/distance-from-smith...,27.59,0 hour(s) and 30 minutes,55,30.1,Ontario,Canada,not available,SW,44.65,-76.3167,2011
9,Almonte,https://www.distantias.com/towns-radius-almont...,23.63,38.028704,https://www.distantias.com/distance-from-smith...,28.71,0 hour(s) and 31 minutes,55,31.3,Ontario,Canada,4752,NW,45.2167,-76.2,2011


In [125]:
df_ontariodata.dtypes

Town Name                                 object
Web Link                                  object
Distance Miles                           float64
Distance KM                              float64
Precise Drive time and Directions URL     object
Approx Drive miles                       float64
Approx Drive Time                         object
Assumed Average MPH                        int64
Total Minutes                            float64
Region                                    object
Country                                   object
Population                                object
Direction                                 object
Latitude                                 float64
Longitude                                float64
Date                                       int64
dtype: object

In [126]:
df_ontariodata.shape

(135, 16)

### Filter for towns that meet criteria 1 (within 70km)

In [127]:
df1=df_ontariodata[df_ontariodata['Distance KM'] < 70]

df1.shape

(62, 16)

In [128]:
df1

Unnamed: 0,Town Name,Web Link,Distance Miles,Distance KM,Precise Drive time and Directions URL,Approx Drive miles,Approx Drive Time,Assumed Average MPH,Total Minutes,Region,Country,Population,Direction,Latitude,Longitude,Date
0,Smiths Falls,https://www.distantias.com/towns-radius-smiths...,0.00,0.000000,https://www.distantias.com/distance-from-smith...,0.00,0 hour(s) and 0 minutes,0,0.0,Ontario,Canada,9403,NE,44.9000,-76.0167,2011
1,Merrickville,https://www.distantias.com/towns-radius-merric...,9.05,14.564527,https://www.distantias.com/distance-from-smith...,10.77,0 hour(s) and 25 minutes,26,24.9,Ontario,Canada,3067,NE,44.9167,-75.8333,2011
2,Perth,https://www.distantias.com/towns-radius-perth-...,10.67,17.171658,https://www.distantias.com/distance-from-smith...,13.22,0 hour(s) and 21 minutes,37,21.4,Ontario,Canada,6211,SW,44.8833,-76.2333,2011
3,Eloida,https://www.distantias.com/towns-radius-eloida...,15.17,24.413688,https://www.distantias.com/distance-from-smith...,18.80,0 hour(s) and 30 minutes,37,30.5,Ontario,Canada,not available,SE,44.6833,-75.9667,2011
4,Carleton Place,https://www.distantias.com/towns-radius-carlet...,17.09,27.503621,https://www.distantias.com/distance-from-smith...,21.17,0 hour(s) and 34 minutes,37,34.3,Ontario,Canada,10013,NW,45.1333,-76.1333,2011
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
57,Old Chelsea,https://www.distantias.com/towns-radius-old_ch...,42.58,68.525697,https://www.distantias.com/distance-from-smith...,51.73,0 hour(s) and 56 minutes,55,56.4,Quebec,Canada,not available,NE,45.5000,-75.8167,2011
58,Luskville,https://www.distantias.com/towns-radius-luskvi...,42.62,68.590071,https://www.distantias.com/distance-from-smith...,51.78,0 hour(s) and 56 minutes,55,56.5,Quebec,Canada,not available,NE,45.5167,-76.0000,2011
59,Fishers Landing,https://www.distantias.com/towns-radius-fisher...,42.97,69.153340,https://www.distantias.com/distance-from-smith...,52.21,0 hour(s) and 57 minutes,55,57.0,New York,United States,89,SE,44.2782,-76.0083,2011
60,Braeside,https://www.distantias.com/towns-radius-braesi...,43.34,69.748796,https://www.distantias.com/distance-from-smith...,52.66,0 hour(s) and 57 minutes,55,57.5,Ontario,Canada,7178,NW,45.4667,-76.4000,2011


### Filter for towns that meet criteria 2 (not in Quebec)

In [129]:
#First remove whitespace which is errorenously present in the .csv file, this step is needed so that filter function works

df1['Region']=df1['Region'].str.strip()

#Then filter for regions not called Quebec
df2=df1[df1['Region'] != "Quebec"]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df1['Region']=df1['Region'].str.strip()


The warning above can be ignored - it's just telling me that I have modified the original dataframe rather than a copy, which is our desired result and *not* a mistake.

In [130]:
df2.shape

(58, 16)

In [131]:
df2.tail(10)

Unnamed: 0,Town Name,Web Link,Distance Miles,Distance KM,Precise Drive time and Directions URL,Approx Drive miles,Approx Drive Time,Assumed Average MPH,Total Minutes,Region,Country,Population,Direction,Latitude,Longitude,Date
50,Rockcliffe Park,https://www.distantias.com/towns-radius-rockcl...,41.3,66.465742,https://www.distantias.com/distance-from-smith...,50.18,0 hour(s) and 55 minutes,55,54.7,Ontario,Canada,1932,NE,45.45,-75.6833,2011
51,Chesterville,https://www.distantias.com/towns-radius-cheste...,41.4,66.626676,https://www.distantias.com/distance-from-smith...,50.3,0 hour(s) and 55 minutes,55,54.9,Ontario,Canada,1514,NE,45.1,-75.2167,2011
52,Madrid,https://www.distantias.com/towns-radius-madrid...,41.69,67.093385,https://www.distantias.com/distance-from-smith...,50.65,0 hour(s) and 55 minutes,55,55.3,New York,United States,1651,SE,44.7756,-75.1851,2018
53,Waddington,https://www.distantias.com/towns-radius-waddin...,41.75,67.189945,https://www.distantias.com/distance-from-smith...,50.73,0 hour(s) and 55 minutes,55,55.3,New York,United States,2214,SE,44.8474,-75.1678,2018
54,Redwood,https://www.distantias.com/towns-radius-redwoo...,42.19,67.898055,https://www.distantias.com/distance-from-smith...,51.26,0 hour(s) and 56 minutes,55,55.9,New York,United States,1510,SE,44.3237,-75.735,2011
55,Mountain Grove,https://www.distantias.com/towns-radius-mounta...,42.5,68.39695,https://www.distantias.com/distance-from-smith...,51.64,0 hour(s) and 56 minutes,55,56.3,Ontario,Canada,not available,SW,44.7333,-76.85,2011
56,Thousand Island Park,https://www.distantias.com/towns-radius-thousa...,42.51,68.413043,https://www.distantias.com/distance-from-smith...,51.65,0 hour(s) and 56 minutes,55,56.4,New York,United States,31,SW,44.2849,-76.0299,2011
59,Fishers Landing,https://www.distantias.com/towns-radius-fisher...,42.97,69.15334,https://www.distantias.com/distance-from-smith...,52.21,0 hour(s) and 57 minutes,55,57.0,New York,United States,89,SE,44.2782,-76.0083,2011
60,Braeside,https://www.distantias.com/towns-radius-braesi...,43.34,69.748796,https://www.distantias.com/distance-from-smith...,52.66,0 hour(s) and 57 minutes,55,57.5,Ontario,Canada,7178,NW,45.4667,-76.4,2011
61,Plessis,https://www.distantias.com/towns-radius-plessi...,43.35,69.764889,https://www.distantias.com/distance-from-smith...,52.67,0 hour(s) and 57 minutes,55,57.5,New York,United States,164,SE,44.2847,-75.8458,2011


### Filter for towns that meet criteria 3 (not in the United States)

In [132]:
#Then filter for regions not called Quebec
df3=df2[df2['Country'] != "United States"]
df3.shape

(41, 16)

In [133]:
df3.tail(10)

Unnamed: 0,Town Name,Web Link,Distance Miles,Distance KM,Precise Drive time and Directions URL,Approx Drive miles,Approx Drive Time,Assumed Average MPH,Total Minutes,Region,Country,Population,Direction,Latitude,Longitude,Date
43,Gananoque,https://www.distantias.com/towns-radius-ganano...,39.85,64.132199,https://www.distantias.com/distance-from-smith...,48.42,0 hour(s) and 53 minutes,55,52.8,Ontario,Canada,5194,SW,44.3333,-76.1667,2011
45,Russell,https://www.distantias.com/towns-radius-russel...,40.49,65.162177,https://www.distantias.com/distance-from-smith...,49.2,0 hour(s) and 54 minutes,55,53.7,Ontario,Canada,3759,NE,45.26,-75.36,2011
46,Arnprior,https://www.distantias.com/towns-radius-arnpri...,40.57,65.290924,https://www.distantias.com/distance-from-smith...,49.29,0 hour(s) and 54 minutes,55,53.8,Ontario,Canada,10099,NW,45.4333,-76.3667,2011
47,Ompah,https://www.distantias.com/towns-radius-ompah-...,40.7,65.500138,https://www.distantias.com/distance-from-smith...,49.45,0 hour(s) and 54 minutes,55,54.0,Ontario,Canada,1675,NW,45.0167,-76.8333,2011
48,Morrisburg,https://www.distantias.com/towns-radius-morris...,40.79,65.644979,https://www.distantias.com/distance-from-smith...,49.56,0 hour(s) and 54 minutes,55,54.1,Ontario,Canada,2756,NE,44.9,-75.1833,2011
49,Gloucester,https://www.distantias.com/towns-radius-glouce...,41.03,66.03122,https://www.distantias.com/distance-from-smith...,49.85,0 hour(s) and 54 minutes,55,54.4,Ontario,Canada,133280,NE,45.4167,-75.6,2011
50,Rockcliffe Park,https://www.distantias.com/towns-radius-rockcl...,41.3,66.465742,https://www.distantias.com/distance-from-smith...,50.18,0 hour(s) and 55 minutes,55,54.7,Ontario,Canada,1932,NE,45.45,-75.6833,2011
51,Chesterville,https://www.distantias.com/towns-radius-cheste...,41.4,66.626676,https://www.distantias.com/distance-from-smith...,50.3,0 hour(s) and 55 minutes,55,54.9,Ontario,Canada,1514,NE,45.1,-75.2167,2011
55,Mountain Grove,https://www.distantias.com/towns-radius-mounta...,42.5,68.39695,https://www.distantias.com/distance-from-smith...,51.64,0 hour(s) and 56 minutes,55,56.3,Ontario,Canada,not available,SW,44.7333,-76.85,2011
60,Braeside,https://www.distantias.com/towns-radius-braesi...,43.34,69.748796,https://www.distantias.com/distance-from-smith...,52.66,0 hour(s) and 57 minutes,55,57.5,Ontario,Canada,7178,NW,45.4667,-76.4,2011


### Filter for towns that meet criteria 4 (more than 2,500 residents)

First must drop any tiny settlements or hamlets (those with 'not available' listed in the population column)

In [134]:
df4 = df3[df3['Population'] != 'not available']
df4.shape

(35, 16)

In [135]:
df4.head(10)

Unnamed: 0,Town Name,Web Link,Distance Miles,Distance KM,Precise Drive time and Directions URL,Approx Drive miles,Approx Drive Time,Assumed Average MPH,Total Minutes,Region,Country,Population,Direction,Latitude,Longitude,Date
0,Smiths Falls,https://www.distantias.com/towns-radius-smiths...,0.0,0.0,https://www.distantias.com/distance-from-smith...,0.0,0 hour(s) and 0 minutes,0,0.0,Ontario,Canada,9403,NE,44.9,-76.0167,2011
1,Merrickville,https://www.distantias.com/towns-radius-merric...,9.05,14.564527,https://www.distantias.com/distance-from-smith...,10.77,0 hour(s) and 25 minutes,26,24.9,Ontario,Canada,3067,NE,44.9167,-75.8333,2011
2,Perth,https://www.distantias.com/towns-radius-perth-...,10.67,17.171658,https://www.distantias.com/distance-from-smith...,13.22,0 hour(s) and 21 minutes,37,21.4,Ontario,Canada,6211,SW,44.8833,-76.2333,2011
4,Carleton Place,https://www.distantias.com/towns-radius-carlet...,17.09,27.503621,https://www.distantias.com/distance-from-smith...,21.17,0 hour(s) and 34 minutes,37,34.3,Ontario,Canada,10013,NW,45.1333,-76.1333,2011
5,Kemptville,https://www.distantias.com/towns-radius-kemptv...,20.39,32.814443,https://www.distantias.com/distance-from-smith...,24.77,0 hour(s) and 27 minutes,55,27.0,Ontario,Canada,3532,NE,45.0167,-75.6333,2011
7,Richmond,https://www.distantias.com/towns-radius-richmo...,21.52,34.632997,https://www.distantias.com/distance-from-smith...,26.15,0 hour(s) and 29 minutes,55,28.5,Ontario,Canada,3797,NE,45.1833,-75.8333,2011
9,Almonte,https://www.distantias.com/towns-radius-almont...,23.63,38.028704,https://www.distantias.com/distance-from-smith...,28.71,0 hour(s) and 31 minutes,55,31.3,Ontario,Canada,4752,NW,45.2167,-76.2,2011
10,Westport,https://www.distantias.com/towns-radius-westpo...,24.06,38.72072,https://www.distantias.com/distance-from-smith...,29.23,0 hour(s) and 32 minutes,55,31.9,Ontario,Canada,590,SW,44.6833,-76.4,2011
11,Stittsville,https://www.distantias.com/towns-radius-stitts...,24.67,39.702418,https://www.distantias.com/distance-from-smith...,29.97,0 hour(s) and 33 minutes,55,32.7,Ontario,Canada,41350,NE,45.25,-75.9167,2011
12,Brockville,https://www.distantias.com/towns-radius-brockv...,27.35,44.015449,https://www.distantias.com/distance-from-smith...,33.23,0 hour(s) and 36 minutes,55,36.3,Ontario,Canada,23354,SE,44.5833,-75.6833,2011


Need to next convert the Population column to numeric values

In [136]:
df4["Population"] = pd.to_numeric(df4["Population"])
df4.dtypes

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df4["Population"] = pd.to_numeric(df4["Population"])


Town Name                                 object
Web Link                                  object
Distance Miles                           float64
Distance KM                              float64
Precise Drive time and Directions URL     object
Approx Drive miles                       float64
Approx Drive Time                         object
Assumed Average MPH                        int64
Total Minutes                            float64
Region                                    object
Country                                   object
Population                                 int64
Direction                                 object
Latitude                                 float64
Longitude                                float64
Date                                       int64
dtype: object

Now the population values are numerical integers, we can perform the operation of removing towns that contain less than 2,500 residents.

In [137]:
df5=df4[df4['Population'] > 2500]
df5.shape

(24, 16)

In [138]:
df5

Unnamed: 0,Town Name,Web Link,Distance Miles,Distance KM,Precise Drive time and Directions URL,Approx Drive miles,Approx Drive Time,Assumed Average MPH,Total Minutes,Region,Country,Population,Direction,Latitude,Longitude,Date
0,Smiths Falls,https://www.distantias.com/towns-radius-smiths...,0.0,0.0,https://www.distantias.com/distance-from-smith...,0.0,0 hour(s) and 0 minutes,0,0.0,Ontario,Canada,9403,NE,44.9,-76.0167,2011
1,Merrickville,https://www.distantias.com/towns-radius-merric...,9.05,14.564527,https://www.distantias.com/distance-from-smith...,10.77,0 hour(s) and 25 minutes,26,24.9,Ontario,Canada,3067,NE,44.9167,-75.8333,2011
2,Perth,https://www.distantias.com/towns-radius-perth-...,10.67,17.171658,https://www.distantias.com/distance-from-smith...,13.22,0 hour(s) and 21 minutes,37,21.4,Ontario,Canada,6211,SW,44.8833,-76.2333,2011
4,Carleton Place,https://www.distantias.com/towns-radius-carlet...,17.09,27.503621,https://www.distantias.com/distance-from-smith...,21.17,0 hour(s) and 34 minutes,37,34.3,Ontario,Canada,10013,NW,45.1333,-76.1333,2011
5,Kemptville,https://www.distantias.com/towns-radius-kemptv...,20.39,32.814443,https://www.distantias.com/distance-from-smith...,24.77,0 hour(s) and 27 minutes,55,27.0,Ontario,Canada,3532,NE,45.0167,-75.6333,2011
7,Richmond,https://www.distantias.com/towns-radius-richmo...,21.52,34.632997,https://www.distantias.com/distance-from-smith...,26.15,0 hour(s) and 29 minutes,55,28.5,Ontario,Canada,3797,NE,45.1833,-75.8333,2011
9,Almonte,https://www.distantias.com/towns-radius-almont...,23.63,38.028704,https://www.distantias.com/distance-from-smith...,28.71,0 hour(s) and 31 minutes,55,31.3,Ontario,Canada,4752,NW,45.2167,-76.2,2011
11,Stittsville,https://www.distantias.com/towns-radius-stitts...,24.67,39.702418,https://www.distantias.com/distance-from-smith...,29.97,0 hour(s) and 33 minutes,55,32.7,Ontario,Canada,41350,NE,45.25,-75.9167,2011
12,Brockville,https://www.distantias.com/towns-radius-brockv...,27.35,44.015449,https://www.distantias.com/distance-from-smith...,33.23,0 hour(s) and 36 minutes,55,36.3,Ontario,Canada,23354,SE,44.5833,-75.6833,2011
13,Prescott,https://www.distantias.com/towns-radius-presco...,27.63,44.466064,https://www.distantias.com/distance-from-smith...,33.57,0 hour(s) and 37 minutes,55,36.6,Ontario,Canada,4284,SE,44.7167,-75.5167,2011


### Filter for towns that meet criteria 5 (less than 50,000 residents)

In [139]:
df6=df5[df5['Population'] < 50000]
df6.shape

(20, 16)

In [141]:
df6.head(10)

Unnamed: 0,Town Name,Web Link,Distance Miles,Distance KM,Precise Drive time and Directions URL,Approx Drive miles,Approx Drive Time,Assumed Average MPH,Total Minutes,Region,Country,Population,Direction,Latitude,Longitude,Date
0,Smiths Falls,https://www.distantias.com/towns-radius-smiths...,0.0,0.0,https://www.distantias.com/distance-from-smith...,0.0,0 hour(s) and 0 minutes,0,0.0,Ontario,Canada,9403,NE,44.9,-76.0167,2011
1,Merrickville,https://www.distantias.com/towns-radius-merric...,9.05,14.564527,https://www.distantias.com/distance-from-smith...,10.77,0 hour(s) and 25 minutes,26,24.9,Ontario,Canada,3067,NE,44.9167,-75.8333,2011
2,Perth,https://www.distantias.com/towns-radius-perth-...,10.67,17.171658,https://www.distantias.com/distance-from-smith...,13.22,0 hour(s) and 21 minutes,37,21.4,Ontario,Canada,6211,SW,44.8833,-76.2333,2011
4,Carleton Place,https://www.distantias.com/towns-radius-carlet...,17.09,27.503621,https://www.distantias.com/distance-from-smith...,21.17,0 hour(s) and 34 minutes,37,34.3,Ontario,Canada,10013,NW,45.1333,-76.1333,2011
5,Kemptville,https://www.distantias.com/towns-radius-kemptv...,20.39,32.814443,https://www.distantias.com/distance-from-smith...,24.77,0 hour(s) and 27 minutes,55,27.0,Ontario,Canada,3532,NE,45.0167,-75.6333,2011
7,Richmond,https://www.distantias.com/towns-radius-richmo...,21.52,34.632997,https://www.distantias.com/distance-from-smith...,26.15,0 hour(s) and 29 minutes,55,28.5,Ontario,Canada,3797,NE,45.1833,-75.8333,2011
9,Almonte,https://www.distantias.com/towns-radius-almont...,23.63,38.028704,https://www.distantias.com/distance-from-smith...,28.71,0 hour(s) and 31 minutes,55,31.3,Ontario,Canada,4752,NW,45.2167,-76.2,2011
11,Stittsville,https://www.distantias.com/towns-radius-stitts...,24.67,39.702418,https://www.distantias.com/distance-from-smith...,29.97,0 hour(s) and 33 minutes,55,32.7,Ontario,Canada,41350,NE,45.25,-75.9167,2011
12,Brockville,https://www.distantias.com/towns-radius-brockv...,27.35,44.015449,https://www.distantias.com/distance-from-smith...,33.23,0 hour(s) and 36 minutes,55,36.3,Ontario,Canada,23354,SE,44.5833,-75.6833,2011
13,Prescott,https://www.distantias.com/towns-radius-presco...,27.63,44.466064,https://www.distantias.com/distance-from-smith...,33.57,0 hour(s) and 37 minutes,55,36.6,Ontario,Canada,4284,SE,44.7167,-75.5167,2011


### Drop unrequired columns, then reset the index

In [144]:
df_shortlist=df6[['Town Name','Population','Distance KM','Latitude','Longitude']]
df_shortlist.reset_index(drop=True)

Unnamed: 0,Town Name,Population,Distance KM,Latitude,Longitude
0,Smiths Falls,9403,0.0,44.9,-76.0167
1,Merrickville,3067,14.564527,44.9167,-75.8333
2,Perth,6211,17.171658,44.8833,-76.2333
3,Carleton Place,10013,27.503621,45.1333,-76.1333
4,Kemptville,3532,32.814443,45.0167,-75.6333
5,Richmond,3797,34.632997,45.1833,-75.8333
6,Almonte,4752,38.028704,45.2167,-76.2
7,Stittsville,41350,39.702418,45.25,-75.9167
8,Brockville,23354,44.015449,44.5833,-75.6833
9,Prescott,4284,44.466064,44.7167,-75.5167


This represent our final processed, cleaned and finished dataframe, ready for leveraging and evaluation using foresquare

## Import additional Required libraries

In [145]:
import numpy as np # library to handle data in a vectorized manner

import json # library to handle JSON files

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

import folium # map rendering library

print('Libraries imported.')

Libraries imported.


### Create a map centered on Smith's Falls, with all towns that meet the minimum criteria superimposed on top

Obtain longitudes and latitudes from top row in dataframe

In [164]:
latitude = df_shortlist.iloc[0,3] 
longitude = df_shortlist.iloc[0,4] 

Create map with suitable zoom level

In [173]:
map_smithsFalls = folium.Map(location=[latitude, longitude], zoom_start=8)

Add markers to map

In [174]:
for lat, lng, town in zip(df_shortlist['Latitude'], df_shortlist['Longitude'], df_shortlist['Town Name']):
    label = '{}'.format(town)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_smithsFalls)  
    
map_smithsFalls

### Define my Foursquare Credentials and Version

In [175]:
CLIENT_ID = 'CCKOQHSERN1KJLQR44C4RZBZ4NH4QO4ELA3EU4FBLIMKBEGL'
CLIENT_SECRET = 'E0FLMAGE44ZKOJ44VXSHPX2J4NZDCPX34NJRKQIFDKEEJ2HO' 
VERSION = '20180605'
LIMIT = 100 

### Create a function to iteratively explore the Towns

In [194]:
def getNearbyVenues(names, latitudes, longitudes, radius=3000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Town', 
                  'Town Latitude', 
                  'Town Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

Run the above function on each town and create a new dataframe called shortlist_venues

In [195]:
shortlist_venues = getNearbyVenues(names=df_shortlist['Town Name'],
                                   latitudes=df_shortlist['Latitude'],
                                   longitudes=df_shortlist['Longitude']
                                  )

Smiths Falls
Merrickville
Perth
Carleton Place
Kemptville
Richmond
Almonte
Stittsville
Brockville
Prescott
Manotick
Mississippi Mills
Greely
Britannia
Rockport
Gananoque
Russell
Arnprior
Morrisburg
Braeside


Examine the size and print out a sample of the resulting dataframe

In [203]:
print(shortlist_venues.shape)
shortlist_venues

(412, 7)


Unnamed: 0,Town,Town Latitude,Town Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Smiths Falls,44.9000,-76.0167,Coffee Culture Cafe & Eatery,44.901396,-76.021224,Café
1,Smiths Falls,44.9000,-76.0167,TD Canada Trust,44.900042,-76.020867,Bank
2,Smiths Falls,44.9000,-76.0167,Pizza Hut,44.891743,-76.030096,Pizza Place
3,Smiths Falls,44.9000,-76.0167,Andress' Your Independent Grocer,44.892481,-76.026833,Grocery Store
4,Smiths Falls,44.9000,-76.0167,The Beer Store,44.893694,-76.027683,Beer Store
...,...,...,...,...,...,...,...
407,Braeside,45.4667,-76.4000,Dan Leblanc Bulldozing & Septic Systems,45.465448,-76.406106,Construction & Landscaping
408,Braeside,45.4667,-76.4000,Robyn Lamont Massage Therapy,45.467556,-76.408664,Massage Studio
409,Braeside,45.4667,-76.4000,My Country Home,45.471506,-76.376277,Gift Shop
410,Braeside,45.4667,-76.4000,Darlene's Place in the Prior,45.449172,-76.375622,Gift Shop


Examine how many venues were returned for each town

In [197]:
shortlist_venues.groupby('Town').count()

Unnamed: 0_level_0,Town Latitude,Town Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Town,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Almonte,6,6,6,6,6,6
Arnprior,24,24,24,24,24,24
Braeside,5,5,5,5,5,5
Britannia,100,100,100,100,100,100
Brockville,44,44,44,44,44,44
Carleton Place,30,30,30,30,30,30
Gananoque,21,21,21,21,21,21
Greely,4,4,4,4,4,4
Kemptville,24,24,24,24,24,24
Manotick,19,19,19,19,19,19


Check how many unique categories can be curated from all the returned venues

In [192]:
print('There are {} unique categories.'.format(len(shortlist_venues['Venue Category'].unique())))

There are 82 unique categories.
