# Segmenting and Clustering Neighborhoods in Toronto

<h3>Introduction</h3>
<p>
This is an assignment for the Introduction to Artifical Intelligence course (SOFE 3720U). Within this we will be explore how to segement and cluster the neighborhoods in Toronto. 
</p>

<h3>Import Statements</h3>

In [1]:
from dotenv import load_dotenv
from dotenv import dotenv_values

import numpy as np

import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json
import geojson

import requests
from pandas import json_normalize

import folium

from bs4 import BeautifulSoup as bs

print('Libraries Imported')

Libraries Imported


<h3>Week 1 - Foursquare API</h3>
<p>
Within this section we will be using the Foursquare API to find latitude, longitude, and venues within the Toronto area.
</p>

<h4>
Setting up Foursquare API
</h4>

In [2]:
#Import the hidden values within the .env file, these values are keys used to access API
config = dotenv_values(".env")
#Assign url variable to initialize API
url = "https://api.foursquare.com/v3/places/nearby"
#Assign header which will allow us to access the website by passing through keys 
headers = {"Accept": "application/json","Authorization": config["API_KEY"]}
#Create the request statement which allows to freely use API
response = requests.request("GET", url, headers=headers)

#Initialize and define findNearbyVenues function
def findNearbyVenues(location, categories, limit):
    #Assign url variable which is based on the parameters passed through function call
    url = "https://api.foursquare.com/v3/places/search?" + "categories=" + categories + "&near=" + str(location[0]) + "%2C" + str(location[1]) + "&limit=" + limit
    #Create the response statement from requesting from API
    response = requests.request("GET", url, headers=headers)
    #Return the result if the code was successful
    if(response.status_code == 200):
        return response.json()
    #Return false if the code didn't work
    else:
        return False
    
print('API initialize and custom function created')

API initialize and custom function created


<h4>
Creating dataframe using the function created to use Foursquare API
</h4>

In [3]:
#Assign lat and long for Toronto
latitude = 43.6532 
longitude = -79.3832
#Assign result variable use the custom function with the parameters below
results = findNearbyVenues(location = ["Toronto", "ON"], categories="17000", limit="50")
#Normalize the results to be able to changed
venuesdf = json_normalize(results['results'], max_level=3)
#Drop unnecessary columns
venuesdf.drop(venuesdf.columns[[0,1,2,3,5,8,9,10,11,12,13,17,18,19,20,22]], axis=1, inplace=True)
#Display first five rows
venuesdf.head()

Unnamed: 0,name,geocodes.main.latitude,geocodes.main.longitude,location.locality,location.neighborhood,location.postcode,related_places.children
0,Fiesta Farms,43.668877,-79.420664,Toronto,[Christie Pitts],M6G 3B6,
1,Nordstrom,43.726102,-79.449217,Toronto,,M6A 2T9,"[{'fsq_id': '5810feea38faaa3930f3aa05', 'name'..."
2,Yorkdale,43.725902,-79.453193,Toronto,[Downsview],M6A 2T9,"[{'fsq_id': '4bca314a937ca593345da792', 'name'..."
3,Uniqlo ユニクロ,43.726446,-79.450564,Toronto,"[Lawrence Heights, Toronto, ON]",M6A 2T9,
4,Loblaws,43.661965,-79.379811,Toronto,,M5B 1J2,"[{'fsq_id': '4ed6319130f813e77a9f8a3b', 'name'..."


In [4]:
#Create the map which is based on the coordinates for Toronto
map_creation = folium.Map(location=[latitude, longitude], zoom_start=10)
#Display Map
map_creation

<h3>Week 2 - Prepare your data</h3>
<p>
Within this section we will be using the provide source to create a large dataframe which contains the neccessary information for the choosen correlations
</p>

In [5]:
url = 'https://en.wikipedia.org/w/index.php?title=List_of_postal_codes_of_Canada:_M&oldid=945633050'
temp = requests.get(url)
data = temp.text
soup = bs(data,'html.parser')
wiki = soup.find('table')
df = pd.read_html(str(wiki))[0]
df.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront


In [6]:
df.drop(df[df['Borough'] == 'Not assigned'].index, inplace=True)
df.index = range(len(df))
df.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,Harbourfront
3,M6A,North York,Lawrence Heights
4,M6A,North York,Lawrence Manor


In [7]:
dfPostalCodes = pd.read_csv('Geospatial_Coordinates.csv')
dfPostalCodes.rename(columns={'Postal Code':'Postcode'}, inplace=True)
dfMerge = pd.merge(df, dfPostalCodes, on='Postcode')
dfMerge.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,Harbourfront,43.65426,-79.360636
3,M6A,North York,Lawrence Heights,43.718518,-79.464763
4,M6A,North York,Lawrence Manor,43.718518,-79.464763


In [60]:
with open('Crime_Rates.geojson') as f:
    data = geojson.load(f)
dfCrime=pd.json_normalize(data["features"])
dfCrime.drop(dfCrime.columns[[0,1,2,3,5,7,8,9,10,11,12,14]], axis=1, inplace=True)
dfCrime.drop(dfCrime.columns[[3,4,5,6,7,8,9,11]], axis=1, inplace=True)
dfCrime.drop(dfCrime.columns[[4,5,6,7,8,9,10,12]], axis=1, inplace=True)
dfCrime.drop(dfCrime.columns[[5,6,8,9,10,11,13]], axis=1, inplace=True)
dfCrime.drop(dfCrime.columns[[5,7]], axis=1, inplace=True)
dfCrime.drop(dfCrime.columns[[6,7,8,9,10,11,13,14]], axis=1, inplace=True)
dfCrime.drop(dfCrime.columns[[7,8,9,10,11,12,14]], axis=1, inplace=True)
dfCrime.drop(dfCrime.columns[[8,9,10]], axis=1, inplace=True)
dfCrime.rename(columns={'properties.Neighbourhood':'Neighbourhood'}, inplace=True)
dfCrime.head()

Unnamed: 0,Neighbourhood,properties.Population,properties.Assault_AVG,properties.AutoTheft_AVG,properties.BreakandEnter_AVG,properties.Homicide_AVG,properties.Robbery_AVG,properties.TheftOver_AVG
0,Yonge-St.Clair,12528,31.0,4.3,23.3,0.0,5.7,4.3
1,York University Heights,27593,333.2,106.3,113.2,0.8,75.8,36.3
2,Lansing-Westgate,16164,70.7,23.7,38.8,1.7,14.7,7.0
3,Yorkdale-Glen Park,14804,160.2,55.5,63.3,1.2,31.5,22.5
4,Stonegate-Queensway,25051,83.2,28.7,52.8,0.0,20.7,6.0


In [9]:
dfAll = pd.merge(dfMerge, dfCrime, on='Neighbourhood')
dfAll.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude,properties.Hood_ID,properties.Population,properties.Assault_2014,properties.Assault_2015,properties.Assault_2016,properties.Assault_2017,properties.Assault_2018,properties.Assault_2019,properties.Assault_AVG,properties.Assault_CHG,properties.Assault_Rate_2019,properties.AutoTheft_2014,properties.AutoTheft_2015,properties.AutoTheft_2016,properties.AutoTheft_2017,properties.AutoTheft_2018,properties.AutoTheft_2019,properties.AutoTheft_AVG,properties.AutoTheft_CHG,properties.AutoTheft_Rate_2019,properties.BreakandEnter_2014,properties.BreakandEnter_2015,properties.BreakandEnter_2016,properties.BreakandEnter_2017,properties.BreakandEnter_2018,properties.BreakandEnter_2019,properties.BreakandEnter_AVG,properties.BreakandEnter_CHG,properties.BreakandEnter_Rate_2019,properties.Homicide_2014,properties.Homicide_2015,properties.Homicide_2016,properties.Homicide_2017,properties.Homicide_2018,properties.Homicide_2019,properties.Homicide_AVG,properties.Homicide_CHG,properties.Homicide_Rate_2019,properties.Robbery_2014,properties.Robbery_2015,properties.Robbery_2016,properties.Robbery_2017,properties.Robbery_2018,properties.Robbery_2019,properties.Robbery_AVG,properties.Robbery_CHG,properties.Robbery_Rate_2019,properties.TheftOver_2014,properties.TheftOver_2015,properties.TheftOver_2016,properties.TheftOver_2017,properties.TheftOver_2018,properties.TheftOver_2019,properties.TheftOver_AVG,properties.TheftOver_CHG,properties.TheftOver_Rate_2019
0,M4A,North York,Victoria Village,43.725882,-79.315572,43,17510,118,138,133,83,112,132,119.3,0.18,753.9,20,14,20,14,13,18,16.5,0.38,102.8,38,46,23,32,35,60,39.0,0.71,342.7,0,0,2,1,0,1,0.7,1.0,5.7,10,18,13,10,14,14,13.2,0.0,80.0,6,6,4,5,4,5,5.0,0.25,28.6
1,M1B,Scarborough,Rouge,43.806686,-79.194353,131,46496,177,167,145,185,184,182,173.3,-0.01,391.4,57,50,21,33,55,87,50.5,0.58,187.1,75,73,75,71,81,59,72.3,-0.27,126.9,1,0,0,4,0,0,0.8,0.0,0.0,24,37,31,42,28,32,32.3,0.14,68.8,10,8,13,11,16,13,11.8,-0.19,28.0
2,M1B,Scarborough,Malvern,43.806686,-79.194353,132,43794,215,280,280,289,272,333,278.2,0.22,760.4,25,31,36,56,64,71,47.2,0.11,162.1,63,43,65,35,64,44,52.3,-0.31,100.5,0,0,4,2,3,1,1.7,-0.67,2.3,40,45,64,64,45,46,50.7,0.02,105.0,10,3,8,7,16,10,9.0,-0.38,22.8
3,M1C,Scarborough,Highland Creek,43.784535,-79.160497,134,12494,49,50,62,64,43,58,54.3,0.35,464.2,25,11,8,14,20,27,17.5,0.35,216.1,28,36,35,29,30,33,31.8,0.1,264.1,0,1,0,0,0,1,0.3,1.0,8.0,16,28,8,15,4,9,13.3,1.25,72.0,3,5,1,1,3,1,2.3,-0.67,8.0
4,M3C,North York,Flemingdon Park,43.7259,-79.340923,44,21933,128,147,153,122,145,152,141.2,0.05,693.0,8,7,4,12,8,8,7.8,0.0,36.5,50,20,14,12,14,16,21.0,0.14,72.9,1,0,0,0,0,0,0.2,0.0,0.0,25,22,24,20,16,13,20.0,-0.19,59.3,9,3,3,4,1,3,3.8,2.0,13.7
