<h1><b>Machine Learning Based Clustering and Segmentation for Navigation<b></h1>

<h3><b>Introduction</b></h3>
    <p>
    An ML based navigation algorithm that is based on several factors pertaining to neighbourhoods. That will give you the most efficient route to the desired destination, based on factors such as crime rate and population density.
    </p>
<h3><b>Project Contribution</b></h3>
    <p>
    The project contribution is to find correlations between topics surrounding the crime rate, population information and income sources. The purpose of this Jupyter notebook is to focus on the following correlations:
        <ul>
            <li>Correlation between hospitals and homicide rates</li>
            <li>Correlation between police stations and assault rates</li>
            <li>Correlation between car dealerships and auto-theft rates</li>
        </ul>
    </p>
<h3><b>Prerequisites</b></h3>
<ul>
    <li>Foursquare API</li>
</ul>
<h3><b>Datasets Used</b></h3>

<h3><b>Import Statements</b></h3>

In [4]:
from dotenv import load_dotenv
from dotenv import dotenv_values
import folium
import requests
import pandas as pd 
from pandas import json_normalize
from bs4 import BeautifulSoup as bs
import os

<h3><b>Foursquare API Initialization / Check</b></h3>
<h4><b>Category Codes:</b></h4>
<ul>
    <li>10000 - Arts and Entertainment</li>
    <li>11000 - Business and Professional Services</li>
    <li>12000 - Community and Government</li>
    <li>13000 - Dining and Drinking</li>
    <li>14000 - Event</li>
    <li>15000 - Health and Medicine</li>
    <li>16000 - Landmarks and Outdoors</li>
    <li>17000 - Retail</li>
    <li>18000 - Sports and Recreation</li>
    <li>19000 - Travel and Transportation</li>
</ul>

In [4]:
config = dotenv_values(".env")
url = "https://api.foursquare.com/v3/places/nearby"

headers = {"Accept": "application/json",
            "Authorization": config["API_KEY"]}

response = requests.request("GET", url, headers=headers)

def create_request(coords= None, location = None, categories = None, limit = "10"):
    """
        Important:
            - Coords and location cannot be entered together
            - Location and radius cannot be entered together

        The coords will be a list with latitude and longitude.\n 
        Location will be a city and province such as  "Oshawa, ON".\n
        The category is a string from the above codes, with a default of None.\n
        The limit parameter is a maximum of 50, with a default of 10 requests.\n

        Examples:
            - create_request(coords=[-72.848752,43.895962], limit="1")
            - create_request(coords=[-72.848752,43.895962], categories="10000", limit="2")\n
            - create_request(location=["Oshawa","ON"], limit="2")
            - create_request(location=["Oshawa","ON"], categories="10000", limit="20")
    """

    if(coords and categories == None):
        url = "https://api.foursquare.com/v3/places/search?ll=" + str(coords[0]) + "%2C" + str(coords[1]) + "&radius=100000"  + "&limit=" + limit
    elif(coords and categories):
        url = "https://api.foursquare.com/v3/places/search?ll=" + str(coords[0]) + "%2C" + str(coords[1]) +"&categories=" + categories + "&radius=100000" + "&limit=" + limit
    elif(location and categories == None):
        url = "https://api.foursquare.com/v3/places/search?" + "near=" + str(location[0]) + "%2C" + str(location[1]) + "&limit=" + limit
    elif(location and categories):
        url = "https://api.foursquare.com/v3/places/search?" + "categories=" + categories + "&near=" + str(location[0]) + "%2C" + str(location[1]) + "&limit=" + limit
    else:
        return False
    
    response = requests.request("GET", url, headers=headers)
    
    if(response.status_code == 200):
        return response.json()
    else:
        return False

<h3><b>Creating Venue DataFrame</b></h3>

In [5]:
latitude = 43.6532 
longitude = -79.3832
results = create_request(location = ["Toronto", "ON"], categories="12000", limit="50")

# Generates data from the Foursquare API
venues = json_normalize(results['results'], max_level=3)
venues.drop(venues.columns[[0, 1, 2, 3, 5, 8, 9, 10, 11, 13, 12, 17, 18, 19, 20, 21]], axis=1, inplace=True)

#
pd.DataFrame(venues)


Unnamed: 0,name,geocodes.main.latitude,geocodes.main.longitude,location.locality,location.neighborhood,location.postcode,related_places.parent.name
0,Evergreen Brick Works,43.684421,-79.365094,Toronto,[East York],M4W 3X8,
1,Toronto Public Library - Toronto Reference Lib...,43.671795,-79.386944,Toronto,,M4W 2G8,
2,Toronto Public Library,43.708118,-79.399986,Toronto,[Yonge and Eglinton],M4R 1B9,
3,Trinity-St Paul's United Church,43.666094,-79.405674,Toronto,[Midtown],M5S 1X7,
4,Mount Pleasant Cemetery,43.69654,-79.38307,Toronto,[East York],M4T 2V8,
5,North York Public Library,43.768473,-79.412959,Toronto,[Willowdale],M2N 5N9,North York City Centre
6,Toronto St Lawrence Comm Ctr,43.649667,-79.36514,Toronto,[Saint Lawrence],M5A 4J6,
7,Edithvale Community Centre,43.776989,-79.426464,Toronto,,M2N 2H8,
8,Eatonville Public Library,43.646159,-79.559471,Etobicoke,[Eatonville],M9B 2B1,
9,Don Montgomery Community Centre,43.73262,-79.261769,Toronto,,M1K 2R1,


<h3><b>Scraping the Wikipedia Page for Postal Codes</b></h3>

In [16]:
path = os.getcwd()
path = os.path.join(path,"datasets/neighborhoodData.csv")
postcodes = pd.read_csv(path)
postcodes.drop(postcodes.columns[[0]], axis=1, inplace=True)
postcodes.head()



Unnamed: 0,Postcode,Borough,Neighbourhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,Harbourfront
3,M6A,North York,Lawrence Heights
4,M6A,North York,Lawrence Manor


<h3><b>Scraping the for Homicide Rates and Crime Rates</b></h3>

In [19]:
path2 = os.getcwd()
path2 = os.path.join(path2,"datasets/Neighbourhood_Crime_Rates.csv")
crimedata = pd.read_csv(path2)
crimedata.drop(crimedata.columns[[0]], axis=1, inplace=True)
crimedata.head()

KeyError: "['Postcode'] not found in axis"

<h3><b>Cluster and Map Creation</b></h3>


In [None]:
map_creation = folium.Map(location=[latitude, longitude], zoom_start=10)
map_creation