# Business Problem and Goal of the Project
According to the city of Toronto, a Business Improvement Area (BIA) is an "association of commercial property owners and tenants within a defined area who work in partnership with the City to create thriving, competitive, and safe business areas that attract shoppers, diners, tourists, and new businesses". Business owners could benefit from a tool that would allow them to see if their area of interest has the potential of being approved as a BIA before starting the application process. The goal of this project is to leverage venue data to aid commercial property owners and tenants with the process of defining new BIA's based on the location and type of businesses in the city.

# Data Description
For this project, I will use three datasets from the City of Toronto to leverage the Foursquare API data. The first dataset contains information about pedestrian and vehicle volume accross intersections throughout the city. The second dataset contains geographical data for neighbourhood profiles in the city. Lastly, the third dataset contains the geographical data for the business improvement areas in the city.

In [1]:
import pandas as pd
import geopandas  # DataFrame to GeoJSON
import numpy as np

import requests  # handle requests
import json

## Pedestrian Volume in Toronto
The first dataset contains the pedestrian volume accross intersections in the city. This will be used to determine what are the busiest neighbourhoods in the city. This will in turn allow me to choose an area of the city with high pedestrian volume where I will explore BIA's.

In [2]:
# Get the dataset metadata by passing package_id to the package_search endpoint
# For example, to retrieve the metadata for this dataset:
url = "https://ckan0.cf.opendata.inter.prod-toronto.ca/api/3/action/package_show"
payload = {"id": "ae4e10a2-9eaf-4da4-83fb-f3731a30c124"}
response = requests.get(url, params=payload).json()
traffic_df = pd.read_excel(response["result"]['resources'][0]['url'])
print('Number of rows read in:', traffic_df.shape[0], '\n')
traffic_df.head()

Number of rows read in: 2280 



Unnamed: 0,TCS #,Main,Midblock Route,Side 1 Route,Side 2 Route,Activation Date,Latitude,Longitude,Count Date,8 Peak Hr Vehicle Volume,8 Peak Hr Pedestrian Volume
0,2,JARVIS ST,,FRONT ST E,,11/15/1948,43.649418,-79.371446,2017-06-21,15662,13535
1,3,KING ST E,,JARVIS ST,,08/23/1950,43.650461,-79.371924,2016-09-17,12960,7333
2,4,JARVIS ST,,ADELAIDE ST E,,09/12/1958,43.651534,-79.37236,2016-11-08,17770,7083
3,5,JARVIS ST,,RICHMOND ST E,,04/21/1962,43.652718,-79.372824,2015-12-08,19678,4369
4,6,JARVIS ST,,QUEEN ST E,,08/24/1928,43.653704,-79.373238,2016-09-17,14487,3368


## Neighbourhood Profiles & Foursquare API
This dataset contains the geographical data for neighbourhoods in the city of Toronto. I will be using this data, in conjunction with the pedestrian volume data, to select the area in the city with the highest pedestrian volume. Furthermore, I will use the __Foursquare API__ to retrieve venues inside this area. You can see a map of the neighbourhoods [here](https://open.toronto.ca/dataset/neighbourhoods/).

In [3]:
nbh_gdf = geopandas.read_file('https://ckan0.cf.opendata.inter.prod-toronto.ca/download_resource/a083c865-6d60-4d1d-b6c6-b0c8a85f9c15?format=geojson&projection=4326')
print('Number of rows read in:', nbh_gdf.shape[0])
nbh_gdf.head()

Number of rows read in: 140


Unnamed: 0,_id,AREA_ID,AREA_ATTR_ID,PARENT_AREA_ID,AREA_SHORT_CODE,AREA_LONG_CODE,AREA_NAME,AREA_DESC,X,Y,LONGITUDE,LATITUDE,OBJECTID,Shape__Area,Shape__Length,geometry
0,6301,25886861,25926662,49885,94,94,Wychwood (94),Wychwood (94),,,-79.425515,43.676919,16491505,3217960.0,7515.779658,"POLYGON ((-79.43592 43.68015, -79.43492 43.680..."
1,6302,25886820,25926663,49885,100,100,Yonge-Eglinton (100),Yonge-Eglinton (100),,,-79.40359,43.704689,16491521,3160334.0,7872.021074,"POLYGON ((-79.41096 43.70408, -79.40962 43.704..."
2,6303,25886834,25926664,49885,97,97,Yonge-St.Clair (97),Yonge-St.Clair (97),,,-79.397871,43.687859,16491537,2222464.0,8130.411276,"POLYGON ((-79.39119 43.68108, -79.39141 43.680..."
3,6304,25886593,25926665,49885,27,27,York University Heights (27),York University Heights (27),,,-79.488883,43.765736,16491553,25418210.0,25632.335242,"POLYGON ((-79.50529 43.75987, -79.50488 43.759..."
4,6305,25886688,25926666,49885,31,31,Yorkdale-Glen Park (31),Yorkdale-Glen Park (31),,,-79.457108,43.714672,16491569,11566690.0,13953.408098,"POLYGON ((-79.43969 43.70561, -79.44011 43.705..."


## Business Improvement Areas
The last dataset I will use contains the geographical data of the BIA's in the city. I will be selecting the BIA's that fall within the area defined using the two previous datasets. You can see a map of the BIA's [here](https://open.toronto.ca/dataset/business-improvement-areas/).

In [4]:
bia_gdf = geopandas.read_file('https://ckan0.cf.opendata.inter.prod-toronto.ca/download_resource/d173e644-ace0-45e0-be43-8ba02fb116eb?format=geojson&projection=4326')
print('Number of rows read in:', bia_gdf.shape[0])

bia_gdf.drop([
    'AREA_ID', 'DATE_EFFECTIVE', 'AREA_ATTR_ID', 'PARENT_AREA_ID', 'AREA_SHORT_CODE', 
    'AREA_LONG_CODE', 'AREA_DESC', 'X', 'Y', 'OBJECTID', 'Shape__Area', 'Shape__Length'
    ], axis=1, inplace=True
)

bia_gdf.rename(
    columns={'_id': 'Id', 'AREA_NAME': 'BIA', 'LONGITUDE': 'Longitude', 'LATITUDE': 'Latitude'},
    inplace=True
)
bia_gdf.head()

Number of rows read in: 83


Unnamed: 0,Id,BIA,Longitude,Latitude,geometry
0,3796,Rogers Road,-79.46989,43.681791,"MULTIPOLYGON (((-79.46624 43.68241, -79.46617 ..."
1,3797,Bloor-Yorkville,-79.389159,43.670401,"MULTIPOLYGON (((-79.38722 43.67408, -79.38679 ..."
2,3798,Little Italy,-79.414394,43.655397,"MULTIPOLYGON (((-79.42050 43.65520, -79.42053 ..."
3,3799,Liberty Village,-79.421265,43.63767,"MULTIPOLYGON (((-79.42466 43.63938, -79.42236 ..."
4,3800,Leslieville,-79.333555,43.66246,"MULTIPOLYGON (((-79.32410 43.66505, -79.32398 ..."
