<h1>Data</h1>
We're going to use the FourSquare API and map visuals to aggregate data and plot coffee shop locations to display coffee shop concentrations in both cities by neighbourhood. We will use the Wikipedia page provided in an earlier lab to compile Toronto and New York neighbourhood names to input to the FourSquare API.

<h1>Introduction</h1>
Toronto is both an incredibly beautiful and diverse city to live in. Canadians are also avid coffee drinkers. As a potential Toronto migrant and coffee lover, I want to know which Toronto  neighbourhood has the most independent coffee shops or cafes to indulge my coffee cravings. We will explore both raw numbers and also customer reviews of these spots.

<h1>Data</h1>
We're going to use the FourSquare API and map visuals to aggregate data and plot coffee shop locations to display coffee shop concentrations in both cities by neighbourhood. We will use the Wikipedia page provided in an earlier lab to compile Toronto and New York neighbourhood names to input to the FourSquare API.

In [94]:
#!pip install bs4
#!pip install geocoder

from bs4 import BeautifulSoup
import requests
import pandas as pd
import geocoder
import json
import folium

url = "https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M"
source = requests.get(url).text
soup = BeautifulSoup(source)
table=soup.find('table')
column_names = ['Postalcode','Borough','Neighborhood']
df = pd.DataFrame(columns = column_names)
for tr_cell in table.find_all('tr'):
    row_data=[]
    for td_cell in tr_cell.find_all('td'):
        row_data.append(td_cell.text.strip())
    if len(row_data)==3:
        df.loc[len(df)] = row_data

In [48]:
df=df[df['Borough']!='Not assigned'].reset_index(drop=True)
df.head()

Unnamed: 0,Postalcode,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"


Data source: https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M

In [127]:
coord = pd.read_csv('Geospatial_Coordinates.csv')
df.rename(columns={'Postalcode':'Postal Code'}, inplace=True)
df = pd.merge(df,coord, on='Postal Code')
df['Coffee Shops Count']=""
print('There are {} boroughs and {} neighbourhoods in our dataset.'.format(len(df['Borough'].unique()), df.shape[0]))

There are 10 boroughs and 103 neighbourhoods in our dataset.


In [138]:
df.head()

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude,Coffee Shops Count
0,M3A,North York,Parkwoods,43.753259,-79.329656,0
1,M4A,North York,Victoria Village,43.725882,-79.315572,1
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636,16
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763,1
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,26


In [130]:
url = 'https://api.foursquare.com/v2/venues/explore?client_id=25BYT22BJQJ41Q52YPKTUGAW2D4AFF0PDPBQ0MHVPZJOXWUK&client_secret=YR3MXJNVJO1YK5PZO3535IILHDKKF5XXIFVMWVTOVFQDI03N&ll={},{}&v=20200521&query=coffee&radius=500&limit=50'

for index, row in df.iterrows():
    response = requests.get(url.format(row['Latitude'],row['Longitude'])).json()["response"]['groups'][0]['items']
    coffee = pd.json_normalize(response)
    #print(str(index) + " " + str(row['Latitude']) + " " + str(row['Longitude']) + " " + str(coffee.shape[0]))
    df.at[index, 'Coffee Shops Count']=coffee.shape[0]


#coffee_dict = json.loads(coffee_json)
#coffee_dict['response']['groups']
  
#coffee = pd.DataFrame.from_dict(coffee_dict)
#coffee


0 43.7532586 -79.3296565 0
1 43.725882299999995 -79.31557159999998 1
2 43.6542599 -79.3606359 16
3 43.718517999999996 -79.46476329999999 1
4 43.6623015 -79.3894938 26
5 43.6678556 -79.53224240000002 0
6 43.806686299999996 -79.19435340000001 0
7 43.745905799999996 -79.352188 1
8 43.7063972 -79.309937 5
9 43.6571618 -79.37893709999999 50
10 43.709577 -79.44507259999999 2
11 43.6509432 -79.55472440000001 0
12 43.7845351 -79.16049709999999 0
13 43.72589970000001 -79.340923 4
14 43.695343900000005 -79.3183887 2
15 43.6514939 -79.3754179 50
16 43.6937813 -79.42819140000002 0
17 43.6435152 -79.57720079999999 2
18 43.7635726 -79.1887115 0
19 43.67635739999999 -79.2930312 2
20 43.644770799999996 -79.3733064 24
21 43.6890256 -79.453512 1
22 43.7709921 -79.21691740000001 2
23 43.7090604 -79.3634517 4
24 43.6579524 -79.3873826 44
25 43.669542 -79.4225637 6
26 43.773136 -79.23947609999999 1
27 43.8037622 -79.3634517 0
28 43.7543283 -79.4422593 2
29 43.7053689 -79.34937190000001 4
30 43.650571200000

In [137]:
#with pd.option_context('display.max_rows', None, 'display.max_columns', None):  # more options can be specified also
    #print(df)

    Postal Code           Borough  \
0           M3A        North York   
1           M4A        North York   
2           M5A  Downtown Toronto   
3           M6A        North York   
4           M7A  Downtown Toronto   
5           M9A         Etobicoke   
6           M1B       Scarborough   
7           M3B        North York   
8           M4B         East York   
9           M5B  Downtown Toronto   
10          M6B        North York   
11          M9B         Etobicoke   
12          M1C       Scarborough   
13          M3C        North York   
14          M4C         East York   
15          M5C  Downtown Toronto   
16          M6C              York   
17          M9C         Etobicoke   
18          M1E       Scarborough   
19          M4E      East Toronto   
20          M5E  Downtown Toronto   
21          M6E              York   
22          M1G       Scarborough   
23          M4G         East York   
24          M5G  Downtown Toronto   
25          M6G  Downtown Toronto   
2

<h1>Methodology</h1>

I used the FourSquare API to query each Toronto neighbourhood for coffee shops and record the count. For the API, the radius was kept to a modest 500 metres and maxed out the limit to 50 to contrast the neighbourhoods as much as possible. With the coffee shop count, we can prepare for the next step which is creating the heat map.

Using the postal codes we compiled from the Wikipedia page, we were able to generate a GeoJSON file and apply it to our choropleth map of Toronto. With this we can create a heat map of Toronto neighbourhoods, where darker segments have a higher concentration of coffee shops or cafes for us to enjoy.

<h1>Results</h1>

From the map, we can see the highest concentration is in downtown Toronto, in the area surrounding the Eaton Centre to generalize the area. This is in line with what I expected, as downtown Toronto is the most densely populate part of the city and has the a very high concentration of restaurants / cafes from anecdotal experience.

In [136]:
map = folium.Map(
       location=[43.65,-79.38],
       zoom_start=12)
data=df[['Postal Code', 'Coffee Shops Count']]
#data=pd.read_csv('temp.csv')[['Geographic code','Province or territory', 'Population, 2016']]
#data=data[(data['Province or territory']== "Ontario")]
#data
map.choropleth(geo_data='Toronto.geojson',data=data, fill_color='BuGn',key_on='feature.properties.CFSAUID',columns=['Postal Code','Coffee Shops Count'],legend_name='Coffee Shop Count in Toronto, by Neighbourhood')
map

<h1>Discussion</h1>

This heat map clearly indicates a four neighbourhood section of Toronto that would be perfect for any coffee lover to live in or at least frequently visit.

<h1>Conclusion</h1>

In this capstone project, I was able to scrape a Wikipedia page of Toronto postal codes, run those codes through the FourSquare API and query for coffee shops, and create a heat map of Toronto to depict coffee shop counts in all the neighbourhoods. Overall, if you are a true coffee lover, downtown Toronto is where you want to be. Overall, if you are a true coffee lover, downtown Toronto is where you want to be.