# Coursera Capstone Project

This notebook will be used for the capstone project.

### Criteria

For **first week**, you will required to submit the following:
* A description of the problem and a discussion of the background. (15 marks)
* A description of the data and how it will be used to solve the problem. (15 marks)

For the **second week**, the final deliverables of the project will be:
* A link to your Notebook on your Github repository, showing your code. (15 marks)
* A full report consisting of all of the following components (15 marks):
    * Introduction where you discuss the business problem and who would be interested in this project.
    * Data where you describe the data that will be used to solve the problem and the source of the data.
    * Methodology section which represents the main component of the report where you discuss and describe any exploratory data analysis that you did, any inferential statistical testing that you performed, if any, and what machine learnings were used and why.
    * Results section where you discuss the results.
    * Discussion section where you discuss any observations you noted and any recommendations you can make based on the results.
    * Conclusion section where you conclude the report.
* Your choice of a presentation or blogpost. (10 marks)

In [1]:
# Importing libraries

import pandas as pd
import numpy as np
import random
import requests

# module to convert an address into latitude and longitude values
from geopy.geocoders import Nominatim 

# modules to work with geodata
import geopandas as gp
from geopandas.tools import geocode
import folium

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize
import json

# import tools for webscraping
from bs4 import BeautifulSoup
from urllib.request import urlopen
import urllib

## 1. Problem Description and Background Discussion

### 1.1 Problem Description
As part of the Capstone Project for the Applied Data Science Coursera Course I have chosen to analyze the effectiveness of the Business Improvement Area (BIA) Program of Toronto, ON in Canada. The question I will answer is: **„Does the BIA help venues to get better ratings on Foursquare?“** To answer the questions I will compare the ratings of venues that lie within the boundaries of BIA’s to ratings of venues in the areas surrounding the BIA’s. 

### 1.2 Background Discussion
The **Business Improvement Area (BIA)** is an association of commercial property owners and tenants within a defined area who work in partnership with the City to create thriving, competitive, and safe business areas that attract shoppers, diners, tourists, and new businesses. The question is how effective this association and the created Areas are for attracting shoppers, diners, tourists and new business. 

## 2. Data Description 

### 2.1 Description of Data and Data Source
The BIA layer represents the active BIAs in the City of Toronto that has been enacted by Council. Each BIA has been defined by a by-law and is represented by a Board of Management. The layer is updated as BIAs are created, amended or deleted by Council. This file is a polygon file that shows the BIAs Areas. 

The second part of the data for the analysis comes via the Foursquare API. This dataset contains venues located in Toronto, there location, name, venue category and user rating. 

### 2.2 How will the Data be used to solve the Problem
In a first step the venue data will be split depending on whether the venue is located within the boundaries of a BIA or not. Then in a second step the average rating for the venues within each BIA will be calculated and the average rating for the venues not in a BIA. In a third step a comparison between the average rating for the BIAs and the overall average rating will be made. This comparision will show if venues within BIAs are faring better than venues not in BIAs. And therefor give a first clou whether BIAs are effective in improving Venues.

A second method to measure the effectivness of BIAs is there power in gathering venues near their location. This can be measured by comparing the number of venues within the BIAs boundaries to the number of venues located elsewhere in the city.


### Getting the BIAs Data

Via the API provided by the City of Toronto 

In [42]:
# Get the dataset metadata by passing package_id to the package_search endpoint
# For example, to retrieve the metadata for this dataset:

url = "https://ckan0.cf.opendata.inter.prod-toronto.ca/api/3/action/package_show"
params = { "id": "9edb9628-1213-42bd-8352-5c4ed28e9e42"}
response = urllib.request.urlopen(url, data=bytes(json.dumps(params), encoding="utf-8"))
package = json.loads(response.read())

# Get the data by passing the resource_id to the datastore_search endpoint
# See https://docs.ckan.org/en/latest/maintaining/datastore.html for detailed parameters options
# For example, to retrieve the data content for the first resource in the datastore:

for idx, resource in enumerate(package["result"]["resources"]):
    if resource["datastore_active"]:
        url = "https://ckan0.cf.opendata.inter.prod-toronto.ca/api/3/action/datastore_search"
        p = { "id": resource["id"] }
        r = urllib.request.urlopen(url, data=bytes(json.dumps(p), encoding="utf-8"))
        data = json.loads(r.read())
        df_BIAs = pd.DataFrame(data["result"]["records"])
        break
df_BIAs.head()

Unnamed: 0,_id,AREA_ID,DATE_EFFECTIVE,AREA_ATTR_ID,PARENT_AREA_ID,AREA_SHORT_CODE,AREA_LONG_CODE,AREA_NAME,AREA_DESC,X,Y,LONGITUDE,LATITUDE,OBJECTID,Shape__Area,Shape__Length,geometry
0,3215,2481875,2020-02-04T17:20:36,26006975,,115-00,115-00,Rogers Road,Rogers Road,307227.635,4837983.077,-79.46989,43.681791,17568785,351093.855469,5936.862796,"{""type"": ""Polygon"", ""coordinates"": [[[-79.4662..."
1,3216,2481874,2020-02-04T17:20:36,26006974,,031-02,031-02,Bloor-Yorkville,Bloor-Yorkville,313738.285,4836723.196,-79.389159,43.670401,17568801,918046.484375,6613.691633,"{""type"": ""Polygon"", ""coordinates"": [[[-79.3872..."
2,3217,2481873,2020-02-04T17:20:36,26006973,,020-01,020-01,Little Italy,Little Italy,311705.037,4835053.901,-79.414394,43.655397,17568817,232341.589844,3917.542802,"{""type"": ""Polygon"", ""coordinates"": [[[-79.4205..."
3,3218,2481872,2020-02-04T17:20:36,26006972,,042-01,042-01,Liberty Village,Liberty Village,311152.727,4833083.985,-79.421265,43.63767,17568833,797292.066406,4400.913504,"{""type"": ""Polygon"", ""coordinates"": [[[-79.4246..."
4,3219,2481871,2020-02-04T17:20:36,26006971,,093-01,093-01,Leslieville,Leslieville,318224.026,4835848.463,-79.333555,43.66246,17568849,351302.890625,6457.749078,"{""type"": ""Polygon"", ""coordinates"": [[[-79.3240..."


In [43]:
df_BIAs.shape

(83, 17)

In [44]:
# dropping BIAs that are out of the central area of Toronto
area_names = ['Albion Islington Square', 'Wilson Village', 'Sheppard East Village', 'Emery Village', 'DuKe Heights', 'Kennedy Road', 'Wexford Heights', 'Crossroads of the Danforth']
index_drop = df_BIAs['AREA_NAME'].isin(area_names)
index_drop = index_drop[index_drop == True]
index_drop.index

Int64Index([5, 17, 22, 28, 45, 50, 58, 79], dtype='int64')

In [45]:
df_BIAs = df_BIAs.drop(index_drop.index)
df_BIAs.head()

Unnamed: 0,_id,AREA_ID,DATE_EFFECTIVE,AREA_ATTR_ID,PARENT_AREA_ID,AREA_SHORT_CODE,AREA_LONG_CODE,AREA_NAME,AREA_DESC,X,Y,LONGITUDE,LATITUDE,OBJECTID,Shape__Area,Shape__Length,geometry
0,3215,2481875,2020-02-04T17:20:36,26006975,,115-00,115-00,Rogers Road,Rogers Road,307227.635,4837983.077,-79.46989,43.681791,17568785,351093.855469,5936.862796,"{""type"": ""Polygon"", ""coordinates"": [[[-79.4662..."
1,3216,2481874,2020-02-04T17:20:36,26006974,,031-02,031-02,Bloor-Yorkville,Bloor-Yorkville,313738.285,4836723.196,-79.389159,43.670401,17568801,918046.484375,6613.691633,"{""type"": ""Polygon"", ""coordinates"": [[[-79.3872..."
2,3217,2481873,2020-02-04T17:20:36,26006973,,020-01,020-01,Little Italy,Little Italy,311705.037,4835053.901,-79.414394,43.655397,17568817,232341.589844,3917.542802,"{""type"": ""Polygon"", ""coordinates"": [[[-79.4205..."
3,3218,2481872,2020-02-04T17:20:36,26006972,,042-01,042-01,Liberty Village,Liberty Village,311152.727,4833083.985,-79.421265,43.63767,17568833,797292.066406,4400.913504,"{""type"": ""Polygon"", ""coordinates"": [[[-79.4246..."
4,3219,2481871,2020-02-04T17:20:36,26006971,,093-01,093-01,Leslieville,Leslieville,318224.026,4835848.463,-79.333555,43.66246,17568849,351302.890625,6457.749078,"{""type"": ""Polygon"", ""coordinates"": [[[-79.3240..."


In [46]:
df_BIAs.shape

(75, 17)

In [47]:
df_BIAs.reset_index(drop = True, inplace = True)

### Plotting the BIA-Areas on a map

In [49]:
# get location for map centering from Downtown Toronto
address = 'Toronto, Downtown'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto Downtown are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto Downtown are 43.6541737, -79.38081164513409.


In [50]:
# create a open street map, center it on a location using latitude and longitude and give it a starting zoom factor
m = folium.Map(location = [latitude, longitude], tiles = 'Stamen Toner', zoom_start = 12)

# create a feature group for the map
fg = folium.map.FeatureGroup(name='BIAs').add_to(m)

# add geojson data for the BIAs to map
for i in range(len(df_BIAs['geometry'])):
    b = folium.GeoJson(df_BIAs['geometry'][i])
    b.add_child(folium.Popup(df_BIAs['AREA_NAME'][i]))
    fg.add_child(b)
    
    
folium.LayerControl().add_to(m)
    
# display the map
m

### Setting up the API for accessing foursquare data

In [51]:
CLIENT_ID = '5MEM4YM205NTQBOMWUQX00NHLMW2GJGAV2OPGIHK55JSJKFU' # your Foursquare ID
CLIENT_SECRET = 'XQ34UGNCZTPZWQFKCIVSYLXHK533UR24OSHJ1BKLE2SSZTT3' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 30
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: 5MEM4YM205NTQBOMWUQX00NHLMW2GJGAV2OPGIHK55JSJKFU
CLIENT_SECRET:XQ34UGNCZTPZWQFKCIVSYLXHK533UR24OSHJ1BKLE2SSZTT3


### Getting Data for the Venues in the BIAs
Via a API request the data for the venues in the BIAs Areas are collected and stored in a Data Frame for easy data manipulation. The definied function will get, depending on the location of the BIA, the names of the venues within a that BIA, exact location (latitude, longitude) and venue category. 

In [282]:
# defining a function to get the Venues Name, Location, Rating and Category via the Foursquare API

def getNearbyVenues(names, latitudes, longitudes, radius=300, LIMIT = 100):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['id'],
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['BIA', 
                  'BIA Latitude', 
                  'BIA Longitude',
                  'Venue ID',
                  'Venue', 
                  'Venue Latitude',
                  'Venue Longitude',
                  'Venue Category']
    
    return(nearby_venues)

In [55]:
bia_venues = getNearbyVenues(names=df_BIAs['AREA_NAME'],
                                   latitudes=df_BIAs['LATITUDE'],
                                   longitudes=df_BIAs['LONGITUDE']
                                  )

Rogers Road
Bloor-Yorkville
Little Italy
Liberty Village
Leslieville
Lakeshore Village
Korea Town
Kensington Market
Historic Queen East
Hillcrest Village
Harbord Street
Greektown on the Danforth
Gerrard India Bazaar
Forest Hill Village
Financial District
Fairbank Village
Eglinton Hill
Dupont by the Castle
Church-Wellesley Village
Little Portugal On Dundas
Downtown Yonge
Dovercourt Village
Danforth Village
Danforth Mosaic
College West
Corso Italia
College Promenade
CityPlace and Fort York
Chinatown
Cabbagetown
Broadview Danforth
Long Branch
Queen Street West
Bloordale Village
Junction Gardens
shoptheQueensway.com
The Eglinton Way
West Queen West
York-Eglinton
Uptown Yonge
Upper Village
Toronto Entertainment District
The Kingsway
Yonge Lawrence Village
Roncesvalles Village
St. Clair Gardens
The Beach
Trinity-Bellwoods
Wychwood Heights
St. Lawrence Market Neighbourhood
Yonge & St. Clair
The Waterfront
Rosedale Main Street
Weston Village
Village of Islington
Riverside District
Mount Pleasa

In [56]:
print(bia_venues.shape)
bia_venues.head()

(611, 8)


Unnamed: 0,BIA,BIA Latitude,BIA Longitude,Venue ID,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Bloor-Yorkville,43.670401,-79.389159,5b2d934e59c423002c0c6db6,Eataly,43.669754,-79.38872,Gourmet Shop
1,Bloor-Yorkville,43.670401,-79.389159,554ea8bd498efa064ec03031,Paramount Fine Foods,43.670677,-79.389865,Middle Eastern Restaurant
2,Bloor-Yorkville,43.670401,-79.389159,5738ca80cd10b91a6747abde,Pi Co.,43.670107,-79.389852,Pizza Place
3,Bloor-Yorkville,43.670401,-79.389159,57e074a7498ef24d3980d2f5,Planta Yorkville,43.670213,-79.389512,Vegetarian / Vegan Restaurant
4,Bloor-Yorkville,43.670401,-79.389159,4d249ace0e998cfa43b9253f,Starbucks,43.67034,-79.388262,Coffee Shop


In [283]:
# define a function to get the ratings for the venues

def getRating(v):

    rating = []
    ID = []
    
    for i in v:
        ID.append(i)
    
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/{}?client_id={}&client_secret={}&v={}'.format(
            i, 
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION)

        # make the GET request
        results = requests.get(url).json()
        try:
            res = results['response']['venue']['rating']
            rating.append(res)
        except:
            rating.append('NaN')
        
    venues_rating = pd.DataFrame(columns=['Venue_ID', 'Venue Rating'])
    venues_rating['Venue_ID'] = ID
    venues_rating['Venue Rating'] = rating
        
        
    return(venues_rating)

In [366]:
venues_rating = getRating(v = bia_venues['Venue ID'])

In [59]:
# load the venues rating data from csv
venues_rating = pd.read_csv('Data/venues_rating.csv')

In [374]:
venues_rating[venues_rating['Venue Rating'] != 'NaN'].mean()

Venue Rating    7.304595
dtype: float64

In [69]:
venues_rating.head()

Unnamed: 0.1,Unnamed: 0,Venue_ID,Venue Rating
0,0,55be4bd4498e08d9ccc061e4,8.9
1,1,518e6d3a498e3e5f52a938e3,8.3
2,2,4e11d1fc1495c8d31bc9a291,7.8
3,3,4b646bedf964a520d5b12ae3,7.3
4,4,5be625b081635b002c676360,7.0


In [70]:
# merge on venues id with venue data
df_merged = bia_venues.merge(venues_rating, left_on = 'Venue ID', right_on = 'Venue_ID')
df_merged.drop(['Unnamed: 0', 'Venue_ID'], axis = 1, inplace = True)
df_merged.head()

Unnamed: 0,BIA,BIA Latitude,BIA Longitude,Venue ID,Venue,Venue Latitude,Venue Longitude,Venue Category,Venue Rating
0,Bloor-Yorkville,43.670401,-79.389159,5b2d934e59c423002c0c6db6,Eataly,43.669754,-79.38872,Gourmet Shop,8.6
1,Bloor-Yorkville,43.670401,-79.389159,5b2d934e59c423002c0c6db6,Eataly,43.669754,-79.38872,Gourmet Shop,8.6
2,Bloor Street,43.669995,-79.388414,5b2d934e59c423002c0c6db6,Eataly,43.669754,-79.38872,Gourmet Shop,8.6
3,Bloor Street,43.669995,-79.388414,5b2d934e59c423002c0c6db6,Eataly,43.669754,-79.38872,Gourmet Shop,8.6
4,Bloor-Yorkville,43.670401,-79.389159,554ea8bd498efa064ec03031,Paramount Fine Foods,43.670677,-79.389865,Middle Eastern Restaurant,8.6


In [74]:
# safe to csv
df_merged.to_csv('Data/df_merged.csv')

### Generating random Locations in Toronto
To get data about venues not located within the BIAs, random locations in Toronto are generated. A function is definined that generates random longitudes and latitudes within a specific maximal and minimal longitude and latitude. 

In [183]:
# import random number generator
import random

# define function for generating random numbers
def Rand(start, end, num): 
    res = [] 
  
    for j in range(num): 
        res.append(random.uniform(start, end)) 
  
    return res 

In [293]:
# generating 40 random latitudes and longitudes within the max and min latitude, longitude of BIA_venues
lat = Rand(43.647, 43.695, 60)
long = Rand(-79.5, -79.3, 60)
location = [x+1 for x in range(0,60,1)]
Location = ['Location ' + str(x) for x in range(len(location))]

In [294]:
df_location = pd.DataFrame(columns = ['Location','lat', 'long'])
df_location['lat'] = lat
df_location['long'] = long
df_location['Location'] = Location
df_location.head()

Unnamed: 0,Location,lat,long
0,Location 0,43.664331,-79.34547
1,Location 1,43.654724,-79.321122
2,Location 2,43.654243,-79.372209
3,Location 3,43.658513,-79.32107
4,Location 4,43.691147,-79.342159


### Collecting venue information for the randomly generated Locations
Via the Foursquare API information about Venues near the randomly generated locations in Toronto is gathered. Information about the Venue include: Venue Name, Venue location, Venue Category, Venue Rating. This information is used to fill the Data Frame df_notBIAVenues.

In [307]:
# collect venues via Foursquare API near random locations. Limit = 100, Radius = 100
df_notBIAVenues = getNearbyVenues(names=df_location['Location'],
                                   latitudes=df_location['lat'],
                                   longitudes=df_location['long']
                                  )

Location 0
Location 1
Location 2
Location 3
Location 4
Location 5
Location 6
Location 7
Location 8
Location 9
Location 10
Location 11
Location 12
Location 13
Location 14
Location 15
Location 16
Location 17
Location 18
Location 19
Location 20
Location 21
Location 22
Location 23
Location 24
Location 25
Location 26
Location 27
Location 28
Location 29
Location 30
Location 31
Location 32
Location 33
Location 34
Location 35
Location 36
Location 37
Location 38
Location 39
Location 40
Location 41
Location 42
Location 43
Location 44
Location 45
Location 46
Location 47
Location 48
Location 49
Location 50
Location 51
Location 52
Location 53
Location 54
Location 55
Location 56
Location 57
Location 58
Location 59


In [308]:
print(df_notBIAVenues.shape)
df_notBIAVenues.head()

(672, 8)


Unnamed: 0,BIA,BIA Latitude,BIA Longitude,Venue ID,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Location 0,43.664331,-79.34547,4b05d9a3f964a5205ce422e3,Batifole,43.666651,-79.347261,French Restaurant
1,Location 0,43.664331,-79.34547,56d331c5498e8113529782ed,Hailed Coffee,43.6669,-79.345432,Coffee Shop
2,Location 0,43.664331,-79.34547,4e21f9fbcc3fee7eee221f0f,Rock Oasis,43.66588,-79.342794,Climbing Gym
3,Location 0,43.664331,-79.34547,4af5eb2af964a520cffe21e3,Simon's Wok,43.667026,-79.345569,Vegetarian / Vegan Restaurant
4,Location 0,43.664331,-79.34547,4cb5d0239c7ba35dd5218d06,Artists' Play Dance,43.663973,-79.344101,Dance Studio


In [310]:
venues_rating_notBIA = getRating(v = df_notBIAVenues['Venue ID'])

In [314]:
print(venues_rating_notBIA.shape)
venues_rating_notBIA.head()

(672, 2)


Unnamed: 0,Venue_ID,Venue Rating
0,4b05d9a3f964a5205ce422e3,8.8
1,56d331c5498e8113529782ed,8.4
2,4e21f9fbcc3fee7eee221f0f,8.3
3,4af5eb2af964a520cffe21e3,7.5
4,4cb5d0239c7ba35dd5218d06,


In [318]:
# merge on df_notBIAVenues with venues_rating_notBIA
df_mergedNotBIA = df_notBIAVenues.merge(venues_rating_notBIA, left_on = 'Venue ID', right_on = 'Venue_ID')
df_mergedNotBIA.drop(['Venue_ID'], axis = 1, inplace = True)
df_mergedNotBIA.head()

Unnamed: 0,BIA,BIA Latitude,BIA Longitude,Venue ID,Venue,Venue Latitude,Venue Longitude,Venue Category,Venue Rating
0,Location 0,43.664331,-79.34547,4b05d9a3f964a5205ce422e3,Batifole,43.666651,-79.347261,French Restaurant,8.8
1,Location 0,43.664331,-79.34547,56d331c5498e8113529782ed,Hailed Coffee,43.6669,-79.345432,Coffee Shop,8.4
2,Location 0,43.664331,-79.34547,4e21f9fbcc3fee7eee221f0f,Rock Oasis,43.66588,-79.342794,Climbing Gym,8.3
3,Location 0,43.664331,-79.34547,4af5eb2af964a520cffe21e3,Simon's Wok,43.667026,-79.345569,Vegetarian / Vegan Restaurant,7.5
4,Location 0,43.664331,-79.34547,4cb5d0239c7ba35dd5218d06,Artists' Play Dance,43.663973,-79.344101,Dance Studio,


In [323]:
# Prepare the data for Heatmap
# Ensure you're handing it floats
df_mergedNotBIA['Venue Latitude'] = df_mergedNotBIA['Venue Latitude'].astype(float)
df_mergedNotBIA['Venue Longitude'] = df_mergedNotBIA['Venue Longitude'].astype(float)

# Filter the DF for rows, then columns, then remove NaNs
heat_df = df_mergedNotBIA[['Venue Latitude', 'Venue Longitude']]
heat_df = heat_df.dropna(axis=0, subset=['Venue Latitude','Venue Longitude'])

# List comprehension to make out list of lists
heat_data = [[row['Venue Latitude'],row['Venue Longitude']] for index, row in heat_df.iterrows()]

## Analysing the Data
The collected Data is analised in the following ways:
1. Mapping the BIAs and the Venues for overview
2. Comparing the overall average Venue Rating with the Venue Ratings for Venues in the BIAs
3. 

### 1. Mapping the BIAs and Venues for Overview

In [327]:
# create map of Toronto with BIAs (blue) and Venues within the BIAs (red)

toronto_map = folium.Map(location = [latitude, longitude], tiles = 'Stamen Toner', zoom_start = 12)

# create a feature group for the map
fg = folium.map.FeatureGroup(name='BIAs').add_to(toronto_map)
fg1 = folium.map.FeatureGroup(name = 'BIA Venues').add_to(toronto_map)
fg2 = folium.map.FeatureGroup(name = 'Location Not BIA Venues').add_to(toronto_map)
fg3 = folium.map.FeatureGroup(name = 'Not BIA Venues').add_to(toronto_map)
fg4 = folium.map.FeatureGroup(name = 'Density of Venues').add_to(toronto_map)

# add geojson data for the BIAs to map
for i in range(len(df_BIAs['geometry'])):
    b = folium.GeoJson(df_BIAs['geometry'][i])
    b.add_child(folium.Popup(df_BIAs['AREA_NAME'][i]))
    fg.add_child(b)
    
for lat, long, name in zip(df_merged['Venue Latitude'], df_merged['Venue Longitude'], df_merged['Venue']):
    name = folium.Popup(name, parse_html = True)
    c = folium.Circle(
    [lat, long],
    radius = 2,
    popup = name,
    color = 'red',
    fill = True,
    parse_html = False)
    fg1.add_child(c)

for lat, long, name in zip(df_location['lat'], df_location['long'], df_location['Location']):
    name = folium.Popup(name, parse_html = True)
    d = folium.Circle(
    [lat, long],
    radius = 300,
    popup = name,
    color = 'green',
    fill = True,
    parse_html = False)
    fg2.add_child(d)    
    
for lat, long, name in zip(df_notBIAVenues['Venue Latitude'], df_notBIAVenues['Venue Longitude'], df_notBIAVenues['Venue']):
    name = folium.Popup(name, parse_html = True)
    f = folium.Circle(
    [lat, long],
    radius = 2,
    popup = name,
    color = 'yellow',
    fill = True,
    parse_html = False)
    fg3.add_child(f)    
    
    

# Plot heatmap
h = HeatMap(heat_data)
fg4.add_child(h)    
    
folium.LayerControl().add_to(toronto_map)
    
# display the map
toronto_map

### 2. Comparing the average Ratings for BIA and not BIA

In [329]:
# Average Rating BIA
meanBIARating = df_merged['Venue Rating'].mean()
print('The mean Rating for Venues in BIAs is: ', meanBIARating)

The mean Rating for Venues in BIAs is:  7.317195767195767


In [334]:
# Average Rating for Venues not in BIA
df = df_mergedNotBIA[df_mergedNotBIA['Venue Rating'] != 'NaN']
meanRating = df['Venue Rating'].mean()
print('The mean Rating for Venues not in BIAs is: ', meanRating)

The mean Rating for Venues not in BIAs is:  7.399047619047619
