# The Battle of Neighborhoods
## Toronto neighborhood - Vegan restaurants

**Author**: Hércules Sant'Ana da Silva José

_Report created in July/2020_

## Summary

**Introduction Section:**
- Introduction where we discuss the business problem about vegan restaurants in Toronto and who would be interested in this project.

**Data Section:**
- Data where we describe the data that will be used to solve the problem and the source of the data.

**Methodology section:**
- Methodology section which represents the main component of this report where we discuss and describe any exploratory data analysis that we did, any inferential statistical testing that we performed, if any, and what machine learnings were used and why.

**Results section:**
- Results section where we discuss the results.

**Discussion section:**
- Discussion section where we discuss any observations we noted and any recommendations to our audience based on the results.

**Conclusion section:**
- Conclusion section is where we conclude the report.

## 1. Introduction

### 1.1. Scenario

Recent studies suggest that about 8% of the world's population is vegan. Few people around the world are committed to living a plant-based lifestyle. If you are interested in eating more cleanly, do not want to explore animals or simply enjoy how you feel as a vegan, it can sometimes be difficult to find great vegan restaurants in the world.

However, this is starting to change. You will find that the main cities in the world are slowly but surely developing strong vegan cultures, from coffee shops and cooperatives to some of the best vegan restaurants in the world. In short, you now have more options than simply ordering the only salad on the menu.

Toronto is one of the most ethnic and culturally diverse cities in the world. Located on the shores of Lake Ontario, it has people from all over the world from different cultures and ethnicities resulting in a lot of delicious delights. No matter what type of cuisine you like, you will surely find great restaurants that serve delicious food for you. Now, if you're a vegan, you may be wondering, all of this is good, but what if I'm looking for wonderful vegan cuisine to enjoy in Toronto? Will I be able to find something good? Will I as a student find good vegan restaurants near universities? Which neighborhoods can a company that specializes in vegan products promote its products? 

### 1.2. Business problem

Identify existing vegan restaurants in Toronto, and identify the best neighborhoods where vegan companies can partner with restaurants to promote their products or open new physical stores.

### 1.3. Audience

Vegan audience that comes to Toronto to come to work, study or tourism, and also companies interested in selling and promoting their vegan products.

## 2. Data

### 2.1. Data description

For this analysis, we will be required to explore, segment, and cluster the neighborhoods in the city of Toronto. However, the neighborhood data is not readily available on the internet. So, we will use a Wikipedia page that contains all the information we need to explore and cluster the neighborhoods in Toronto. The wikipedia page is available in https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M.

We will be required to scrape the Wikipedia page and process the data, clean it, and then read it into a Pandas dataframe so that it is in a structured format that we can do our analysis.

We will use the Foursquare API to identify the existing vegan restaurants in Toronto and any existing restaurants in each neighborhood with a distance up to 500 meters. We will use a list of boroughs and neighborhoods of Toronto with their coordinates (latitude and longitude) to obtain the venues of each boroughs and neighborhoods of Toronto. The coordinates dataset is available in http://cocl.us/Geospatial_data.

### 2.2. Data preparation

The data preparation process consisted of obtaining postal codes for the city of Toronto from Wikipedia page. All borough and neighborhood with value "Not assigned" were removed from the dataframe.

In [1]:
import pandas as pd

# Create dataframe by using pandas and show first 10 lines
df = pd.read_html('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M')
df[0].head(10)

Unnamed: 0,Postal Code,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"
5,M6A,North York,"Lawrence Manor, Lawrence Heights"
6,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"
7,M8A,Not assigned,Not assigned
8,M9A,Etobicoke,"Islington Avenue, Humber Valley Village"
9,M1B,Scarborough,"Malvern, Rouge"


In [2]:
# We've assigned index 0 to a new Pandas dataframe
df_toronto = df[0]

# Drop cells with a borough that is Not assigned
index_NA = df_toronto[(df_toronto['Borough']=='Not assigned')].index
df_toronto.drop(index_NA, inplace=True)
df_toronto.head(10)

Unnamed: 0,Postal Code,Borough,Neighbourhood
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"
5,M6A,North York,"Lawrence Manor, Lawrence Heights"
6,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"
8,M9A,Etobicoke,"Islington Avenue, Humber Valley Village"
9,M1B,Scarborough,"Malvern, Rouge"
11,M3B,North York,Don Mills
12,M4B,East York,"Parkview Hill, Woodbine Gardens"
13,M5B,Downtown Toronto,"Garden District, Ryerson"


In [3]:
# If a cell has a borough but a Not assigned neighborhood, then the neighborhood will be the same as the borough
df_toronto.loc[df_toronto.Neighbourhood=='Not assigned','Neighbourhood'] = df_toronto.Neighbourhood
df_toronto.head(10)

Unnamed: 0,Postal Code,Borough,Neighbourhood
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"
5,M6A,North York,"Lawrence Manor, Lawrence Heights"
6,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"
8,M9A,Etobicoke,"Islington Avenue, Humber Valley Village"
9,M1B,Scarborough,"Malvern, Rouge"
11,M3B,North York,Don Mills
12,M4B,East York,"Parkview Hill, Woodbine Gardens"
13,M5B,Downtown Toronto,"Garden District, Ryerson"


Then we obtain the geographic data of the Toronto neighborhoods. We merged the two dataframe, creating a new one with all neighborhoods and their respective coordinates (latitude and longitude).

In [5]:
# We used Pandas to read csv file that has the geographical coordinates of each postal code to a dataframe
df_coordinators = pd.read_csv('http://cocl.us/Geospatial_data')
df_coordinators.head(10)

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476
5,M1J,43.744734,-79.239476
6,M1K,43.727929,-79.262029
7,M1L,43.711112,-79.284577
8,M1M,43.716316,-79.239476
9,M1N,43.692657,-79.264848


In [6]:
# We merged the df_coordinators into df_toronto.
df_toronto = pd.merge(left=df_toronto,right=df_coordinators,on='Postal Code')
df_toronto.head(10)

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494
5,M9A,Etobicoke,"Islington Avenue, Humber Valley Village",43.667856,-79.532242
6,M1B,Scarborough,"Malvern, Rouge",43.806686,-79.194353
7,M3B,North York,Don Mills,43.745906,-79.352188
8,M4B,East York,"Parkview Hill, Woodbine Gardens",43.706397,-79.309937
9,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937


In [8]:
# Check numbers of Boroughs and Neighborhoods in the df_toronto
print('The dataframe has {} boroughs and {} neighborhoods for our analysis.'.format(
        len(df_toronto['Borough'].unique()),
        df_toronto.shape[0]
    )
)

The dataframe has 10 boroughs and 103 neighborhoods for our analysis.


After this, we check the geographic data collected by plotting the neighborhoods on the map. We used the Toronto coordinates as start point.

The map below shows all of Toronto's neighborhoods.

In [9]:
import folium

# Toronto, Ontario
latitude = 43.6529
longitude = -79.3849

#Create map of Toronto using Latitude and Longitude values
map_toronto = folium.Map(location=[latitude,longitude], zoom_start=11)

#add markers to map for neighborhoods
for lat,lng, borough, neighborhood in zip(df_toronto['Latitude'],df_toronto['Longitude'],df_toronto['Borough'],df_toronto['Neighbourhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat,lng],
        radius=5,
        popup=label,
        color='yellow',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
    parse_html=False).add_to(map_toronto)
    
map_toronto

### 2.3. Checking Foursquare API data

We invoke the Foursquare API to check the returned data. In this test we verify how many venues exists around the University of Toronto at a distance of up to 1000 meters.

In [10]:
# University of Toronto, Toronto, Ontario
neighborhood_latitude = 43.72149
neighborhood_longitude = -79.37881

In [11]:
# Foursquare credencials
CLIENT_ID = '<CLIENT_ID>' # your Foursquare ID
CLIENT_SECRET = '<CLIENT_SECRET>' # your Foursquare Secret
VERSION = '20180604'

# URL
LIMIT=100
radius=1000
url='https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID,
    CLIENT_SECRET,
    VERSION,
    neighborhood_latitude,
    neighborhood_longitude,
    radius,
    LIMIT
)

In [12]:
import requests # library to handle requests

results = requests.get(url).json()

In [13]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

We created a function to extract the venue categories from json returned from API.

I've cleaned the returned json and I've structured it into a pandas dataframe.

In [16]:
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

In [18]:
print('{} venues were returned by Foursquare near University of Toronto.'.format(nearby_venues.shape[0]))

15 venues were returned by Foursquare near University of Toronto.


The table below shows the contents of the dataframe created from the json returned by the Foursquare API.

In [19]:
nearby_venues

Unnamed: 0,name,categories,lat,lng
0,Sherwood Park,Park,43.716551,-79.387776
1,Whole Foods Market,Grocery Store,43.71416,-79.377966
2,Second Cup,Coffee Shop,43.721382,-79.376428
3,Dogs Off-Leash Area,Dog Run,43.716589,-79.384246
4,Rollian Sushi,Japanese Restaurant,43.712637,-79.376914
5,Tim Hortons,Coffee Shop,43.727324,-79.379563
6,Shoppers Drug Mart,Pharmacy,43.714158,-79.378098
7,Bistro On the Go: Sunnybrook,Food Court,43.72163,-79.376425
8,Druxy's,Deli / Bodega,43.720452,-79.377939
9,Glendon Forest,Trail,43.727226,-79.378413


## 3. Methodology

We performed an exploratory analysis from the data returned by the Foursquare API. First we seek a specific category for vegan restaurants. 

We then identified all other types of restaurants. Using a map of the city (provided by the Python library  Folium), we plot all venues and visually compare the amount of vegan restaurants with the other types of restaurants. 

Finally, we use the KMeans clustering technique to determine which neighborhoods are promising for companies to partner with local restaurants to sell vegan products.

We have created a function to facilitate the search for venues from the geographic data of the respective neighborhoods. We limited the search for venues to a distance of up to 500 meters. For performance reasons, the data collected was saved in a CSV file.

In [21]:
# Function that return the nearby venues
# Distance: 500 meters
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                   'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    print('Success in obtaining the data!')
    return(nearby_venues)

In [23]:
df_toronto_venues = getNearbyVenues(names=df_toronto['Neighbourhood'],
                                   latitudes=df_toronto['Latitude'],
                                   longitudes=df_toronto['Longitude'],
                                  )
df_toronto_venues.head(10)

Success in obtaining the data!


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Parkwoods,43.753259,-79.329656,Brookbanks Park,43.751976,-79.33214,Park
1,Parkwoods,43.753259,-79.329656,Variety Store,43.751974,-79.333114,Food & Drink Shop
2,Victoria Village,43.725882,-79.315572,Victoria Village Arena,43.723481,-79.315635,Hockey Arena
3,Victoria Village,43.725882,-79.315572,Portugril,43.725819,-79.312785,Portuguese Restaurant
4,Victoria Village,43.725882,-79.315572,Tim Hortons,43.725517,-79.313103,Coffee Shop
5,Victoria Village,43.725882,-79.315572,The Frig,43.727051,-79.317418,French Restaurant
6,Victoria Village,43.725882,-79.315572,Pizza Nova,43.725824,-79.31286,Pizza Place
7,"Regent Park, Harbourfront",43.65426,-79.360636,Roselle Desserts,43.653447,-79.362017,Bakery
8,"Regent Park, Harbourfront",43.65426,-79.360636,Tandem Coffee,43.653559,-79.361809,Coffee Shop
9,"Regent Park, Harbourfront",43.65426,-79.360636,Cooper Koo Family YMCA,43.653249,-79.358008,Distribution Center


In [24]:
print('Unique categories: {}.'.format(len(df_toronto_venues['Venue Category'].unique())))

Unique categories: 271.


In [25]:
df_toronto_venues['Venue Category'].unique()

array(['Park', 'Food & Drink Shop', 'Hockey Arena',
       'Portuguese Restaurant', 'Coffee Shop', 'French Restaurant',
       'Pizza Place', 'Bakery', 'Distribution Center', 'Spa',
       'Restaurant', 'Pub', 'Breakfast Spot', 'Gym / Fitness Center',
       'Historic Site', 'Farmers Market', 'Dessert Shop',
       'Chocolate Shop', 'Performing Arts Venue', 'Mexican Restaurant',
       'Café', 'Yoga Studio', 'Theater', 'Event Space',
       'Asian Restaurant', 'Shoe Store', 'Ice Cream Shop',
       'Electronics Store', 'Art Gallery', 'Bank', 'Beer Store', 'Hotel',
       'Wine Shop', 'Antique Shop', 'Boutique', 'Furniture / Home Store',
       'Vietnamese Restaurant', 'Clothing Store', 'Accessories Store',
       'Miscellaneous Shop', 'Creperie', 'Sushi Restaurant',
       'Arts & Crafts Store', 'Burrito Place', 'Beer Bar', 'Hobby Shop',
       'Diner', 'Fried Chicken Joint', 'Smoothie Shop', 'Sandwich Place',
       'Gym', 'College Auditorium', 'Bar', 'College Cafeteria',
       'Gene

In [26]:
# Saving the results
df_toronto_venues.to_csv('toronto_venues.csv', index=False)
print('Data saved!')

Data saved!


## 4. Results

In [27]:
import pandas as pd

# Loading the saved results
df_toronto_venues = pd.read_csv('toronto_venues.csv')
df_toronto_venues.head(10)

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Parkwoods,43.753259,-79.329656,Brookbanks Park,43.751976,-79.33214,Park
1,Parkwoods,43.753259,-79.329656,Variety Store,43.751974,-79.333114,Food & Drink Shop
2,Victoria Village,43.725882,-79.315572,Victoria Village Arena,43.723481,-79.315635,Hockey Arena
3,Victoria Village,43.725882,-79.315572,Portugril,43.725819,-79.312785,Portuguese Restaurant
4,Victoria Village,43.725882,-79.315572,Tim Hortons,43.725517,-79.313103,Coffee Shop
5,Victoria Village,43.725882,-79.315572,The Frig,43.727051,-79.317418,French Restaurant
6,Victoria Village,43.725882,-79.315572,Pizza Nova,43.725824,-79.31286,Pizza Place
7,"Regent Park, Harbourfront",43.65426,-79.360636,Roselle Desserts,43.653447,-79.362017,Bakery
8,"Regent Park, Harbourfront",43.65426,-79.360636,Tandem Coffee,43.653559,-79.361809,Coffee Shop
9,"Regent Park, Harbourfront",43.65426,-79.360636,Cooper Koo Family YMCA,43.653249,-79.358008,Distribution Center


Exploring our dataset, we identified a specific category for vegan restaurants called "Vegetarian / Vegan Restaurant".

In [28]:
# Verify that it exists a venue or venue category for 'vegan'
for i in df_toronto_venues['Venue Category'].unique():
    if 'Vegan' in i:
        print(i)
        break

Vegetarian / Vegan Restaurant


When we counting all venues in the category, we found that there are only 18 venues registered.

In [29]:
df_vegan_restaurants = df_toronto_venues[df_toronto_venues['Venue Category'] == 'Vegetarian / Vegan Restaurant']
print('Vegeratian / Vegan restaurants in Toronto: {}.'.format(df_vegan_restaurants.shape[0]))

Vegeratian / Vegan restaurants in Toronto: 18.


In contrast, when we look for the word 'restaurant' in the description of the venues categories, we found 47 different categories.

In [30]:
# Counting all venues categories contain 'restaurant' except vegan restaurant.
i = 0
for item in df_toronto_venues['Venue Category'].unique():
    if 'Restaurant' in item and not 'Vegan' in item:
        i+=1
print('Quantity for restaurant category: ', i)

Quantity for restaurant category:  47


Before applying the KMeans clustering technique, we grouped the neighborhoods together and created a dataframe containing only the venues within the 'Restaurant' category.

In [38]:
# Here we want to create a new dataframe that stores neighborhood and their venue category
# First,creating new dataframe for venues category
df_toronto_venues_neighborhood = pd.get_dummies(df_toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# Second, we append neighborhood column back to df_toronto_venues_neighborhood
df_toronto_venues_neighborhood['Neighborhood'] = df_toronto_venues['Neighborhood']

# Here we group df_toronto_venues_neighborhood by neighborhood and put it in new dataframe
Neighborhoods_grouped = df_toronto_venues_neighborhood.groupby(["Neighborhood"]).mean().reset_index()

# To go to final step, clustring, we need to create a new dataframe only for Bar
Restaurant_df = Neighborhoods_grouped[["Neighborhood","Restaurant"]]
Restaurant_df.head(10)

Unnamed: 0,Neighborhood,Restaurant
0,Agincourt,0.0
1,"Alderwood, Long Branch",0.0
2,"Bathurst Manor, Wilson Heights, Downsview North",0.047619
3,Bayview Village,0.0
4,"Bedford Park, Lawrence Manor East",0.076923
5,Berczy Park,0.033898
6,"Birch Cliff, Cliffside West",0.0
7,"Brockton, Parkdale Village, Exhibition Place",0.043478
8,"Business reply mail Processing Centre, South C...",0.052632
9,"CN Tower, King and Spadina, Railway Lands, Har...",0.0


We applied the K-Means clustering technique, where the neighborhoods are grouped into three clusters. The labels created were associated with each neighborhood contained in the dataframe.

In [39]:
from sklearn.cluster import KMeans

import sklearn.cluster.k_means_ as kmean
kmeans = kmean.KMeans()

# set number of clusters
Noclusters = 3

clustering_df = Restaurant_df.drop(["Neighborhood"], 1)

# run k-means clustering
kmeans = KMeans(n_clusters=Noclusters, random_state=0).fit(clustering_df)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([0, 0, 2, 0, 1, 2, 0, 2, 2, 0])

In [41]:
# create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.
Cluster_df = Restaurant_df.copy()

# add clustering labels
Cluster_df["Cluster Labels"] = kmeans.labels_

# merge Cluster_df with df_toronto to add latitude/longitude for each neighborhood
Cluster_df = Cluster_df.join(df_toronto.set_index("Neighbourhood"), on="Neighborhood")

print(Cluster_df.shape)
Cluster_df.head(10)

(97, 7)


Unnamed: 0,Neighborhood,Restaurant,Cluster Labels,Postal Code,Borough,Latitude,Longitude
0,Agincourt,0.0,0,M1S,Scarborough,43.7942,-79.262029
1,"Alderwood, Long Branch",0.0,0,M8W,Etobicoke,43.602414,-79.543484
2,"Bathurst Manor, Wilson Heights, Downsview North",0.047619,2,M3H,North York,43.754328,-79.442259
3,Bayview Village,0.0,0,M2K,North York,43.786947,-79.385975
4,"Bedford Park, Lawrence Manor East",0.076923,1,M5M,North York,43.733283,-79.41975
5,Berczy Park,0.033898,2,M5E,Downtown Toronto,43.644771,-79.373306
6,"Birch Cliff, Cliffside West",0.0,0,M1N,Scarborough,43.692657,-79.264848
7,"Brockton, Parkdale Village, Exhibition Place",0.043478,2,M6K,West Toronto,43.636847,-79.428191
8,"Business reply mail Processing Centre, South C...",0.052632,2,M7Y,East Toronto,43.662744,-79.321558
9,"CN Tower, King and Spadina, Railway Lands, Har...",0.0,0,M5V,Downtown Toronto,43.628947,-79.39442


## 5. Discussion

We plot all restaurants on the map. Green points represent all vegan restaurants, and the red points represent other types of restaurants.

In [44]:
df_all_restaurants = df_toronto_venues[df_toronto_venues['Venue Category'].str.contains('Restaurant')]

import folium

# Toronto, Ontario
latitude = 43.6529
longitude = -79.3849

#Create map of Toronto using Latitude and Longitude values
map_toronto = folium.Map(location=[latitude,longitude], zoom_start=14)

#add markers to map for neighborhoods
for lat, lng, category, neighborhood in zip(df_all_restaurants['Venue Latitude'], df_all_restaurants['Venue Longitude'], df_all_restaurants['Venue Category'], df_all_restaurants['Neighborhood']):
    label = '{}, {}'.format(neighborhood, category)
    label = folium.Popup(label, parse_html=True)
    is_vegan = 'Vegan' in category
    folium.CircleMarker(
        [lat,lng],
        radius=3,
        popup=label,
        color='green' if is_vegan else 'red',
        fill=True,
        fill_color='green' if is_vegan else 'red',
        fill_opacity=0.7,
    parse_html=False).add_to(map_toronto)
    
map_toronto

We observed on the map that most vegan restaurants are present in places where other categories of restaurants already exist.

For example, students from the University of Toronto they have a vegan restaurant option near close from the university.

However, when comparing the number of existing restaurants with the number of vegan / vegetarian restaurants, we still have few options to serve an audience that continues to grow.

In [45]:
import folium
import numpy as np
import matplotlib.cm as cm
import matplotlib.colors as colors

# Toronto, Ontario
latitude = 43.6529
longitude = -79.3849

# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(Noclusters)
ys = [i+x+(i*x)**2 for i in range(Noclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(Cluster_df['Latitude'], Cluster_df['Longitude'], Cluster_df['Neighborhood'], Cluster_df['Cluster Labels']):
    label = folium.Popup(str(poi) + ' - Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

When we analyse the clusters created for neighborhoods, we identify that restaurants are generally concentrated in the downtown (green dot on the map, cluster label 2). The concentration of restaurant in downtown indicate fews opportunities to open new physical stores, but it is a rich place for the promote vegan products.



In [48]:
#Cluster 2
Cluster_df.loc[Cluster_df['Cluster Labels'] == 2]

Unnamed: 0,Neighborhood,Restaurant,Cluster Labels,Postal Code,Borough,Latitude,Longitude
2,"Bathurst Manor, Wilson Heights, Downsview North",0.047619,2,M3H,North York,43.754328,-79.442259
5,Berczy Park,0.033898,2,M5E,Downtown Toronto,43.644771,-79.373306
7,"Brockton, Parkdale Village, Exhibition Place",0.043478,2,M6K,West Toronto,43.636847,-79.428191
8,"Business reply mail Processing Centre, South C...",0.052632,2,M7Y,East Toronto,43.662744,-79.321558
14,Christie,0.058824,2,M6G,Downtown Toronto,43.669542,-79.422564
15,Church and Wellesley,0.039474,2,M4Y,Downtown Toronto,43.66586,-79.38316
19,Davisville,0.03125,2,M4S,Central Toronto,43.704324,-79.38879
28,"Fairview, Henry Farm, Oriole",0.042857,2,M2J,North York,43.778517,-79.346556
29,"First Canadian Place, Underground city",0.04,2,M5X,Downtown Toronto,43.648429,-79.38228
35,"Harbourfront East, Union Station, Toronto Islands",0.03,2,M5J,Downtown Toronto,43.640816,-79.381752


The analysis of Toronto neighborhoods showed us neighborhoods that have little concentration of restaurants (blue dot on the map, label 1). In these neighborhoods, companies can invest in opening physical stores without major competition concerns. 



In [47]:
#Cluster 1
Cluster_df.loc[Cluster_df['Cluster Labels'] == 1]

Unnamed: 0,Neighborhood,Restaurant,Cluster Labels,Postal Code,Borough,Latitude,Longitude
4,"Bedford Park, Lawrence Manor East",0.076923,1,M5M,North York,43.733283,-79.41975
18,"Commerce Court, Victoria Hotel",0.07,1,M5L,Downtown Toronto,43.648198,-79.379817
22,Don Mills,0.076923,1,M3B,North York,43.745906,-79.352188
22,Don Mills,0.076923,1,M3C,North York,43.7259,-79.340923
34,"Guildwood, Morningside, West Hill",0.125,1,M1E,Scarborough,43.763573,-79.188711
53,"New Toronto, Mimico South, Humber Bay Shores",0.076923,1,M8V,Etobicoke,43.605647,-79.501321
59,"Parkdale, Roncesvalles",0.071429,1,M6R,West Toronto,43.64896,-79.456325
77,"Summerhill West, Rathnelly, South Hill, Forest...",0.0625,1,M4V,Central Toronto,43.686412,-79.400049


The red dots on the map (label 0) are places with few dining options, and locations contain other options for food. For tourists looking for restaurants, whether vegan or menus of other nationalities, you will not find many options in these neighborhoods.

In [46]:
#Cluster 0
Cluster_df.loc[Cluster_df['Cluster Labels'] == 0]

Unnamed: 0,Neighborhood,Restaurant,Cluster Labels,Postal Code,Borough,Latitude,Longitude
0,Agincourt,0.0,0,M1S,Scarborough,43.794200,-79.262029
1,"Alderwood, Long Branch",0.0,0,M8W,Etobicoke,43.602414,-79.543484
3,Bayview Village,0.0,0,M2K,North York,43.786947,-79.385975
6,"Birch Cliff, Cliffside West",0.0,0,M1N,Scarborough,43.692657,-79.264848
9,"CN Tower, King and Spadina, Railway Lands, Har...",0.0,0,M5V,Downtown Toronto,43.628947,-79.394420
...,...,...,...,...,...,...,...
87,"Wexford, Maryvale",0.0,0,M1R,Scarborough,43.750072,-79.295849
89,"Willowdale, Willowdale West",0.0,0,M2R,North York,43.782736,-79.442259
90,Woburn,0.0,0,M1G,Scarborough,43.770992,-79.216917
91,Woodbine Heights,0.0,0,M4C,East York,43.695344,-79.318389


## 6. Conclusion

This report was intended to indicate which locations in Toronto have a vegan restaurant, and also to show potential locations for companies specializing in vegan products to partner with other restaurants to sell their products.

The data showed that there are nearby dining options for students at the University of Toronto, but overall the number of vegan restaurants is still small compared to the total number of existing restaurants.

The data also show us the existence of potential neighborhoods for opening physical stores for vegan restaurants, since the concentration level of restaurants is not so high.

And finally, we discovered neighborhoods in Toronto that don't have restaurant options. In these cases, the tourist needs to find other options for food.


## 7. References

https://bigseventravel.com/2020/02/best-vegan-restaurants-in-the-world-2020/

https://wtvox.com/sustainable-living/2019-the-world-of-vegan-but-how-many-vegans-are-in-the-world/

https://www.dailyhawker.ca/vegan-restaurants-in-toronto/