# Best Neighbourhood in Toronto for new restaurant

### Problem Statement: 
John Smith is a good chief. After discussion with his family, he decides to open a new restaurant in Toronto. However, he doesn't know which is the best neighbourhood in Toronto for a new restaurant. He asks us to research and help him figure out the best neighbourhood. 

### Data Source: 
we will use the toronto data generated from last assignment, which contains the Borough, Postal Code, Neighbourhood, Latitude, Longitude. Further, we will use Foursuare API to pull venue data to find out the neighbourhood with most venues but least restaurants.

# Methodology

In [88]:
# Import and load the toronto data from the previous assignment's result

from project_lib import Project
project = Project(project_id="f3101ff4-e278-4a2c-8141-00afc30b3e09",project_access_token="p-d28f16c105a5ce75a81225274d24530c67c9bc51")

# Fetch the file
my_file = project.get_file("torontodata.csv")

# Read the CSV data file from the object storage into a pandas DataFrame
my_file.seek(0)
import pandas as pd
torontodata = pd.read_csv(my_file)

In [89]:
# Explore the data
torontodata.head()

Unnamed: 0,Cluster,Postal Code,Borough,Neighbourhood,Latitude,Longitude,Borough_num
0,0,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636,1
1,0,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,1
2,0,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937,1
3,0,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418,1
4,3,M4E,East Toronto,The Beaches,43.676357,-79.293031,2


In [90]:
# Import necessary libraries

import requests # library to handle requests
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation

!pip install geopy
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 

! pip install folium==0.5.0
import folium # plotting library

# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize

print('All Libraries imported.')

All Libraries imported.


In [91]:
# The code was removed by Watson Studio for sharing.

Your credentails are setup


In [92]:
# Define a function to get all venues near a neighborhood in Toronto

def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighbourhood', 
                  'Neighbourhood Latitude', 
                  'Neighbourhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

# run the function on each neighborhood and generate dataframe 
toronto_venues = getNearbyVenues(torontodata.Neighbourhood, torontodata.Latitude, torontodata.Longitude, radius=500)
toronto_venues.head()

Regent Park, Harbourfront
Queen's Park, Ontario Provincial Government
Garden District, Ryerson
St. James Town
The Beaches
Berczy Park
Central Bay Street
Christie
Richmond, Adelaide, King
Dufferin, Dovercourt Village
Harbourfront East, Union Station, Toronto Islands
Little Portugal, Trinity
The Danforth West, Riverdale
Toronto Dominion Centre, Design Exchange
Brockton, Parkdale Village, Exhibition Place
India Bazaar, The Beaches West
Commerce Court, Victoria Hotel
Studio District
Lawrence Park
Roselawn
Davisville North
Forest Hill North & West, Forest Hill Road Park
High Park, The Junction South
North Toronto West, Lawrence Park
The Annex, North Midtown, Yorkville
Parkdale, Roncesvalles
Davisville
University of Toronto, Harbord
Runnymede, Swansea
Moore Park, Summerhill East
Kensington Market, Chinatown, Grange Park
Summerhill West, Rathnelly, South Hill, Forest Hill SE, Deer Park
CN Tower, King and Spadina, Railway Lands, Harbourfront West, Bathurst Quay, South Niagara, Island airport
R

Unnamed: 0,Neighbourhood,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"Regent Park, Harbourfront",43.65426,-79.360636,Roselle Desserts,43.653447,-79.362017,Bakery
1,"Regent Park, Harbourfront",43.65426,-79.360636,Tandem Coffee,43.653559,-79.361809,Coffee Shop
2,"Regent Park, Harbourfront",43.65426,-79.360636,Cooper Koo Family YMCA,43.653249,-79.358008,Distribution Center
3,"Regent Park, Harbourfront",43.65426,-79.360636,Morning Glory Cafe,43.653947,-79.361149,Breakfast Spot
4,"Regent Park, Harbourfront",43.65426,-79.360636,Body Blitz Spa East,43.654735,-79.359874,Spa


# Results

In [93]:
# Generate neighbourhoods with the number of total venues within a radius of 500m.
totalvenue = toronto_venues.groupby('Neighbourhood').count().sort_values(by='Venue', ascending=False)[['Venue']]
totalvenue.rename(columns={'Venue':'Venue Number'}, inplace=True)
totalvenue.head()

Unnamed: 0_level_0,Venue Number
Neighbourhood,Unnamed: 1_level_1
"Toronto Dominion Centre, Design Exchange",100
"Commerce Court, Victoria Hotel",100
"First Canadian Place, Underground city",100
"Garden District, Ryerson",100
"Harbourfront East, Union Station, Toronto Islands",100


In [94]:
# Generate neighbourhoods with the number of total restaurants within a radius of 500m.
restaurantvenue = toronto_venues[toronto_venues['Venue Category'] == 'Restaurant']
topneigh = restaurantvenue.groupby('Neighbourhood').count().sort_values(by='Venue', ascending=False)[['Venue']]
topneigh.rename(columns={'Venue':'Restaurant Number'}, inplace=True)
topneigh.head()

Unnamed: 0_level_0,Restaurant Number
Neighbourhood,Unnamed: 1_level_1
"Commerce Court, Victoria Hotel",7
"Toronto Dominion Centre, Design Exchange",4
"First Canadian Place, Underground city",4
"Richmond, Adelaide, King",4
"Harbourfront East, Union Station, Toronto Islands",3


In [95]:
# merge the number of venues and restaurants into one dataframe
bestneigh = pd.merge(totalvenue, topneigh, on='Neighbourhood')
topneigh = bestneigh[bestneigh['Venue Number'] == bestneigh['Venue Number'].max()].sort_values('Restaurant Number',ascending=True)
topneigh

Unnamed: 0_level_0,Venue Number,Restaurant Number
Neighbourhood,Unnamed: 1_level_1,Unnamed: 2_level_1
"Garden District, Ryerson",100,1
"Harbourfront East, Union Station, Toronto Islands",100,3
"Toronto Dominion Centre, Design Exchange",100,4
"First Canadian Place, Underground city",100,4
"Commerce Court, Victoria Hotel",100,7


# Discussion

### limitation
The Foursquare explore API has a limit of 100 venues, which can't effectively identify the real total number of venues near the neighbourhood

# Conclusion

In [96]:
print('The Neighbourhood {} is the best one because it has the most venues and least restaurants'.format(str(topneigh.index[0])))

The Neighbourhood Garden District, Ryerson is the best one because it has the most venues and least restaurants


# Presentation

In [97]:
nearres = toronto_venues[toronto_venues['Neighbourhood'] == 'Garden District, Ryerson']
nearres

Unnamed: 0,Neighbourhood,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
80,"Garden District, Ryerson",43.657162,-79.378937,UNIQLO ユニクロ,43.655910,-79.380641,Clothing Store
81,"Garden District, Ryerson",43.657162,-79.378937,Page One Cafe,43.657772,-79.376073,Café
82,"Garden District, Ryerson",43.657162,-79.378937,Silver Snail Comics,43.657031,-79.381403,Comic Shop
83,"Garden District, Ryerson",43.657162,-79.378937,Yonge-Dundas Square,43.656054,-79.380495,Plaza
84,"Garden District, Ryerson",43.657162,-79.378937,Burrito Boyz,43.656265,-79.378343,Burrito Place
...,...,...,...,...,...,...,...
175,"Garden District, Ryerson",43.657162,-79.378937,Shoppers Drug Mart,43.658475,-79.384868,Pharmacy
176,"Garden District, Ryerson",43.657162,-79.378937,Abercrombie & Fitch,43.652915,-79.380495,Clothing Store
177,"Garden District, Ryerson",43.657162,-79.378937,GoodLife Fitness Toronto Bell Trinity Centre,43.653436,-79.382314,Gym
178,"Garden District, Ryerson",43.657162,-79.378937,Good Earth Coffeehouse,43.656850,-79.374719,Coffee Shop


In [98]:
# Visualize the venues around the best neighbourhood

latitude = nearres.iloc[0,1]
longitude = nearres.iloc[0,2]

venues_map = folium.Map(location=[latitude, longitude], zoom_start=15) # generate map centred around Garden District, Ryerson
# add Ecco as a red circle mark
folium.Circle(
    [latitude, longitude],
    radius=500,
    popup='Neighbourhood: Garden District, Ryerson',
    fill=True,
    color='red',
    fill_color='red',
    fill_opacity=0.2
    ).add_to(venues_map)

# add popular spots to the map as blue circle markers
for lat, lng, label in zip(nearres['Venue Latitude'], nearres['Venue Longitude'], nearres['Venue Category']):
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        fill=True,
        color='blue',
        fill_color='blue',
        fill_opacity=0.6
        ).add_to(venues_map)
# display map
venues_map
