# Introduction/Business Problem

#### The problem we will solve is: which of the neighbourhoods of Toronto are more suitable to open there a restaurant?

#### To solve this problem we will use the dataframe from the previous ptactical work. The dataframe is the following

In [1]:
import pandas as pd 
import numpy as np
import requests
from pandas.io.json import json_normalize

csv_path='/resources/data/Toronto_neighbourhoods_2.csv'
df=pd.read_csv(csv_path, sep=",", encoding='cp1252')
df.drop(['Unnamed: 0'], axis=1,inplace=True)

#### The dataframe df contains all Toronto postcodes and  boroughs as well as Toronto neighbourhoods grouped by their postcodes. Two last columns of the dataframe contain latitudes and longituges of these neighborhood groups.

In [2]:
df.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476


#### The dataframe df contains 103 rows and 5 columns

In [3]:
df.shape

(103, 5)

## Define Foursquare Credentials and Version

In [4]:
CLIENT_ID = 'OTQDBGJPMXHTIMHHNO5OPVE1VJBEOZ3NHSWWMRMUB1N0MW5H' # your Foursquare ID
CLIENT_SECRET = 'QHV23BYJTV2BATLWELNUMSXG1VTPU4UV2MAYW4MIZMP2UN55' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT=30

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: OTQDBGJPMXHTIMHHNO5OPVE1VJBEOZ3NHSWWMRMUB1N0MW5H
CLIENT_SECRET:QHV23BYJTV2BATLWELNUMSXG1VTPU4UV2MAYW4MIZMP2UN55


#### For each group of neighbourhoods in df we will be interested in number of restaurants within 1000 meters from the geographical position of that group (which is given by two last columns of df in the corresponding row). So, first put radius = 1000 and search_query='restaurant'

In [5]:
radius=1000
search_query='restaurant'

## Consider an example of a FourSquare query that we will use below. For this example we take the latitude and longitude from the third row of the dataframe df (i.e. for the postal code M1E).

In [6]:
lat = 43.763573
lng = -79.188711

#### Define the corresponding URL

In [7]:
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, lat, lng, VERSION, search_query, radius, LIMIT)
url

'https://api.foursquare.com/v2/venues/search?client_id=OTQDBGJPMXHTIMHHNO5OPVE1VJBEOZ3NHSWWMRMUB1N0MW5H&client_secret=QHV23BYJTV2BATLWELNUMSXG1VTPU4UV2MAYW4MIZMP2UN55&ll=43.763573,-79.188711&v=20180605&query=restaurant&radius=1000&limit=30'

#### Send the GET Request and examine the results

In [8]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5bf5a97d351e3d1640b76b93'},
 'response': {'venues': [{'id': '4b7054e3f964a5204d132de3',
    'name': 'Wonder Season Chinese Restaurant',
    'location': {'address': '4379 Kingston Road',
     'lat': 43.76535407,
     'lng': -79.19053556,
     'labeledLatLngs': [{'label': 'display',
       'lat': 43.76535407,
       'lng': -79.19053556}],
     'distance': 246,
     'cc': 'CA',
     'city': 'Toronto',
     'state': 'ON',
     'country': 'Canada',
     'formattedAddress': ['4379 Kingston Road', 'Toronto ON', 'Canada']},
    'categories': [],
    'referralId': 'v-1542826365',
    'hasPerk': False},
   {'id': '4ea863ad5c5cc8e499272572',
    'name': 'Mahar Restaurant',
    'location': {'address': 'Gerrard Street',
     'lat': 43.76934,
     'lng': -79.18818,
     'labeledLatLngs': [{'label': 'display',
       'lat': 43.76934,
       'lng': -79.18818}],
     'distance': 643,
     'cc': 'CA',
     'city': 'Toronto',
     'state': 'ON',
     'country': 'Canad

#### Get relevant part of JSON and transform it into a *pandas* dataframe

In [9]:
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
dataframe = json_normalize(venues)
dataframe.head()

Unnamed: 0,categories,hasPerk,id,location.address,location.cc,location.city,location.country,location.crossStreet,location.distance,location.formattedAddress,location.labeledLatLngs,location.lat,location.lng,location.postalCode,location.state,name,referralId
0,[],False,4b7054e3f964a5204d132de3,4379 Kingston Road,CA,Toronto,Canada,,246,"[4379 Kingston Road, Toronto ON, Canada]","[{'label': 'display', 'lat': 43.76535407, 'lng...",43.765354,-79.190536,,ON,Wonder Season Chinese Restaurant,v-1542826365
1,[],False,4ea863ad5c5cc8e499272572,Gerrard Street,CA,Toronto,Canada,,643,"[Gerrard Street, Toronto ON, Canada]","[{'label': 'display', 'lat': 43.76934, 'lng': ...",43.76934,-79.18818,,ON,Mahar Restaurant,v-1542826365
2,"[{'id': '4bf58dd8d48988d16e941735', 'name': 'F...",False,4b9023e9f964a5200e7833e3,4434 Kingston Rd,CA,Scarborough,Canada,Lawrence Ave E,532,"[4434 Kingston Rd (Lawrence Ave E), Scarboroug...","[{'label': 'display', 'lat': 43.76834717796655...",43.768347,-79.188368,M1E,ON,McDonald's,v-1542826365
3,"[{'id': '4bf58dd8d48988d145941735', 'name': 'C...",False,4ccc6b0dba0a5481623f3d59,4190 Kingston Road,CA,Toronto,Canada,,883,"[4190 Kingston Road, Toronto ON, Canada]","[{'label': 'display', 'lat': 43.759544, 'lng':...",43.759544,-79.19818,,ON,Tai Chi Restaurant,v-1542826365
4,"[{'id': '4bf58dd8d48988d145941735', 'name': 'C...",False,4c85aa5bee6fef3b1d1d3e5c,4532 Kingston Road,CA,Scarborough,Canada,Morningside Ave.,957,"[4532 Kingston Road (Morningside Ave.), Scarbo...","[{'label': 'display', 'lat': 43.77194591410752...",43.771946,-79.185976,M1E 2N8,ON,Peking Garden Restaurant,v-1542826365


## Get the data about the number of restaurants we are interested in

#### First add to df a column 'sum', where there will be the numbers of restaurants within 1000 meters from the neighbourhoods 

In [10]:
df['sum']=""

#### Next for each row in df we take the values of latitude and longitude and form for them the FourSquare API query, where radius and search_query are defined above. As a result we get url, which will be then transformed to json format 
#### (by the command results = requests.get(url).json()).
#### And then we put to the corresponding row of the column 'sum' the length of results['response']['venues'], which is exactly the number of restaurants within 1000 meters from the point we are interested in. 

In [11]:
for k, row in df.iterrows():
    (lat,lng)= (row["Latitude"], row["Longitude"])
    url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, lat, lng, VERSION, search_query, radius, LIMIT)
    results = requests.get(url).json()
    df.set_value(index=k,col='sum',value=len(results['response']['venues']))
    
# Url = Url.append({'Italian food': url}, ignore_index=True)


  """


#### Here are the first 10 rows of the final fataframe

In [12]:
df.head(10)

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude,sum
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353,0
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497,1
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711,5
3,M1G,Scarborough,Woburn,43.770992,-79.216917,3
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476,8
5,M1J,Scarborough,Scarborough Village,43.744734,-79.239476,7
6,M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park",43.727929,-79.262029,5
7,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge",43.711112,-79.284577,3
8,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West",43.716316,-79.239476,1
9,M1N,Scarborough,"Birch Cliff, Cliffside West",43.692657,-79.264848,4


#### So df is the dataframe that we will use in the next step for solving the business problem