# Restaurant Suggestion in New Delhi

## Table of Contents
<div class="alert alert-block alert-info" >

<font size = 3>

1. <a href="#item1">Introduction/Business Problem</a>

2. <a href="#item2">Data</a>

3. <a href="#item3">Download and Explore Dataset</a>

4. <a href="#item4">Data Processing</a>

5. <a href="#item5">Methodology</a>    

6. <a href="#item6">Results and Discussions</a> 
7. <a href="#item7">Conclusion</a> 
</font>
</div>

##  <a id="item1"></a>Introduction/Business Problem

The aim of this project is to identify best restaurants in <b>New Delhi,India</b> based on their average prices and ratings.We will use <b>FourSquare API</b> and <b>Zomato data </b> of various restaurants present in Delhi, for the analysis.
This project will help various visitors to look up for best restaurant in their neighborhood (in Delhi), based on users average rating.<br>
The idea behind this project is that whenever a tourist enters a new city or area he/she starts looking for best tourist places and secondly famous restaurants or cafe's to enjoy a good meal.<br>
Here I am using <b>Foursquare location data</b> to look up for famous venues in Delhi and <b> Zomato data</b> to look up for the average ratings and average price of an restaurant.


## <a id="item2"></a> Data

For this project apart from location data we also need average ratings and average prices of restaurants present in New Delhi.
Thus for the location data we will be dealing with <b>Foursquare API</b> and for the restaurants data, I just simply searched it on Kaggel and found <b>Zomato data</b> consisting of various restaurants not just from Delhi but all over the world, further we will filter it to extract the required data.<br>
<b>Foursquare API:</b> We will collect the information of various venues in New Delhi within a radius of <b>25 km</b> and futher we wiil filter it to keep only restaurants inforamtion.<br>
<b>Zomato data:</b> This data downloaded from Kaggle will help to get the average ratings and prices for each restaurant present in city, which is also in common with restaurants got from Foursquare API.

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


## <a id="item3"></a> Download and Explore Dataset
we will be using Foursquare Api location to download location data and a datasheet from kaggle of various restaurants present in New Delhi.

In [593]:
Client_id='id'
Client_secret='client__'
VERSION = '20180605'


Using geocode Nominatim to get the coordinates of New Delhi

In [594]:
geolocator=Nominatim()
location=geolocator.geocode('New Delhi')
lat=location.latitude
lon=location.longitude

  """Entry point for launching an IPython kernel.


In [595]:
radius=25000
venues=220
limit=220

After looking at json file got from foursquare API I found out that categories row is further enclosed in a dict so this function will take out the categories of each Venue

In [652]:
def get_category(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [653]:
pd.set_option('display.max_rows', None)

offset = 0
total_venues = 0
df_restaurant = pd.DataFrame(columns = ['name', 'categories', 'lat', 'lng'])

while True:
    url=('https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}&offset={}').format(Client_id,Client_secret,VERSION,location.latitude,location.longitude,radius,limit,offset)
    result = requests.get(url).json()
    data=result['response']['groups'][0]['items']
    venues_fetched = len(data)
    total_venues = total_venues + venues_fetched
    print("Total {} venues fetched within a total radius of {} Km".format(venues_fetched, radius/1000))
    if venues_fetched==0:
        break

    
    venues = json_normalize(data)

    # Filter the columns
    columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
    venues = venues.loc[:, columns]

    # Filtering the category further for each row
    venues['venue.categories'] = venues.apply(get_category, axis = 1)

    # Clean all column names
    new_names=[]
    for column_ in venues.columns:
        new_names.append(column_.split(".")[-1])
    
    venues.columns=new_names
    # add the data to previous dataframe
    df_restaurant = pd.concat([df_restaurant, venues], axis = 0, sort = False)
    
    if (total_venues > 700):
        break
    else:
        offset = offset + 10
    #reseting the index of dataframe
    df_restaurant= df_restaurant.reset_index(drop = True)

print("Total {} venues were fetched from foursquare API location".format(total_venues))

Total 100 venues fetched within a total radius of 25.0 Km
Total 100 venues fetched within a total radius of 25.0 Km
Total 93 venues fetched within a total radius of 25.0 Km
Total 83 venues fetched within a total radius of 25.0 Km
Total 73 venues fetched within a total radius of 25.0 Km
Total 63 venues fetched within a total radius of 25.0 Km
Total 53 venues fetched within a total radius of 25.0 Km
Total 43 venues fetched within a total radius of 25.0 Km
Total 33 venues fetched within a total radius of 25.0 Km
Total 23 venues fetched within a total radius of 25.0 Km
Total 13 venues fetched within a total radius of 25.0 Km
Total 3 venues fetched within a total radius of 25.0 Km
Total 0 venues fetched within a total radius of 25.0 Km
Total 680 venues were fetched from foursquare API location


In [654]:
df_restaurant.head()

Unnamed: 0,name,categories,lat,lng
0,The Imperial,Hotel,28.625548,77.218664
1,Tamra,Restaurant,28.620543,77.218174
2,Pandey Paan,Smoke Shop,28.622249,77.201075
3,Amour Bistro,Café,28.601569,77.185923
4,The Big Chill Cafe,Italian Restaurant,28.600686,77.227636


In [609]:
dd=pd.read_csv('zomato.csv',encoding='latin-1')
dd.head()

Unnamed: 0,Restaurant ID,Restaurant Name,Country Code,City,Address,Locality,Locality Verbose,Longitude,Latitude,Cuisines,Average Cost for two,Currency,Has Table booking,Has Online delivery,Is delivering now,Switch to order menu,Price range,Aggregate rating,Rating color,Rating text,Votes
0,6317637,Le Petit Souffle,162,Makati City,"Third Floor, Century City Mall, Kalayaan Avenu...","Century City Mall, Poblacion, Makati City","Century City Mall, Poblacion, Makati City, Mak...",121.027535,14.565443,"French, Japanese, Desserts",1100,Botswana Pula(P),Yes,No,No,No,3,4.8,Dark Green,Excellent,314
1,6304287,Izakaya Kikufuji,162,Makati City,"Little Tokyo, 2277 Chino Roces Avenue, Legaspi...","Little Tokyo, Legaspi Village, Makati City","Little Tokyo, Legaspi Village, Makati City, Ma...",121.014101,14.553708,Japanese,1200,Botswana Pula(P),Yes,No,No,No,3,4.5,Dark Green,Excellent,591
2,6300002,Heat - Edsa Shangri-La,162,Mandaluyong City,"Edsa Shangri-La, 1 Garden Way, Ortigas, Mandal...","Edsa Shangri-La, Ortigas, Mandaluyong City","Edsa Shangri-La, Ortigas, Mandaluyong City, Ma...",121.056831,14.581404,"Seafood, Asian, Filipino, Indian",4000,Botswana Pula(P),Yes,No,No,No,4,4.4,Green,Very Good,270
3,6318506,Ooma,162,Mandaluyong City,"Third Floor, Mega Fashion Hall, SM Megamall, O...","SM Megamall, Ortigas, Mandaluyong City","SM Megamall, Ortigas, Mandaluyong City, Mandal...",121.056475,14.585318,"Japanese, Sushi",1500,Botswana Pula(P),No,No,No,No,4,4.9,Dark Green,Excellent,365
4,6314302,Sambo Kojin,162,Mandaluyong City,"Third Floor, Mega Atrium, SM Megamall, Ortigas...","SM Megamall, Ortigas, Mandaluyong City","SM Megamall, Ortigas, Mandaluyong City, Mandal...",121.057508,14.58445,"Japanese, Korean",1500,Botswana Pula(P),Yes,No,No,No,4,4.8,Dark Green,Excellent,229


## <a id="item4"></a>Data Processing
Now after downloading and extracting data we need to clean it and process it to make it ready for modelling before passing it to Kmeans Cluster algorathim.
<br>
At first we will just clean Zomato Dataset, Hence we will get the restaurants and cafe's only present inNew Delhi.
<br>
Later parallely we will clean up Locaton Data


In [611]:
df_delhi=dd[dd['City']=='New Delhi']
df_delhi.head()

Unnamed: 0,Restaurant ID,Restaurant Name,Country Code,City,Address,Locality,Locality Verbose,Longitude,Latitude,Cuisines,Average Cost for two,Currency,Has Table booking,Has Online delivery,Is delivering now,Switch to order menu,Price range,Aggregate rating,Rating color,Rating text,Votes
2560,18287358,Food Cloud,1,New Delhi,"Aaya Nagar, New Delhi",Aaya Nagar,"Aaya Nagar, New Delhi",0.0,0.0,Cuisine Varies,500,Indian Rupees(Rs.),No,No,No,No,2,0.0,White,Not rated,2
2561,18216944,Burger.in,1,New Delhi,"84, Near Honda Showroom, Adchini, New Delhi",Adchini,"Adchini, New Delhi",77.196923,28.535382,Fast Food,350,Indian Rupees(Rs.),No,Yes,No,No,1,3.2,Orange,Average,46
2562,313333,Days of the Raj,1,New Delhi,"81/3, 1st Floor, Qutub Residency, Adchini, New...",Adchini,"Adchini, New Delhi",77.197475,28.535493,"North Indian, Seafood, Continental",1500,Indian Rupees(Rs.),Yes,Yes,No,No,3,3.4,Orange,Average,45
2563,18384127,Dilli Ka Dhaba,1,New Delhi,"66 A, Ground Floor, Sri Aurobindo Marg, Adchin...",Adchini,"Adchini, New Delhi",77.198033,28.537547,"South Indian, North Indian",500,Indian Rupees(Rs.),No,No,No,No,2,2.6,Orange,Average,11
2564,582,Govardhan,1,New Delhi,"84, Adjacent Hero Motor Bike Showroom, Main Me...",Adchini,"Adchini, New Delhi",77.196924,28.535523,"South Indian, North Indian, Chinese",500,Indian Rupees(Rs.),No,Yes,No,No,2,3.4,Orange,Average,238


In [617]:
df_restaurant.sort_values(by='name',inplace=True)
df_restaurant.head()

Unnamed: 0,name,categories,lat,lng
397,Aloft New Delhi Aerocity,Hotel,28.552446,77.123437
324,Aloft New Delhi Aerocity,Hotel,28.552446,77.123437
151,Aloft New Delhi Aerocity,Hotel,28.552446,77.123437
241,Aloft New Delhi Aerocity,Hotel,28.552446,77.123437
513,Aloft New Delhi Aerocity,Hotel,28.552446,77.123437


In [618]:
df_delhi.sort_values(by='Restaurant Name',inplace=True)

df_delhi.reset_index(inplace=True)
df_delhi.head()

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


Unnamed: 0,index,Restaurant ID,Restaurant Name,Country Code,City,Address,Locality,Locality Verbose,Longitude,Latitude,Cuisines,Average Cost for two,Currency,Has Table booking,Has Online delivery,Is delivering now,Switch to order menu,Price range,Aggregate rating,Rating color,Rating text,Votes
0,2613,18311951,#InstaFreeze,1,New Delhi,"B-17, Alaknanda Shopping Complex, Alaknanda, N...",Alaknanda,"Alaknanda, New Delhi",77.253694,28.52542,Ice Cream,300,Indian Rupees(Rs.),No,No,No,No,1,0.0,White,Not rated,2
1,6998,18336489,#OFF Campus,1,New Delhi,"284, Opposite Sri Venkateshwara College, Satya...",Satyaniketan,"Satyaniketan, New Delhi",77.168737,28.588521,"Cafe, Continental, Italian, Fast Food",800,Indian Rupees(Rs.),Yes,Yes,No,No,2,3.7,Yellow,Good,216
2,5379,18489842,#hashtag,1,New Delhi,"1092/1, Mehrauli Bus Stand, Mehrauli, New Delhi",Mehrauli,"Mehrauli, New Delhi",77.181865,28.522205,Cafe,500,Indian Rupees(Rs.),No,No,No,No,2,0.0,White,Not rated,0
3,7392,18430898,13 Cafe,1,New Delhi,"4/175, Subhash Nagar, New Delhi",Subhash Nagar,"Subhash Nagar, New Delhi",77.112695,28.637299,Cafe,500,Indian Rupees(Rs.),No,No,No,No,2,0.0,White,Not rated,0
4,3488,18361752,17 Degree Food Service,1,New Delhi,"Shop 41/1, Hari Complex 304, Garhi, East of Ka...",East of Kailash,"East of Kailash, New Delhi",0.0,0.0,North Indian,260,Indian Rupees(Rs.),No,No,No,No,1,0.0,White,Not rated,1


In [623]:
df_restaurant.drop_duplicates('name',inplace=True)
df_restaurant.shape

(106, 4)

#### List of common Restuarants
After removing duplicates we are left with only 106 rows.Now we will create a list of restaurants and cafe's which are present in both of oyr dataset.So that we have all the necessary information for modelling

In [624]:
k=df_restaurant['name']
l=df_delhi['Restaurant Name']
count=0
res=[]
for i in k:
    for j in l:
        if i==j:
            res.append(j)
            print(i)
            count+=1


BTW
BTW
BTW
BTW
BTW
BTW
BTW
Big Chill
Big Chill
Big Chill
Big Chill
Big Yellow Door
Big Yellow Door
Big Yellow Door
Biryani Blues
Biryani Blues
Biryani Blues
Biryani Blues
Biryani Blues
Blue Tokai Coffee Roasters
Blue Tokai Coffee Roasters
Cafe Delhi Heights
Diggin
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza
Domino's Pizza

In [625]:
df_common=pd.DataFrame()
df_common

In [629]:
df_delhi.set_index('Restaurant Name',inplace=True)
df_common=df_delhi.loc[res]


df_common.sort_values(by='Votes',inplace=True,ascending=False)
df_common.head(8)

Unnamed: 0_level_0,index,Restaurant ID,Country Code,City,Address,Locality,Locality Verbose,Longitude,Latitude,Cuisines,Average Cost for two,Currency,Has Table booking,Has Online delivery,Is delivering now,Switch to order menu,Price range,Aggregate rating,Rating color,Rating text,Votes
Restaurant Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Big Chill,4638,1614,1,New Delhi,"68-A, Khan Market, New Delhi",Khan Market,"Khan Market, New Delhi",77.227447,28.600624,"Italian, Continental, European, Cafe",1500,Indian Rupees(Rs.),No,No,No,No,3,4.5,Dark Green,Excellent,4986
Big Chill,4638,1614,1,New Delhi,"68-A, Khan Market, New Delhi",Khan Market,"Khan Market, New Delhi",77.227447,28.600624,"Italian, Continental, European, Cafe",1500,Indian Rupees(Rs.),No,No,No,No,3,4.5,Dark Green,Excellent,4986
Big Chill,4638,1614,1,New Delhi,"68-A, Khan Market, New Delhi",Khan Market,"Khan Market, New Delhi",77.227447,28.600624,"Italian, Continental, European, Cafe",1500,Indian Rupees(Rs.),No,No,No,No,3,4.5,Dark Green,Excellent,4986
Big Chill,4638,1614,1,New Delhi,"68-A, Khan Market, New Delhi",Khan Market,"Khan Market, New Delhi",77.227447,28.600624,"Italian, Continental, European, Cafe",1500,Indian Rupees(Rs.),No,No,No,No,3,4.5,Dark Green,Excellent,4986
Big Yellow Door,7863,301700,1,New Delhi,"H-8 B, Near GTB Nagar Metro Station, Opposite ...",Vijay Nagar,"Vijay Nagar, New Delhi",77.204991,28.693444,"Cafe, Italian, Fast Food",600,Indian Rupees(Rs.),No,No,No,No,2,4.3,Green,Very Good,3986
Big Yellow Door,7863,301700,1,New Delhi,"H-8 B, Near GTB Nagar Metro Station, Opposite ...",Vijay Nagar,"Vijay Nagar, New Delhi",77.204991,28.693444,"Cafe, Italian, Fast Food",600,Indian Rupees(Rs.),No,No,No,No,2,4.3,Green,Very Good,3986
Big Yellow Door,7863,301700,1,New Delhi,"H-8 B, Near GTB Nagar Metro Station, Opposite ...",Vijay Nagar,"Vijay Nagar, New Delhi",77.204991,28.693444,"Cafe, Italian, Fast Food",600,Indian Rupees(Rs.),No,No,No,No,2,4.3,Green,Very Good,3986
Big Yellow Door,7033,306503,1,New Delhi,"H-8, Opposite Venkateswara College, Satyaniket...",Satyaniketan,"Satyaniketan, New Delhi",77.167524,28.587912,"Cafe, Fast Food, Italian",600,Indian Rupees(Rs.),No,No,No,No,2,4.2,Green,Very Good,3311


In [630]:
df_common.reset_index(inplace=True)
dfff=df_common.drop_duplicates(subset='Restaurant Name')
dfff.sort_values(by='Restaurant Name',inplace=True)

dfff.set_index('Restaurant Name',inplace=True)
dfff

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  This is separate from the ipykernel package so we can avoid doing imports until


Unnamed: 0_level_0,index,Restaurant ID,Country Code,City,Address,Locality,Locality Verbose,Longitude,Latitude,Cuisines,Average Cost for two,Currency,Has Table booking,Has Online delivery,Is delivering now,Switch to order menu,Price range,Aggregate rating,Rating color,Rating text,Votes
Restaurant Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
BTW,5875,3372,1,New Delhi,"G-46, Aggarwal Millenium Tower, Netaji Subhash...",Netaji Subhash Place,"Netaji Subhash Place, New Delhi",77.149909,28.693716,Street Food,200,Indian Rupees(Rs.),No,No,No,No,1,3.7,Yellow,Good,459
Big Chill,4638,1614,1,New Delhi,"68-A, Khan Market, New Delhi",Khan Market,"Khan Market, New Delhi",77.227447,28.600624,"Italian, Continental, European, Cafe",1500,Indian Rupees(Rs.),No,No,No,No,3,4.5,Dark Green,Excellent,4986
Big Yellow Door,7863,301700,1,New Delhi,"H-8 B, Near GTB Nagar Metro Station, Opposite ...",Vijay Nagar,"Vijay Nagar, New Delhi",77.204991,28.693444,"Cafe, Italian, Fast Food",600,Indian Rupees(Rs.),No,No,No,No,2,4.3,Green,Very Good,3986
Biryani Blues,3097,18216915,1,New Delhi,"Showroom 9, Scindia House, Connaught Circus, C...",Connaught Place,"Connaught Place, New Delhi",77.220531,28.629983,"Biryani, Hyderabadi",1000,Indian Rupees(Rs.),No,Yes,No,No,3,4.0,Green,Very Good,510
Blue Tokai Coffee Roasters,6919,18217023,1,New Delhi,"Khasra 258, Lane 3, Westend Marg, Saidulajab, ...",Saket,"Saket, New Delhi",77.200089,28.517303,Cafe,350,Indian Rupees(Rs.),No,Yes,No,No,1,4.4,Green,Very Good,269
Cafe Delhi Heights,6925,18126111,1,New Delhi,"Shop 1-2, Ground Floor, Sangam Courtyard, R K ...","Sangam Courtyard, RK Puram","Sangam Courtyard, RK Puram, New Delhi",77.1735,28.571681,"Continental, American, Italian, Seafood, North...",2000,Indian Rupees(Rs.),Yes,No,No,No,4,4.0,Green,Very Good,304
Diggin,2635,307113,1,New Delhi,"Anand Lok Shopping Centre, Opposite Gargi Coll...",Anand Lok,"Anand Lok, New Delhi",77.219498,28.555635,"Italian, Continental, Cafe",1400,Indian Rupees(Rs.),Yes,Yes,No,No,3,4.2,Green,Very Good,2131
Domino's Pizza,3031,143,1,New Delhi,"M-42, Connaught Place, New Delhi",Connaught Place,"Connaught Place, New Delhi",77.222896,28.633231,"Pizza, Fast Food",700,Indian Rupees(Rs.),No,No,No,No,2,3.7,Yellow,Good,336
Hawkers,7707,3072,1,New Delhi,"B-1, Vasant Kunj, New Delhi",Vasant Kunj,"Vasant Kunj, New Delhi",77.157316,28.523209,Chinese,600,Indian Rupees(Rs.),No,Yes,No,No,2,3.4,Orange,Average,398
Imperfecto,3986,301442,1,New Delhi,"1-A/1, Hauz Khas Village, New Delhi",Hauz Khas Village,"Hauz Khas Village, New Delhi",77.195143,28.554686,"Mediterranean, Italian, Continental, Spanish, ...",1800,Indian Rupees(Rs.),Yes,No,No,No,3,3.7,Yellow,Good,2247


In [631]:

df_restaurant.set_index('name',inplace=True)

Names were used as index to make it easy to extract only the restaurants which are available in res list, later we will again change the index to drop duplicate values.

In [633]:
d_Final=df_restaurant.loc[res]
d_Final.reset_index(inplace=True)
d_Final.drop_duplicates('name',inplace=True)
d_Final.set_index('name',inplace=True)
d_Final


Unnamed: 0_level_0,categories,lat,lng
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
BTW,Indian Restaurant,28.541246,77.296934
Big Chill,Italian Restaurant,28.542758,77.156446
Big Yellow Door,Café,28.693245,77.204948
Biryani Blues,Indian Restaurant,28.46228,77.087233
Blue Tokai Coffee Roasters,Coffee Shop,28.517214,77.200021
Cafe Delhi Heights,Café,28.468201,77.083786
Diggin,Café,28.555665,77.21859
Domino's Pizza,Pizza Place,28.43,77.297
Hawkers,Chinese Restaurant,28.523172,77.157362
Imperfecto,Mediterranean Restaurant,28.554657,77.195092


## <a id="item5"></a> Methodology

This project aims at identifying the venues in Delhi based on their average rating and average costs. This would enable any visitor to identify the venues he/she wants to visit based on their rating and cost preference.
<br><br>
As a first step, we retrieved the <b>data from Foursquare API</b> and <b> datasheet of Zomato</b> from Kaggle. We extract venue information from the center of New Delhi, upto a distance of 25 Km. The latitude and longitude values are gathered by using geocode and then fetched to <b>foursquare API</b>.
<br>
Later we just removed the duplicates from dataframe of foursquare API and our data was shrinked to just 106 rows from 600+ rows.<br><br>
Further we just run a for loop to create a list of common restaurants in both dataset, and <b>extracted </b>them from our dataframes.
<br><br>
Next, we'll <b>analyse the data</b> that we created based on the ratings and price of each venue. We'll **identify places where many venues are located** so that any visitor can go to one place and enjoy the option to choose amongst many venue options. We'll also explore **areas that are high rated and those that are low rated**  Lastly, we'll **cluster the venues** based on the available information of each venue. This will allow us to clearly identify which venues can be recommended and with what characteristics.<br>

Finally, we'll discuss and conclude which venues to be explored based on visitor requirement of rating and cost.

In [634]:
df_final=d_Final.join(dfff,on='name')
df_final

Unnamed: 0_level_0,categories,lat,lng,index,Restaurant ID,Country Code,City,Address,Locality,Locality Verbose,Longitude,Latitude,Cuisines,Average Cost for two,Currency,Has Table booking,Has Online delivery,Is delivering now,Switch to order menu,Price range,Aggregate rating,Rating color,Rating text,Votes
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1
BTW,Indian Restaurant,28.541246,77.296934,5875,3372,1,New Delhi,"G-46, Aggarwal Millenium Tower, Netaji Subhash...",Netaji Subhash Place,"Netaji Subhash Place, New Delhi",77.149909,28.693716,Street Food,200,Indian Rupees(Rs.),No,No,No,No,1,3.7,Yellow,Good,459
Big Chill,Italian Restaurant,28.542758,77.156446,4638,1614,1,New Delhi,"68-A, Khan Market, New Delhi",Khan Market,"Khan Market, New Delhi",77.227447,28.600624,"Italian, Continental, European, Cafe",1500,Indian Rupees(Rs.),No,No,No,No,3,4.5,Dark Green,Excellent,4986
Big Yellow Door,Café,28.693245,77.204948,7863,301700,1,New Delhi,"H-8 B, Near GTB Nagar Metro Station, Opposite ...",Vijay Nagar,"Vijay Nagar, New Delhi",77.204991,28.693444,"Cafe, Italian, Fast Food",600,Indian Rupees(Rs.),No,No,No,No,2,4.3,Green,Very Good,3986
Biryani Blues,Indian Restaurant,28.46228,77.087233,3097,18216915,1,New Delhi,"Showroom 9, Scindia House, Connaught Circus, C...",Connaught Place,"Connaught Place, New Delhi",77.220531,28.629983,"Biryani, Hyderabadi",1000,Indian Rupees(Rs.),No,Yes,No,No,3,4.0,Green,Very Good,510
Blue Tokai Coffee Roasters,Coffee Shop,28.517214,77.200021,6919,18217023,1,New Delhi,"Khasra 258, Lane 3, Westend Marg, Saidulajab, ...",Saket,"Saket, New Delhi",77.200089,28.517303,Cafe,350,Indian Rupees(Rs.),No,Yes,No,No,1,4.4,Green,Very Good,269
Cafe Delhi Heights,Café,28.468201,77.083786,6925,18126111,1,New Delhi,"Shop 1-2, Ground Floor, Sangam Courtyard, R K ...","Sangam Courtyard, RK Puram","Sangam Courtyard, RK Puram, New Delhi",77.1735,28.571681,"Continental, American, Italian, Seafood, North...",2000,Indian Rupees(Rs.),Yes,No,No,No,4,4.0,Green,Very Good,304
Diggin,Café,28.555665,77.21859,2635,307113,1,New Delhi,"Anand Lok Shopping Centre, Opposite Gargi Coll...",Anand Lok,"Anand Lok, New Delhi",77.219498,28.555635,"Italian, Continental, Cafe",1400,Indian Rupees(Rs.),Yes,Yes,No,No,3,4.2,Green,Very Good,2131
Domino's Pizza,Pizza Place,28.43,77.297,3031,143,1,New Delhi,"M-42, Connaught Place, New Delhi",Connaught Place,"Connaught Place, New Delhi",77.222896,28.633231,"Pizza, Fast Food",700,Indian Rupees(Rs.),No,No,No,No,2,3.7,Yellow,Good,336
Hawkers,Chinese Restaurant,28.523172,77.157362,7707,3072,1,New Delhi,"B-1, Vasant Kunj, New Delhi",Vasant Kunj,"Vasant Kunj, New Delhi",77.157316,28.523209,Chinese,600,Indian Rupees(Rs.),No,Yes,No,No,2,3.4,Orange,Average,398
Imperfecto,Mediterranean Restaurant,28.554657,77.195092,3986,301442,1,New Delhi,"1-A/1, Hauz Khas Village, New Delhi",Hauz Khas Village,"Hauz Khas Village, New Delhi",77.195143,28.554686,"Mediterranean, Italian, Continental, Spanish, ...",1800,Indian Rupees(Rs.),Yes,No,No,No,3,3.7,Yellow,Good,2247


### Further Cleaning
By removing unnecessary columns we will further clean our dataset, Here I am using drop function to carry this operation

In [635]:
df_final.drop(['City','lat','lng','index'],axis=1,inplace=True)
df_final

Unnamed: 0_level_0,categories,Restaurant ID,Country Code,Address,Locality,Locality Verbose,Longitude,Latitude,Cuisines,Average Cost for two,Currency,Has Table booking,Has Online delivery,Is delivering now,Switch to order menu,Price range,Aggregate rating,Rating color,Rating text,Votes
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
BTW,Indian Restaurant,3372,1,"G-46, Aggarwal Millenium Tower, Netaji Subhash...",Netaji Subhash Place,"Netaji Subhash Place, New Delhi",77.149909,28.693716,Street Food,200,Indian Rupees(Rs.),No,No,No,No,1,3.7,Yellow,Good,459
Big Chill,Italian Restaurant,1614,1,"68-A, Khan Market, New Delhi",Khan Market,"Khan Market, New Delhi",77.227447,28.600624,"Italian, Continental, European, Cafe",1500,Indian Rupees(Rs.),No,No,No,No,3,4.5,Dark Green,Excellent,4986
Big Yellow Door,Café,301700,1,"H-8 B, Near GTB Nagar Metro Station, Opposite ...",Vijay Nagar,"Vijay Nagar, New Delhi",77.204991,28.693444,"Cafe, Italian, Fast Food",600,Indian Rupees(Rs.),No,No,No,No,2,4.3,Green,Very Good,3986
Biryani Blues,Indian Restaurant,18216915,1,"Showroom 9, Scindia House, Connaught Circus, C...",Connaught Place,"Connaught Place, New Delhi",77.220531,28.629983,"Biryani, Hyderabadi",1000,Indian Rupees(Rs.),No,Yes,No,No,3,4.0,Green,Very Good,510
Blue Tokai Coffee Roasters,Coffee Shop,18217023,1,"Khasra 258, Lane 3, Westend Marg, Saidulajab, ...",Saket,"Saket, New Delhi",77.200089,28.517303,Cafe,350,Indian Rupees(Rs.),No,Yes,No,No,1,4.4,Green,Very Good,269
Cafe Delhi Heights,Café,18126111,1,"Shop 1-2, Ground Floor, Sangam Courtyard, R K ...","Sangam Courtyard, RK Puram","Sangam Courtyard, RK Puram, New Delhi",77.1735,28.571681,"Continental, American, Italian, Seafood, North...",2000,Indian Rupees(Rs.),Yes,No,No,No,4,4.0,Green,Very Good,304
Diggin,Café,307113,1,"Anand Lok Shopping Centre, Opposite Gargi Coll...",Anand Lok,"Anand Lok, New Delhi",77.219498,28.555635,"Italian, Continental, Cafe",1400,Indian Rupees(Rs.),Yes,Yes,No,No,3,4.2,Green,Very Good,2131
Domino's Pizza,Pizza Place,143,1,"M-42, Connaught Place, New Delhi",Connaught Place,"Connaught Place, New Delhi",77.222896,28.633231,"Pizza, Fast Food",700,Indian Rupees(Rs.),No,No,No,No,2,3.7,Yellow,Good,336
Hawkers,Chinese Restaurant,3072,1,"B-1, Vasant Kunj, New Delhi",Vasant Kunj,"Vasant Kunj, New Delhi",77.157316,28.523209,Chinese,600,Indian Rupees(Rs.),No,Yes,No,No,2,3.4,Orange,Average,398
Imperfecto,Mediterranean Restaurant,301442,1,"1-A/1, Hauz Khas Village, New Delhi",Hauz Khas Village,"Hauz Khas Village, New Delhi",77.195143,28.554686,"Mediterranean, Italian, Continental, Spanish, ...",1800,Indian Rupees(Rs.),Yes,No,No,No,3,3.7,Yellow,Good,2247


In [636]:
df_final.drop(['Country Code','Restaurant ID','Country Code','Address','Locality Verbose','Has Table booking','Has Online delivery','Is delivering now','Switch to order menu'],axis=1,inplace=True)
df_final

Unnamed: 0_level_0,categories,Locality,Longitude,Latitude,Cuisines,Average Cost for two,Currency,Price range,Aggregate rating,Rating color,Rating text,Votes
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
BTW,Indian Restaurant,Netaji Subhash Place,77.149909,28.693716,Street Food,200,Indian Rupees(Rs.),1,3.7,Yellow,Good,459
Big Chill,Italian Restaurant,Khan Market,77.227447,28.600624,"Italian, Continental, European, Cafe",1500,Indian Rupees(Rs.),3,4.5,Dark Green,Excellent,4986
Big Yellow Door,Café,Vijay Nagar,77.204991,28.693444,"Cafe, Italian, Fast Food",600,Indian Rupees(Rs.),2,4.3,Green,Very Good,3986
Biryani Blues,Indian Restaurant,Connaught Place,77.220531,28.629983,"Biryani, Hyderabadi",1000,Indian Rupees(Rs.),3,4.0,Green,Very Good,510
Blue Tokai Coffee Roasters,Coffee Shop,Saket,77.200089,28.517303,Cafe,350,Indian Rupees(Rs.),1,4.4,Green,Very Good,269
Cafe Delhi Heights,Café,"Sangam Courtyard, RK Puram",77.1735,28.571681,"Continental, American, Italian, Seafood, North...",2000,Indian Rupees(Rs.),4,4.0,Green,Very Good,304
Diggin,Café,Anand Lok,77.219498,28.555635,"Italian, Continental, Cafe",1400,Indian Rupees(Rs.),3,4.2,Green,Very Good,2131
Domino's Pizza,Pizza Place,Connaught Place,77.222896,28.633231,"Pizza, Fast Food",700,Indian Rupees(Rs.),2,3.7,Yellow,Good,336
Hawkers,Chinese Restaurant,Vasant Kunj,77.157316,28.523209,Chinese,600,Indian Rupees(Rs.),2,3.4,Orange,Average,398
Imperfecto,Mediterranean Restaurant,Hauz Khas Village,77.195143,28.554686,"Mediterranean, Italian, Continental, Spanish, ...",1800,Indian Rupees(Rs.),3,3.7,Yellow,Good,2247


In [637]:
df_final.drop(['categories','Votes','Rating color','Currency'],axis=1,inplace=True)

df_final.sort_values(by='Aggregate rating',ascending=False,inplace=True)
df_final=df_final.reset_index()
df_final.head(11)

Unnamed: 0,name,Locality,Longitude,Latitude,Cuisines,Average Cost for two,Price range,Aggregate rating,Rating text
0,Naturals Ice Cream,Connaught Place,77.222148,28.634348,Ice Cream,150,1,4.9,Excellent
1,Big Chill,Khan Market,77.227447,28.600624,"Italian, Continental, European, Cafe",1500,3,4.5,Excellent
2,Blue Tokai Coffee Roasters,Saket,77.200089,28.517303,Cafe,350,1,4.4,Very Good
3,Big Yellow Door,Vijay Nagar,77.204991,28.693444,"Cafe, Italian, Fast Food",600,2,4.3,Very Good
4,Naivedyam,Hauz Khas Village,77.195275,28.555157,South Indian,500,2,4.2,Very Good
5,Diggin,Anand Lok,77.219498,28.555635,"Italian, Continental, Cafe",1400,3,4.2,Very Good
6,Kunzum Travel Cafe,Hauz Khas Village,77.194322,28.55333,Cafe,200,1,4.2,Very Good
7,Starbucks,Connaught Place,77.217702,28.632177,Cafe,700,2,4.1,Very Good
8,Biryani Blues,Connaught Place,77.220531,28.629983,"Biryani, Hyderabadi",1000,3,4.0,Very Good
9,Cafe Delhi Heights,"Sangam Courtyard, RK Puram",77.1735,28.571681,"Continental, American, Italian, Seafood, North...",2000,4,4.0,Very Good


This is a visual of our final dataframe before modelling stage.The data is now ready for modelling and evaluation stage.

## Clustering
<br>Here we will carry out our modelling stage of the training data for descriptive analysis

In [638]:
from sklearn.cluster import KMeans
cluster=3
selected=df_final.drop(['name','Locality','Cuisines','Rating text'],1)
kmeans=KMeans(n_clusters=cluster,random_state=0).fit(selected)
kmeans.labels_[0:10]

array([0, 1, 0, 2, 2, 1, 0, 2, 2, 1])

In [639]:
df_final.insert(0,'Cluster',kmeans.labels_)
df_final.head(11)

Unnamed: 0,Cluster,name,Locality,Longitude,Latitude,Cuisines,Average Cost for two,Price range,Aggregate rating,Rating text
0,0,Naturals Ice Cream,Connaught Place,77.222148,28.634348,Ice Cream,150,1,4.9,Excellent
1,1,Big Chill,Khan Market,77.227447,28.600624,"Italian, Continental, European, Cafe",1500,3,4.5,Excellent
2,0,Blue Tokai Coffee Roasters,Saket,77.200089,28.517303,Cafe,350,1,4.4,Very Good
3,2,Big Yellow Door,Vijay Nagar,77.204991,28.693444,"Cafe, Italian, Fast Food",600,2,4.3,Very Good
4,2,Naivedyam,Hauz Khas Village,77.195275,28.555157,South Indian,500,2,4.2,Very Good
5,1,Diggin,Anand Lok,77.219498,28.555635,"Italian, Continental, Cafe",1400,3,4.2,Very Good
6,0,Kunzum Travel Cafe,Hauz Khas Village,77.194322,28.55333,Cafe,200,1,4.2,Very Good
7,2,Starbucks,Connaught Place,77.217702,28.632177,Cafe,700,2,4.1,Very Good
8,2,Biryani Blues,Connaught Place,77.220531,28.629983,"Biryani, Hyderabadi",1000,3,4.0,Very Good
9,1,Cafe Delhi Heights,"Sangam Courtyard, RK Puram",77.1735,28.571681,"Continental, American, Italian, Seafood, North...",2000,4,4.0,Very Good


## Examine Clusters

Here in this section we will take a look on all three clusters averae rating and average prices

In [647]:
result_1=df_final[df_final['Cluster']==0]
result_1

Unnamed: 0,Cluster,name,Locality,Longitude,Latitude,Cuisines,Average Cost for two,Price range,Aggregate rating,Rating text
0,0,Naturals Ice Cream,Connaught Place,77.222148,28.634348,Ice Cream,150,1,4.9,Excellent
2,0,Blue Tokai Coffee Roasters,Saket,77.200089,28.517303,Cafe,350,1,4.4,Very Good
6,0,Kunzum Travel Cafe,Hauz Khas Village,77.194322,28.55333,Cafe,200,1,4.2,Very Good
10,0,L'Opera,Khan Market,77.22619,28.599787,"Bakery, Desserts, Fast Food",400,1,4.0,Very Good
12,0,BTW,Netaji Subhash Place,77.149909,28.693716,Street Food,200,1,3.7,Good


In [646]:
result_2=df_final[df_final['Cluster']==1]
result_2

Unnamed: 0,Cluster,name,Locality,Longitude,Latitude,Cuisines,Average Cost for two,Price range,Aggregate rating,Rating text
1,1,Big Chill,Khan Market,77.227447,28.600624,"Italian, Continental, European, Cafe",1500,3,4.5,Excellent
5,1,Diggin,Anand Lok,77.219498,28.555635,"Italian, Continental, Cafe",1400,3,4.2,Very Good
9,1,Cafe Delhi Heights,"Sangam Courtyard, RK Puram",77.1735,28.571681,"Continental, American, Italian, Seafood, North...",2000,4,4.0,Very Good
14,1,Imperfecto,Hauz Khas Village,77.195143,28.554686,"Mediterranean, Italian, Continental, Spanish, ...",1800,3,3.7,Good
18,1,Lighthouse 13,"MGF Metropolitan Mall, Saket",77.219503,28.530136,"North Indian, European, Chinese",2100,4,3.4,Average


In [645]:
result_3=df_final[df_final['Cluster']==2]
result_3

Unnamed: 0,Cluster,name,Locality,Longitude,Latitude,Cuisines,Average Cost for two,Price range,Aggregate rating,Rating text
3,2,Big Yellow Door,Vijay Nagar,77.204991,28.693444,"Cafe, Italian, Fast Food",600,2,4.3,Very Good
4,2,Naivedyam,Hauz Khas Village,77.195275,28.555157,South Indian,500,2,4.2,Very Good
7,2,Starbucks,Connaught Place,77.217702,28.632177,Cafe,700,2,4.1,Very Good
8,2,Biryani Blues,Connaught Place,77.220531,28.629983,"Biryani, Hyderabadi",1000,3,4.0,Very Good
11,2,Pita Pit,Kailash Colony,77.236741,28.557442,"Healthy Food, Salad",600,2,3.8,Good
13,2,Khan Chacha,Khan Market,77.227313,28.600746,"Mughlai, North Indian",650,2,3.7,Good
15,2,Domino's Pizza,Connaught Place,77.222896,28.633231,"Pizza, Fast Food",700,2,3.7,Good
16,2,Pizza Hut,Connaught Place,77.222822,28.632218,"Italian, Pizza, Fast Food",1000,3,3.5,Good
17,2,Subway,Connaught Place,77.222238,28.631131,"American, Fast Food, Salad, Healthy Food",500,2,3.5,Good
19,2,Hawkers,Vasant Kunj,77.157316,28.523209,Chinese,600,2,3.4,Average


## <a id="item6"></a> Results and Discussions
<br>
Let's print our result in dataframe format for each clusters

In [643]:
column=['Cluster','avg price range','avg rating','avg cost for 2']
result=pd.DataFrame(columns=column)

result.loc[0,column]=(0,result_1['Price range'].mean(),result_1['Aggregate rating'].mean(),result_1['Average Cost for two'].mean())
result.loc[1,column]=(1,result_2['Price range'].mean(),result_2['Aggregate rating'].mean(),result_2['Average Cost for two'].mean())
result.loc[2,column]=(2,result_3['Price range'].mean(),result_3['Aggregate rating'].mean(),result_3['Average Cost for two'].mean())
result

Unnamed: 0,Cluster,avg price range,avg rating,avg cost for 2
0,0,1.0,4.24,260
1,1,3.4,3.96,1760
2,2,2.2,3.82,685


From our analysis we can draw number of conclusions to help our visitors in New Delhi.<br>
<br>
We can easily see that if a traveller has lot's of money in his/her pocket then for a good meal and best experience he/she needs to visit <b>Overseas restaurants</b> where highest budgets restaurants are available with overall good rating.
<br><br>
If a traveller wants to save some extra money without compromising the meal then<b> Cluster 2 and Cluster 0</b> is best place to suite them.

<br><br>
Finally it can  be easily claimed after seeing result data that going for having a meal at <b>cluster 0</b> category restaurants is<b> is worthit</b>. Since it's budget <b>much</b> less than <b>double</b> than that of cluster 2 still avg rating is much better,there was change of <b> approx 10%</b> in avg rating.<br>
We can easily see that reason of being cluster 0 much cheaper is because it contains cafe's and street food type meals.
<br>
<br>A company can use this information to build up an online website/mobile application, to provide users with up to date information about various venues in the city based on the search criteria (name, rating and price).

## <a id="item7"></a> Conclusion

The purpose of this project was to explore the places that a person visiting Delhi could visit. The venues have been identified using Foursquare and Zomato datasheet.
<br>
From our result section we can easily conclude that <b>Cluster 1</b> will be worth for maximum percentage of people's in <b>India</b> and <b> cluster 0</b> will be the worst case scenario to choose from other cases, maximum number of people will try to avoid it.
<br>
So most of the travellers are left with only 2 clusters and they can choose any one of them based on price and rating requirements.