# Introduction/Business Problem: 
This project is basically aimed at helping tourists visiting India or locals in searching best city on the basis of variety of cuisines. 
As food is directly linked with tourism in this global village era, we would help foodoholics in easily finding best place to go. This will
also help cities in attracting customers. We would analyze the restaurants locations in four major cities of India and find best place among 
the cities. It will be a good help to tourists as they do not have to go hunting out.

First things first, let's import all necessary libraries.

In [2]:
#let's import all important libraries here
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe
!conda install -c conda-forge folium=0.5.0
import folium # map rendering library

print('Libraries imported, All Set!')

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    certifi-2019.11.28         |           py36_0         149 KB  conda-forge
    altair-4.0.1               |             py_0         575 KB  conda-forge
    folium-0.5.0               |             py_0          45 KB  conda-forge
    vincent-0.4.4              |             py_1          28 KB  conda-forge
    openssl-1.1.1d             |       h516909a_0         2.1 MB  conda-forge
    ca-certificates-2019.11.28 |       hecc5488_0         145 KB  conda-forge
    branca-0.4.0               |             py_0          26 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         3.0 MB

The following NEW packages will be 

# Procuring Data
We are going to use the FourSquare API to collect data about locations of restaurants/Food places in major tourist cities in India i.e. Mumbai, Delhi, 
Chennai and Kolkata. We would figure out city with highest density of restaurants/Food places. Here, we are assuming that the density of food places 
would be directly proportional to variety since all these cities are equally populous and are famous tourist location.
Now, we will connect to Foursquare API and obtain data for these cities.

In [4]:
#Using below code, we can connect to Foursquare API
CLIENT_ID = 'P1W3PI2AM4F4CUSQ1JUVENDZMPGGN3YAOIEMNOV4OPDFPCTP' 
CLIENT_SECRET = '' 
VERSION = '20180605' 

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)

Your credentails:
CLIENT_ID: P1W3PI2AM4F4CUSQ1JUVENDZMPGGN3YAOIEMNOV4OPDFPCTP


We are going to save the data obtained through Foursquare API in below code piece. We are taking first 100 data for each city due to Foursquare call limitations.

In [5]:
#Using below code, we will store the data regarding restaurants location for all four cities of choice
LIMIT = 500 
cities = ['Delhi, IN', 'Mumbai, IN', 'Chennai, IN', 'Kolkata, IN']
results = {}
for city in cities:
    url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&near={}&limit={}&categoryId={}'.format(
        CLIENT_ID, 
        CLIENT_SECRET, 
        VERSION, 
        city,
        LIMIT,
        "4d4b7105d754a06374d81259") # FOOD CATEGORY ID
    results[city] = requests.get(url).json()

In [7]:
df_places={}
for city in cities:
    places = json_normalize(results[city]['response']['groups'][0]['items'])
    df_places[city] = places[['venue.name', 'venue.location.address', 'venue.location.lat', 'venue.location.lng']]
    df_places[city].columns = ['Name', 'Address', 'Lat', 'Lng']

# Plotting individual city food places density
Below piece of code will compute the densities of restaurants/food places in all four city for comparison and plot same in map.

In [8]:
#Let's plot map for visualizing density of food places in all four cities
maps = {}
for city in cities:
    city_lat = np.mean([results[city]['response']['geocode']['geometry']['bounds']['ne']['lat'],
                        results[city]['response']['geocode']['geometry']['bounds']['sw']['lat']])
    city_lng = np.mean([results[city]['response']['geocode']['geometry']['bounds']['ne']['lng'],
                        results[city]['response']['geocode']['geometry']['bounds']['sw']['lng']])
    maps[city] = folium.Map(location=[city_lat, city_lng], zoom_start=11)

    # add markers to map
    for lat, lng, label in zip(df_places[city]['Lat'], df_places[city]['Lng'], df_places[city]['Name']):
        label = folium.Popup(label, parse_html=True)
        folium.CircleMarker(
            [lat, lng],
            radius=5,
            popup=label,
            color='blue',
            fill=True,
            fill_color='#3186cc',
            fill_opacity=0.7,
            parse_html=False).add_to(maps[city])  
    print(f"Total number of restaurants in {city} = ", results[city]['response']['totalResults'])
    print("Showing Top 100")

Total number of restaurants in Delhi, IN =  131
Showing Top 100
Total number of restaurants in Mumbai, IN =  166
Showing Top 100
Total number of restaurants in Chennai, IN =  115
Showing Top 100
Total number of restaurants in Kolkata, IN =  98
Showing Top 100


Now, let's observe the plots for each city. Though as per initial data, we can see that Mumbai has highest density of food places.

In [9]:
maps[cities[0]]

In [10]:
maps[cities[1]]

In [11]:
maps[cities[2]]

In [12]:
maps[cities[3]]

Though we have the initial results here, let's quantify the density like a true Data Scientist! We will find out the place in particular city where most restaurants are concentrated! To get an indicator of the density of food Places, I calculated a center coordinate of the venues to get the mean longitude and latitude values. Then I calculated the mean of the Euclidean distance from each venue to the mean coordinates. That was my indicator; mean distance to the mean coordinate. Now the locals or tourist can easily hit these places to satisfy hunger pangs and drool on delicious cuisines!! 

In [16]:
maps = {}
for city in cities:
    city_lat = np.mean([results[city]['response']['geocode']['geometry']['bounds']['ne']['lat'],
                        results[city]['response']['geocode']['geometry']['bounds']['sw']['lat']])
    city_lng = np.mean([results[city]['response']['geocode']['geometry']['bounds']['ne']['lng'],
                        results[city]['response']['geocode']['geometry']['bounds']['sw']['lng']])
    maps[city] = folium.Map(location=[city_lat, city_lng], zoom_start=11)
    places_mean_coor = [df_places[city]['Lat'].mean(), df_places[city]['Lng'].mean()] 
    # add markers to map
    for lat, lng, label in zip(df_places[city]['Lat'], df_places[city]['Lng'], df_places[city]['Name']):
        label = folium.Popup(label, parse_html=True)
        folium.CircleMarker(
            [lat, lng],
            radius=5,
            popup=label,
            color='blue',
            fill=True,
            fill_color='#3186cc',
            fill_opacity=0.7,
            parse_html=False).add_to(maps[city])
        folium.PolyLine([places_mean_coor, [lat, lng]], color="green", weight=1.5, opacity=0.5).add_to(maps[city])
    
    label = folium.Popup("Mean Co-ordinate", parse_html=True)
    folium.CircleMarker(
        places_mean_coor,
        radius=10,
        popup=label,
        color='green',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(maps[city])

    print(city)
    print(places_mean_coor, [lat, lng])
    print("Mean Distance from Mean coordinates")
    print(np.mean(np.apply_along_axis(lambda x: np.linalg.norm(x - places_mean_coor),1,df_places[city][['Lat','Lng']].values)))


Delhi, IN
[28.584661406589277, 77.20788416712709] [28.69388244100248, 77.14998090659716]
Mean Distance from Mean coordinates
0.05177313485446367
Mumbai, IN
[19.033275651564235, 72.83747964648356] [19.15251493490186, 72.83092221126502]
Mean Distance from Mean coordinates
0.0731607650555944
Chennai, IN
[13.050365212027755, 80.2435103957549] [12.996417106009028, 80.26850213041183]
Mean Distance from Mean coordinates
0.02771038427140823
Kolkata, IN
[22.552139493508275, 88.37299833095761] [22.63313975970929, 88.43453647133795]
Mean Distance from Mean coordinates
0.03498408488623034


We will see the densities of each city along with mean distance to mean coordinates i.e. MDMC.

In [17]:
maps[cities[0]]


In [18]:
maps[cities[1]]


In [19]:
maps[cities[2]]


In [20]:
maps[cities[3]]


It is pretty clear that Mumbai is most rich in food joints/restaurants. Also, I would recommend that tourist books a hotel close to the mean coordinate so that wide choices are available. We can compile ranking of cities based on number of restaurants as below:
  1. Mumbai
  2. Delhi
  3. Chennai
  4. Kolkata  
  Now, you know where you gotta head first!!

One thing I noticed in the figure is that restaurants/food joints mapped in Delhi are actually lying in Delhi NCR region. 
Now Delhi NCR region area is spread in 3 states, thus giving it a low density compared to Mumbai. One consideration to do further work 
on is to move the location of the Foursquare API query until we get all the food places in each city 
and do the calculations again. Here, the call limitaion is a short coming. Also for future improvement, we can compare the quality of food places 
by comparing user ratings.