## Business Problem (BY Kumar Shivam) <a name="introduction"></a>

In this project I will try to find an optimal location for opening a new Restaurant. 

Specifically, this report will be targeted to stakeholders interested in opening an **Punjabi restaurant** in **Connaught Place**, Delhi.

Since there are lots of restaurants in Delhi at **Cannaught Place** I will try to detect **locations that are not already crowded with restaurants**. I am also particularly interested in **areas with no Punjabi restaurants in vicinity**. I would also prefer locations **as close to Cannaught Place center as possible**, assuming that first two conditions are met.


Based on definition of Bussines problem, factors that will influence our decission are:
* number of existing restaurants in the neighborhood (any type of restaurant)
* number of and distance to Punjabi restaurants in the neighborhood, if any
* distance of neighborhood from Cannaught Place (centre)

Following data sources will be needed to extract/generate the required information:
* centers of candidate areas will be generated algorithmically and approximate addresses of centers of those areas will be obtained using **Reverse Geocoding**
* number of restaurants and their type and location in every neighborhood will be obtained using **Foursquare API**
* coordinate of Cannaught Place will be obtained using **Geopy geocoding** of well known Delhi location (Cannaught Place)

### API Used

FOURSUQUARE API

### Pip Install
1. !pip install shapely
2. !pip install pyproj
3. !pip install folium
4. !pip install geopy
5. !pip install json
6. !pip install pickle
7. !pip install requests
 

### Libarary used

1 Pandas
2 Numpy
3 Geopy
4 Geojson,Json
5 pyproj
6 pickle
7 Folium
8 KMeans



In [23]:
import pandas as pd
import numpy as np 
import random
import plotly.express as px
import plotly.graph_objects as go

import requests
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize
import json
import geojson
import pyproj
import pickle

import folium

In [2]:
CLIENT_ID = 'FKZSXSUZ15N15NDT0L02LQKUVQOJPLBWRLKHXX3SFOBLYDSO' # your Foursquare ID
CLIENT_SECRET = 'XPGX1VPF3NP2NCDTYMWZXSKCZJYLETAWONQYSBNSKD3D0UJO' # your Foursquare Secret
ACCESS_TOKEN = '4HLZVV31IRIKGULK52WDBHUCHFBP4P04YZNQAXWCWGHZEIBK' # your FourSquare Access Token
VERSION = '20211505'
LIMIT = 30
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: FKZSXSUZ15N15NDT0L02LQKUVQOJPLBWRLKHXX3SFOBLYDSO
CLIENT_SECRET:XPGX1VPF3NP2NCDTYMWZXSKCZJYLETAWONQYSBNSKD3D0UJO


In [3]:
address = 'Connaught Place,New Delhi,India'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
delhi_centre = [latitude,longitude]
print(delhi_centre)

[28.6314022, 77.2193791]


### Neighborhood Candidates

Let's create latitude & longitude coordinates for centroids of our candidate neighborhoods. We will create a grid of cells covering our area of interest which is aprox. 6*6 killometers centered around Cannaught Place center.

Let's first find the latitude & longitude of Cannaught Place center, using specific, well known address and Geocoder geocoding API.

In [5]:
import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning)

In [6]:

import shapely.geometry
import pyproj
import math

def lonlat_to_xy(lon, lat):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=33, datum='WGS84')
    xy = pyproj.transform(proj_latlon, proj_xy, lon, lat)
    return xy[0], xy[1]

def xy_to_lonlat(x, y):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=33, datum='WGS84')
    lonlat = pyproj.transform(proj_xy, proj_latlon, x, y)
    return lonlat[0], lonlat[1]

def calc_xy_distance(x1, y1, x2, y2):
    dx = x2 - x1
    dy = y2 - y1
    return math.sqrt(dx*dx + dy*dy)

print('Coordinate transformation check')
print('-------------------------------')
print('Delhi center longitude={}, latitude={}'.format(delhi_centre[1], delhi_centre[0]))
x, y = lonlat_to_xy(delhi_centre[1], delhi_centre[0])
print('DElhi center UTM X={}, Y={}'.format(x, y))
lo, la = xy_to_lonlat(x, y)
print('Delhi center longitude={}, latitude={}'.format(lo, la))

Coordinate transformation check
-------------------------------
Delhi center longitude=77.2193791, latitude=28.6314022
DElhi center UTM X=7113896.987048123, Y=5500340.958062688
Delhi center longitude=77.21937909999885, latitude=28.631402200001293


Let's create a **hexagonal grid of cells**: we offset every other row, and adjust vertical row spacing so that **every cell center is equally distant from all it's neighbors**.

Now let's create a grid of area candidates, equaly spaced, centered around city center and within ~6km from Cannaught Place. Our neighborhoods will be defined as circular areas with a radius of 300 meters, so our neighborhood centers will be 600 meters apart.

To accurately calculate distances we need to create our grid of locations in Cartesian 2D coordinate system which allows us to calculate distances in meters (not in latitude/longitude degrees). Then we'll project those coordinates back to latitude/longitude degrees to be shown on Folium map. So let's create functions to convert between WGS84 spherical coordinate system (latitude/longitude degrees) and UTM Cartesian coordinate system (X/Y coordinates in  meters).

In [11]:
%%time
delhi_center_x, delhi_center_y = lonlat_to_xy(delhi_centre[1], delhi_centre[0]) # City center in Cartesian coordinates

k = math.sqrt(3) / 2 # Vertical offset for hexagonal grid cells
x_min = delhi_center_x - 6000
x_step = 600
y_min = delhi_center_y - 6000 - (int(21/k)*k*600 - 12000)/2
y_step = 600 * k 

latitudes = []
longitudes = []
distances_from_center = []
xs = []
ys = []
for i in range(0, int(21/k)):
    y = y_min + i * y_step
    x_offset = 300 if i%2==0 else 0
    for j in range(0, 21):
        x = x_min + j * x_step + x_offset
        distance_from_center = calc_xy_distance(delhi_center_x, delhi_center_y, x, y)
        if (distance_from_center <= 5001):
            lon, lat = xy_to_lonlat(x, y)
            latitudes.append(lat)
            longitudes.append(lon)
            distances_from_center.append(distance_from_center)
            xs.append(x)
            ys.append(y)

print(len(latitudes), 'candidate neighborhood centers generated.')

254 candidate neighborhood centers generated.
Wall time: 26.8 s


Let's visualize the data we have so far: city center location and candidate neighborhood centers:

In [12]:
delhi_centre

[28.6314022, 77.2193791]

In [13]:
%%time
map_delhi = folium.Map(location=delhi_centre, zoom_start=13)
folium.Marker(delhi_centre, popup='New Delhi').add_to(map_delhi)
for lat, lon in zip(latitudes, longitudes):
    #folium.CircleMarker([lat, lon], radius=1, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_delhi) 
    folium.Circle([lat, lon], radius=180, color='blue', fill=False).add_to(map_delhi)
    #folium.Marker([lat, lon]).add_to(map_delhi)
map_delhi

Wall time: 1.05 s


OK, we now have the coordinates of centers of neighborhoods/areas to be evaluated, equally spaced (distance from every point to it's neighbors is exactly the same) and within ~6km from Cannught place centre. 

Let's now use Google Maps API to get approximate addresses of those locations.

In [14]:
coords = list(zip(latitudes, longitudes))
address =[]
for i in range(len(coords)):
    print('\r'+'Processing Status: ' + str(round(i/len(coords)*100,1)) + ' % ', end='')
    location = geolocator.reverse(coords[i])
    location = str(location).replace(', India','')
    
    address.append(location)
        

Processing Status: 99.6 % 

In [15]:

df_locations = pd.DataFrame({'Address': address,
                             'Latitude': latitudes,
                             'Longitude': longitudes,
                             #'Distance from center':dist
                             })

df_locations.head(10)

Unnamed: 0,Address,Latitude,Longitude
0,"Raisina Hill, Rakab Ganj, Chanakya Puri Tehsil...",28.616404,77.193322
1,"Mughal Gardens, Rajpath, Raisina Hill, Rakab G...",28.614101,77.196174
2,"Dara Shikoh Road, Raisina Hill, Rakab Ganj, Ch...",28.611798,77.199025
3,"Tyagaraj Marg, Raisina Hill, Rakab Ganj, Chana...",28.609494,77.201876
4,"Rakab Ganj, Chanakya Puri Tehsil, New Delhi, D...",28.607191,77.204727
5,"Vande Matram Marg, Central Ridge Reserve Fores...",28.624343,77.188452
6,"Talkatora Stadium, Mother Teresa Crescent, Mal...",28.622039,77.191305
7,"Rakab Ganj, Chanakya Puri Tehsil, New Delhi, D...",28.619736,77.194157
8,"Raisina Hill, Rakab Ganj, Chanakya Puri Tehsil...",28.617432,77.197009
9,"Rashtrapati Bhavan, Rajpath, Raisina Hill, Rak...",28.615128,77.19986


In [16]:
import geopy.distance

dist = []
coords_1 = (delhi_centre[0],delhi_centre[1])
for lat, lon in zip(df_locations.Latitude, df_locations.Longitude):
    coords_2 = (lat, lon)
    dist.append(geopy.distance.distance(coords_1, coords_2).m)

In [17]:
df_locations['Distance_from_center']=dist
df_locations.head()

Unnamed: 0,Address,Latitude,Longitude,Distance_from_center
0,"Raisina Hill, Rakab Ganj, Chanakya Puri Tehsil...",28.616404,77.193322,3042.335927
1,"Mughal Gardens, Rajpath, Raisina Hill, Rakab G...",28.614101,77.196174,2970.912716
2,"Dara Shikoh Road, Raisina Hill, Rakab Ganj, Ch...",28.611798,77.199025,2946.650887
3,"Tyagaraj Marg, Raisina Hill, Rakab Ganj, Chana...",28.609494,77.201876,2970.69562
4,"Rakab Ganj, Chanakya Puri Tehsil, New Delhi, D...",28.607191,77.204727,3041.891312


Looking good. Let's now place all this into a Pandas dataframe.

## Exported the generated data into csv

In [18]:

df_locations.to_csv('df_locations_final.csv')

In [19]:
df_locations = pd.read_csv('df_locations_final.csv',index_col=0)

In [20]:
df_locations.head()

Unnamed: 0,Address,Latitude,Longitude,Distance_from_center
0,"Raisina Hill, Rakab Ganj, Chanakya Puri Tehsil...",28.616404,77.193322,3042.335927
1,"Mughal Gardens, Rajpath, Raisina Hill, Rakab G...",28.614101,77.196174,2970.912716
2,"Dara Shikoh Road, Raisina Hill, Rakab Ganj, Ch...",28.611798,77.199025,2946.650887
3,"Tyagaraj Marg, Raisina Hill, Rakab Ganj, Chana...",28.609494,77.201876,2970.69562
4,"Rakab Ganj, Chanakya Puri Tehsil, New Delhi, D...",28.607191,77.204727,3041.891312


### Foursquare
Now that we have our location candidates, let's use Foursquare API to get info on restaurants in each neighborhood.

We're interested in venues in 'food' category, but only those that are proper restaurants - coffe shops, pizza places, bakeries etc. are not direct competitors so we don't care about those. So we will include in out list only venues that have 'restaurant' in category name, and we'll make sure to detect and include all the subcategories of specific 'Punjabi restaurant' category, as we need info on Punjabi restaurants in the neighborhood.

(https://developer.foursquare.com/docs/resources/categories):

In [21]:
# Category IDs corresponding to Indian/Punjabi restaurants were taken from Foursquare web site 

food_category = '4d4b7105d754a06374d81259' # 'Root' category for all food-related venues

punjabi_restaurant_categories = ['4bf58dd8d48988d10f941735','54135bf5e4b08f3d2429dff0','54135bf5e4b08f3d2429dfde','54135bf5e4b08f3d2429dfdf']

def is_restaurant(categories, specific_filter=None):
    restaurant_words = ['restaurant', 'diner', 'hotel', 'punjab']
    restaurant = False
    specific = False
    for c in categories:
        category_name = c[0].lower()
        category_id = c[1]
        for r in restaurant_words:
            if r in category_name:
                restaurant = True
        if 'fast food' in category_name:
            restaurant = False
        if not(specific_filter is None) and (category_id in specific_filter):
            specific = True
            restaurant = True
    return restaurant, specific

def get_categories(categories):
    return [(cat['name'], cat['id']) for cat in categories]

def format_address(location):
    address = ', '.join(location['formattedAddress'])
    address = address.replace(', New Delhi', '')
    address = address.replace(', India', '')
    return address

def get_venues_near_location(lat, lon, category, client_id, client_secret, radius=500, limit=100):
    version = '20210501'
    url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&categoryId={}&radius={}&limit={}'.format(
        client_id, client_secret, version, lat, lon, category, radius, limit)
    try:
        results = requests.get(url).json()['response']['groups'][0]['items']
        venues = [(item['venue']['id'],
                   item['venue']['name'],
                   get_categories(item['venue']['categories']),
                   (item['venue']['location']['lat'], item['venue']['location']['lng']),
                   format_address(item['venue']['location']),
                   item['venue']['location']['distance']) for item in results]        
    except:
        venues = []
    return venues

In [22]:
# Let's now go over our neighborhood locations and get nearby restaurants; we'll also maintain a dictionary of all found restaurants and all found Punjabi restaurants

import pickle

def get_restaurants(lats, lons):
    restaurants = {}
    punjabi_restaurants = {}
    location_restaurants = []

    print('Obtaining venues around candidate locations:', end='')
    for lat, lon in zip(lats, lons):
        # Using radius=350 to meke sure we have overlaps/full coverage so we don't miss any restaurant (we're using dictionaries to remove any duplicates resulting from area overlaps)
        venues = get_venues_near_location(lat, lon, food_category,'FKZSXSUZ15N15NDT0L02LQKUVQOJPLBWRLKHXX3SFOBLYDSO'
,'XPGX1VPF3NP2NCDTYMWZXSKCZJYLETAWONQYSBNSKD3D0UJO', radius=500, limit=100)
        area_restaurants = []
        for venue in venues:
            venue_id = venue[0]
            venue_name = venue[1]
            venue_categories = venue[2]
            venue_latlon = venue[3]
            venue_address = venue[4]
            venue_distance = venue[5]
            is_res, is_punjabi = is_restaurant(venue_categories, specific_filter=punjabi_restaurant_categories)
            if is_res:
                x, y = lonlat_to_xy(venue_latlon[1], venue_latlon[0])
                restaurant = (venue_id, venue_name, venue_latlon[0], venue_latlon[1], venue_address, venue_distance, is_punjabi, x, y)
                if venue_distance<=300:
                    area_restaurants.append(restaurant)
                restaurants[venue_id] = restaurant
                if is_punjabi:
                    punjabi_restaurants[venue_id] = restaurant
        location_restaurants.append(area_restaurants)
        print(' .', end='')
    print(' done.')
    return restaurants, punjabi_restaurants, location_restaurants

# Try to load from local file system in case we did this before
restaurants = {}
punjabi_restaurants = {}
location_restaurants = []

restaurants,punjabi_restaurants, location_restaurants = get_restaurants(latitudes, longitudes)
    
    # Let's persists this in local file system
with open('restaurants_final(350m).pkl', 'wb') as f:
    pickle.dump(restaurants, f)
with open('punjabi_restaurants_final(350).pkl', 'wb') as f:
    pickle.dump(punjabi_restaurants, f)
with open('location_restaurants_final(350).pkl', 'wb') as f:
    pickle.dump(location_restaurants, f)
        

Obtaining venues around candidate locations: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . done.


In [25]:
restaurants = {}
punjabi_restaurants = {}
location_restaurants = []
with open('restaurants_350.pkl', 'rb') as f:
    restaurants = pickle.load(f)
with open('punjabi_restaurants_350.pkl', 'rb') as f:
    punjabi_restaurants = pickle.load(f)
with open('location_restaurants_350.pkl', 'rb') as f:
    location_restaurants = pickle.load(f)
print('Restaurant data loaded')

Restaurant data loaded


In [24]:
import numpy as np

print('Total number of restaurants:', len(restaurants))
print('Total number of punjabi restaurants:', len(punjabi_restaurants))
print('Percentage of punjabi restaurants: {:.2f}%'.format(len(punjabi_restaurants) / len(restaurants) * 100))
print('Average number of restaurants in neighborhood:', np.array([len(r) for r in location_restaurants]).mean())

Total number of restaurants: 226
Total number of punjabi restaurants: 110
Percentage of punjabi restaurants: 48.67%
Average number of restaurants in neighborhood: 1.263779527559055


In [25]:
print('List of all restaurants')
print('-----------------------')
for r in list(restaurants.values())[:10]:
    print(r)
print('...')
print('Total:', len(restaurants))

List of all restaurants
-----------------------
('4bbdfc7c8a4fb71386a23d9d', 'Sevilla', 28.60114350202022, 77.21617449259902, 'The Claridges Hotel (12 Aurangzeb Rd), Delhi', 420, False, 7117126.468986577, 5496079.950845045)
('4d974c720caaa143ac4b8eb3', 'Dhaba by Claridges', 28.600633368284306, 77.21655356969086, '12 Aurangzeb Rd (Motilal Nehru Marg) 110011, Delhi', 462, False, 7117230.508048081, 5496053.508217669)
('4fd9f90ae4b0a04895bb04b3', 'Jade', 28.600383802307814, 77.21678961059109, '12 Aurangzeb Road, Delhi', 483, False, 7117287.20030896, 5496045.878006362)
('4ece7088775bbb5f309e359c', 'Fujiya Japanese n Chinese restaurant', 28.623761974124704, 77.19956425110736, 'Malcha Marg, Delhi', 312, False, 7112537.677410544, 5497272.926601561)
('59978aa58b98fd4558ba68f7', 'Restaurante Grappa', 28.61394, 77.20902, 'New Delhi 110004, Delhi', 411, False, 7114786.7242692495, 5496990.05150785)
('54df38bf498eefaf2e334623', 'Tantunicim', 28.6136417388916, 77.2093734741211, 'Yeni Mah', 367, False

In [26]:
print('List of punjabi restaurants')
print('---------------------------')
for r in list(punjabi_restaurants.values())[:10]:
    print(r)
print('...')
print('Total:', len(punjabi_restaurants))

List of punjabi restaurants
---------------------------
('4ba09fd3f964a520cd7437e3', 'Gola sizzlers', 28.633770127572962, 77.18979948132089, 'Cp, Delhi', 475, True, 7110231.901459199, 5497547.694279956)
('4db055c693a061576840af32', 'Tee pee o', 28.62399466001383, 77.2023211508556, 'Basement Mohan Singh Place 110001, Delhi', 436, True, 7112825.528588652, 5497592.039121725)
('4c65834faebea5939e8870d0', 'Varq | वर्क', 28.60454694756125, 77.2237810008916, 'Taj Mahal Hotel,  Lower Lobby Level (1 Mansingh Road), Delhi', 414, True, 7117593.018605844, 5497319.456540414)
('535a98f5498e466e491f5aae', 'Wok In The Clouds', 28.600501884081112, 77.22722973943226, 'Khan Market, Delhi', 439, True, 7118468.829207039, 5497156.40144725)
('4c04967a47049c740f0dae4d', 'Khan Chacha', 28.600618250461434, 77.22723729363604, '50, Middle Lane, Khan Market (Middle Lane, Khan Market) 110003, Delhi', 426, True, 7118455.859789653, 5497172.29423919)
('4d39c67d687ca35dc2ef82c4', "Azam's Mughlai", 28.600110036726374, 7

In [29]:
print('Restaurants around location')
print('---------------------------')
for i in range(25, 110):
    rs = location_restaurants[i][:8]
    names = ', '.join([r[1] for r in rs])
    print('Restaurants around location {}: {}'.format(i+1, names))

Restaurants around location
---------------------------
Restaurants around location 26: 
Restaurants around location 27: Gola sizzlers
Restaurants around location 28: 
Restaurants around location 29: 
Restaurants around location 30: 
Restaurants around location 31: Fujiya Japanese n Chinese restaurant
Restaurants around location 32: Tee pee o, Fujiya Japanese n Chinese restaurant
Restaurants around location 33: 
Restaurants around location 34: 
Restaurants around location 35: 
Restaurants around location 36: Punjabi Nawab's - Best Western Taurus, Ardor Catering Company
Restaurants around location 37: 
Restaurants around location 38: 
Restaurants around location 39: Varq | वर्क, Machan, Wasabi by Morimoto, The House of Ming
Restaurants around location 40: Varq | वर्क, Machan, Wasabi by Morimoto, The House of Ming
Restaurants around location 41: 
Restaurants around location 42: Gola sizzlers
Restaurants around location 43: 
Restaurants around location 44: 
Restaurants around location 45:

Let's now see all the collected restaurants in our area of interest on map, and let's also show Punjabi restaurants in different color.

In [30]:
map_delhi = folium.Map(location=delhi_centre, zoom_start=13)
folium.Marker(delhi_centre, popup='delhi').add_to(map_delhi)
for res in restaurants.values():
    lat = res[2]; lon = res[3]
    is_punjabi = res[6]
    color = 'red' if is_punjabi else 'blue'
    folium.CircleMarker([lat, lon], radius=3, color=color, fill=True, fill_color=color, fill_opacity=1).add_to(map_delhi)
map_delhi

Looking good. So now we have all the restaurants in area within few kilometers from Cannaught Place, and we know which ones are Punjabi restaurants! We also know which restaurants exactly are in vicinity of every neighborhood candidate center.

This concludes the data gathering phase - we're now ready to use this data for analysis to produce the report on optimal locations for a new Punjabi restaurant!

## Methodology

In this project we will direct our efforts on detecting areas of CP,Delhi that have low restaurant density, particularly those with low number of Punjabu restaurants. We will limit our analysis to area ~6km around CP center.

In first step we have collected the required **data: location and type (category) of every restaurant within 6km from Cannaught Place center** . i have also **identified Punjabi restaurants** (according to Foursquare categorization).

Second step in our analysis will be calculation and exploration of '**restaurant density**' across different areas near of CP - i  will use **heatmaps** to identify a few promising areas close to center with low number of restaurants in general (*and* no Punjabi restaurants in vicinity) and focus our attention on those areas.

In third and final step we will focus on most promising areas and within those create **clusters of locations that meet some basic requirements** established in discussion with stakeholders: i will take into consideration locations with **no more than two restaurants in radius of 250 meters**, and we want locations **without Punjabi restaurants in radius of 400 meters**. We will present map of all such locations but also create clusters (using **k-means clustering**) of those locations to identify general zones / neighborhoods / addresses which should be a starting point for final 'street level' exploration and search for optimal venue location by stakeholders.

## Detailed Analysis

In [34]:
location_restaurants_count = [len(res) for res in location_restaurants]

df_locations['Restaurants in area'] = location_restaurants_count

print('Average number of restaurants in every area with radius=300m:', np.array(location_restaurants_count).mean())


Average number of restaurants in every area with radius=300m: 1.263779527559055


In [50]:
fig = px.scatter(df_locations,y='Distance_from_center',color='Distance_from_center')
fig.update_layout(template='plotly_dark')

In [36]:
df_locations.head()

Unnamed: 0,Address,Latitude,Longitude,Distance_from_center,Restaurants in area
0,"Raisina Hill, Rakab Ganj, Chanakya Puri Tehsil...",28.616404,77.193322,3042.335927,0
1,"Mughal Gardens, Rajpath, Raisina Hill, Rakab G...",28.614101,77.196174,2970.912716,0
2,"Dara Shikoh Road, Raisina Hill, Rakab Ganj, Ch...",28.611798,77.199025,2946.650887,0
3,"Tyagaraj Marg, Raisina Hill, Rakab Ganj, Chana...",28.609494,77.201876,2970.69562,0
4,"Rakab Ganj, Chanakya Puri Tehsil, New Delhi, D...",28.607191,77.204727,3041.891312,0


In [52]:
fig = px.histogram(df_locations,x='Restaurants in area',color='Restaurants in area')
fig.update_layout(template='plotly_dark')

#### Calculating the **distance to nearest Punjabi restaurant from every area candidate center** (not only those within 300m - we want distance to closest one, regardless of how distant it is).

In [38]:
%%time
x_cor = []
y_cor =[]
for lat,lan in zip(df_locations.Latitude,df_locations.Longitude):
    x, y= lonlat_to_xy(lat,lon)
    x_cor.append(x)
    y_cor.append(y)
print('Done')

Done
Wall time: 28.8 s


In [39]:
#Adding the above calulated Coordinates into our **df_locations**
df_locations['x_coordinate'] = x_cor
df_locations['y_coordinate'] = y_cor
df_locations.head()

Unnamed: 0,Address,Latitude,Longitude,Distance_from_center,Restaurants in area,x_coordinate,y_coordinate
0,"Raisina Hill, Rakab Ganj, Chanakya Puri Tehsil...",28.616404,77.193322,3042.335927,0,832691.987792,8613088.0
1,"Mughal Gardens, Rajpath, Raisina Hill, Rakab G...",28.614101,77.196174,2970.912716,0,832636.671451,8613075.0
2,"Dara Shikoh Road, Raisina Hill, Rakab Ganj, Ch...",28.611798,77.199025,2946.650887,0,832581.357165,8613062.0
3,"Tyagaraj Marg, Raisina Hill, Rakab Ganj, Chana...",28.609494,77.201876,2970.69562,0,832526.044933,8613049.0
4,"Rakab Ganj, Chanakya Puri Tehsil, New Delhi, D...",28.607191,77.204727,3041.891312,0,832470.734756,8613036.0


In [40]:
# Now I am trying to getting the distance of each candidate address from Punjabi Restro
distances_to_punjabi_restaurant = []

for area_x, area_y in zip(xs,ys):
    min_distance = 10000
    for res in punjabi_restaurants.values():
        res_x = res[7]
        res_y = res[8]
        d = calc_xy_distance(area_x, area_y, res_x, res_y)
        if d<min_distance:
            min_distance = d
    distances_to_punjabi_restaurant.append(min_distance)

df_locations['Distance to Punjabi restaurant'] = distances_to_punjabi_restaurant

df_locations.head()

Unnamed: 0,Address,Latitude,Longitude,Distance_from_center,Restaurants in area,x_coordinate,y_coordinate,Distance to Punjabi restaurant
0,"Raisina Hill, Rakab Ganj, Chanakya Puri Tehsil...",28.616404,77.193322,3042.335927,0,832691.987792,8613088.0,1931.899326
1,"Mughal Gardens, Rajpath, Raisina Hill, Rakab G...",28.614101,77.196174,2970.912716,0,832636.671451,8613075.0,1984.435727
2,"Dara Shikoh Road, Raisina Hill, Rakab Ganj, Ch...",28.611798,77.199025,2946.650887,0,832581.357165,8613062.0,2205.387791
3,"Tyagaraj Marg, Raisina Hill, Rakab Ganj, Chana...",28.609494,77.201876,2970.69562,0,832526.044933,8613049.0,2551.36933
4,"Rakab Ganj, Chanakya Puri Tehsil, New Delhi, D...",28.607191,77.204727,3041.891312,0,832470.734756,8613036.0,2912.725997


In [45]:
fig = px.scatter(df_locations,y='Distance to Punjabi restaurant',color='Distance to Punjabi restaurant')
fig.update_layout(template='plotly_dark')

In [49]:
print('Average distance to closest Punjabi restaurant from each Area center:', df_locations['Distance to Punjabi restaurant'].mean().round(3),'meter')

Average distance to closest Punjabi restaurant from each Area center: 674.378 meter


#### Next Steps

1. OK, so **on average Punjabi restaurant can be found within ~500m** from every area center candidate. That's fairly close, so we need to filter our areas carefully!

2. Let's crete a map showing **heatmap / density of restaurants** and try to extract some meaningfull info from that. Also, let's show **borders of Delhi Wards** on our map and a few circles indicating distance of 500m, 1km,2km and 3km from Cannaught Place.

In [54]:
# getting Latitudes and Longitudes of Restaurants and Punjabi_Restros (That we earlier generated)
restaurant_latlons = [[res[2], res[3]] for res in restaurants.values()]

punjabi_latlons = [[res[2], res[3]] for res in punjabi_restaurants.values()]

#### In Folium map i am using Delhi_Boundary.geojson file (source:Github)

In [56]:
def boroughs_style(feature):
    return { 'color': 'blue', 'fill': False }

from folium import plugins
from folium.plugins import HeatMap

map_delhi = folium.Map(location=delhi_centre, zoom_start=13)
folium.TileLayer('cartodbpositron').add_to(map_delhi) #cartodbpositron cartodbdark_matter
HeatMap(restaurant_latlons).add_to(map_delhi)
folium.Marker(delhi_centre).add_to(map_delhi)
folium.Circle(delhi_centre, radius=500, fill=False, color='white').add_to(map_delhi)
folium.Circle(delhi_centre, radius=1000, fill=False, color='white').add_to(map_delhi)
folium.Circle(delhi_centre, radius=2000, fill=False, color='white').add_to(map_delhi)
folium.Circle(delhi_centre, radius=3000, fill=False, color='white').add_to(map_delhi)
folium.GeoJson(open('Delhi_Boundary.geojson').read(), name='geojson').add_to(map_delhi)
map_delhi

Now i can see few pockets of low restaurant density closest to Cannaught Place can be found **North/East of Cannaught Place**. 

Let's create another heatmap map showing **heatmap/density of Punjabi restaurants** only.

In [58]:
map_delhi = folium.Map(location=delhi_centre, zoom_start=13)
folium.TileLayer('cartodbpositron').add_to(map_delhi) #cartodbpositron cartodbdark_matter
HeatMap(punjabi_latlons).add_to(map_delhi)
folium.Marker(delhi_centre).add_to(map_delhi)
folium.Circle(delhi_centre, radius=500, fill=False, color='white').add_to(map_delhi)
folium.Circle(delhi_centre, radius=1000, fill=False, color='white').add_to(map_delhi)
folium.Circle(delhi_centre, radius=2000, fill=False, color='white').add_to(map_delhi)
folium.Circle(delhi_centre, radius=3000, fill=False, color='white').add_to(map_delhi)
folium.GeoJson(open('Delhi_Boundary.geojson').read(), name='geojson').add_to(map_delhi)
map_delhi

In [59]:
### As i got some visuals but it is not so good to tell my results here i have to do further analysis on the low density areas for more accurate results

#### Marking Area closer to Center and having less density of restros

In [60]:
roi_x_min = delhi_center_x + 200
roi_y_max = delhi_center_y -300
roi_width = 5000
roi_height = 5000
roi_center_x = roi_x_min - 100
roi_center_y = roi_y_max + 350
roi_center_lon, roi_center_lat = xy_to_lonlat(roi_center_x, roi_center_y)
roi_center = [roi_center_lat, roi_center_lon]

map_delhi = folium.Map(location=roi_center, zoom_start=14)
HeatMap(restaurant_latlons).add_to(map_delhi)
folium.Marker(delhi_centre).add_to(map_delhi)
folium.Circle(roi_center, radius=300, color='white', fill=True, fill_opacity=0.4).add_to(map_delhi)
folium.GeoJson(open('Delhi_Boundary.geojson').read(), name='geojson').add_to(map_delhi)
map_delhi

#### Now I am doing the same method for generating the grids that i did in the staring but having different Parameters

In [62]:
%%time
### Yeah i got that required area marked now i am creating a more denser GRID of locations candidates restricted to our new region of interest (let's make our location candidates 100m appart).

k = math.sqrt(3) / 2 # Vertical offset for hexagonal grid cells
x_step = 100
y_step = 100 * k 
roi_y_min = roi_center_y - 2500

roi_latitudes = []
roi_longitudes = []
roi_xs = []
roi_ys = []
for i in range(0, int(51/k)):
    y = roi_y_min + i * y_step
    x_offset = 50 if i%2==0 else 0
    for j in range(0, 51):
        x = roi_x_min + j * x_step + x_offset
        d = calc_xy_distance(roi_center_x, roi_center_y, x, y)
        if (d <= 2501):
            lon, lat = xy_to_lonlat(x, y)
            roi_latitudes.append(lat)
            roi_longitudes.append(lon)
            roi_xs.append(x)
            roi_ys.append(y)

print(len(roi_latitudes), 'Candidate Area centers generated.')

1087 Candidate Area centers generated.
Wall time: 2min 10s


In [63]:
# Creating functions for getting our desired values
def count_restaurants_nearby(x, y, restaurants, radius=250):    
    count = 0
    for res in restaurants.values():
        res_x = res[7]; res_y = res[8]
        d = calc_xy_distance(x, y, res_x, res_y)
        if d<=radius:
            count += 1
    return count

def find_nearest_restaurant(x, y, restaurants):
    d_min = 100000
    for res in restaurants.values():
        res_x = res[7]; res_y = res[8]
        d = calc_xy_distance(x, y, res_x, res_y)
        if d<=d_min:
            d_min = d
    return d_min

roi_restaurant_counts = []
roi_punjabi_distances = []

print('Generating data on location candidates... ', end='')
for x, y in zip(roi_xs, roi_ys):
    count = count_restaurants_nearby(x, y, restaurants, radius=250)
    roi_restaurant_counts.append(count)
    distance = find_nearest_restaurant(x, y, punjabi_restaurants)
    roi_punjabi_distances.append(distance)
print('done.')


Generating data on location candidates... done.


#### I got around 1000+ location candidates inside our low density zone:

1. Number of restaurants in vicinity (we'll use radius of 250 meters)
2. Distance to closest Punjabi restaurant.

In [65]:
# Let's put this into dataframe
df_lowdensity = pd.DataFrame({'Latitude':roi_latitudes,
                                 'Longitude':roi_longitudes,
                                 'X':roi_xs,
                                 'Y':roi_ys,
                                 'Restaurants_nearby':roi_restaurant_counts,
                                 'closest_punjabi_restro':roi_punjabi_distances})

df_lowdensity.head(10)

Unnamed: 0,Latitude,Longitude,X,Y,Restaurants_nearby,closest_punjabi_restro
0,28.620728,77.21004,7114097.0,5497978.0,0,1040.05312
1,28.620344,77.210516,7114197.0,5497978.0,0,1115.971735
2,28.61996,77.210991,7114297.0,5497978.0,0,1120.274202
3,28.619576,77.211466,7114397.0,5497978.0,0,1036.953327
4,28.619192,77.211941,7114497.0,5497978.0,1,956.83338
5,28.618808,77.212416,7114597.0,5497978.0,1,880.788302
6,28.620899,77.210655,7114147.0,5498064.0,0,1023.305031
7,28.620515,77.21113,7114247.0,5498064.0,0,1104.914299
8,28.620131,77.211605,7114347.0,5498064.0,0,1032.932627
9,28.619747,77.21208,7114447.0,5498064.0,0,947.210498


In [67]:
df_lowdensity.shape

(1087, 6)

In [72]:
fig = px.scatter(df_lowdensity,y='closest_punjabi_restro',color='closest_punjabi_restro')
fig.update_layout(template='plotly_dark')

In [75]:
fig = px.histogram(df_lowdensity,x='Restaurants_nearby',color='Restaurants_nearby')
fig.update_layout(template='plotly_dark')

In [73]:
df_lowdensity['closest_punjabi_restro'].mean().round(3)

387.892

In [76]:
#Lets export this dataset for our further use:

df_lowdensity.to_csv('area_lowdensity_restro.csv')

#### Now i am getting more closer:

**Filtering the locations on the basis of **

1. Locations with no more than two restaurants in radius of 250 meters.
2. No Punjabi restaurants in radius of 400 meters.

In [80]:
# Both above conditions to be true:
df_good_area = df_lowdensity[(df_lowdensity['Restaurants_nearby']<=2) & (df_lowdensity['closest_punjabi_restro']>=400)] 

df_good_area.shape


(466, 6)

In [81]:
# Visualizing these Good area location:

good_latitudes = df_good_area['Latitude'].values
good_longitudes = df_good_area['Longitude'].values

good_locations = [[lat, lon] for lat, lon in zip(good_latitudes, good_longitudes)]

map_delhi = folium.Map(location=roi_center, zoom_start=14)
folium.TileLayer('cartodbpositron').add_to(map_delhi)
HeatMap(restaurant_latlons).add_to(map_delhi)
folium.Circle(roi_center, radius=2500, color='white', fill=True, fill_opacity=0.6).add_to(map_delhi)
folium.Marker(delhi_centre).add_to(map_delhi)
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_delhi) 
folium.GeoJson(open('Delhi_Boundary.geojson').read(), name='geojson').add_to(map_delhi)
map_delhi

Looking good.
Now I have a bunch of locations fairly close to  and we know that each of those locations has no more than two restaurants in radius of 250m, and no Punjabi restaurant closer than 400m. Any of those locations is a **potential candidate for a new Punjabi restaurant**, at least based on nearby competition.

Let's now show those good locations in a form of heatmap:

In [82]:
map_delhi = folium.Map(location=roi_center, zoom_start=14)
HeatMap(good_locations, radius=25).add_to(map_delhi)
folium.Marker(delhi_centre).add_to(map_delhi)
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_delhi)
folium.GeoJson(open('Delhi_Boundary.geojson').read(), name='geojson').add_to(map_delhi)
map_delhi

## Now i have to analyis these candidate location points and get some better center points.


1. NOw I will try to **cluster** those locations to create **centers of zones containing good locations**. Those zones, their centers and addresses will be the final result of our analysis.

In [84]:
#Using Kmeans algo to make clusters as it is like unsupervised learning type

from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score


In [90]:
good_xys = df_good_area[['X', 'Y']].values

range_n_clusters = np.arange(2,20)
for n_clusters in range_n_clusters:
    clust_model = KMeans(n_clusters=n_clusters)
    preds = clust_model.fit_predict(good_xys)
    #centers = clusterer.cluster_centers

    score = silhouette_score(good_xys, preds).round(2)*100
    print("For k = {},  silhouette score is {} %)".format(n_clusters, score))


For k = 2,  silhouette score is 52.0 %)
For k = 3,  silhouette score is 51.0 %)
For k = 4,  silhouette score is 53.0 %)
For k = 5,  silhouette score is 53.0 %)
For k = 6,  silhouette score is 52.0 %)
For k = 7,  silhouette score is 49.0 %)
For k = 8,  silhouette score is 45.0 %)
For k = 9,  silhouette score is 42.0 %)
For k = 10,  silhouette score is 43.0 %)
For k = 11,  silhouette score is 43.0 %)
For k = 12,  silhouette score is 45.0 %)
For k = 13,  silhouette score is 45.0 %)
For k = 14,  silhouette score is 44.0 %)
For k = 15,  silhouette score is 43.0 %)
For k = 16,  silhouette score is 42.0 %)
For k = 17,  silhouette score is 42.0 %)
For k = 18,  silhouette score is 41.0 %)
For k = 19,  silhouette score is 40.0 %)


In [91]:
Ks = 20
mean_acc = np.zeros((Ks-1))
for n_clusters in range(2,Ks):
    clusterer = KMeans(n_clusters=n_clusters)
    preds = clusterer.fit_predict(good_xys)
    mean_acc[n_clusters-1] = silhouette_score(good_xys, preds)
print(f'The best silhouette percentage is {mean_acc.max()*100} % for  K={mean_acc.argmax()+1}')

The best silhouette percentage is 53.46778321963042 % for  K=4


In [94]:
# Let see major Areas 


number_of_clusters = 5

good_xys = df_good_area[['X', 'Y']].values
kmeans = KMeans(n_clusters=number_of_clusters, random_state=0).fit(good_xys)

cluster_centers = [xy_to_lonlat(cc[0], cc[1]) for cc in kmeans.cluster_centers_]

map_delhi = folium.Map(location=roi_center, zoom_start=14)
folium.Marker(delhi_centre).add_to(map_delhi)
folium.TileLayer('cartodbpositron').add_to(map_delhi)
HeatMap(restaurant_latlons).add_to(map_delhi)
folium.Circle(roi_center, radius=2500, color='white', fill=True, fill_opacity=0.4).add_to(map_delhi)
for lon, lat in cluster_centers:
    folium.Circle([lat, lon], radius=350, color='green', fill=True, fill_opacity=0.25).add_to(map_delhi) 
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_delhi)
folium.GeoJson(open('Delhi_Boundary.geojson').read(), name='geojson').add_to(map_delhi)
map_delhi

#### But,Here i will select K according to the number of candidates area that i want to be in my final results

In [95]:

number_of_clusters = 15

good_xys = df_good_area[['X', 'Y']].values
kmeans = KMeans(n_clusters=number_of_clusters, random_state=0).fit(good_xys)

cluster_centers = [xy_to_lonlat(cc[0], cc[1]) for cc in kmeans.cluster_centers_]

map_delhi = folium.Map(location=roi_center, zoom_start=14)
folium.Marker(delhi_centre).add_to(map_delhi)
folium.TileLayer('cartodbpositron').add_to(map_delhi)
HeatMap(restaurant_latlons).add_to(map_delhi)
folium.Circle(roi_center, radius=2500, color='white', fill=True, fill_opacity=0.4).add_to(map_delhi)
for lon, lat in cluster_centers:
    folium.Circle([lat, lon], radius=350, color='green', fill=True, fill_opacity=0.25).add_to(map_delhi) 
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_delhi)
folium.GeoJson(open('Delhi_Boundary.geojson').read(), name='geojson').add_to(map_delhi)
map_delhi

Lets have some clear view without heatmap

In [89]:
map_delhi = folium.Map(location=roi_center, zoom_start=14)
folium.Marker(delhi_centre).add_to(map_delhi)
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.Circle([lat, lon], radius=250, color='#00000000', fill=True, fill_color='#0066ff', fill_opacity=0.07).add_to(map_delhi)
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_delhi)
for lon, lat in cluster_centers:
    folium.Circle([lat, lon], radius=400, color='green', fill=False).add_to(map_delhi) 
folium.GeoJson(open('Delhi_Boundary.geojson').read(), name='geojson').add_to(map_delhi)
map_delhi

For coordinates finding i used:
https://www.findlatitudeandlongitude.com/r/20%2C+Ashoka+Road%2C+New+Delhi%2C+Delhi+110001%2C+India/880046/

Visualizing Candidate areas in **Ashok road**:


In [100]:
map_delhi = folium.Map(location=[28.622515,77.214242], zoom_start=15)
folium.Marker(delhi_centre).add_to(map_delhi)
for lon, lat in cluster_centers:
    folium.Circle([lat, lon], radius=500, color='green', fill=False).add_to(map_delhi) 
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.Circle([lat, lon], radius=250, color='#0000ff00', fill=True, fill_color='#0066ff', fill_opacity=0.07).add_to(map_delhi)
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_delhi)
folium.GeoJson(open('Delhi_Boundary.geojson').read(), name='geojson').add_to(map_delhi)
map_delhi


Visualizing Candidate areas in **Indrajit Verma Marg**:

In [99]:
map_delhi = folium.Map(location=[28.633547,77.232888], zoom_start=15)
folium.Marker(delhi_centre).add_to(map_delhi)
for lon, lat in cluster_centers:
    folium.Circle([lat, lon], radius=500, color='green', fill=False).add_to(map_delhi) 
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.Circle([lat, lon], radius=250, color='#0000ff00', fill=True, fill_color='#0066ff', fill_opacity=0.07).add_to(map_delhi)
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_delhi)
folium.GeoJson(open('Delhi_Boundary.geojson').read(), name='geojson').add_to(map_delhi)
map_delhi

### Finaly, let's **reverse geocode those candidate area centers to get the addresses** which can be presented to stakeholders.

In [96]:
# Getting the centers of the clusters and storing it for reverse geocoding
lati = []
lng  = []
for lat,lon in cluster_centers:
    lati.append(lon)
    lng.append(lat)
print('done')

done


In [97]:
#Transforming in the coordinate format for a Geocoder
cluster_centers_1 = list(zip(lati,lng))
cluster_centers_1[0]

(28.631582980875045, 77.23475681971075)

In [98]:
# checking reverse geocoder
geolocator.reverse('28.631582980875045, 77.23475681971075')

Location(Rouse Avenue Court, Indrajit Verma Marg, Rouse Avenue, Delhi, Kotwali Tehsil, Central Delhi, Delhi, 110002, India, (28.6319525, 77.23432411189678, 0.0))

In [101]:
# Calculating the distance from Cannaught Place(centre) to clusters_Center

import geopy.distance
dist = []
coords_1 = (delhi_centre[0],delhi_centre[1])
for lat, lon in cluster_centers_1:
    coords_2 = (lat, lon)
    dist.append(geopy.distance.distance(coords_1, coords_2).m)

 **Important**

 Doing Reverse Geocoding for the clusters Centres

In [104]:
%%time
candidate_area_addresses = []
distance = []
print('==============================================================')
print('Addresses of centers of areas recommended for Opening New Punjabi Resturant')
print('==============================================================\n')
for i in range(len(cluster_centers_1)):
    location = geolocator.reverse(cluster_centers_1[i])
    location = str(location).replace(', India','')
    candidate_area_addresses.append(location)  

coords_1 = (delhi_centre[0],delhi_centre[1])
for lat, lon in cluster_centers_1:
    coords_2 = (lat, lon)
    distance.append(geopy.distance.distance(coords_1, coords_2).m)
print('All recommended places are found!!! \n')    

Addresses of centers of areas recommended for Opening New Punjabi Resturant

All recommended places are found 

Wall time: 7.28 s


# Recommended Area addresss along with distance from Cannaught Place

In [105]:
# Creating a Dataframe having my Analysis Results

columns = ['Area','Distance from Cannaught Place_KM']
df_prefered = pd.DataFrame(columns=columns)
df_prefered['Area']=candidate_area_addresses
df_prefered['Distance from Cannaught Place_KM']= distance
df_prefered['Distance from Cannaught Place_KM'] = df_prefered['Distance from Cannaught Place_KM']/1000
df_prefered

Unnamed: 0,Area,Distance from Cannaught Place_KM
0,"Rouse Avenue Court, Indrajit Verma Marg, Rouse...",1.503805
1,"Patel Chok, Ashoka Road, Rakab Ganj, Chanakya ...",1.001312
2,"Pandit Ravi Shankar Shukla Lane, Chanakya Puri...",1.256852
3,"Maharaja Ranjeet Singh Marg, Delhi, Kotwali Te...",1.429144
4,"Kasturba Gandhi Marg Crossing, Firoz Shah Road...",1.131471
5,"Barakhamba Road, Shankar Market, Chanakya Puri...",1.114253
6,"Shankar Market, Chanakya Puri Tehsil, New Delh...",0.488816
7,"Chanakya Puri Tehsil, New Delhi, Delhi, 110 001",1.516319
8,"Rakab Ganj, Chanakya Puri Tehsil, New Delhi, D...",1.398064
9,"Atul Grove Road, Connaught Place, Chanakya Pur...",0.796054


In [108]:
# Exporting my Analysis resluts
df_prefered.to_csv('Prefferd_location_centre.csv')

### On map

In [106]:
map_delhi = folium.Map(location=roi_center, zoom_start=14)
folium.Circle(delhi_centre, radius=50, color='red', fill=True, fill_color='red', fill_opacity=1).add_to(map_delhi)
for lonlat, candidate_area_addresses in zip(cluster_centers, candidate_area_addresses):
    folium.Marker([lonlat[1], lonlat[0]], popup=location).add_to(map_delhi) 
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.Circle([lat, lon], radius=250, color='#0000ff00', fill=True, fill_color='#0066ff', fill_opacity=0.05).add_to(map_delhi)
map_delhi

### Outcome

#### This concludes my analysis. I have created 15 addresses representing centers of zones containing locations with low number of restaurants and no Punjabi restaurants nearby, all zones being fairly close to Cannaught Place.


### Further Idea

Their centers/addresses should be considered only as a starting point for exploring area nearby in search for potential restaurant locations. Most of the zones are located in within 2km radius from centre,which we have identified as interesting due to being popular with tourists, fairly close to city center and well connected by public transport.

## A Analysis and a method proposed by Kumar Shivam (an aspiring Data Scientist)

@copyrights_reserved

Thankyou