# A recommendation system for setting up a new restaurant / eatery focussed primarily on the corporate demographic (based on employee count and location of corporate offices in Bangalore, India)

## Brief Introduction

## Part 1: Description of problem

### India is an extremely densely populated country (one of the most dense), with more than 1.34 billion residents.
### Obviously it is difficult to start a business here due to high real estate costs. 
### So, an entrepreneur aiming at a corporate centric market should know the best places to set up shop.

### A large population of Bangalore lies in this corporate demographic (more than 200 corporations), and 800+ startups, so eating snack foods out is more popular and convenient than ever, hence goal is to find the best places in Bangalore to setup a new food shop/ restaurant.
### The example chosen is for Bangalore, but this project can be used for various different locations like Chennai, Mumbai, Delhi etc.

### The objective is to find the optimal location for setting up a new business (based on location of offices, eateries in Bangalore, India). 

### Target audience: 
### Entrepreneurs and small-scale businessmen/women interested in the food/ snacks industry, aiming at the corporate demographic for maximising profits.

 #   

## Part 2: Data that is needed

### 1. We need a list of the corporate offices in Bangalore. Their latitude and longitude will be calculated using Geopy Nominatim (a Python Library).

This data can be found on Wikipedia, as well as many other websites.

For instance: https://en.wikipedia.org/wiki/Category:Companies_based_in_Bangalore

### 2. **Then we can use the FourSquare API to find the number of eateries in a 1km radius around each office.** The API will provide us with Postal Code, Neighborhood, Venue, Venue Summary and Venue Category.

Foursquare is a local search-and-discovery service mobile app which provides search results for its users (Wikipedia). It has more than 60 million users.

### 3. Processing the Retrieved data and creating a structured DataFrame for all the venues, grouped by offices. 

### 4. Selecting relevant venues (food related only).

### **The offices with highest ratio of `(no. of employees)/(no. of eateries)` would be the best places to start a restaurant.** (supply and demand)

We can also create clusters of most highly student populated areas

In [1]:
import requests  # library to handle requests
import pandas as pd  # library for data analsysis
import numpy as np  # library to handle data in a vectorized manner
import random  # library for random number generation

# module to convert an address into latitude and longitude values
from geopy.geocoders import Nominatim

# libraries for displaying images
from IPython.display import Image
from IPython.core.display import HTML

# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize
import folium  # plotting library

print('Folium installed')
print('Libraries imported.')

Folium installed
Libraries imported.


In [2]:
CLIENT_ID = '1NW3OMZEIVJXCFGGIYLXLLH4CQWIYX3GSO4ERVDST4FXYI4E'  # your Foursquare ID
# your Foursquare Secret
CLIENT_SECRET = 'NNWGYNG2RFCIPLNG0ER4BNC15GPE2CSY10UA32BJFCBYOO0Y'
VERSION = '20180604'
LIMIT = 30
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: 1NW3OMZEIVJXCFGGIYLXLLH4CQWIYX3GSO4ERVDST4FXYI4E
CLIENT_SECRET:NNWGYNG2RFCIPLNG0ER4BNC15GPE2CSY10UA32BJFCBYOO0Y


In [3]:
# Amazon Development Center, Aquila building
geolocator = Nominatim(user_agent='myapplication')
location = geolocator.geocode("Amazon Bangalore India").raw
lat = location['lat']
lon = location['lon']
print("Latitude: ", lat)
print("Longitude: ", lon)

Latitude:  12.9795028
Longitude:  77.6959454


In [4]:
radius = 1000
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID,
    CLIENT_SECRET,
    VERSION,
    lat,
    lon,
    radius,
    100)
url

'https://api.foursquare.com/v2/venues/explore?&client_id=1NW3OMZEIVJXCFGGIYLXLLH4CQWIYX3GSO4ERVDST4FXYI4E&client_secret=NNWGYNG2RFCIPLNG0ER4BNC15GPE2CSY10UA32BJFCBYOO0Y&v=20180604&ll=12.9795028,77.6959454&radius=1000&limit=100'

In [5]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5dda6862e826ac0022d64225'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'}]},
  'headerLocation': 'Current map view',
  'headerFullLocation': 'Current map view',
  'headerLocationGranularity': 'unknown',
  'totalResults': 27,
  'suggestedBounds': {'ne': {'lat': 12.988502809000009,
    'lng': 77.70516413934406},
   'sw': {'lat': 12.970502790999992, 'lng': 77.68672666065594}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4be967c2aecb76b073c95a80',
       'name': 'Jalsa',
       'location': {'address': 'Doddanekundi',
        'crossStreet': 'at Outer Ring Rd.',
        'lat': 12.97768337190332,
        'lng': 77.6954755254431,
        'labeledLatLngs': [{'label': 'dis

In [6]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']

    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [7]:
venues = results['response']['groups'][0]['items']

nearby_venues = json_normalize(venues)  # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories',
                    'venue.location.lat', 'venue.location.lng']
nearby_venues = nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(
    get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]
print(nearby_venues.shape)
nearby_venues

(27, 4)


Unnamed: 0,name,categories,lat,lng
0,Jalsa,Indian Restaurant,12.977683,77.695476
1,X-Torque,Motorcycle Shop,12.97757,77.693483
2,Kabab Magic,Fried Chicken Joint,12.978704,77.695532
3,Subway,Sandwich Place,12.980265,77.694095
4,Flechazo,Mediterranean Restaurant,12.974379,77.6974
5,Subway,Sandwich Place,12.985817,77.6944
6,Cafe Coffee Day,Coffee Shop,12.97955,77.694346
7,CineMAX,Multiplex,12.979687,77.69391
8,Cafe Coffee Day,Coffee Shop,12.982808,77.693899
9,Callista Bar & Kitchen,Food,12.980596,77.693488


In [8]:
nearby_venues.categories.unique()

array(['Indian Restaurant', 'Motorcycle Shop', 'Fried Chicken Joint',
       'Sandwich Place', 'Mediterranean Restaurant', 'Coffee Shop',
       'Multiplex', 'Food', 'Hotel', 'Movie Theater', 'Cafeteria',
       'BBQ Joint', 'Café', 'Business Service', 'Chinese Restaurant',
       'Burrito Place', 'Department Store', 'Park', 'Breakfast Spot',
       'Bus Station'], dtype=object)

### Listing the offices we will study

In [9]:
offices = ['Amazon','Infosys','Dell','HP','Tech Mahindra','SAP','Samsung R&D','Accenture','Wipro','TCS', 'IBM','Oracle',
           'Cognizant','Capgemini','Cisco','Mindtree','HCL','Mu Sigma','Robert Bosch','Thomson Reuters','Honeywell','CGI',
           'Mphasis','EY','Deloitte','Nokia','Intel','Huawei','Goldman Sachs','Flipkart']
print(len(offices))

30


Adding Bangalore to the end of each to help Nominatim find the coordinates for the locations more easily

In [10]:
offices = [x+" Bangalore" for x in offices]
offices

['Amazon Bangalore',
 'Infosys Bangalore',
 'Dell Bangalore',
 'HP Bangalore',
 'Tech Mahindra Bangalore',
 'SAP Bangalore',
 'Samsung R&D Bangalore',
 'Accenture Bangalore',
 'Wipro Bangalore',
 'TCS Bangalore',
 'IBM Bangalore',
 'Oracle Bangalore',
 'Cognizant Bangalore',
 'Capgemini Bangalore',
 'Cisco Bangalore',
 'Mindtree Bangalore',
 'HCL Bangalore',
 'Mu Sigma Bangalore',
 'Robert Bosch Bangalore',
 'Thomson Reuters Bangalore',
 'Honeywell Bangalore',
 'CGI Bangalore',
 'Mphasis Bangalore',
 'EY Bangalore',
 'Deloitte Bangalore',
 'Nokia Bangalore',
 'Intel Bangalore',
 'Huawei Bangalore',
 'Goldman Sachs Bangalore',
 'Flipkart Bangalore']

### Function to get latitude and longitude of each office

In [11]:
def coords(office):
    d = {}
    d['office'] = office
    geolocator = Nominatim(user_agent='myapplication')
    try:
        location = geolocator.geocode(office).raw
        d['latitude'] = location['lat']
        d['longitude'] = location['lon']
        return d
    except Exception as e:
        print("Office %s not found"%office)
        return -1

In [12]:
l = []
for i in offices:
    details = coords(i)
    if(details!=-1):
        l.append(coords(i))
print(l)
print(len(l))

Office Mphasis Bangalore not found
Office Goldman Sachs Bangalore not found
[{'office': 'Amazon Bangalore', 'latitude': '12.9795028', 'longitude': '77.6959454'}, {'office': 'Infosys Bangalore', 'latitude': '12.84508845', 'longitude': '77.6649530443891'}, {'office': 'Dell Bangalore', 'latitude': '12.9383649', 'longitude': '77.629499'}, {'office': 'HP Bangalore', 'latitude': '12.99646925', 'longitude': '77.6888267135983'}, {'office': 'Tech Mahindra Bangalore', 'latitude': '12.85094555', 'longitude': '77.6778630567074'}, {'office': 'SAP Bangalore', 'latitude': '12.9608001', 'longitude': '77.6372345'}, {'office': 'Samsung R&D Bangalore', 'latitude': '13.0043687', 'longitude': '77.5524019'}, {'office': 'Accenture Bangalore', 'latitude': '12.9678934', 'longitude': '77.7240130513179'}, {'office': 'Wipro Bangalore', 'latitude': '12.9135669', 'longitude': '77.6856346215575'}, {'office': 'TCS Bangalore', 'latitude': '12.84830835', 'longitude': '77.6789892079896'}, {'office': 'IBM Bangalore', 'la

Latitude and Longitude of Bangalore, India

In [13]:
ban_lat = 12.9716
ban_lon = 77.5946

#### Plotting all the institutes that we are considering

In [14]:
office_map = folium.Map(location = [ban_lat, ban_lon], zoom_start=12)

for d in l:
    folium.CircleMarker(
    [float(d['latitude']), float(d['longitude'])],
        radius = 5, 
        popup = d['office'],
        fill = True,
        color = '#0012EE',
        fill_color = 'red',
        fill_opacity = 0.5
    ).add_to(office_map)
    
office_map

Food related categories

In [15]:
food_cats = ['Ice Cream Shop', 'Sandwich Place', 'Food Truck', 'Fast Food Restaurant', 'Indian Restaurant', 'Steakhouse',
            'Chinese Restaurant', 'Coffee Shop', 'Bakery', 'Café', 'Middle Eastern Restuarant']

Testing plot for nearby venues for Amazon

In [16]:
print(l[0])
office1 = l[0]['office']
lat = l[0]['latitude']
lon = l[0]['longitude']

{'office': 'Amazon Bangalore', 'latitude': '12.9795028', 'longitude': '77.6959454'}


In [17]:
radius = 500
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID,
    CLIENT_SECRET,
    VERSION,
    lat,
    lon,
    radius,
    100)
url

'https://api.foursquare.com/v2/venues/explore?&client_id=1NW3OMZEIVJXCFGGIYLXLLH4CQWIYX3GSO4ERVDST4FXYI4E&client_secret=NNWGYNG2RFCIPLNG0ER4BNC15GPE2CSY10UA32BJFCBYOO0Y&v=20180604&ll=12.9795028,77.6959454&radius=500&limit=100'

In [18]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5dda68ba78a484001bfd94a8'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'}]},
  'headerLocation': 'Current map view',
  'headerFullLocation': 'Current map view',
  'headerLocationGranularity': 'unknown',
  'totalResults': 15,
  'suggestedBounds': {'ne': {'lat': 12.984002804500005,
    'lng': 77.70055476967202},
   'sw': {'lat': 12.975002795499996, 'lng': 77.69133603032797}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4be967c2aecb76b073c95a80',
       'name': 'Jalsa',
       'location': {'address': 'Doddanekundi',
        'crossStreet': 'at Outer Ring Rd.',
        'lat': 12.97768337190332,
        'lng': 77.6954755254431,
        'labeledLatLngs': [{'label': 'dis

#### Filtering only food related venues

In [19]:
venues = results['response']['groups'][0]['items']

nearby_venues = json_normalize(venues)  # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories',
                    'venue.location.lat', 'venue.location.lng']
nearby_venues = nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(
    get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]
nearby_venues = nearby_venues[nearby_venues['categories'].isin(food_cats)]
print(nearby_venues.shape)
nearby_venues

(7, 4)


Unnamed: 0,name,categories,lat,lng
0,Jalsa,Indian Restaurant,12.977683,77.695476
3,Subway,Sandwich Place,12.980265,77.694095
4,Cafe Coffee Day,Coffee Shop,12.97955,77.694346
6,Cafe Coffee Day,Coffee Shop,12.982808,77.693899
7,spree hotels,Indian Restaurant,12.978961,77.694559
11,Hind Ka Chulha,Indian Restaurant,12.980291,77.693474
14,3 Kings Cafe,Café,12.976808,77.692859


In [20]:
venues_i1 = []
for index, row in nearby_venues.iterrows():
    d = {}
    d['name'] = row['name']
    d['lat'] = row['lat']
    d['lng'] = row['lng']
    venues_i1.append(d)

#### Plotting the food venues near Amazon

In [21]:
rest_map = folium.Map(location=[float(lat), float(lon)], zoom_start=15)

for d in venues_i1:
    folium.CircleMarker(
        [float(d['lat']), float(d['lng'])],
        radius=5,
        popup=d['name'],
        fill=True,
        color='blue',
        fill_color='blue',
        fill_opacity=0.5
    ).add_to(rest_map)

folium.CircleMarker(
    [float(lat), float(lon)],
    radius=7.5,
    popup=office1,
    fill=True,
    color='red',
    fill_color='red'
).add_to(rest_map)
rest_map