# A recommendation system for food stalls aimed at students (based on student count and location of schools/ colleges in Mumbai, India)

## Brief Introduction

## Part 1: Description of problem

### Mumbai, India is an extremely densely populated city (one of the most dense), with more than 18 million residents.

### Obviously it is tough to start a business here due to high real estate costs. So, an entrepreneur aiming at a student centric market (13 - 20 year old demographic) should know the best places to set up shop.

### A large population of Mumbai lies in this student demographic (more than 50 schools), and eating snack foods out is more popular and convenient than ever, hence we will find the best places in Mumbai to set up a food shop/ restaurant

### Target audience: 
### Entrepreneurs and small-scale businessmen/women interested in the food/ snacks industry, aiming at the student demographic

 #   

## Part 2: Data that is needed

### 1. We need a list of the most populated schools in Mumbai. Their latitude and longitude will be calculated using geopy Nominatim. 

This data can be found on Wikipedia, as well as the school websites.

For instance: https://en.wikipedia.org/wiki/List_of_educational_institutions_in_Mumbai

### 2. Then we can use the FourSquare API to find the number of eateries in a 1km radius around each school. The API will provide us with Postal Code, Neighborhood, Venue, Venue Summary and Venue Category.

Foursquare is a local search-and-discovery service mobile app which provides search results for its users (Wikipedia). It has more than 60 million users.

### 3. We can also use the FourSquare API to find all food related categories that we will filter

### 4. Processing the Retrieved data and creating a structured DataFrame for all the venues, grouped by schools. 

### 5. Selecting relevant venues (food related only).

### **The schools with highest ratio of `(no. of students)/(no. of eateries)` would be the best places to start a food stall/ restaurant.** (supply and demand)

We can also create clusters of most highly student populated areas

### Thank you for your time, I would greatly appreciate any feedback (sidjain1412@gmail.com)

### Imports

In [12]:
import requests  # library to handle requests
import pandas as pd  # library for data analsysis
import numpy as np  # library to handle data in a vectorized manner
import random  # library for random number generation
import string # Manipulation of name for Folium Map
# module to convert an address into latitude and longitude values
from geopy.geocoders import Nominatim

# libraries for displaying images
from IPython.display import Image
from IPython.core.display import HTML

# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize
import folium  # plotting library

print('Libraries imported.')

Libraries imported.


### Declaring API keys

In [3]:
CLIENT_ID = 'WEMY4AM5NRBMPJ55IDUZ1XRYOHE52FANWWSHMCT2S0I1JUG3'  # your Foursquare ID
# your Foursquare Secret
CLIENT_SECRET = 'GAGO1KZFQ1DI3IKT1DG42DNQLGHPEBSJIE0QMDRXBJIHGJB1'
VERSION = '20180604'
LIMIT = 40
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: WEMY4AM5NRBMPJ55IDUZ1XRYOHE52FANWWSHMCT2S0I1JUG3
CLIENT_SECRET:GAGO1KZFQ1DI3IKT1DG42DNQLGHPEBSJIE0QMDRXBJIHGJB1


### Listing the schools/ colleges we will study

In [4]:
insts = ['Lilavatibai podar santacruz', 'Narsee Monjee College', 'University of Mumbai', 'Jai Hind College',
         'Mithibai College', 'Ramnarain Ruia College', 'Sophia College', "St. Andrew's College",
         'St. Xaviers College', 'Wilson College', 'IIT Bombay', 'Arya Vidya Mandir', 'BD Somani', 'Cambridge School',
         'Don Bosco High School', 'Hiranandani Foundation School Powai', 'Oberoi International', 'Vibgyor High School']
print(len(insts))

18


In [5]:
insts = [x+" Mumbai" for x in insts]
insts

['Lilavatibai podar santacruz Mumbai',
 'Narsee Monjee College Mumbai',
 'University of Mumbai Mumbai',
 'Jai Hind College Mumbai',
 'Mithibai College Mumbai',
 'Ramnarain Ruia College Mumbai',
 'Sophia College Mumbai',
 "St. Andrew's College Mumbai",
 'St. Xaviers College Mumbai',
 'Wilson College Mumbai',
 'IIT Bombay Mumbai',
 'Arya Vidya Mandir Mumbai',
 'BD Somani Mumbai',
 'Cambridge School Mumbai',
 'Don Bosco High School Mumbai',
 'Hiranandani Foundation School Powai Mumbai',
 'Oberoi International Mumbai',
 'Vibgyor High School Mumbai']

### Function to get latitude and longitude of each institute

In [6]:
def coords(institute):
    d = {}
    d['institute'] = institute
    geolocator = Nominatim(user_agent='myapplication')
    try:
        location = geolocator.geocode(institute).raw
        d['latitude'] = location['lat']
        d['longitude'] = location['lon']
        return d
    except Exception as e:
        print("Institute %s not found"%institute)
        return -1

In [11]:
institutes = []
for i in insts:
    details = coords(i)
    if(details!=-1):
        institutes.append(coords(i))
print(institutes)
print(len(institutes))

Institute University of Mumbai Mumbai not found
[{'institute': 'Lilavatibai podar santacruz Mumbai', 'latitude': '19.0810735', 'longitude': '72.8371727'}, {'institute': 'Narsee Monjee College Mumbai', 'latitude': '19.1037065', 'longitude': '72.837347688538'}, {'institute': 'Jai Hind College Mumbai', 'latitude': '18.93455995', 'longitude': '72.8251531862371'}, {'institute': 'Mithibai College Mumbai', 'latitude': '19.1028853', 'longitude': '72.8374936781393'}, {'institute': 'Ramnarain Ruia College Mumbai', 'latitude': '19.02381515', 'longitude': '72.8500989494695'}, {'institute': 'Sophia College Mumbai', 'latitude': '18.970042', 'longitude': '72.8070136'}, {'institute': "St. Andrew's College Mumbai", 'latitude': '19.0566226', 'longitude': '72.8287305'}, {'institute': 'St. Xaviers College Mumbai', 'latitude': '18.943156', 'longitude': '72.831870310951'}, {'institute': 'Wilson College Mumbai', 'latitude': '18.9567432', 'longitude': '72.810628561733'}, {'institute': 'IIT Bombay Mumbai', 'la

We were able to find the locations of 17 institutes

Latitude and Longitude of Mumbai, India

In [8]:
mum_lat = 19.0760
mum_lon = 72.8777

### Plotting all the institutes that we are considering

In [15]:
inst_map = folium.Map(location = [mum_lat, mum_lon], zoom_start=11)

for d in l:
    folium.CircleMarker(
    [float(d['latitude']), float(d['longitude'])],
        radius = 5, 
        popup = d['institute'].translate(str.maketrans('', '', string.punctuation)),
        fill = True,
        color = '#0012EE',
        fill_color = 'red',
        fill_opacity = 0.5
    ).add_to(inst_map)
    
inst_map

### Using the FourSquare Venues API to find all food related categories

In [16]:
url = 'https://api.foursquare.com/v2/venues/categories?&client_id={}&client_secret={}&v={}'.format(
    CLIENT_ID,
    CLIENT_SECRET,
    VERSION,)

results = requests.get(url).json()

In [19]:
food_categs = []
for i in results['response']['categories'][3]['categories']:
    food_categs.append(i['name'])
print(len(food_categs))

91


We have 91 food related categories

In [18]:
food_categs[:5]

['Afghan Restaurant',
 'African Restaurant',
 'American Restaurant',
 'Asian Restaurant',
 'Australian Restaurant']

### Function to generate a random color hex code:

In [20]:
import random
def randomcol():
    r = lambda: random.randint(0,255)
    return('#%02X%02X%02X' % (r(),r(),r()))

### Function to extract the category of a venue from a dataframe

In [23]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']

    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

## Main plotting function for all institutes and all the food venues within 750 metres

In [24]:
main_map = folium.Map(location=[mum_lat, mum_lon], zoom_start=11)
radius = 750

def fullplot():
    for i in institutes:
        name = i['institute']
        print(name, end=' ')
        lat = i['latitude']
        lon = i['longitude']
        
        # Using the foursquare venues API to find nearby venues for an institute
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID,
            CLIENT_SECRET,
            VERSION,
            lat,
            lon,
            radius,
            100)
        results = requests.get(url).json()
        
        venues = results['response']['groups'][0]['items']

        nearby_venues = json_normalize(venues)  # flatten JSON

        # filter columns
        filtered_columns = ['venue.name', 'venue.categories',
                            'venue.location.lat', 'venue.location.lng']
        nearby_venues = nearby_venues.loc[:, filtered_columns]

        # filter the category for each row
        nearby_venues['venue.categories'] = nearby_venues.apply(
            get_category_type, axis=1)

        # clean columns
        nearby_venues.columns = [col.split(".")[-1]
                                 for col in nearby_venues.columns]
        nearby_venues = nearby_venues[nearby_venues['categories'].isin(
            food_categs)]
        
        print(", Venues: ",nearby_venues.shape[0])

        venues_i = []
        for index, row in nearby_venues.iterrows():
            d = {}
            d['name'] = row['name'].translate(str.maketrans('', '', string.punctuation))
            d['lat'] = row['lat']
            d['lng'] = row['lng']
            venues_i.append(d)
        
        # Generating a random color
        color = randomcol()
        
        # Plotting venues
        for d in venues_i:
            folium.CircleMarker(
                [float(d['lat']), float(d['lng'])],
                radius=2,
                popup=d['name'].translate(str.maketrans('', '', string.punctuation)),
                fill=True,
                color=color,
                fill_color='blue',
                fill_opacity=0.5
            ).add_to(main_map)

        # Plotting institute
        folium.CircleMarker(
            [float(lat), float(lon)],
            radius=5,
            popup=name.translate(str.maketrans('', '', string.punctuation)),
            fill=True,
            color=color,
            fill_color='red'
        ).add_to(main_map)
        
# Calling the function that adds markers to the map
fullplot()

# Printing our map
main_map

Lilavatibai podar santacruz Mumbai , Venues:  15
Narsee Monjee College Mumbai , Venues:  30
Jai Hind College Mumbai , Venues:  33
Mithibai College Mumbai , Venues:  31
Ramnarain Ruia College Mumbai , Venues:  34
Sophia College Mumbai , Venues:  28
St. Andrew's College Mumbai , Venues:  61
St. Xaviers College Mumbai , Venues:  20
Wilson College Mumbai , Venues:  29
IIT Bombay Mumbai , Venues:  6
Arya Vidya Mandir Mumbai , Venues:  59
BD Somani Mumbai , Venues:  13
Cambridge School Mumbai , Venues:  17
Don Bosco High School Mumbai , Venues:  12
Hiranandani Foundation School Powai Mumbai , Venues:  43
Oberoi International Mumbai , Venues:  6
Vibgyor High School Mumbai , Venues:  6


**The above map is completely interactive**

### Inferences from our analysis

**We can see that *IIT Bombay* has very few eateries around it.** 

**With only 6 eateries in a 750 metre radius, the students must have very low options for food outside**

Let us look into some details about this institute.

Around **1100** students are enrolled each year, mostly for a 4 year long degree (Bachelor of Technology). So we can safely assume around 4400 students at any given time will be present here.

This seems to be a relatively small number, but we must take into account the wildly popular festivals held at this institue

**IIT Bombay Techfest** sees a footfall of more than 150,000 people from all over India. Most of these visitors are from colleges and lie in the student demographic.

**Mood Indigo**, a cultural festival held here in December also sees a footfall of more than 140,000 college students.

These large student footfalls can easily be converted into revenue if a good food place is opened in the vicinity

## Hence we find that the best place for a food shop aimed at students in the age range of 13 to 20 year old will be near IIT Bombay in Powai, Mumbai