# A recommendation system for food stalls aimed at students (based on student count and location of schools/ colleges in Mumbai, India)

## Brief Introduction

## Part 1: Description of problem

### Mumbai, India is an extremely densely populated city (one of the most dense), with more than 18 million residents.

### Obviously it is tough to start a business here due to high real estate costs. So, an entrepreneur aiming at a student centric market (13 - 20 year old demographic) should know the best places to set up shop.

### A large population of Mumbai lies in this student demographic (more than 50 schools), and eating snack foods out is more popular and convenient than ever, hence we will find the best places in Mumbai to set up a food shop/ restaurant

### Target audience: 
### Entrepreneurs and small-scale businessmen/women interested in the food/ snacks industry, aiming at the student demographic

 #   

## Part 2: Data that is needed

### 1. **We need a list of the most populated schools in Mumbai.** Their latitude and longitude will be calculated using geopy Nominatim. 

This data can be found on Wikipedia, as well as the school websites.

For instance: https://en.wikipedia.org/wiki/List_of_educational_institutions_in_Mumbai

### 2. **Then we can use the FourSquare API to find the number of eateries in a 1km radius around each school.** The API will provide us with Postal Code, Neighborhood, Venue, Venue Summary and Venue Category.

Foursquare is a local search-and-discovery service mobile app which provides search results for its users (Wikipedia). It has more than 60 million users.

### 3. Processing the Retrieved data and creating a structured DataFrame for all the venues, grouped by schools. 

### 4. Selecting relevant venues (food related only).

### **The schools with highest ratio of `(no. of students)/(no. of eateries)` would be the best places to start a food stall/ restaurant.** (supply and demand)

We can also create clusters of most highly student populated areas

### Thank you for your time, I would greatly appreciate any feedback (sidjain1412@gmail.com)

**Data examples:**

In [11]:
from geopy.geocoders import Nominatim

geolocator = Nominatim(user_agent='myapplication')
location = geolocator.geocode("Lilavatibai podar santacruz Mumbai India").raw
lat = location['lat']
lon = location['lon']
print("Latitude: ", lat)
print("Longitutde: ", lon)

Latitude:  19.0810735
Longitutde:  72.8371727


In [19]:
radius = 1000
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    lat, 
    lon, 
    radius, 
    100)

In [15]:
import requests

results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5c7c2d8f6a60714c80d24d4e'},
 'response': {'groups': [{'items': [{'reasons': {'count': 0,
       'items': [{'reasonName': 'globalInteractionReason',
         'summary': 'This spot is popular',
         'type': 'general'}]},
      'referralId': 'e-0-4ce017cbdb125481d7a13ace-0',
      'venue': {'categories': [{'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/icecream_',
          'suffix': '.png'},
         'id': '4bf58dd8d48988d1c9941735',
         'name': 'Ice Cream Shop',
         'pluralName': 'Ice Cream Shops',
         'primary': True,
         'shortName': 'Ice Cream'}],
       'id': '4ce017cbdb125481d7a13ace',
       'location': {'cc': 'IN',
        'country': 'India',
        'distance': 184,
        'formattedAddress': ['Mahārāshtra', 'India'],
        'labeledLatLngs': [{'label': 'display',
          'lat': 19.081607436400976,
          'lng': 72.83551217986849}],
        'lat': 19.081607436400976,
        'lng': 72.8355121798

In [16]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [17]:
from pandas.io.json import json_normalize
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]
print(nearby_venues.shape)
nearby_venues.head(15)

(53, 4)


Unnamed: 0,name,categories,lat,lng
0,Gokul Icecreams,Ice Cream Shop,19.081607,72.835512
1,Sandwizzaa,Sandwich Place,19.0807,72.840414
2,Seasons,Women's Store,19.080887,72.83831
3,Society Stores,Convenience Store,19.084338,72.836298
4,Ram & Shyam,Food Truck,19.07822,72.836411
5,Being Human Store,Clothing Store,19.079782,72.834403
6,Three Wise Men,Pub,19.084432,72.835128
7,Atrang Fashions,Jewelry Store,19.082495,72.838007
8,Joss,Japanese Restaurant,19.085574,72.834691
9,Nice Fast Food Corner,Fast Food Restaurant,19.077202,72.837742


In [18]:
nearby_venues.categories.unique()

array(['Ice Cream Shop', 'Sandwich Place', "Women's Store",
       'Convenience Store', 'Food Truck', 'Clothing Store', 'Pub',
       'Jewelry Store', 'Japanese Restaurant', 'Fast Food Restaurant',
       'Gym', 'Coffee Shop', "Men's Store", 'Lounge',
       'Gym / Fitness Center', 'French Restaurant',
       'Furniture / Home Store', 'Boutique', 'Steakhouse', 'Snack Place',
       'Thai Restaurant', 'Indian Restaurant', 'Juice Bar', 'Music Venue',
       'Chinese Restaurant', 'Train Station', 'Pizza Place', 'Café',
       'Bakery', 'Park', 'Market', 'Moving Target', 'Shopping Mall',
       'Hotel', 'Yoga Studio', 'Metro Station', 'Multiplex',
       'Bus Station'], dtype=object)