# __Capstone Project - Explore the optimal location for a restaurant  in Mangalore__

Table of contents
1. Introduction: Business Problem
2. Data
3. Methodology
4. Analysis
5. Results and Discussion
6. Conclusion

## 1. Introduction: Business Problem

### 1.1 Background
Mangalore, officially known as Mangaluru, is the chief port city of the Indian state of Karnataka. Mangalore is the state's only city to have all four modes of transport—air, road, rail and sea. The population of the urban agglomeration is nearly a million.

Mangalore is also the administrative headquarters of the Dakshina Kannada district. This city's International Airport is the second-largest airport in Karnataka state. Mangalore is a __commercial, industrial, educational, healthcare and startup hub.__ Day by day the population of the city is increasing and there are more visitors to the city. This leads to a good business opportunity. There is ample scope for opening restaurants.


### 1.2 Business Problem
In this project we will try to find a good location for a restaurant in Mangalore city.  There are many restaurants in the city. But, we would like to identify a crowded location where there is no restaurant or just one.

Identifying the __optimal location for opening a new restaurant is the business problem of this project.__ 

Restaurants around __colleges, healthcare, factories, government buildings and schools__ normally do good business. Such locations are normally crowded. Project should propose an optimal location, that  should be __within the 200meters (walkable distance)__ from the main spot such as college/hospital etc.

## 2. Data

### 1.1 Data requirement
Based on our business problem, we need following data:
1. Identify top 5 locations for Colleges, Hospitals, Government buildings, Factories and Schools in Mangalore. 
2. For each such location, identify the existing restaurants within 200meter radius.
3. Analyze the obtained data to identify the optimal location


### 1.2 Data source
Following data sources will be needed to get the required information:
1. Coordinates of Mangalore city center is obtained from Google Maps
2. Foursquare API is used to get the top 5 locations for Colleges, Hospitals, Government buildings, Factories and Scholls. Also to get the nearby restaurants for each such location.
3. To get the latitude and longitude based on the address, we use Geopy Nominatim


In [0]:
import numpy as np # library to handle data in a vectorized manner
import pandas as pd
import requests
import math

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

!pip install geopy
from geopy.geocoders import Nominatim 

# tranform JSON file into a pandas dataframe
from pandas.io.json import json_normalize 

# import k-means from clustering stage
from sklearn.cluster import KMeans

!pip install folium
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


Lets get the latitude and longitude of Mangalore city usi geopy library.

In [0]:
# get the address of Mangalore city
address = 'Mangalore, Karnataka, India'

# get geolocator
geolocator = Nominatim(user_agent="ny_explorer")

# get location from address
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude

print('The geograpical coordinate of Mangalore City are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Mangalore City are 12.8698101, 74.8430082.


Foursquare API is the main API source with which we get all the top locations for various categories and then the nearby restaurants for each such location. We use venues/search API for category specific search based on latitude and longitude. We get the category id for 5 categories and the Food category from Foursquare API documentation.

In [0]:
#@title Foursquare details
CLIENT_ID = 'TWECZIHLL3ZZYQ1R5FQWRP0FAFOVJJ2AJHLJJRITXA4EI5H5' # your Foursquare ID
CLIENT_SECRET = 'ZBMPN2T42QB45M0QKMVCHSZPG2IUTYGJ5HLUH0VMBMCRFVYS' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

Now, define a function to get the top 100 venues that are in Mangalore within a radius of 5000 meters based on category id.

let's use Foursquare API to get info on restaurants in each category locations.

We're interested in venues in 'food' category, for each category location.

In [0]:
radius = 5000
LIMIT = 100

def get_Url(categoryId):
    return 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}&categoryId={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, radius, LIMIT, categoryId)

Define utility functions to extract the venues for a category

In [0]:
def get_nearby_venues(results):
    # clean the data
    venues = results['response']['venues']

    nearby_venues = json_normalize(venues) # flatten JSON

    if nearby_venues.shape[0] > 0:
        # filter columns
        filtered_columns = ['name', 'categories', 'location.lat', 'location.lng']
        nearby_venues = nearby_venues.loc[:, filtered_columns]

        # filter the category for each row
        nearby_venues['categories'] = nearby_venues.apply(get_category_type, axis=1)

        # clean columns
        nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

    return nearby_venues

In [0]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [0]:
# get the nearby venues based on the category id in Mangalore within 5KM radius
def get_nearby_venues_for_category(categoryId):
    # first get the url
    url = get_Url(categoryId)

    # get the json response from url
    results = requests.get(url).json()

    # return the venues for the provided category
    return get_nearby_venues(results)


Define the category id for categories - Food, Colleges, Factories, Hospitals, Government Buildings and Schools

In [0]:
# Foursquare category codes
category_food = "4d4b7105d754a06374d81259"
category_colleges = "4d4b7105d754a06372d81259"
category_factories = "4eb1bea83b7b6f98df247e06"
category_government_builiding = "4bf58dd8d48988d126941735"
category_hospitals = "4bf58dd8d48988d196941735"
category_schools = "4bf58dd8d48988d13b941735"

Lets first get the top 100 colleges within Mangalore city 5km radius. Also, clean the data to get top 5 colleges.

Finally display the college list.

In [0]:
# get the colleges list
colleges = get_nearby_venues_for_category(category_colleges)

# clean the college data based on the category - College Academic Building
colleges = colleges[colleges['categories'] == 'College Academic Building'].reset_index(drop=True)

# reset the category name
colleges = colleges.assign(categories='College')

# get the top 5 colleges
colleges = colleges[0:5]

colleges

  """


Unnamed: 0,name,categories,lat,lng
0,St Aloysius PU College,College,12.873166,74.844708
1,Mahesh P.U. College,College,12.913561,74.835975
2,"SDM Law College, Mangalore",College,12.87873,74.841349
3,Canara College,College,12.879053,74.842549
4,University College Mangalore,College,12.865785,74.840139


Lets get the top 100 factories within Mangalore city 5km radius. Also, clean the data to get top 5 factories.

Finally display the location list.

In [0]:
# get the factories list
factories = get_nearby_venues_for_category(category_factories)

# sort based on lat, lng, which will give us the factories which are near to the mangalore center point
factories.sort_values(['lat', 'lng'], ascending=[True, True]).reset_index(drop=True)

# get the top 5 factories
factories = factories[0:5]

factories

  """


Unnamed: 0,name,categories,lat,lng
0,shakthi creations,Factory,12.869485,74.838037
1,Rudra Industries,Factory,12.877755,74.846977
2,Sarasija Foods (NARAN'S),Factory,12.896301,74.86104
3,Sovereign Tile Works,Factory,12.880381,74.827174
4,Sarkar laminates,Factory,12.822981,74.859305


Lets get the top 100 government building locations within Mangalore city 5km radius. Also, clean the data to get top 5 locations.

Finally display the government building location list.

In [0]:
# get the goverment buildings list
government_buildings = get_nearby_venues_for_category(category_government_builiding)

# based on category Government Building filter the data
government_buildings = government_buildings[government_buildings['categories'] == 'Government Building'].reset_index(drop=True)

# get the top 5 entries
government_buildings = government_buildings[0:5]

government_buildings

  """


Unnamed: 0,name,categories,lat,lng
0,Mangalore One,Government Building,12.871469,74.845673
1,Petroleum And Explosives Safety Organisation,Government Building,12.870521,74.84546
2,RTO,Government Building,12.861331,74.839396
3,Mangalore City Corporation,Government Building,12.884741,74.839219
4,D C Office,Government Building,12.861403,74.836295


Lets get the top 100 hospital locations within Mangalore city 5km radius. Also, clean the data to get top 5 hospitals

Finally display the hospital list

In [0]:
# get the hospitals list
hospitals = get_nearby_venues_for_category(category_hospitals)

hospitals = hospitals[hospitals['categories'] == 'Hospital'].reset_index(drop=True)

hospitals.sort_values(['lat', 'lng'], ascending=[True, True]).reset_index(drop=True)

# drop row 3, 5 as the entries are already present for the same hospital
hospitals.drop(hospitals.index[[3, 5]], inplace=True)

# get the top 5 hospitals
hospitals = hospitals[0:5]

hospitals

  """


Unnamed: 0,name,categories,lat,lng
0,Tara Hospital,Hospital,12.869495,74.840137
1,Wenlock Government Hospital,Hospital,12.867602,74.843103
2,Yenepoya Hospital,Hospital,12.870697,74.846558
4,KMC Ambedkar Hospital,Hospital,12.872169,74.848706
6,KMC Hospital Life's On,Hospital,12.872002,74.848127


Lets get the top 100 schoold locations within Mangalore city 5km radius. Also, clean the data to get top 5 locations

Finally display the top 5 schools list

In [0]:
# get the nearby schools
schools = get_nearby_venues_for_category(category_schools)

# filter the data based on category - High School
schools = schools[schools['categories'] == 'High School'].reset_index(drop=True)

# slice the top 5 entries
schools = schools[0:5]

schools

  """


Unnamed: 0,name,categories,lat,lng
0,Nalanda School,High School,12.873374,74.840744
1,Milagres School,High School,12.867199,74.843828
2,Canara Girls High School,High School,12.877045,74.84268
3,Government High School,High School,12.870202,74.834983
4,St. Aloysius High School,High School,12.873727,74.84505


Lets combine all the top 5 locations for various categories

In [0]:
# lets first merge the top5 spots of all the categories
top_spots = pd.concat([government_buildings, hospitals, colleges, schools, factories]).reset_index(drop=True)
top_spots

Unnamed: 0,name,categories,lat,lng
0,Mangalore One,Government Building,12.871469,74.845673
1,Petroleum And Explosives Safety Organisation,Government Building,12.870521,74.84546
2,RTO,Government Building,12.861331,74.839396
3,Mangalore City Corporation,Government Building,12.884741,74.839219
4,D C Office,Government Building,12.861403,74.836295
5,Tara Hospital,Hospital,12.869495,74.840137
6,Wenlock Government Hospital,Hospital,12.867602,74.843103
7,Yenepoya Hospital,Hospital,12.870697,74.846558
8,KMC Ambedkar Hospital,Hospital,12.872169,74.848706
9,KMC Hospital Life's On,Hospital,12.872002,74.848127


Lets plot these top spots on the map centered at Mangalore city. Each location is shown in Orange color marker

In [0]:
# create map of Mangalore using latitude and longitude values
map_mangalore = folium.Map(location=[latitude, longitude], zoom_start=12)

# add markers to map
for lat, lng, category, name in zip(top_spots['lat'], top_spots['lng'], top_spots['categories'], top_spots['name']):
    label = '{}, {}'.format(name, category)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=7,
        popup=label,
        color='orange',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_mangalore) 
    
    
map_mangalore

## 3. Methodology

In this project we will put our efforts on identifying the optimal location for a restaurant in Mangalore city within 5km range.
We first identified the categories which are usually crowded. These are __Colleges, Hospital, Schools, Factories and Government Buildings__. These 5 categories are more crowded compared to other categories such as Art, Museum, Pool etc.

1. In the first step we have collected top 100 locations data for different categories. We then cleaned the data to include only the top 5 locations for each category. Details for each location includes name, latitude, longitude and category name. Plot these locations on the map with Orange color.

2. Second step in our analysis will be getting the nearby restaurants __within 200m__ distance from each of these locations. We feel 200m is a reasonable walkable distance from any location. Then plotting these locations on the map to display the nearby restaurants along with locations.

3. Third step is to analyse the consolidated data which we obtained in step 2. This analysis will shed light on the recommended locations. 

4. Final step is to recommend the optimal location to the stakeholders.


Lets define a function to get the nearby restaurant for a given location

In [0]:
# set the radius and limit
radius = 200 # 200 meters
LIMIT = 5 # top 5 nearby restaurants

# function to get nearby restaurant for a spot
def get_nearby_restaurants_for_a_spot(latitude, longitude):

    spot_url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}&categoryId={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, radius, LIMIT, category_food)

    results = requests.get(spot_url).json()

    # return the venues for the provided category
    return get_nearby_venues(results)

Lets find the nearby restaurants for each such spot _within 200m, which is walkable distance from the spot_.

1. Note that, in the code, i have also __printing the number of location found for each location.__ 
2. Incase if we don't find any nearby restaurants, then _we don't add such location to the dataframe_.

In [0]:
# for each spot in top_spots get the nearby restaurants within 200m
column_names = ["location", "location_category", "location_lat", "location_lng", "name", "categories", "lat", "lng"]

restaurants_combined = pd.DataFrame(columns = column_names)
df = top_spots
for lat, lng, name, category in zip(df['lat'], df['lng'], df['name'], df['categories']):
    # get the nearby restaurants for a spot
    restaurants = get_nearby_restaurants_for_a_spot(lat, lng)

    if restaurants.shape[0] > 0:
        # filter based on the category value containing Restaurant
        restaurants = restaurants[restaurants['categories'].str.contains("Restaurant")].reset_index(drop=True)
        restaurants.insert(0, 'location', name)
        restaurants.insert(1, 'location_category', category)
        restaurants.insert(2, 'location_lat', lat)
        restaurants.insert(3, 'location_lng', lng)
        
        if restaurants.shape[0] == 1 :
            print("Only one Restaurant found near ", name)
        if restaurants.shape[0] == 2 :
            print("Only two Restaurant found near ", name)

        # concat the restaurants data to the combined dataframe
        restaurants_combined = pd.concat([restaurants_combined, restaurants])
    else:
        print("NO Restaurants found near ", name)

restaurants_combined.reset_index(drop=True)

  """


Only one Restaurant found near  RTO
Only two Restaurant found near  D C Office
Only two Restaurant found near  KMC Ambedkar Hospital
Only two Restaurant found near  KMC Hospital Life's On
NO Restaurants found near  Mahesh P.U. College
Only one Restaurant found near  SDM Law College, Mangalore
Only one Restaurant found near  Canara College
Only two Restaurant found near  Nalanda School
Only two Restaurant found near  Milagres School
Only two Restaurant found near  Canara Girls High School
Only two Restaurant found near  Government High School
Only one Restaurant found near  St. Aloysius High School
Only one Restaurant found near  shakthi creations
Only one Restaurant found near  Sarasija Foods (NARAN'S)
Only one Restaurant found near  Sovereign Tile Works
NO Restaurants found near  Sarkar laminates


Unnamed: 0,location,location_category,location_lat,location_lng,name,categories,lat,lng
0,RTO,Government Building,12.861331,74.839396,New Danish Arabian Treat,Middle Eastern Restaurant,12.860274,74.838933
1,Mangalore City Corporation,Government Building,12.884741,74.839219,Chicken Tikka Halal,Middle Eastern Restaurant,12.885498,74.83954
2,Mangalore City Corporation,Government Building,12.884741,74.839219,Mathura Vegetarian,Vegetarian / Vegan Restaurant,12.885307,74.84013
3,Mangalore City Corporation,Government Building,12.884741,74.839219,Hotel annaapoorna,Indian Restaurant,12.885206,74.838214
4,D C Office,Government Building,12.861403,74.836295,Cardamom,Indian Restaurant,12.861908,74.835169
5,D C Office,Government Building,12.861403,74.836295,Hotel Swagath,Udupi Restaurant,12.861357,74.83693
6,Tara Hospital,Hospital,12.869495,74.840137,Punjab Da Pind,Punjabi Restaurant,12.870084,74.841372
7,Tara Hospital,Hospital,12.869495,74.840137,Hotel Maya Darshini,South Indian Restaurant,12.868126,74.84042
8,Tara Hospital,Hospital,12.869495,74.840137,Fish bowl,Seafood Restaurant,12.870882,74.841335
9,Wenlock Government Hospital,Hospital,12.867602,74.843103,Taj Mahal Veg Restaurant,Vegetarian / Vegan Restaurant,12.868113,74.842896


If you check the printed log above the dataframe, we can clear see that
"NO Restaurants found near __Mahesh P.U. College__" is printed.

## 4. Analysis

Let's perform some basic exploratory data analysis and derive some additional info from our raw data. Lets plot all the nearby restaurants on the map which was already rendered to display the top spots

1. Top locations are displayed in Orange circle
2. Nearby restaurants are displayed in Blue color

In [0]:
df = restaurants_combined

# add markers to map to display the nearby restaurants for each top sport in blue color
for lat, lng, category, name in zip(df['lat'], df['lng'], df['categories'], df['name']):
    label = '{}, {}'.format(name, category)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=7,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_mangalore) 
    
    
map_mangalore

Lets group the restaurants based on location name, note that __Mahesh P.U. College__ is not present in the dataframe since we didn't find any nearvy restaurants within 200m radius

In [0]:
restaurants_combined = restaurants_combined.groupby('location').size().reset_index(drop=False)
restaurants_combined.rename(columns={0: "count"}, inplace=True)
restaurants_combined.sort_values('count', inplace=True)
type(restaurants_combined)

pandas.core.frame.DataFrame

Lets just reset the index and display the final result

In [0]:
restaurants_combined.reset_index(drop=True)

Unnamed: 0,location,count
0,Canara College,1
1,St. Aloysius High School,1
2,Sovereign Tile Works,1
3,Sarasija Foods (NARAN'S),1
4,"SDM Law College, Mangalore",1
5,RTO,1
6,shakthi creations,1
7,Milagres School,2
8,Nalanda School,2
9,KMC Ambedkar Hospital,2


From the above result its clear that within 200m of walkable distance:

1. There are __7 locations__ which has __One nearby restaurant__ 
2. There are __7 locations__ which has __Two nearby restaurant__ 
3. There are __4 locations__ which has __Three nearby restaurant__
4. There is __1 location__ which has __Four nearby restaurant__
5. There is __1 location__ which has __Five nearby restaurant__

Note that, we still __one location without any nearyby restaurants__, location name is __Mahesh P.U. College__

## 5. Result and Discussion

Our overall data analysis shows that Mangalore has many restaurants near crowded places. To identify the optimal location, we first identified the locations for Colleges, Hospitals, Factories, Schools and Government Building category within the 5 KM range from Mangalore center point. We wanted to get the optimal location within 5KM city radius. Then filtered these data to get top 5 locations for each category.

We could have opted for 10 locations for each category. But with 10 locations for each category, we may be moving away from the city center point. To be near to the city center point, i preferred to select the top 5 locations. 

Once we got top 5 locations for each category, we got the nearby existing restaurants for each of these locations using the Foursquare API, which are within 200m of distance from location. 

The reason for considering 200m as the radius limit is, we wanted to look for a restaurant within walkable distance. We could have gone with 500m, but many don't prefer 500m as the walkable distance. For 500m, they prefer either two wheeler or four wheeler. Hence 200meter is a decent walkable distance.

Finally merged all the data and plotted these info on the map, with Orange color for category locations and Blue color for Restaurants.

Then prepared a table to list out the location name and nearby restaurants. We got one location without any nearby restaurant and 7 locations each with one and two nearby restaurants.

__Recommendations:__
Based on the obtained result, i would recommend following locations:
1. For __Mahesh P.U. College__ location, we didn't find any nearby restaurants within 200m. This location is my first recommendation.
2. There are __7 such locations__, for which we have only __1 nearby__ restaurants. They are: __Canara College, St. Aloysius High School, Sovereign Tile Works, Sarasija Foods (NARAN'S), SDM Law College, Mangalore and RTO, Shakthi creations.__. Interested stakeholder can look into any of these locations.


## 6. Conslusion

Purpose of this project is to identify the optimal location for opening a new restaurant within 5KM range of Mangalore city. This would help the stakeholders in narrowing down the search for location. Initially, I identified the top 5 locations for different categories. These categories normally attract crowds. Hence scope for opening a restaurant near such location is more. Then we identified the nearby restaurants for such locations. 

My finding and recommendations are already discussed in the _Result and Discussion_ section. Please check the recommendations. Selecting the optimal location is left to the stakeholder based on the other dependent factor such as rent, water availabilility, parking slots, connectivity etc. It is upto the stakeholder who has to pick one of the recommended locations.


