# Cousera Capstone Project - The Battle of Neighborhoods (Week 2)
## Applied Data Science Capstone

## Table of contents
* [Introduction:  Problem Statement](#introduction)
* [Data Requirement](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

---
## 1. Problem Statement<a name="introduction"></a>

In this project I will try to idetify locations in **Bangalore, India** which would be optimal to open a **French Restaurant**.

This project will be beneficial to all the **stakeholders** who wants to open a **New French Restaurant** in the city of **Bangalore, India**. As in this project I will try to identify all the **locations** which will be optimal to open a **French Restaurant** and the locations which are less crowded with restuarent especially **French Restaurant**.

**Bangalore** is the fastest growing city of **Asia** and is the **Fifth largest city in India**, **Bangalore** is the Silicon Valley of India and thus boost a population which lies in the age group of **20-45 years**. **Bangalore** has a high density of Restaurant, through this project I will try to identify locations which has low density of Restaurants and also I will try to identify locations which has high potential for opening a French Restuarant.

The **criterion** to choose the location for the Restaurant will be:
* The density or number of restaurants in a Neighborhood
* The type of restaurants in a Neighborhood
* The distance of the nearest French Restuarant from the Neighborhood
* Location of the Neighborhood (to determine the afforadability and accesibilty)

---
## 2. Data Requirement<a name="data"></a>

Data's that I will require for this project are:
* Location data of Bangalore, India 
* Location data of the Neighborhoods of Bangalore
* Number of restuarants in a Neighborhoods
* The type of restuarant in a Neighborhoods

 The locations data will be obtained using **Google Maps API geocoding** \
 Restuarant data will be obtained using **Foursquare API**

The above data's will help me to determine the optimal locations for opening a restuarant in **Bangalore** and recommend locations to open a **French Restuarant** to potential stakeholders.

### Making a dataframe of Neighborhoods of Bangalore by Postal code with location data

Scraping the list of Neighborhoods of Bangalore along with Postal code from "www.finkode.com" 

In [12]:
import pandas as pd
import numpy as np
import requests
import lxml
from IPython.display import display_html
from bs4 import BeautifulSoup

In [13]:
url = 'https://finkode.com/ka/bangalore.html'
source = requests.get(url).text

soup = BeautifulSoup(source,'lxml')
print(soup.title)

table = str(soup.table)
display_html(table,raw=True)

<title>Bangalore District Pincode List, Karnataka Postal Pin Codes | FinKode.com</title>


Post Office,District,Pincode
A F Station Yelahanka S.O,Bangalore,560063
Adugodi S.O,Bangalore,560030
Agara B.O,Bangalore,560034
Agram S.O,Bangalore,560007
Amruthahalli B.O,Bangalore,560092
Anandnagar S.O (Bangalore),Bangalore,560024
Anekal S.O,Bangalore,562106
Anekalbazar B.O,Bangalore,562106
Arabic College S.O,Bangalore,560045
Ashoknagar S.O (Bangalore),Bangalore,560050


Converting the list into pandas dataframe

In [14]:
df = pd.read_html(table)
df1 = df[0]
df1.head()

Unnamed: 0,Post Office,District,Pincode
0,A F Station Yelahanka S.O,Bangalore,560063
1,Adugodi S.O,Bangalore,560030
2,Agara B.O,Bangalore,560034
3,Agram S.O,Bangalore,560007
4,Amruthahalli B.O,Bangalore,560092


In [15]:
# replacing 'S.O' & 'B.O' with ''
df1['Post Office'] = df1['Post Office'].str.replace('S.O', '')
df1['Post Office'] = df1['Post Office'].str.replace('B.O', '') 
df1.head()

Unnamed: 0,Post Office,District,Pincode
0,A F Station Yelahanka,Bangalore,560063
1,Adugodi,Bangalore,560030
2,Agara,Bangalore,560034
3,Agram,Bangalore,560007
4,Amruthahalli,Bangalore,560092


In [16]:
# Dropping 'District' column
df2 = df1.drop(['District'], axis = 1)
df2.head()

Unnamed: 0,Post Office,Pincode
0,A F Station Yelahanka,560063
1,Adugodi,560030
2,Agara,560034
3,Agram,560007
4,Amruthahalli,560092


In [17]:
df3 = df2.rename(columns={'Post Office': 'Neighborhood', 'Pincode': 'Postal_Code'})
df3.head()

Unnamed: 0,Neighborhood,Postal_Code
0,A F Station Yelahanka,560063
1,Adugodi,560030
2,Agara,560034
3,Agram,560007
4,Amruthahalli,560092


In [18]:
[maxRow,maxCol] = df3.shape
maxRow,maxCol

(270, 2)

## Getting location data of all the Neighborhoods

In [20]:
!pip install -U googlemaps

import googlemaps

gm = googlemaps.Client(key = api_key_google)

def Geocode(query):
    # do geocoding
    try:
        geocode_result = gm.geocode(query)[0]       
        latitude = geocode_result['geometry']['location']['lat']
        longitude = geocode_result['geometry']['location']['lng']
        return latitude,longitude
    except IndexError:
        return 0

Collecting googlemaps
  Downloading https://files.pythonhosted.org/packages/07/b8/bd7ab78014a4290853250ac8a1744c5a200e569811b7e0cc9222d38fc296/googlemaps-4.2.0-py3-none-any.whl
Installing collected packages: googlemaps
Successfully installed googlemaps-4.2.0


In [21]:
def GeocodeStreetLocationCity(data):
    lat=[]                            # initialize latitude list
    lng=[]                            # initialize longitude list
    start = df3.index[0]             # start from the first data
    end = df3.index[maxRow-1]        # end at maximum number of row
    for i in range(start,end+1,1):    # iterate all rows in the data
        isSuccess=True                # initial Boolean flag
        query = str(df3.Neighborhood[i]) + 'Bangalore'  # try set up our query street-location-city 
        result=Geocode(query)
        
        # store the results
        lat.append(result[0])     # latitude
        lng.append(result[1])     # longitude
    return lat,lng

In [22]:
# call the geocoding function
[lat,lng]=GeocodeStreetLocationCity(df3)

# we put the list of latitude,longitude into pandas data frame
df4 = pd.DataFrame(
    {'Neighborhood': df3.Neighborhood, 'latitude': lat,
     'longitude': lng
    })

In [23]:
df4.head()

Unnamed: 0,Neighborhood,latitude,longitude
0,A F Station Yelahanka,13.137243,77.610284
1,Adugodi,12.942004,77.608304
2,Agara,12.923065,77.646453
3,Agram,12.957993,77.630838
4,Amruthahalli,13.065879,77.604206


In [24]:
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values
!pip install folium
import folium # plotting library

Collecting folium
[?25l  Downloading https://files.pythonhosted.org/packages/fd/a0/ccb3094026649cda4acd55bf2c3822bb8c277eb11446d13d384e5be35257/folium-0.10.1-py2.py3-none-any.whl (91kB)
[K     |████████████████████████████████| 92kB 8.4MB/s eta 0:00:011
Collecting branca>=0.3.0 (from folium)
  Downloading https://files.pythonhosted.org/packages/81/6d/31c83485189a2521a75b4130f1fee5364f772a0375f81afff619004e5237/branca-0.4.0-py3-none-any.whl
Installing collected packages: branca, folium
Successfully installed branca-0.4.0 folium-0.10.1


## loading the latitude and Longitude of Bangalore, India

In [25]:
address = 'Bangalore, India'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print("latitude of Bangalore is :", latitude)
print("longitude of Bangalore is :", longitude)

latitude of Bangalore is : 12.9791198
longitude of Bangalore is : 77.5912997


## Plotting the map of Bangalore with all Neighbourhood

In [183]:
map_bangalore = folium.Map(location=[latitude,longitude],zoom_start=12)

for lat, lng, Neighbourhood in zip(df4.latitude,df4.longitude,df4.Neighborhood):
    label = '{}'.format(Neighbourhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
    [lat, lng],
    radius=5,
    popup=label,
    color='blue',
    fill=True,
    fill_color='#3186cc',
    fill_opacity=0.7,
    parse_html=False).add_to(map_bangalore)

map_bangalore

## Foursquare
Now after getting the location data of the Neighborhoods of Bangalore, I will use Foursquare API to get info on restaurants in each neighborhood.

I will get venues in 'food' category especially those that are proper restaurants. So I will include in my list only venues that have 'restaurant' in category name, and will make sure to detect and include all the subcategories of specific 'French restaurant' category.

In [29]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [30]:
df5 = getNearbyVenues(names=df4.Neighborhood, latitudes=df4.latitude,
                    longitudes=df4.longitude)
df5.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Adugodi,12.942004,77.608304,Bharathi Refreshments(South Indian Food) - Adu...,12.943388,77.60784,Fast Food Restaurant
1,Adugodi,12.942004,77.608304,Stoneart,12.941271,77.608701,Design Studio
2,Adugodi,12.942004,77.608304,adigas,12.940589,77.60878,Indian Restaurant
3,Adugodi,12.942004,77.608304,audugodi,12.942543,77.607353,Bus Station
4,Agara,12.923065,77.646453,Grigliato,12.922214,77.645357,Italian Restaurant


In [31]:
df6 = df5[df5['Venue Category'].str.contains('Restaurant',case=False)]
df6.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Adugodi,12.942004,77.608304,Bharathi Refreshments(South Indian Food) - Adu...,12.943388,77.60784,Fast Food Restaurant
2,Adugodi,12.942004,77.608304,adigas,12.940589,77.60878,Indian Restaurant
4,Agara,12.923065,77.646453,Grigliato,12.922214,77.645357,Italian Restaurant
5,Agara,12.923065,77.646453,Sichuan,12.921003,77.645143,Chinese Restaurant
6,Agara,12.923065,77.646453,Biriyani Zone,12.920138,77.645004,Hyderabadi Restaurant


In [32]:
# save our results to a file
df6.to_csv("neighborhood_venues_500m.csv")

In [33]:
df6.shape

(926, 7)

In [34]:
len(df6.Venue.unique())

553

In [35]:
len(df6["Venue Category"].unique())

47

In [209]:
import numpy as np

print('Total number of restaurants:', len(df6.Venue.unique()))
print('Total number of French restaurants:', len(df9.Venue.unique()))

Total number of restaurants: 553
Total number of French restaurants: 3


In [36]:
df7 = df6.groupby('Neighborhood').count()
df8 = df7.drop(['Neighborhood Latitude', 'Neighborhood Longitude', 'Venue Latitude', 'Venue Longitude', 'Venue Category'], axis = 1)
df8.head()

Unnamed: 0_level_0,Venue
Neighborhood,Unnamed: 1_level_1
Adugodi,2
Agara,4
Anandnagar (Bangalore),1
Ashoknagar (Bangalore),26
Attibele,2


In [37]:
df8.sort_values('Venue', ascending=False)

Unnamed: 0_level_0,Venue
Neighborhood,Unnamed: 1_level_1
Indiranagar (Bangalore),39
Tilaknagar (Bangalore),31
Jayanagar West,31
Jayanagar H.O,31
Koramangala,28
Ashoknagar (Bangalore),26
Jayangar III Block,22
H.A.L II Stage H.O,20
Fraser Town,19
Koramangala VI Bk,19


### Visualize all the restaurants in Bangalore

In [203]:
map_bangalore_restaurant = folium.Map(location=[latitude, longitude], zoom_start=12)

# add markers to map
for lat, lng, Venue in zip(df6['Venue Latitude'], df6['Venue Longitude'], df6['Venue']):
    label = '{}'.format(Venue)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='red',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_bangalore_restaurant)  
    
map_bangalore_restaurant

### from df8 we can see that places such as Indranagar, Tilaknagar which are at the heart of the city is overcrowed with restaurants and places such as yeswanthpura which are at the outskirt of the city has very few restaurants.

### Lets now see how many French restaurants are in each Neighborhood of Bangalore

In [205]:
df9 = df5[df5['Venue Category'].str.contains('French',case=False)]
df9

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
192,Bangalore City,12.971599,77.594563,Café Noir,12.971995,77.596001,French Restaurant
286,Bangalore International Airport,13.198635,77.706593,Café Noir,13.200013,77.70864,French Restaurant
367,Bestamaranahalli,12.971599,77.594563,Café Noir,12.971995,77.596001,French Restaurant
492,Chudenapura,12.971599,77.594563,Café Noir,12.971995,77.596001,French Restaurant
1037,Jalavayuvihar,12.971599,77.594563,Café Noir,12.971995,77.596001,French Restaurant
1717,Malkand Lines,12.971599,77.594563,Café Noir,12.971995,77.596001,French Restaurant
1868,Milk Colony,13.007876,77.55763,Cafe Noir,13.0114,77.555295,French Restaurant
1953,NAL,12.960894,77.65149,Le Cirque Signature,12.960797,77.648597,French Restaurant
2421,Venkatarangapura,12.971599,77.594563,Café Noir,12.971995,77.596001,French Restaurant


In [210]:
df10 = df9.groupby('Neighborhood').count()
df11 = df10.drop(['Neighborhood Latitude', 'Neighborhood Longitude', 'Venue Latitude', 'Venue Longitude', 'Venue Category'], axis = 1)
print("Total No of French Restaurant in Bangalore is:", len(df9.Venue.unique()))
df11.head(10)

Total No of French Restaurant in Bangalore is: 3


Unnamed: 0_level_0,Venue
Neighborhood,Unnamed: 1_level_1
Bangalore City,1
Bangalore International Airport,1
Bestamaranahalli,1
Chudenapura,1
Jalavayuvihar,1
Malkand Lines,1
Milk Colony,1
NAL,1
Venkatarangapura,1


### There are 9 French restaurants in Bangalore one each in 9 Neighborhoods of Bangalore

### Visualize all the French Restaurants in Bangalore

In [212]:
map_bangalore_french = folium.Map(location=[latitude, longitude], zoom_start=12)

# add markers to map
for lat, lng, Venue in zip(df9['Venue Latitude'], df9['Venue Longitude'], df9['Venue']):
    label = '{}'.format(Venue)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='red',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_bangalore_french)  
    
map_bangalore_french

### Now lets analyze each Neighborhood of Bangalore

In [41]:
# one hot encoding
df6_onehot = pd.get_dummies(df6[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
df6_onehot['Neighborhood'] = df6['Neighborhood'] 

# move neighborhood column to the first column
cols = df6_onehot.columns.tolist()
cols.insert(0, cols.pop(cols.index('Neighborhood')))

df6_onehot = df6_onehot.reindex(columns= cols)
df6_onehot.head()

Unnamed: 0,Neighborhood,Afghan Restaurant,American Restaurant,Andhra Restaurant,Asian Restaurant,Bengali Restaurant,Cantonese Restaurant,Caribbean Restaurant,Chinese Restaurant,Doner Restaurant,...,Seafood Restaurant,South Indian Restaurant,Southern / Soul Food Restaurant,Swiss Restaurant,Tex-Mex Restaurant,Thai Restaurant,Tibetan Restaurant,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant
0,Adugodi,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Adugodi,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Agara,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
5,Agara,0,0,0,0,0,0,0,1,0,...,0,0,0,0,0,0,0,0,0,0
6,Agara,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [42]:
df12 = df6_onehot.groupby('Neighborhood').mean().reset_index()
df12.head()

Unnamed: 0,Neighborhood,Afghan Restaurant,American Restaurant,Andhra Restaurant,Asian Restaurant,Bengali Restaurant,Cantonese Restaurant,Caribbean Restaurant,Chinese Restaurant,Doner Restaurant,...,Seafood Restaurant,South Indian Restaurant,Southern / Soul Food Restaurant,Swiss Restaurant,Tex-Mex Restaurant,Thai Restaurant,Tibetan Restaurant,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant
0,Adugodi,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Agara,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Anandnagar (Bangalore),0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Ashoknagar (Bangalore),0.0,0.038462,0.038462,0.0,0.0,0.0,0.0,0.115385,0.0,...,0.0,0.0,0.0,0.0,0.0,0.038462,0.038462,0.0,0.0,0.0
4,Attibele,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


## Clustering the Neighborhoods of Bangalore using K Means Clustering 

In [44]:
from sklearn.cluster import KMeans
# set number of clusters
kclusters = 8

neighborhood_clustering = df12.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(neighborhood_clustering)

# check cluster labels generated for each row in the dataframe
labels = kmeans.labels_

In [45]:
df13 = df8
df13['label'] = labels
df13.sort_values('Venue', ascending=False)

Unnamed: 0_level_0,Venue,label
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1
Indiranagar (Bangalore),39,6
Tilaknagar (Bangalore),31,0
Jayanagar West,31,0
Jayanagar H.O,31,0
Koramangala,28,6
Ashoknagar (Bangalore),26,0
Jayangar III Block,22,0
H.A.L II Stage H.O,20,4
Fraser Town,19,0
Koramangala VI Bk,19,0


In [46]:
df14 = pd.merge(df4, df13, on='Neighborhood')
df14.head()

Unnamed: 0,Neighborhood,latitude,longitude,Venue,label
0,Adugodi,12.942004,77.608304,2,6
1,Agara,12.923065,77.646453,4,4
2,Anandnagar (Bangalore),13.031328,77.591313,1,3
3,Ashoknagar (Bangalore),12.971885,77.607009,26,0
4,Attibele,12.778963,77.770203,2,5


### Visualize the resulting clusters

In [213]:
import matplotlib.cm as cm
import matplotlib.colors as colors

# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=12)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]
#rainbow = ["#FF33F6","#33E0FF","#7A33FF","#FF7D33","#7E0548"]
# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(df14['latitude'], df14['longitude'], df14['Neighborhood'], df14['label']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=4,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.5).add_to(map_clusters)
       
map_clusters

### lets see in which clusters Neighborhood with French Resturant Falls

In [138]:
df15 = pd.merge(df5, df13, on='Neighborhood')
df15.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue_x,Venue Latitude,Venue Longitude,Venue Category,Venue_y,label
0,Adugodi,12.942004,77.608304,Bharathi Refreshments(South Indian Food) - Adu...,12.943388,77.60784,Fast Food Restaurant,2,6
1,Adugodi,12.942004,77.608304,Stoneart,12.941271,77.608701,Design Studio,2,6
2,Adugodi,12.942004,77.608304,adigas,12.940589,77.60878,Indian Restaurant,2,6
3,Adugodi,12.942004,77.608304,audugodi,12.942543,77.607353,Bus Station,2,6
4,Agara,12.923065,77.646453,Grigliato,12.922214,77.645357,Italian Restaurant,4,4


In [140]:
French_Restaurant_Clus = df15[df15['Venue Category'].str.contains('French',case=False)]
French_Restaurant_Clus.head(10)

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue_x,Venue Latitude,Venue Longitude,Venue Category,Venue_y,label
176,Bangalore City,12.971599,77.594563,Café Noir,12.971995,77.596001,French Restaurant,18,4
265,Bangalore International Airport,13.198635,77.706593,Café Noir,13.200013,77.70864,French Restaurant,8,4
342,Bestamaranahalli,12.971599,77.594563,Café Noir,12.971995,77.596001,French Restaurant,18,4
459,Chudenapura,12.971599,77.594563,Café Noir,12.971995,77.596001,French Restaurant,18,4
970,Jalavayuvihar,12.971599,77.594563,Café Noir,12.971995,77.596001,French Restaurant,18,4
1625,Malkand Lines,12.971599,77.594563,Café Noir,12.971995,77.596001,French Restaurant,18,4
1768,Milk Colony,13.007876,77.55763,Cafe Noir,13.0114,77.555295,French Restaurant,13,4
1845,NAL,12.960894,77.65149,Le Cirque Signature,12.960797,77.648597,French Restaurant,8,0
2281,Venkatarangapura,12.971599,77.594563,Café Noir,12.971995,77.596001,French Restaurant,18,4


### Visualize Neighborhoods with French Restaurant

In [182]:
map_bangalore_french = folium.Map(location=[latitude, longitude], zoom_start=12)

# add markers to map
for lat, lng, Venue in zip(French_Restaurant_Clus['Venue Latitude'], French_Restaurant_Clus['Venue Longitude']
                                  , French_Restaurant_Clus['Venue_x']):
    label = '{}'.format(Venue)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='red',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_bangalore_french)  
    
map_bangalore_french

### From French_Restaurant_Clus we can see that all the Neighborhoods which has a French Restaurant belongs cluster 4 
**Thus, all the Neighborhoods which falls under Cluster 4 and does not have a French Retaurant is the most eligible location to open a new French Restaurant**

In [169]:
cluster_4=df14[df14["label"].isin([4])]
cluster_4.sort_values('Venue', ascending=False)

Unnamed: 0,Neighborhood,latitude,longitude,Venue,label
52,H.A.L II Stage H.O,12.968478,77.642574,20,4
144,Venkatarangapura,12.971599,77.594563,18,4
97,Malkand Lines,12.971599,77.594563,18,4
20,Bestamaranahalli,12.971599,77.594563,18,4
67,Jalavayuvihar,12.971599,77.594563,18,4
36,Chudenapura,12.971599,77.594563,18,4
10,Bangalore City,12.971599,77.594563,18,4
103,Milk Colony,13.007876,77.55763,13,4
60,Hulsur Bazaar,12.97632,77.622825,11,4
13,Bangalore International Airport,13.198635,77.706593,8,4


In [170]:
neigh_list = cluster_4.Neighborhood.tolist()
neigh_list

['Agara ',
 'Bangalore City ',
 'Bangalore International Airport ',
 'Begur ',
 'Bestamaranahalli ',
 'Chandra Lay Out ',
 'Chudenapura ',
 'Dr. Shivarama Karanth Nagar ',
 'H.A.L II Stage H.O',
 'Hulsur Bazaar ',
 'Jalavayuvihar ',
 'Malkand Lines ',
 'Milk Colony ',
 'Mount St Joseph ',
 'Rv Niketan ',
 'Sri Chowdeshwari ',
 'St. Thomas Town ',
 'Venkatarangapura ']

In [171]:
candidate_location = ['Agara ','Begur ','Chandra Lay Out ','Dr. Shivarama Karanth Nagar ','H.A.L II Stage H.O','Hulsur Bazaar ',
                      'Mount St Joseph ','Rv Niketan ','Sri Chowdeshwari ','St. Thomas Town ']

In [175]:
df_candidate_location = pd.DataFrame(candidate_location, columns = ['Neighborhood'])
df_candidate_location

Unnamed: 0,Neighborhood
0,Agara
1,Begur
2,Chandra Lay Out
3,Dr. Shivarama Karanth Nagar
4,H.A.L II Stage H.O
5,Hulsur Bazaar
6,Mount St Joseph
7,Rv Niketan
8,Sri Chowdeshwari
9,St. Thomas Town


In [179]:
df16 = pd.merge(df_candidate_location, cluster_4, on='Neighborhood')
df16.sort_values('Venue', ascending=False)

Unnamed: 0,Neighborhood,latitude,longitude,Venue,label
4,H.A.L II Stage H.O,12.968478,77.642574,20,4
5,Hulsur Bazaar,12.97632,77.622825,11,4
0,Agara,12.923065,77.646453,4,4
7,Rv Niketan,12.91555,77.51347,2,4
1,Begur,12.878767,77.637668,1,4
2,Chandra Lay Out,12.954459,77.522755,1,4
3,Dr. Shivarama Karanth Nagar,13.071902,77.626405,1,4
6,Mount St Joseph,12.868318,77.591079,1,4
8,Sri Chowdeshwari,13.032589,77.55493,1,4
9,St. Thomas Town,13.01313,77.624257,1,4


### Visualize our candidate locations

In [180]:
map_bangalore_candidate = folium.Map(location=[latitude, longitude], zoom_start=12)

# add markers to map
for lat, lng, neighborhood in zip(df16['latitude'], df16['longitude'], df16['Neighborhood']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='red',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_bangalore_candidate)  
    
map_bangalore_candidate