# <center> Openning a Michelin Guide starred restaurant in New York City</center>

## Introduction & Business Problem :

An investor wants to open a restaurant in New York City (NYC) with the ambition to be rated in the Michelin Guide. 

New York City is the most populous city in the USA. New York City is home to "nearly one thousand of the finest and most diverse haute cuisine restaurants in the world", according to Michelin.  As of 2019, there were 27,043 restaurants in the city, up from 24,865 in 2017. This means that the market is highly competitive. Therefore, the project of opening a new restaurant in New York City needs to be analyzed carefully. 

The business decision restaurant project is based on multiple factors. One key factor the ability to distinguish yourself from the competition.

Knowing that the US investor asks to a consultant to provide him with the landscape of the restaurant in New York City by type of cuisine.

The success of the project will be a good recommendation of borough/Neighborhood choice for the US investor based on the lack of such restaurants in the recommended  area.

## Data acquisition

We Will use the following sets of data in our project :

#### Data 1

We will retreive New York City boroughs and neighborhoods names and locations from the following dataset https://geo.nyu.edu/catalog/nyu_2451_34572 


#### Data 2

We will retreive from weekypedia the list List of Michelin starred restaurants in New York City:

https://en.wikipedia.org/wiki/List_of_Michelin_starred_restaurants_in_New_York_City

#### Data 3

Newyork city Michelin starred restaurants data will be utilized as input for the Foursquare API, that will be leveraged to provision restaurant information for each neighborhood.

## Data prepatation

We download the dependencies needed.

In [1]:
import numpy as np 
import pandas as pd
import json
!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim 
import requests 
from pandas.io.json import json_normalize
import matplotlib.cm as cm
import matplotlib.colors as colors
!conda install -c conda-forge folium=0.5.0 --yes 
import folium
import csv 
print('Libraries imported.')

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    ca-certificates-2019.11.28 |       hecc5488_0         145 KB  conda-forge
    geographiclib-1.50         |             py_0          34 KB  conda-forge
    geopy-1.21.0               |             py_0          58 KB  conda-forge
    certifi-2019.11.28         |           py36_0         149 KB  conda-forge
    openssl-1.1.1d             |       h516909a_0         2.1 MB  conda-forge
    ------------------------------------------------------------
                                           Total:         2.5 MB

The following NEW packages will be INSTALLED:

    geographiclib:   1.50-py_0         conda-forge
    geopy:           1.21.0-py_0       conda-forge

The following packages will be UPDATED:

    ca-

We reteive NYC data and put them into a dataframe.

In [2]:
!wget -q -O 'newyork_data.json' https://ibm.box.com/shared/static/fbpwbovar7lf8p5sgddm06cgipa2rxpe.json
print('Data downloaded!')

Data downloaded!


In [3]:
with open('newyork_data.json') as json_data:
    newyork_data = json.load(json_data)

In [4]:
neighborhoods_data = newyork_data['features']
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude'] 
neighborhoods = pd.DataFrame(columns=column_names)

In [5]:
for data in neighborhoods_data:
    borough = neighborhood_name = data['properties']['borough'] 
    neighborhood_name = data['properties']['name']
        
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]
    
    neighborhoods = neighborhoods.append({'Borough': borough,
                                          'Neighborhood': neighborhood_name,
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)

In [6]:
neighborhoods.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Bronx,Wakefield,40.894705,-73.847201
1,Bronx,Co-op City,40.874294,-73.829939
2,Bronx,Eastchester,40.887556,-73.827806
3,Bronx,Fieldston,40.895437,-73.905643
4,Bronx,Riverdale,40.890834,-73.912585


We create a map of NYC with neighborhoods superimposed on top.

In [7]:
map_NewYork = folium.Map(location=[40.7308619, -73.9871558], zoom_start=10)

for lat, lng, borough, neighborhood in zip(neighborhoods['Latitude'], neighborhoods['Longitude'], neighborhoods['Borough'], neighborhoods['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=4,
        popup=label,
        color='red',
        fill=True,
        fill_color='red',
        fill_opacity=0.6,
        parse_html=False).add_to(map_NewYork)  
    
map_NewYork

**Now lets perform the Web scrapping of the Wikipedia page to retreive the list list of Michelin starred restaurants in New York City using BeautifulSoup.**

In [8]:
import urllib.request
from bs4 import BeautifulSoup
import pandas as pd

In [9]:
url = "https://en.wikipedia.org/wiki/List_of_Michelin_starred_restaurants_in_New_York_City"
page = urllib.request.urlopen(url)
soup = BeautifulSoup(page, "lxml")


In [10]:
td = soup.findAll('tr')[1:]

In [11]:
Rest_NYC = []

for data in td:
    col = data.find_all('td')
    details = {}
    for i,col in enumerate(col):
        if i == 0:
            details['Venue'] = (col.text.replace('\n',''))
        if i == 1:
            details["Borough"] = (col.text.replace('\n',''))
        if i == 16: 
            details['Star_rating'] = (col.img)
            
      
    Rest_NYC.append(details)
Rest_NYC

[{'Venue': '15 East', 'Borough': 'Manhattan', 'Star_rating': None},
 {'Venue': 'A Voce Columbus', 'Borough': 'Manhattan'},
 {'Venue': 'A Voce Madison', 'Borough': 'Manhattan'},
 {'Venue': 'Adour', 'Borough': 'Manhattan'},
 {'Venue': 'Agern',
  'Borough': 'Manhattan',
  'Star_rating': <img alt="1 star" data-file-height="33" data-file-width="30" decoding="async" height="15" src="//upload.wikimedia.org/wikipedia/commons/thumb/a/a5/Michelin-1.gif/14px-Michelin-1.gif" srcset="//upload.wikimedia.org/wikipedia/commons/thumb/a/a5/Michelin-1.gif/21px-Michelin-1.gif 1.5x, //upload.wikimedia.org/wikipedia/commons/thumb/a/a5/Michelin-1.gif/28px-Michelin-1.gif 2x" width="14"/>},
 {'Venue': 'Ai Fiori',
  'Borough': 'Manhattan',
  'Star_rating': <img alt="1 star" data-file-height="33" data-file-width="30" decoding="async" height="15" src="//upload.wikimedia.org/wikipedia/commons/thumb/a/a5/Michelin-1.gif/14px-Michelin-1.gif" srcset="//upload.wikimedia.org/wikipedia/commons/thumb/a/a5/Michelin-1.gif/2

In [12]:
df= pd.DataFrame(Rest_NYC) 

Now lets put our table into a good format

we reorder the columns 

In [13]:
df= df[['Venue', 'Borough', 'Star_rating']]

In [14]:
df.head(10)

Unnamed: 0,Venue,Borough,Star_rating
0,15 East,Manhattan,
1,A Voce Columbus,Manhattan,
2,A Voce Madison,Manhattan,
3,Adour,Manhattan,
4,Agern,Manhattan,"<img alt=""1 star"" data-file-height=""33"" data-f..."
5,Ai Fiori,Manhattan,"<img alt=""1 star"" data-file-height=""33"" data-f..."
6,Alain Ducasse at the Essex House,Manhattan,
7,Aldea,Manhattan,"<img alt=""1 star"" data-file-height=""33"" data-f..."
8,Allen & Delancey,Manhattan,
9,Alto,Manhattan,


We drop the 0 star Restaurant 

In [15]:
df.replace("None", np.NaN, inplace = True)


In [16]:
df.dropna(inplace=True)


In [17]:
df.reset_index(drop=True, inplace=True)

Now we have to keep only the number of stars in the Star_rating column

In [18]:
df['S_rate'] = df.Star_rating.astype('str').str.slice(start=10, stop=12)

In [19]:
df.drop('Star_rating', axis = 1, inplace=True)

In [20]:
df.head()

Unnamed: 0,Venue,Borough,S_rate
0,Agern,Manhattan,1
1,Ai Fiori,Manhattan,1
2,Aldea,Manhattan,1
3,L'Appart,Manhattan,1
4,Aquavit,Manhattan,2


So, Now we have our list of NYC restaurant rated in the Michelin guide with their rating.

This is the last step of our data prapatation. We will retreive the information of the restaurants using the Foursquare API. at end of the this step we will have a our final table ready to be analised. 

In [21]:
import matplotlib.pyplot as plt

from sklearn.cluster import KMeans

from sklearn.metrics import silhouette_score

we Define Foursquare Credentials and Version

In [22]:
CLIENT_ID = 'JCZ31SNZCQXRT55F1PZL243BAC0HJEEY54NI2KCL35CAVVN2'
CLIENT_SECRET = 'Z53QPMORE5UV5OTUN20D155QAJSRHHJJMGQM54HBD002OTY5' 
VERSION = '20180604'

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: JCZ31SNZCQXRT55F1PZL243BAC0HJEEY54NI2KCL35CAVVN2
CLIENT_SECRET:Z53QPMORE5UV5OTUN20D155QAJSRHHJJMGQM54HBD002OTY5


In [23]:
address = 'New York, NY'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

40.7127281 -74.0060152


In [24]:
def getNearbyVenues(names, latitudes, longitudes, LIMIT=200, radius=1000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [25]:

BM_venues = getNearbyVenues(names=neighborhoods['Neighborhood'],
                                  latitudes=neighborhoods['Latitude'],
                                  longitudes=neighborhoods['Longitude'])

BM_venues.head()

Wakefield
Co-op City
Eastchester
Fieldston
Riverdale
Kingsbridge
Marble Hill
Woodlawn
Norwood
Williamsbridge
Baychester
Pelham Parkway
City Island
Bedford Park
University Heights
Morris Heights
Fordham
East Tremont
West Farms
High  Bridge
Melrose
Mott Haven
Port Morris
Longwood
Hunts Point
Morrisania
Soundview
Clason Point
Throgs Neck
Country Club
Parkchester
Westchester Square
Van Nest
Morris Park
Belmont
Spuyten Duyvil
North Riverdale
Pelham Bay
Schuylerville
Edgewater Park
Castle Hill
Olinville
Pelham Gardens
Concourse
Unionport
Edenwald
Bay Ridge
Bensonhurst
Sunset Park
Greenpoint
Gravesend
Brighton Beach
Sheepshead Bay
Manhattan Terrace
Flatbush
Crown Heights
East Flatbush
Kensington
Windsor Terrace
Prospect Heights
Brownsville
Williamsburg
Bushwick
Bedford Stuyvesant
Brooklyn Heights
Cobble Hill
Carroll Gardens
Red Hook
Gowanus
Fort Greene
Park Slope
Cypress Hills
East New York
Starrett City
Canarsie
Flatlands
Mill Island
Manhattan Beach
Coney Island
Bath Beach
Borough Park
Dyker

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Wakefield,40.894705,-73.847201,Lollipops Gelato,40.894123,-73.845892,Dessert Shop
1,Wakefield,40.894705,-73.847201,Ripe Kitchen & Bar,40.898152,-73.838875,Caribbean Restaurant
2,Wakefield,40.894705,-73.847201,Ali's Roti Shop,40.894036,-73.856935,Caribbean Restaurant
3,Wakefield,40.894705,-73.847201,Jackie's West Indian Bakery,40.889283,-73.84331,Caribbean Restaurant
4,Wakefield,40.894705,-73.847201,Carvel Ice Cream,40.890487,-73.848568,Ice Cream Shop


In [26]:
BM_venues= BM_venues.set_index('Venue')


In [27]:
df=df.set_index('Venue')

In [28]:
BM_venues.head()

Unnamed: 0_level_0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue Latitude,Venue Longitude,Venue Category
Venue,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Lollipops Gelato,Wakefield,40.894705,-73.847201,40.894123,-73.845892,Dessert Shop
Ripe Kitchen & Bar,Wakefield,40.894705,-73.847201,40.898152,-73.838875,Caribbean Restaurant
Ali's Roti Shop,Wakefield,40.894705,-73.847201,40.894036,-73.856935,Caribbean Restaurant
Jackie's West Indian Bakery,Wakefield,40.894705,-73.847201,40.889283,-73.84331,Caribbean Restaurant
Carvel Ice Cream,Wakefield,40.894705,-73.847201,40.890487,-73.848568,Ice Cream Shop


In [29]:
result = pd.merge(df, BM_venues, on='Venue' )

In [30]:
result.head()

Unnamed: 0_level_0,Borough,S_rate,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue Latitude,Venue Longitude,Venue Category
Venue,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
Ai Fiori,Manhattan,1,Midtown,40.754691,-73.981669,40.750075,-73.983784,Italian Restaurant
Ai Fiori,Manhattan,1,Murray Hill,40.748303,-73.978332,40.750075,-73.983784,Italian Restaurant
Ai Fiori,Manhattan,1,Midtown South,40.74851,-73.988713,40.750075,-73.983784,Italian Restaurant
Atera,Manhattan,2,Tribeca,40.721522,-74.010683,40.716752,-74.005712,Molecular Gastronomy Restaurant
Atera,Manhattan,2,Civic Center,40.715229,-74.005415,40.716752,-74.005712,Molecular Gastronomy Restaurant


In [31]:
result.reset_index(inplace=True)
result.head()

Unnamed: 0,Venue,Borough,S_rate,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue Latitude,Venue Longitude,Venue Category
0,Ai Fiori,Manhattan,1,Midtown,40.754691,-73.981669,40.750075,-73.983784,Italian Restaurant
1,Ai Fiori,Manhattan,1,Murray Hill,40.748303,-73.978332,40.750075,-73.983784,Italian Restaurant
2,Ai Fiori,Manhattan,1,Midtown South,40.74851,-73.988713,40.750075,-73.983784,Italian Restaurant
3,Atera,Manhattan,2,Tribeca,40.721522,-74.010683,40.716752,-74.005712,Molecular Gastronomy Restaurant
4,Atera,Manhattan,2,Civic Center,40.715229,-74.005415,40.716752,-74.005712,Molecular Gastronomy Restaurant


In [32]:
result.drop_duplicates(subset='Venue', inplace=True)
result.reset_index(drop=True,inplace=True)
result.head()

Unnamed: 0,Venue,Borough,S_rate,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue Latitude,Venue Longitude,Venue Category
0,Ai Fiori,Manhattan,1,Midtown,40.754691,-73.981669,40.750075,-73.983784,Italian Restaurant
1,Atera,Manhattan,2,Tribeca,40.721522,-74.010683,40.716752,-74.005712,Molecular Gastronomy Restaurant
2,Atomix,Manhattan,2,Murray Hill,40.748303,-73.978332,40.744306,-73.982945,Korean Restaurant
3,Bâtard,Manhattan,1,Tribeca,40.721522,-74.010683,40.719624,-74.005788,Modern European Restaurant
4,Blanca,Brooklyn,2,East Williamsburg,40.708492,-73.938858,40.705033,-73.933774,New American Restaurant


lets vizualize our reslut 

In [33]:
venues_map = folium.Map(location=[latitude, longitude], zoom_start=13) 

for lat, lng, Venue, S_rate in zip(result['Venue Latitude'], result['Venue Longitude'], result['Venue'], result['S_rate']):
    label = '{}, {}'.format(Venue, S_rate)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=4,
        popup=label,
        color='blue',
        fill=True,
        fill_color='red',
        fill_opacity=0.6,
        parse_html=False).add_to(venues_map)  
    
venues_map

let's Check how many venues are in each neighborhood

In [34]:
df3= result.groupby('Neighborhood').count()

In [35]:
df3

Unnamed: 0_level_0,Venue,Borough,S_rate,Neighborhood Latitude,Neighborhood Longitude,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
Central Harlem,1,1,1,1,1,1,1,1
Chelsea,1,1,1,1,1,1,1,1
Chinatown,1,1,1,1,1,1,1,1
Clinton Hill,1,1,1,1,1,1,1,1
East Village,1,1,1,1,1,1,1,1
East Williamsburg,1,1,1,1,1,1,1,1
Financial District,1,1,1,1,1,1,1,1
Fulton Ferry,1,1,1,1,1,1,1,1
Gowanus,1,1,1,1,1,1,1,1
Gramercy,3,3,3,3,3,3,3,3


##  Clustering the Restaurant using k-means

In [36]:
from sklearn.preprocessing import StandardScaler

In [40]:
df4.head()

Unnamed: 0,S_rate,Neighborhood Latitude,Neighborhood Longitude,Venue Latitude,Venue Longitude
0,1,40.754691,-73.981669,40.750075,-73.983784
1,2,40.721522,-74.010683,40.716752,-74.005712
2,2,40.748303,-73.978332,40.744306,-73.982945
3,1,40.721522,-74.010683,40.719624,-74.005788
4,2,40.708492,-73.938858,40.705033,-73.933774


In [48]:
df4 = result.drop('Venue', axis = 1)
df4.drop('Borough', axis = 1, inplace=True)
df4.drop('Neighborhood', axis = 1, inplace=True)
df4.drop('Venue Category', axis = 1, inplace=True)

In [41]:
X = df4.values[:,1:]
X = np.nan_to_num(X)
cluster_dataset = StandardScaler().fit_transform(X)
cluster_dataset



array([[ 8.49445922e-01, -1.63868476e-02,  7.25220565e-01,
        -1.22578006e-01],
       [-3.50623412e-01, -1.48576168e+00, -5.17732951e-01,
        -1.27470940e+00],
       [ 6.18325232e-01,  1.52595581e-01,  5.10016802e-01,
        -7.85334888e-02],
       [-3.50623412e-01, -1.48576168e+00, -4.10599374e-01,
        -1.27872838e+00],
       [-8.22036721e-01,  2.15166768e+00, -9.54827102e-01,
         2.50502059e+00],
       [-1.54854767e-01, -9.40375866e-01,  5.37271765e-02,
        -9.56384790e-01],
       [-1.54854767e-01, -9.40375866e-01, -1.01785215e-01,
        -9.81782449e-01],
       [ 4.41440199e-01,  1.39153530e+00,  4.75260255e-01,
         1.42447005e+00],
       [ 2.16968293e-01, -1.55462222e-03,  1.96811815e-01,
        -3.00594673e-01],
       [-2.07247381e+00, -6.63199552e-01, -1.98499679e+00,
        -2.48190375e-01],
       [-8.72157407e-01, -1.48485819e+00, -9.11797731e-01,
        -1.36816018e+00],
       [ 1.33504186e+00,  1.13873294e+00,  1.35013892e+00,
      

Let's run our model and group our customers into three clusters

In [42]:
num_clusters = 3

k_means = KMeans(init="k-means++", n_clusters=num_clusters, n_init=12)
k_means.fit(cluster_dataset)
labels = k_means.labels_

print(labels)

[2 0 2 0 1 0 0 2 2 0 0 2 0 2 0 1 1 2 2 2 0 2 1 0 1 1 2 1 0 2 0 2 2 0 0]


In [49]:
df4["Labels"] = labels
df4.head()

Unnamed: 0,S_rate,Neighborhood Latitude,Neighborhood Longitude,Venue Latitude,Venue Longitude,Labels
0,1,40.754691,-73.981669,40.750075,-73.983784,2
1,2,40.721522,-74.010683,40.716752,-74.005712,0
2,2,40.748303,-73.978332,40.744306,-73.982945,2
3,1,40.721522,-74.010683,40.719624,-74.005788,0
4,2,40.708492,-73.938858,40.705033,-73.933774,1


We have our different clusters!

In [43]:
result["Labels"] = labels
result

Unnamed: 0,Venue,Borough,S_rate,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue Latitude,Venue Longitude,Venue Category,Labels
0,Ai Fiori,Manhattan,1,Midtown,40.754691,-73.981669,40.750075,-73.983784,Italian Restaurant,2
1,Atera,Manhattan,2,Tribeca,40.721522,-74.010683,40.716752,-74.005712,Molecular Gastronomy Restaurant,0
2,Atomix,Manhattan,2,Murray Hill,40.748303,-73.978332,40.744306,-73.982945,Korean Restaurant,2
3,Bâtard,Manhattan,1,Tribeca,40.721522,-74.010683,40.719624,-74.005788,Modern European Restaurant,0
4,Blanca,Brooklyn,2,East Williamsburg,40.708492,-73.938858,40.705033,-73.933774,New American Restaurant,1
5,Blue Hill,Manhattan,1,Greenwich Village,40.726933,-73.999914,40.732073,-73.999653,American Restaurant,0
6,Carbone,Manhattan,1,Greenwich Village,40.726933,-73.999914,40.727903,-74.000136,Italian Restaurant,0
7,Casa Enrique,Queens,1,Hunters Point,40.743414,-73.953868,40.743374,-73.954339,Mexican Restaurant,2
8,Casa Mono,Manhattan,1,Gramercy,40.73721,-73.981376,40.735909,-73.987172,Spanish Restaurant,2
9,Claro,Brooklyn,1,Gowanus,40.673931,-73.994441,40.677415,-73.986174,Mexican Restaurant,0


In [47]:
df5=df4.groupby('Labels').mean()
df5.reset_index(inplace=True)
df5

Unnamed: 0,Labels,Neighborhood Latitude,Neighborhood Longitude,Venue Latitude,Venue Longitude
0,0,40.719634,-74.000474,40.719451,-73.998869
1,1,40.704311,-73.957164,40.704032,-73.956808
2,2,40.756243,-73.974307,40.755113,-73.976354


lets visualize them with our list of restaurant and NYC neighborhood

In [51]:
Cluster_map = folium.Map(location=[latitude, longitude], zoom_start=13) 

for lat, lng, Labels in zip(df5['Venue Latitude'], df5['Venue Longitude'], df5['Labels']):
    label = '{}, {}'.format(Venue, S_rate)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=70,
        popup=label,
        color='green',
        fill=True,
        fill_color='green',
        fill_opacity=0.6,
        parse_html=False).add_to(Cluster_map)  
for lat, lng, Venue, S_rate in zip(result['Venue Latitude'], result['Venue Longitude'], result['Venue'], result['S_rate']):
    label = '{}, {}'.format(Venue, S_rate)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=4,
        popup=label,
        color='blue',
        fill=True,
        fill_color='blue',
        fill_opacity=0.6,
        parse_html=False).add_to(Cluster_map)  
    

for lat, lng, borough, neighborhood in zip(neighborhoods['Latitude'], neighborhoods['Longitude'], neighborhoods['Borough'], neighborhoods['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=4,
        popup=label,
        color='red',
        fill=True,
        fill_color='red',
        fill_opacity=0.6,
        parse_html=False).add_to(Cluster_map)   
    
Cluster_map

Last but not least : Diferrent type of restaurants in each Neighborhood

In [52]:
# one hot encoding
BM_onehot = pd.get_dummies(result[['Venue Category']], prefix="", prefix_sep="")

#column lists before adding neighborhood
column_names = ['Neighborhood'] + list(BM_onehot.columns)

# add neighborhood column back to dataframe
BM_onehot['Neighborhood'] = neighborhoods['Neighborhood'] 

# move neighborhood column to the first column
BM_onehot = BM_onehot[column_names]

BM_onehot.head()

Unnamed: 0,Neighborhood,American Restaurant,Asian Restaurant,French Restaurant,Italian Restaurant,Japanese Restaurant,Korean Restaurant,Mediterranean Restaurant,Mexican Restaurant,Modern European Restaurant,Molecular Gastronomy Restaurant,New American Restaurant,Restaurant,Spanish Restaurant,Steakhouse,Sushi Restaurant,Thai Restaurant,Wine Bar
0,Wakefield,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Co-op City,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0
2,Eastchester,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0
3,Fieldston,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0
4,Riverdale,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0


In [53]:
BM_onehot

Unnamed: 0,Neighborhood,American Restaurant,Asian Restaurant,French Restaurant,Italian Restaurant,Japanese Restaurant,Korean Restaurant,Mediterranean Restaurant,Mexican Restaurant,Modern European Restaurant,Molecular Gastronomy Restaurant,New American Restaurant,Restaurant,Spanish Restaurant,Steakhouse,Sushi Restaurant,Thai Restaurant,Wine Bar
0,Wakefield,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Co-op City,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0
2,Eastchester,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0
3,Fieldston,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0
4,Riverdale,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0
5,Kingsbridge,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
6,Marble Hill,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0
7,Woodlawn,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0
8,Norwood,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0
9,Williamsbridge,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0
