## Introduction

For this excersize I'd like to introduce you to my friend Jack. Jack is a talented chef, his mom is french, his dad is italian, he grew up in asia and lives in Manhatten. Jack can cook any cuisine you can think of to perfection. <b>Here is the problem:</b> Jack wants to open a restaurant/foodvenue on Manhattan, and he wants to use data to help him make the hard descition of the kind of restaurant he should open and where he should open it. Ideally he would like a recommendation for each neighborhood on  Manhattan, based on the existing food places in the neighborhood.

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    certifi-2019.6.16          |           py36_1         149 KB  conda-forge
    geopy-1.20.0               |             py_0          57 KB  conda-forge
    openssl-1.1.1c             |       h516909a_0         2.1 MB  conda-forge
    geographiclib-1.49         |             py_0          32 KB  conda-forge
    ca-certificates-2019.6.16  |       hecc5488_0         145 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         2.5 MB

The following NEW packages will be INSTALLED:

    geographiclib:   1.49-py_0         conda-forge
    geopy:           1.20.0-py_0       conda-forge

The following packages will be UPDATED:

    ca-

## Dataset

To complete the task I will need to isolate the Manhattan data from the New York dataset and the dataset must fulfill these requirements:
- The data set must be food venues only. 
- The data set must include food category. 
- The data set must include latitude and logitude coordinates of each neighborhood. 

## Methodology

1) First I'll download the data. <br>
2) Draw a map of the Manhattan Neiborhoods. <br>
3) Find out the top10 food venue types for all of Manhattan. <br>
4) Isolate top10 food-venue type and make a dataframe for these and the Manhattan neiborhoods. <br>
5) Based on the average amount of food-venue type, rate each food-venue type for each neiborhood. <br>

## 1) Download and prep the data

In [2]:
!wget -q -O 'newyork_data.json' https://cocl.us/new_york_dataset
print('Data downloaded!')

Data downloaded!


In [3]:
with open('newyork_data.json') as json_data:
    newyork_data = json.load(json_data)

In [4]:
neighborhoods_data = newyork_data['features']

#### The above just returns a json object for all of NY, next step is to tranform the data into a dataframe

In [5]:
# define the dataframe columns
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude'] 

# instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)
neighborhoods

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude


#### Isolate Mahattan Data

In [6]:
#Fill in Manhattan data
for data in neighborhoods_data:
    borough = neighborhood_name = data['properties']['borough'] 
    neighborhood_name = data['properties']['name']
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]
    neighborhoods = neighborhoods.append({'Borough': borough,
                                          'Neighborhood': neighborhood_name,
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)
neighborhoods.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Bronx,Wakefield,40.894705,-73.847201
1,Bronx,Co-op City,40.874294,-73.829939
2,Bronx,Eastchester,40.887556,-73.827806
3,Bronx,Fieldston,40.895437,-73.905643
4,Bronx,Riverdale,40.890834,-73.912585


In [7]:
print('The dataframe has {} boroughs and {} neighborhoods.'.format(
        len(neighborhoods['Borough'].unique()),
        neighborhoods.shape[0]
    )
)

The dataframe has 5 boroughs and 306 neighborhoods.


####  Time to isolate Mahattan data

In [9]:
#Isolate Mahattan Data
manhattan_data = neighborhoods[neighborhoods['Borough'] == 'Manhattan'].reset_index(drop=True)
manhattan_data.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Manhattan,Marble Hill,40.876551,-73.91066
1,Manhattan,Chinatown,40.715618,-73.994279
2,Manhattan,Washington Heights,40.851903,-73.9369
3,Manhattan,Inwood,40.867684,-73.92121
4,Manhattan,Hamilton Heights,40.823604,-73.949688


## 2) Draw a map of the Manhattan Neiborhoods

####  Get Manhattan coordinates for the map viewport

In [10]:
address = 'Manhattan, NY'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Manhattan are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Manhattan are 40.7900869, -73.9598295.


#### Create map of Manhattan using latitude and longitude values


In [11]:
# Create map
map_manhattan = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(manhattan_data['Latitude'], manhattan_data['Longitude'], manhattan_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_manhattan)  
    
map_manhattan

## 3) Find out the top10 food venue types for all of Manhattan. 

####  Secret Foursquare credentials

In [12]:
# The code was removed by Watson Studio for sharing.

#### Next step is to create a function that gets the food-venue data. Notice the section is set to food, so it only returns food-venues.

In [13]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    section = 'food'
    venues_list=[]
    LIMIT = 100
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&section={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            section,
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

#### Set up dataframe for all neighborhoods

In [None]:


manhattan_venues = getNearbyVenues(names=manhattan_data['Neighborhood'],
                                   latitudes=manhattan_data['Latitude'],
                                   longitudes=manhattan_data['Longitude']
                                  )



Marble Hill
Chinatown


#### Now we should have a data frame with Neighborhood, coordinates, venue name and venue category

In [14]:
print(manhattan_venues.shape)
manhattan_venues.head()

(2883, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Marble Hill,40.876551,-73.91066,Arturo's,40.874412,-73.910271,Pizza Place
1,Marble Hill,40.876551,-73.91066,Tibbett Diner,40.880404,-73.908937,Diner
2,Marble Hill,40.876551,-73.91066,Dunkin',40.877136,-73.906666,Donut Shop
3,Marble Hill,40.876551,-73.91066,Land & Sea Restaurant,40.877885,-73.905873,Seafood Restaurant
4,Marble Hill,40.876551,-73.91066,Parrilla Latina,40.877473,-73.906073,Steakhouse


In [15]:
# I want to know the 10 most popular food venues
df = manhattan_venues['Venue Category'].value_counts()
df.head(10)

Italian Restaurant     247
Pizza Place            169
Café                   143
American Restaurant    141
Deli / Bodega          131
Sandwich Place         119
Chinese Restaurant     110
Mexican Restaurant     110
Bakery                 106
French Restaurant       90
Name: Venue Category, dtype: int64

## 4) Isolate top10 food-venue type and make a dataframe for these and the Manhattan neiborhoods. 


#### Now we set up a dataframe that shows the neighborhood-location of each type of restaurant.

In [16]:
# one hot encoding
manhattan_onehot = pd.get_dummies(manhattan_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
manhattan_onehot['Neighborhood'] = manhattan_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [manhattan_onehot.columns[-1]] + list(manhattan_onehot.columns[:-1])
manhattan_onehot = manhattan_onehot[fixed_columns]
manhattan_onehot.shape
manhattan_onehot.head()

Unnamed: 0,Neighborhood,Afghan Restaurant,African Restaurant,American Restaurant,Arepa Restaurant,Argentinian Restaurant,Asian Restaurant,Australian Restaurant,Austrian Restaurant,BBQ Joint,Bagel Shop,Bakery,Belgian Restaurant,Bistro,Brazilian Restaurant,Breakfast Spot,Buffet,Burger Joint,Burrito Place,Cafeteria,Café,Cajun / Creole Restaurant,Cambodian Restaurant,Cantonese Restaurant,Caribbean Restaurant,Caucasian Restaurant,Chinese Restaurant,Creperie,Cuban Restaurant,Czech Restaurant,Deli / Bodega,Dim Sum Restaurant,Diner,Donut Shop,Dosa Place,Dumpling Restaurant,Eastern European Restaurant,Empanada Restaurant,English Restaurant,Ethiopian Restaurant,Falafel Restaurant,Fast Food Restaurant,Filipino Restaurant,Food,Food Court,Food Truck,French Restaurant,Fried Chicken Joint,Gastropub,German Restaurant,Gluten-free Restaurant,Greek Restaurant,Hawaiian Restaurant,Himalayan Restaurant,Hot Dog Joint,Hotpot Restaurant,Indian Restaurant,Indonesian Restaurant,Irish Pub,Israeli Restaurant,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Jewish Restaurant,Kebab Restaurant,Korean Restaurant,Kosher Restaurant,Latin American Restaurant,Lebanese Restaurant,Mac & Cheese Joint,Malay Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant,Molecular Gastronomy Restaurant,Mongolian Restaurant,Moroccan Restaurant,New American Restaurant,Noodle House,North Indian Restaurant,Paella Restaurant,Pakistani Restaurant,Peking Duck Restaurant,Persian Restaurant,Peruvian Restaurant,Pet Café,Pizza Place,Poke Place,Portuguese Restaurant,Poutine Place,Ramen Restaurant,Restaurant,Russian Restaurant,Salad Place,Sandwich Place,Scandinavian Restaurant,Seafood Restaurant,Shabu-Shabu Restaurant,Shanghai Restaurant,Snack Place,Soba Restaurant,Soup Place,South American Restaurant,South Indian Restaurant,Southern / Soul Food Restaurant,Spanish Restaurant,Sri Lankan Restaurant,Steakhouse,Sushi Restaurant,Swiss Restaurant,Szechuan Restaurant,Taco Place,Taiwanese Restaurant,Tapas Restaurant,Thai Restaurant,Theme Restaurant,Tonkatsu Restaurant,Turkish Restaurant,Udon Restaurant,Ukrainian Restaurant,Vegetarian / Vegan Restaurant,Venezuelan Restaurant,Vietnamese Restaurant,Wings Joint
0,Marble Hill,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Marble Hill,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Marble Hill,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Marble Hill,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Marble Hill,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


#### Next, I group rows by neighborhood and summarize for each category. The top 10 venue categories are specified in the code below, to clear out all other categories.

In [17]:
manhattan_grouped = manhattan_onehot.groupby('Neighborhood').sum()
manhattan_grouped = manhattan_grouped[['Italian Restaurant','Pizza Place','Café','American Restaurant','Deli / Bodega','Sandwich Place','Mexican Restaurant','Chinese Restaurant','Bakery','French Restaurant']]
manhattan_grouped.head()

Unnamed: 0_level_0,Italian Restaurant,Pizza Place,Café,American Restaurant,Deli / Bodega,Sandwich Place,Mexican Restaurant,Chinese Restaurant,Bakery,French Restaurant
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
Battery Park City,4,4,0,2,0,2,1,3,1,0
Carnegie Hill,4,8,5,2,1,0,2,1,6,3
Central Harlem,0,3,1,2,3,2,0,4,1,2
Chelsea,5,4,5,4,2,3,4,2,8,6
Chinatown,1,2,3,3,0,3,4,19,7,0


#### Done, its starting to look prety useful. Next step is to find the average number of the top 10 restaurants per Neighborhood, and add that as the last row (called AVG).

In [18]:
# Lets find the average number of the top 10 restaurants per Neighborhood, and add that as the last row.
manhattan_grouped.loc['AVG'] = manhattan_grouped.mean()
manhattan_grouped.tail(10)

Unnamed: 0_level_0,Italian Restaurant,Pizza Place,Café,American Restaurant,Deli / Bodega,Sandwich Place,Mexican Restaurant,Chinese Restaurant,Bakery,French Restaurant
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
Sutton Place,7.0,9.0,1.0,4.0,1.0,1.0,3.0,3.0,1.0,4.0
Tribeca,7.0,1.0,5.0,7.0,4.0,3.0,1.0,2.0,3.0,2.0
Tudor City,2.0,5.0,7.0,3.0,9.0,3.0,5.0,2.0,0.0,1.0
Turtle Bay,10.0,3.0,6.0,3.0,9.0,3.0,1.0,0.0,0.0,3.0
Upper East Side,15.0,5.0,2.0,6.0,4.0,1.0,2.0,2.0,3.0,4.0
Upper West Side,7.0,3.0,2.0,2.0,0.0,0.0,2.0,1.0,3.0,2.0
Washington Heights,2.0,9.0,3.0,1.0,7.0,3.0,4.0,6.0,4.0,0.0
West Village,17.0,4.0,2.0,9.0,0.0,2.0,4.0,2.0,1.0,4.0
Yorkville,10.0,10.0,2.0,2.0,9.0,4.0,3.0,3.0,3.0,1.0
AVG,6.175,4.225,3.575,3.525,3.275,2.975,2.75,2.75,2.65,2.25


## 5) Based on the average amount of food-venue type, rate each food-venue type for each neiborhood. <br>

Now we are getting to the recommendation part. I'll deduct the average from the actual number of food places for a certain category, and thereby give each food-category a score for a neighborhood. <b>For example</b> if the average number of pizzaplaces is 4 and a neighborhood only has two, the Pizzaplace score for this Neighborhood will be +2. If a neighborhood already has 8 pizzaplaces, the score is -4, meaning 'Dont open a pizzaplace here, there are plenty'.

In [19]:
# Subtract the average [AVG] from all rows, and flip the sign (+/-) to make negative potins negative and positive points positive
df = manhattan_grouped.iloc[0:40].subtract(manhattan_grouped.iloc[40])
df = df*-1
df.head(10)

Unnamed: 0_level_0,Italian Restaurant,Pizza Place,Café,American Restaurant,Deli / Bodega,Sandwich Place,Mexican Restaurant,Chinese Restaurant,Bakery,French Restaurant
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
Battery Park City,2.175,0.225,3.575,1.525,3.275,0.975,1.75,-0.25,1.65,2.25
Carnegie Hill,2.175,-3.775,-1.425,1.525,2.275,2.975,0.75,1.75,-3.35,-0.75
Central Harlem,6.175,1.225,2.575,1.525,0.275,0.975,2.75,-1.25,1.65,0.25
Chelsea,1.175,0.225,-1.425,-0.475,1.275,-0.025,-1.25,0.75,-5.35,-3.75
Chinatown,5.175,2.225,0.575,0.525,3.275,-0.025,-1.25,-16.25,-4.35,2.25
Civic Center,-5.825,1.225,-1.425,-2.475,0.275,-5.025,-2.25,1.75,-1.35,-3.75
Clinton,-3.825,1.225,-0.425,-4.475,-4.725,-2.025,-0.25,-2.25,1.65,0.25
East Harlem,6.175,-1.775,1.575,3.525,-2.725,0.975,-4.25,1.75,-2.35,1.25
East Village,1.175,-4.775,0.575,1.525,0.275,2.975,-2.25,-3.25,1.65,-1.75
Financial District,0.175,-1.775,-2.425,-3.475,-0.725,-5.025,-2.25,1.75,0.65,1.25


It works! Looks like it would be a good idea to look into opening an Italian restaurant in Central Harlem, and a less good ideas to open a Chinese restaurant in Chinatown, or a bakery in Chelsea. Would be nice if it was easier to read the data though..

#### Highlight the results with pandas style function:

In [20]:
def highlight_max(s):
    '''
    highlight the maximum in a Series yellow.
    '''
    is_max = s == s.max()
    return ['background-color: yellow' if v else '' for v in is_max]

In [21]:
df.style.apply(highlight_max)

Unnamed: 0_level_0,Italian Restaurant,Pizza Place,Café,American Restaurant,Deli / Bodega,Sandwich Place,Mexican Restaurant,Chinese Restaurant,Bakery,French Restaurant
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
Battery Park City,2.175,0.225,3.575,1.525,3.275,0.975,1.75,-0.25,1.65,2.25
Carnegie Hill,2.175,-3.775,-1.425,1.525,2.275,2.975,0.75,1.75,-3.35,-0.75
Central Harlem,6.175,1.225,2.575,1.525,0.275,0.975,2.75,-1.25,1.65,0.25
Chelsea,1.175,0.225,-1.425,-0.475,1.275,-0.025,-1.25,0.75,-5.35,-3.75
Chinatown,5.175,2.225,0.575,0.525,3.275,-0.025,-1.25,-16.25,-4.35,2.25
Civic Center,-5.825,1.225,-1.425,-2.475,0.275,-5.025,-2.25,1.75,-1.35,-3.75
Clinton,-3.825,1.225,-0.425,-4.475,-4.725,-2.025,-0.25,-2.25,1.65,0.25
East Harlem,6.175,-1.775,1.575,3.525,-2.725,0.975,-4.25,1.75,-2.35,1.25
East Village,1.175,-4.775,0.575,1.525,0.275,2.975,-2.25,-3.25,1.65,-1.75
Financial District,0.175,-1.775,-2.425,-3.475,-0.725,-5.025,-2.25,1.75,0.65,1.25


## Result

The dataframe above is the result of the exscersise. Jack's wish was a recommendation for each neighborhood on Manhattan, based on the existing food places in the neighborhood. The resulting dataframe provides that, as it highlights the restaurants for each neighborhood. The category that scores the highest is "Italian restaurant" with a value of 6.175 in 6 neighborhoods, so I would recomment Jack to start looking at those first. We also get some strong advise for <b>not</b> opening an Italaian restaurant in West village (negative 10.8 points).

## Discussion

Although the results does satisfy the business problem, there are ways to improve it. We could for example add population for each neighborhood, to make sure there are enough people living there, compared to the amount of restaurants. Also, the model is based on existing competition, and therefore it would not recommend opening a chinese restaurant in chinatown, but instead recommend opening an italian restaurant in chinatown. This might or might not be an issue with the model, it's hard to say at this point and would require more research.

## Conclusion

Jack wish was a recommendation for each neighborhood on  Manhattan, based on the existing food places in the neighborhood. The resulting dataframe provides that, as it highlights the restaurants for each neighborhood, that scores the highest, is in the topten most liked food categories, and not have too much competition in the neighborhood.