# Capstone Project - Finding an optimal location for a restaurant

## Table of contents
* [Introduction:Background & Problem Description](#introduction)
* [Data Preparation](#Data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

## 1. Introduction:Background & Problem Description <a name="introduction"></a>

**New York City**, the most populous city in the United States, one of the greatest metropolises over the world, is a dream place for gourmet to seek delicious cuisine. Its food culture includes an array of international cuisines influenced by the city's immigrant history. Central and Eastern European immigrants, especially Jewish immigrants from those regions, brought bagels, cheesecake, hot dogs, knishes, and delicatessens (or delis) to the city. Italian immigrants brought New York-style pizza and Italian cuisine into the city, while Jewish immigrants and Irish immigrants brought pastrami and corned beef, respectively. Chinese and other Asian restaurants, sandwich joints, trattorias, diners, and coffeehouses are ubiquitous throughout the city. Some 4,000 mobile food vendors licensed by the city, many immigrant-owned, have made Middle Eastern foods such as falafel and kebabs examples of modern New York street food. The city is home to "nearly one thousand of the finest and most diverse haute cuisine restaurants in the world," according to Michelin. As of 2019, there were 27,043 restaurants in the city, up from 24,865 in 2017[1].

In this project, we will try to find an optimal restaurant location.The strengths of each region will then be clearly expressed so that stakeholders can choose the best final location... Because there are so many restaurants in New York, we're going to try to find places that aren't already full of restaurants. We are also particularly interested in areas where there are no Chinese restaurants nearby. We also want to be as close to the city centre as possible, provided the first two conditions are met.

## 2. Data Preparation<a name="Data"></a>

The data used in the analysis are as follows:
* getting the location data from 'newyork_data.json' from IBM Watson Studio.
* cleaning the data and reducing it to boroughs of NYC so that I can use it to find geological locations for further venue analysis.
* Using Foursquare API to get tcoordinate of the center of New York City（Manhattan）.
* Using Foursquare API to get the number of restaurants and their type and location in every neighborhood

In [12]:
import numpy as np 
import pandas as pd 
import json
#!conda install -c conda-forge geopy --yes
from geopy.geocoders import Nominatim 
import requests 
from pandas.io.json import json_normalize 
import matplotlib.cm as cm
import matplotlib.colors as colors
from sklearn.cluster import KMeans
print('Libraries imported.')

Libraries imported.


In [13]:
address = 'Manhattan, NY'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Manhattan are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Manhattan are 40.7896239, -73.9598939.


In [14]:
!wget -q -O 'newyork_data.json' https://cocl.us/new_york_dataset
print('Data downloaded!')

Data downloaded!


In [15]:
with open('newyork_data.json') as json_data:
    newyork_data = json.load(json_data)

In [16]:
# define the dataframe columns
neighborhoods_data = newyork_data['features']
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude'] 

# instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)
for data in neighborhoods_data:
    borough = neighborhood_name = data['properties']['borough'] 
    neighborhood_name = data['properties']['name']
        
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]
    
    neighborhoods = neighborhoods.append({'Borough': borough,
                                          'Neighborhood': neighborhood_name,
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)
print('The dataframe has {} boroughs and {} neighborhoods.'.format(
        len(neighborhoods['Borough'].unique()),
        neighborhoods.shape[0]
    )
)
neighborhoods.head()

The dataframe has 5 boroughs and 306 neighborhoods.


Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Bronx,Wakefield,40.894705,-73.847201
1,Bronx,Co-op City,40.874294,-73.829939
2,Bronx,Eastchester,40.887556,-73.827806
3,Bronx,Fieldston,40.895437,-73.905643
4,Bronx,Riverdale,40.890834,-73.912585


Manhattan is described as the economic and cultural center of the entire United States. It is the seat of New York City's Central Business District. It is home to many famous companies, known as the economic center of the world, and the richest district in New York.

Manhattan is a long and narrow island. It is divided from North to South into Uptown, Midtown and Downtown. The island is used for office buildings such as the Empire State Building, Chrysler Building, Rockefeller Center , Madison Square Garden, Garden Center, Metropolitan Life Insurance Building, Lincoln Performing Arts Center, United Nations Building, etc. Manhattan is the world's largest concentration of skyscrapers.

So,we chose Manhattan as the analysis area of interest.

In [17]:
manhattan_data = neighborhoods[neighborhoods['Borough'] == 'Manhattan'].reset_index(drop=True)
print('manhattan has {} neighborhoods.'.format(manhattan_data .shape[0]))
manhattan_data.head()

manhattan has 40 neighborhoods.


Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Manhattan,Marble Hill,40.876551,-73.91066
1,Manhattan,Chinatown,40.715618,-73.994279
2,Manhattan,Washington Heights,40.851903,-73.9369
3,Manhattan,Inwood,40.867684,-73.92121
4,Manhattan,Hamilton Heights,40.823604,-73.949688


In [None]:
!conda install -c conda-forge pyproj

In [21]:
#!conda install -c conda-forge pyproj
import pyproj
import math
def lonlat_to_xy(lng, lat):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=33, datum='WGS84')
    xy = pyproj.transform(proj_latlon, proj_xy, lng, lat)
    return xy[0], xy[1]
def xy_to_lonlat(x, y):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=33, datum='WGS84')
    lonlat = pyproj.transform(proj_xy, proj_latlon, x, y)
    return lonlat[0], lonlat[1]

def calc_xy_distance(x1, y1, x2, y2):
    dx = x2 - x1
    dy = y2 - y1
    return math.sqrt(dx*dx + dy*dy)


In [22]:
xs = []
ys = []
latitudes=manhattan_data['Latitude']
longitudes=manhattan_data['Longitude']
for lat,lng in zip(latitudes,longitudes):
    x,y=lonlat_to_xy(lng,lat)
    xs.append(x)
    ys.append(y)

In [23]:
manhattan_center_x, manhattan_center_y = lonlat_to_xy(longitude,latitude) 
distance_from_centers=[]
for x,y in zip(xs,ys):
    distance_from_center = calc_xy_distance(manhattan_center_x, manhattan_center_y, x, y)
    distance_from_centers.append(distance_from_center)

manhattan_data['X'] = xs
manhattan_data['Y'] = ys
manhattan_data['Distance from center'] = distance_from_centers

manhattan_data.head(10)

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,X,Y,Distance from center
0,Manhattan,Marble Hill,40.876551,-73.91066,-5794205.0,9858099.0,16020.661541
1,Manhattan,Chinatown,40.715618,-73.994279,-5821760.0,9868103.0,13309.592195
2,Manhattan,Washington Heights,40.851903,-73.9369,-5798470.0,9861349.0,10953.332666
3,Manhattan,Inwood,40.867684,-73.92121,-5795743.0,9859410.0,14122.05588
4,Manhattan,Hamilton Heights,40.823604,-73.949688,-5803305.0,9862859.0,5903.904961
5,Manhattan,Manhattanville,40.816934,-73.957385,-5804461.0,9863817.0,4637.67885
6,Manhattan,Central Harlem,40.815976,-73.943211,-5804573.0,9861989.0,4953.789819
7,Manhattan,East Harlem,40.792249,-73.944182,-5808594.0,9862002.0,2071.67311
8,Manhattan,Upper East Side,40.775639,-73.960508,-5811466.0,9864025.0,2371.412055
9,Manhattan,Yorkville,40.77593,-73.947118,-5811369.0,9862302.0,2845.043858


* let's visualiza Manhattan the neighborhoods in it.

In [None]:
!conda install -c conda-forge folium

In [24]:
#!conda install -c conda-forge folium
import folium

# create map of Manhattan using latitude and longitude values
map_manhattan = folium.Map(location=[latitude, longitude], zoom_start=12)

# add markers to map
for lat, lng, label in zip(manhattan_data['Latitude'], manhattan_data['Longitude'], manhattan_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)   
    folium.Circle(
        [lat, lng],
        radius=500,
        popup=label,
        color='blue',
        fill=False,parse_html=False).add_to(map_manhattan)  
    
map_manhattan

### Foursquare

Now that we have our location candidates, let's use Foursquare API to get info on restaurants in each neighborhood.

In [25]:
client_id = 'YOONTBCCHZJON13FJUUIQDC3R5TA1S2HUTGDZ33BPKSONP4U' 
client_secret = 'SLBUS330XA3JI1OMTZVPJCXQNEG4KCZPB5OMWNWC0K0UOPL2' 
version = '20180724' 

print('Your credentails:')
print('CLIENT_ID: ' + client_id)
print('CLIENT_SECRET:' + client_secret)

Your credentails:
CLIENT_ID: YOONTBCCHZJON13FJUUIQDC3R5TA1S2HUTGDZ33BPKSONP4U
CLIENT_SECRET:SLBUS330XA3JI1OMTZVPJCXQNEG4KCZPB5OMWNWC0K0UOPL2


In [26]:
food_category = '4d4b7105d754a06374d81259' # 'Root' category for all food-related venues

chinese_restaurant_categories = ['4bf58dd8d48988d145941735','52af3a5e3cf9994f4e043bea','52af3a723cf9994f4e043bec','52af3a7c3cf9994f4e043bed','58daa1558bbb0b01f18ec1d3',
                                 '52af3a673cf9994f4e043beb','52af3a903cf9994f4e043bee','4bf58dd8d48988d1f5931735','52af3a9f3cf9994f4e043bef','52af3aaa3cf9994f4e043bf0',
                                 '52af3ab53cf9994f4e043bf1','52af3abe3cf9994f4e043bf2','52af3ac83cf9994f4e043bf3','52af3ad23cf9994f4e043bf4','52af3add3cf9994f4e043bf5',
                                 '52af3af23cf9994f4e043bf7','52af3ae63cf9994f4e043bf6','52af3afc3cf9994f4e043bf8','52af3b053cf9994f4e043bf9','52af3b213cf9994f4e043bfa',
                                 '52af3b293cf9994f4e043bfb','52af3b343cf9994f4e043bfc','52af3b3b3cf9994f4e043bfd','52af3b463cf9994f4e043bfe','52af3b633cf9994f4e043c01',
                                 '52af3b513cf9994f4e043bff','52af3b593cf9994f4e043c00','52af3b6e3cf9994f4e043c02','52af3b773cf9994f4e043c03','52af3b813cf9994f4e043c04',
                                 '52af3b893cf9994f4e043c05','52af3b913cf9994f4e043c06','52af3b9a3cf9994f4e043c07','52af3ba23cf9994f4e043c08']

def is_restaurant(categories, specific_filter=None):
    restaurant_words = ['restaurant', 'food', 'chinese']
    restaurant = False
    specific = False
    for c in categories:
        category_name = c[0].lower()
        category_id = c[1]
        for r in restaurant_words:
            if r in category_name:
                restaurant = True
        if 'fast food' in category_name:
            restaurant = False
        if not(specific_filter is None) and (category_id in specific_filter):
            specific = True
            restaurant = True
    return restaurant, specific

def get_categories(categories):
    return [(cat['name'], cat['id']) for cat in categories]

def format_address(location):
    address = ', '.join(location['formattedAddress'])
    address = address.replace(', United States', '')
    return address

def get_venues_near_location(lat, lng, category, client_id, client_secret, radius=500, limit=100):
    version = '20180724'
    url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&categoryId={}&radius={}&limit={}'.format(
        client_id, client_secret, version, lat, lng, category, radius, limit)
    try:
        results = requests.get(url).json()['response']['groups'][0]['items']
        venues = [(item['venue']['id'],
                   item['venue']['name'],
                   get_categories(item['venue']['categories']),
                   (item['venue']['location']['lat'], item['venue']['location']['lng']),
                   format_address(item['venue']['location']),
                   item['venue']['location']['distance']) for item in results]
    except:
        venues = []
    return venues

In [27]:
import pickle

def get_restaurants(lats, lngs):
    restaurants = {}
    chinese_restaurants = {}
    location_restaurants = []

    print('Obtaining venues around candidate locations:', end='')
    for lat, lng in zip(manhattan_data['Latitude'], manhattan_data['Longitude']):
        venues = get_venues_near_location(lat, lng, food_category, client_id, client_secret, radius=350, limit=100)
        area_restaurants = []
        for venue in venues:
            venue_id = venue[0]
            venue_name = venue[1]
            venue_categories = venue[2]
            venue_latlon = venue[3]
            venue_address = venue[4]
            venue_distance = venue[5]
            is_res, is_chinese = is_restaurant(venue_categories, specific_filter=chinese_restaurant_categories)
            if is_res:
                x, y = lonlat_to_xy(venue_latlon[1], venue_latlon[0])
                restaurant = (venue_id, venue_name, venue_latlon[0], venue_latlon[1], venue_address, venue_distance, is_chinese, x, y)
                restaurants[venue_id] = restaurant
                if venue_distance<=300:
                    area_restaurants.append(restaurant)
                if is_chinese:
                    chinese_restaurants[venue_id] = restaurant
        location_restaurants.append(area_restaurants)
        print(' .', end='')
    print(' done.')
    return restaurants, chinese_restaurants, location_restaurants

restaurants, chinese_restaurants, location_restaurants = get_restaurants(manhattan_data['Latitude'], manhattan_data['Longitude'])      

Obtaining venues around candidate locations: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . done.


In [28]:
import numpy as np

print('Total number of restaurants:', len(restaurants))
print('Total number of chinese restaurants:', len(chinese_restaurants))
print('Percentage of chinese restaurants: {:.2f}%'.format(len(chinese_restaurants) / len(restaurants) * 100))
print('Average number of restaurants in neighborhood:', np.array([len(r) for r in location_restaurants]).mean())

Total number of restaurants: 1128
Total number of chinese restaurants: 93
Percentage of chinese restaurants: 8.24%
Average number of restaurants in neighborhood: 20.875


In [29]:
print('List of chinese restaurants')
print('---------------------------')
for r in list(chinese_restaurants.values())[:10]:
    print(r)
print('...')
print('Total:', len(chinese_restaurants))

List of chinese restaurants
---------------------------
('4b7f2f48f964a520561d30e3', 'China Wang', 40.87434742150794, -73.91054009348214, '109 W 225th St, Bronx, NY 10463', 245, True, -5794577.1341078505, 9858073.22516032)
('4db3374590a0843f295fb69b', 'Spicy Village', 40.71701, -73.99353, '68 Forsyth St Frnt B (btwn Grand & Hester St), New York, NY 10002', 167, True, -5821521.342536928, 9868012.52110206)
('4a96bf8ff964a520ce2620e3', 'Wah Fung Number 1 Fast Food 華豐快飯店', 40.71727831655619, -73.99417731304892, '79 Chrystie St (btwn Hester St & Grand St), New York, NY 10002', 348, True, -5821478.065359891, 9868097.287691277)
('5894c9a15e56b417cf79e553', "Xi'an Famous Foods", 40.715231941715004, -73.99726288220869, '45 Bayard St (Bowery), New York, NY 10013', 255, True, -5821835.849460265, 9868486.268862609)
('5ded51eaf492de00080966ed', "Joe's Shanghai 鹿嗚春", 40.71566097685305, -73.99669268189086, '46 Bowery, New York, NY 10013', 203, True, -5821761.103620904, 9868414.612930615)
('3fd66200f9

Let's now see all the collected restaurants in our area of interest on map, and let's also show Chinese restaurants in different color.

In [30]:
map_manhattan1 = folium.Map(location=[latitude, longitude], zoom_start=13)
folium.Marker([latitude, longitude], popup='manhattan').add_to(map_manhattan1)
for res in restaurants.values():
    lat = res[2]; lon = res[3]
    is_chinese = res[6]
    color = 'red' if is_chinese else 'blue'
    folium.CircleMarker([lat, lon], radius=3, color=color, fill=True, fill_color=color, fill_opacity=1).add_to(map_manhattan1)
map_manhattan1

now we have all the restaurants in area, and we know which ones are Chinese restaurants! We also know which restaurants exactly are in vicinity of every neighborhood candidate center.

## Methodology <a name="methodology"></a>

In this project, we will look for low-density restaurant areas in Manhattan, especially areas with fewer Chinese restaurants.

First, we collected the data we needed: the location and type (category) of each restaurant in Manhattan. We found all the Chinese restaurants (according to Foursquare).

The second step will be to calculate the density of restaurants in different parts of Manhattan. We will use the "heat map" to identify some promising areas near downtown with few restaurants and no Chinese restaurants nearby.

Third, we will focus on the most promising areas and create clusters of sites within these areas that meet some of the basic requirements: we will consider sites with a radius of no more than two restaurants within a radius of 250 meters, and no Chinese restaurants within a radius of 400 meters is best. We will use maps to show all locations.

## Analysis <a name="analysis"></a>

First, let's calculate the number of restaurants in each candidate region:

In [31]:
location_restaurants_count= [len(res) for res in location_restaurants]
manhattan_center_x, manhattan_center_y = lonlat_to_xy(longitude,latitude) 
distance_from_centers=[]
for x,y in zip(xs,ys):
    distance_from_center = calc_xy_distance(manhattan_center_x, manhattan_center_y, x, y)
    distance_from_centers.append(distance_from_center)
manhattan_data['X'] = xs
manhattan_data['Y'] = ys
manhattan_data['Distance from center'] = distance_from_centers
manhattan_data['Restaurants in area'] = location_restaurants_count

print('Average number of restaurants in every area with radius=300m:', np.array(location_restaurants_count).mean())

manhattan_data.head(10)

Average number of restaurants in every area with radius=300m: 20.875


Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,X,Y,Distance from center,Restaurants in area
0,Manhattan,Marble Hill,40.876551,-73.91066,-5794205.0,9858099.0,16020.661541,3
1,Manhattan,Chinatown,40.715618,-73.994279,-5821760.0,9868103.0,13309.592195,40
2,Manhattan,Washington Heights,40.851903,-73.9369,-5798470.0,9861349.0,10953.332666,16
3,Manhattan,Inwood,40.867684,-73.92121,-5795743.0,9859410.0,14122.05588,15
4,Manhattan,Hamilton Heights,40.823604,-73.949688,-5803305.0,9862859.0,5903.904961,20
5,Manhattan,Manhattanville,40.816934,-73.957385,-5804461.0,9863817.0,4637.67885,7
6,Manhattan,Central Harlem,40.815976,-73.943211,-5804573.0,9861989.0,4953.789819,10
7,Manhattan,East Harlem,40.792249,-73.944182,-5808594.0,9862002.0,2071.67311,8
8,Manhattan,Upper East Side,40.775639,-73.960508,-5811466.0,9864025.0,2371.412055,13
9,Manhattan,Yorkville,40.77593,-73.947118,-5811369.0,9862302.0,2845.043858,13


In [32]:
distances_to_chinese_restaurant = []

for area_x, area_y in zip(xs, ys):
    min_distance = 10000
    for res in chinese_restaurants.values():
        res_x = res[7]
        res_y = res[8]
        d = calc_xy_distance(area_x, area_y, res_x, res_y)
        if d<min_distance:
            min_distance = d
    distances_to_chinese_restaurant.append(min_distance)

manhattan_data['Distance to Chinese restaurant'] = distances_to_chinese_restaurant
manhattan_data.head(10)

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,X,Y,Distance from center,Restaurants in area,Distance to Chinese restaurant
0,Manhattan,Marble Hill,40.876551,-73.91066,-5794205.0,9858099.0,16020.661541,3,373.027265
1,Manhattan,Chinatown,40.715618,-73.994279,-5821760.0,9868103.0,13309.592195,40,255.197513
2,Manhattan,Washington Heights,40.851903,-73.9369,-5798470.0,9861349.0,10953.332666,16,255.971806
3,Manhattan,Inwood,40.867684,-73.92121,-5795743.0,9859410.0,14122.05588,15,52.482328
4,Manhattan,Hamilton Heights,40.823604,-73.949688,-5803305.0,9862859.0,5903.904961,20,274.653363
5,Manhattan,Manhattanville,40.816934,-73.957385,-5804461.0,9863817.0,4637.67885,7,162.870594
6,Manhattan,Central Harlem,40.815976,-73.943211,-5804573.0,9861989.0,4953.789819,10,482.295374
7,Manhattan,East Harlem,40.792249,-73.944182,-5808594.0,9862002.0,2071.67311,8,2322.160621
8,Manhattan,Upper East Side,40.775639,-73.960508,-5811466.0,9864025.0,2371.412055,13,473.75047
9,Manhattan,Yorkville,40.77593,-73.947118,-5811369.0,9862302.0,2845.043858,13,348.758893


In [33]:
print('Average distance to closest Chinese restaurant from each area center:', manhattan_data['Distance to Chinese restaurant'].mean())

Average distance to closest Chinese restaurant from each area center: 453.2523954185277


Let's create a map that shows the "heat map/restaurant density" and then try to extract some meaningful information from it.

In [34]:
manhattan_boroughs_url = 'https://raw.githubusercontent.com/m-hoerz/berlin-shapes/master/berliner-bezirke.geojson'
manhattan_boroughs = requests.get(manhattan_boroughs_url).json()

def boroughs_style(feature):
    return { 'color': 'blue', 'fill': False }

In [35]:
restaurant_latlons = [[res[2], res[3]] for res in restaurants.values()]

chinese_latlons = [[res[2], res[3]] for res in chinese_restaurants.values()]

In [36]:
from folium import plugins
from folium.plugins import HeatMap

map_manhattan2 = folium.Map(location=[latitude, longitude], zoom_start=12)
folium.TileLayer('cartodbpositron').add_to(map_manhattan2) #cartodbpositron cartodbdark_matter
HeatMap(restaurant_latlons).add_to(map_manhattan2)
folium.GeoJson(manhattan_boroughs, style_function=boroughs_style, name='geojson').add_to(map_manhattan2)
map_manhattan2

On the west side of Manhattan's central park, it seems possible to find places with a low density of restaurants.

Let's create another heatmap map showing heatmap/density of Chinese restaurants only.

In [37]:
map_manhattan3 = folium.Map(location=[latitude,longitude], zoom_start=12)
folium.TileLayer('cartodbpositron').add_to(map_manhattan3) #cartodbpositron cartodbdark_matter
HeatMap(chinese_latlons).add_to(map_manhattan3)
folium.GeoJson(manhattan_boroughs, style_function=boroughs_style, name='geojson').add_to(map_manhattan3)
map_manhattan3

Now, we can clearly see the small number of restaurants nearby, there are no Chinese restaurants nearby.As can be seen in the figure on the right, the density of Chinese restaurants in the western part of Manhattan's central park is lower.We choose the streets calles 'Upper West Side','Manhattan Valley' and 'Morningside Heights' as the focused object.

Now let's cluster those locations to create the center of the area that contains the good locations. The centers and addresses of these areas will be the end result of our analysis.

In [38]:
good_location1 = manhattan_data[manhattan_data['Neighborhood']=='Upper West Side']
good_location2 = manhattan_data[manhattan_data['Neighborhood']=='Morningside Heights']
good_location3 = manhattan_data[manhattan_data['Neighborhood']=='Manhattan Valley']
good_location=pd.concat([good_location1, good_location2,good_location3]).reset_index(drop=True)
good_location.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,X,Y,Distance from center,Restaurants in area,Distance to Chinese restaurant
0,Manhattan,Upper West Side,40.787658,-73.977059,-5809488.0,9866213.0,2235.693968,15,90.35919
1,Manhattan,Morningside Heights,40.808,-73.963896,-5805997.0,9864613.0,3155.515994,6,406.208031
2,Manhattan,Manhattan Valley,40.797307,-73.964286,-5807809.0,9864613.0,1419.342078,7,1464.039418


In [40]:
map_g = folium.Map(location=[latitude, longitude], zoom_start=14)
folium.Marker([40.787658, -73.977059], popup='Upper West Side').add_to(map_g)
folium.Marker([40.808000, -73.963896], popup='Morningside Heights').add_to(map_g)
folium.Marker([40.797307, -73.964286], popup='Manhattan Valley').add_to(map_g)
folium.Circle([latitude, longitude], radius=1000, color='white',fill=True, fill_opacity=0.4).add_to(map_g)
folium.Circle([40.787658, -73.977059], radius=400,color='green',fill=False).add_to(map_g)
folium.Circle([40.808000, -73.963896], radius=400,color='green',fill=False).add_to(map_g)
folium.Circle([40.797307, -73.964286], radius=400,color='green',fill=True, fill_opacity=0.25).add_to(map_g)
for res in restaurants.values():
    lat = res[2]; lon = res[3]
    is_chinese = res[6]
    color = 'red' if is_chinese else 'blue'
    folium.CircleMarker([lat, lon], radius=3, color=color, fill=True, fill_color=color, fill_opacity=1).add_to(map_g)

map_g

Manhattan Valley is the closest to the city center, and there's no Chinese restaurant within 400 metres.

## Results and Discussion <a name="results"></a>

The analysis shows that while Manhattan has a large number of restaurants (about 1,200), there are also some low-density areas close to downtown. The south Manhattan area has the highest density of restaurants, so we focused on the lower density in the northwest, so we chose the 'Upper West Side','Manhattan Valley' and 'Morningside Heights' street areas, which are popular with tourists, close to downtown, and have a strong socioeconomic dynamic.

Through the visual images in this paper, it can be concluded that Manhattan Valley has the lowest density from the nearest area, and there are no Chinese restaurants within 400 meters. The optimal location can be found in this area. However, the proposed area should only be considered as a starting point for more detailed analysis, and the final site should take into account not only the competitive relationship but also other factors.

## Conclusion <a name="conclusion"></a>

The objective of the project was to identify the number of restaurants and Chinese restaurants in the vicinity of midtown Manhattan to help stakeholders narrow down the number of Chinese restaurants in the best location. By calculating the restaurant density distribution from the Foursquare data, we first determined the distribution of all areas, then identified the target area, and finally found the location set that meets the basic requirements of existing nearby restaurants.The ultimate best restaurant location will be determined by stakeholders based on the characteristics and location of specific communities in each recommended area, taking into account additional factors.