# Capstone Project - The Battle of the Neighborhoods (Week 2)
### Applied Data Science Capstone by IBM/Coursera


## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

## Introduction: Business Problem <a name="introduction"></a>

In this project I will provide a geographic overview regarding east asian restaurants in **Mahattan, New York**. Specifically, this report will be targeted to residents in, and tourists to, Mahattan who are interested in **Japanese, Chinese and Korean** food. 

This report should also be helpful to stakeholders interested in **opening an east asian restaurant** in Mahattan, which can be used to find locations that are not already crowded with restaurants or neighborhoods with no east asian restaurants yet.

Since there are lots of restaurants in Mahattan we will show the most common restaurants in each neighborhood. Then we will provide the **distribution of the east asian restaurnats** mentioned above in particular.

The data science powers will be used to find the neighborhoods with more east asian restaurants for the **Mahattan residents as well as tourists** to choose.

## Data <a name="data"></a>

The following information will be needed to address the above goals:
* number of existing restaurants in Mahattan and each of its neighborhoods
* distrubution of Japanese, Chinese and Korean restaurants among all neighborhoods
* clusters of neighborhoods crowded with the above reataurants

Following data sources will be needed to extract/generate the required information:
* number of restaurants and their type and location in every neighborhood will be obtained using **Foursquare API**
* a dataset that contains the Mahattan neighborhoods as well as latitude and logitude coordinates of each neighborhood. 
This dataset exists for free on the web. Here is the link to the dataset: https://geo.nyu.edu/catalog/nyu_2451_34572


### Neighborhood Candidates

First of all, get the data for Borough, neighborhoods, latitude & longitude coordinates for neighborhoods in the Mahattan area.

In [313]:
import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analsysis
import json # library to handle JSON files
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
import requests # library to handle requests
import folium # map rendering library

# This dataset exists for free on the web. Here is the link: https://cocl.us/new_york_dataset
url = "https://cocl.us/new_york_dataset"
results = requests.get(url).json()
neighborhoods_data = results['features']

Then, place all these data into a Pandas dataframe.

In [314]:
# define the dataframe columns
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude'] 
# instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)

for data in neighborhoods_data:
    borough = neighborhood_name = data['properties']['borough'] 
    neighborhood_name = data['properties']['name']
        
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]
    
    neighborhoods = neighborhoods.append({'Borough': borough,
                                          'Neighborhood': neighborhood_name,
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)

In [394]:
# We are interested in Mahattan only.
manhattan_data = neighborhoods[neighborhoods['Borough'] == 'Manhattan'].reset_index(drop=True)
manhattan_data.shape

(40, 4)

### Foursquare
Now Foursquare API will be used to get info on restaurants in each neighborhood.

Foursquare credentials are defined in hidden cell bellow.

In [387]:
CLIENT_ID = 'JLGN3RMFHJRUERHN4NDFY0V5DPP3XJVU3LYU1ODNH3EL5C2T' # your Foursquare ID
CLIENT_SECRET = 'A4UVQEIZOI0KIXYDPHDUO1PHQI1QQ3LJTHKTNCC0HO4Q4S5Q' # your Foursquare Secret
VERSION = '20200818' # Foursquare API version
LIMIT = 100 # limit of number of venues returned by Foursquare API
radius = 500 # define radius
RESTAURANT=1
CHINESE=3
JAPANESE=4
KOREAN=5

# Category IDs corresponding to restaurants were taken from Foursquare web site (https://developer.foursquare.com/docs/resources/categories):
food_category = '4d4b7105d754a06374d81259' # 'Root' category for all food-related venues

# Since each east asian restaurant is divided into multiple categories, 
# we have to combine these sub-categories to have an accurate statistics of the number of each category
chinese_restaurant_categories = ['4bf58dd8d48988d145941735','52af3a5e3cf9994f4e043bea','52af3a723cf9994f4e043bec',
                                 '52af3a7c3cf9994f4e043bed','58daa1558bbb0b01f18ec1d3','52af3a673cf9994f4e043beb',
                                 '52af3a903cf9994f4e043bee','4bf58dd8d48988d1f5931735','52af3a9f3cf9994f4e043bef',
                                 '52af3aaa3cf9994f4e043bf0','52af3ab53cf9994f4e043bf1','52af3abe3cf9994f4e043bf2',
                                 '52af3ac83cf9994f4e043bf3','52af3ad23cf9994f4e043bf4','52af3add3cf9994f4e043bf5',
                                 '52af3af23cf9994f4e043bf7','52af3ae63cf9994f4e043bf6','52af3afc3cf9994f4e043bf8',
                                 '52af3b053cf9994f4e043bf9','52af3b213cf9994f4e043bfa','52af3b293cf9994f4e043bfb',
                                 '52af3b343cf9994f4e043bfc','52af3b3b3cf9994f4e043bfd','52af3b463cf9994f4e043bfe',
                                 '52af3b633cf9994f4e043c01','52af3b513cf9994f4e043bff','52af3b593cf9994f4e043c00',
                                 '52af3b6e3cf9994f4e043c02','52af3b773cf9994f4e043c03','52af3b813cf9994f4e043c04',
                                 '52af3b893cf9994f4e043c05','52af3b913cf9994f4e043c06','52af3b9a3cf9994f4e043c07','52af3ba23cf9994f4e043c08']

japanese_restaurant_categories = ['4bf58dd8d48988d111941735','55a59bace4b013909087cb0c','55a59bace4b013909087cb30',
                                 '55a59bace4b013909087cb21','55a59bace4b013909087cb06','55a59bace4b013909087cb1b',
                                 '55a59bace4b013909087cb1e','55a59bace4b013909087cb18','55a59bace4b013909087cb24',
                                 '55a59bace4b013909087cb15','55a59bace4b013909087cb27','55a59bace4b013909087cb12',
                                 '4bf58dd8d48988d1d2941735','55a59bace4b013909087cb2d','55a59a31e4b013909087cb00',
                                 '55a59af1e4b013909087cb03','55a59bace4b013909087cb2a','55a59bace4b013909087cb0f',
                                 '55a59bace4b013909087cb33','55a59bace4b013909087cb09','55a59bace4b013909087cb09',
                                 '55a59bace4b013909087cb36']

korean_restaurant_categories = ['4bf58dd8d48988d113941735','56aa371be4b08b9a8d5734e4','56aa371be4b08b9a8d5734f0',
                                 '56aa371be4b08b9a8d5734e7','56aa371be4b08b9a8d5734ed','56aa371be4b08b9a8d5734ea']

In [388]:
# Identify restaurant type, especially, sub-categories of east asian foods have to be combined.
def getRestaurantType(categories):
    restaurant_words = ['restaurant', 'diner', 'taverna', 'steakhouse']
    restaurant = 0
    category_name = categories['name'].lower()
    category_id = categories['id']
    for r in restaurant_words:
        if r in category_name:
            restaurant = RESTAURANT
    if 'fast food' in category_name:
        restaurant = 0
    if (category_id in chinese_restaurant_categories):
        restaurant = CHINESE
    if (category_id in japanese_restaurant_categories):
        restaurant = JAPANESE
    if (category_id in korean_restaurant_categories):
        restaurant = KOREAN
    return restaurant

In [385]:
def getNearbyVenues(names, latitudes, longitudes, category, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        #print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&categoryId={}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            category,
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name'],
            getRestaurantType(v['venue']['categories'][0])) for v in results])
        #break

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category', 
                  'Restaurant Type']
    
    return(nearby_venues)


In [None]:
# Get all restaurants in Manhattan.
manhattan_venues = getNearbyVenues(names=manhattan_data['Neighborhood'],
                                   latitudes=manhattan_data['Latitude'],
                                   longitudes=manhattan_data['Longitude'],
                                   category=food_category
                                  )
restaurants=manhattan_venues[manhattan_venues['Restaurant Type']>0]

In [386]:
restaurants.shape

(1725, 8)

In [376]:
address = 'Manhattan, NY'
geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
def ShowMahattanRestaurants(res,color):    
    # add markers to map
    for lat, lng, label in zip(res['Neighborhood Latitude'], res['Neighborhood Longitude'], res['Venue']):
        label = folium.Popup(label, parse_html=True)
        folium.CircleMarker(
            [lat, lng],
            radius=4,
            popup=label,
            color=color,
            fill=True,
            fill_color=color,
            fill_opacity=0.7,
            parse_html=False).add_to(map_manhattan_restaurants)  

In [374]:
# create map of Manhattan using latitude and longitude values
map_manhattan_restaurants = folium.Map(location=[latitude, longitude], zoom_start=11)
ShowMahattanRestaurants(restaurants,'red')
map_manhattan_restaurants

In [413]:
# Get ready list of Japanese, Chinese and Korean restaurants
japaneseRestaurants=manhattan_venues[manhattan_venues['Restaurant Type']==JAPANESE]
chineseRestaurants=manhattan_venues[manhattan_venues['Restaurant Type']==CHINESE]
koreanRestaurants=manhattan_venues[manhattan_venues['Restaurant Type']==KOREAN]

In [393]:
print('Total number of Japanese restaurants:', len(japaneseRestaurants))
print('Total number of Chinese restaurants:', len(chineseRestaurants))
print('Total number of Korean restaurants:', len(koreanRestaurants))
print('Percentage of Japanese restaurants: {:.2f}%'.format(len(japaneseRestaurants) / len(restaurants) * 100))
print('Percentage of Chinese restaurants: {:.2f}%'.format(len(chineseRestaurants) / len(restaurants) * 100))
print('Percentage of Korean restaurants: {:.2f}%'.format(len(koreanRestaurants) / len(restaurants) * 100))

Total number of Japanese restaurants: 196
Total number of Chinese restaurants: 128
Total number of Korean restaurants: 44
Percentage of Japanese restaurants: 11.36%
Percentage of Chinese restaurants: 7.42%
Percentage of Korean restaurants: 2.55%


We find that there are **196 Japanese** restaurants, **128 Chinese** restaurants and **44 Korean** restaurants in Manhattan.    

So now we have all the **identified restaurants** in Manhattan(according to Foursquare categorization) and lists of Japanese, Chinese and Korean restaurants after combining all sub-categories! 

This concludes the data gathering phase - we're now ready to use this data for analysis of the restaurants' distribution in all the neiborhoods!

## Methodology <a name="methodology"></a>

In this project, a geographic overview regarding east asian restaurants in Mahattan, New York will be provided for residents in, and tourists to, Mahattan who love **Japanese, Chinese and Korean** food, and also the stakeholders who are interested in **opening an east asian restaurant** in Mahattan.

First of all, let's take a look at all the restaurants in Manhattan to get an idea about which restaurants are most popular in each neighborhood as well as in the whole Manhattan area. 

Next in our analysis will be calculation and exploration of those east asian restaurants's distribution among all the neighboroods in Manhattan. K-means will be used to detect any clusters formed by neighborhoods close to each other with the same category of restaurants.

In the final step the neighborhoods already crowded with each of the restaurants will be identified and recommended for the audience.

In [357]:
# one hot encoding
manhattan_onehot = pd.get_dummies(restaurants[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
manhattan_onehot['Neighborhood'] = restaurants['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [manhattan_onehot.columns[-1]] + list(manhattan_onehot.columns[:-1])
manhattan_onehot = manhattan_onehot[fixed_columns]

In [358]:
a=pd.DataFrame(manhattan_onehot.sum())
a=a[1:]
a.columns=['Total']
a.sort_values(by=['Total'], ascending=False)

Unnamed: 0,Total
Italian Restaurant,239
American Restaurant,132
Mexican Restaurant,109
Chinese Restaurant,104
Sushi Restaurant,87
French Restaurant,73
Japanese Restaurant,72
Restaurant,58
Seafood Restaurant,52
Mediterranean Restaurant,50


So, according to the Foursquare categorization, the top 6 popular restaurants in Manhattan are Italian (239), American (132), Mexican (109), Japanese (159, including Sushi), Chinese (104) and French (73).

Next, let's take a look at the statistics for individual neighborhoods.

In [359]:
manhattan_grouped = manhattan_onehot.groupby('Neighborhood').sum().reset_index()
manhattan_grouped.head()

num_top_venues = 8init=True
for hood in manhattan_grouped['Neighborhood']:
    #print("----"+hood+"----")
    temp = manhattan_grouped[manhattan_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','count']
    temp = temp.iloc[1:]
    tmp=temp.sort_values('count', ascending=False).reset_index(drop=True).head(num_top_venues)
    tmp[hood] = tmp['venue'].map(str).str.replace('Restaurant','') + '(' + tmp['count'].map(str) + ')' 
    tmp=pd.DataFrame(tmp[hood])
    if init:
        manhattan_resranks=tmp.T
        init=False
    else:
        frames=[manhattan_resranks,tmp.T]
        manhattan_resranks=pd.concat(frames)
        
manhattan_resranks.columns = ['Most','2nd Most','3rd Most','4thMost','5th Most','6th Most','7th Most','8th Most']
manhattan_resranks

Unnamed: 0,Most,2nd Most,3rd Most,4thMost,5th Most,6th Most,7th Most,8th Most
Battery Park City,Chinese (3),Mexican (1),Japanese (1),Steakhouse(1),Italian (1),Seafood (1),American (1),Moroccan (0)
Carnegie Hill,Sushi (6),Italian (6),French (3),Mexican (3),Chinese (2),(2),Indian (2),Diner(2)
Central Harlem,Chinese (4),Seafood (3),Caribbean (3),African (3),Southern / Soul Food (3),American (2),(2),French (2)
Chelsea,French (7),American (5),Italian (4),Japanese (4),Sushi (3),Seafood (3),(2),Indian (2)
Chinatown,Chinese (18),Dumpling (5),Vietnamese (4),Mexican (4),Hotpot (4),Dim Sum (4),Shanghai (3),American (3)
Civic Center,Italian (11),American (5),French (5),Sushi (4),Diner(3),Mexican (3),Indian (3),Falafel (2)
Clinton,Italian (10),American (7),(7),Chinese (5),Thai (4),Mexican (4),Mediterranean (2),Steakhouse(2)
East Harlem,Mexican (9),Latin American (3),Thai (3),Steakhouse(2),Chinese (1),French (1),(1),New American (1)
East Village,Mexican (7),Vietnamese (5),Vegetarian / Vegan (5),Korean (4),Japanese (4),Italian (4),Chinese (3),Ramen (3)
Financial District,American (7),Italian (7),Mexican (5),Steakhouse(4),Falafel (3),Japanese (3),(3),Chinese (2)


Now we see the top restaurants for all the neighborhoods are mostly **Italian, American, Mexican, Chinese**. Since both **Japanese and Korean** restaurants are divided into multiple categories, 
we have to combine these sub-categories to have a more accurate statistics of their number. SO let's use the lists of Japanese, Chinese and Korean restaurants after combining all sub-categories for more detailed exploration.

In [391]:
frames=[chineseRestaurants,japaneseRestaurants,koreanRestaurants]
eastAsianRestaurants=pd.concat(frames)

In [392]:
# For aLl east asian restaurants
e=eastAsianRestaurants.groupby('Neighborhood').count()['Neighborhood Latitude']
e.sort_values(ascending=False)

Neighborhood
Midtown South          37
Chinatown              35
East Village           20
Noho                   17
Flatiron               16
Greenwich Village      15
Little Italy           15
Midtown                14
Yorkville              13
Turtle Bay             13
Lenox Hill             13
Murray Hill            12
Carnegie Hill          11
Chelsea                10
Clinton                10
Manhattanville          9
Lower East Side         9
Washington Heights      8
Sutton Place            7
West Village            7
Civic Center            7
Tudor City              7
Financial District      7
Soho                    6
Hamilton Heights        6
Tribeca                 5
Upper West Side         5
Upper East Side         5
Manhattan Valley        5
Central Harlem          4
Morningside Heights     4
Battery Park City       4
Lincoln Square          3
Gramercy                2
Inwood                  2
Roosevelt Island        2
East Harlem             1
Stuyvesant Town         1

In [396]:
map_manhattan_restaurants = folium.Map(location=[latitude, longitude], zoom_start=11)
ShowMahattanRestaurants(eastAsianRestaurants,'red')
map_manhattan_restaurants

We find that the top 5 neighborhoods for east asian restaurants are **Midtown South, Chinatown, East Village, Noho and Flatiron**. Now let's look at how each type of restaurants distributes.

In [363]:
# For Japanese restaurants
j=japaneseRestaurants.groupby('Neighborhood').count()['Neighborhood Latitude']
j.sort_values(ascending=False)

Neighborhood
Noho                  15
Turtle Bay            13
East Village          12
Flatiron              12
Midtown               12
Yorkville             11
Greenwich Village     11
Midtown South         10
Lenox Hill            10
Carnegie Hill          9
Chelsea                8
Murray Hill            7
Civic Center           5
Financial District     5
Manhattanville         5
Sutton Place           5
Tudor City             5
Lower East Side        4
Soho                   4
Upper West Side        4
West Village           4
Upper East Side        4
Little Italy           3
Hamilton Heights       3
Clinton                3
Chinatown              2
Manhattan Valley       2
Washington Heights     2
Tribeca                2
Roosevelt Island       1
Gramercy               1
Stuyvesant Town        1
Battery Park City      1
Name: Neighborhood Latitude, dtype: int64

So Japanese restaurants are pretty much evenly distributed in more than ten neighborhoods, among which are **Noho, Turtle Bay, East Village, Flatiron, Midtown, Yorkville, Greenwich Village, Midtown South, Lenox Hill** with more than 10 Japanese restaurants(10-15), respectively. 

Among 40 neighborhoods in Manhattan, 39 of them have Japanese restaurants!

In [486]:
map_manhattan_restaurants = folium.Map(location=[latitude, longitude], zoom_start=11)
ShowMahattanRestaurants(japaneseRestaurants,'green')
map_manhattan_restaurants

Now, let's see if some of these neighborhoods which are close to each other form a cluster crowded with Japanese restaurants. The following code runs k-means to cluster the neighborhood into 12 clusters.

In [453]:
kclusters = 12

japres=japaneseRestaurants[['Neighborhood Latitude','Neighborhood Longitude']]
# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(japres)

# check cluster labels generated for each row in the dataframe
kmeans.labels_

array([ 4,  4, 10, 10,  2,  2,  2,  2,  2,  2,  2,  2,  3,  3,  3,  3,  7,
        7,  7,  7,  7,  7,  7,  7,  7,  7,  7,  3,  3,  3,  3,  3,  3,  3,
        3,  3,  3,  3,  5,  5,  5,  5,  0,  0,  0,  0,  0,  0,  0,  0,  0,
        0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0, 11, 11, 11, 11,
       11, 11, 11, 11,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  1,  1,
        1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  8,  8,  4,
        4,  4,  4,  4,  4,  4, 11, 11, 11, 11,  5,  5,  6,  8,  8,  8,  8,
        8,  8,  7,  7,  7,  7,  7,  7,  7,  7,  7,  1,  1,  1,  1,  1,  1,
        1,  1,  1,  1,  1,  1,  1,  1,  1,  8,  8,  8,  8,  8,  6,  6,  6,
        6,  6,  6,  6,  6,  6,  6,  3,  3,  3,  3,  3,  9,  9,  9,  9,  9,
        9,  9,  9,  9,  9,  9,  9,  9,  9,  9,  9,  9,  9,  1,  6,  6,  6,
        6,  6,  6,  6,  6,  6,  6,  6,  6])

In [473]:
# add clustering labels
japaneseRestaurants=manhattan_venues[manhattan_venues['Restaurant Type']==JAPANESE]
japaneseRestaurants.insert(0, 'Cluster Labels', kmeans.labels_)

j=japaneseRestaurants.groupby('Cluster Labels').count()['Neighborhood']
j.sort_values(ascending=False)

Cluster Labels
1     32
6     23
0     22
7     20
4     20
3     20
9     18
8     13
11    12
2      8
5      6
10     2
Name: Neighborhood, dtype: int64

We do find a cluster with 32 Japanese restaurants. Let's see which neighborhoods are in this cluster.

In [477]:
cl=japaneseRestaurants[japaneseRestaurants['Cluster Labels']==1]
cl['Neighborhood'].unique()

array(['East Village', 'Lower East Side', 'Noho', 'Stuyvesant Town'],
      dtype=object)

So **East Village, Lower East Side, Noho, and Stuyvesant Town** are the neighborhoods concentrated with 32 japanese restaurants. Let's visualize them in map.

In [489]:
map_manhattan_restaurants = folium.Map(location=[latitude, longitude], zoom_start=11)
ShowMahattanRestaurants(cl,'green')
map_manhattan_restaurants

Let's look at Chinese restaurants next.

In [398]:
# For Chinese restaurants
c=chineseRestaurants.groupby('Neighborhood').count()['Neighborhood Latitude']
c.sort_values(ascending=False)

Neighborhood
Chinatown              31
Little Italy           11
Washington Heights      6
Clinton                 6
Lower East Side         5
Manhattanville          4
Central Harlem          4
East Village            4
Greenwich Village       4
Murray Hill             4
Hamilton Heights        3
Lenox Hill              3
Lincoln Square          3
Battery Park City       3
Morningside Heights     3
Midtown                 2
Soho                    2
Carnegie Hill           2
Chelsea                 2
Tudor City              2
Tribeca                 2
Financial District      2
Sutton Place            2
Yorkville               2
Inwood                  2
Noho                    2
West Village            2
Manhattan Valley        2
Hudson Yards            1
Midtown South           1
Flatiron                1
Roosevelt Island        1
East Harlem             1
Upper East Side         1
Upper West Side         1
Gramercy                1
Name: Neighborhood Latitude, dtype: int64

Not surprisingly, **Chinatown** has the most Chinese restaurants (37) among all neighbor hoods. **Little Italy** also has 11 Chinese restaurants. All the rest neighborhoods have only single digit number of Chinese restaurants. So there's no need to inspect if there're any clusters by different neighborhoods.

In [378]:
map_manhattan_restaurants = folium.Map(location=[latitude, longitude], zoom_start=11)
ShowMahattanRestaurants(chineseRestaurants,'yellow')
map_manhattan_restaurants

In [367]:
# For Korean restaurants.
k=koreanRestaurants.groupby('Neighborhood').count()['Neighborhood Latitude']
k.sort_values(ascending=False)

Neighborhood
Midtown South          26
East Village            4
Flatiron                3
Civic Center            2
Chinatown               2
West Village            1
Tribeca                 1
Murray Hill             1
Morningside Heights     1
Manhattan Valley        1
Little Italy            1
Clinton                 1
Name: Neighborhood Latitude, dtype: int64

Among all 44 Korean restaurants, 26 of them are in Midtown South. All the rest neighborhoods have only single digit number (1-4) of Korean restaurants. Just like Chinese restaurants, there's no need to detect if there're any clusters either.

In [379]:
map_manhattan_restaurants = folium.Map(location=[latitude, longitude], zoom_start=11)
ShowMahattanRestaurants(koreanRestaurants,'red')
map_manhattan_restaurants

## Results and Discussion <a name="results"></a>

Our analysis shows that there are **196 Japanese** restaurants, **128 Chinese** restaurants and **44 Korean** restaurants in Manhattan. The top 5 neighborhoods for these asian restaurants are **Midtown South, Chinatown, East Village, Noho and Flatiron**.

Furthur analysis reveals that **Japanese** restaurants are the most popular asian restaurants in Manhattan. Among 40 neighborhoods in Manhattan, 39 of them have Japanese restaurants! More than ten neighborhoods, among which are **Noho, Turtle Bay, East Village, Flatiron, Midtown, Yorkville, Greenwich Village, Midtown South, Lenox Hill**, have more than 10 Japanese restaurants(10-15), respectively. So for Manhattan residents and tourists interested in Japanese food, no matter wherever you go, you can always find a Japanese restaurant. 

**Chinese** restaurants are also fairly popular in Manhattan, which can be found in 36 out of 40 neighborhoods. But unlike Japanese, Chinese restaurants are very cencentrated in two neighborhoods - **Chinatown** and **Little Italy**. There are 37 Chinese restaurants in **Chinatown**, which is not  surprising, and 11 in **Little Italy**. All the rest neighborhoods have only single digit number(1-6) of Chinese restaurants. So for Manhattan residents and tourists who love Chinese food, **Chinatown** and **Little Italy** are definitely what they should first explore. 

**Korean** restaurants are even more concentrated in Manhattan and 26 out of 44 Korean restaurants are in **Midtown South**. Out of 12 neighborhoods with Korean retaurants, the rest neighborhoods have only single digit number (1-4) of Korean restaurants. So for Korean food lovers, **Midtown South** is definitely the one-stop neighborhood for you to visit. 

For stakeholders interested in opening a **Japanese** restaurant, crowdedness within a neighborhood is not a big factor to worry about, except that East Village, Lower East Side and Noho form a concentrated cluster of Japanese restaurants. Additional factors like customer flow, attractiveness of location, real estate availability, prices, social and economic dynamics of every neighborhood might have more weights on the final decision. 
As for opening a **Chinese** restaurants, you definitely have to be careful with **Chinatown** and **Little Italy** which are already crowded with many Chinese restaurants. For those who would like to open a Korean restaurants, **Midtown South** is definitely the neighborhood to avoid because of its high concentration of Korean restaurants.

Of course, other factors should also be taken into account for Chinese or Korean restaurants' opening, although they might not as important as the crowdedness.

## Conclusion <a name="conclusion"></a>

The purpose of this project is to provide a geographic overview of east asian restaurants in **Mahattan, New York** to aid Manhattan residents and tourists who love **Japanese, Chinese and Korean** food and the stakeholders who are interested in **opening an east asian restaurant** in Mahattan. By downloading food category data from Foursquare we have first identified all east asian restaurants in Manhatan, and then generated detailed distribution of Japanese, Chinese and Korean restaurants. We found specific characteristics for Japanese (Even distribution among apmost all neighborhoods with a concentration near East Village, Lower East Side, and Noho), Chinese (Available in most neighborhoods but fairly concentrated in two of them) and Korean (Pretty scarce but with one neighborhood of high concentration).

This discovery should be helpful for Manhattan residents and tourists loving east asian foods and those who are interested in opening an east asian restaurant to avoid crowdedness. Final decision should be also based on other factors such as customer flow, attractiveness of location, real estate availability, prices, social and economic dynamics of every neighborhood etc.