# Capstone Project
## Problem : Where would be recommended if someone want to open a restaurant in Queens ?

### Introduction
The data will be get from the New York Geojson, and the Foursquare map. The data will contain only the 'food' categories and the distance to Flushing Main Street which represent traffic convenience. With these data, we can get the information of food shop density and traffic convenience. Then, with clustering analysis, we can know that neighborhood has similar characteristic.

### Data

Base on our problem, some features we want to get are: <br>
* Every restaurant or food related shop in Queens
* The categories composition of each neighborhood
* The average distance to Flushing Main Street of every venues in each neighborhood

Data source are following: <br>
* Explore food related shop or restaurant in specific region by Foursquare API
* With the latitude and longitude information, the distance from each venue to Flushing Main Street could be calculated

### Methodology
* In the first step, we get the longitude and latitude of Flushing Main Street by Foursquare API. Then we use the Geojson data in this course to get the neighborhood information in Queens. <br>
* Second step we use Foursquare API and information in last step to get the venues list in each neighborhood. Then the venues are filtered with categories such as food shop or restaurant. <br>
* In third step, the distance from each venue to Flushing Main Street train station is calculated by longitude and latitude. These information represent the traffic convinence of each venue. <br>
* In final step, we use one hot encoding to transform category data to value, and combine the data to distance data, then use different method to cluster every neighborhood.

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /Users/steventsai/opt/anaconda3

  added / updated specs:
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    archspec-0.1.1             |     pyh9f0ad1d_0          25 KB  conda-forge
    conda-4.8.4                |   py37hc8dfbb8_1         3.0 MB  conda-forge
    ------------------------------------------------------------
                                           Total:         3.0 MB

The following NEW packages will be INSTALLED:

  archspec           conda-forge/noarch::archspec-0.1.1-pyh9f0ad1d_0

The following packages will be UPDATED:

  conda                                4.8.3-py37hc8dfbb8_1 --> 4.8.4-py37hc8dfbb8_1



Downloading and Extracting Packages
conda-4.8.4          | 3.0 MB    | ##################################### | 100% 
archs

## Using the New York Geojson data

In [2]:
with open('newyork_data.json') as json_data:
    newyork_data = json.load(json_data)
neighborhoods_data = newyork_data['features']

# define the dataframe columns
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude'] 

# instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)

for data in neighborhoods_data:
    borough = neighborhood_name = data['properties']['borough'] 
    neighborhood_name = data['properties']['name']
        
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]
    
    neighborhoods = neighborhoods.append({'Borough': borough,
                                          'Neighborhood': neighborhood_name,
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)

## Get the latitude and longitude of Flushing Main Street

In [3]:
address = 'Flushing Main Street'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of NY are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of NY are 40.7585195, -73.8298574.


In [4]:
# create map of New York using latitude and longitude values
map_newyork = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(neighborhoods['Latitude'], neighborhoods['Longitude'], neighborhoods['Borough'], neighborhoods['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_newyork)  
    
map_newyork

In [5]:
print(neighborhoods.shape)
neighborhoods.head()

(306, 4)


Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Bronx,Wakefield,40.894705,-73.847201
1,Bronx,Co-op City,40.874294,-73.829939
2,Bronx,Eastchester,40.887556,-73.827806
3,Bronx,Fieldston,40.895437,-73.905643
4,Bronx,Riverdale,40.890834,-73.912585


## Get the data of Queens

In [6]:
queen_data = neighborhoods[neighborhoods['Borough'] == 'Queens'].reset_index(drop=True)
print(queen_data.shape)
queen_data.head()

(81, 4)


Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Queens,Astoria,40.768509,-73.915654
1,Queens,Woodside,40.746349,-73.901842
2,Queens,Jackson Heights,40.751981,-73.882821
3,Queens,Elmhurst,40.744049,-73.881656
4,Queens,Howard Beach,40.654225,-73.838138


In [7]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [8]:
CLIENT_ID = 'OA0DRW3PDAHDZFVOP4IXML5MMCNNMWQHBBU1ERAP1CTCIZLD' # your Foursquare ID
CLIENT_SECRET = '2F3HF1ZFTNCLO1F5IN2ZFILSZQQDXSNJROIOVB2R5IM2S0A3' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: OA0DRW3PDAHDZFVOP4IXML5MMCNNMWQHBBU1ERAP1CTCIZLD
CLIENT_SECRET:2F3HF1ZFTNCLO1F5IN2ZFILSZQQDXSNJROIOVB2R5IM2S0A3


In [9]:
cat = '4d4b7105d754a06374d81259'
LIMIT = 200
radius = 10000
sortP = 1
sec = 'food'

url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}&categoryId={}&sortByPopularity={}&section={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    latitude, 
    longitude,
    radius,
    LIMIT,
    cat,
    sortP,
    sec)

results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5f30fcdc2cc28e57cedd2c4e'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'},
    {'name': '$-$$$$', 'key': 'price'}]},
  'headerLocation': 'Queens',
  'headerFullLocation': 'Queens',
  'headerLocationGranularity': 'city',
  'query': 'food',
  'totalResults': 250,
  'suggestedBounds': {'ne': {'lat': 40.84851959000009,
    'lng': -73.71126221232828},
   'sw': {'lat': 40.66851940999991, 'lng': -73.94845258767171}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4b3ea6eaf964a52058a025e3',
       'name': 'Peter Luger Steak House',
       'location': {'address': '225 Northern Blvd',
        'crossStreet': 'btwn Tain Dr & Merrivale Rd',
        'lat': 40.777245373268975,
   

In [10]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

  This is separate from the ipykernel package so we can avoid doing imports until


Unnamed: 0,name,categories,lat,lng
0,Peter Luger Steak House,Steakhouse,40.777245,-73.727541
1,Mexicocina,Mexican Restaurant,40.811837,-73.909512
2,Trinciti Roti Shop,Caribbean Restaurant,40.680314,-73.821202
3,Patsy's Pizza - East Harlem,Pizza Place,40.797108,-73.934626
4,You Garden Xiao Long Bao,Shanghai Restaurant,40.763364,-73.770664


In [11]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

100 venues were returned by Foursquare.


In [12]:
distance = venues[0]['venue']['location']['distance']
type(distance)

int

In [13]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        cat = '4d4b7105d754a06374d81259'
        LIMIT = 200
        sortP = 1
        sec = 'food'
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}&categoryId={}&sortByPopularity={}&section={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT,
            cat,
            sortP,
            sec)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'], 
            v['venue']['location']['distance'],
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue distance',
                  'Venue Category']
    
    return(nearby_venues)

In [14]:
Q_venues = getNearbyVenues(names=queen_data['Neighborhood'],
                                   latitudes=queen_data['Latitude'],
                                   longitudes=queen_data['Longitude']
                                  )

Astoria
Woodside
Jackson Heights
Elmhurst
Howard Beach
Corona
Forest Hills
Kew Gardens
Richmond Hill
Flushing
Long Island City
Sunnyside
East Elmhurst
Maspeth
Ridgewood
Glendale
Rego Park
Woodhaven
Ozone Park
South Ozone Park
College Point
Whitestone
Bayside
Auburndale
Little Neck
Douglaston
Glen Oaks
Bellerose
Kew Gardens Hills
Fresh Meadows
Briarwood
Jamaica Center
Oakland Gardens
Queens Village
Hollis
South Jamaica
St. Albans
Rochdale
Springfield Gardens
Cambria Heights
Rosedale
Far Rockaway
Broad Channel
Breezy Point
Steinway
Beechhurst
Bay Terrace
Edgemere
Arverne
Rockaway Beach
Neponsit
Murray Hill
Floral Park
Holliswood
Jamaica Estates
Queensboro Hill
Hillcrest
Ravenswood
Lindenwood
Laurelton
Lefrak City
Belle Harbor
Rockaway Park
Somerville
Brookville
Bellaire
North Corona
Forest Hills Gardens
Jamaica Hills
Utopia
Pomonok
Astoria Heights
Hunters Point
Sunnyside Gardens
Blissville
Roxbury
Middle Village
Malba
Hammels
Bayswater
Queensbridge


In [15]:
print(Q_venues.shape)
Q_venues.head()

(1610, 8)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue distance,Venue Category
0,Astoria,40.768509,-73.915654,Neptune Diner,40.770788,-73.91689,274,Diner
1,Astoria,40.768509,-73.915654,Sugar Freak,40.764443,-73.916055,453,Cajun / Creole Restaurant
2,Astoria,40.768509,-73.915654,New York City Bagel & Coffee House,40.765841,-73.919441,436,Bagel Shop
3,Astoria,40.768509,-73.915654,Brooklyn Bagel & Coffee Co.,40.764732,-73.916944,434,Bagel Shop
4,Astoria,40.768509,-73.915654,George's Deli,40.766316,-73.915284,246,Deli / Bodega


In [16]:
Q_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue distance,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Arverne,8,8,8,8,8,8,8
Astoria,89,89,89,89,89,89,89
Astoria Heights,9,9,9,9,9,9,9
Auburndale,10,10,10,10,10,10,10
Bay Terrace,11,11,11,11,11,11,11
Bayside,59,59,59,59,59,59,59
Beechhurst,7,7,7,7,7,7,7
Bellaire,8,8,8,8,8,8,8
Belle Harbor,8,8,8,8,8,8,8
Bellerose,16,16,16,16,16,16,16


In [17]:
print('There are {} unique categories.'.format(len(Q_venues['Venue Category'].unique())))

There are 98 unique categories.


In [18]:
# one hot encoding
Q_onehot = pd.get_dummies(Q_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
Q_onehot['Neighborhood'] = Q_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [Q_onehot.columns[-1]] + list(Q_onehot.columns[:-1])
Q_onehot = Q_onehot[fixed_columns]

Q_onehot.head()

Unnamed: 0,Neighborhood,Afghan Restaurant,American Restaurant,Arepa Restaurant,Argentinian Restaurant,Asian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bistro,Brazilian Restaurant,Breakfast Spot,Buffet,Burger Joint,Burrito Place,Cafeteria,Café,Cajun / Creole Restaurant,Cantonese Restaurant,Caribbean Restaurant,Chinese Restaurant,Colombian Restaurant,Comfort Food Restaurant,Cuban Restaurant,Deli / Bodega,Dim Sum Restaurant,Diner,Donut Shop,Dosa Place,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Empanada Restaurant,Falafel Restaurant,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Food,Food Court,Food Stand,Food Truck,French Restaurant,Fried Chicken Joint,Gastropub,German Restaurant,Gluten-free Restaurant,Greek Restaurant,Halal Restaurant,Himalayan Restaurant,Hot Dog Joint,Hotpot Restaurant,Hunan Restaurant,Indian Restaurant,Indonesian Restaurant,Irish Pub,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Korean Restaurant,Kosher Restaurant,Latin American Restaurant,Malay Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Moroccan Restaurant,New American Restaurant,Noodle House,North Indian Restaurant,Persian Restaurant,Peruvian Restaurant,Pizza Place,Poke Place,Polish Restaurant,Ramen Restaurant,Restaurant,Romanian Restaurant,Salad Place,Sandwich Place,Seafood Restaurant,Shabu-Shabu Restaurant,Shanghai Restaurant,Snack Place,South American Restaurant,Southern / Soul Food Restaurant,Souvlaki Shop,Spanish Restaurant,Steakhouse,Sushi Restaurant,Szechuan Restaurant,Taco Place,Taiwanese Restaurant,Tapas Restaurant,Thai Restaurant,Tibetan Restaurant,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wings Joint
0,Astoria,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Astoria,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Astoria,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Astoria,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Astoria,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [19]:
Q_onehot.shape

(1610, 99)

In [20]:
Q_grouped = Q_onehot.groupby('Neighborhood').mean().reset_index()
Q_grouped

Unnamed: 0,Neighborhood,Afghan Restaurant,American Restaurant,Arepa Restaurant,Argentinian Restaurant,Asian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bistro,Brazilian Restaurant,Breakfast Spot,Buffet,Burger Joint,Burrito Place,Cafeteria,Café,Cajun / Creole Restaurant,Cantonese Restaurant,Caribbean Restaurant,Chinese Restaurant,Colombian Restaurant,Comfort Food Restaurant,Cuban Restaurant,Deli / Bodega,Dim Sum Restaurant,Diner,Donut Shop,Dosa Place,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Empanada Restaurant,Falafel Restaurant,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Food,Food Court,Food Stand,Food Truck,French Restaurant,Fried Chicken Joint,Gastropub,German Restaurant,Gluten-free Restaurant,Greek Restaurant,Halal Restaurant,Himalayan Restaurant,Hot Dog Joint,Hotpot Restaurant,Hunan Restaurant,Indian Restaurant,Indonesian Restaurant,Irish Pub,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Korean Restaurant,Kosher Restaurant,Latin American Restaurant,Malay Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Moroccan Restaurant,New American Restaurant,Noodle House,North Indian Restaurant,Persian Restaurant,Peruvian Restaurant,Pizza Place,Poke Place,Polish Restaurant,Ramen Restaurant,Restaurant,Romanian Restaurant,Salad Place,Sandwich Place,Seafood Restaurant,Shabu-Shabu Restaurant,Shanghai Restaurant,Snack Place,South American Restaurant,Southern / Soul Food Restaurant,Souvlaki Shop,Spanish Restaurant,Steakhouse,Sushi Restaurant,Szechuan Restaurant,Taco Place,Taiwanese Restaurant,Tapas Restaurant,Thai Restaurant,Tibetan Restaurant,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wings Joint
0,Arverne,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.125,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0
1,Astoria,0.0,0.011236,0.0,0.0,0.0,0.011236,0.022472,0.067416,0.0,0.011236,0.011236,0.0,0.011236,0.0,0.0,0.044944,0.011236,0.0,0.0,0.033708,0.0,0.011236,0.0,0.078652,0.0,0.011236,0.0,0.0,0.0,0.0,0.0,0.0,0.011236,0.011236,0.0,0.0,0.011236,0.0,0.0,0.033708,0.0,0.0,0.0,0.0,0.0,0.067416,0.011236,0.0,0.0,0.0,0.0,0.067416,0.0,0.011236,0.022472,0.0,0.022472,0.011236,0.0,0.011236,0.0,0.044944,0.022472,0.101124,0.011236,0.011236,0.0,0.0,0.0,0.0,0.05618,0.011236,0.0,0.0,0.011236,0.0,0.0,0.011236,0.044944,0.0,0.0,0.0,0.0,0.0,0.011236,0.0,0.0,0.011236,0.0,0.0,0.0,0.0,0.011236,0.0,0.0,0.011236,0.0,0.0
2,Astoria Heights,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.222222,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.222222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Auburndale,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.3,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Bay Terrace,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Bayside,0.0,0.050847,0.0,0.0,0.016949,0.0,0.016949,0.016949,0.016949,0.0,0.0,0.0,0.033898,0.0,0.0,0.033898,0.016949,0.0,0.0,0.067797,0.0,0.0,0.016949,0.016949,0.0,0.0,0.033898,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.016949,0.016949,0.016949,0.0,0.0,0.050847,0.0,0.0,0.0,0.0,0.0,0.067797,0.0,0.0,0.033898,0.0,0.0,0.033898,0.0,0.016949,0.0,0.033898,0.050847,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.084746,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.016949,0.0,0.016949,0.0,0.016949,0.0,0.0,0.016949,0.033898,0.050847,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.016949,0.0
6,Beechhurst,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.285714,0.0,0.0,0.0,0.142857,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Bellaire,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.375,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Belle Harbor,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.25,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Bellerose,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.25,0.0,0.0625,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1875,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [21]:
Q_grouped.shape

(77, 99)

In [22]:
num_top_venues = 5

for hood in Q_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = Q_grouped[Q_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Arverne----
             venue  freq
0   Sandwich Place  0.25
1       Restaurant  0.12
2             Café  0.12
3       Donut Shop  0.12
4  Thai Restaurant  0.12


----Astoria----
                       venue  freq
0  Middle Eastern Restaurant  0.10
1              Deli / Bodega  0.08
2           Greek Restaurant  0.07
3          Indian Restaurant  0.07
4                     Bakery  0.07


----Astoria Heights----
                venue  freq
0  Italian Restaurant  0.22
1  Chinese Restaurant  0.22
2       Deli / Bodega  0.11
3         Pizza Place  0.11
4              Bakery  0.11


----Auburndale----
                  venue  freq
0    Italian Restaurant   0.3
1                 Diner   0.1
2     Korean Restaurant   0.1
3  Fast Food Restaurant   0.1
4          Noodle House   0.1


----Bay Terrace----
                 venue  freq
0           Donut Shop  0.18
1  American Restaurant  0.18
2          Pizza Place  0.18
3               Bakery  0.09
4          Salad Place  0.09


----Bayside--

                venue  freq
0       Deli / Bodega  0.15
1         Pizza Place  0.12
2               Diner  0.12
3  Chinese Restaurant  0.12
4              Bakery  0.08


----Middle Village----
                venue  freq
0  Chinese Restaurant  0.20
1              Bakery  0.13
2  Spanish Restaurant  0.07
3       Burrito Place  0.07
4    Sushi Restaurant  0.07


----Murray Hill----
                 venue  freq
0    Korean Restaurant  0.70
1        Deli / Bodega  0.05
2          Pizza Place  0.02
3           Restaurant  0.02
4  Fried Chicken Joint  0.02


----North Corona----
                venue  freq
0       Deli / Bodega  0.22
1         Pizza Place  0.17
2              Bakery  0.17
3  Mexican Restaurant  0.11
4  Spanish Restaurant  0.06


----Oakland Gardens----
                venue  freq
0   Korean Restaurant  0.29
1  Chinese Restaurant  0.18
2          Donut Shop  0.07
3       Deli / Bodega  0.07
4    Sushi Restaurant  0.04


----Ozone Park----
                       venue  freq
0 

In [23]:
Q_venues.groupby([ 'Neighborhood', 'Venue Category']).mean()

Unnamed: 0_level_0,Unnamed: 1_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue Latitude,Venue Longitude,Venue distance
Neighborhood,Venue Category,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Arverne,Burrito Place,40.589144,-73.791992,40.590085,-73.797505,477.0
Arverne,Café,40.589144,-73.791992,40.590558,-73.79735,479.0
Arverne,Donut Shop,40.589144,-73.791992,40.590657,-73.797239,474.0
Arverne,Pizza Place,40.589144,-73.791992,40.590689,-73.797161,469.0
Arverne,Restaurant,40.589144,-73.791992,40.590759,-73.797119,469.0
Arverne,Sandwich Place,40.589144,-73.791992,40.591874,-73.795733,481.0
Arverne,Thai Restaurant,40.589144,-73.791992,40.590823,-73.796706,440.0
Astoria,American Restaurant,40.768509,-73.915654,40.764629,-73.916879,444.0
Astoria,BBQ Joint,40.768509,-73.915654,40.764437,-73.916278,456.0
Astoria,Bagel Shop,40.768509,-73.915654,40.765286,-73.918192,435.0


## Define a function of calculating the distance of each venue to Flushing Main Street by longitude and latitude

In [24]:
import math
def distance_cal(lat1,lon1,lat2,lon2):
    R = 6137e3
    phi1 = lat1 * math.pi/180
    phi2 = lat2 * math.pi/180
    dphi = (lat2-lat1) * math.pi/180
    dlam = (lon2-lon1) * math.pi/180
    a = math.sin(dphi/2) * math.sin(dphi/2) + math.cos(phi1) * math.cos(phi2) * math.sin(dlam/2) * math.sin(dlam/2)
    c = 2 * math.atan2(math.sqrt(a), math.sqrt(1-a))
    d = R * c
    return d

In [25]:
# Flushin Main Street latitude and longitude
lat1 = 40.7585195
lon1 = -73.8298574
# Calculate the distance to Flushing Main Street of each venue
dist_to_Flushing = []
for name, lat2, lon2 in zip(Q_venues['Venue'], Q_venues['Venue Latitude'], Q_venues['Venue Longitude']):
    d = distance_cal(lat1, lon1, lat2, lon2)
    dist_to_Flushing.append([name, int(d)])

dist_df = pd.DataFrame(dist_to_Flushing)
dist_df.rename(columns = {0:'Venue', 1:'Distance'}, inplace = True)

In [26]:
print(dist_df.shape)
dist_df

(1610, 2)


Unnamed: 0,Venue,Distance
0,Neptune Diner,7181
1,Sugar Freak,7021
2,New York City Bagel & Coffee House,7309
3,Brooklyn Bagel & Coffee Co.,7096
4,George's Deli,6980
5,Zyara Restaurant,6777
6,Butcher Bar,7039
7,The Grand,7056
8,Koroni Souvlaki & Grill,7485
9,Queens Comfort,7082


In [27]:
Q_venues_d = pd.concat([Q_venues, dist_df['Distance']], axis = 1)
Q_venues_d.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue distance,Venue Category,Distance
0,Astoria,40.768509,-73.915654,Neptune Diner,40.770788,-73.91689,274,Diner,7181
1,Astoria,40.768509,-73.915654,Sugar Freak,40.764443,-73.916055,453,Cajun / Creole Restaurant,7021
2,Astoria,40.768509,-73.915654,New York City Bagel & Coffee House,40.765841,-73.919441,436,Bagel Shop,7309
3,Astoria,40.768509,-73.915654,Brooklyn Bagel & Coffee Co.,40.764732,-73.916944,434,Bagel Shop,7096
4,Astoria,40.768509,-73.915654,George's Deli,40.766316,-73.915284,246,Deli / Bodega,6980


In [28]:
Q_venues_d.drop(['Venue distance'], axis = 1, inplace = True)

In [29]:
Q_venues_d

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category,Distance
0,Astoria,40.768509,-73.915654,Neptune Diner,40.770788,-73.91689,Diner,7181
1,Astoria,40.768509,-73.915654,Sugar Freak,40.764443,-73.916055,Cajun / Creole Restaurant,7021
2,Astoria,40.768509,-73.915654,New York City Bagel & Coffee House,40.765841,-73.919441,Bagel Shop,7309
3,Astoria,40.768509,-73.915654,Brooklyn Bagel & Coffee Co.,40.764732,-73.916944,Bagel Shop,7096
4,Astoria,40.768509,-73.915654,George's Deli,40.766316,-73.915284,Deli / Bodega,6980
5,Astoria,40.768509,-73.915654,Zyara Restaurant,40.766591,-73.912713,Restaurant,6777
6,Astoria,40.768509,-73.915654,Butcher Bar,40.764437,-73.916278,BBQ Joint,7039
7,Astoria,40.768509,-73.915654,The Grand,40.764649,-73.91646,Mediterranean Restaurant,7056
8,Astoria,40.768509,-73.915654,Koroni Souvlaki & Grill,40.768474,-73.921187,Souvlaki Shop,7485
9,Astoria,40.768509,-73.915654,Queens Comfort,40.764648,-73.916775,Comfort Food Restaurant,7082


## Get the average distance of all venues in the neighborhood

In [99]:
Q_distance_M = Q_venues_d.groupby('Neighborhood').mean()
Q_distance_M

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue Latitude,Venue Longitude,Distance
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Arverne,40.589144,-73.791992,40.590915,-73.796818,18151.5
Astoria,40.768509,-73.915654,40.766635,-73.915896,7036.640449
Astoria Heights,40.770317,-73.89468,40.76904,-73.894469,5363.222222
Auburndale,40.76173,-73.791762,40.759913,-73.791757,3112.3
Bay Terrace,40.782843,-73.776802,40.779842,-73.776648,4883.272727
Bayside,40.766041,-73.774274,40.765085,-73.771904,4756.677966
Beechhurst,40.792781,-73.804365,40.79319,-73.807187,4143.714286
Bellaire,40.733014,-73.738892,40.731272,-73.737397,8053.25
Belle Harbor,40.576156,-73.854018,40.578453,-73.84943,19352.0
Bellerose,40.728573,-73.720128,40.725507,-73.720191,9576.8125


## Transfer dataframe by one hot encoding 

In [31]:
# one hot encoding
Q_onehot = pd.get_dummies(Q_venues_d[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
Q_onehot['Neighborhood'] = Q_venues_d['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [Q_onehot.columns[-1]] + list(Q_onehot.columns[:-1])
Q_onehot = Q_onehot[fixed_columns]

Q_onehot.head()

Unnamed: 0,Neighborhood,Afghan Restaurant,American Restaurant,Arepa Restaurant,Argentinian Restaurant,Asian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bistro,Brazilian Restaurant,Breakfast Spot,Buffet,Burger Joint,Burrito Place,Cafeteria,Café,Cajun / Creole Restaurant,Cantonese Restaurant,Caribbean Restaurant,Chinese Restaurant,Colombian Restaurant,Comfort Food Restaurant,Cuban Restaurant,Deli / Bodega,Dim Sum Restaurant,Diner,Donut Shop,Dosa Place,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Empanada Restaurant,Falafel Restaurant,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Food,Food Court,Food Stand,Food Truck,French Restaurant,Fried Chicken Joint,Gastropub,German Restaurant,Gluten-free Restaurant,Greek Restaurant,Halal Restaurant,Himalayan Restaurant,Hot Dog Joint,Hotpot Restaurant,Hunan Restaurant,Indian Restaurant,Indonesian Restaurant,Irish Pub,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Korean Restaurant,Kosher Restaurant,Latin American Restaurant,Malay Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Moroccan Restaurant,New American Restaurant,Noodle House,North Indian Restaurant,Persian Restaurant,Peruvian Restaurant,Pizza Place,Poke Place,Polish Restaurant,Ramen Restaurant,Restaurant,Romanian Restaurant,Salad Place,Sandwich Place,Seafood Restaurant,Shabu-Shabu Restaurant,Shanghai Restaurant,Snack Place,South American Restaurant,Southern / Soul Food Restaurant,Souvlaki Shop,Spanish Restaurant,Steakhouse,Sushi Restaurant,Szechuan Restaurant,Taco Place,Taiwanese Restaurant,Tapas Restaurant,Thai Restaurant,Tibetan Restaurant,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wings Joint
0,Astoria,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Astoria,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Astoria,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Astoria,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Astoria,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [32]:
Q_grouped = Q_onehot.groupby('Neighborhood').mean().reset_index()
Q_grouped.sort_values(by='Neighborhood',inplace = True)
Q_grouped

Unnamed: 0,Neighborhood,Afghan Restaurant,American Restaurant,Arepa Restaurant,Argentinian Restaurant,Asian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bistro,Brazilian Restaurant,Breakfast Spot,Buffet,Burger Joint,Burrito Place,Cafeteria,Café,Cajun / Creole Restaurant,Cantonese Restaurant,Caribbean Restaurant,Chinese Restaurant,Colombian Restaurant,Comfort Food Restaurant,Cuban Restaurant,Deli / Bodega,Dim Sum Restaurant,Diner,Donut Shop,Dosa Place,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Empanada Restaurant,Falafel Restaurant,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Food,Food Court,Food Stand,Food Truck,French Restaurant,Fried Chicken Joint,Gastropub,German Restaurant,Gluten-free Restaurant,Greek Restaurant,Halal Restaurant,Himalayan Restaurant,Hot Dog Joint,Hotpot Restaurant,Hunan Restaurant,Indian Restaurant,Indonesian Restaurant,Irish Pub,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Korean Restaurant,Kosher Restaurant,Latin American Restaurant,Malay Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Moroccan Restaurant,New American Restaurant,Noodle House,North Indian Restaurant,Persian Restaurant,Peruvian Restaurant,Pizza Place,Poke Place,Polish Restaurant,Ramen Restaurant,Restaurant,Romanian Restaurant,Salad Place,Sandwich Place,Seafood Restaurant,Shabu-Shabu Restaurant,Shanghai Restaurant,Snack Place,South American Restaurant,Southern / Soul Food Restaurant,Souvlaki Shop,Spanish Restaurant,Steakhouse,Sushi Restaurant,Szechuan Restaurant,Taco Place,Taiwanese Restaurant,Tapas Restaurant,Thai Restaurant,Tibetan Restaurant,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wings Joint
0,Arverne,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.125,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0
1,Astoria,0.0,0.011236,0.0,0.0,0.0,0.011236,0.022472,0.067416,0.0,0.011236,0.011236,0.0,0.011236,0.0,0.0,0.044944,0.011236,0.0,0.0,0.033708,0.0,0.011236,0.0,0.078652,0.0,0.011236,0.0,0.0,0.0,0.0,0.0,0.0,0.011236,0.011236,0.0,0.0,0.011236,0.0,0.0,0.033708,0.0,0.0,0.0,0.0,0.0,0.067416,0.011236,0.0,0.0,0.0,0.0,0.067416,0.0,0.011236,0.022472,0.0,0.022472,0.011236,0.0,0.011236,0.0,0.044944,0.022472,0.101124,0.011236,0.011236,0.0,0.0,0.0,0.0,0.05618,0.011236,0.0,0.0,0.011236,0.0,0.0,0.011236,0.044944,0.0,0.0,0.0,0.0,0.0,0.011236,0.0,0.0,0.011236,0.0,0.0,0.0,0.0,0.011236,0.0,0.0,0.011236,0.0,0.0
2,Astoria Heights,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.222222,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.222222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Auburndale,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.3,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Bay Terrace,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Bayside,0.0,0.050847,0.0,0.0,0.016949,0.0,0.016949,0.016949,0.016949,0.0,0.0,0.0,0.033898,0.0,0.0,0.033898,0.016949,0.0,0.0,0.067797,0.0,0.0,0.016949,0.016949,0.0,0.0,0.033898,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.016949,0.016949,0.016949,0.0,0.0,0.050847,0.0,0.0,0.0,0.0,0.0,0.067797,0.0,0.0,0.033898,0.0,0.0,0.033898,0.0,0.016949,0.0,0.033898,0.050847,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.084746,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.016949,0.0,0.016949,0.0,0.016949,0.0,0.0,0.016949,0.033898,0.050847,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.016949,0.0
6,Beechhurst,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.285714,0.0,0.0,0.0,0.142857,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Bellaire,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.375,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Belle Harbor,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.25,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Bellerose,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.25,0.0,0.0625,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1875,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [33]:
Q_mergeD = Q_grouped.merge(Q_distance_M, on = 'Neighborhood')

In [37]:
Q_mergeD.drop(columns=['Neighborhood Latitude', 'Neighborhood Longitude', 'Venue Latitude', 'Venue Longitude',], axis = 1, inplace = True)

In [38]:
Q_mergeD

Unnamed: 0,Neighborhood,Afghan Restaurant,American Restaurant,Arepa Restaurant,Argentinian Restaurant,Asian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bistro,Brazilian Restaurant,Breakfast Spot,Buffet,Burger Joint,Burrito Place,Cafeteria,Café,Cajun / Creole Restaurant,Cantonese Restaurant,Caribbean Restaurant,Chinese Restaurant,Colombian Restaurant,Comfort Food Restaurant,Cuban Restaurant,Deli / Bodega,Dim Sum Restaurant,Diner,Donut Shop,Dosa Place,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Empanada Restaurant,Falafel Restaurant,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Food,Food Court,Food Stand,Food Truck,French Restaurant,Fried Chicken Joint,Gastropub,German Restaurant,Gluten-free Restaurant,Greek Restaurant,Halal Restaurant,Himalayan Restaurant,Hot Dog Joint,Hotpot Restaurant,Hunan Restaurant,Indian Restaurant,Indonesian Restaurant,Irish Pub,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Korean Restaurant,Kosher Restaurant,Latin American Restaurant,Malay Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Moroccan Restaurant,New American Restaurant,Noodle House,North Indian Restaurant,Persian Restaurant,Peruvian Restaurant,Pizza Place,Poke Place,Polish Restaurant,Ramen Restaurant,Restaurant,Romanian Restaurant,Salad Place,Sandwich Place,Seafood Restaurant,Shabu-Shabu Restaurant,Shanghai Restaurant,Snack Place,South American Restaurant,Southern / Soul Food Restaurant,Souvlaki Shop,Spanish Restaurant,Steakhouse,Sushi Restaurant,Szechuan Restaurant,Taco Place,Taiwanese Restaurant,Tapas Restaurant,Thai Restaurant,Tibetan Restaurant,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wings Joint,Distance
0,Arverne,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.125,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,18151.5
1,Astoria,0.0,0.011236,0.0,0.0,0.0,0.011236,0.022472,0.067416,0.0,0.011236,0.011236,0.0,0.011236,0.0,0.0,0.044944,0.011236,0.0,0.0,0.033708,0.0,0.011236,0.0,0.078652,0.0,0.011236,0.0,0.0,0.0,0.0,0.0,0.0,0.011236,0.011236,0.0,0.0,0.011236,0.0,0.0,0.033708,0.0,0.0,0.0,0.0,0.0,0.067416,0.011236,0.0,0.0,0.0,0.0,0.067416,0.0,0.011236,0.022472,0.0,0.022472,0.011236,0.0,0.011236,0.0,0.044944,0.022472,0.101124,0.011236,0.011236,0.0,0.0,0.0,0.0,0.05618,0.011236,0.0,0.0,0.011236,0.0,0.0,0.011236,0.044944,0.0,0.0,0.0,0.0,0.0,0.011236,0.0,0.0,0.011236,0.0,0.0,0.0,0.0,0.011236,0.0,0.0,0.011236,0.0,0.0,7036.640449
2,Astoria Heights,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.222222,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.222222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,5363.222222
3,Auburndale,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.3,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,3112.3
4,Bay Terrace,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4883.272727
5,Bayside,0.0,0.050847,0.0,0.0,0.016949,0.0,0.016949,0.016949,0.016949,0.0,0.0,0.0,0.033898,0.0,0.0,0.033898,0.016949,0.0,0.0,0.067797,0.0,0.0,0.016949,0.016949,0.0,0.0,0.033898,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.016949,0.016949,0.016949,0.0,0.0,0.050847,0.0,0.0,0.0,0.0,0.0,0.067797,0.0,0.0,0.033898,0.0,0.0,0.033898,0.0,0.016949,0.0,0.033898,0.050847,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.084746,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.016949,0.0,0.016949,0.0,0.016949,0.0,0.0,0.016949,0.033898,0.050847,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.016949,0.0,4756.677966
6,Beechhurst,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.285714,0.0,0.0,0.0,0.142857,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4143.714286
7,Bellaire,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.375,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,8053.25
8,Belle Harbor,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.25,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,19352.0
9,Bellerose,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.25,0.0,0.0625,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1875,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,9576.8125


In [43]:
from sklearn.cluster import KMeans, DBSCAN
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler

In [98]:
X = Q_mergeD.values[:,1:]
kclusters = 10
clus_Kdata = StandardScaler().fit_transform(X)
k_means = KMeans(init = "k-means++", n_clusters = kclusters, n_init = 100)
k_means.fit(X)
K_labels = k_means.labels_
K_labels

array([2, 1, 6, 8, 3, 3, 3, 7, 9, 4, 7, 6, 2, 5, 4, 8, 8, 1, 8, 2, 3, 2,
       4, 0, 3, 3, 3, 4, 1, 2, 3, 1, 1, 5, 4, 3, 6, 6, 6, 6, 3, 5, 3, 4,
       1, 7, 8, 6, 6, 0, 8, 6, 7, 8, 7, 0, 7, 7, 3, 6, 1, 4, 2, 9, 5, 9,
       1, 4, 5, 7, 6, 1, 1, 3, 8, 1, 6], dtype=int32)

In [100]:
# add clustering labels
Q_distance_M.insert(0, 'Cluster Labels', K_labels)
Q_distance_M
##manhattan_merged = manhattan_data

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
##Q_mergeC = Q_distance_M.join(Q_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

##Q_mergeC.head() # check the last columns!

Unnamed: 0_level_0,Cluster Labels,Neighborhood Latitude,Neighborhood Longitude,Venue Latitude,Venue Longitude,Distance
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Arverne,2,40.589144,-73.791992,40.590915,-73.796818,18151.5
Astoria,1,40.768509,-73.915654,40.766635,-73.915896,7036.640449
Astoria Heights,6,40.770317,-73.89468,40.76904,-73.894469,5363.222222
Auburndale,8,40.76173,-73.791762,40.759913,-73.791757,3112.3
Bay Terrace,3,40.782843,-73.776802,40.779842,-73.776648,4883.272727
Bayside,3,40.766041,-73.774274,40.765085,-73.771904,4756.677966
Beechhurst,3,40.792781,-73.804365,40.79319,-73.807187,4143.714286
Bellaire,7,40.733014,-73.738892,40.731272,-73.737397,8053.25
Belle Harbor,9,40.576156,-73.854018,40.578453,-73.84943,19352.0
Bellerose,4,40.728573,-73.720128,40.725507,-73.720191,9576.8125


In [125]:
# create map
from folium.features import DivIcon
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(Q_distance_M['Neighborhood Latitude'], Q_distance_M['Neighborhood Longitude'], Q_distance_M.index, Q_distance_M['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
    folium.Marker([lat, lon], icon = DivIcon(icon_size=(150,36),
        icon_anchor=(0,0),
        html=f"""<div style="font-family: courier new; color: black ">{"{:.0f}".format(cluster)}</div>""",
        )).add_to(map_clusters)
       
map_clusters

In [102]:
Q_merge = Q_mergeD.drop(columns=['Distance'], axis = 1)

In [103]:
Y = Q_merge.values[:,1:]
clus_KdataY = StandardScaler().fit_transform(Y)
k_means = KMeans(init = "k-means++", n_clusters = kclusters, n_init = 100)
k_means.fit(clus_KdataY)
K_labelsY = k_means.labels_
K_labelsY

array([3, 8, 3, 3, 3, 3, 2, 3, 2, 2, 2, 2, 3, 2, 0, 3, 3, 2, 3, 0, 5, 0,
       2, 6, 3, 3, 2, 3, 2, 0, 3, 0, 0, 3, 3, 3, 3, 2, 0, 3, 3, 0, 3, 2,
       2, 3, 2, 3, 2, 3, 2, 7, 3, 3, 0, 3, 2, 3, 3, 0, 1, 0, 4, 3, 0, 2,
       2, 2, 0, 0, 3, 3, 9, 2, 2, 2, 3], dtype=int32)

In [104]:
Q_distance_M.insert(0, 'Cluster Labels woD', K_labelsY)
Q_distance_M

Unnamed: 0_level_0,Cluster Labels woD,Cluster Labels,Neighborhood Latitude,Neighborhood Longitude,Venue Latitude,Venue Longitude,Distance
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Arverne,3,2,40.589144,-73.791992,40.590915,-73.796818,18151.5
Astoria,8,1,40.768509,-73.915654,40.766635,-73.915896,7036.640449
Astoria Heights,3,6,40.770317,-73.89468,40.76904,-73.894469,5363.222222
Auburndale,3,8,40.76173,-73.791762,40.759913,-73.791757,3112.3
Bay Terrace,3,3,40.782843,-73.776802,40.779842,-73.776648,4883.272727
Bayside,3,3,40.766041,-73.774274,40.765085,-73.771904,4756.677966
Beechhurst,2,3,40.792781,-73.804365,40.79319,-73.807187,4143.714286
Bellaire,3,7,40.733014,-73.738892,40.731272,-73.737397,8053.25
Belle Harbor,2,9,40.576156,-73.854018,40.578453,-73.84943,19352.0
Bellerose,2,4,40.728573,-73.720128,40.725507,-73.720191,9576.8125


In [124]:
# create map
map_clustersY = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(Q_distance_M['Neighborhood Latitude'], Q_distance_M['Neighborhood Longitude'], Q_distance_M.index, Q_distance_M['Cluster Labels woD']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clustersY)
    folium.Marker([lat, lon], icon = DivIcon(icon_size=(150,36),
        icon_anchor=(0,0),
        html=f"""<div style="font-family: courier new; color: black ">{"{:.0f}".format(cluster)}</div>""",
        )).add_to(map_clustersY)
       
map_clustersY

In [107]:
Q_mergeD.insert(0, 'Cluster Labels Distance', K_labels)

In [108]:
Q_mergeD.insert(0, 'Cluster Labels woD', K_labelsY)

In [110]:
Q_MD = Q_mergeD.set_index('Neighborhood')

### Group the neighborhood by cluster labels with distance parameter

In [114]:
Q_mergeD.groupby(['Cluster Labels Distance', 'Neighborhood']).mean()

Unnamed: 0_level_0,Unnamed: 1_level_0,Cluster Labels woD,Afghan Restaurant,American Restaurant,Arepa Restaurant,Argentinian Restaurant,Asian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bistro,Brazilian Restaurant,Breakfast Spot,Buffet,Burger Joint,Burrito Place,Cafeteria,Café,Cajun / Creole Restaurant,Cantonese Restaurant,Caribbean Restaurant,Chinese Restaurant,Colombian Restaurant,Comfort Food Restaurant,Cuban Restaurant,Deli / Bodega,Dim Sum Restaurant,Diner,Donut Shop,Dosa Place,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Empanada Restaurant,Falafel Restaurant,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Food,Food Court,Food Stand,Food Truck,French Restaurant,Fried Chicken Joint,Gastropub,German Restaurant,Gluten-free Restaurant,Greek Restaurant,Halal Restaurant,Himalayan Restaurant,Hot Dog Joint,Hotpot Restaurant,Hunan Restaurant,Indian Restaurant,Indonesian Restaurant,Irish Pub,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Korean Restaurant,Kosher Restaurant,Latin American Restaurant,Malay Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Moroccan Restaurant,New American Restaurant,Noodle House,North Indian Restaurant,Persian Restaurant,Peruvian Restaurant,Pizza Place,Poke Place,Polish Restaurant,Ramen Restaurant,Restaurant,Romanian Restaurant,Salad Place,Sandwich Place,Seafood Restaurant,Shabu-Shabu Restaurant,Shanghai Restaurant,Snack Place,South American Restaurant,Southern / Soul Food Restaurant,Souvlaki Shop,Spanish Restaurant,Steakhouse,Sushi Restaurant,Szechuan Restaurant,Taco Place,Taiwanese Restaurant,Tapas Restaurant,Thai Restaurant,Tibetan Restaurant,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wings Joint,Distance
Cluster Labels Distance,Neighborhood,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1,Unnamed: 82_level_1,Unnamed: 83_level_1,Unnamed: 84_level_1,Unnamed: 85_level_1,Unnamed: 86_level_1,Unnamed: 87_level_1,Unnamed: 88_level_1,Unnamed: 89_level_1,Unnamed: 90_level_1,Unnamed: 91_level_1,Unnamed: 92_level_1,Unnamed: 93_level_1,Unnamed: 94_level_1,Unnamed: 95_level_1,Unnamed: 96_level_1,Unnamed: 97_level_1,Unnamed: 98_level_1,Unnamed: 99_level_1,Unnamed: 100_level_1,Unnamed: 101_level_1
0,Flushing,6,0.0,0.0,0.0,0.0,0.044776,0.029851,0.0,0.089552,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.029851,0.0,0.014925,0.0,0.134328,0.0,0.0,0.0,0.029851,0.029851,0.0,0.014925,0.0,0.014925,0.0,0.0,0.0,0.0,0.014925,0.0,0.0,0.029851,0.0,0.0,0.029851,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.089552,0.014925,0.0,0.0,0.0,0.014925,0.014925,0.0,0.164179,0.0,0.0,0.014925,0.0,0.0,0.0,0.0,0.0,0.014925,0.0,0.0,0.0,0.0,0.0,0.0,0.014925,0.0,0.0,0.0,0.0,0.029851,0.0,0.014925,0.0,0.0,0.0,0.0,0.0,0.0,0.044776,0.044776,0.0,0.014925,0.0,0.0,0.0,0.0,0.0,0.0,0.0,479.940299
0,Murray Hill,3,0.0,0.0,0.0,0.0,0.025,0.025,0.0,0.025,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.7,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1519.8
0,Queensboro Hill,3,0.0,0.0,0.0,0.0,0.166667,0.0,0.083333,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.083333,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1528.166667
1,Astoria,8,0.0,0.011236,0.0,0.0,0.0,0.011236,0.022472,0.067416,0.0,0.011236,0.011236,0.0,0.011236,0.0,0.0,0.044944,0.011236,0.0,0.0,0.033708,0.0,0.011236,0.0,0.078652,0.0,0.011236,0.0,0.0,0.0,0.0,0.0,0.0,0.011236,0.011236,0.0,0.0,0.011236,0.0,0.0,0.033708,0.0,0.0,0.0,0.0,0.0,0.067416,0.011236,0.0,0.0,0.0,0.0,0.067416,0.0,0.011236,0.022472,0.0,0.022472,0.011236,0.0,0.011236,0.0,0.044944,0.022472,0.101124,0.011236,0.011236,0.0,0.0,0.0,0.0,0.05618,0.011236,0.0,0.0,0.011236,0.0,0.0,0.011236,0.044944,0.0,0.0,0.0,0.0,0.0,0.011236,0.0,0.0,0.011236,0.0,0.0,0.0,0.0,0.011236,0.0,0.0,0.011236,0.0,0.0,7036.640449
1,Douglaston,2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.086957,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.130435,0.0,0.0,0.0,0.173913,0.0,0.043478,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.086957,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.086957,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.086957,0.0,0.0,0.0,0.0,0.0,7180.826087
1,Glendale,2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.428571,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.428571,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,6716.285714
1,Hollis,0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,7397.2
1,Holliswood,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,6864.0
1,Little Neck,2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.119048,0.0,0.0,0.0,0.071429,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.02381,0.0,0.0,0.0,0.02381,0.02381,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.047619,0.095238,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.02381,0.095238,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.02381,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.02381,0.02381,7657.285714
1,Ridgewood,1,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.238095,0.0,0.02381,0.0,0.0,0.0,0.02381,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.071429,0.0,0.02381,0.02381,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.095238,0.0,0.02381,0.0,0.047619,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.02381,0.0,0.02381,0.02381,0.0,0.0,7847.595238


### Group the neighborhood by cluster labels w/o distance
This table represents that we can get the shop information of each cluster. With the zero value in a cluster, we can know there are no such categories restaurant or food shop in that cluster or area, so it means there are no competitor in this cluster. 

In [116]:
Q_mergeD.groupby(['Cluster Labels woD']).mean()

Unnamed: 0_level_0,Cluster Labels Distance,Afghan Restaurant,American Restaurant,Arepa Restaurant,Argentinian Restaurant,Asian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bistro,Brazilian Restaurant,Breakfast Spot,Buffet,Burger Joint,Burrito Place,Cafeteria,Café,Cajun / Creole Restaurant,Cantonese Restaurant,Caribbean Restaurant,Chinese Restaurant,Colombian Restaurant,Comfort Food Restaurant,Cuban Restaurant,Deli / Bodega,Dim Sum Restaurant,Diner,Donut Shop,Dosa Place,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Empanada Restaurant,Falafel Restaurant,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Food,Food Court,Food Stand,Food Truck,French Restaurant,Fried Chicken Joint,Gastropub,German Restaurant,Gluten-free Restaurant,Greek Restaurant,Halal Restaurant,Himalayan Restaurant,Hot Dog Joint,Hotpot Restaurant,Hunan Restaurant,Indian Restaurant,Indonesian Restaurant,Irish Pub,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Korean Restaurant,Kosher Restaurant,Latin American Restaurant,Malay Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Moroccan Restaurant,New American Restaurant,Noodle House,North Indian Restaurant,Persian Restaurant,Peruvian Restaurant,Pizza Place,Poke Place,Polish Restaurant,Ramen Restaurant,Restaurant,Romanian Restaurant,Salad Place,Sandwich Place,Seafood Restaurant,Shabu-Shabu Restaurant,Shanghai Restaurant,Snack Place,South American Restaurant,Southern / Soul Food Restaurant,Souvlaki Shop,Spanish Restaurant,Steakhouse,Sushi Restaurant,Szechuan Restaurant,Taco Place,Taiwanese Restaurant,Tapas Restaurant,Thai Restaurant,Tibetan Restaurant,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wings Joint,Distance
Cluster Labels woD,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1,Unnamed: 82_level_1,Unnamed: 83_level_1,Unnamed: 84_level_1,Unnamed: 85_level_1,Unnamed: 86_level_1,Unnamed: 87_level_1,Unnamed: 88_level_1,Unnamed: 89_level_1,Unnamed: 90_level_1,Unnamed: 91_level_1,Unnamed: 92_level_1,Unnamed: 93_level_1,Unnamed: 94_level_1,Unnamed: 95_level_1,Unnamed: 96_level_1,Unnamed: 97_level_1,Unnamed: 98_level_1,Unnamed: 99_level_1,Unnamed: 100_level_1
0,4.071429,0.003759,0.0,0.0,0.0,0.039708,0.0,0.0,0.04622,0.0,0.0,0.015558,0.0,0.0,0.0,0.0,0.008861,0.0,0.0,0.144686,0.094288,0.0,0.002976,0.0,0.080653,0.0,0.026974,0.077478,0.0,0.0,0.0,0.0,0.0,0.0,0.053481,0.005952,0.0,0.025422,0.0,0.0,0.014286,0.0,0.09474,0.0,0.0,0.0,0.0,0.004926,0.0,0.0,0.0,0.0,0.011149,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019424,0.0,0.0,0.017344,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.046887,0.0,0.0,0.0,0.018336,0.0,0.0,0.084199,0.027885,0.0,0.0,0.0,0.0,0.024827,0.0,0.003759,0.0,0.0,0.0,0.003759,0.0,0.0,0.002463,0.0,0.0,0.0,0.0,0.0,11129.669037
1,1.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.238095,0.0,0.02381,0.0,0.0,0.0,0.02381,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.071429,0.0,0.02381,0.02381,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.095238,0.0,0.02381,0.0,0.047619,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.02381,0.0,0.02381,0.02381,0.0,0.0,7847.595238
2,4.73913,0.003953,0.002717,0.00207,0.0,0.0,0.0,0.016201,0.051295,0.0,0.0,0.0,0.0,0.0,0.002899,0.003344,0.0,0.00189,0.00207,0.008627,0.103322,0.0,0.0,0.0,0.262936,0.0,0.007506,0.041647,0.006211,0.0,0.0,0.0,0.0,0.0,0.02557,0.0,0.0,0.013767,0.0,0.0,0.01413,0.001035,0.003106,0.0,0.0,0.0,0.002926,0.0,0.0,0.0,0.0,0.0,0.079434,0.0,0.014493,0.027391,0.0,0.013251,0.011874,0.0,0.004486,0.0,0.0,0.014406,0.001035,0.0,0.0,0.0,0.0,0.0,0.001035,0.086486,0.0,0.0,0.0,0.005415,0.0,0.006211,0.035041,0.002717,0.001035,0.003961,0.0,0.009267,0.0,0.0,0.01608,0.0,0.01452,0.0,0.0,0.0,0.0,0.010231,0.0,0.0,0.062338,0.001035,0.001035,8039.651667
3,4.787879,0.000797,0.026055,0.000439,0.000497,0.01469,0.004676,0.023654,0.049061,0.001957,0.000797,0.010239,0.001305,0.012668,0.004461,0.000705,0.036764,0.00101,0.002525,0.00392,0.083781,0.000497,0.0,0.001684,0.07827,0.001894,0.020643,0.049097,0.0,0.003214,0.004444,0.0,0.002437,0.002755,0.019398,0.001814,0.000758,0.016582,0.004058,0.0,0.033017,0.002564,0.009462,0.001626,0.0,0.002755,0.010051,0.010288,0.000439,0.0,0.0,0.0,0.009386,0.0,0.001894,0.048152,0.0,0.020128,0.025974,0.000631,0.023528,0.0,0.008002,0.027358,0.006795,0.0,0.003278,0.007464,0.001166,0.0,0.006833,0.084732,0.0,0.001595,0.003066,0.021257,0.000705,0.002755,0.044117,0.006371,0.0,0.001145,0.008562,0.011255,0.0,0.0,0.005426,0.010512,0.025387,0.000673,0.003972,0.0,0.002996,0.015508,0.000439,0.0,0.0,0.001187,0.0,6621.43232
4,2.0,0.0,0.0,0.066667,0.0,0.0,0.066667,0.066667,0.0,0.0,0.033333,0.033333,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.066667,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,18752.666667
5,3.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.08,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.16,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.08,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.28,0.0,0.0,0.0,0.08,0.0,4539.04
6,0.0,0.0,0.0,0.0,0.0,0.044776,0.029851,0.0,0.089552,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.029851,0.0,0.014925,0.0,0.134328,0.0,0.0,0.0,0.029851,0.029851,0.0,0.014925,0.0,0.014925,0.0,0.0,0.0,0.0,0.014925,0.0,0.0,0.029851,0.0,0.0,0.029851,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.089552,0.014925,0.0,0.0,0.0,0.014925,0.014925,0.0,0.164179,0.0,0.0,0.014925,0.0,0.0,0.0,0.0,0.0,0.014925,0.0,0.0,0.0,0.0,0.0,0.0,0.014925,0.0,0.0,0.0,0.0,0.029851,0.0,0.014925,0.0,0.0,0.0,0.0,0.0,0.0,0.044776,0.044776,0.0,0.014925,0.0,0.0,0.0,0.0,0.0,0.0,0.0,479.940299
7,6.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.178571,0.0,0.0,0.0,0.071429,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.285714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.035714,0.035714,6063.928571
8,1.0,0.0,0.011236,0.0,0.0,0.0,0.011236,0.022472,0.067416,0.0,0.011236,0.011236,0.0,0.011236,0.0,0.0,0.044944,0.011236,0.0,0.0,0.033708,0.0,0.011236,0.0,0.078652,0.0,0.011236,0.0,0.0,0.0,0.0,0.0,0.0,0.011236,0.011236,0.0,0.0,0.011236,0.0,0.0,0.033708,0.0,0.0,0.0,0.0,0.0,0.067416,0.011236,0.0,0.0,0.0,0.0,0.067416,0.0,0.011236,0.022472,0.0,0.022472,0.011236,0.0,0.011236,0.0,0.044944,0.022472,0.101124,0.011236,0.011236,0.0,0.0,0.0,0.0,0.05618,0.011236,0.0,0.0,0.011236,0.0,0.0,0.011236,0.044944,0.0,0.0,0.0,0.0,0.0,0.011236,0.0,0.0,0.011236,0.0,0.0,0.0,0.0,0.011236,0.0,0.0,0.011236,0.0,0.0,7036.640449
9,1.0,0.0,0.050633,0.0,0.0,0.025316,0.0,0.0,0.075949,0.012658,0.0,0.0,0.0,0.012658,0.0,0.0,0.025316,0.0,0.0,0.0,0.037975,0.0,0.0,0.0,0.075949,0.0,0.012658,0.037975,0.0,0.0,0.0,0.0,0.0,0.0,0.037975,0.025316,0.0,0.0,0.0,0.0,0.063291,0.025316,0.0,0.012658,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.012658,0.0,0.0,0.012658,0.0,0.025316,0.037975,0.0,0.012658,0.0,0.0,0.037975,0.012658,0.0,0.0,0.0,0.0,0.0,0.025316,0.075949,0.0,0.0,0.012658,0.0,0.012658,0.0,0.037975,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.037975,0.0,0.0,0.0,0.0,0.050633,0.0,0.037975,0.0,0.0,0.0,7320.556962


### Results and Discussion
With K-means clustering, we can get two clustering results. First one is clustering by distance, it represents the traffic convinence. The second one is clustering by catergories of restaurant or food shop. With these two information, everyone who want to open a restaurant or food shop can determine what kind of restaurant they could open better and which location is better for traffic convinence or more crowd. <br>
For example, if someone want to open a Korean restaurant, he can choose 0, 4, 5 clusters from the table grouped by labels without distance data. In these cluster, there is no Korean restaurant competitor. Then, in the 0 and some 5 cluster, the distance to Flushing Main Street is similar, so he can choose one of these neighborhood to open his restaurant.

### Conclusion
Using the Geojson data and Foursquare API, we can get the venue categories and distance information, and with the clustering algorithm in Python, we can get similar kind of restaurant categories or distance in each cluster. According to which kind of restaurant or food shop someone want to open, we can use the table grouped by cluster labels with categories to find which cluster has no similar category competitor. Then using the cluster grouped by distance, we can choose the closest neighborhood. With analysis method in this project, we can determine which neighborhood is propriate to open a new restaurant or food shop.