# Toronto & New York

### The printing shop is a business that specializes in printing flyers, business cards and printing on T-shirts. The company currently works in New York and has several outlets. The idea is to open a new store in the city of Toronto. For this reason we want to analyze both New York and Toronto in order to understand how similar they are and find the best area to open the new structure 

# Toronto DATA:

In [1]:
import pandas as pd
import numpy as np
import requests
import json

In [2]:
# read data
wiki_url = 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'
wiki_page = requests.get(wiki_url)
wiki_raw = pd.read_html(wiki_page.content, header = 0)[0]

df = wiki_raw[wiki_raw.Borough != 'Not assigned']
df = df.reset_index()
df = df.set_index('Postal Code')
df.head()

Unnamed: 0_level_0,index,Borough,Neighbourhood
Postal Code,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
M3A,2,North York,Parkwoods
M4A,3,North York,Victoria Village
M5A,4,Downtown Toronto,"Regent Park, Harbourfront"
M6A,5,North York,"Lawrence Manor, Lawrence Heights"
M7A,6,Downtown Toronto,"Queen's Park, Ontario Provincial Government"


In [3]:
geo_url = 'http://cocl.us/Geospatial_data'
df_geo = pd.read_csv(geo_url)
df_geo = df_geo.set_index('Postal Code')
df_geo.head()

Unnamed: 0_level_0,Latitude,Longitude
Postal Code,Unnamed: 1_level_1,Unnamed: 2_level_1
M1B,43.806686,-79.194353
M1C,43.784535,-79.160497
M1E,43.763573,-79.188711
M1G,43.770992,-79.216917
M1H,43.773136,-79.239476


In [4]:
# data cleaning
big_data = pd.merge(df, df_geo, left_index=True, right_index=True)
big_data = big_data.reset_index()
big_data = big_data.drop(['index'], axis = 1)
big_data # One Borough only DATA

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.654260,-79.360636
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494
...,...,...,...,...,...
98,M8X,Etobicoke,"The Kingsway, Montgomery Road, Old Mill North",43.653654,-79.506944
99,M4Y,Downtown Toronto,Church and Wellesley,43.665860,-79.383160
100,M7Y,East Toronto,"Business reply mail Processing Centre, South C...",43.662744,-79.321558
101,M8Y,Etobicoke,"Old Mill South, King's Mill Park, Sunnylea, Hu...",43.636258,-79.498509


In [5]:
pd.get_dummies(big_data['Borough']).sum()

Central Toronto      9
Downtown Toronto    19
East Toronto         5
East York            5
Etobicoke           12
Mississauga          1
North York          24
Scarborough         17
West Toronto         6
York                 5
dtype: int64

In [6]:
big_data = big_data[big_data['Borough'] == 'Downtown Toronto'].reset_index(drop=True)
big_data # One Borough DATA only

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
0,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
1,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494
2,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937
3,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418
4,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306
5,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383
6,M6G,Downtown Toronto,Christie,43.669542,-79.422564
7,M5H,Downtown Toronto,"Richmond, Adelaide, King",43.650571,-79.384568
8,M5J,Downtown Toronto,"Harbourfront East, Union Station, Toronto Islands",43.640816,-79.381752
9,M5K,Downtown Toronto,"Toronto Dominion Centre, Design Exchange",43.647177,-79.381576


# New York DATA:

In [7]:
!wget -q -O 'newyork_data.json' https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0701EN-SkillsNetwork/labs/newyork_data.json
print('Data downloaded!')
with open('newyork_data.json') as json_data:
    newyork_data = json.load(json_data)

neighborhoods_data = newyork_data['features'] # all data is in the features key, which is basically a list of the neighborhoods.
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude']  # define the dataframe columns
neighborhoods = pd.DataFrame(columns=column_names) # instantiate the dataframe

for data in neighborhoods_data:
    borough = neighborhood_name = data['properties']['borough'] 
    neighborhood_name = data['properties']['name']
        
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]
    
    neighborhoods = neighborhoods.append({'Borough': borough,
                                          'Neighborhood': neighborhood_name,
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)


print('The dataframe has {} boroughs and {} neighborhoods.'.format(
        len(neighborhoods['Borough'].unique()),
        neighborhoods.shape[0]))
neighborhoods.head() # all NY DATA

manhattan_data = neighborhoods[neighborhoods['Borough'] == 'Manhattan'].reset_index(drop=True)
manhattan_data.head() # Manhattan only DATA

Data downloaded!
The dataframe has 5 boroughs and 306 neighborhoods.


Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Manhattan,Marble Hill,40.876551,-73.91066
1,Manhattan,Chinatown,40.715618,-73.994279
2,Manhattan,Washington Heights,40.851903,-73.9369
3,Manhattan,Inwood,40.867684,-73.92121
4,Manhattan,Hamilton Heights,40.823604,-73.949688


In [9]:
big_data = big_data.rename(columns = {"Neighbourhood": "Neighborhood"})
all_data=pd.concat([manhattan_data,big_data],sort=False,axis=0)
all_data = all_data.drop(['Postal Code'], axis = 1)
print(all_data.shape)
all_data.head()

(59, 4)


Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Manhattan,Marble Hill,40.876551,-73.91066
1,Manhattan,Chinatown,40.715618,-73.994279
2,Manhattan,Washington Heights,40.851903,-73.9369
3,Manhattan,Inwood,40.867684,-73.92121
4,Manhattan,Hamilton Heights,40.823604,-73.949688


 # PART 3. Put all Toronto's and Manhattan's neighborhoods on the map:

In [10]:
!pip install geopy
#print('Librarie installed.')



In [11]:
!pip install folium
#print('Librarie installed.')

Collecting folium
  Downloading folium-0.12.1-py2.py3-none-any.whl (94 kB)
[K     |████████████████████████████████| 94 kB 5.3 MB/s  eta 0:00:01
Collecting branca>=0.3.0
  Downloading branca-0.4.2-py3-none-any.whl (24 kB)
Installing collected packages: branca, folium
Successfully installed branca-0.4.2 folium-0.12.1


In [12]:
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
import matplotlib.cm as cm
import matplotlib.colors as colors
from sklearn.cluster import KMeans
import folium 
print('Libraries imported.')


Libraries imported.


In [14]:
address = 'Toronto'
geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
t_latitude = location.latitude
t_longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(t_latitude, t_longitude))

address = 'Manhattan, NY'
geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
m_latitude = location.latitude
m_longitude = location.longitude
print('The geograpical coordinate of Manhattan are {}, {}.'.format(m_latitude, m_longitude))

The geograpical coordinate of Toronto are 43.6534817, -79.3839347.
The geograpical coordinate of Manhattan are 40.7896239, -73.9598939.


In [15]:
# create map of Toronto using latitude and longitude values
map_toronto = folium.Map(location=[t_latitude, t_longitude], zoom_start=11)
# create map of Manhattan using latitude and longitude values
map_manhattan = folium.Map(location=[m_latitude, m_longitude], zoom_start=12)


# add markers to map
for lat, lng, borough, neighborhood, postalcode in zip(big_data['Latitude'],big_data['Longitude'],big_data['Borough'],big_data['Neighborhood'],big_data['Postal Code']):
    label = '{}, {}, {}'.format(neighborhood, borough, postalcode)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
# add markers to map
for lat, lng, label in zip(manhattan_data['Latitude'], manhattan_data['Longitude'], manhattan_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_manhattan)

In [16]:
map_toronto

In [17]:
map_manhattan

# Foursquare:

In [18]:
CLIENT_ID = 'EYKGEV2E3LS5ZZRLWR0GCS2WIVVF5TLBAGGXDTJ0Y4BYIII5' # your Foursquare ID
CLIENT_SECRET = 'RY1J3KSIOGNX5WWLF2PISYTSFTYGKTI1VLNK3DHLLJFXO3LW' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

In [19]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [20]:
toronto_venues = getNearbyVenues(names=big_data['Neighborhood'],
                                latitudes=big_data['Latitude'],
                                longitudes=big_data['Longitude'])

Regent Park, Harbourfront
Queen's Park, Ontario Provincial Government
Garden District, Ryerson
St. James Town
Berczy Park
Central Bay Street
Christie
Richmond, Adelaide, King
Harbourfront East, Union Station, Toronto Islands
Toronto Dominion Centre, Design Exchange
Commerce Court, Victoria Hotel
University of Toronto, Harbord
Kensington Market, Chinatown, Grange Park
CN Tower, King and Spadina, Railway Lands, Harbourfront West, Bathurst Quay, South Niagara, Island airport
Rosedale
Stn A PO Boxes
St. James Town, Cabbagetown
First Canadian Place, Underground city
Church and Wellesley


In [21]:
manhattan_venues = getNearbyVenues(names=manhattan_data['Neighborhood'],
                                   latitudes=manhattan_data['Latitude'],
                                   longitudes=manhattan_data['Longitude'])

Marble Hill
Chinatown
Washington Heights
Inwood
Hamilton Heights
Manhattanville
Central Harlem
East Harlem
Upper East Side
Yorkville
Lenox Hill
Roosevelt Island
Upper West Side
Lincoln Square
Clinton
Midtown
Murray Hill
Chelsea
Greenwich Village
East Village
Lower East Side
Tribeca
Little Italy
Soho
West Village
Manhattan Valley
Morningside Heights
Gramercy
Battery Park City
Financial District
Carnegie Hill
Noho
Civic Center
Midtown South
Sutton Place
Turtle Bay
Tudor City
Stuyvesant Town
Flatiron
Hudson Yards


# Analyze neighborhood

In [22]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
toronto_onehot['Neighborhood'] = toronto_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]
toronto_grouped = toronto_onehot.groupby('Neighborhood').mean().reset_index()

In [23]:
# one hot encoding
manhattan_onehot = pd.get_dummies(manhattan_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
manhattan_onehot['Neighborhood'] = manhattan_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [manhattan_onehot.columns[-1]] + list(manhattan_onehot.columns[:-1])
manhattan_onehot = manhattan_onehot[fixed_columns]
manhattan_grouped = manhattan_onehot.groupby('Neighborhood').mean().reset_index()

In [24]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [25]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_t_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_t_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_t_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_t_venues_sorted['Group'] = "Toronto"
print(neighborhoods_t_venues_sorted.shape)
neighborhoods_t_venues_sorted.head()

(19, 12)


Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Group
0,Berczy Park,Coffee Shop,Cocktail Bar,Cheese Shop,Farmers Market,Seafood Restaurant,Beer Bar,Restaurant,Bakery,Breakfast Spot,Clothing Store,Toronto
1,"CN Tower, King and Spadina, Railway Lands, Har...",Airport Lounge,Airport Service,Airport Terminal,Coffee Shop,Bar,Plane,Rental Car Location,Boutique,Sculpture Garden,Boat or Ferry,Toronto
2,Central Bay Street,Coffee Shop,Sandwich Place,Café,Italian Restaurant,Burger Joint,Salad Place,Bubble Tea Shop,Portuguese Restaurant,Ramen Restaurant,Poke Place,Toronto
3,Christie,Grocery Store,Café,Park,Nightclub,Italian Restaurant,Restaurant,Candy Store,Baby Store,Coffee Shop,Dessert Shop,Toronto
4,Church and Wellesley,Coffee Shop,Sushi Restaurant,Japanese Restaurant,Restaurant,Gay Bar,Yoga Studio,Pub,Fast Food Restaurant,Burger Joint,Hotel,Toronto


In [26]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_m_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_m_venues_sorted['Neighborhood'] = manhattan_grouped['Neighborhood']

for ind in np.arange(manhattan_grouped.shape[0]):
    neighborhoods_m_venues_sorted.iloc[ind, 1:] = return_most_common_venues(manhattan_grouped.iloc[ind, :], num_top_venues)

neighborhoods_m_venues_sorted['Group'] = "Manhattan"
print(neighborhoods_m_venues_sorted.shape)
neighborhoods_m_venues_sorted.head()

(40, 12)


Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Group
0,Battery Park City,Coffee Shop,Hotel,Park,Gym,Memorial Site,Clothing Store,Gourmet Shop,Beer Garden,Food Court,Shopping Mall,Manhattan
1,Carnegie Hill,Coffee Shop,Café,Yoga Studio,Gym,Cosmetics Shop,Pizza Place,Bookstore,French Restaurant,Wine Shop,Bakery,Manhattan
2,Central Harlem,African Restaurant,Chinese Restaurant,Cosmetics Shop,Seafood Restaurant,Gym / Fitness Center,Bar,Art Gallery,American Restaurant,French Restaurant,Library,Manhattan
3,Chelsea,Coffee Shop,Art Gallery,Bakery,American Restaurant,French Restaurant,Ice Cream Shop,Wine Shop,Seafood Restaurant,Market,Bookstore,Manhattan
4,Chinatown,Bakery,Chinese Restaurant,Cocktail Bar,Dessert Shop,Hotpot Restaurant,Spa,American Restaurant,Bubble Tea Shop,Salon / Barbershop,Ice Cream Shop,Manhattan


In [27]:
neighborhoods_all_venues_sorted=pd.concat([neighborhoods_m_venues_sorted,neighborhoods_t_venues_sorted],sort=False,axis=0)
print(neighborhoods_all_venues_sorted.shape)
neighborhoods_all_venues_sorted.head()

(59, 12)


Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Group
0,Battery Park City,Coffee Shop,Hotel,Park,Gym,Memorial Site,Clothing Store,Gourmet Shop,Beer Garden,Food Court,Shopping Mall,Manhattan
1,Carnegie Hill,Coffee Shop,Café,Yoga Studio,Gym,Cosmetics Shop,Pizza Place,Bookstore,French Restaurant,Wine Shop,Bakery,Manhattan
2,Central Harlem,African Restaurant,Chinese Restaurant,Cosmetics Shop,Seafood Restaurant,Gym / Fitness Center,Bar,Art Gallery,American Restaurant,French Restaurant,Library,Manhattan
3,Chelsea,Coffee Shop,Art Gallery,Bakery,American Restaurant,French Restaurant,Ice Cream Shop,Wine Shop,Seafood Restaurant,Market,Bookstore,Manhattan
4,Chinatown,Bakery,Chinese Restaurant,Cocktail Bar,Dessert Shop,Hotpot Restaurant,Spa,American Restaurant,Bubble Tea Shop,Salon / Barbershop,Ice Cream Shop,Manhattan


In [28]:
all_grouped=pd.concat([manhattan_grouped,toronto_grouped],sort=False,axis=0)
print(all_grouped.shape)
all_grouped=all_grouped.fillna(0)
all_grouped.head()

(59, 367)


Unnamed: 0,Neighborhood,Accessories Store,Adult Boutique,Afghan Restaurant,African Restaurant,American Restaurant,Antique Shop,Arepa Restaurant,Argentinian Restaurant,Art Gallery,...,Hospital,IT Services,Lake,Plane,Portuguese Restaurant,Poutine Place,Sculpture Garden,Smoothie Shop,Tanning Salon,Theme Restaurant
0,Battery Park City,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Carnegie Hill,0.0,0.0,0.0,0.0,0.011364,0.0,0.0,0.011364,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Central Harlem,0.0,0.0,0.0,0.066667,0.044444,0.0,0.0,0.0,0.044444,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Chelsea,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.05,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Chinatown,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [29]:
kclusters = 5
all_grouped_clustering = all_grouped.drop('Neighborhood', 1)
kmeans = KMeans(n_clusters=kclusters)
kmeans.fit(all_grouped_clustering)
kmeans.labels_

array([3, 3, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 3,
       3, 0, 0, 3, 3, 0, 4, 0, 3, 0, 0, 0, 3, 0, 0, 0, 0, 0, 3, 3, 3, 2,
       3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 3, 3, 3, 3, 3], dtype=int32)

In [30]:
neighborhoods_all_venues_sorted.insert(0, 'Cluster', kmeans.labels_)
all_merged = all_data
all_merged = all_merged.join(neighborhoods_all_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

In [31]:
all_merged=all_merged.dropna(subset = ['Cluster'])[:]
all_merged['Cluster'] = all_merged['Cluster'].astype(int)
all_merged.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Group
0,Manhattan,Marble Hill,40.876551,-73.91066,3,Gym,Discount Store,Sandwich Place,Coffee Shop,Yoga Studio,Pizza Place,Steakhouse,Shopping Mall,Seafood Restaurant,Deli / Bodega,Manhattan
1,Manhattan,Chinatown,40.715618,-73.994279,0,Bakery,Chinese Restaurant,Cocktail Bar,Dessert Shop,Hotpot Restaurant,Spa,American Restaurant,Bubble Tea Shop,Salon / Barbershop,Ice Cream Shop,Manhattan
2,Manhattan,Washington Heights,40.851903,-73.9369,0,Café,Bakery,Mobile Phone Shop,Bank,Grocery Store,Deli / Bodega,Gym,Latin American Restaurant,Tapas Restaurant,Italian Restaurant,Manhattan
3,Manhattan,Inwood,40.867684,-73.92121,0,Mexican Restaurant,Lounge,Restaurant,Café,Caribbean Restaurant,Bakery,Chinese Restaurant,Park,Pizza Place,Wine Bar,Manhattan
4,Manhattan,Hamilton Heights,40.823604,-73.949688,3,Pizza Place,Coffee Shop,Café,Mexican Restaurant,Deli / Bodega,Bakery,Park,Cocktail Bar,Sandwich Place,Chinese Restaurant,Manhattan


In [32]:
all0=all_merged.loc[all_merged['Cluster'] == 0, all_merged.columns[[1] + list(range(5, all_merged.shape[1]))]]
all0

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Group
1,Chinatown,Bakery,Chinese Restaurant,Cocktail Bar,Dessert Shop,Hotpot Restaurant,Spa,American Restaurant,Bubble Tea Shop,Salon / Barbershop,Ice Cream Shop,Manhattan
2,Washington Heights,Café,Bakery,Mobile Phone Shop,Bank,Grocery Store,Deli / Bodega,Gym,Latin American Restaurant,Tapas Restaurant,Italian Restaurant,Manhattan
3,Inwood,Mexican Restaurant,Lounge,Restaurant,Café,Caribbean Restaurant,Bakery,Chinese Restaurant,Park,Pizza Place,Wine Bar,Manhattan
6,Central Harlem,African Restaurant,Chinese Restaurant,Cosmetics Shop,Seafood Restaurant,Gym / Fitness Center,Bar,Art Gallery,American Restaurant,French Restaurant,Library,Manhattan
7,East Harlem,Mexican Restaurant,Bakery,Deli / Bodega,Thai Restaurant,Sandwich Place,Spa,Latin American Restaurant,Restaurant,Steakhouse,Beer Bar,Manhattan
8,Upper East Side,Coffee Shop,Italian Restaurant,Bakery,Exhibit,Gym / Fitness Center,American Restaurant,Yoga Studio,Spa,French Restaurant,Juice Bar,Manhattan
9,Yorkville,Italian Restaurant,Coffee Shop,Gym,Deli / Bodega,Sushi Restaurant,Bar,Wine Shop,Japanese Restaurant,Diner,Gym / Fitness Center,Manhattan
10,Lenox Hill,Italian Restaurant,Cocktail Bar,Coffee Shop,Pizza Place,Sushi Restaurant,Gym,Burger Joint,Gym / Fitness Center,Café,Salon / Barbershop,Manhattan
12,Upper West Side,Italian Restaurant,Café,Bar,Bakery,Indian Restaurant,Wine Bar,Sushi Restaurant,Vegetarian / Vegan Restaurant,Breakfast Spot,Ice Cream Shop,Manhattan
13,Lincoln Square,Plaza,Theater,Concert Hall,Performing Arts Venue,Café,Gym / Fitness Center,Indie Movie Theater,Wine Shop,Cycle Studio,Cosmetics Shop,Manhattan


In [33]:
all1=all_merged.loc[all_merged['Cluster'] == 1, all_merged.columns[[1] + list(range(5, all_merged.shape[1]))]]
all1

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Group
14,Rosedale,Park,Trail,Playground,Department Store,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop,Doner Restaurant,Dog Run,Toronto


In [34]:
all2=all_merged.loc[all_merged['Cluster'] == 2, all_merged.columns[[1] + list(range(5, all_merged.shape[1]))]]
all2

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Group
6,Christie,Grocery Store,Café,Park,Nightclub,Italian Restaurant,Restaurant,Candy Store,Baby Store,Coffee Shop,Dessert Shop,Toronto


In [35]:
all3=all_merged.loc[all_merged['Cluster'] == 3, all_merged.columns[[1] + list(range(5, all_merged.shape[1]))]]
all3

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Group
0,Marble Hill,Gym,Discount Store,Sandwich Place,Coffee Shop,Yoga Studio,Pizza Place,Steakhouse,Shopping Mall,Seafood Restaurant,Deli / Bodega,Manhattan
4,Hamilton Heights,Pizza Place,Coffee Shop,Café,Mexican Restaurant,Deli / Bodega,Bakery,Park,Cocktail Bar,Sandwich Place,Chinese Restaurant,Manhattan
5,Manhattanville,Coffee Shop,Seafood Restaurant,Deli / Bodega,Sushi Restaurant,Italian Restaurant,Chinese Restaurant,Mexican Restaurant,Bus Stop,Lounge,Boutique,Manhattan
16,Murray Hill,Coffee Shop,Sandwich Place,Japanese Restaurant,Bar,Hotel,American Restaurant,Burger Joint,Gym / Fitness Center,Deli / Bodega,Grocery Store,Manhattan
26,Morningside Heights,Park,Coffee Shop,Bookstore,American Restaurant,Burger Joint,Café,Farmers Market,Tennis Court,Supermarket,Mediterranean Restaurant,Manhattan
28,Battery Park City,Coffee Shop,Hotel,Park,Gym,Memorial Site,Clothing Store,Gourmet Shop,Beer Garden,Food Court,Shopping Mall,Manhattan
29,Financial District,Coffee Shop,Cocktail Bar,Bar,Italian Restaurant,Pizza Place,Park,Gym,Gym / Fitness Center,Mexican Restaurant,Salad Place,Manhattan
30,Carnegie Hill,Coffee Shop,Café,Yoga Studio,Gym,Cosmetics Shop,Pizza Place,Bookstore,French Restaurant,Wine Shop,Bakery,Manhattan
35,Turtle Bay,Coffee Shop,Deli / Bodega,Italian Restaurant,Sushi Restaurant,Seafood Restaurant,Park,Japanese Restaurant,Hotel,Garden,Thai Restaurant,Manhattan
37,Stuyvesant Town,Park,Bar,Coffee Shop,Boat or Ferry,Farmers Market,Gym / Fitness Center,Fountain,Harbor / Marina,Gas Station,Cocktail Bar,Manhattan


In [36]:
all4=all_merged.loc[all_merged['Cluster'] == 4, all_merged.columns[[1] + list(range(5, all_merged.shape[1]))]]
all4

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Group
11,Roosevelt Island,Park,Playground,Outdoors & Recreation,School,Liquor Store,Supermarket,Dry Cleaner,Soccer Field,Coffee Shop,Greek Restaurant,Manhattan
