# Capstone Project - The Battle of the Neighborhoods (Week 2)
### Applied Data Science Capstone by IBM/Coursera

## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data Description](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

## Introduction: Business Problem <a name="introduction"></a>

**Toronto and New York** both are financial and tourist capitals of their respective countries. 
They are diverse in may ways. The goal of our study to find how much these cities are similar and dissimilar from tourist point of view regarding food, accommodation, beautiful places and many more.

Today Tourism is one of the pillars of the economy and the people most often visits those countries who are rich in heritage and developed enough from a foreign prospective, like friendly environment. Every city is unique in their own way and give something new. And now the information is so common regarding location of every place around the world on your fingertips which make it easier to explore. Therefore, tourists always eager to travel to different places on the basis of available information, and the comparison (the part of the information) between the two cities always assist to choose the specific places or according to their choice.

## Data Description <a name="data"></a>

For this problem, we will get the services of Foursquare API to explore the data of two cities, in terms of their neighborhoods. The data also include the information about the places around each neighborhood like restaurants, hotels, coffee shops, parks, theaters, art galleries, museums and many more. We selected one Borough from each city to analyze their neighborhoods. **Manhattan from New York and Downtown Toronto from Toronto**. We will use machine learning technique, “Clustering” to segment the neighborhoods with similar objects on the basis of each neighborhood data. These objects will be given priority on the basis of foot traffic (activity) in their respective neighborhoods. This will help to locate the tourist’s areas and hubs, and then we can judge the similarity or dissimilarity between two cities on that basis.

### Load Toronto Data

In [2]:
import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
pd.set_option('display.width', None)

In [3]:
#Loading Toronto Data, Toronto Data already scraped from wiki pedia and stored local file system
path=r'C:\Users\ramshast\Desktop\DS\IBM DS Certification\Coursera_Capstone_Project\capstone_project_dataset.csv'
df_toronto=pd.read_csv(path)

In [4]:
df_toronto

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494
5,M9A,Etobicoke,"Islington Avenue, Humber Valley Village",43.667856,-79.532242
6,M1B,Scarborough,"Malvern, Rouge",43.806686,-79.194353
7,M3B,North York,Don Mills,43.745906,-79.352188
8,M4B,East York,"Parkview Hill, Woodbine Gardens",43.706397,-79.309937
9,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937


In [5]:
#Loading New York Data and which is downloaded in Json file from https://cocl.us/new_york_dataset
#Json file downloaded and converted into dataframe
import json
path=r'C:\Users\ramshast\Desktop\DS\IBM DS Certification\Coursera_Capstone_Project\nyu_2451_34572-geojson.json'
with open(path) as json_data:
    newyork_data = json.load(json_data)

In [6]:
neighborhoods_data = newyork_data['features']

In [7]:
# define the dataframe columns
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude'] 

# instantiate the dataframe
df_nyc = pd.DataFrame(columns=column_names)

In [8]:
for data in neighborhoods_data:
    borough = neighborhood_name = data['properties']['borough'] 
    neighborhood_name = data['properties']['name']
        
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]
    
    df_nyc = df_nyc.append({'Borough': borough,
                                          'Neighborhood': neighborhood_name,
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)

In [9]:
df_nyc

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Bronx,Wakefield,40.894705,-73.847201
1,Bronx,Co-op City,40.874294,-73.829939
2,Bronx,Eastchester,40.887556,-73.827806
3,Bronx,Fieldston,40.895437,-73.905643
4,Bronx,Riverdale,40.890834,-73.912585
5,Bronx,Kingsbridge,40.881687,-73.902818
6,Manhattan,Marble Hill,40.876551,-73.91066
7,Bronx,Woodlawn,40.898273,-73.867315
8,Bronx,Norwood,40.877224,-73.879391
9,Bronx,Williamsbridge,40.881039,-73.857446


In [10]:
manhattan_data = df_nyc[df_nyc['Borough'] == 'Manhattan'].reset_index(drop=True)
manhattan_data.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Manhattan,Marble Hill,40.876551,-73.91066
1,Manhattan,Chinatown,40.715618,-73.994279
2,Manhattan,Washington Heights,40.851903,-73.9369
3,Manhattan,Inwood,40.867684,-73.92121
4,Manhattan,Hamilton Heights,40.823604,-73.949688


In [11]:
df_toronto.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494


In [12]:
downtown_data = df_toronto[df_toronto['Borough'] == 'Downtown Toronto'].reset_index(drop=True)
downtown_data.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
1,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494
2,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937
3,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418
4,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306


In [13]:
downtown_data.drop(columns={'PostalCode'},inplace=True)

In [14]:
downtown_data.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
1,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494
2,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937
3,Downtown Toronto,St. James Town,43.651494,-79.375418
4,Downtown Toronto,Berczy Park,43.644771,-79.373306


In [15]:
manhattan_data.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Manhattan,Marble Hill,40.876551,-73.91066
1,Manhattan,Chinatown,40.715618,-73.994279
2,Manhattan,Washington Heights,40.851903,-73.9369
3,Manhattan,Inwood,40.867684,-73.92121
4,Manhattan,Hamilton Heights,40.823604,-73.949688


### Visualize Neighborhood

In [16]:
import folium
def city_map(city,lat,lon):
    # create map of Manhattan using latitude and longitude values
    map_city = folium.Map(location=[lat,lon], zoom_start=11)
    # add markers to map
    for lat, lng, label in zip(city['Latitude'], city['Longitude'], city['Neighborhood']):
        label = folium.Popup(label, parse_html=True)
        folium.CircleMarker(
            [lat, lng],
            radius=5,
            popup=label,
            color='blue',
            fill=True,
            fill_color='#3186cc',
            fill_opacity=0.7,
            parse_html=False).add_to(map_city)  
    return map_city

Map For Manhattan Neighborhood

In [17]:
map_manhattan=city_map(manhattan_data,40.876551,-73.910660)
map_manhattan

Map For Toronto Neighborhood

In [18]:
map_downtown=city_map(downtown_data,43.653908, -79.384293)
map_downtown

Forsquare Credential to Extract Near by Venue

## Methodology <a name="methodology"></a>

In [19]:
CLIENT_ID = 'SJPO45IX0J02NAO0CKCPRIXLBEC0TKBTQF1PFRG3C1IDC050' # your Foursquare ID
CLIENT_SECRET = 'PID2MUPRKRIMABIBRTKEYJN2J1C1EKSWGVQZGVUVYPFHO25S' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: SJPO45IX0J02NAO0CKCPRIXLBEC0TKBTQF1PFRG3C1IDC050
CLIENT_SECRET:PID2MUPRKRIMABIBRTKEYJN2J1C1EKSWGVQZGVUVYPFHO25S


In [20]:
import requests
def getNearbyVenues(names, latitudes,longitudes, radius=500,limit=100):
    
    venues_list=[]
    for name, lat, lng in zip(names,latitudes,longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            limit)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
             v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [24]:
#Write the code to run the above function on each neighborhood and create a new dataframe called downtown_toronto_venues.
downtown_toronto_venues = getNearbyVenues(names=downtown_data['Neighborhood'],
                                   latitudes=downtown_data['Latitude'],
                                   longitudes=downtown_data['Longitude'],
                                  )

Regent Park, Harbourfront
Queen's Park, Ontario Provincial Government
Garden District, Ryerson
St. James Town
Berczy Park
Central Bay Street
Christie
Richmond, Adelaide, King
Harbourfront East, Union Station, Toronto Islands
Toronto Dominion Centre, Design Exchange
Commerce Court, Victoria Hotel
University of Toronto, Harbord
Kensington Market, Chinatown, Grange Park
CN Tower, King and Spadina, Railway Lands, Harbourfront West, Bathurst Quay, South Niagara, Island airport
Rosedale
Stn A PO Boxes
St. James Town, Cabbagetown
First Canadian Place, Underground city
Church and Wellesley


In [25]:
#Write the code to run the above function on each neighborhood and create a new dataframe called manhattan_nyc_venues.
manhattan_nyc_venues = getNearbyVenues(names=manhattan_data['Neighborhood'],
                                   latitudes=manhattan_data['Latitude'],
                                   longitudes=manhattan_data['Longitude'],
                                  )

Marble Hill
Chinatown
Washington Heights
Inwood
Hamilton Heights
Manhattanville
Central Harlem
East Harlem
Upper East Side
Yorkville
Lenox Hill
Roosevelt Island
Upper West Side
Lincoln Square
Clinton
Midtown
Murray Hill
Chelsea
Greenwich Village
East Village
Lower East Side
Tribeca
Little Italy
Soho
West Village
Manhattan Valley
Morningside Heights
Gramercy
Battery Park City
Financial District
Carnegie Hill
Noho
Civic Center
Midtown South
Sutton Place
Turtle Bay
Tudor City
Stuyvesant Town
Flatiron
Hudson Yards


In [26]:
#save dataframe
manhattan_nyc_venues.to_csv('manhattan_nyc_venues.csv',index=False)

In [27]:
downtown_toronto_venues.to_csv('downtown_toronto_venues.csv',index=False)

Quick check the Shape of Each Data Frame

In [28]:
downtown_toronto_venues.shape

(1236, 7)

In [29]:
manhattan_nyc_venues.shape

(3186, 7)

In [30]:
#quick check to see hown many venue return for each neighborhood
downtown_toronto_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Berczy Park,58,58,58,58,58,58
"CN Tower, King and Spadina, Railway Lands, Harbourfront West, Bathurst Quay, South Niagara, Island airport",16,16,16,16,16,16
Central Bay Street,64,64,64,64,64,64
Christie,16,16,16,16,16,16
Church and Wellesley,75,75,75,75,75,75
"Commerce Court, Victoria Hotel",100,100,100,100,100,100
"First Canadian Place, Underground city",100,100,100,100,100,100
"Garden District, Ryerson",100,100,100,100,100,100
"Harbourfront East, Union Station, Toronto Islands",100,100,100,100,100,100
"Kensington Market, Chinatown, Grange Park",66,66,66,66,66,66


In [31]:
#quick check to see hown many venue return for each neighborhood
manhattan_nyc_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Battery Park City,73,73,73,73,73,73
Carnegie Hill,91,91,91,91,91,91
Central Harlem,45,45,45,45,45,45
Chelsea,100,100,100,100,100,100
Chinatown,100,100,100,100,100,100
Civic Center,100,100,100,100,100,100
Clinton,100,100,100,100,100,100
East Harlem,42,42,42,42,42,42
East Village,100,100,100,100,100,100
Financial District,100,100,100,100,100,100


In [32]:
#Quick check for number of Unique Categories for Downtown Toronto
len(downtown_toronto_venues['Venue Category'].unique())

210

In [33]:
#Quick check for number of Unique Categories for New York City
len(manhattan_nyc_venues['Venue Category'].unique())

328

## Analysis <a name="analysis"></a>

## Analyze Each Neighborhood

Analyze Neighborhood For Toronto

In [34]:
def onehot_encoding(df):
    #one hot Encoding
    df_onehot=pd.get_dummies(df[['Venue Category']])
    #Add Neighborhood column back to DataFrame
    df_onehot['Neighborhood']=df['Neighborhood']
    #Change the Neighborhood column to first columns
    columns=[df_onehot.columns[-1]]+list(df_onehot.columns[:-1])
    df_onehot=df_onehot[columns]
    return df_onehot

In [35]:
#Convert Venue Category to one hot encoding for Toronto
downtown_toronto_onehot=onehot_encoding(downtown_toronto_venues)
#Convert Venue Category to one hot encoding for Manhattan
manhattan_nyc_onehot=onehot_encoding(manhattan_nyc_venues)

In [36]:
downtown_toronto_onehot.head()

Unnamed: 0,Neighborhood,Venue Category_Afghan Restaurant,Venue Category_Airport,Venue Category_Airport Food Court,Venue Category_Airport Lounge,Venue Category_Airport Service,Venue Category_Airport Terminal,Venue Category_American Restaurant,Venue Category_Antique Shop,Venue Category_Aquarium,Venue Category_Art Gallery,Venue Category_Art Museum,Venue Category_Arts & Crafts Store,Venue Category_Asian Restaurant,Venue Category_BBQ Joint,Venue Category_Baby Store,Venue Category_Bagel Shop,Venue Category_Bakery,Venue Category_Bank,Venue Category_Bar,Venue Category_Baseball Stadium,Venue Category_Basketball Stadium,Venue Category_Beach,Venue Category_Bed & Breakfast,Venue Category_Beer Bar,Venue Category_Beer Store,Venue Category_Belgian Restaurant,Venue Category_Bistro,Venue Category_Boat or Ferry,Venue Category_Bookstore,Venue Category_Boutique,Venue Category_Brazilian Restaurant,Venue Category_Breakfast Spot,Venue Category_Brewery,Venue Category_Bubble Tea Shop,Venue Category_Building,Venue Category_Burger Joint,Venue Category_Burrito Place,Venue Category_Butcher,Venue Category_Café,Venue Category_Camera Store,Venue Category_Candy Store,Venue Category_Caribbean Restaurant,Venue Category_Cheese Shop,Venue Category_Chinese Restaurant,Venue Category_Chocolate Shop,Venue Category_Church,Venue Category_Clothing Store,Venue Category_Cocktail Bar,Venue Category_Coffee Shop,Venue Category_College Arts Building,Venue Category_College Auditorium,Venue Category_College Gym,Venue Category_College Rec Center,Venue Category_Colombian Restaurant,Venue Category_Comfort Food Restaurant,Venue Category_Comic Shop,Venue Category_Concert Hall,Venue Category_Convenience Store,Venue Category_Cosmetics Shop,Venue Category_Creperie,Venue Category_Cupcake Shop,Venue Category_Dance Studio,Venue Category_Deli / Bodega,Venue Category_Department Store,Venue Category_Dessert Shop,Venue Category_Diner,Venue Category_Discount Store,Venue Category_Distribution Center,Venue Category_Dog Run,Venue Category_Doner Restaurant,Venue Category_Donut Shop,Venue Category_Dumpling Restaurant,Venue Category_Eastern European Restaurant,Venue Category_Electronics Store,Venue Category_Ethiopian Restaurant,Venue Category_Event Space,Venue Category_Falafel Restaurant,Venue Category_Farmers Market,Venue Category_Fast Food Restaurant,Venue Category_Filipino Restaurant,Venue Category_Fish Market,Venue Category_Food & Drink Shop,Venue Category_Food Court,Venue Category_Food Truck,Venue Category_Fountain,Venue Category_French Restaurant,Venue Category_Fried Chicken Joint,Venue Category_Furniture / Home Store,Venue Category_Gaming Cafe,Venue Category_Garden,Venue Category_Gastropub,Venue Category_Gay Bar,Venue Category_General Entertainment,Venue Category_General Travel,Venue Category_German Restaurant,Venue Category_Gift Shop,Venue Category_Gluten-free Restaurant,Venue Category_Gourmet Shop,Venue Category_Greek Restaurant,Venue Category_Grocery Store,Venue Category_Gym,Venue Category_Gym / Fitness Center,Venue Category_Harbor / Marina,Venue Category_Health & Beauty Service,Venue Category_Health Food Store,Venue Category_Historic Site,Venue Category_History Museum,Venue Category_Hobby Shop,Venue Category_Hookah Bar,Venue Category_Hospital,Venue Category_Hostel,Venue Category_Hotel,Venue Category_Hotel Bar,Venue Category_IT Services,Venue Category_Ice Cream Shop,Venue Category_Indian Restaurant,Venue Category_Indie Movie Theater,Venue Category_Irish Pub,Venue Category_Italian Restaurant,Venue Category_Japanese Restaurant,Venue Category_Jazz Club,Venue Category_Jewelry Store,Venue Category_Juice Bar,Venue Category_Knitting Store,Venue Category_Korean Restaurant,Venue Category_Lake,Venue Category_Latin American Restaurant,Venue Category_Lingerie Store,Venue Category_Liquor Store,Venue Category_Lounge,Venue Category_Market,Venue Category_Martial Arts School,Venue Category_Mediterranean Restaurant,Venue Category_Men's Store,Venue Category_Mexican Restaurant,Venue Category_Middle Eastern Restaurant,Venue Category_Miscellaneous Shop,Venue Category_Modern European Restaurant,Venue Category_Molecular Gastronomy Restaurant,Venue Category_Monument / Landmark,Venue Category_Moroccan Restaurant,Venue Category_Movie Theater,Venue Category_Museum,Venue Category_Music Venue,Venue Category_Neighborhood,Venue Category_New American Restaurant,Venue Category_Nightclub,Venue Category_Noodle House,Venue Category_Office,Venue Category_Opera House,Venue Category_Optical Shop,Venue Category_Organic Grocery,Venue Category_Other Great Outdoors,Venue Category_Outdoor Sculpture,Venue Category_Park,Venue Category_Performing Arts Venue,Venue Category_Pet Store,Venue Category_Pharmacy,Venue Category_Pizza Place,Venue Category_Plane,Venue Category_Playground,Venue Category_Plaza,Venue Category_Poke Place,Venue Category_Portuguese Restaurant,Venue Category_Poutine Place,Venue Category_Pub,Venue Category_Ramen Restaurant,Venue Category_Record Shop,Venue Category_Rental Car Location,Venue Category_Restaurant,Venue Category_Roof Deck,Venue Category_Sake Bar,Venue Category_Salad Place,Venue Category_Salon / Barbershop,Venue Category_Sandwich Place,Venue Category_Scenic Lookout,Venue Category_Sculpture Garden,Venue Category_Seafood Restaurant,Venue Category_Shoe Store,Venue Category_Shopping Mall,Venue Category_Skating Rink,Venue Category_Smoke Shop,Venue Category_Smoothie Shop,Venue Category_Snack Place,Venue Category_Soup Place,Venue Category_Spa,Venue Category_Speakeasy,Venue Category_Sporting Goods Shop,Venue Category_Sports Bar,Venue Category_Steakhouse,Venue Category_Strip Club,Venue Category_Supermarket,Venue Category_Sushi Restaurant,Venue Category_Taco Place,Venue Category_Tailor Shop,Venue Category_Taiwanese Restaurant,Venue Category_Tanning Salon,Venue Category_Tea Room,Venue Category_Thai Restaurant,Venue Category_Theater,Venue Category_Theme Restaurant,Venue Category_Toy / Game Store,Venue Category_Trail,Venue Category_Train Station,Venue Category_Vegetarian / Vegan Restaurant,Venue Category_Video Game Store,Venue Category_Vietnamese Restaurant,Venue Category_Wine Bar,Venue Category_Women's Store,Venue Category_Yoga Studio
0,"Regent Park, Harbourfront",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,"Regent Park, Harbourfront",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,"Regent Park, Harbourfront",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,"Regent Park, Harbourfront",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,"Regent Park, Harbourfront",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [37]:
manhattan_nyc_onehot.head()

Unnamed: 0,Neighborhood,Venue Category_Accessories Store,Venue Category_Adult Boutique,Venue Category_Afghan Restaurant,Venue Category_African Restaurant,Venue Category_American Restaurant,Venue Category_Antique Shop,Venue Category_Arepa Restaurant,Venue Category_Argentinian Restaurant,Venue Category_Art Gallery,Venue Category_Art Museum,Venue Category_Arts & Crafts Store,Venue Category_Asian Restaurant,Venue Category_Athletics & Sports,Venue Category_Auditorium,Venue Category_Australian Restaurant,Venue Category_Austrian Restaurant,Venue Category_BBQ Joint,Venue Category_Baby Store,Venue Category_Bagel Shop,Venue Category_Bakery,Venue Category_Bank,Venue Category_Bar,Venue Category_Baseball Field,Venue Category_Basketball Court,Venue Category_Beer Bar,Venue Category_Beer Garden,Venue Category_Beer Store,Venue Category_Big Box Store,Venue Category_Bike Rental / Bike Share,Venue Category_Bike Shop,Venue Category_Bike Trail,Venue Category_Bistro,Venue Category_Board Shop,Venue Category_Boat or Ferry,Venue Category_Bookstore,Venue Category_Boutique,Venue Category_Boxing Gym,Venue Category_Brazilian Restaurant,Venue Category_Breakfast Spot,Venue Category_Bridal Shop,Venue Category_Bridge,Venue Category_Bubble Tea Shop,Venue Category_Building,Venue Category_Burger Joint,Venue Category_Burrito Place,Venue Category_Bus Line,Venue Category_Bus Station,Venue Category_Bus Stop,Venue Category_Butcher,Venue Category_Cafeteria,Venue Category_Café,Venue Category_Cajun / Creole Restaurant,Venue Category_Camera Store,Venue Category_Candy Store,Venue Category_Cantonese Restaurant,Venue Category_Caribbean Restaurant,Venue Category_Caucasian Restaurant,Venue Category_Check Cashing Service,Venue Category_Cheese Shop,Venue Category_Chinese Restaurant,Venue Category_Chocolate Shop,Venue Category_Circus,Venue Category_Climbing Gym,Venue Category_Clothing Store,Venue Category_Club House,Venue Category_Cocktail Bar,Venue Category_Coffee Shop,Venue Category_College Academic Building,Venue Category_College Arts Building,Venue Category_College Bookstore,Venue Category_College Cafeteria,Venue Category_College Theater,Venue Category_Comedy Club,Venue Category_Community Center,Venue Category_Concert Hall,Venue Category_Convenience Store,Venue Category_Cooking School,Venue Category_Cosmetics Shop,Venue Category_Coworking Space,Venue Category_Creperie,Venue Category_Cuban Restaurant,Venue Category_Cupcake Shop,Venue Category_Cycle Studio,Venue Category_Czech Restaurant,Venue Category_Dance Studio,Venue Category_Daycare,Venue Category_Deli / Bodega,Venue Category_Department Store,Venue Category_Design Studio,Venue Category_Dessert Shop,Venue Category_Dim Sum Restaurant,Venue Category_Diner,Venue Category_Discount Store,Venue Category_Doctor's Office,Venue Category_Dog Run,Venue Category_Donut Shop,Venue Category_Drugstore,Venue Category_Dry Cleaner,Venue Category_Dumpling Restaurant,Venue Category_Duty-free Shop,Venue Category_Eastern European Restaurant,Venue Category_Egyptian Restaurant,Venue Category_Electronics Store,Venue Category_Empanada Restaurant,Venue Category_Ethiopian Restaurant,Venue Category_Event Space,Venue Category_Exhibit,Venue Category_Falafel Restaurant,Venue Category_Farmers Market,Venue Category_Fast Food Restaurant,Venue Category_Filipino Restaurant,Venue Category_Financial or Legal Service,Venue Category_Fish Market,Venue Category_Flea Market,Venue Category_Flower Shop,Venue Category_Food & Drink Shop,Venue Category_Food Court,Venue Category_Food Stand,Venue Category_Food Truck,Venue Category_Fountain,Venue Category_French Restaurant,Venue Category_Fried Chicken Joint,Venue Category_Frozen Yogurt Shop,Venue Category_Furniture / Home Store,Venue Category_Gaming Cafe,Venue Category_Garden,Venue Category_Garden Center,Venue Category_Gas Station,Venue Category_Gastropub,Venue Category_Gay Bar,Venue Category_General Entertainment,Venue Category_German Restaurant,Venue Category_Gift Shop,Venue Category_Golf Course,Venue Category_Gourmet Shop,Venue Category_Greek Restaurant,Venue Category_Grocery Store,Venue Category_Gym,Venue Category_Gym / Fitness Center,Venue Category_Gym Pool,Venue Category_Gymnastics Gym,Venue Category_Harbor / Marina,Venue Category_Hardware Store,Venue Category_Hawaiian Restaurant,Venue Category_Health & Beauty Service,Venue Category_Health Food Store,Venue Category_Heliport,Venue Category_High School,Venue Category_Himalayan Restaurant,Venue Category_Historic Site,Venue Category_History Museum,Venue Category_Hobby Shop,Venue Category_Hookah Bar,Venue Category_Hostel,Venue Category_Hot Dog Joint,Venue Category_Hotel,Venue Category_Hotel Bar,Venue Category_Hotpot Restaurant,Venue Category_Ice Cream Shop,Venue Category_Indian Restaurant,Venue Category_Indie Movie Theater,Venue Category_Indie Theater,Venue Category_Intersection,Venue Category_Irish Pub,Venue Category_Israeli Restaurant,Venue Category_Italian Restaurant,Venue Category_Japanese Curry Restaurant,Venue Category_Japanese Restaurant,Venue Category_Jazz Club,Venue Category_Jewelry Store,Venue Category_Jewish Restaurant,Venue Category_Juice Bar,Venue Category_Karaoke Bar,Venue Category_Kids Store,Venue Category_Kitchen Supply Store,Venue Category_Korean Restaurant,Venue Category_Kosher Restaurant,Venue Category_Latin American Restaurant,Venue Category_Laundry Service,Venue Category_Leather Goods Store,Venue Category_Lebanese Restaurant,Venue Category_Library,Venue Category_Lingerie Store,Venue Category_Liquor Store,Venue Category_Lounge,Venue Category_Malay Restaurant,Venue Category_Market,Venue Category_Martial Arts School,Venue Category_Massage Studio,Venue Category_Mattress Store,Venue Category_Medical Center,Venue Category_Mediterranean Restaurant,Venue Category_Memorial Site,Venue Category_Men's Store,Venue Category_Mexican Restaurant,Venue Category_Middle Eastern Restaurant,Venue Category_Mini Golf,Venue Category_Miscellaneous Shop,Venue Category_Mobile Phone Shop,Venue Category_Molecular Gastronomy Restaurant,Venue Category_Monument / Landmark,Venue Category_Moroccan Restaurant,Venue Category_Movie Theater,Venue Category_Moving Target,Venue Category_Museum,Venue Category_Music School,Venue Category_Music Venue,Venue Category_Nail Salon,Venue Category_New American Restaurant,Venue Category_Newsstand,Venue Category_Nightclub,Venue Category_Non-Profit,Venue Category_Noodle House,Venue Category_North Indian Restaurant,Venue Category_Office,Venue Category_Opera House,Venue Category_Optical Shop,Venue Category_Organic Grocery,Venue Category_Outdoor Sculpture,Venue Category_Outdoors & Recreation,Venue Category_Paella Restaurant,Venue Category_Paper / Office Supplies Store,Venue Category_Park,Venue Category_Pedestrian Plaza,Venue Category_Performing Arts Venue,Venue Category_Perfume Shop,Venue Category_Persian Restaurant,Venue Category_Peruvian Restaurant,Venue Category_Pet Café,Venue Category_Pet Service,Venue Category_Pet Store,Venue Category_Pharmacy,Venue Category_Photography Studio,Venue Category_Physical Therapist,Venue Category_Piano Bar,Venue Category_Pie Shop,Venue Category_Pier,Venue Category_Pilates Studio,Venue Category_Pizza Place,Venue Category_Playground,Venue Category_Plaza,Venue Category_Poke Place,Venue Category_Pool,Venue Category_Pub,Venue Category_Public Art,Venue Category_Ramen Restaurant,Venue Category_Record Shop,Venue Category_Rental Car Location,Venue Category_Residential Building (Apartment / Condo),Venue Category_Resort,Venue Category_Rest Area,Venue Category_Restaurant,Venue Category_River,Venue Category_Rock Climbing Spot,Venue Category_Rock Club,Venue Category_Roof Deck,Venue Category_Sake Bar,Venue Category_Salad Place,Venue Category_Salon / Barbershop,Venue Category_Sandwich Place,Venue Category_Scandinavian Restaurant,Venue Category_Scenic Lookout,Venue Category_School,Venue Category_Sculpture Garden,Venue Category_Seafood Restaurant,Venue Category_Shanghai Restaurant,Venue Category_Shipping Store,Venue Category_Shoe Store,Venue Category_Shopping Mall,Venue Category_Skate Park,Venue Category_Smoke Shop,Venue Category_Smoothie Shop,Venue Category_Snack Place,Venue Category_Soba Restaurant,Venue Category_Soccer Field,Venue Category_Social Club,Venue Category_Soup Place,Venue Category_South American Restaurant,Venue Category_South Indian Restaurant,Venue Category_Southern / Soul Food Restaurant,Venue Category_Spa,Venue Category_Spanish Restaurant,Venue Category_Speakeasy,Venue Category_Sporting Goods Shop,Venue Category_Sports Bar,Venue Category_Sports Club,Venue Category_Steakhouse,Venue Category_Street Art,Venue Category_Strip Club,Venue Category_Supermarket,Venue Category_Supplement Shop,Venue Category_Sushi Restaurant,Venue Category_Swiss Restaurant,Venue Category_Szechuan Restaurant,Venue Category_Taco Place,Venue Category_Tailor Shop,Venue Category_Taiwanese Restaurant,Venue Category_Tapas Restaurant,Venue Category_Tea Room,Venue Category_Tech Startup,Venue Category_Tennis Court,Venue Category_Tennis Stadium,Venue Category_Thai Restaurant,Venue Category_Theater,Venue Category_Theme Restaurant,Venue Category_Thrift / Vintage Store,Venue Category_Tiki Bar,Venue Category_Tourist Information Center,Venue Category_Toy / Game Store,Venue Category_Trail,Venue Category_Train Station,Venue Category_Turkish Restaurant,Venue Category_Udon Restaurant,Venue Category_Used Bookstore,Venue Category_Vegetarian / Vegan Restaurant,Venue Category_Venezuelan Restaurant,Venue Category_Veterinarian,Venue Category_Video Game Store,Venue Category_Video Store,Venue Category_Vietnamese Restaurant,Venue Category_Volleyball Court,Venue Category_Waterfront,Venue Category_Whisky Bar,Venue Category_Wine Bar,Venue Category_Wine Shop,Venue Category_Wings Joint,Venue Category_Women's Store,Venue Category_Yoga Studio
0,Marble Hill,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Marble Hill,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1
2,Marble Hill,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Marble Hill,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Marble Hill,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [38]:
# Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category
downtown_toronto_grouped=downtown_toronto_onehot.groupby('Neighborhood').mean().reset_index()
manhattan_nyc_grouped=manhattan_nyc_onehot.groupby('Neighborhood').mean().reset_index()

In [39]:
def neighbor_anlongwithtop5_venue(grouped_df):
    top_venue=5
    for nb in grouped_df['Neighborhood']:
        print('*************'+nb+'**************')
        temp=grouped_df[grouped_df['Neighborhood']==nb].T.reset_index()
        temp.columns=['Venue','Feq']
        temp=temp.iloc[1:]
        temp['Feq']=temp['Feq'].astype(float)
        temp=temp.round({'Feq':2})
        print(temp.sort_values('Feq',ascending=False).reset_index(drop=True).head(top_venue))
        print('\n')

In [40]:
#Top 5 Venue for Downtown Toronto
neighbor_anlongwithtop5_venue(downtown_toronto_grouped)

*************Berczy Park**************
                           Venue   Feq
0     Venue Category_Coffee Shop  0.10
1  Venue Category_Farmers Market  0.03
2            Venue Category_Café  0.03
3        Venue Category_Beer Bar  0.03
4     Venue Category_Cheese Shop  0.03


*************CN Tower, King and Spadina, Railway Lands, Harbourfront West, Bathurst Quay, South Niagara, Island airport**************
                             Venue   Feq
0    Venue Category_Airport Lounge  0.12
1   Venue Category_Airport Service  0.12
2  Venue Category_Airport Terminal  0.12
3             Venue Category_Plane  0.06
4   Venue Category_Harbor / Marina  0.06


*************Central Bay Street**************
                                Venue   Feq
0          Venue Category_Coffee Shop  0.17
1       Venue Category_Sandwich Place  0.08
2   Venue Category_Italian Restaurant  0.05
3  Venue Category_Japanese Restaurant  0.05
4                 Venue Category_Café  0.05


*************Christie**********

In [41]:
#Top 5 Venue for Manhattan 
neighbor_anlongwithtop5_venue(manhattan_nyc_grouped)

*************Battery Park City**************
                          Venue   Feq
0           Venue Category_Park  0.11
1          Venue Category_Hotel  0.07
2            Venue Category_Gym  0.05
3    Venue Category_Coffee Shop  0.05
4  Venue Category_Memorial Site  0.04


*************Carnegie Hill**************
                           Venue   Feq
0     Venue Category_Coffee Shop  0.08
1            Venue Category_Café  0.05
2     Venue Category_Yoga Studio  0.03
3  Venue Category_Cosmetics Shop  0.03
4       Venue Category_Bookstore  0.03


*************Central Harlem**************
                                Venue   Feq
0   Venue Category_African Restaurant  0.07
1   Venue Category_Chinese Restaurant  0.04
2   Venue Category_Seafood Restaurant  0.04
3  Venue Category_American Restaurant  0.04
4           Venue Category_Restaurant  0.04


*************Chelsea**************
                                Venue   Feq
0          Venue Category_Coffee Shop  0.09
1          Venue 

In [42]:
#Create new data frame having neighborhood and top 10 venues
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [43]:
import numpy as np
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted_d = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted_d['Neighborhood'] = downtown_toronto_grouped['Neighborhood']

for ind in np.arange(downtown_toronto_grouped.shape[0]):
    neighborhoods_venues_sorted_d.iloc[ind, 1:] = return_most_common_venues(downtown_toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted_d

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Berczy Park,Venue Category_Coffee Shop,Venue Category_Cheese Shop,Venue Category_Farmers Market,Venue Category_Café,Venue Category_Bakery,Venue Category_Beer Bar,Venue Category_Seafood Restaurant,Venue Category_Restaurant,Venue Category_Cocktail Bar,Venue Category_Beach
1,"CN Tower, King and Spadina, Railway Lands, Har...",Venue Category_Airport Lounge,Venue Category_Airport Service,Venue Category_Airport Terminal,Venue Category_Sculpture Garden,Venue Category_Airport,Venue Category_Airport Food Court,Venue Category_Bar,Venue Category_Rental Car Location,Venue Category_Boutique,Venue Category_Harbor / Marina
2,Central Bay Street,Venue Category_Coffee Shop,Venue Category_Sandwich Place,Venue Category_Italian Restaurant,Venue Category_Japanese Restaurant,Venue Category_Café,Venue Category_Salad Place,Venue Category_Department Store,Venue Category_Bubble Tea Shop,Venue Category_Burger Joint,Venue Category_Portuguese Restaurant
3,Christie,Venue Category_Grocery Store,Venue Category_Café,Venue Category_Park,Venue Category_Italian Restaurant,Venue Category_Baby Store,Venue Category_Diner,Venue Category_Restaurant,Venue Category_Nightclub,Venue Category_Candy Store,Venue Category_Coffee Shop
4,Church and Wellesley,Venue Category_Coffee Shop,Venue Category_Japanese Restaurant,Venue Category_Sushi Restaurant,Venue Category_Gay Bar,Venue Category_Restaurant,Venue Category_Mediterranean Restaurant,Venue Category_Hotel,Venue Category_Café,Venue Category_Yoga Studio,Venue Category_Bubble Tea Shop
5,"Commerce Court, Victoria Hotel",Venue Category_Coffee Shop,Venue Category_Café,Venue Category_Restaurant,Venue Category_Hotel,Venue Category_American Restaurant,Venue Category_Gym,Venue Category_Italian Restaurant,Venue Category_Seafood Restaurant,Venue Category_Japanese Restaurant,Venue Category_Cocktail Bar
6,"First Canadian Place, Underground city",Venue Category_Coffee Shop,Venue Category_Café,Venue Category_Hotel,Venue Category_Restaurant,Venue Category_Gym,Venue Category_Japanese Restaurant,Venue Category_Seafood Restaurant,Venue Category_Asian Restaurant,Venue Category_Steakhouse,Venue Category_American Restaurant
7,"Garden District, Ryerson",Venue Category_Clothing Store,Venue Category_Coffee Shop,Venue Category_Café,Venue Category_Cosmetics Shop,Venue Category_Bubble Tea Shop,Venue Category_Japanese Restaurant,Venue Category_Italian Restaurant,Venue Category_Middle Eastern Restaurant,Venue Category_Pizza Place,Venue Category_Plaza
8,"Harbourfront East, Union Station, Toronto Islands",Venue Category_Coffee Shop,Venue Category_Aquarium,Venue Category_Hotel,Venue Category_Café,Venue Category_Fried Chicken Joint,Venue Category_Restaurant,Venue Category_Italian Restaurant,Venue Category_Scenic Lookout,Venue Category_Brewery,Venue Category_Bar
9,"Kensington Market, Chinatown, Grange Park",Venue Category_Vegetarian / Vegan Restaurant,Venue Category_Café,Venue Category_Coffee Shop,Venue Category_Bar,Venue Category_Vietnamese Restaurant,Venue Category_Mexican Restaurant,Venue Category_Pizza Place,Venue Category_Gaming Cafe,Venue Category_Park,Venue Category_Bakery


In [44]:
import numpy as np
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = manhattan_nyc_grouped['Neighborhood']

for ind in np.arange(manhattan_nyc_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(manhattan_nyc_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Battery Park City,Venue Category_Park,Venue Category_Hotel,Venue Category_Gym,Venue Category_Coffee Shop,Venue Category_Shopping Mall,Venue Category_Memorial Site,Venue Category_BBQ Joint,Venue Category_Boat or Ferry,Venue Category_Food Court,Venue Category_Gourmet Shop
1,Carnegie Hill,Venue Category_Coffee Shop,Venue Category_Café,Venue Category_Yoga Studio,Venue Category_Bookstore,Venue Category_Gym / Fitness Center,Venue Category_Gym,Venue Category_French Restaurant,Venue Category_Cosmetics Shop,Venue Category_Pizza Place,Venue Category_Wine Shop
2,Central Harlem,Venue Category_African Restaurant,Venue Category_Seafood Restaurant,Venue Category_American Restaurant,Venue Category_Bar,Venue Category_French Restaurant,Venue Category_Chinese Restaurant,Venue Category_Restaurant,Venue Category_Cosmetics Shop,Venue Category_Southern / Soul Food Restaurant,Venue Category_Boutique
3,Chelsea,Venue Category_Coffee Shop,Venue Category_Art Gallery,Venue Category_American Restaurant,Venue Category_Bakery,Venue Category_French Restaurant,Venue Category_Italian Restaurant,Venue Category_Hotel,Venue Category_Seafood Restaurant,Venue Category_Market,Venue Category_Bar
4,Chinatown,Venue Category_Chinese Restaurant,Venue Category_Dessert Shop,Venue Category_Cocktail Bar,Venue Category_Vietnamese Restaurant,Venue Category_Bakery,Venue Category_Spa,Venue Category_Salon / Barbershop,Venue Category_American Restaurant,Venue Category_Optical Shop,Venue Category_Ice Cream Shop
5,Civic Center,Venue Category_Coffee Shop,Venue Category_Hotel,Venue Category_Cocktail Bar,Venue Category_French Restaurant,Venue Category_Gym / Fitness Center,Venue Category_Yoga Studio,Venue Category_Spa,Venue Category_Park,Venue Category_Italian Restaurant,Venue Category_Sushi Restaurant
6,Clinton,Venue Category_Gym / Fitness Center,Venue Category_Cocktail Bar,Venue Category_Sandwich Place,Venue Category_Italian Restaurant,Venue Category_Theater,Venue Category_Coffee Shop,Venue Category_American Restaurant,Venue Category_Gym,Venue Category_Hotel,Venue Category_Pizza Place
7,East Harlem,Venue Category_Mexican Restaurant,Venue Category_Thai Restaurant,Venue Category_Bakery,Venue Category_Latin American Restaurant,Venue Category_Deli / Bodega,Venue Category_Sandwich Place,Venue Category_Spa,Venue Category_Gas Station,Venue Category_Beer Bar,Venue Category_Cocktail Bar
8,East Village,Venue Category_Bar,Venue Category_Ice Cream Shop,Venue Category_Mexican Restaurant,Venue Category_Cocktail Bar,Venue Category_Pizza Place,Venue Category_Wine Bar,Venue Category_Speakeasy,Venue Category_Vegetarian / Vegan Restaurant,Venue Category_Italian Restaurant,Venue Category_Coffee Shop
9,Financial District,Venue Category_Coffee Shop,Venue Category_Pizza Place,Venue Category_Café,Venue Category_Cocktail Bar,Venue Category_Gym / Fitness Center,Venue Category_Mexican Restaurant,Venue Category_Bar,Venue Category_Italian Restaurant,Venue Category_Gym,Venue Category_Steakhouse


### Clustering Neighborhood

In [45]:
from sklearn.cluster import KMeans
# set number of clusters
kclusters = 5

downtown_toronto_grouped_clustering = downtown_toronto_grouped.drop('Neighborhood', 1)
#print(downtown_toronto_grouped_clustering)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(downtown_toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([1, 2, 1, 4, 1, 1, 1, 1, 1, 1])

In [46]:
# Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.
downtown_toronto_merged = downtown_data

# add clustering labels
downtown_toronto_merged['Cluster Labels'] = kmeans.labels_

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
downtown_toronto_merged = downtown_toronto_merged.join(neighborhoods_venues_sorted_d.set_index('Neighborhood'), on='Neighborhood')

downtown_toronto_merged.head() # check the last columns!

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636,1,Venue Category_Coffee Shop,Venue Category_Pub,Venue Category_Bakery,Venue Category_Park,Venue Category_Restaurant,Venue Category_Breakfast Spot,Venue Category_Café,Venue Category_Theater,Venue Category_Farmers Market,Venue Category_Performing Arts Venue
1,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,2,Venue Category_Coffee Shop,Venue Category_Diner,Venue Category_Yoga Studio,Venue Category_College Auditorium,Venue Category_Beer Bar,Venue Category_Smoothie Shop,Venue Category_Sandwich Place,Venue Category_Burrito Place,Venue Category_Café,Venue Category_Portuguese Restaurant
2,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937,1,Venue Category_Clothing Store,Venue Category_Coffee Shop,Venue Category_Café,Venue Category_Cosmetics Shop,Venue Category_Bubble Tea Shop,Venue Category_Japanese Restaurant,Venue Category_Italian Restaurant,Venue Category_Middle Eastern Restaurant,Venue Category_Pizza Place,Venue Category_Plaza
3,Downtown Toronto,St. James Town,43.651494,-79.375418,4,Venue Category_Café,Venue Category_Coffee Shop,Venue Category_Clothing Store,Venue Category_American Restaurant,Venue Category_Cocktail Bar,Venue Category_Cosmetics Shop,Venue Category_Restaurant,Venue Category_Diner,Venue Category_Creperie,Venue Category_Park
4,Downtown Toronto,Berczy Park,43.644771,-79.373306,1,Venue Category_Coffee Shop,Venue Category_Cheese Shop,Venue Category_Farmers Market,Venue Category_Café,Venue Category_Bakery,Venue Category_Beer Bar,Venue Category_Seafood Restaurant,Venue Category_Restaurant,Venue Category_Cocktail Bar,Venue Category_Beach


In [47]:
import matplotlib.cm as cm
import matplotlib.colors as colors
# create map
map_clusters = folium.Map(location=[43.653908, -79.384293], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(downtown_toronto_merged['Latitude'], downtown_toronto_merged['Longitude'], downtown_toronto_merged['Neighborhood'], downtown_toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Examine Clusters

Now, we can examine each cluster and determine the discriminating venue categories that distinguish each cluster. Based on the defining categories, we can then assign a name to each cluster.### Now, we can examine each cluster and determine the discriminating venue categories that distinguish each cluster. Based on the defining categories, we can then assign a name to each cluster.

### Cluster 1 (Airport Lounge, Coffee Shop, Cafe, Restaurants & Grocery Store)

In [48]:
downtown_toronto_merged.loc[downtown_toronto_merged['Cluster Labels'] == 0, downtown_toronto_merged.columns[[1] + list(range(5, downtown_toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
10,"Commerce Court, Victoria Hotel",Venue Category_Coffee Shop,Venue Category_Café,Venue Category_Restaurant,Venue Category_Hotel,Venue Category_American Restaurant,Venue Category_Gym,Venue Category_Italian Restaurant,Venue Category_Seafood Restaurant,Venue Category_Japanese Restaurant,Venue Category_Cocktail Bar


### Cluster 2 (Gastropubs)

In [49]:
downtown_toronto_merged.loc[downtown_toronto_merged['Cluster Labels'] == 1, downtown_toronto_merged.columns[[1] + list(range(5, downtown_toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Regent Park, Harbourfront",Venue Category_Coffee Shop,Venue Category_Pub,Venue Category_Bakery,Venue Category_Park,Venue Category_Restaurant,Venue Category_Breakfast Spot,Venue Category_Café,Venue Category_Theater,Venue Category_Farmers Market,Venue Category_Performing Arts Venue
2,"Garden District, Ryerson",Venue Category_Clothing Store,Venue Category_Coffee Shop,Venue Category_Café,Venue Category_Cosmetics Shop,Venue Category_Bubble Tea Shop,Venue Category_Japanese Restaurant,Venue Category_Italian Restaurant,Venue Category_Middle Eastern Restaurant,Venue Category_Pizza Place,Venue Category_Plaza
4,Berczy Park,Venue Category_Coffee Shop,Venue Category_Cheese Shop,Venue Category_Farmers Market,Venue Category_Café,Venue Category_Bakery,Venue Category_Beer Bar,Venue Category_Seafood Restaurant,Venue Category_Restaurant,Venue Category_Cocktail Bar,Venue Category_Beach
5,Central Bay Street,Venue Category_Coffee Shop,Venue Category_Sandwich Place,Venue Category_Italian Restaurant,Venue Category_Japanese Restaurant,Venue Category_Café,Venue Category_Salad Place,Venue Category_Department Store,Venue Category_Bubble Tea Shop,Venue Category_Burger Joint,Venue Category_Portuguese Restaurant
6,Christie,Venue Category_Grocery Store,Venue Category_Café,Venue Category_Park,Venue Category_Italian Restaurant,Venue Category_Baby Store,Venue Category_Diner,Venue Category_Restaurant,Venue Category_Nightclub,Venue Category_Candy Store,Venue Category_Coffee Shop
7,"Richmond, Adelaide, King",Venue Category_Coffee Shop,Venue Category_Café,Venue Category_Bar,Venue Category_Clothing Store,Venue Category_Restaurant,Venue Category_Gym,Venue Category_Hotel,Venue Category_Steakhouse,Venue Category_Thai Restaurant,Venue Category_Office
8,"Harbourfront East, Union Station, Toronto Islands",Venue Category_Coffee Shop,Venue Category_Aquarium,Venue Category_Hotel,Venue Category_Café,Venue Category_Fried Chicken Joint,Venue Category_Restaurant,Venue Category_Italian Restaurant,Venue Category_Scenic Lookout,Venue Category_Brewery,Venue Category_Bar
9,"Toronto Dominion Centre, Design Exchange",Venue Category_Coffee Shop,Venue Category_Hotel,Venue Category_Café,Venue Category_Restaurant,Venue Category_Salad Place,Venue Category_Seafood Restaurant,Venue Category_Japanese Restaurant,Venue Category_American Restaurant,Venue Category_Sushi Restaurant,Venue Category_Asian Restaurant
11,"University of Toronto, Harbord",Venue Category_Café,Venue Category_Bookstore,Venue Category_Restaurant,Venue Category_Bar,Venue Category_Japanese Restaurant,Venue Category_Sandwich Place,Venue Category_Bakery,Venue Category_Theater,Venue Category_Italian Restaurant,Venue Category_Beer Bar
12,"Kensington Market, Chinatown, Grange Park",Venue Category_Vegetarian / Vegan Restaurant,Venue Category_Café,Venue Category_Coffee Shop,Venue Category_Bar,Venue Category_Vietnamese Restaurant,Venue Category_Mexican Restaurant,Venue Category_Pizza Place,Venue Category_Gaming Cafe,Venue Category_Park,Venue Category_Bakery


### Cluster 3 (Cafes)

In [50]:
downtown_toronto_merged.loc[downtown_toronto_merged['Cluster Labels'] == 2, downtown_toronto_merged.columns[[1] + list(range(5, downtown_toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,"Queen's Park, Ontario Provincial Government",Venue Category_Coffee Shop,Venue Category_Diner,Venue Category_Yoga Studio,Venue Category_College Auditorium,Venue Category_Beer Bar,Venue Category_Smoothie Shop,Venue Category_Sandwich Place,Venue Category_Burrito Place,Venue Category_Café,Venue Category_Portuguese Restaurant


### Cluster 4 (Coffee Shop, Cafe, Park & Japanese Restaurant)

In [51]:
downtown_toronto_merged.loc[downtown_toronto_merged['Cluster Labels'] == 3, downtown_toronto_merged.columns[[1] + list(range(5, downtown_toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
13,"CN Tower, King and Spadina, Railway Lands, Har...",Venue Category_Airport Lounge,Venue Category_Airport Service,Venue Category_Airport Terminal,Venue Category_Sculpture Garden,Venue Category_Airport,Venue Category_Airport Food Court,Venue Category_Bar,Venue Category_Rental Car Location,Venue Category_Boutique,Venue Category_Harbor / Marina


### Cluster 5 (Seafood, steakhouse, Hotel & Cafe)

In [52]:
downtown_toronto_merged.loc[downtown_toronto_merged['Cluster Labels'] == 4, downtown_toronto_merged.columns[[1] + list(range(5, downtown_toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,St. James Town,Venue Category_Café,Venue Category_Coffee Shop,Venue Category_Clothing Store,Venue Category_American Restaurant,Venue Category_Cocktail Bar,Venue Category_Cosmetics Shop,Venue Category_Restaurant,Venue Category_Diner,Venue Category_Creperie,Venue Category_Park


### Exploring Neighborhoods in Manhattan

### Analyzing Manhattat Neighborhood

In [53]:
# Let's put that into a pandas dataframe
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [54]:
# Now let's create the new dataframe and display the top 10 venues for each neighborhood.
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = manhattan_nyc_grouped['Neighborhood']

for ind in np.arange(manhattan_nyc_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(manhattan_nyc_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Battery Park City,Venue Category_Park,Venue Category_Hotel,Venue Category_Gym,Venue Category_Coffee Shop,Venue Category_Shopping Mall,Venue Category_Memorial Site,Venue Category_BBQ Joint,Venue Category_Boat or Ferry,Venue Category_Food Court,Venue Category_Gourmet Shop
1,Carnegie Hill,Venue Category_Coffee Shop,Venue Category_Café,Venue Category_Yoga Studio,Venue Category_Bookstore,Venue Category_Gym / Fitness Center,Venue Category_Gym,Venue Category_French Restaurant,Venue Category_Cosmetics Shop,Venue Category_Pizza Place,Venue Category_Wine Shop
2,Central Harlem,Venue Category_African Restaurant,Venue Category_Seafood Restaurant,Venue Category_American Restaurant,Venue Category_Bar,Venue Category_French Restaurant,Venue Category_Chinese Restaurant,Venue Category_Restaurant,Venue Category_Cosmetics Shop,Venue Category_Southern / Soul Food Restaurant,Venue Category_Boutique
3,Chelsea,Venue Category_Coffee Shop,Venue Category_Art Gallery,Venue Category_American Restaurant,Venue Category_Bakery,Venue Category_French Restaurant,Venue Category_Italian Restaurant,Venue Category_Hotel,Venue Category_Seafood Restaurant,Venue Category_Market,Venue Category_Bar
4,Chinatown,Venue Category_Chinese Restaurant,Venue Category_Dessert Shop,Venue Category_Cocktail Bar,Venue Category_Vietnamese Restaurant,Venue Category_Bakery,Venue Category_Spa,Venue Category_Salon / Barbershop,Venue Category_American Restaurant,Venue Category_Optical Shop,Venue Category_Ice Cream Shop
5,Civic Center,Venue Category_Coffee Shop,Venue Category_Hotel,Venue Category_Cocktail Bar,Venue Category_French Restaurant,Venue Category_Gym / Fitness Center,Venue Category_Yoga Studio,Venue Category_Spa,Venue Category_Park,Venue Category_Italian Restaurant,Venue Category_Sushi Restaurant
6,Clinton,Venue Category_Gym / Fitness Center,Venue Category_Cocktail Bar,Venue Category_Sandwich Place,Venue Category_Italian Restaurant,Venue Category_Theater,Venue Category_Coffee Shop,Venue Category_American Restaurant,Venue Category_Gym,Venue Category_Hotel,Venue Category_Pizza Place
7,East Harlem,Venue Category_Mexican Restaurant,Venue Category_Thai Restaurant,Venue Category_Bakery,Venue Category_Latin American Restaurant,Venue Category_Deli / Bodega,Venue Category_Sandwich Place,Venue Category_Spa,Venue Category_Gas Station,Venue Category_Beer Bar,Venue Category_Cocktail Bar
8,East Village,Venue Category_Bar,Venue Category_Ice Cream Shop,Venue Category_Mexican Restaurant,Venue Category_Cocktail Bar,Venue Category_Pizza Place,Venue Category_Wine Bar,Venue Category_Speakeasy,Venue Category_Vegetarian / Vegan Restaurant,Venue Category_Italian Restaurant,Venue Category_Coffee Shop
9,Financial District,Venue Category_Coffee Shop,Venue Category_Pizza Place,Venue Category_Café,Venue Category_Cocktail Bar,Venue Category_Gym / Fitness Center,Venue Category_Mexican Restaurant,Venue Category_Bar,Venue Category_Italian Restaurant,Venue Category_Gym,Venue Category_Steakhouse


### CLUSTERING NEIGHBORHOODS

In [55]:
# Run k-means to cluster the neighborhood into 5 clusters.
# set number of clusters
kclusters = 5

manhattan_grouped_clustering = manhattan_nyc_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(manhattan_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 


array([0, 2, 4, 2, 4, 2, 2, 4, 4, 2])

In [56]:
# Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood
manhattan_merged = manhattan_data

# add clustering labels
manhattan_merged['Cluster Labels'] = kmeans.labels_

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
manhattan_merged = manhattan_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

manhattan_merged.head() # check the last columns!

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Manhattan,Marble Hill,40.876551,-73.91066,0,Venue Category_Gym,Venue Category_Sandwich Place,Venue Category_Coffee Shop,Venue Category_Yoga Studio,Venue Category_Deli / Bodega,Venue Category_Supplement Shop,Venue Category_Steakhouse,Venue Category_Shopping Mall,Venue Category_Seafood Restaurant,Venue Category_Pizza Place
1,Manhattan,Chinatown,40.715618,-73.994279,2,Venue Category_Chinese Restaurant,Venue Category_Dessert Shop,Venue Category_Cocktail Bar,Venue Category_Vietnamese Restaurant,Venue Category_Bakery,Venue Category_Spa,Venue Category_Salon / Barbershop,Venue Category_American Restaurant,Venue Category_Optical Shop,Venue Category_Ice Cream Shop
2,Manhattan,Washington Heights,40.851903,-73.9369,4,Venue Category_Café,Venue Category_Bakery,Venue Category_Bank,Venue Category_Mobile Phone Shop,Venue Category_Deli / Bodega,Venue Category_Pizza Place,Venue Category_Latin American Restaurant,Venue Category_Supermarket,Venue Category_Italian Restaurant,Venue Category_Sandwich Place
3,Manhattan,Inwood,40.867684,-73.92121,2,Venue Category_Lounge,Venue Category_Mexican Restaurant,Venue Category_Restaurant,Venue Category_Café,Venue Category_Park,Venue Category_Chinese Restaurant,Venue Category_Bakery,Venue Category_Spanish Restaurant,Venue Category_Frozen Yogurt Shop,Venue Category_American Restaurant
4,Manhattan,Hamilton Heights,40.823604,-73.949688,4,Venue Category_Pizza Place,Venue Category_Coffee Shop,Venue Category_Café,Venue Category_Mexican Restaurant,Venue Category_Deli / Bodega,Venue Category_Yoga Studio,Venue Category_Chinese Restaurant,Venue Category_Sushi Restaurant,Venue Category_Cocktail Bar,Venue Category_School


In [57]:
# create map
map_clusters = folium.Map(location=[40.876551,-73.910660], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(manhattan_merged['Latitude'], manhattan_merged['Longitude'], manhattan_merged['Neighborhood'], manhattan_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### EXAMINE CLUSTERS

Now, we can examine each cluster and determine the discriminating venue categories that distinguish each cluster. Based on the defining categories, we can then assign a name to each cluster.

### Manhattan

Residential

In [58]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 0, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Marble Hill,Venue Category_Gym,Venue Category_Sandwich Place,Venue Category_Coffee Shop,Venue Category_Yoga Studio,Venue Category_Deli / Bodega,Venue Category_Supplement Shop,Venue Category_Steakhouse,Venue Category_Shopping Mall,Venue Category_Seafood Restaurant,Venue Category_Pizza Place
28,Battery Park City,Venue Category_Park,Venue Category_Hotel,Venue Category_Gym,Venue Category_Coffee Shop,Venue Category_Shopping Mall,Venue Category_Memorial Site,Venue Category_BBQ Joint,Venue Category_Boat or Ferry,Venue Category_Food Court,Venue Category_Gourmet Shop


### Commercial Places

In [59]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 1, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
14,Clinton,Venue Category_Gym / Fitness Center,Venue Category_Cocktail Bar,Venue Category_Sandwich Place,Venue Category_Italian Restaurant,Venue Category_Theater,Venue Category_Coffee Shop,Venue Category_American Restaurant,Venue Category_Gym,Venue Category_Hotel,Venue Category_Pizza Place
17,Chelsea,Venue Category_Coffee Shop,Venue Category_Art Gallery,Venue Category_American Restaurant,Venue Category_Bakery,Venue Category_French Restaurant,Venue Category_Italian Restaurant,Venue Category_Hotel,Venue Category_Seafood Restaurant,Venue Category_Market,Venue Category_Bar
23,Soho,Venue Category_Italian Restaurant,Venue Category_Coffee Shop,Venue Category_Clothing Store,Venue Category_Boutique,Venue Category_Mediterranean Restaurant,Venue Category_Bakery,Venue Category_Seafood Restaurant,Venue Category_Salon / Barbershop,Venue Category_Furniture / Home Store,Venue Category_Vegetarian / Vegan Restaurant


### Tourist Areas & Hubs

In [60]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 2, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Chinatown,Venue Category_Chinese Restaurant,Venue Category_Dessert Shop,Venue Category_Cocktail Bar,Venue Category_Vietnamese Restaurant,Venue Category_Bakery,Venue Category_Spa,Venue Category_Salon / Barbershop,Venue Category_American Restaurant,Venue Category_Optical Shop,Venue Category_Ice Cream Shop
3,Inwood,Venue Category_Lounge,Venue Category_Mexican Restaurant,Venue Category_Restaurant,Venue Category_Café,Venue Category_Park,Venue Category_Chinese Restaurant,Venue Category_Bakery,Venue Category_Spanish Restaurant,Venue Category_Frozen Yogurt Shop,Venue Category_American Restaurant
5,Manhattanville,Venue Category_Coffee Shop,Venue Category_Mexican Restaurant,Venue Category_Seafood Restaurant,Venue Category_Italian Restaurant,Venue Category_Bar,Venue Category_Deli / Bodega,Venue Category_Bus Station,Venue Category_Boutique,Venue Category_Bike Trail,Venue Category_Lounge
6,Central Harlem,Venue Category_African Restaurant,Venue Category_Seafood Restaurant,Venue Category_American Restaurant,Venue Category_Bar,Venue Category_French Restaurant,Venue Category_Chinese Restaurant,Venue Category_Restaurant,Venue Category_Cosmetics Shop,Venue Category_Southern / Soul Food Restaurant,Venue Category_Boutique
9,Yorkville,Venue Category_Italian Restaurant,Venue Category_Gym,Venue Category_Coffee Shop,Venue Category_Bar,Venue Category_Pizza Place,Venue Category_Sushi Restaurant,Venue Category_Deli / Bodega,Venue Category_Wine Shop,Venue Category_Diner,Venue Category_Mexican Restaurant
10,Lenox Hill,Venue Category_Sushi Restaurant,Venue Category_Coffee Shop,Venue Category_Pizza Place,Venue Category_Italian Restaurant,Venue Category_Cocktail Bar,Venue Category_Café,Venue Category_Gym / Fitness Center,Venue Category_Deli / Bodega,Venue Category_Gym,Venue Category_Burger Joint
12,Upper West Side,Venue Category_Italian Restaurant,Venue Category_Coffee Shop,Venue Category_Bakery,Venue Category_Bar,Venue Category_Dessert Shop,Venue Category_Indian Restaurant,Venue Category_Mediterranean Restaurant,Venue Category_Wine Bar,Venue Category_Pub,Venue Category_Ice Cream Shop
16,Murray Hill,Venue Category_Sandwich Place,Venue Category_Coffee Shop,Venue Category_American Restaurant,Venue Category_Japanese Restaurant,Venue Category_Mediterranean Restaurant,Venue Category_Gym / Fitness Center,Venue Category_Hotel,Venue Category_Bar,Venue Category_Bakery,Venue Category_Café
18,Greenwich Village,Venue Category_Italian Restaurant,Venue Category_Sushi Restaurant,Venue Category_Café,Venue Category_Bubble Tea Shop,Venue Category_Ice Cream Shop,Venue Category_Indian Restaurant,Venue Category_Gym,Venue Category_Clothing Store,Venue Category_Chinese Restaurant,Venue Category_Sandwich Place
24,West Village,Venue Category_Italian Restaurant,Venue Category_New American Restaurant,Venue Category_American Restaurant,Venue Category_Cocktail Bar,Venue Category_Park,Venue Category_Wine Bar,Venue Category_Jazz Club,Venue Category_Coffee Shop,Venue Category_Theater,Venue Category_Bakery


### Center Acivity

In [61]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 3, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
30,Carnegie Hill,Venue Category_Coffee Shop,Venue Category_Café,Venue Category_Yoga Studio,Venue Category_Bookstore,Venue Category_Gym / Fitness Center,Venue Category_Gym,Venue Category_French Restaurant,Venue Category_Cosmetics Shop,Venue Category_Pizza Place,Venue Category_Wine Shop


### Cultural & Going Out Places

In [62]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 4, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Washington Heights,Venue Category_Café,Venue Category_Bakery,Venue Category_Bank,Venue Category_Mobile Phone Shop,Venue Category_Deli / Bodega,Venue Category_Pizza Place,Venue Category_Latin American Restaurant,Venue Category_Supermarket,Venue Category_Italian Restaurant,Venue Category_Sandwich Place
4,Hamilton Heights,Venue Category_Pizza Place,Venue Category_Coffee Shop,Venue Category_Café,Venue Category_Mexican Restaurant,Venue Category_Deli / Bodega,Venue Category_Yoga Studio,Venue Category_Chinese Restaurant,Venue Category_Sushi Restaurant,Venue Category_Cocktail Bar,Venue Category_School
7,East Harlem,Venue Category_Mexican Restaurant,Venue Category_Thai Restaurant,Venue Category_Bakery,Venue Category_Latin American Restaurant,Venue Category_Deli / Bodega,Venue Category_Sandwich Place,Venue Category_Spa,Venue Category_Gas Station,Venue Category_Beer Bar,Venue Category_Cocktail Bar
8,Upper East Side,Venue Category_Italian Restaurant,Venue Category_Coffee Shop,Venue Category_Bakery,Venue Category_Gym / Fitness Center,Venue Category_Yoga Studio,Venue Category_Hotel,Venue Category_Juice Bar,Venue Category_French Restaurant,Venue Category_Exhibit,Venue Category_Spa
11,Roosevelt Island,Venue Category_Park,Venue Category_Outdoors & Recreation,Venue Category_Residential Building (Apartment...,Venue Category_Scenic Lookout,Venue Category_Sandwich Place,Venue Category_Liquor Store,Venue Category_Noodle House,Venue Category_Dry Cleaner,Venue Category_Greek Restaurant,Venue Category_Gym
13,Lincoln Square,Venue Category_Plaza,Venue Category_Café,Venue Category_Gym / Fitness Center,Venue Category_Concert Hall,Venue Category_Performing Arts Venue,Venue Category_Theater,Venue Category_Italian Restaurant,Venue Category_American Restaurant,Venue Category_Clothing Store,Venue Category_Wine Shop
15,Midtown,Venue Category_Hotel,Venue Category_Clothing Store,Venue Category_Bakery,Venue Category_Coffee Shop,Venue Category_Sporting Goods Shop,Venue Category_Theater,Venue Category_American Restaurant,Venue Category_Bookstore,Venue Category_Steakhouse,Venue Category_Sandwich Place
19,East Village,Venue Category_Bar,Venue Category_Ice Cream Shop,Venue Category_Mexican Restaurant,Venue Category_Cocktail Bar,Venue Category_Pizza Place,Venue Category_Wine Bar,Venue Category_Speakeasy,Venue Category_Vegetarian / Vegan Restaurant,Venue Category_Italian Restaurant,Venue Category_Coffee Shop
20,Lower East Side,Venue Category_Chinese Restaurant,Venue Category_Bakery,Venue Category_Pizza Place,Venue Category_Café,Venue Category_Japanese Restaurant,Venue Category_Art Gallery,Venue Category_Coffee Shop,Venue Category_Cocktail Bar,Venue Category_Yoga Studio,Venue Category_Mediterranean Restaurant
21,Tribeca,Venue Category_Italian Restaurant,Venue Category_Park,Venue Category_Wine Bar,Venue Category_Spa,Venue Category_Greek Restaurant,Venue Category_American Restaurant,Venue Category_Coffee Shop,Venue Category_Café,Venue Category_Men's Store,Venue Category_Bakery


## Rsults and Discussion <a name="results"></a>

After clustering the data of the respective neighborhoods, both cities (Boroughs) have venues which can be explored and attract the Tourists. The neighborhoods are much similar in features like Theaters, opera houses, food places, clubs, museums, parks etc. As far as concern to dissimilarity, it differs in terms of some unique places like historical places and monuments.

## Observations & Recommendations

When we compare the tourist places, we observe that the historical place is only situated in Downtown Toronto and the Monument or landmark venue is in Manhattan neighborhoods. Similarly, Airport facility, Harbor, Sculpture garden and Boat or ferry services are also available in Downtown Toronto while venues like Nightlife, Climbing gym and Museums are present in Manhattan.
As far as concern to recommendations, we recommend Downtown Toronto Neighborhoods will be considered first to visit. The tourists have an easily travelling access due to Airport facility, which not only saves time but also helps to save money. This saved money can be utilized to explore more, the attracting venues.

## Conclusion <a name="conclusion"></a>

The downtown Toronto and Manhattan neighborhoods have more like similar venues. As we know that every place is unique in its own way, so that’s argument is present in both neighborhoods. The dissimilarity exists in terms of some different venues and facilities but not on a larger extent.