## Capstone Project Notebook
### *Clustering and Exploring the neighborhoods of Toronto and NYC*

So far we have constructed our dataframe for analysis, therefore this part of project is to apply dataframe

### Instruction
Explore and cluster the neighborhoods in Toronto and NYC. You can decide to work with only boroughs that contain the word Toronto and then replicate the same analysis we did to the New York City data. It is up to you.

Just make sure:

- to add enough Markdown cells to explain what you decided to do and to report any observations you make.
- to generate maps to visualize your neighborhoods and how they cluster together.
- once you are happy with your analysis, submit a link to the new Notebook on your Github repository.

In [1]:
import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
import json # library to handle JSON files
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe
import matplotlib.cm as cm # Matplotlib and associated plotting modules
import matplotlib.colors as colors
from sklearn.cluster import KMeans # import k-means from clustering stage
import folium # map rendering library
print('Libraries imported.')

Libraries imported.


### 0. Processing the data, we will onty use borough to do the analysis

#### 0.1-Toronto

In [2]:
# load data
df=pd.read_csv("./tor_geoinfo.csv")
df.head()
# drop unnecessary columns
df.drop(["Unnamed: 0", "Postcode"], axis=1, inplace=True)
df_toronto=df
print(df_toronto.shape)
df_toronto.head()

(210, 4)


Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude
0,North York,Parkwoods,43.753259,-79.329656
1,North York,Victoria Village,43.725882,-79.315572
2,Downtown Toronto,Harbourfront,43.65426,-79.360636
3,North York,Lawrence Heights,43.718518,-79.464763
4,North York,Lawrence Manor,43.718518,-79.464763


#### 0.2-NYC

In [3]:
# load data
!wget -q -O 'newyork_data.json' https://cocl.us/new_york_dataset
print('Data downloaded!')
# make dataframe
with open('newyork_data.json') as json_data:
    newyork_data = json.load(json_data)
df = newyork_data['features']
# define the dataframe columns
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude'] 
# instantiate the dataframe
df_nyc = pd.DataFrame(columns=column_names)
for data in df:
    borough = neighborhood_name = data['properties']['borough'] 
    neighborhood_name = data['properties']['name']
        
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]

    df_nyc = df_nyc.append({'Borough': borough,
                    'Neighborhood': neighborhood_name,
                    'Latitude': neighborhood_lat,
                    'Longitude': neighborhood_lon}, ignore_index=True)
    
print(df_nyc.shape)
df_nyc.head(10)

Data downloaded!
(306, 4)


Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Bronx,Wakefield,40.894705,-73.847201
1,Bronx,Co-op City,40.874294,-73.829939
2,Bronx,Eastchester,40.887556,-73.827806
3,Bronx,Fieldston,40.895437,-73.905643
4,Bronx,Riverdale,40.890834,-73.912585
5,Bronx,Kingsbridge,40.881687,-73.902818
6,Manhattan,Marble Hill,40.876551,-73.91066
7,Bronx,Woodlawn,40.898273,-73.867315
8,Bronx,Norwood,40.877224,-73.879391
9,Bronx,Williamsbridge,40.881039,-73.857446


### 1. Use geopy library to get the latitude and longitude values of Toronto and NYC.
In order to define an instance of the geocoder, we need to define a user_agent. We will name our agent to_explorer, as shown below.

#### 1.1 Toronto

In [4]:
address = 'Toronto, ON'

geolocator = Nominatim(user_agent="to_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto are 43.653963, -79.387207.


#### Create a map of Toronto with neighborhoods superimposed on top.

In [5]:
data_toronto=df_toronto
# create map of Toronto using latitude and longitude values
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(df_toronto['Latitude'], df_toronto['Longitude'], df_toronto['Borough'], df_toronto['Neighbourhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

#### 1.2 NYC

In [6]:
address = 'New York City, NY'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of New York City are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of New York City are 40.7127281, -74.0060152.


In [7]:
# create map of New York using latitude and longitude values
map_newyork = folium.Map(location=[latitude, longitude], zoom_start=10.25)
# add markers to map
for lat, lng, borough, neighborhood in zip(df_nyc['Latitude'], df_nyc['Longitude'], df_nyc['Borough'], df_nyc['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_newyork)     
map_newyork

### 2. Explore the neighborhoods of *Toronto* and *NYC*
#### *Let's create a function to repeat the same process to all the neighborhoods in Toronto and NYC*

In [8]:
# foursquare API
CLIENT_ID='FDHYPUGTTOOHYKSWWHKJUDPEFTEUFMF1XP2FTYIDSNHJ4SLD'
CLIENT_SECRET='1RZMQG0G2MUTAAR5152JBADL3WAOXSSYCRNT0UT0C2KBBKF4' 
VERSION='20191122'
LIMIT=100

def getNearbyVenues(names, latitudes, longitudes, radius=500):
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        #print(name)    
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = [ 'Neighborhood', 
                              'Neighborhood Latitude', 
                              'Neighborhood Longitude', 
                              'Venue', 
                              'Venue Latitude', 
                              'Venue Longitude', 
                              'Venue Category']
    return(nearby_venues)

#### 2.1-Toronto

#### *Now write the code to run the above function on each neighborhood and create a new dataframe called venues_toronto*.

In [9]:
venues_toronto = getNearbyVenues(names=data_toronto['Neighbourhood'],
                                 latitudes=data_toronto['Latitude'],
                                 longitudes=data_toronto['Longitude'])

#### *Let's check the size of the resulting dataframe*

In [10]:
print(venues_toronto.shape)
venues_toronto.head()

(4379, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Parkwoods,43.753259,-79.329656,Brookbanks Park,43.751976,-79.33214,Park
1,Parkwoods,43.753259,-79.329656,GTA Restoration,43.753396,-79.333477,Fireworks Store
2,Parkwoods,43.753259,-79.329656,Variety Store,43.751974,-79.333114,Food & Drink Shop
3,Victoria Village,43.725882,-79.315572,Victoria Village Arena,43.723481,-79.315635,Hockey Arena
4,Victoria Village,43.725882,-79.315572,Tim Hortons,43.725517,-79.313103,Coffee Shop


#### *Let's check how many venues were returned for each neighborhood*

In [11]:
venues_toronto.groupby('Neighborhood').count().head()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Adelaide,100,100,100,100,100,100
Agincourt,4,4,4,4,4,4
Agincourt North,3,3,3,3,3,3
Albion Gardens,9,9,9,9,9,9
Alderwood,8,8,8,8,8,8


#### Let's find out how many unique categories can be curated from all the returned venues

In [12]:
print('There are {} uniques categories.'.format(len(venues_toronto['Venue Category'].unique())))

There are 270 uniques categories.


#### 2.1-NYC

In [13]:
venues_nyc = getNearbyVenues(names=df_nyc['Neighborhood'],
                            latitudes=df_nyc['Latitude'],
                            longitudes=df_nyc['Longitude'])
venues_nyc.groupby('Neighborhood').count()
print('There are {} uniques categories.'.format(len(venues_nyc['Venue Category'].unique())))

There are 429 uniques categories.


### 3. Analyze Each Neighborhood

#### 3.1-Toronto

In [14]:
# one hot encoding
onehot_toronto = pd.get_dummies(venues_toronto[['Venue Category']], prefix="", prefix_sep="")
# add neighborhood column back to dataframe
onehot_toronto['Neighborhood'] = venues_toronto['Neighborhood'] 
# move neighborhood column to the first column
fixed_columns = [onehot_toronto.columns[-1]] + list(onehot_toronto.columns[:-1])
onehot_toronto = onehot_toronto[fixed_columns]
onehot_toronto.head()

Unnamed: 0,Yoga Studio,Accessories Store,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,Aquarium,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Garage,Auto Workshop,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Baseball Stadium,Basketball Court,Basketball Stadium,Beach,Bed & Breakfast,Beer Bar,Beer Store,Belgian Restaurant,Bike Shop,Bistro,Board Shop,Boat or Ferry,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Bridal Shop,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Line,Bus Station,Butcher,Cafeteria,Café,Cajun / Creole Restaurant,Candy Store,Caribbean Restaurant,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Arts Building,College Auditorium,College Gym,College Rec Center,College Stadium,Colombian Restaurant,Comfort Food Restaurant,Comic Shop,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Costume Shop,Coworking Space,Creperie,Cuban Restaurant,Cupcake Shop,Curling Ice,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Empanada Restaurant,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Filipino Restaurant,Fireworks Store,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General Entertainment,General Travel,German Restaurant,Gift Shop,Gluten-free Restaurant,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Hakka Restaurant,Harbor / Marina,Hardware Store,Health & Beauty Service,Health Food Store,Historic Site,History Museum,Hobby Shop,Hockey Arena,Home Service,Hospital,Hostel,Hotel,Hotel Bar,Hotpot Restaurant,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indonesian Restaurant,Indoor Play Area,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Korean Restaurant,Lake,Latin American Restaurant,Light Rail Station,Lingerie Store,Liquor Store,Lounge,Mac & Cheese Joint,Malay Restaurant,Market,Massage Studio,Medical Center,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Motel,Movie Theater,Museum,Music Venue,Neighborhood,New American Restaurant,Nightclub,Noodle House,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Park,Performing Arts Venue,Pet Store,Pharmacy,Pizza Place,Plane,Playground,Plaza,Poke Place,Pool,Portuguese Restaurant,Poutine Place,Pub,Ramen Restaurant,Record Shop,Recording Studio,Rental Car Location,Restaurant,River,Roof Deck,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Shopping Plaza,Skate Park,Skating Rink,Smoke Shop,Smoothie Shop,Snack Place,Soccer Field,Soup Place,Southern / Soul Food Restaurant,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Stadium,Stationery Store,Steakhouse,Strip Club,Supermarket,Supplement Shop,Sushi Restaurant,Swim School,Taco Place,Tailor Shop,Taiwanese Restaurant,Tanning Salon,Tapas Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Restaurant,Thrift / Vintage Store,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Wings Joint,Women's Store
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Parkwoods,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Parkwoods,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Parkwoods,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Victoria Village,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Victoria Village,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


#### *And let's examine the new dataframe size*

In [15]:
onehot_toronto.shape

(4379, 270)

#### *Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category*

In [16]:
grouped_toronto = onehot_toronto.groupby('Neighborhood').mean().reset_index()
grouped_toronto.head()

Unnamed: 0,Neighborhood,Yoga Studio,Accessories Store,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,Aquarium,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Garage,Auto Workshop,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Baseball Stadium,Basketball Court,Basketball Stadium,Beach,Bed & Breakfast,Beer Bar,Beer Store,Belgian Restaurant,Bike Shop,Bistro,Board Shop,Boat or Ferry,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Bridal Shop,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Line,Bus Station,Butcher,Cafeteria,Café,Cajun / Creole Restaurant,Candy Store,Caribbean Restaurant,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Arts Building,College Auditorium,College Gym,College Rec Center,College Stadium,Colombian Restaurant,Comfort Food Restaurant,Comic Shop,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Costume Shop,Coworking Space,Creperie,Cuban Restaurant,Cupcake Shop,Curling Ice,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Empanada Restaurant,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Filipino Restaurant,Fireworks Store,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General Entertainment,General Travel,German Restaurant,Gift Shop,Gluten-free Restaurant,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Hakka Restaurant,Harbor / Marina,Hardware Store,Health & Beauty Service,Health Food Store,Historic Site,History Museum,Hobby Shop,Hockey Arena,Home Service,Hospital,Hostel,Hotel,Hotel Bar,Hotpot Restaurant,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indonesian Restaurant,Indoor Play Area,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Korean Restaurant,Lake,Latin American Restaurant,Light Rail Station,Lingerie Store,Liquor Store,Lounge,Mac & Cheese Joint,Malay Restaurant,Market,Massage Studio,Medical Center,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Motel,Movie Theater,Museum,Music Venue,New American Restaurant,Nightclub,Noodle House,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Park,Performing Arts Venue,Pet Store,Pharmacy,Pizza Place,Plane,Playground,Plaza,Poke Place,Pool,Portuguese Restaurant,Poutine Place,Pub,Ramen Restaurant,Record Shop,Recording Studio,Rental Car Location,Restaurant,River,Roof Deck,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Shopping Plaza,Skate Park,Skating Rink,Smoke Shop,Smoothie Shop,Snack Place,Soccer Field,Soup Place,Southern / Soul Food Restaurant,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Stadium,Stationery Store,Steakhouse,Strip Club,Supermarket,Supplement Shop,Sushi Restaurant,Swim School,Taco Place,Tailor Shop,Taiwanese Restaurant,Tanning Salon,Tapas Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Restaurant,Thrift / Vintage Store,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Wings Joint,Women's Store
0,Adelaide,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.03,0.0,0.0,0.0,0.01,0.02,0.01,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.07,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.03,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0
1,Agincourt,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Agincourt North,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Albion Gardens,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Alderwood,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


#### *Let's confirm the new size*

In [17]:
grouped_toronto.shape

(204, 270)

#### *Let's print each neighborhood along with the top 5 most common venues*

In [18]:
num_top_venues = 5
for hood in grouped_toronto['Neighborhood']:
    #print("----"+hood+"----")
    temp = grouped_toronto[grouped_toronto['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    #print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    #print('\n')

#### *Let's put that into a pandas dataframe*
First, let's write a function to sort the venues in descending order.

In [19]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

#### Now let's create the new dataframe and display the top 10 venues for each neighborhood.

In [20]:
num_top_venues = 10
indicators = ['st', 'nd', 'rd']
# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))
# create a new dataframe
neighborhoods_venues_sorted_toronto = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted_toronto['Neighborhood'] = grouped_toronto['Neighborhood']
for ind in np.arange(grouped_toronto.shape[0]):
    neighborhoods_venues_sorted_toronto.iloc[ind, 1:] = return_most_common_venues(grouped_toronto.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted_toronto.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Adelaide,Coffee Shop,Café,Bar,Steakhouse,Cosmetics Shop,American Restaurant,Restaurant,Bakery,Sushi Restaurant,Asian Restaurant
1,Agincourt,Lounge,Latin American Restaurant,Skating Rink,Breakfast Spot,Ethiopian Restaurant,Empanada Restaurant,Electronics Store,Event Space,Dim Sum Restaurant,Eastern European Restaurant
2,Agincourt North,Playground,Park,Coffee Shop,Women's Store,Donut Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant
3,Albion Gardens,Pizza Place,Beer Store,Fried Chicken Joint,Japanese Restaurant,Fast Food Restaurant,Discount Store,Pharmacy,Sandwich Place,Grocery Store,Airport Terminal
4,Alderwood,Pizza Place,Pub,Gym,Pharmacy,Sandwich Place,Coffee Shop,Skating Rink,Department Store,Dessert Shop,Dim Sum Restaurant


#### 3.2-NYC

In [21]:
onehot_nyc = pd.get_dummies(venues_nyc[['Venue Category']], prefix="", prefix_sep="")
onehot_nyc['Neighborhood'] = venues_nyc['Neighborhood'] 
fixed_columns = [onehot_nyc.columns[-1]] + list(onehot_nyc.columns[:-1])
onehot_nyc = onehot_nyc[fixed_columns]
grouped_nyc = onehot_nyc.groupby('Neighborhood').mean().reset_index()
num_top_venues = 5
for hood in grouped_nyc['Neighborhood']:
    print("----"+hood+"----")
    temp = grouped_nyc[grouped_nyc['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

num_top_venues = 10
indicators = ['st', 'nd', 'rd']
# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))
# create a new dataframe
neighborhoods_venues_nyc_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_nyc_sorted['Neighborhood'] = grouped_nyc['Neighborhood']
for ind in np.arange(grouped_nyc.shape[0]):
    neighborhoods_venues_nyc_sorted.iloc[ind, 1:] = return_most_common_venues(grouped_nyc.iloc[ind, :], num_top_venues)

neighborhoods_venues_nyc_sorted.head()

----Allerton----
                venue  freq
0         Supermarket  0.10
1         Pizza Place  0.10
2       Deli / Bodega  0.06
3         Bus Station  0.06
4  Chinese Restaurant  0.06


----Annadale----
                 venue  freq
0  American Restaurant  0.21
1          Pizza Place  0.14
2                  Pub  0.07
3               Bakery  0.07
4         Dance Studio  0.07


----Arden Heights----
          venue  freq
0      Pharmacy   0.2
1      Bus Stop   0.2
2   Coffee Shop   0.2
3  Home Service   0.2
4   Pizza Place   0.2


----Arlington----
                 venue  freq
0        Boat or Ferry  0.17
1  American Restaurant  0.17
2        Grocery Store  0.17
3         Intersection  0.17
4             Bus Stop  0.17


----Arrochar----
                   venue  freq
0               Bus Stop  0.16
1     Italian Restaurant  0.11
2          Deli / Bodega  0.11
3                  Hotel  0.05
4  Outdoors & Recreation  0.05


----Arverne----
             venue  freq
0        Surf Spot  0.22

                       venue  freq
0              Deli / Bodega  0.10
1                     Bakery  0.08
2         Chinese Restaurant  0.05
3                   Pharmacy  0.05
4  Latin American Restaurant  0.05


----Concord----
                 venue  freq
0        Deli / Bodega   0.2
1   Italian Restaurant   0.1
2  Peruvian Restaurant   0.1
3          Supermarket   0.1
4           Bagel Shop   0.1


----Concourse----
                venue  freq
0       Deli / Bodega  0.15
1  Spanish Restaurant  0.08
2  Italian Restaurant  0.08
3            Pharmacy  0.04
4              Bakery  0.04


----Concourse Village----
                  venue  freq
0         Deli / Bodega  0.12
1  Fast Food Restaurant  0.09
2           Supermarket  0.06
3   Sporting Goods Shop  0.06
4        Sandwich Place  0.06


----Coney Island----
                          venue  freq
0              Baseball Stadium  0.12
1          Caribbean Restaurant  0.12
2                  Skating Rink  0.06
3                    Food S

4     Convenience Store  0.05


----Forest Hills Gardens----
                     venue  freq
0                     Food  0.08
1                   Bakery  0.08
2  New American Restaurant  0.04
3              Pizza Place  0.04
4               Food Truck  0.04


----Fort Greene----
                venue  freq
0  Italian Restaurant  0.06
1           Wine Shop  0.04
2             Theater  0.04
3         Coffee Shop  0.04
4         Flower Shop  0.04


----Fort Hamilton----
                  venue  freq
0         Deli / Bodega  0.06
1  Gym / Fitness Center  0.04
2           Pizza Place  0.04
3        Sandwich Place  0.04
4              Bus Stop  0.04


----Fox Hills----
            venue  freq
0        Bus Stop  0.50
1   Grocery Store  0.25
2  Sandwich Place  0.25
3     Yoga Studio  0.00
4    Outlet Store  0.00


----Fresh Meadows----
                           venue  freq
0                    Bus Station  0.19
1                       Pharmacy  0.12
2             Chinese Restaurant  0.12
3  

4     Paella Restaurant   0.0


----Lefrak City----
              venue  freq
0  Department Store  0.08
1            Bakery  0.08
2    Cosmetics Shop  0.08
3        Steakhouse  0.04
4       Dry Cleaner  0.04


----Lenox Hill----
                venue  freq
0  Italian Restaurant  0.06
1         Coffee Shop  0.06
2         Pizza Place  0.04
3    Sushi Restaurant  0.04
4        Cocktail Bar  0.03


----Lighthouse Hill----
                venue  freq
0  Italian Restaurant  0.25
1                Café  0.25
2          Art Museum  0.25
3                 Spa  0.25
4              Office  0.00


----Lincoln Square----
                  venue  freq
0               Theater  0.07
1    Italian Restaurant  0.05
2  Gym / Fitness Center  0.05
3                 Plaza  0.05
4                  Café  0.05


----Lindenwood----
                     venue  freq
0  Fruit & Vegetable Store  0.08
1               Donut Shop  0.08
2       Chinese Restaurant  0.08
3                     Bank  0.08
4                 

               venue  freq
0             Lawyer   0.5
1                Bar   0.5
2        Yoga Studio   0.0
3  Paella Restaurant   0.0
4        Pet Service   0.0


----Ocean Hill----
                 venue  freq
0        Deli / Bodega  0.16
1  Fried Chicken Joint  0.10
2          Supermarket  0.06
3   Chinese Restaurant  0.06
4    Convenience Store  0.06


----Ocean Parkway----
           venue  freq
0  Deli / Bodega  0.05
1  Grocery Store  0.05
2    Pizza Place  0.05
3     Bagel Shop  0.05
4       Pharmacy  0.05


----Old Town----
                venue  freq
0  Italian Restaurant  0.22
1         Video Store  0.06
2        Liquor Store  0.06
3      Mattress Store  0.06
4       Grocery Store  0.06


----Olinville----
                  venue  freq
0           Supermarket  0.17
1  Caribbean Restaurant  0.17
2      Basketball Court  0.08
3                  Food  0.08
4     Convenience Store  0.08


----Ozone Park----
         venue  freq
0  Pizza Place  0.09
1     Pharmacy  0.09
2        D

4                 Bar  0.08


----Silver Lake----
                 venue  freq
0         Burger Joint  0.25
1  American Restaurant  0.25
2                Beach  0.25
3          Golf Course  0.25
4          Yoga Studio  0.00


----Soho----
            venue  freq
0  Clothing Store  0.09
1        Boutique  0.08
2     Art Gallery  0.05
3   Women's Store  0.04
4      Shoe Store  0.04


----Somerville----
                 venue  freq
0                 Park   1.0
1          Yoga Studio   0.0
2            Pet Store   0.0
3             Pet Café   0.0
4  Peruvian Restaurant   0.0


----Soundview----
                 venue  freq
0   Chinese Restaurant  0.18
1         Liquor Store  0.12
2         Burger Joint  0.06
3  Fried Chicken Joint  0.06
4          Video Store  0.06


----South Beach----
                venue  freq
0                Pier   0.4
1       Deli / Bodega   0.2
2               Beach   0.2
3  Athletics & Sports   0.2
4         Yoga Studio   0.0


----South Jamaica----
           ven

4                   Office  0.00


----Whitestone----
               venue  freq
0    Bubble Tea Shop  0.25
1       Dance Studio  0.25
2      Deli / Bodega  0.25
3        Candy Store  0.25
4  Paella Restaurant  0.00


----Williamsbridge----
                  venue  freq
0  Caribbean Restaurant   0.4
1             Nightclub   0.2
2                   Bar   0.2
3            Soup Place   0.2
4           Pet Service   0.0


----Williamsburg----
              venue  freq
0               Bar  0.08
1       Coffee Shop  0.08
2       Pizza Place  0.05
3        Bagel Shop  0.05
4  Greek Restaurant  0.03


----Willowbrook----
                venue  freq
0            Bus Stop   0.3
1       Deli / Bodega   0.2
2                 Spa   0.1
3        Intersection   0.1
4  Chinese Restaurant   0.1


----Windsor Terrace----
           venue  freq
0  Deli / Bodega  0.10
1          Diner  0.07
2           Park  0.07
3  Grocery Store  0.07
4           Café  0.07


----Wingate----
                  venue  fre

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Allerton,Pizza Place,Supermarket,Chinese Restaurant,Bus Station,Deli / Bodega,Fried Chicken Joint,Breakfast Spot,Grocery Store,Martial Arts Dojo,Bike Trail
1,Annadale,American Restaurant,Pizza Place,Bakery,Restaurant,Pharmacy,Train Station,Diner,Sushi Restaurant,Dance Studio,Pub
2,Arden Heights,Pizza Place,Pharmacy,Bus Stop,Coffee Shop,Home Service,Women's Store,Ethiopian Restaurant,Event Service,Event Space,Exhibit
3,Arlington,Bus Stop,Intersection,Boat or Ferry,American Restaurant,Deli / Bodega,Grocery Store,Women's Store,Field,Fast Food Restaurant,Farmers Market
4,Arrochar,Bus Stop,Deli / Bodega,Italian Restaurant,Food Truck,Supermarket,Mediterranean Restaurant,Outdoors & Recreation,Middle Eastern Restaurant,Bagel Shop,Pharmacy


### 4. Cluster Neighborhoods
Run k-means to cluster the neighborhood into 5 clusters.

#### 4.1-Toronto

In [22]:
# set number of clusters
kclusters = 5
grouped_toronto_clustering = grouped_toronto.drop('Neighborhood', 1)
# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(grouped_toronto_clustering)
# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([1, 1, 0, 2, 1, 1, 1, 1, 2, 1], dtype=int32)

#### *Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood*

In [23]:
# add clustering labels
neighborhoods_venues_sorted_toronto.insert(0, 'Cluster Labels', kmeans.labels_)
merged_toronto = data_toronto
# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
merged_toronto = merged_toronto.join(neighborhoods_venues_sorted_toronto.set_index('Neighborhood'), on='Neighbourhood')
merged_toronto.head() # check the last columns!

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,North York,Parkwoods,43.753259,-79.329656,0.0,Park,Fireworks Store,Food & Drink Shop,Women's Store,Donut Shop,Diner,Discount Store,Dog Run,Doner Restaurant,Dumpling Restaurant
1,North York,Victoria Village,43.725882,-79.315572,1.0,Pizza Place,Coffee Shop,Intersection,Hockey Arena,Portuguese Restaurant,French Restaurant,Dumpling Restaurant,Eastern European Restaurant,Drugstore,Donut Shop
2,Downtown Toronto,Harbourfront,43.65426,-79.360636,1.0,Coffee Shop,Park,Pub,Bakery,Mexican Restaurant,Theater,Breakfast Spot,Café,Electronics Store,Chocolate Shop
3,North York,Lawrence Heights,43.718518,-79.464763,1.0,Clothing Store,Gift Shop,Miscellaneous Shop,Event Space,Furniture / Home Store,Coffee Shop,Women's Store,Vietnamese Restaurant,Boutique,Accessories Store
4,North York,Lawrence Manor,43.718518,-79.464763,1.0,Clothing Store,Gift Shop,Miscellaneous Shop,Event Space,Furniture / Home Store,Coffee Shop,Women's Store,Vietnamese Restaurant,Boutique,Accessories Store


#### Finally, let's visualize the resulting clusters

In [24]:
# drop row containing NaN value
merged_toronto.dropna(inplace=True)

In [25]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(merged_toronto['Latitude'], merged_toronto['Longitude'], merged_toronto['Neighbourhood'], merged_toronto['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    #print(cluster)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[int(cluster)-1],
        fill=True,
        fill_color=rainbow[int(cluster)-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

#### 4.2 NYC

In [26]:
kclusters = 5
grouped_nyc_clustering = grouped_nyc.drop('Neighborhood', 1)
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(grouped_nyc_clustering)
kmeans.labels_[0:10] 

neighborhoods_venues_nyc_sorted.insert(0, 'Cluster Labels', kmeans.labels_)
merged_nyc = df_nyc
merged_nyc = merged_nyc.join(neighborhoods_venues_nyc_sorted.set_index('Neighborhood'), on='Neighborhood')

merged_nyc.dropna(inplace=True)

map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(merged_nyc['Latitude'], merged_nyc['Longitude'], merged_nyc['Neighborhood'], merged_nyc['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    #print(cluster)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[int(cluster)-1],
        fill=True,
        fill_color=rainbow[int(cluster)-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### 5. Examine Clusters
Now, you can examine each cluster and determine the discriminating venue categories that distinguish each cluster. Based on the defining categories, you can then assign a name to each cluster. I will leave this exercise to you.

#### Cluster 1

In [27]:
merged_toronto.loc[merged_toronto['Cluster Labels'] == 0, merged_toronto.columns[[1] + list(range(5, merged_toronto.shape[1]))]]

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Parkwoods,Park,Fireworks Store,Food & Drink Shop,Women's Store,Donut Shop,Diner,Discount Store,Dog Run,Doner Restaurant,Dumpling Restaurant
37,Caledonia-Fairbanks,Park,Women's Store,Fast Food Restaurant,Market,Drugstore,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop
59,East Toronto,Park,Coffee Shop,Convenience Store,Women's Store,Drugstore,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop
69,CFB Toronto,Airport,Park,Electronics Store,Women's Store,Donut Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant
70,Downsview East,Airport,Park,Electronics Store,Women's Store,Donut Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant
81,Silver Hills,Cafeteria,Park,Drugstore,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant
82,York Mills,Cafeteria,Park,Drugstore,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant
111,Lawrence Park,Bus Line,Park,Swim School,Donut Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Women's Store
115,Weston,Park,Women's Store,Drugstore,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant
119,York Mills West,Park,Bank,Convenience Store,Women's Store,Drugstore,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop


#### Cluster 2

In [28]:
merged_toronto.loc[merged_toronto['Cluster Labels'] == 1, merged_toronto.columns[[1] + list(range(5, merged_toronto.shape[1]))]]

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Victoria Village,Pizza Place,Coffee Shop,Intersection,Hockey Arena,Portuguese Restaurant,French Restaurant,Dumpling Restaurant,Eastern European Restaurant,Drugstore,Donut Shop
2,Harbourfront,Coffee Shop,Park,Pub,Bakery,Mexican Restaurant,Theater,Breakfast Spot,Café,Electronics Store,Chocolate Shop
3,Lawrence Heights,Clothing Store,Gift Shop,Miscellaneous Shop,Event Space,Furniture / Home Store,Coffee Shop,Women's Store,Vietnamese Restaurant,Boutique,Accessories Store
4,Lawrence Manor,Clothing Store,Gift Shop,Miscellaneous Shop,Event Space,Furniture / Home Store,Coffee Shop,Women's Store,Vietnamese Restaurant,Boutique,Accessories Store
5,Queen's Park,Coffee Shop,Gym,Diner,Park,Yoga Studio,Nightclub,Smoothie Shop,Seafood Restaurant,Sandwich Place,Burger Joint
6,Queen's Park,Coffee Shop,Gym,Diner,Park,Yoga Studio,Nightclub,Smoothie Shop,Seafood Restaurant,Sandwich Place,Burger Joint
9,Don Mills North,Gym / Fitness Center,Japanese Restaurant,Café,Caribbean Restaurant,Baseball Field,Basketball Court,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Diner
12,Ryerson,Coffee Shop,Clothing Store,Cosmetics Shop,Fast Food Restaurant,Café,Bakery,Bookstore,Pizza Place,Bubble Tea Shop,Burger Joint
13,Garden District,Coffee Shop,Clothing Store,Cosmetics Shop,Fast Food Restaurant,Café,Bakery,Bookstore,Pizza Place,Bubble Tea Shop,Burger Joint
14,Glencairn,Pizza Place,Japanese Restaurant,Park,Pub,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant


#### Cluster 3

In [29]:
merged_toronto.loc[merged_toronto['Cluster Labels'] == 2, merged_toronto.columns[[1] + list(range(5, merged_toronto.shape[1]))]]

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
7,Rouge,Fast Food Restaurant,Department Store,Event Space,Ethiopian Restaurant,Empanada Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Drugstore,Donut Shop
8,Malvern,Fast Food Restaurant,Department Store,Event Space,Ethiopian Restaurant,Empanada Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Drugstore,Donut Shop
10,Woodbine Gardens,Pizza Place,Fast Food Restaurant,Athletics & Sports,Gastropub,Intersection,Pet Store,Pharmacy,Bus Line,Bank,Gym / Fitness Center
11,Parkview Hill,Pizza Place,Fast Food Restaurant,Athletics & Sports,Gastropub,Intersection,Pet Store,Pharmacy,Bus Line,Bank,Gym / Fitness Center
43,Hillcrest Village,Mediterranean Restaurant,Golf Course,Fast Food Restaurant,Pool,Dog Run,Donut Shop,Dim Sum Restaurant,Diner,Discount Store,Doner Restaurant
101,Del Ray,Fast Food Restaurant,Sandwich Place,Restaurant,Doner Restaurant,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Women's Store
102,Keelesdale,Fast Food Restaurant,Sandwich Place,Restaurant,Doner Restaurant,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Women's Store
103,Mount Dennis,Fast Food Restaurant,Sandwich Place,Restaurant,Doner Restaurant,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Women's Store
104,Silverthorn,Fast Food Restaurant,Sandwich Place,Restaurant,Doner Restaurant,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Women's Store
173,Albion Gardens,Pizza Place,Beer Store,Fried Chicken Joint,Japanese Restaurant,Fast Food Restaurant,Discount Store,Pharmacy,Sandwich Place,Grocery Store,Airport Terminal


#### Cluster 4

In [30]:
merged_toronto.loc[merged_toronto['Cluster Labels'] == 3, merged_toronto.columns[[1] + list(range(5, merged_toronto.shape[1]))]]

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
15,Cloverdale,Golf Course,Women's Store,Drugstore,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant
16,Islington,Golf Course,Women's Store,Drugstore,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant
17,Martin Grove,Golf Course,Women's Store,Drugstore,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant
18,Princess Gardens,Golf Course,Women's Store,Drugstore,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant
19,West Deane Park,Golf Course,Women's Store,Drugstore,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant


#### Cluster 5

In [31]:
merged_toronto.loc[merged_toronto['Cluster Labels'] == 4, merged_toronto.columns[[1] + list(range(5, merged_toronto.shape[1]))]]

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
97,Downsview Central,Home Service,Food Truck,Baseball Field,Women's Store,Dessert Shop,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop
105,Emery,Baseball Field,Women's Store,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Fast Food Restaurant
106,Humberlea,Baseball Field,Women's Store,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Fast Food Restaurant
112,Roselawn,Garden,Home Service,Women's Store,Department Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop
197,Humber Bay,Home Service,Baseball Field,Women's Store,Dessert Shop,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore
198,King's Mill Park,Home Service,Baseball Field,Women's Store,Dessert Shop,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore
199,Kingsway Park South East,Home Service,Baseball Field,Women's Store,Dessert Shop,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore
200,Mimico NE,Home Service,Baseball Field,Women's Store,Dessert Shop,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore
201,Old Mill South,Home Service,Baseball Field,Women's Store,Dessert Shop,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore
202,The Queensway East,Home Service,Baseball Field,Women's Store,Dessert Shop,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore
