# Best Neighborhood to Rent in Hamilton 

## Introduction

Hamilton is a Canadian port city on the western tip of Lake Ontario, it is famous for its amazing escarpment views and numerous waterfalls, it is a great place to live in. Many people working in Toronto are moving to Hamilton because of the great natural views, cheaper rent and convenient transportation. The objective of this project is to analyze the neighborhoods in Hamilton based on the featured venues and the average rent price to help people who are considering moving to Hamilton to find the best place to rent that suit their needs. 

## Table of Contents


1. Download and Explore Dataset 

2. Explore Neighborhoods in Hamilton, ON 

3. Analyze Each Neighborhood 

4. Cluster Neighborhoods

5. Examine Clusters 

First, download the libraries needed for the project.

In [2]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Libraries imported.


## 1. Download and Explore Dataset

### 1.1 Get the Rent Dataframe

In [5]:
rent_data = {
'Durand' : 1300,
'Central Hamilton':1675,
'Landsdale': 1345,
'Beasley': 1395,
'Corktown': 1375,
'Gibson': 1199,
'Kirkendall North': 1720,
'Stipley': 1175,
'Ainslie Wood East': 525,
'Westdale South': 720,
'Riverdale West': 1450,
'Crown Point West': 1195,
'Stinson': 1310,
'Strathcona': 1272,
'St. Clair': 1250,
'Rosedale': 1424,
'Raleigh': 1348,
'Crown Point East': 1100,
'North End East': 1200,
'Waterdown': 2150,
'Ainslie Wood West': 1129,
'Ainslie Wood North': 560,
'University Gardens': 1375,
'Hill Park': 1115,
'Stoney Creek Estates': 949,
'Bakeley': 1320,
'Greenford': 1650}

In [90]:
rent_df = pd.DataFrame.from_dict(rent_data,orient='index').reset_index()
rent_df.columns = ['Neighbourhood','MedianRent']
rent_df

Unnamed: 0,Neighbourhood,MedianRent
0,Durand,1300
1,Central Hamilton,1675
2,Landsdale,1345
3,Beasley,1395
4,Corktown,1375
5,Gibson,1199
6,Kirkendall North,1720
7,Stipley,1175
8,Ainslie Wood East,525
9,Westdale South,720


### 1.2 Get the GIS location data

In [91]:
def getLat(address):
    loc = address + ', Hamilton, Canada'
    if geolocator.geocode(loc):
        return geolocator.geocode(loc).latitude 
def getLon(address):
    loc = address + ', Hamilton, Canada'
    if geolocator.geocode(loc):
        return geolocator.geocode(loc).longitude
rent_df['Latitude'] = rent_df['Neighbourhood'].apply(getLat) 
rent_df['Longitude'] = rent_df['Neighbourhood'].apply(getLon) 

In [94]:
rent_df.dropna(subset = ['Latitude'],inplace = True)

In [96]:
rent_df.reset_index()

Unnamed: 0,index,Neighbourhood,MedianRent,Latitude,Longitude
0,0,Durand,1300,43.250247,-79.875734
1,1,Central Hamilton,1675,43.25608,-79.872858
2,3,Beasley,1395,43.259204,-79.861012
3,4,Corktown,1375,43.250681,-79.868619
4,5,Gibson,1199,43.257866,-79.839098
5,6,Kirkendall North,1720,39.977308,-86.047118
6,9,Westdale South,720,43.261881,-79.905921
7,10,Riverdale West,1450,43.228332,-79.75982
8,12,Stinson,1310,43.246953,-79.852747
9,13,Strathcona,1272,43.265244,-79.883693


#### Create a map of Hamilton with neighborhoods superimposed on top.

In [97]:
address = 'Hamilton, ON, Canada'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of New York City are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of New York City are 43.2560802, -79.8728583.


In [101]:
# create map of New York using latitude and longitude values
map_hamilton = folium.Map(location=[latitude, longitude], zoom_start=13)

# add markers to map
for lat, lng, neighborhood in zip(rent_df['Latitude'], rent_df['Longitude'], rent_df['Neighbourhood']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_hamilton)  
    
map_hamilton

Next, start utilizing the Foursquare API to explore the neighborhoods and segment them.

#### Define Foursquare Credentials and Version

In [103]:
CLIENT_ID = '2CFPS0RAJBRO5QUJ0TAVVPUDLK0LHDDBYQHKZ4RRSDN5XEPH' # your Foursquare ID
CLIENT_SECRET = '2CGAOWN5Z5CQPZNSFUNJEK5QUR33WMXX1KB1OZBDAQPOVBUM' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('My credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

My credentails:
CLIENT_ID: 2CFPS0RAJBRO5QUJ0TAVVPUDLK0LHDDBYQHKZ4RRSDN5XEPH
CLIENT_SECRET:2CGAOWN5Z5CQPZNSFUNJEK5QUR33WMXX1KB1OZBDAQPOVBUM


## 2. Explore Neighborhoods in Hamilton, ON

In [104]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

#### Run the above function on each neighborhood and create a new dataframe called *hamilton_venues*.

In [107]:
LIMIT = 100
hamilton_venues = getNearbyVenues(names=rent_df['Neighbourhood'],
                                   latitudes=rent_df['Latitude'],
                                   longitudes=rent_df['Longitude']
                                  )


Durand
Central Hamilton
Beasley
Corktown
Gibson
Kirkendall North
Westdale South
Riverdale West
Stinson
Strathcona
St. Clair
Rosedale
Raleigh
North End East
Waterdown
University Gardens
Hill Park
Greenford


In [112]:
hamilton_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Durand,43.250247,-79.875734,One Duke Restaurant and Lounge,43.251812,-79.871852,Seafood Restaurant
1,Durand,43.250247,-79.875734,Red Crow Coffee,43.250061,-79.871915,Café
2,Durand,43.250247,-79.875734,Durand Coffee,43.25152,-79.878845,Café
3,Durand,43.250247,-79.875734,Bronzie's Place,43.250541,-79.871633,Italian Restaurant
4,Durand,43.250247,-79.875734,The Pheasant Plucker,43.25197,-79.870248,Pub


In [109]:
hamilton_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Beasley,10,10,10,10,10,10
Central Hamilton,67,67,67,67,67,67
Corktown,31,31,31,31,31,31
Durand,16,16,16,16,16,16
Gibson,4,4,4,4,4,4
Greenford,20,20,20,20,20,20
Hill Park,9,9,9,9,9,9
Kirkendall North,4,4,4,4,4,4
North End East,7,7,7,7,7,7
Raleigh,6,6,6,6,6,6


## 3. Analyze Each Neighborhood

In [123]:
# one hot encoding
hamilton_onehot = pd.get_dummies(hamilton_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
hamilton_onehot['Neighborhood'] = hamilton_venues['Neighborhood'] 

first_col = hamilton_onehot.pop('Neighborhood')
hamilton_onehot.insert(0, 'Neighborhood', first_col)
hamilton_onehot.head()

Unnamed: 0,Neighborhood,Adult Boutique,American Restaurant,Arcade,Art Gallery,Arts & Crafts Store,Asian Restaurant,Auto Garage,Bagel Shop,Bakery,Bank,Bar,Beer Store,Big Box Store,Bookstore,Breakfast Spot,Brewery,Bubble Tea Shop,Burger Joint,Burrito Place,Café,Chinese Restaurant,Clothing Store,Coffee Shop,College Arts Building,College Baseball Diamond,College Basketball Court,College Cafeteria,Concert Hall,Convenience Store,Cosmetics Shop,Cupcake Shop,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Dog Run,Ethiopian Restaurant,Exhibit,Farm,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Food & Drink Shop,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden,Gas Station,Gastropub,Gay Bar,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Health & Beauty Service,Historic Site,History Museum,Hockey Arena,Home Service,Hotel,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Italian Restaurant,Lake,Library,Liquor Store,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Movie Theater,Multiplex,Museum,Music Venue,New American Restaurant,Nightclub,Noodle House,Park,Performing Arts Venue,Pharmacy,Pizza Place,Pub,Record Shop,Restaurant,Rock Club,Salon / Barbershop,Sandwich Place,Seafood Restaurant,Shipping Store,Shopping Mall,Skating Rink,Smoothie Shop,Snack Place,Soccer Field,Soccer Stadium,Sports Bar,Steakhouse,Student Center,Supermarket,Sushi Restaurant,Taco Place,Tea Room,Tennis Court,Thai Restaurant,Theater,Thrift / Vintage Store,Trail,Tree,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wings Joint,Yoga Studio
0,Durand,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Durand,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Durand,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Durand,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Durand,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


#### Next, group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [124]:
hamilton_grouped = hamilton_onehot.groupby('Neighborhood').mean().reset_index()
hamilton_grouped

Unnamed: 0,Neighborhood,Adult Boutique,American Restaurant,Arcade,Art Gallery,Arts & Crafts Store,Asian Restaurant,Auto Garage,Bagel Shop,Bakery,Bank,Bar,Beer Store,Big Box Store,Bookstore,Breakfast Spot,Brewery,Bubble Tea Shop,Burger Joint,Burrito Place,Café,Chinese Restaurant,Clothing Store,Coffee Shop,College Arts Building,College Baseball Diamond,College Basketball Court,College Cafeteria,Concert Hall,Convenience Store,Cosmetics Shop,Cupcake Shop,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Dog Run,Ethiopian Restaurant,Exhibit,Farm,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Food & Drink Shop,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden,Gas Station,Gastropub,Gay Bar,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Health & Beauty Service,Historic Site,History Museum,Hockey Arena,Home Service,Hotel,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Italian Restaurant,Lake,Library,Liquor Store,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Movie Theater,Multiplex,Museum,Music Venue,New American Restaurant,Nightclub,Noodle House,Park,Performing Arts Venue,Pharmacy,Pizza Place,Pub,Record Shop,Restaurant,Rock Club,Salon / Barbershop,Sandwich Place,Seafood Restaurant,Shipping Store,Shopping Mall,Skating Rink,Smoothie Shop,Snack Place,Soccer Field,Soccer Stadium,Sports Bar,Steakhouse,Student Center,Supermarket,Sushi Restaurant,Taco Place,Tea Room,Tennis Court,Thai Restaurant,Theater,Thrift / Vintage Store,Trail,Tree,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wings Joint,Yoga Studio
0,Beasley,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0
1,Central Hamilton,0.0,0.029851,0.014925,0.014925,0.0,0.014925,0.0,0.0,0.0,0.014925,0.029851,0.0,0.0,0.0,0.014925,0.0,0.0,0.014925,0.029851,0.029851,0.0,0.014925,0.104478,0.0,0.0,0.0,0.0,0.014925,0.014925,0.0,0.0,0.0,0.0,0.0,0.0,0.014925,0.0,0.0,0.0,0.0,0.014925,0.044776,0.014925,0.014925,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014925,0.014925,0.014925,0.0,0.014925,0.0,0.014925,0.0,0.029851,0.0,0.029851,0.0,0.014925,0.0,0.0,0.014925,0.0,0.014925,0.029851,0.014925,0.014925,0.0,0.0,0.014925,0.029851,0.0,0.014925,0.014925,0.014925,0.014925,0.059701,0.014925,0.029851,0.0,0.0,0.029851,0.014925,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014925,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014925,0.0
2,Corktown,0.0,0.0,0.0,0.0,0.0,0.032258,0.0,0.0,0.0,0.032258,0.0,0.0,0.0,0.0,0.032258,0.0,0.032258,0.0,0.0,0.032258,0.0,0.0,0.032258,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.032258,0.0,0.0,0.0,0.032258,0.0,0.0,0.0,0.0,0.0,0.0,0.032258,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.032258,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.096774,0.0,0.0,0.0,0.0,0.032258,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.064516,0.0,0.0,0.032258,0.258065,0.0,0.032258,0.0,0.0,0.064516,0.032258,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.032258,0.0,0.0,0.0,0.032258,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Durand,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0625,0.1875,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Gibson,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Greenford,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.05,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.1,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0
6,Hill Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.111111,0.0,0.0,0.0,0.0,0.0
7,Kirkendall North,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,North End East,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Raleigh,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.166667,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0


#### Write a function to sort the venues in descending order.

In [125]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

#### Create the new dataframe and display the top 10 venues for each neighborhood.

In [252]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = hamilton_grouped['Neighborhood']

for ind in np.arange(hamilton_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(hamilton_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Beasley,Middle Eastern Restaurant,Coffee Shop,Vietnamese Restaurant,Fast Food Restaurant,Pharmacy,Beer Store,Asian Restaurant,Theater,Sushi Restaurant,Dog Run
1,Central Hamilton,Coffee Shop,Pub,Fast Food Restaurant,Sandwich Place,Bar,Café,Middle Eastern Restaurant,Indian Restaurant,Hotel,Burrito Place
2,Corktown,Pub,Italian Restaurant,Park,Sandwich Place,Fast Food Restaurant,Pizza Place,Mexican Restaurant,Restaurant,Coffee Shop,Seafood Restaurant
3,Durand,Pub,Café,Italian Restaurant,Pharmacy,Ethiopian Restaurant,Fast Food Restaurant,Breakfast Spot,Seafood Restaurant,Bank,Pizza Place
4,Gibson,Restaurant,Coffee Shop,Gas Station,Library,Fast Food Restaurant,Cupcake Shop,Deli / Bodega,Department Store,Dessert Shop,Diner


## 4. Cluster Neighborhoods

Add  the median rent to the dataframe

In [248]:
#avg_rent = rent_df[['Neighborhood','MedianRent']]
#avg_rent = avg_rent.sort_values(by = 'Neighborhood')
#df = pd.merge(neighborhoods_venues_sorted, avg_rent, on='Neighborhood', how='inner')


In [253]:
# set number of clusters
kclusters = 6

hamilton_grouped_clustering = hamilton_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(hamilton_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 0, 0, 0, 2, 0, 0, 1, 0, 0], dtype=int32)

Create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [254]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

hamilton_merged = rent_df

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
hamilton_merged = hamilton_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

hamilton_merged.head() # check the last columns!

Unnamed: 0,Neighborhood,MedianRent,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Durand,1300,43.250247,-79.875734,0,Pub,Café,Italian Restaurant,Pharmacy,Ethiopian Restaurant,Fast Food Restaurant,Breakfast Spot,Seafood Restaurant,Bank,Pizza Place
1,Central Hamilton,1675,43.25608,-79.872858,0,Coffee Shop,Pub,Fast Food Restaurant,Sandwich Place,Bar,Café,Middle Eastern Restaurant,Indian Restaurant,Hotel,Burrito Place
3,Beasley,1395,43.259204,-79.861012,0,Middle Eastern Restaurant,Coffee Shop,Vietnamese Restaurant,Fast Food Restaurant,Pharmacy,Beer Store,Asian Restaurant,Theater,Sushi Restaurant,Dog Run
4,Corktown,1375,43.250681,-79.868619,0,Pub,Italian Restaurant,Park,Sandwich Place,Fast Food Restaurant,Pizza Place,Mexican Restaurant,Restaurant,Coffee Shop,Seafood Restaurant
5,Gibson,1199,43.257866,-79.839098,2,Restaurant,Coffee Shop,Gas Station,Library,Fast Food Restaurant,Cupcake Shop,Deli / Bodega,Department Store,Dessert Shop,Diner


In [255]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster, rent in zip(hamilton_merged['Latitude'], hamilton_merged['Longitude'], hamilton_merged['Neighborhood'], hamilton_merged['Cluster Labels'],hamilton_merged['MedianRent']):
    

    label = folium.Popup(str(poi) 
                         + ", Cluster:  " + str(cluster) 
                         + ", Average Rent: " + str(rent), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

## 5. Examine Clusters

#### Cluster 1

In [256]:
hamilton_merged.loc[hamilton_merged['Cluster Labels'] == 0, hamilton_merged.columns[[1] + list(range(5, hamilton_merged.shape[1]))]]

Unnamed: 0,MedianRent,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,1300,Pub,Café,Italian Restaurant,Pharmacy,Ethiopian Restaurant,Fast Food Restaurant,Breakfast Spot,Seafood Restaurant,Bank,Pizza Place
1,1675,Coffee Shop,Pub,Fast Food Restaurant,Sandwich Place,Bar,Café,Middle Eastern Restaurant,Indian Restaurant,Hotel,Burrito Place
3,1395,Middle Eastern Restaurant,Coffee Shop,Vietnamese Restaurant,Fast Food Restaurant,Pharmacy,Beer Store,Asian Restaurant,Theater,Sushi Restaurant,Dog Run
4,1375,Pub,Italian Restaurant,Park,Sandwich Place,Fast Food Restaurant,Pizza Place,Mexican Restaurant,Restaurant,Coffee Shop,Seafood Restaurant
9,720,Coffee Shop,Sandwich Place,Bank,Mediterranean Restaurant,Burger Joint,Mexican Restaurant,Cupcake Shop,Indie Movie Theater,Supermarket,Middle Eastern Restaurant
10,1450,Adult Boutique,Frozen Yogurt Shop,Pizza Place,Department Store,Sandwich Place,Chinese Restaurant,Shopping Mall,Liquor Store,Fast Food Restaurant,Ice Cream Shop
13,1272,Yoga Studio,Theater,Fast Food Restaurant,Pharmacy,Coffee Shop,Hotel,Gastropub,Gas Station,History Museum,Arts & Crafts Store
14,1250,Bar,Sushi Restaurant,Sandwich Place,Dessert Shop,Pizza Place,Indian Restaurant,Soccer Stadium,Fast Food Restaurant,College Cafeteria,Convenience Store
16,1348,Wings Joint,Pharmacy,Restaurant,Sandwich Place,Coffee Shop,Farmers Market,Cupcake Shop,Deli / Bodega,Department Store,Dessert Shop
18,1200,Restaurant,Breakfast Spot,Park,Coffee Shop,Skating Rink,Brewery,Convenience Store,Gas Station,Cupcake Shop,Deli / Bodega


In [257]:
hamilton_merged.groupby('Cluster Labels').mean()

Unnamed: 0_level_0,MedianRent,Latitude,Longitude
Cluster Labels,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
0,1376.923077,36.697991,-60.556152
1,1720.0,39.977308,-86.047118
2,1199.0,43.257866,-79.839098
3,1375.0,43.263222,-79.936684
4,1310.0,43.246953,-79.852747
5,1424.0,43.226083,-79.807812


#### Cluster 2

In [258]:
hamilton_merged.loc[hamilton_merged['Cluster Labels'] == 1, hamilton_merged.columns[[1] + list(range(5, hamilton_merged.shape[1]))]]

Unnamed: 0,MedianRent,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
6,1720,Golf Course,Park,Dog Run,Soccer Field,Yoga Studio,Farmers Market,Cupcake Shop,Deli / Bodega,Department Store,Dessert Shop


#### Cluster 3

In [259]:
hamilton_merged.loc[hamilton_merged['Cluster Labels'] == 2, hamilton_merged.columns[[1] + list(range(5, hamilton_merged.shape[1]))]]

Unnamed: 0,MedianRent,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
5,1199,Restaurant,Coffee Shop,Gas Station,Library,Fast Food Restaurant,Cupcake Shop,Deli / Bodega,Department Store,Dessert Shop,Diner


#### Cluster 4

In [260]:
hamilton_merged.loc[hamilton_merged['Cluster Labels'] == 3, hamilton_merged.columns[[1] + list(range(5, hamilton_merged.shape[1]))]]

Unnamed: 0,MedianRent,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
22,1375,Thai Restaurant,Yoga Studio,Fish & Chips Shop,Cupcake Shop,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Dog Run


#### Cluster 5

In [261]:
hamilton_merged.loc[hamilton_merged['Cluster Labels'] == 4, hamilton_merged.columns[[1] + list(range(5, hamilton_merged.shape[1]))]]

Unnamed: 0,MedianRent,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
12,1310,Park,Trail,Coffee Shop,Yoga Studio,Fast Food Restaurant,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store


#### Cluster 6

In [262]:
hamilton_merged.loc[hamilton_merged['Cluster Labels'] == 5, hamilton_merged.columns[[1] + list(range(5, hamilton_merged.shape[1]))]]

Unnamed: 0,MedianRent,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
15,1424,Convenience Store,Park,Chinese Restaurant,Burrito Place,Cosmetics Shop,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store
