# Melbourne: Neighbourhood Analysis & Segmentation
## Introduction

This is for Coursera Applied DS Final Project. Main tasks include: (1) Retrieve Melbourne postcodes & coordinates; (2) Foursquare API: Venues in Melbourne; (3) Exploring & clustering the neighborhoods in Melbourne.

## Table of Contents

1. <a href="#part01">Retrieve Melbourne postcodes & coordinates</a>
2. <a href="#part02">Foursquare API: Venues in Melbourne</a>  
3. <a href="#part03">Cluster the neighborhoods in Melbourne</a>  
4. <a href="#part04">Examine clusters</a>  


## Packages installation:

In [6]:
import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
import numpy as np

# import the library to open URLs
import urllib.request
# import the BeautifulSoup library to parse HTML and XML documents
!conda install -c conda-forge beautifulsoup4 --yes
from bs4 import BeautifulSoup

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('------- Libraries imported. -------')

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

------- Libraries imported. -------


<a id="part01"></a>

<div class="alert alert-block alert-info" style="margin-top: 50px">
<font size = 8  color='black' font-weight="bold">
</font>
</div>

# **1. Retrieve Melbourne postcodes & coordinates**

<div class="alert alert-block alert-info">
<font size = 8  color='black' font-weight="bold">
</font>
</div>

## Reading data:
`australian_postcodes.csv` contains an up-to-date database of Australian Postcodes and Localities, including accurate longitude and latitude values, sourced by and supported by the community. Source: <a href='https://www.matthewproctor.com/australian_postcodes' target='blank'>Australian Postcodes</a>

In [7]:
df_postcodes = pd.read_csv('australian_postcodes.csv')
print('df_postcodes size: ', df_postcodes.shape)
df_postcodes.head()

df_postcodes size:  (18272, 14)


Unnamed: 0,id,postcode,locality,state,long,lat,dc,type,status,sa3,sa3name,sa4,sa4name,region
0,230,200,ANU,ACT,0.0,0.0,,,,,,,,R1
1,21820,200,Australian National University,ACT,149.1189,-35.2777,,,Added 19-Jan-2020,,,,,R1
2,232,800,DARWIN,NT,130.83668,-12.458684,,,Updated 6-Feb-2020,70101.0,Darwin City,701.0,Darwin,R1
3,233,801,DARWIN,NT,130.83668,-12.458684,,,Updated 25-Mar-2020 SA3,70101.0,Darwin City,701.0,Darwin,R1
4,234,804,PARAP,NT,130.873315,-12.428017,,,Updated 25-Mar-2020 SA3,70102.0,Darwin Suburbs,701.0,Darwin,R1


## Extract records of __VIC__
__VIC__ = __Victoria__ state

In [8]:
df_vic = df_postcodes[df_postcodes['state']=='VIC']
print('df_vic size: ', df_vic.shape)
df_vic.head()

df_vic size:  (3531, 14)


Unnamed: 0,id,postcode,locality,state,long,lat,dc,type,status,sa3,sa3name,sa4,sa4name,region
6100,4746,3000,MELBOURNE,VIC,144.956776,-37.817403,CITY DELIVERY CENTRE,Delivery Area,Updated 6-Feb-2020,20604.0,Melbourne City,206.0,Melbourne - Inner,R1
6101,4747,3001,MELBOURNE,VIC,144.76592,-38.365017,CITY MAIL PROCESSING CENTRE,Post Office Boxes,Updated 25-Mar-2020 SA3,20605.0,Port Phillip,206.0,Melbourne - Inner,R1
6102,4748,3002,EAST MELBOURNE,VIC,144.982207,-37.818517,CITY DELIVERY CENTRE,Delivery Area,Updated 6-Feb-2020,20604.0,Melbourne City,206.0,Melbourne - Inner,R1
6103,4749,3003,WEST MELBOURNE,VIC,144.949592,-37.810871,CITY DELIVERY CENTRE,Delivery Area,Updated 6-Feb-2020,20604.0,Melbourne City,206.0,Melbourne - Inner,R1
6104,4750,3004,MELBOURNE,VIC,144.970161,-37.844246,CITY DELIVERY CENTRE,Delivery Area,Updated 6-Feb-2020,20605.0,Port Phillip,206.0,Melbourne - Inner,R1


## Handle missing values
### Check missing values

In [16]:
df_vic.isna().sum()

id            0
postcode      0
locality      0
state         0
long          0
lat           0
dc           56
type        213
status        7
sa3          19
sa3name      19
sa4          19
sa4name      19
region        0
dtype: int64

### Drop records with missing values

In [17]:
# df_vic = df_vic.apply (pd.to_numeric, errors='coerce')
df_vic = df_vic.dropna()
df_vic = df_vic.reset_index(drop=True)
print('df_vic size: ', df_vic.shape)
df_vic.isna().sum()

df_vic size:  (3305, 14)


id          0
postcode    0
locality    0
state       0
long        0
lat         0
dc          0
type        0
status      0
sa3         0
sa3name     0
sa4         0
sa4name     0
region      0
dtype: int64

## Extract records of __Melbourne__
Based on value of `sa4name` column. Postcodes belonging to Melbourne contain _'Melbourne'_ in their `sa4name` string. For example, _'Melbourne-Inner'_, _'Melbourne-Inner East'_, _'Melbourne-Inner South'_, _'Melbourne-West'_, etc. More details can be found at <a href='https://itt.abs.gov.au/itt/r.jsp?databyregion' target='blank'>Australian Beaurau of Statistics</a>.

In [21]:
df_melbourne = df_vic[df_vic['sa4name'].str.contains('Melbourne')].reset_index(drop=True)

print("df_melbourne size: ", df_melbourne.shape)
df_melbourne.head(5)

df_melbourne size:  (712, 14)


Unnamed: 0,id,postcode,locality,state,long,lat,dc,type,status,sa3,sa3name,sa4,sa4name,region
0,4746,3000,MELBOURNE,VIC,144.956776,-37.817403,CITY DELIVERY CENTRE,Delivery Area,Updated 6-Feb-2020,20604.0,Melbourne City,206.0,Melbourne - Inner,R1
1,4747,3001,MELBOURNE,VIC,144.76592,-38.365017,CITY MAIL PROCESSING CENTRE,Post Office Boxes,Updated 25-Mar-2020 SA3,20605.0,Port Phillip,206.0,Melbourne - Inner,R1
2,4748,3002,EAST MELBOURNE,VIC,144.982207,-37.818517,CITY DELIVERY CENTRE,Delivery Area,Updated 6-Feb-2020,20604.0,Melbourne City,206.0,Melbourne - Inner,R1
3,4749,3003,WEST MELBOURNE,VIC,144.949592,-37.810871,CITY DELIVERY CENTRE,Delivery Area,Updated 6-Feb-2020,20604.0,Melbourne City,206.0,Melbourne - Inner,R1
4,4750,3004,MELBOURNE,VIC,144.970161,-37.844246,CITY DELIVERY CENTRE,Delivery Area,Updated 6-Feb-2020,20605.0,Port Phillip,206.0,Melbourne - Inner,R1


### Keep relevant columns

In [22]:
relevant_columns = ['id','postcode','locality','long','lat','sa3name','sa4name']
df_melbourne = df_melbourne[relevant_columns]
print('df_melbourne size: ', df_melbourne.shape)
df_melbourne.head()

df_melbourne size:  (712, 7)


Unnamed: 0,id,postcode,locality,long,lat,sa3name,sa4name
0,4746,3000,MELBOURNE,144.956776,-37.817403,Melbourne City,Melbourne - Inner
1,4747,3001,MELBOURNE,144.76592,-38.365017,Port Phillip,Melbourne - Inner
2,4748,3002,EAST MELBOURNE,144.982207,-37.818517,Melbourne City,Melbourne - Inner
3,4749,3003,WEST MELBOURNE,144.949592,-37.810871,Melbourne City,Melbourne - Inner
4,4750,3004,MELBOURNE,144.970161,-37.844246,Port Phillip,Melbourne - Inner


<a id="part02"></a>

<div class="alert alert-block alert-info" style="margin-top: 50px">
<font size = 8  color='black' font-weight="bold">
</font>
</div>

# **2. Foursquare API: Venues in Melbourne**

<div class="alert alert-block alert-info" style="margin-top: 0px">
<font size = 8  color='black' font-weight="bold">
</font>
</div>

## Define Foursquare Credentials and Version

In [24]:
CLIENT_ID = 'KGCFNTU2I3RZNLATJX3OQFHD5OW5KLGRYLTNIKR2RL5Y2NZX' # Foursquare ID
CLIENT_SECRET = 'NNYV15RRKAHC35PZT310HOZYDXISRJSPQ2IVWGU0CMO4ZPMN' # Foursquare Secret
VERSION = '20200531' # Foursquare API version
LIMIT = 100 
radius = 500

print('My credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

My credentails:
CLIENT_ID: KGCFNTU2I3RZNLATJX3OQFHD5OW5KLGRYLTNIKR2RL5Y2NZX
CLIENT_SECRET:NNYV15RRKAHC35PZT310HOZYDXISRJSPQ2IVWGU0CMO4ZPMN


## Function to retrieve venues & apply to all the neighborhoods in Melbourne
This function is taken from the Neighborhood Segmentation lab

In [25]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
#         print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['neighbourhood', 
                  'neighbourhood latitude', 
                  'neighbourhood longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

## Apply the `getNearbyVenues` function to get the list of venues
<mark>Do __NOT__ re-run the code shell below! Foursquare Free account has a limit on calls per day!!!</mark>
<hr>

In [26]:
melbourne_venues = getNearbyVenues(names=df_melbourne['sa3name'],
                                   latitudes=df_melbourne['lat'],
                                   longitudes=df_melbourne['long']
                                  )
print(melbourne_venues.shape)
melbourne_venues.head()

(5378, 7)


Unnamed: 0,neighbourhood,neighbourhood latitude,neighbourhood longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Melbourne City,-37.817403,144.956776,Virgin Active Health Club,-37.818806,144.955917,Gym / Fitness Center
1,Melbourne City,-37.817403,144.956776,Bonnie Coffee Brewers,-37.818153,144.957636,Coffee Shop
2,Melbourne City,-37.817403,144.956776,Royal Stacks,-37.817867,144.958489,Burger Joint
3,Melbourne City,-37.817403,144.956776,Brim CC,-37.817764,144.954732,Japanese Restaurant
4,Melbourne City,-37.817403,144.956776,Don Don,-37.818174,144.956018,Japanese Restaurant


<hr>

## Check how many venues were returned for each neighborhood

In [85]:
print('There are {} uniques categories.'.format(len(melbourne_venues['Venue Category'].unique())))
melbourne_venues.groupby('neighbourhood').count()

There are 288 uniques categories.


Unnamed: 0_level_0,neighbourhood latitude,neighbourhood longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
neighbourhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Banyule,57,57,57,57,57,57
Bayside,161,161,161,161,161,161
Boroondara,131,131,131,131,131,131
Brimbank,160,160,160,160,160,160
Brunswick - Coburg,171,171,171,171,171,171
Cardinia,24,24,24,24,24,24
Casey - North,20,20,20,20,20,20
Casey - South,25,25,25,25,25,25
Dandenong,52,52,52,52,52,52
Darebin - North,258,258,258,258,258,258


## One hot coding

In [86]:
# one hot encoding
melbourne_onehot = pd.get_dummies(melbourne_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
melbourne_onehot['neighbourhood'] = melbourne_venues['neighbourhood'] 

# move neighborhood column to the first column
fixed_columns = [melbourne_onehot.columns[-1]] + list(melbourne_onehot.columns[:-1])
melbourne_onehot = melbourne_onehot[fixed_columns]
print('melbourne_onehot size: ', melbourne_onehot.shape)
melbourne_onehot.head()

melbourne_onehot size:  (5378, 289)


Unnamed: 0,neighbourhood,Accessories Store,Adult Boutique,Afghan Restaurant,Alternative Healer,American Restaurant,Antique Shop,Aquarium,Arcade,Arepa Restaurant,Art Gallery,Arts & Crafts Store,Arts & Entertainment,Asian Restaurant,Athletics & Sports,Australian Restaurant,Austrian Restaurant,Automotive Shop,BBQ Joint,Badminton Court,Bagel Shop,Bakery,Bar,Baseball Field,Basketball Court,Basketball Stadium,Beach,Bed & Breakfast,Beer Garden,Bistro,Board Shop,Bookstore,Boutique,Bowling Alley,Bowling Green,Breakfast Spot,Brewery,Bridal Shop,Bridge,Bubble Tea Shop,Buffet,Building,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Business Service,Butcher,Café,Cambodian Restaurant,Camera Store,Candy Store,Casino,Cemetery,Chaat Place,Cheese Shop,Chinese Restaurant,City Hall,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Bookstore,College Gym,College Theater,Comic Shop,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Costume Shop,Creperie,Cricket Ground,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dive Bar,Dog Run,Donut Shop,Dry Cleaner,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Filipino Restaurant,Fish & Chips Shop,Fish Market,Fishing Spot,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Football Stadium,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General Entertainment,German Restaurant,Gift Shop,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Halal Restaurant,Harbor / Marina,Health & Beauty Service,Hill,Historic Site,History Museum,Hockey Arena,Hockey Field,Home Service,Hookah Bar,Hostel,Hot Dog Joint,Hotel,Hotel Bar,Hungarian Restaurant,IT Services,Ice Cream Shop,Indian Chinese Restaurant,Indian Restaurant,Indie Movie Theater,Indie Theater,Indonesian Restaurant,Italian Restaurant,Japanese Restaurant,Jazz Club,Juice Bar,Karaoke Bar,Kebab Restaurant,Kids Store,Kitchen Supply Store,Korean Restaurant,Lake,Lebanese Restaurant,Light Rail Station,Liquor Store,Lounge,Malay Restaurant,Market,Massage Studio,Medical Center,Medical School,Mediterranean Restaurant,Memorial Site,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Mini Golf,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Monument / Landmark,Moroccan Restaurant,Motel,Motorcycle Shop,Mountain,Movie Theater,Moving Target,Multiplex,Museum,Music Store,Music Venue,National Park,Nightclub,Noodle House,Office,Opera House,Optical Shop,Other Great Outdoors,Outlet Mall,Paintball Field,Paper / Office Supplies Store,Park,Pedestrian Plaza,Peking Duck Restaurant,Performing Arts Venue,Persian Restaurant,Pet Store,Pharmacy,Pier,Pizza Place,Platform,Playground,Plaza,Polish Restaurant,Pool,Portuguese Restaurant,Post Office,Pub,Ramen Restaurant,Record Shop,Recreation Center,Rental Car Location,Residential Building (Apartment / Condo),Rest Area,Restaurant,River,Rock Climbing Spot,Rock Club,Roof Deck,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Store,Shop & Service,Shopping Mall,Skate Park,Skating Rink,Smoke Shop,Snack Place,Soba Restaurant,Soccer Field,Soccer Stadium,Social Club,Soup Place,Southern / Soul Food Restaurant,Souvenir Shop,Souvlaki Shop,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Stadium,Steakhouse,Supermarket,Sushi Restaurant,Szechuan Restaurant,Taco Place,Tailor Shop,Tapas Restaurant,Tea Room,Temple,Tennis Court,Tennis Stadium,Thai Restaurant,Theater,Theme Restaurant,Thrift / Vintage Store,Tibetan Restaurant,Tour Provider,Tourist Information Center,Toy / Game Store,Trail,Train,Train Station,Tram Station,Tree,Tunnel,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Store,Vietnamese Restaurant,Whisky Bar,Wine Bar,Wine Shop,Women's Store,Yoga Studio,Yunnan Restaurant,Zoo Exhibit
0,Melbourne City,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Melbourne City,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Melbourne City,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Melbourne City,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Melbourne City,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


## Group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [87]:
melbourne_grouped = melbourne_onehot.groupby('neighbourhood').mean().reset_index()
print('melbourne_grouped size: ', melbourne_grouped.shape)
melbourne_grouped.head()

melbourne_grouped size:  (38, 289)


Unnamed: 0,neighbourhood,Accessories Store,Adult Boutique,Afghan Restaurant,Alternative Healer,American Restaurant,Antique Shop,Aquarium,Arcade,Arepa Restaurant,Art Gallery,Arts & Crafts Store,Arts & Entertainment,Asian Restaurant,Athletics & Sports,Australian Restaurant,Austrian Restaurant,Automotive Shop,BBQ Joint,Badminton Court,Bagel Shop,Bakery,Bar,Baseball Field,Basketball Court,Basketball Stadium,Beach,Bed & Breakfast,Beer Garden,Bistro,Board Shop,Bookstore,Boutique,Bowling Alley,Bowling Green,Breakfast Spot,Brewery,Bridal Shop,Bridge,Bubble Tea Shop,Buffet,Building,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Business Service,Butcher,Café,Cambodian Restaurant,Camera Store,Candy Store,Casino,Cemetery,Chaat Place,Cheese Shop,Chinese Restaurant,City Hall,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Bookstore,College Gym,College Theater,Comic Shop,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Costume Shop,Creperie,Cricket Ground,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dive Bar,Dog Run,Donut Shop,Dry Cleaner,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Filipino Restaurant,Fish & Chips Shop,Fish Market,Fishing Spot,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Football Stadium,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General Entertainment,German Restaurant,Gift Shop,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Halal Restaurant,Harbor / Marina,Health & Beauty Service,Hill,Historic Site,History Museum,Hockey Arena,Hockey Field,Home Service,Hookah Bar,Hostel,Hot Dog Joint,Hotel,Hotel Bar,Hungarian Restaurant,IT Services,Ice Cream Shop,Indian Chinese Restaurant,Indian Restaurant,Indie Movie Theater,Indie Theater,Indonesian Restaurant,Italian Restaurant,Japanese Restaurant,Jazz Club,Juice Bar,Karaoke Bar,Kebab Restaurant,Kids Store,Kitchen Supply Store,Korean Restaurant,Lake,Lebanese Restaurant,Light Rail Station,Liquor Store,Lounge,Malay Restaurant,Market,Massage Studio,Medical Center,Medical School,Mediterranean Restaurant,Memorial Site,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Mini Golf,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Monument / Landmark,Moroccan Restaurant,Motel,Motorcycle Shop,Mountain,Movie Theater,Moving Target,Multiplex,Museum,Music Store,Music Venue,National Park,Nightclub,Noodle House,Office,Opera House,Optical Shop,Other Great Outdoors,Outlet Mall,Paintball Field,Paper / Office Supplies Store,Park,Pedestrian Plaza,Peking Duck Restaurant,Performing Arts Venue,Persian Restaurant,Pet Store,Pharmacy,Pier,Pizza Place,Platform,Playground,Plaza,Polish Restaurant,Pool,Portuguese Restaurant,Post Office,Pub,Ramen Restaurant,Record Shop,Recreation Center,Rental Car Location,Residential Building (Apartment / Condo),Rest Area,Restaurant,River,Rock Climbing Spot,Rock Club,Roof Deck,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Store,Shop & Service,Shopping Mall,Skate Park,Skating Rink,Smoke Shop,Snack Place,Soba Restaurant,Soccer Field,Soccer Stadium,Social Club,Soup Place,Southern / Soul Food Restaurant,Souvenir Shop,Souvlaki Shop,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Stadium,Steakhouse,Supermarket,Sushi Restaurant,Szechuan Restaurant,Taco Place,Tailor Shop,Tapas Restaurant,Tea Room,Temple,Tennis Court,Tennis Stadium,Thai Restaurant,Theater,Theme Restaurant,Thrift / Vintage Store,Tibetan Restaurant,Tour Provider,Tourist Information Center,Toy / Game Store,Trail,Train,Train Station,Tram Station,Tree,Tunnel,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Store,Vietnamese Restaurant,Whisky Bar,Wine Bar,Wine Shop,Women's Store,Yoga Studio,Yunnan Restaurant,Zoo Exhibit
0,Banyule,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.087719,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.315789,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.105263,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.070175,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.087719,0.0,0.0
1,Bayside,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.006211,0.0,0.0,0.0,0.0,0.0,0.018634,0.018634,0.0,0.0,0.0,0.006211,0.0,0.0,0.0,0.0,0.018634,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018634,0.0,0.0,0.0,0.0,0.0,0.0,0.130435,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037267,0.0,0.0,0.018634,0.0,0.037267,0.0,0.0,0.0,0.0,0.0,0.0,0.018634,0.0,0.0,0.0,0.0,0.0,0.0,0.018634,0.0,0.018634,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.024845,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018634,0.0,0.0,0.0,0.018634,0.0,0.0,0.006211,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018634,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018634,0.0,0.043478,0.0,0.0,0.0,0.018634,0.018634,0.0,0.0,0.018634,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018634,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018634,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.031056,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055901,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018634,0.0,0.0,0.0,0.0,0.0,0.0,0.018634,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018634,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018634,0.0,0.018634,0.0,0.0,0.0,0.0,0.055901,0.018634,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.062112,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037267,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018634,0.018634,0.0,0.0,0.0,0.0
2,Boroondara,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045802,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015267,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.145038,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015267,0.0,0.0,0.0,0.0,0.030534,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015267,0.0,0.0,0.0,0.0,0.045802,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030534,0.0,0.0,0.015267,0.015267,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015267,0.0,0.030534,0.007634,0.022901,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015267,0.0,0.0,0.0,0.0,0.015267,0.0,0.0,0.0,0.0,0.0,0.015267,0.0,0.0,0.0,0.10687,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045802,0.030534,0.0,0.015267,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045802,0.0,0.015267,0.0,0.0,0.0,0.0,0.0,0.015267,0.0,0.0,0.0,0.015267,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.053435,0.0,0.0,0.0,0.0,0.0,0.015267,0.0,0.022901,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015267,0.0,0.0,0.0,0.0,0.0,0.007634,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015267,0.0,0.0,0.0,0.0,0.0,0.0,0.015267,0.015267,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045802,0.0,0.0,0.007634,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Brimbank,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.09375,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03125,0.025,0.0,0.0,0.0,0.0,0.0,0.03125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04375,0.0,0.03125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03125,0.0,0.0,0.0,0.0,0.0,0.0,0.03125,0.0,0.0,0.0,0.0,0.0,0.05625,0.05625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.03125,0.0,0.0,0.0875,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.00625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.03125,0.0,0.0,0.0,0.0,0.0,0.0,0.08125,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Brunswick - Coburg,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.023392,0.076023,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.239766,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.017544,0.0,0.005848,0.0,0.0,0.0,0.0,0.0,0.0,0.011696,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.005848,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05848,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.017544,0.011696,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.046784,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.023392,0.0,0.0,0.0,0.0,0.0,0.005848,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.035088,0.0,0.040936,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.005848,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.005848,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.023392,0.0,0.0,0.0,0.005848,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.023392,0.017544,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0


## Function to sort the venues in descending order based on frequency
This is taken from the lab

In [88]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

## Dataframe to store the top 5 venues for each neighborhood

In [138]:
num_top_venues = 5

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['neighbourhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['neighbourhood'] = melbourne_grouped['neighbourhood']

for ind in np.arange(melbourne_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(melbourne_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,Banyule,Park,Playground,Yoga Studio,Fish & Chips Shop,Sports Club
1,Bayside,Café,Thai Restaurant,Pizza Place,Supermarket,Indian Restaurant
2,Boroondara,Café,Italian Restaurant,Park,Light Rail Station,Thai Restaurant
3,Brimbank,Café,Pizza Place,Vietnamese Restaurant,Grocery Store,Gym
4,Brunswick - Coburg,Café,Bar,Grocery Store,Light Rail Station,Pizza Place


<a id="part03"></a>

<div class="alert alert-block alert-info" style="margin-top: 50px">
<font size = 8  color='black' font-weight="bold">
</font>
</div>

# __3. Cluster the neighbourhoods in Melbourne__

<div class="alert alert-block alert-info">
<font size = 8  color='black' font-weight="bold">
</font>
</div>

## Using k-Means with 7 clusters

In [139]:
# set number of clusters
kclusters = 7

melbourne_grouped_clustering = melbourne_grouped.drop('neighbourhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(melbourne_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 


array([4, 2, 2, 5, 2, 5, 5, 4, 5, 5], dtype=int32)

## Create a new dataframe to store both cluster label & top 5 venues

In [140]:
df_postcode_short = pd.DataFrame(columns=['neighbourhood','long', 'lat'])
df_postcode_short.head()
neighbourhood_list = df_melbourne['sa3name'].unique()
melbourne_merged = df_melbourne

for i in range(len(neighbourhood_list)):
#     print(neighbourhood_list[i])
    count = 0
    sum_lat = 0
    sum_long = 0
    for j in range(len(melbourne_merged)):
        check = melbourne_merged.loc[j, "sa3name"] == neighbourhood_list[i]
        if check: 
            count += 1
            sum_lat += melbourne_merged.loc[j, "lat"]
            sum_long += melbourne_merged.loc[j, "long"]
    avg_lat = sum_lat / count
    avg_long = sum_long / count
    df_postcode_short.loc[i,'neighbourhood'] = neighbourhood_list[i]
    df_postcode_short.loc[i,'long'] = avg_long
    df_postcode_short.loc[i,'lat'] = avg_lat
print('df_postcode_short size: ', df_postcode_short.shape)
df_postcode_short.head()

df_postcode_short size:  (38, 3)


Unnamed: 0,neighbourhood,long,lat
0,Melbourne City,144.954,-37.8059
1,Port Phillip,144.951,-37.8925
2,Maribyrnong,144.879,-37.7969
3,Hobsons Bay,144.876,-37.8478
4,Brimbank,144.805,-37.7447


In [141]:
melbourne_merged = df_postcode_short
print('melbourne_merged size: ', melbourne_merged.shape)
print('neighborhoods_venues_sorted size: ', neighborhoods_venues_sorted.shape)

# add clustering labels
# neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)
neighborhoods_venues_sorted['Cluster Labels'] = kmeans.labels_

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
melbourne_merged = melbourne_merged.join(neighborhoods_venues_sorted.set_index('neighbourhood'), on='neighbourhood')

print('melbourne_merged size: ', melbourne_merged.shape)
melbourne_merged.head()

melbourne_merged size:  (38, 3)
neighborhoods_venues_sorted size:  (38, 6)
melbourne_merged size:  (38, 9)


Unnamed: 0,neighbourhood,long,lat,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,Cluster Labels
0,Melbourne City,144.954,-37.8059,Café,Coffee Shop,Hotel,Italian Restaurant,Bar,2
1,Port Phillip,144.951,-37.8925,Café,Bar,Australian Restaurant,Breakfast Spot,Athletics & Sports,2
2,Maribyrnong,144.879,-37.7969,Café,Park,Construction & Landscaping,Vietnamese Restaurant,Restaurant,2
3,Hobsons Bay,144.876,-37.8478,Fast Food Restaurant,Café,Convenience Store,Beach,Grocery Store,2
4,Brimbank,144.805,-37.7447,Café,Pizza Place,Vietnamese Restaurant,Grocery Store,Gym,5


## Visualise the resulted clusters

In [142]:
address = 'Melbourne, AU'
geolocator = Nominatim(user_agent="au_explorer")
location = geolocator.geocode(address)
latitude_melbourne = location.latitude
longitude_melbourne = location.longitude
print('The geograpical coordinate of Melbourne are {}, {}.'.format(latitude_melbourne, longitude_melbourne))

The geograpical coordinate of Melbourne are -37.8142176, 144.9631608.


In [155]:
# create map
map_clusters = folium.Map(location=[latitude_melbourne, longitude_melbourne], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(melbourne_merged['lat'], melbourne_merged['long'], melbourne_merged['neighbourhood'], melbourne_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=8,
        popup=label,
#         color=rainbow[cluster-1],
        color = 'gray',
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.99).add_to(map_clusters)
       
map_clusters

## Check number of neighborhoods belonging to each cluster

In [144]:
melbourne_merged.groupby('Cluster Labels').count()

Unnamed: 0_level_0,neighbourhood,long,lat,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
Cluster Labels,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
0,1,1,1,1,1,1,1,1
1,1,1,1,1,1,1,1,1
2,17,17,17,17,17,17,17,17
3,1,1,1,1,1,1,1,1
4,2,2,2,2,2,2,2,2
5,15,15,15,15,15,15,15,15
6,1,1,1,1,1,1,1,1


## Comment: 
### with k = 3: 35 (out of 39) neighbourhoods fall into the same cluster, based on 10 most common venues. 
### with k = 7: 2 bigger clusters are identifed - cluster 2 & cluster 5.

<a id="part04"></a>

<div class="alert alert-block alert-info" style="margin-top: 50px">
<font size = 8  color='black' font-weight="bold">
</font>
</div>

# __4. Examine clusters__

<div class="alert alert-block alert-info">
<font size = 8  color='black' font-weight="bold">
</font>
</div>

## Cluster 0

In [150]:
melbourne_merged.loc[melbourne_merged['Cluster Labels'] == 0, melbourne_merged.columns[[0] + list(range(3, melbourne_merged.shape[1]))]]

Unnamed: 0,neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,Cluster Labels
20,Manningham - West,Playground,Football Stadium,Zoo Exhibit,Fast Food Restaurant,Field,0


## Cluster 1

In [151]:
melbourne_merged.loc[melbourne_merged['Cluster Labels'] == 1, melbourne_merged.columns[[0] + list(range(3, melbourne_merged.shape[1]))]]

Unnamed: 0,neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,Cluster Labels
35,Sunbury,Diner,Home Service,Zoo Exhibit,Flea Market,Fast Food Restaurant,1


## Cluster 2

In [146]:
melbourne_merged.loc[melbourne_merged['Cluster Labels'] == 2, melbourne_merged.columns[[0] + list(range(3, melbourne_merged.shape[1]))]]

Unnamed: 0,neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,Cluster Labels
0,Melbourne City,Café,Coffee Shop,Hotel,Italian Restaurant,Bar,2
1,Port Phillip,Café,Bar,Australian Restaurant,Breakfast Spot,Athletics & Sports,2
2,Maribyrnong,Café,Park,Construction & Landscaping,Vietnamese Restaurant,Restaurant,2
3,Hobsons Bay,Fast Food Restaurant,Café,Convenience Store,Beach,Grocery Store,2
8,Essendon,Café,Light Rail Station,Italian Restaurant,Grocery Store,Asian Restaurant,2
11,Yarra,Café,Bar,Fast Food Restaurant,Pub,Greek Restaurant,2
12,Brunswick - Coburg,Café,Bar,Grocery Store,Light Rail Station,Pizza Place,2
18,Boroondara,Café,Italian Restaurant,Park,Light Rail Station,Thai Restaurant,2
19,Whitehorse - West,Café,Supermarket,Grocery Store,Sandwich Place,Department Store,2
22,Maroondah,Café,Bar,Park,Furniture / Home Store,Seafood Restaurant,2


## Cluster 3

In [134]:
melbourne_merged.loc[melbourne_merged['Cluster Labels'] == 3, melbourne_merged.columns[[0] + list(range(3, melbourne_merged.shape[1]))]]

Unnamed: 0,neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
13,Darebin - South,Café,Playground,Farmers Market,Field,Filipino Restaurant,Fish & Chips Shop,Fish Market,Fishing Spot,Flea Market,Zoo Exhibit


## Cluster 4

In [147]:
melbourne_merged.loc[melbourne_merged['Cluster Labels'] == 4, melbourne_merged.columns[[0] + list(range(3, melbourne_merged.shape[1]))]]

Unnamed: 0,neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,Cluster Labels
16,Banyule,Park,Playground,Yoga Studio,Fish & Chips Shop,Sports Club,4
37,Casey - South,Park,Gas Station,Convenience Store,Playground,Skate Park,4


## Cluster 5

In [148]:
melbourne_merged.loc[melbourne_merged['Cluster Labels'] == 5, melbourne_merged.columns[[0] + list(range(3, melbourne_merged.shape[1]))]]

Unnamed: 0,neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,Cluster Labels
4,Brimbank,Café,Pizza Place,Vietnamese Restaurant,Grocery Store,Gym,5
5,Melton - Bacchus Marsh,Playground,Park,Gym,Yoga Studio,Fast Food Restaurant,5
6,Wyndham,Playground,Fast Food Restaurant,Supermarket,Bus Station,Pizza Place,5
9,Moreland - North,Café,Supermarket,Bakery,Shopping Mall,Thai Restaurant,5
10,Tullamarine - Broadmeadows,Grocery Store,Bakery,Fish & Chips Shop,Pizza Place,Arcade,5
14,Darebin - North,Café,Chinese Restaurant,Bakery,Light Rail Station,Pizza Place,5
15,Whittlesea - Wallan,Pizza Place,Fish & Chips Shop,Supermarket,Portuguese Restaurant,Paper / Office Supplies Store,5
17,Nillumbik - Kinglake,Pizza Place,Café,Thrift / Vintage Store,Park,Bus Station,5
21,Manningham - East,Fast Food Restaurant,Japanese Restaurant,Chinese Restaurant,Restaurant,Thai Restaurant,5
25,Monash,Café,Greek Restaurant,Pizza Place,Supermarket,Asian Restaurant,5


## Cluster 6

In [153]:
melbourne_merged.loc[melbourne_merged['Cluster Labels'] == 6, melbourne_merged.columns[[0] + list(range(3, melbourne_merged.shape[1]))]]

Unnamed: 0,neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,Cluster Labels
7,Keilor,Gym / Fitness Center,Café,Food & Drink Shop,Scenic Lookout,Coffee Shop,6


# __5. Check parking meters__

In [157]:
df_parking_meters = pd.read_csv('On-street_Car_Parking_Meters_with_Location.csv')
print(df_parking_meters.shape)
df_parking_meters.head()

(982, 10)


Unnamed: 0,MeterId,AssetId,Barcode,CreditCard,TapAndGo,MeterType,StreetName,Longitude,Latitude,Location
0,342A,1647009,MPM1647009,YES,YES,Reino TVX,Queensberry Street,144.963976,-37.804537,"(-37.80453746283878, 144.9639761825564)"
1,DDS10,1620105,MPM1620105,YES,NO,Reino VX1,Docklands Drive,144.937064,-37.815179,"(-37.81517934836205, 144.93706352181104)"
2,VS1,1620106,MPM1620106,YES,NO,Reino VX1,Albert Street,144.973784,-37.809446,"(-37.809446131465286, 144.97378384265096)"
3,VS13A,1620107,MPM1620107,YES,NO,Reino VX1,Albert Street,144.982523,-37.810408,"(-37.81040847508652, 144.98252323875553)"
4,172D,1647093,MPM1647093,YES,YES,Reino TVX,Berkeley Street,144.95886,-37.802572,"(-37.802571973350894, 144.95886032823444)"


## View on map

In [160]:
# from folium import plugins
# # let's start again with a clean copy of the map of Melbourne
# mel_map = folium.Map(location = [latitude_melbourne, longitude_melbourne], zoom_start = 11)

# # instantiate a mark cluster object for the incidents in the dataframe
# parking_meters = plugins.MarkerCluster().add_to(mel_map)

# # loop through the dataframe and add each data point to the mark cluster
# for lat, lng, label, in zip(df_parking_meters['Latitude'], df_parking_meters['Longitude'], df_parking_meters['MeterId']):
#     folium.Marker(
#         location=[lat, lng],
#         icon=None,
#         popup=label,
#     ).add_to(parking_meters)

# # display map
# mel_map