# Coursera Capstone Week 5 Notebook

### Purpose

This workbook will classify similar suburbs around Australian cities based on the local services and facilities available to residents.

This analysis will allow people to relocate and still maintain their quality of life. 


### Audience

The audience is expected to be anyone in the 25-40 age group. The reason being that group is having the hardest time entering the property market. That is also the group that will most likely have a young enough family that will have the smallest disruption on the dependents lives in the event of a relocation.


### Methodology 

We will perform a KNN classification to use unsupervised machine learning to decide which factors are most influential in deciding clusters.

### Data

Postcodes and Lat/Longs for suburbs in Australia - https://raw.githubusercontent.com/emmitk/Coursera_Capstone/master/Australian-Cities-Top50-ByPopulation-2017.csv

Foursquare places data (Foursquare API) - https://api.foursquare.com/v2/venues/explore



### Import all required libraries

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
#from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library




### Create Dataframes

#### Data: Top50 Australian Cities with Geocoding

In [70]:
lv_top50_city_list_AU = "https://raw.githubusercontent.com/emmitk/Coursera_Capstone/master/Australian-Cities-Top50-ByPopulation-2017.csv"
top50_latlng_df = pd.read_csv(lv_top50_city_list_AU, header=0, quotechar = '"')
top50_latlng_df.columns = ['Rank','City','2017 Population','5-year growth','5 year growth %', '1 year growth','1 year growth %','lat','lng']
print (top50_latlng_df.shape)
top50_latlng_df.head()

(50, 9)


Unnamed: 0,Rank,City,2017 Population,5-year growth,5 year growth %,1 year growth,1 year growth %,lat,lng
0,1,Sydney,4741874,433750,10.10%,98079,2.10%,-33.794883,151.268071
1,2,Melbourne,4677157,557346,13.50%,119975,2.60%,-38.365017,144.76592
2,3,Brisbane,2326656,203040,9.60%,46366,2.00%,-27.46758,153.027892
3,4,Perth,2004696,141620,7.60%,19789,1.00%,-31.924074,115.91223
4,5,Adelaide,1315346,55749,4.40%,9535,0.70%,-34.92577,138.599732


In [71]:
top50_latlng_df = top50_latlng_df.drop(['Rank','2017 Population','5-year growth','5 year growth %', '1 year growth','1 year growth %'], axis=1)
top50_latlng_df.head()

Unnamed: 0,City,lat,lng
0,Sydney,-33.794883,151.268071
1,Melbourne,-38.365017,144.76592
2,Brisbane,-27.46758,153.027892
3,Perth,-31.924074,115.91223
4,Adelaide,-34.92577,138.599732


### Define function "getNearbyVenues" to return nearby venues from Foursquare APIs

In [72]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    # Set Constants
    CLIENT_ID = 'ZOIG5VFRHSDUWK0G1KWTKRDH53PALXH55B4AU0NOFYE3Q0XT' 
    CLIENT_SECRET = 'HNU11JLRZWHF34YDWQBAGZALY0RYVMV4RJFMGBUR0VV3BBVG' 
    VERSION = '20180605' 
    LIMIT = 100 # limit of number of venues returned by Foursquare API
    radius = 1500 # define radius

   # Section can be one of food, drinks, coffee, shops, arts, outdoors, sights, trending, nextVenues OR topPicks   
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}&section=topPicks'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['City', 
                  'City Latitude', 
                  'City Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [73]:
city_venues = getNearbyVenues(names=top50_latlng_df['City'],
                                   latitudes=top50_latlng_df['lat'],
                                   longitudes=top50_latlng_df['lng']
                                  )

Sydney
Melbourne
Brisbane
Perth
Adelaide
Gold Coast – Tweed Heads
Newcastle – Maitland
Canberra – Queanbeyan
Central Coast
Sunshine Coast
Wollongong
Geelong
Hobart
Townsville
Cairns
Toowoomba
Darwin
Ballarat
Bendigo
Albury – Wodonga
Launceston
Mackay
Rockhampton
Bunbury (WA)
Coffs Harbour
Bundaberg
Melton (VIC)
Wagga Wagga
Hervey Bay
Mildura – Wentworth
Shepparton – Mooroopna
Port Macquarie
Gladstone – Tannum Sands (QLD)
Tamworth
Traralgon – Morwell
Orange
Bowral – Mittagong
Busselton
Geraldton
Dubbo
Nowra – Bomaderry
Warragul – Drouin
Bathurst
Warrnambool
Albany
Kalgoorlie – Boulder
Devonport
Mount Gambier
Lismore (NSW)
Nelson Bay (NSW)


In [74]:
city_venues.head()

Unnamed: 0,City,City Latitude,City Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Sydney,-33.794883,151.268071,Fish Cafe,-33.793634,151.264552,Café
1,Sydney,-33.794883,151.268071,Coles,-33.78547,151.268455,Supermarket
2,Sydney,-33.794883,151.268071,Ajmer's Indian Restaurant,-33.794372,151.26447,Indian Restaurant
3,Sydney,-33.794883,151.268071,Forty Baskets Beach,-33.802796,151.269834,Beach
4,Sydney,-33.794883,151.268071,The Bistro,-33.799394,151.280993,Bistro


In [75]:
clean_city_venues = city_venues.drop(['Venue','Venue Latitude','Venue Longitude'], axis=1)
clean_city_venues.head(20)

Unnamed: 0,City,City Latitude,City Longitude,Venue Category
0,Sydney,-33.794883,151.268071,Café
1,Sydney,-33.794883,151.268071,Supermarket
2,Sydney,-33.794883,151.268071,Indian Restaurant
3,Sydney,-33.794883,151.268071,Beach
4,Sydney,-33.794883,151.268071,Bistro
5,Sydney,-33.794883,151.268071,Pie Shop
6,Sydney,-33.794883,151.268071,Indian Restaurant
7,Sydney,-33.794883,151.268071,Golf Course
8,Sydney,-33.794883,151.268071,Tennis Court
9,Melbourne,-38.365017,144.76592,Café


In [76]:
city_venues.groupby('City').count()

Unnamed: 0_level_0,City Latitude,City Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
City,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Adelaide,48,48,48,48,48,48
Albany,7,7,7,7,7,7
Albury – Wodonga,20,20,20,20,20,20
Ballarat,16,16,16,16,16,16
Bathurst,5,5,5,5,5,5
Bendigo,19,19,19,19,19,19
Bowral – Mittagong,17,17,17,17,17,17
Brisbane,75,75,75,75,75,75
Bunbury (WA),17,17,17,17,17,17
Bundaberg,12,12,12,12,12,12


In [77]:
print('There are {} uniques categories.'.format(len(city_venues['Venue Category'].unique())))

There are 146 uniques categories.


In [78]:
city_venues['Venue Category'].unique()

array(['Café', 'Supermarket', 'Indian Restaurant', 'Beach', 'Bistro',
       'Pie Shop', 'Golf Course', 'Tennis Court', 'Pizza Place',
       'Seafood Restaurant', 'Surf Spot', 'Harbor / Marina', 'Bar',
       'Scenic Lookout', 'Vegetarian / Vegan Restaurant', 'Whisky Bar',
       'Dive Bar', 'Restaurant', 'Ice Cream Shop', 'Outdoor Sculpture',
       'Thai Restaurant', 'Cocktail Bar', 'Trail', 'French Restaurant',
       'Jazz Club', 'Pub', 'Music Venue', 'Dessert Shop', 'Park',
       'Japanese Restaurant', 'Beer Bar', 'German Restaurant',
       'Coffee Shop', 'Korean Restaurant', 'Australian Restaurant',
       'Pedestrian Plaza', 'Gastropub', 'Gay Bar',
       'Indonesian Restaurant', 'Brewery', 'Athletics & Sports',
       'Breakfast Spot', 'Lounge', 'Hot Dog Joint', 'Burger Joint',
       'Plaza', 'Hotel Bar', 'Bakery', 'Chinese Restaurant',
       'Asian Restaurant', 'Mexican Restaurant', 'Tapas Restaurant',
       'Sports Club', 'Liquor Store', 'Fish & Chips Shop', 'Yoga Studi

### Analyse each City - One Hot Encoding

In [79]:
# one hot encoding
city_onehot = pd.get_dummies(city_venues[['Venue Category']], prefix="", prefix_sep="")

city_onehot['City'] = city_venues['City']
cols = ['City']  + [col for col in city_onehot if col != 'City']
city_onehot = city_onehot[cols]

print(city_onehot.shape)
city_onehot.head()

(734, 147)


Unnamed: 0,City,African Restaurant,American Restaurant,Art Gallery,Asian Restaurant,Athletics & Sports,Australian Restaurant,Auto Garage,BBQ Joint,Bagel Shop,Bakery,Bar,Beach,Beach Bar,Beer Bar,Beer Garden,Beer Store,Bistro,Boat or Ferry,Bookstore,Botanical Garden,Breakfast Spot,Brewery,Buffet,Burger Joint,Burrito Place,Café,Cajun / Creole Restaurant,Campground,Chinese Restaurant,Climbing Gym,Cocktail Bar,Coffee Shop,Concert Hall,Construction & Landscaping,Convenience Store,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dive Bar,Dive Shop,Dry Cleaner,Electronics Store,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Fish Market,Food,Food & Drink Shop,Food Truck,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Garden,Gastropub,Gay Bar,German Restaurant,Gift Shop,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Harbor / Marina,History Museum,Hobby Shop,Hostel,Hot Dog Joint,Hotel,Hotel Bar,IT Services,Ice Cream Shop,Indian Restaurant,Indonesian Restaurant,Italian Restaurant,Japanese Restaurant,Jazz Club,Korean Restaurant,Liquor Store,Lounge,Malay Restaurant,Massage Studio,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant,Motel,Movie Theater,Multiplex,Music Venue,Night Market,Noodle House,Optical Shop,Outdoor Sculpture,Paper / Office Supplies Store,Park,Pedestrian Plaza,Performing Arts Venue,Pharmacy,Pie Shop,Pizza Place,Playground,Plaza,Pool,Pub,Record Shop,Resort,Restaurant,River,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shop & Service,Shopping Mall,Shopping Plaza,Snack Place,Soccer Field,South American Restaurant,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Steakhouse,Supermarket,Surf Spot,Sushi Restaurant,Szechuan Restaurant,Taco Place,Tapas Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Thrift / Vintage Store,Trail,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Whisky Bar,Wine Bar,Yoga Studio
0,Sydney,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Sydney,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Sydney,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Sydney,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Sydney,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [80]:
# Debug
#df = city_onehot.groupby('City').sum()

#df['Total'] = df.sum(axis=1)
#df.head()

### Assign the relative weighting of each category across all data for that City

For example if there are 100 Venues in total for a city, and 6 of them are Cafe's, then Cafe's would end up with .06 (ie 6%)

In [81]:
city_grouped = city_onehot.groupby('City').mean().reset_index()
city_grouped.head()

Unnamed: 0,City,African Restaurant,American Restaurant,Art Gallery,Asian Restaurant,Athletics & Sports,Australian Restaurant,Auto Garage,BBQ Joint,Bagel Shop,Bakery,Bar,Beach,Beach Bar,Beer Bar,Beer Garden,Beer Store,Bistro,Boat or Ferry,Bookstore,Botanical Garden,Breakfast Spot,Brewery,Buffet,Burger Joint,Burrito Place,Café,Cajun / Creole Restaurant,Campground,Chinese Restaurant,Climbing Gym,Cocktail Bar,Coffee Shop,Concert Hall,Construction & Landscaping,Convenience Store,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dive Bar,Dive Shop,Dry Cleaner,Electronics Store,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Fish Market,Food,Food & Drink Shop,Food Truck,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Garden,Gastropub,Gay Bar,German Restaurant,Gift Shop,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Harbor / Marina,History Museum,Hobby Shop,Hostel,Hot Dog Joint,Hotel,Hotel Bar,IT Services,Ice Cream Shop,Indian Restaurant,Indonesian Restaurant,Italian Restaurant,Japanese Restaurant,Jazz Club,Korean Restaurant,Liquor Store,Lounge,Malay Restaurant,Massage Studio,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant,Motel,Movie Theater,Multiplex,Music Venue,Night Market,Noodle House,Optical Shop,Outdoor Sculpture,Paper / Office Supplies Store,Park,Pedestrian Plaza,Performing Arts Venue,Pharmacy,Pie Shop,Pizza Place,Playground,Plaza,Pool,Pub,Record Shop,Resort,Restaurant,River,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shop & Service,Shopping Mall,Shopping Plaza,Snack Place,Soccer Field,South American Restaurant,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Steakhouse,Supermarket,Surf Spot,Sushi Restaurant,Szechuan Restaurant,Taco Place,Tapas Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Thrift / Vintage Store,Trail,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Whisky Bar,Wine Bar,Yoga Studio
0,Adelaide,0.020833,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.104167,0.0,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.020833,0.0,0.041667,0.0,0.0625,0.020833,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.041667,0.020833,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.0,0.020833,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.041667,0.0,0.0,0.0,0.0,0.020833,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.083333,0.0
1,Albany,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Albury – Wodonga,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.05,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.05,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Ballarat,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0625,0.1875,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0
4,Bathurst,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


### Lets group all similar food venues (ie All restaurants + Steakhouse + Noodle House, Cafe's and Bars etc)

In [83]:
df_cols = city_grouped.columns
df_cols = df_cols[df_cols.str.contains("Restaurant")]
print(df_cols)
#df_cols[0]
#city_grouped.iloc[df_cols[0]]
city_grouped['Restaurants'] = city_grouped[df_cols].sum(axis=1)
city_grouped.drop(df_cols, axis=1, inplace=True)
city_grouped.head()


Index(['African Restaurant', 'American Restaurant', 'Asian Restaurant',
       'Australian Restaurant', 'Cajun / Creole Restaurant',
       'Chinese Restaurant', 'Dim Sum Restaurant', 'Fast Food Restaurant',
       'French Restaurant', 'German Restaurant', 'Greek Restaurant',
       'Indian Restaurant', 'Indonesian Restaurant', 'Italian Restaurant',
       'Japanese Restaurant', 'Korean Restaurant', 'Malay Restaurant',
       'Mediterranean Restaurant', 'Mexican Restaurant',
       'Middle Eastern Restaurant', 'Modern European Restaurant', 'Restaurant',
       'Seafood Restaurant', 'South American Restaurant', 'Spanish Restaurant',
       'Sushi Restaurant', 'Szechuan Restaurant', 'Tapas Restaurant',
       'Thai Restaurant', 'Vegetarian / Vegan Restaurant',
       'Vietnamese Restaurant'],
      dtype='object')


Unnamed: 0,City,Art Gallery,Athletics & Sports,Auto Garage,BBQ Joint,Bagel Shop,Bakery,Bar,Beach,Beach Bar,Beer Bar,Beer Garden,Beer Store,Bistro,Boat or Ferry,Bookstore,Botanical Garden,Breakfast Spot,Brewery,Buffet,Burger Joint,Burrito Place,Café,Campground,Climbing Gym,Cocktail Bar,Coffee Shop,Concert Hall,Construction & Landscaping,Convenience Store,Deli / Bodega,Department Store,Dessert Shop,Diner,Dive Bar,Dive Shop,Dry Cleaner,Electronics Store,Farmers Market,Fish & Chips Shop,Fish Market,Food,Food & Drink Shop,Food Truck,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Garden,Gastropub,Gay Bar,Gift Shop,Golf Course,Gourmet Shop,Grocery Store,Gym,Harbor / Marina,History Museum,Hobby Shop,Hostel,Hot Dog Joint,Hotel,Hotel Bar,IT Services,Ice Cream Shop,Jazz Club,Liquor Store,Lounge,Massage Studio,Motel,Movie Theater,Multiplex,Music Venue,Night Market,Noodle House,Optical Shop,Outdoor Sculpture,Paper / Office Supplies Store,Park,Pedestrian Plaza,Performing Arts Venue,Pharmacy,Pie Shop,Pizza Place,Playground,Plaza,Pool,Pub,Record Shop,Resort,River,Sandwich Place,Scenic Lookout,Sculpture Garden,Shop & Service,Shopping Mall,Shopping Plaza,Snack Place,Soccer Field,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Steakhouse,Supermarket,Surf Spot,Taco Place,Tea Room,Tennis Court,Theater,Thrift / Vintage Store,Trail,Video Game Store,Whisky Bar,Wine Bar,Yoga Studio,Restaurants
0,Adelaide,0.0,0.0,0.0,0.0,0.0,0.0,0.104167,0.0,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0625,0.020833,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.041667,0.0,0.0,0.0,0.0,0.020833,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.3125
1,Albany,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.428571
2,Albury – Wodonga,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.15
3,Ballarat,0.0,0.0,0.0,0.0,0.0,0.0625,0.1875,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5625
4,Bathurst,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2


In [84]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [85]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['City']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
city_venues_sorted = pd.DataFrame(columns=columns)
city_venues_sorted['City'] = city_grouped['City']

for ind in np.arange(city_grouped.shape[0]):
    city_venues_sorted.iloc[ind, 1:] = return_most_common_venues(city_grouped.iloc[ind, :], num_top_venues)

city_venues_sorted.head()

Unnamed: 0,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Adelaide,Restaurants,Bar,Wine Bar,Pub,Park,Cocktail Bar,Café,Garden,Performing Arts Venue,Hotel
1,Albany,Restaurants,Café,Gastropub,Department Store,Pub,Bagel Shop,Auto Garage,Diner,Dive Bar,Dive Shop
2,Albury – Wodonga,Café,Restaurants,Park,Bar,Taco Place,Gourmet Shop,Motel,Electronics Store,Pub,Sports Bar
3,Ballarat,Restaurants,Bar,Bakery,Burger Joint,Coffee Shop,Hotel,Dive Shop,Dive Bar,Food Truck,Dry Cleaner
4,Bathurst,Pub,Café,Restaurants,Auto Garage,Food Truck,Dessert Shop,Diner,Dive Bar,Dive Shop,Dry Cleaner


### Run a K-means cluster

In [86]:
# set number of clusters
kclusters = 5

city_grouped_clustering = city_grouped.drop('City', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(city_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 0, 3, 4, 2, 0, 3, 0, 3, 3])

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [87]:
# add clustering labels
city_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

city_merged = top50_latlng_df

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
city_merged = city_merged.join(city_venues_sorted.set_index('City'), on='City')

#Cluster label became float for some reason - go back to int for plotting
#toronto_merged['Cluster Labels'] = toronto_merged['Cluster Labels'].fillna(0).astype(int)

city_merged.head() # check thelast columns!

Unnamed: 0,City,lat,lng,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Sydney,-33.794883,151.268071,3.0,Restaurants,Tennis Court,Bistro,Supermarket,Café,Beach,Golf Course,Pie Shop,Dive Bar,Dive Shop
1,Melbourne,-38.365017,144.76592,3.0,Harbor / Marina,Beach,Pizza Place,Café,Surf Spot,Restaurants,Bakery,Athletics & Sports,Diner,Dive Bar
2,Brisbane,-27.46758,153.027892,0.0,Restaurants,Cocktail Bar,Bar,Pub,Park,Café,Dive Bar,Burger Joint,Beer Bar,Dessert Shop
3,Perth,-31.924074,115.91223,0.0,Restaurants,Pizza Place,Sports Club,Bar,Café,Liquor Store,River,Fish & Chips Shop,Coffee Shop,Yoga Studio
4,Adelaide,-34.92577,138.599732,0.0,Restaurants,Bar,Wine Bar,Pub,Park,Cocktail Bar,Café,Garden,Performing Arts Venue,Hotel


### Visualise the clusters

In [88]:

address = 'Australia'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Australia are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Australia are -24.7761086, 134.755.


In [89]:
#pre: some neighbourhoods don't have any venues returned for them so we can remove them
print(city_merged.shape)
city_merged = city_merged.dropna(axis=0, subset=['Cluster Labels'])
#toronto_merged['Cluster Labels'] = toronto_merged['Cluster Labels'].fillna(0).astype(int)
city_merged['Cluster Labels'] = city_merged['Cluster Labels'].astype(int)
print(city_merged.shape)


(50, 14)
(48, 14)


In [96]:

# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=5)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(city_merged['lat'], city_merged['lng'], city_merged['City'], city_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Examine the clusters

Cluster 1

In [91]:
cluster1 = city_merged.loc[city_merged['Cluster Labels'] == 0, city_merged.columns[[0] + list(range(4, city_merged.shape[1]))]]
print (cluster1.shape)
cluster1.head(cluster1.shape[0])

(21, 11)


Unnamed: 0,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Brisbane,Restaurants,Cocktail Bar,Bar,Pub,Park,Café,Dive Bar,Burger Joint,Beer Bar,Dessert Shop
3,Perth,Restaurants,Pizza Place,Sports Club,Bar,Café,Liquor Store,River,Fish & Chips Shop,Coffee Shop,Yoga Studio
4,Adelaide,Restaurants,Bar,Wine Bar,Pub,Park,Cocktail Bar,Café,Garden,Performing Arts Venue,Hotel
5,Gold Coast – Tweed Heads,Restaurants,Shopping Mall,Fruit & Vegetable Store,Fish Market,Food & Drink Shop,Department Store,Dessert Shop,Diner,Dive Bar,Dive Shop
7,Canberra – Queanbeyan,Restaurants,Food Truck,Bar,Speakeasy,Gastropub,Hotel Bar,Lounge,Dry Cleaner,Pizza Place,Cocktail Bar
12,Hobart,Restaurants,Park,Bistro,Gastropub,Frozen Yogurt Shop,Pharmacy,Pub,Sandwich Place,Café,Snack Place
14,Cairns,Restaurants,Bar,Café,Sporting Goods Shop,Park,Resort,Ice Cream Shop,Playground,Dive Shop,Pub
18,Bendigo,Restaurants,Pub,Café,Hotel,Electronics Store,Pizza Place,Beer Store,IT Services,Park,Theater
21,Mackay,Restaurants,Coffee Shop,Steakhouse,Concert Hall,Harbor / Marina,Shopping Mall,Pub,Bar,Beach,Dessert Shop
22,Rockhampton,Restaurants,Steakhouse,Café,Sandwich Place,Sporting Goods Shop,Sports Bar,Coffee Shop,Bar,Deli / Bodega,Dessert Shop


### Cluster 2

In [92]:
cluster2 = city_merged.loc[city_merged['Cluster Labels'] == 1, city_merged.columns[[0] + list(range(4, city_merged.shape[1]))]]
print (cluster2.shape)
cluster2.head(cluster2.shape[0])

(3, 11)


Unnamed: 0,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
6,Newcastle – Maitland,Coffee Shop,Bar,Restaurants,Burger Joint,Noodle House,Plaza,Food,Dessert Shop,Diner,Dive Bar
13,Townsville,Gym,Coffee Shop,Restaurants,Tea Room,Plaza,Café,Shop & Service,Sports Bar,Liquor Store,Movie Theater
27,Wagga Wagga,Pub,Coffee Shop,Liquor Store,Burrito Place,Soccer Field,Food,Department Store,Dessert Shop,Diner,Dive Bar


### Cluster 3

In [93]:
cluster3 = city_merged.loc[city_merged['Cluster Labels'] == 2, city_merged.columns[[0] + list(range(4, city_merged.shape[1]))]]
print (cluster3.shape)
cluster3.head(cluster3.shape[0])

(2, 11)


Unnamed: 0,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
34,Traralgon – Morwell,Grocery Store,Café,Pub,Electronics Store,Food & Drink Shop,Department Store,Dessert Shop,Diner,Dive Bar,Dive Shop
42,Bathurst,Pub,Café,Restaurants,Auto Garage,Food Truck,Dessert Shop,Diner,Dive Bar,Dive Shop,Dry Cleaner


### Cluster 4

In [94]:
cluster4 = city_merged.loc[city_merged['Cluster Labels'] == 3, city_merged.columns[[0] + list(range(4, city_merged.shape[1]))]]
print (cluster4.shape)
cluster4.head(cluster4.shape[0])

(19, 11)


Unnamed: 0,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Sydney,Restaurants,Tennis Court,Bistro,Supermarket,Café,Beach,Golf Course,Pie Shop,Dive Bar,Dive Shop
1,Melbourne,Harbor / Marina,Beach,Pizza Place,Café,Surf Spot,Restaurants,Bakery,Athletics & Sports,Diner,Dive Bar
8,Central Coast,Restaurants,Cocktail Bar,Bistro,Café,Bar,Music Venue,Dry Cleaner,Dive Shop,Food Truck,Farmers Market
9,Sunshine Coast,Restaurants,Park,Café,Shopping Mall,Liquor Store,Lounge,Movie Theater,Electronics Store,Paper / Office Supplies Store,Pharmacy
10,Wollongong,Café,Restaurants,Burger Joint,Beach Bar,Food,Cocktail Bar,Brewery,Breakfast Spot,Convenience Store,Beach
11,Geelong,Café,Bar,Restaurants,Park,Pizza Place,Pub,Steakhouse,Fried Chicken Joint,Diner,Dessert Shop
15,Toowoomba,Restaurants,Pub,Café,Steakhouse,Ice Cream Shop,Burger Joint,Music Venue,Breakfast Spot,Deli / Bodega,Sports Bar
19,Albury – Wodonga,Café,Restaurants,Park,Bar,Taco Place,Gourmet Shop,Motel,Electronics Store,Pub,Sports Bar
20,Launceston,Restaurants,Café,Bar,Steakhouse,BBQ Joint,Bakery,Beer Bar,Fish & Chips Shop,Hotel,Pub
23,Bunbury (WA),Café,Restaurants,Bar,Pub,Breakfast Spot,Pizza Place,Fried Chicken Joint,Liquor Store,Movie Theater,Department Store


### Cluster 5

In [95]:
cluster5 = city_merged.loc[city_merged['Cluster Labels'] == 4, city_merged.columns[[0] + list(range(4, city_merged.shape[1]))]]
print (cluster5.shape)
cluster5.head(cluster5.shape[0])

(3, 11)


Unnamed: 0,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
17,Ballarat,Restaurants,Bar,Bakery,Burger Joint,Coffee Shop,Hotel,Dive Shop,Dive Bar,Food Truck,Dry Cleaner
24,Coffs Harbour,Restaurants,Beach,Shopping Mall,Food & Drink Shop,Department Store,Dessert Shop,Diner,Dive Bar,Dive Shop,Dry Cleaner
40,Nowra – Bomaderry,Restaurants,Sports Bar,Department Store,Dessert Shop,Diner,Dive Bar,Dive Shop,Dry Cleaner,Electronics Store,Farmers Market


### Results

The results do not show sufficient spread in the 5 clusters. There is a very high incidence of Restaurant and Food related venues which is to be expected as social media sites are predominately focussed on food and food reviews.

The unbalanced clustering show that there the model would need to be tweaked to get a more optimum result. There is possibly not enough data for come these locations (2 locations actually had no data returned from Foursquare so they were dropped). 


### Conclusions

Cluster 1 contains mostly food related businesses. Cluster 4 also has a high number of food venues however there does seem to be more venues related to parks and outdoor activities. We would have expected that most coastal areas would be grouped together, however the Sunshine coast is not grouped with Nelson Bay and Coffs Harbour. This may be due the lack of Foursquare users in different cities around Australia.

To get a better comparison, other attributes can be introduced in to the model to distinguish the cities further. For example, Weather data is readily avialable. Property pricing was going to be included but was not available for these areas for free.