# Introduction

I'll be leveraging the Toronto postal code data that I used in the previous assignment, and scraping similar postal code data for Melbourne, Australia to find similarities and differences between clusters in the two cities. This would be particularly useful for someone who lives in Toronto and wants to find the best neighborhood to open a new café and attract as many customers as possible. In this exercise, our prospective business owner also has a business partner all the way on the other side of the earth in Melbourne, Australia, and wants to compare the two cities to see if the best area each person chooses is similar to one another.

# Data

I'll be leveraging the Toronto postal code data from the previous assignment, as well as similar postal code data for Melbourne, Australia in conjunction with the corresponding Foursquare location data.

[Toronto Postal Code Data](https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M)

[Australian Postal Code Database (downloaded and imported CSV file from this website)](https://www.matthewproctor.com/australian_postcodes)

The data from each source will have to be cleaned and formatted into proper dataframe format that lists the postal codes, boroughs, and neighborhoods in Toronto and Melbourne. In order to extract the geographical coordinates (latitude/longitude) of Toronto, I matched the Toronto postal codes to a preexisting Geospatial Coordinates csv file, whereas the geographical coordinates of Australian postcodes in Victoria were already available in the Australian internet database.

# Methodology

## Toronto Data
Because the Wikipedia page for Toronto postal codes no longer maintainted the postal code data in a clean format, I scraped the data from the page into a dataframe and cleaned it into another dataframe that lists each postcode, borough, and neighborhood in each row with the correct column labels. I then mapped each postal code to geographical coordinates using a preexisting CSV file with geospatial coordinate data before using Foursquare API to search for nearby venues in each neighborhood prior to conducting cluster analysis.

## Melbourne Data
I used an open source database of Australian postal code data that includes information on postal code, locality, latitude, longitude, and neighborhood names for data relevant to Melbourne, Australia. After downloading the data as a CSV file, I imported the data file into the program before cleaning it to only include data relevant to locales in Melbourne. I then used Foursquare API to search for nearby venues in each neighborhood prior to conducting cluster analysis.

## Cluster Analysis
After cleaning the data in each city, I ran k-means clustering to cluster each city into four clusters, which would help determine the best cluster for our prospective business owners to open their cafés. The k-means clustering then led to a new dataframe for each city that includes the relevant clusters, as well as the top 10 venues for each neighborhood as determined using the Foursquare API.

The clusters in each city were then visualized on maps for Toronto and Melbourne. I then examined each cluster carefully to determine distinguishing venue categories that stood out from each cluster. The standout categories in each cluster provides key indicators for our business owners to make a final decision on the location of their grand openings.

# Results and Discussion
Based on the cluster analysis, we determined that The Danforth East is the best neighborhood to open a café in Toronto, whereas Port Phillip would be an ideal neighborhood to open a café in Melbourne. as cafés and coffee shops are very popular in those locales.

With this in mind, it's important to note key discrepancies between the two cities, which this exercise helped to illustrate. Melbourne is a more densely populated metropolitan area than Toronto is, but neighborhoods in Toronto appear to be more spaced out while those in Melbourne are bunched up closer together around the Melbourne City Centre. This makes clustering analysis in Melbourne a bit more difficult, as the clusters will generally be a bit closer to each other and is even susceptible to outlier clusters. For example, one of the detected clusters in Melbourne was far away from the city centre, near the airport.

# Conclusion
Our business owners have taken our analysis to heart and opened a beautiful café in The Danforth East area of Toronto and Port Phillip area of Melbourne. While perhaps different types of analyses may have provided alternatives, they believe that both neighborhoods will make their new businesses a popular choice.

In [2]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
#import folium # map rendering library

print('Libraries imported.')

Libraries imported.


In [6]:
url = "https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M"
dfs = pd.read_html(url)
tor_codes = dfs[0]
tor_codes.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8
0,M1ANot assigned,M2ANot assigned,M3ANorth York(Parkwoods),M4ANorth York(Victoria Village),M5ADowntown Toronto(Regent Park / Harbourfront),M6ANorth York(Lawrence Manor / Lawrence Heights),M7AQueen's Park(Ontario Provincial Government),M8ANot assigned,M9AEtobicoke(Islington Avenue)
1,M1BScarborough(Malvern / Rouge),M2BNot assigned,M3BNorth York(Don Mills)North,M4BEast York(Parkview Hill / Woodbine Gardens),"M5BDowntown Toronto(Garden District, Ryerson)",M6BNorth York(Glencairn),M7BNot assigned,M8BNot assigned,M9BEtobicoke(West Deane Park / Princess Garden...
2,M1CScarborough(Rouge Hill / Port Union / Highl...,M2CNot assigned,M3CNorth York(Don Mills)South(Flemingdon Park),M4CEast York(Woodbine Heights),M5CDowntown Toronto(St. James Town),M6CYork(Humewood-Cedarvale),M7CNot assigned,M8CNot assigned,M9CEtobicoke(Eringate / Bloordale Gardens / Ol...
3,M1EScarborough(Guildwood / Morningside / West ...,M2ENot assigned,M3ENot assigned,M4EEast Toronto(The Beaches),M5EDowntown Toronto(Berczy Park),M6EYork(Caledonia-Fairbanks),M7ENot assigned,M8ENot assigned,M9ENot assigned
4,M1GScarborough(Woburn),M2GNot assigned,M3GNot assigned,M4GEast York(Leaside),M5GDowntown Toronto(Central Bay Street),M6GDowntown Toronto(Christie),M7GNot assigned,M8GNot assigned,M9GNot assigned


In [5]:
tor_codes_df = pd.DataFrame(columns = ['Postal Code', 'Borough', 'Neighbourhood'])
tor_codes_df.head()
for x in tor_codes.columns:
    for y in range(len(tor_codes)):
        entry = tor_codes[x][y]
        code = entry[0:3]
        if (entry.find('Not assigned') == -1):
            left = entry.find("(")
            right = entry.find(")")
            bor = entry[3:left]
            neigh = entry[left+1:len(entry)-1]
            tor_codes_df = tor_codes_df.append({'Postal Code': code, 'Borough': bor, 'Neighbourhood': neigh}, ignore_index = True)
            
tor_codes_df.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood
0,M1B,Scarborough,Malvern / Rouge
1,M1C,Scarborough,Rouge Hill / Port Union / Highland Creek
2,M1E,Scarborough,Guildwood / Morningside / West Hill
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae


In [8]:
# The code was removed by Watson Studio for sharing.

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [9]:
lat = []
long = []
tor_codes_df.iloc[0]["Postal Code"]
for i in range(len(tor_codes_df)):
    lat.append(df_data_1.loc[df_data_1["Postal Code"] == tor_codes_df.iloc[i]["Postal Code"]]["Latitude"].values[0])
    long.append(df_data_1.loc[df_data_1["Postal Code"] == tor_codes_df.iloc[i]["Postal Code"]]["Longitude"].values[0])
tor_codes_df.insert(len(tor_codes_df.columns),"Latitude",lat)
tor_codes_df.insert(len(tor_codes_df.columns),"Longitude", long)
tor_codes_df.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
0,M1B,Scarborough,Malvern / Rouge,43.806686,-79.194353
1,M1C,Scarborough,Rouge Hill / Port Union / Highland Creek,43.784535,-79.160497
2,M1E,Scarborough,Guildwood / Morningside / West Hill,43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476


In [10]:
CLIENT_ID = 'ZZ1Q43ISTX00CZCGZ2ZA4XHGBEUBIIY2P1JDCWOO23Y4H0V2' # your Foursquare ID
CLIENT_SECRET = '3XMSEBBRPXH1LZFXK1AA1DEW2ICL23HWB5JPXXTYZ3XCHXTO' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value
!pip install geopy
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize


! pip install folium==0.5.0
import folium # plotting library

print('Folium installed')
print('Libraries imported.')

Collecting folium==0.5.0
  Downloading folium-0.5.0.tar.gz (79 kB)
[K     |████████████████████████████████| 79 kB 10.0 MB/s eta 0:00:01
[?25hCollecting branca
  Downloading branca-0.4.2-py3-none-any.whl (24 kB)
Building wheels for collected packages: folium
  Building wheel for folium (setup.py) ... [?25ldone
[?25h  Created wheel for folium: filename=folium-0.5.0-py3-none-any.whl size=76240 sha256=e54c63520af02ca0f55a83c18942d5210faaf7da34f686d5de5a1a04d0584901
  Stored in directory: /tmp/wsuser/.cache/pip/wheels/b2/2f/2c/109e446b990d663ea5ce9b078b5e7c1a9c45cca91f377080f8
Successfully built folium
Installing collected packages: branca, folium
Successfully installed branca-0.4.2 folium-0.5.0
Folium installed
Libraries imported.


In [11]:
def getNearbyVenues(boroughs, names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for borough, name, lat, lng in zip(boroughs, names, latitudes, longitudes):
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            borough,
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Borough','Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)
toronto_venues = getNearbyVenues(boroughs=tor_codes_df["Borough"],
                                 names=tor_codes_df["Neighbourhood"],latitudes=tor_codes_df["Latitude"],
                                 longitudes=tor_codes_df["Longitude"])
toronto_venues.head()

Unnamed: 0,Borough,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Scarborough,Malvern / Rouge,43.806686,-79.194353,Wendy’s,43.807448,-79.199056,Fast Food Restaurant
1,Scarborough,Rouge Hill / Port Union / Highland Creek,43.784535,-79.160497,Chris Effects Painting,43.784343,-79.163742,Construction & Landscaping
2,Scarborough,Rouge Hill / Port Union / Highland Creek,43.784535,-79.160497,Royal Canadian Legion,43.782533,-79.163085,Bar
3,Scarborough,Guildwood / Morningside / West Hill,43.763573,-79.188711,RBC Royal Bank,43.76679,-79.191151,Bank
4,Scarborough,Guildwood / Morningside / West Hill,43.763573,-79.188711,G & G Electronics,43.765309,-79.191537,Electronics Store


In [12]:
toronto = toronto_venues.loc[toronto_venues["Borough"].str.contains("Toronto")]
toronto.head()

Unnamed: 0,Borough,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
302,East Toronto,The Beaches,43.676357,-79.293031,Glen Manor Ravine,43.676821,-79.293942,Trail
303,East Toronto,The Beaches,43.676357,-79.293031,The Big Carrot Natural Food Market,43.678879,-79.297734,Health Food Store
304,East Toronto,The Beaches,43.676357,-79.293031,Grover Pub and Grub,43.679181,-79.297215,Pub
305,East Toronto,The Beaches,43.676357,-79.293031,Upper Beaches,43.680563,-79.292869,Neighborhood
358,East YorkEast Toronto,The Danforth East,43.685347,-79.338106,The Path,43.683923,-79.335007,Park


In [13]:
toronto_onehot = pd.get_dummies(toronto[["Venue Category"]], prefix = "", prefix_sep="")
toronto_onehot["Neighborhood"] = toronto["Neighborhood"]
hood_index = toronto_onehot.columns.get_loc("Neighborhood")

# move neighborhood column to the first column
fixed_columns = ["Neighborhood"] + list(toronto_onehot.columns[:hood_index]) + list(toronto_onehot.columns[hood_index+1:])
toronto_onehot = toronto_onehot[fixed_columns]
toronto_onehot.head()
toronto_grouped = toronto_onehot.groupby("Neighborhood").mean().reset_index()
toronto_grouped.head()

Unnamed: 0,Neighborhood,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,Aquarium,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Auto Workshop,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Baseball Stadium,Basketball Stadium,Beach,Bed & Breakfast,Beer Bar,Beer Store,Belgian Restaurant,Bistro,Boat or Ferry,Bookstore,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Line,Business Service,Butcher,Cable Car,Café,Cajun / Creole Restaurant,Candy Store,Caribbean Restaurant,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Arts Building,College Gym,College Rec Center,Colombian Restaurant,Comfort Food Restaurant,Comic Shop,Concert Hall,Convenience Store,Cosmetics Shop,Coworking Space,Creperie,Cuban Restaurant,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Escape Room,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Fish Market,Flea Market,Food & Drink Shop,Food Court,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General Entertainment,General Travel,German Restaurant,Gift Shop,Gluten-free Restaurant,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Health & Beauty Service,Health Food Store,Historic Site,History Museum,Home Service,Hookah Bar,Hospital,Hotel,Hotel Bar,IT Services,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Korean Restaurant,Lake,Latin American Restaurant,Light Rail Station,Lingerie Store,Liquor Store,Lounge,Market,Martial Arts School,Massage Studio,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Movie Theater,Moving Target,Museum,Music Venue,New American Restaurant,Nightclub,Noodle House,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Park,Performing Arts Venue,Pet Store,Pharmacy,Pizza Place,Plane,Playground,Plaza,Poke Place,Pool,Portuguese Restaurant,Poutine Place,Pub,Ramen Restaurant,Record Shop,Rental Car Location,Restaurant,Roof Deck,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Repair,Shoe Store,Shopping Mall,Skate Park,Skating Rink,Smoke Shop,Smoothie Shop,Snack Place,Soup Place,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Stadium,Stationery Store,Steakhouse,Strip Club,Supermarket,Sushi Restaurant,Swim School,Taco Place,Tailor Shop,Taiwanese Restaurant,Tanning Salon,Tea Room,Thai Restaurant,Theater,Theme Restaurant,Tibetan Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Yoga Studio
0,Berczy Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.051724,0.0,0.0,0.0,0.017241,0.017241,0.0,0.034483,0.0,0.017241,0.017241,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.017241,0.051724,0.068966,0.0,0.0,0.0,0.0,0.017241,0.0,0.017241,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.017241,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.017241,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.017241,0.017241,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.017241,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.017241,0.0,0.0,0.017241,0.0,0.0,0.017241,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0
1,Brockton / Parkdale Village / Exhibition Place,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.136364,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,CN Tower / King and Spadina / Railway Lands / ...,0.0625,0.0625,0.0625,0.125,0.125,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Central Bay Street,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.031746,0.0,0.031746,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.174603,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.031746,0.0,0.015873,0.015873,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.015873,0.0,0.0,0.0,0.047619,0.031746,0.0,0.0,0.015873,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.015873,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.015873,0.0,0.0,0.015873,0.0,0.0,0.015873,0.0,0.0,0.031746,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.031746,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.015873,0.015873
4,Christie,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.266667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.133333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [15]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Berczy Park,Coffee Shop,Bakery,Cocktail Bar,Pharmacy,Seafood Restaurant,Farmers Market,Beer Bar,Restaurant,Cheese Shop,Comfort Food Restaurant
1,Brockton / Parkdale Village / Exhibition Place,Café,Coffee Shop,Breakfast Spot,Pet Store,Bakery,Performing Arts Venue,Nightclub,Climbing Gym,Restaurant,Burrito Place
2,CN Tower / King and Spadina / Railway Lands / ...,Airport Lounge,Airport Service,Airport Terminal,Airport,Airport Food Court,Airport Gate,Sculpture Garden,Harbor / Marina,Bar,Rental Car Location
3,Central Bay Street,Coffee Shop,Italian Restaurant,Café,Sandwich Place,Japanese Restaurant,Department Store,Salad Place,Burger Joint,Bubble Tea Shop,Thai Restaurant
4,Christie,Grocery Store,Café,Park,Baby Store,Coffee Shop,Italian Restaurant,Candy Store,Nightclub,Restaurant,Dumpling Restaurant


In [16]:
# set number of clusters
kclusters = 4

toronto_grouped_clustering = toronto_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

toronto_merged = toronto

# merge manhattan_grouped with manhattan_data to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

toronto_merged.head() # check the last columns!

Unnamed: 0,Borough,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
302,East Toronto,The Beaches,43.676357,-79.293031,Glen Manor Ravine,43.676821,-79.293942,Trail,0,Health Food Store,Trail,Pub,Yoga Studio,Diner,Event Space,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant
303,East Toronto,The Beaches,43.676357,-79.293031,The Big Carrot Natural Food Market,43.678879,-79.297734,Health Food Store,0,Health Food Store,Trail,Pub,Yoga Studio,Diner,Event Space,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant
304,East Toronto,The Beaches,43.676357,-79.293031,Grover Pub and Grub,43.679181,-79.297215,Pub,0,Health Food Store,Trail,Pub,Yoga Studio,Diner,Event Space,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant
305,East Toronto,The Beaches,43.676357,-79.293031,Upper Beaches,43.680563,-79.292869,Neighborhood,0,Health Food Store,Trail,Pub,Yoga Studio,Diner,Event Space,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant
358,East YorkEast Toronto,The Danforth East,43.685347,-79.338106,The Path,43.683923,-79.335007,Park,2,Park,Coffee Shop,Convenience Store,Yoga Studio,Discount Store,Event Space,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant


In [17]:
address = 'Toronto, ON'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto are 43.6534817, -79.3839347.


In [18]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Neighborhood Latitude'], toronto_merged['Neighborhood Longitude'], 
                                  toronto_merged['Neighborhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [24]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]].head()

Unnamed: 0,Neighborhood,Venue Latitude,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
302,The Beaches,43.676821,-79.293942,Trail,0,Health Food Store,Trail,Pub,Yoga Studio,Diner,Event Space,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant
303,The Beaches,43.678879,-79.297734,Health Food Store,0,Health Food Store,Trail,Pub,Yoga Studio,Diner,Event Space,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant
304,The Beaches,43.679181,-79.297215,Pub,0,Health Food Store,Trail,Pub,Yoga Studio,Diner,Event Space,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant
305,The Beaches,43.680563,-79.292869,Neighborhood,0,Health Food Store,Trail,Pub,Yoga Studio,Diner,Event Space,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant
361,The Danforth West / Riverdale,43.67782,-79.351265,Cosmetics Shop,0,Greek Restaurant,Coffee Shop,Italian Restaurant,Bookstore,Furniture / Home Store,Restaurant,Ice Cream Shop,Caribbean Restaurant,Pub,Café


In [25]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]].head()

Unnamed: 0,Neighborhood,Venue Latitude,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
524,Moore Park / Summerhill East,43.69027,-79.383438,Park,1,Park,Playground,Trail,Restaurant,Comic Shop,Concert Hall,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant
525,Moore Park / Summerhill East,43.690356,-79.386841,Trail,1,Park,Playground,Trail,Restaurant,Comic Shop,Concert Hall,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant
526,Moore Park / Summerhill East,43.692816,-79.384504,Restaurant,1,Park,Playground,Trail,Restaurant,Comic Shop,Concert Hall,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant
527,Moore Park / Summerhill East,43.69361,-79.383465,Playground,1,Park,Playground,Trail,Restaurant,Comic Shop,Concert Hall,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant
542,Rosedale,43.682328,-79.378934,Playground,1,Park,Playground,Trail,Yoga Studio,Dessert Shop,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant,Dumpling Restaurant


In [26]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]].head()

Unnamed: 0,Neighborhood,Venue Latitude,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
358,The Danforth East,43.683923,-79.335007,Park,2,Park,Coffee Shop,Convenience Store,Yoga Studio,Discount Store,Event Space,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant
359,The Danforth East,43.686951,-79.335007,Convenience Store,2,Park,Coffee Shop,Convenience Store,Yoga Studio,Discount Store,Event Space,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant
360,The Danforth East,43.688048,-79.333274,Coffee Shop,2,Park,Coffee Shop,Convenience Store,Yoga Studio,Discount Store,Event Space,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant


In [27]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]].head()

Unnamed: 0,Neighborhood,Venue Latitude,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1425,Roselawn,43.709067,-79.415858,Fast Food Restaurant,3,Fast Food Restaurant,Home Service,Garden,Diner,Falafel Restaurant,Event Space,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant
1426,Roselawn,43.713891,-79.420702,Home Service,3,Fast Food Restaurant,Home Service,Garden,Diner,Falafel Restaurant,Event Space,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant
1427,Roselawn,43.712189,-79.411978,Garden,3,Fast Food Restaurant,Home Service,Garden,Diner,Falafel Restaurant,Event Space,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant


In [34]:

body = client_14a477c36fda4eb083cdedeb43b34b20.get_object(Bucket='capstoneproject-donotdelete-pr-7gxmdvarzlqgag',Key='australian_postcodes.csv')['Body']
# add missing __iter__ method, so pandas accepts body as file-like object
if not hasattr(body, "__iter__"): body.__iter__ = types.MethodType( __iter__, body )
aus_codes = pd.read_csv(body)
aus_codes.head()
mel_codes = aus_codes.loc[((aus_codes['postcode']>=3000) & (aus_codes['postcode'] <= 3207)) | 
                         ((aus_codes['postcode']>=8000) & (aus_codes['postcode']<=8399))]
mel_codes.head()
mel_codes2 = mel_codes.reset_index()
mel_codes3 = mel_codes2.drop(['index'],axis=1)
mel_codes4 = mel_codes3[['postcode', 'locality', 'sa3name', 'lat', 'long']]
mel_codes5 = mel_codes4.loc[(mel_codes4['lat'] != 0) & (mel_codes4['long']!= 0)]
mel_codes5.head()

Unnamed: 0,postcode,locality,sa3name,lat,long
0,3000,MELBOURNE,Melbourne City,-37.817403,144.956776
1,3001,MELBOURNE,Port Phillip,-37.817403,144.956776
2,3002,EAST MELBOURNE,Melbourne City,-37.818517,144.982207
3,3003,WEST MELBOURNE,Melbourne City,-37.810871,144.949592
4,3004,MELBOURNE,Port Phillip,-37.844246,144.970161


In [35]:
melbourne_venues = getNearbyVenues(boroughs=mel_codes5["locality"], names=mel_codes5["sa3name"], 
                                   latitudes=mel_codes5["lat"], longitudes=mel_codes5["long"])
melbourne_venues.head()

Unnamed: 0,Borough,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,MELBOURNE,Melbourne City,-37.817403,144.956776,Virgin Active Health Club,-37.818806,144.955917,Gym / Fitness Center
1,MELBOURNE,Melbourne City,-37.817403,144.956776,The Lui Bar,-37.819067,144.957739,Cocktail Bar
2,MELBOURNE,Melbourne City,-37.817403,144.956776,Bonnie Coffee Brewers,-37.818153,144.957636,Coffee Shop
3,MELBOURNE,Melbourne City,-37.817403,144.956776,Brim CC,-37.817764,144.954732,Japanese Restaurant
4,MELBOURNE,Melbourne City,-37.817403,144.956776,Royal Stacks,-37.817867,144.958489,Burger Joint


In [36]:
melbourne_city_venues = melbourne_venues.loc[melbourne_venues['Borough'].str.contains('MELBOURNE')]
melbourne_city_venues.head()

Unnamed: 0,Borough,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,MELBOURNE,Melbourne City,-37.817403,144.956776,Virgin Active Health Club,-37.818806,144.955917,Gym / Fitness Center
1,MELBOURNE,Melbourne City,-37.817403,144.956776,The Lui Bar,-37.819067,144.957739,Cocktail Bar
2,MELBOURNE,Melbourne City,-37.817403,144.956776,Bonnie Coffee Brewers,-37.818153,144.957636,Coffee Shop
3,MELBOURNE,Melbourne City,-37.817403,144.956776,Brim CC,-37.817764,144.954732,Japanese Restaurant
4,MELBOURNE,Melbourne City,-37.817403,144.956776,Royal Stacks,-37.817867,144.958489,Burger Joint


In [37]:
melbourne_onehot = pd.get_dummies(melbourne_city_venues[["Venue Category"]], prefix = "", prefix_sep="")
melbourne_onehot["Neighborhood"] = melbourne_city_venues["Neighborhood"]
hood_index = melbourne_onehot.columns.get_loc("Neighborhood")

# move neighborhood column to the first column
fixed_columns = ["Neighborhood"] + list(melbourne_onehot.columns[:hood_index]) + list(melbourne_onehot.columns[hood_index+1:])
melbourne_onehot = melbourne_onehot[fixed_columns]
melbourne_onehot.head()
melbourne_grouped = melbourne_onehot.groupby("Neighborhood").mean().reset_index()
melbourne_grouped.head()

Unnamed: 0,Neighborhood,Aquarium,Art Gallery,Asian Restaurant,Athletics & Sports,Australian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bar,Basketball Court,Beer Bar,Bookstore,Breakfast Spot,Burger Joint,Café,Candy Store,Chaat Place,Climbing Gym,Cocktail Bar,Coffee Shop,College Gym,Comic Shop,Concert Hall,Convenience Store,Cricket Ground,Dessert Shop,Dim Sum Restaurant,Dive Bar,Donut Shop,Electronics Store,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Fish Market,Football Stadium,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Gastropub,General Entertainment,Grocery Store,Gym,Gym / Fitness Center,History Museum,Hockey Arena,Home Service,Hostel,Hotel,Indian Restaurant,Indie Theater,Indonesian Restaurant,Italian Restaurant,Japanese Restaurant,Juice Bar,Kebab Restaurant,Kitchen Supply Store,Korean Restaurant,Lake,Lebanese Restaurant,Liquor Store,Lounge,Malay Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Mini Golf,Modern European Restaurant,Movie Theater,Moving Target,Museum,Nightclub,Paper / Office Supplies Store,Park,Persian Restaurant,Pet Store,Pharmacy,Pizza Place,Platform,Playground,Plaza,Pool,Portuguese Restaurant,Pub,Recreation Center,Rental Car Location,Restaurant,Salad Place,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Social Club,Spanish Restaurant,Sporting Goods Shop,Sports Club,Stadium,Steakhouse,Supermarket,Sushi Restaurant,Tapas Restaurant,Tennis Court,Tennis Stadium,Thai Restaurant,Tourist Information Center,Train Station,Tram Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Wine Shop,Yoga Studio,Zoo Exhibit
0,Hobsons Bay,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Melbourne City,0.003968,0.007937,0.0,0.007937,0.003968,0.0,0.003968,0.007937,0.02381,0.003968,0.0,0.0,0.0,0.007937,0.15873,0.003968,0.003968,0.0,0.003968,0.075397,0.003968,0.003968,0.003968,0.02381,0.047619,0.003968,0.003968,0.0,0.003968,0.003968,0.0,0.007937,0.0,0.0,0.007937,0.003968,0.003968,0.0,0.0,0.0,0.003968,0.003968,0.007937,0.007937,0.003968,0.003968,0.0,0.007937,0.039683,0.019841,0.003968,0.003968,0.015873,0.031746,0.003968,0.007937,0.0,0.015873,0.0,0.003968,0.0,0.003968,0.0,0.003968,0.003968,0.003968,0.0,0.0,0.0,0.007937,0.0,0.0,0.015873,0.003968,0.0,0.0,0.011905,0.011905,0.003968,0.007937,0.003968,0.003968,0.031746,0.003968,0.0,0.015873,0.0,0.015873,0.0,0.003968,0.0,0.003968,0.003968,0.007937,0.007937,0.007937,0.003968,0.003968,0.011905,0.003968,0.015873,0.047619,0.011905,0.007937,0.007937,0.031746,0.007937,0.0,0.007937,0.003968,0.015873
2,Port Phillip,0.003802,0.0,0.007605,0.0,0.04943,0.007605,0.003802,0.022814,0.019011,0.0,0.007605,0.007605,0.053232,0.015209,0.1673,0.003802,0.003802,0.0,0.003802,0.060837,0.0,0.003802,0.0,0.003802,0.0,0.011407,0.003802,0.007605,0.003802,0.0,0.007605,0.015209,0.015209,0.007605,0.0,0.003802,0.011407,0.007605,0.015209,0.015209,0.0,0.007605,0.007605,0.003802,0.003802,0.0,0.003802,0.011407,0.011407,0.003802,0.0,0.003802,0.011407,0.019011,0.0,0.007605,0.007605,0.011407,0.045627,0.003802,0.007605,0.0,0.007605,0.019011,0.003802,0.003802,0.007605,0.007605,0.0,0.0,0.007605,0.007605,0.003802,0.0,0.015209,0.007605,0.003802,0.003802,0.0,0.0,0.0,0.003802,0.026616,0.0,0.007605,0.011407,0.007605,0.019011,0.0,0.0,0.007605,0.003802,0.003802,0.0,0.0,0.0,0.003802,0.003802,0.003802,0.003802,0.0,0.0,0.015209,0.0,0.0,0.007605,0.015209,0.007605,0.007605,0.0,0.0
3,Tullamarine - Broadmeadows,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [38]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
mel_neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
mel_neighborhoods_venues_sorted['Neighborhood'] = melbourne_grouped['Neighborhood']

for ind in np.arange(melbourne_grouped.shape[0]):
    mel_neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(melbourne_grouped.iloc[ind, :], num_top_venues)

mel_neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Hobsons Bay,Fast Food Restaurant,Convenience Store,Climbing Gym,Zoo Exhibit,Donut Shop,Farmers Market,Fish & Chips Shop,Fish Market,Football Stadium,French Restaurant
1,Melbourne City,Café,Coffee Shop,Tennis Stadium,Cricket Ground,Hotel,Pub,Tram Station,Japanese Restaurant,Convenience Store,Bar
2,Port Phillip,Café,Coffee Shop,Breakfast Spot,Australian Restaurant,Lake,Pub,Bakery,Sandwich Place,Bar,Japanese Restaurant
3,Tullamarine - Broadmeadows,Moving Target,Scenic Lookout,Dive Bar,Electronics Store,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Fish Market,Football Stadium,French Restaurant


In [39]:
# set number of clusters
kclusters = 4

mel_grouped_clustering = melbourne_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(mel_grouped_clustering)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(mel_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 
# add clustering labels
mel_neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

mel_merged = melbourne_city_venues

# merge manhattan_grouped with manhattan_data to add latitude/longitude for each neighborhood
mel_merged = mel_merged.join(mel_neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

mel_merged.head() # check the last columns!

Unnamed: 0,Borough,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,MELBOURNE,Melbourne City,-37.817403,144.956776,Virgin Active Health Club,-37.818806,144.955917,Gym / Fitness Center,3,Café,Coffee Shop,Tennis Stadium,Cricket Ground,Hotel,Pub,Tram Station,Japanese Restaurant,Convenience Store,Bar
1,MELBOURNE,Melbourne City,-37.817403,144.956776,The Lui Bar,-37.819067,144.957739,Cocktail Bar,3,Café,Coffee Shop,Tennis Stadium,Cricket Ground,Hotel,Pub,Tram Station,Japanese Restaurant,Convenience Store,Bar
2,MELBOURNE,Melbourne City,-37.817403,144.956776,Bonnie Coffee Brewers,-37.818153,144.957636,Coffee Shop,3,Café,Coffee Shop,Tennis Stadium,Cricket Ground,Hotel,Pub,Tram Station,Japanese Restaurant,Convenience Store,Bar
3,MELBOURNE,Melbourne City,-37.817403,144.956776,Brim CC,-37.817764,144.954732,Japanese Restaurant,3,Café,Coffee Shop,Tennis Stadium,Cricket Ground,Hotel,Pub,Tram Station,Japanese Restaurant,Convenience Store,Bar
4,MELBOURNE,Melbourne City,-37.817403,144.956776,Royal Stacks,-37.817867,144.958489,Burger Joint,3,Café,Coffee Shop,Tennis Stadium,Cricket Ground,Hotel,Pub,Tram Station,Japanese Restaurant,Convenience Store,Bar


In [40]:
address = 'Melbourne, VIC'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Melbourne are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Melbourne are -37.8142176, 144.9631608.


In [41]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(mel_merged['Neighborhood Latitude'], mel_merged['Neighborhood Longitude'], 
                                  mel_merged['Neighborhood'], mel_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [42]:
mel_merged.loc[mel_merged['Cluster Labels'] == 0, mel_merged.columns[[1] + list(range(5, mel_merged.shape[1]))]].head()

Unnamed: 0,Neighborhood,Venue Latitude,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
5717,Hobsons Bay,-37.82978,144.917327,Fast Food Restaurant,0,Fast Food Restaurant,Convenience Store,Climbing Gym,Zoo Exhibit,Donut Shop,Farmers Market,Fish & Chips Shop,Fish Market,Football Stadium,French Restaurant
5718,Hobsons Bay,-37.82953,144.91709,Fast Food Restaurant,0,Fast Food Restaurant,Convenience Store,Climbing Gym,Zoo Exhibit,Donut Shop,Farmers Market,Fish & Chips Shop,Fish Market,Football Stadium,French Restaurant
5719,Hobsons Bay,-37.829827,144.916305,Convenience Store,0,Fast Food Restaurant,Convenience Store,Climbing Gym,Zoo Exhibit,Donut Shop,Farmers Market,Fish & Chips Shop,Fish Market,Football Stadium,French Restaurant
5720,Hobsons Bay,-37.83236,144.92181,Climbing Gym,0,Fast Food Restaurant,Convenience Store,Climbing Gym,Zoo Exhibit,Donut Shop,Farmers Market,Fish & Chips Shop,Fish Market,Football Stadium,French Restaurant


In [43]:
mel_merged.loc[mel_merged['Cluster Labels'] == 1, mel_merged.columns[[1] + list(range(5, mel_merged.shape[1]))]].head()

Unnamed: 0,Neighborhood,Venue Latitude,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
77,Port Phillip,-37.818806,144.955917,Gym / Fitness Center,1,Café,Coffee Shop,Breakfast Spot,Australian Restaurant,Lake,Pub,Bakery,Sandwich Place,Bar,Japanese Restaurant
78,Port Phillip,-37.819067,144.957739,Cocktail Bar,1,Café,Coffee Shop,Breakfast Spot,Australian Restaurant,Lake,Pub,Bakery,Sandwich Place,Bar,Japanese Restaurant
79,Port Phillip,-37.818153,144.957636,Coffee Shop,1,Café,Coffee Shop,Breakfast Spot,Australian Restaurant,Lake,Pub,Bakery,Sandwich Place,Bar,Japanese Restaurant
80,Port Phillip,-37.817764,144.954732,Japanese Restaurant,1,Café,Coffee Shop,Breakfast Spot,Australian Restaurant,Lake,Pub,Bakery,Sandwich Place,Bar,Japanese Restaurant
81,Port Phillip,-37.817867,144.958489,Burger Joint,1,Café,Coffee Shop,Breakfast Spot,Australian Restaurant,Lake,Pub,Bakery,Sandwich Place,Bar,Japanese Restaurant


In [44]:
mel_merged.loc[mel_merged['Cluster Labels'] == 2, mel_merged.columns[[1] + list(range(5, mel_merged.shape[1]))]].head()

Unnamed: 0,Neighborhood,Venue Latitude,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1071,Tullamarine - Broadmeadows,-37.677851,144.835389,Scenic Lookout,2,Moving Target,Scenic Lookout,Dive Bar,Electronics Store,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Fish Market,Football Stadium,French Restaurant
1072,Tullamarine - Broadmeadows,-37.674939,144.831147,Moving Target,2,Moving Target,Scenic Lookout,Dive Bar,Electronics Store,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Fish Market,Football Stadium,French Restaurant


In [45]:
mel_merged.loc[mel_merged['Cluster Labels'] == 3, mel_merged.columns[[1] + list(range(5, mel_merged.shape[1]))]].head()

Unnamed: 0,Neighborhood,Venue Latitude,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Melbourne City,-37.818806,144.955917,Gym / Fitness Center,3,Café,Coffee Shop,Tennis Stadium,Cricket Ground,Hotel,Pub,Tram Station,Japanese Restaurant,Convenience Store,Bar
1,Melbourne City,-37.819067,144.957739,Cocktail Bar,3,Café,Coffee Shop,Tennis Stadium,Cricket Ground,Hotel,Pub,Tram Station,Japanese Restaurant,Convenience Store,Bar
2,Melbourne City,-37.818153,144.957636,Coffee Shop,3,Café,Coffee Shop,Tennis Stadium,Cricket Ground,Hotel,Pub,Tram Station,Japanese Restaurant,Convenience Store,Bar
3,Melbourne City,-37.817764,144.954732,Japanese Restaurant,3,Café,Coffee Shop,Tennis Stadium,Cricket Ground,Hotel,Pub,Tram Station,Japanese Restaurant,Convenience Store,Bar
4,Melbourne City,-37.817867,144.958489,Burger Joint,3,Café,Coffee Shop,Tennis Stadium,Cricket Ground,Hotel,Pub,Tram Station,Japanese Restaurant,Convenience Store,Bar
