<a href="https://colab.research.google.com/github/MaguireMaName/Coursera_Capstone/blob/master/The_Battle_of_Neighborhoods.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Capstone: Battle of the Neighbourhoods

##Introduction/Business Problem

This project will bring together neighboured information on venues (from Foursquare) and neighbourhood crime statistics to find complementary neighbourhood concerning local venues, but to compare and contrast alike neighbourhoods with crime statistics.

This information will inform a business who wishes to set up shop in a reasonably safe neighbourhood or enable an individual to identify a venue to visit in a reasonably safe neighbourhood.

## Data description

The Foursquare data will be used to identify similar neighbourhoods by the frequency of venues by type. This alikeness will allow businesses to identify those neighbourhoods with relatively high levels of competition. For example, if I wanted to set up a coffee shop, I might want to avoid a neighbourhood saturated with coffee shops!

Furthermore, the value of the Foursquare dataset is to be extended by the inclusion of local crime dataset alongside resident population dataset. The integration of these datasets will enable business or individuals to identify further those venues and neighbourhoods that intersect with unreasonably high incidences of crime on a per capita basis.

This information will enable business or individuals to identify on a map those neighbourhoods based on the type of venues and the incidence of crime per capita.  Allowing businesses to target and assess the potential location of their operations by frequency of venues (i.e., competition) and low crime per capita.  Similarly, individuals can use the information to assess what venues to visit in what neighbourhood by type of venue and the incidence of crime in that neighbourhood.

In [0]:
# !pip install geocoder

In [0]:
# load dependancies

import pandas as pd
import numpy as np
import geocoder
import folium
from geopy.geocoders import Nominatim
import requests
from sklearn.cluster import KMeans
import matplotlib.cm as cm
import matplotlib.colors as colors

In [7]:
cbr = pd.read_csv('Canberra suburbs.csv')
print(cbr.shape)
cbr.head()

(154, 4)


Unnamed: 0,Neighborhood,Postcode,Country,Region
0,Barton,2600,Australia,Australian Capital Territory
1,Canberra,2600,Australia,Australian Capital Territory
2,Page,2614,Australia,Australian Capital Territory
3,City,2601,Australia,Australian Capital Territory
4,Canberra,2601,Australia,Australian Capital Territory


In [0]:
Lat_list=[]
Lng_list=[]

for i in range(cbr.shape[0]):
    address='{}, Canberra, Australia'.format(cbr.at[i,'Neighborhood'])
    g = geocoder.arcgis(address)
    Lat_list.append(g.latlng[0])
    Lng_list.append(g.latlng[1])

In [13]:
for i in range(cbr.shape[0]): 
  
    cbr['Latitude'] = Lat_list
    cbr['Longitude'] = Lng_list
    
print(cbr.shape)
cbr.head()

(154, 6)


Unnamed: 0,Neighborhood,Postcode,Country,Region,Latitude,Longitude
0,Barton,2600,Australia,Australian Capital Territory,-35.30829,149.13354
1,Canberra,2600,Australia,Australian Capital Territory,-35.30654,149.12655
2,Page,2614,Australia,Australian Capital Territory,-35.23954,149.04826
3,City,2601,Australia,Australian Capital Territory,-35.28007,149.13093
4,Canberra,2601,Australia,Australian Capital Territory,-35.30654,149.12655


In [14]:
address = 'Canberra, Australian Capital Territory'

geolocator = Nominatim(user_agent="canberra_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude

# create map of Canberra using latitude and longitude values
map_cbr = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, label in zip(cbr['Latitude'], cbr['Longitude'], cbr['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='blue',
        fill_opacity=0.8,
        parse_html=False).add_to(map_cbr)  
    
map_cbr

In [0]:
# define Foursquare Credentials and Version

client_id = 'KL5SVGOS40RKZBQK4G1VXYBKBICWCDQL2NMCASHFYER432SS' #  Foursquare ID'
client_secret = '1A5KPYJQIATH0SDZXPPZ5YK0SHLBYVEGPER5AAIIMDXLZ0AB' #  Foursquare Secret
version = '20180604'
limit = 30

In [0]:
# let's create a function to repeat the same process to all the neighborhoods in toronto

def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            client_id, 
            client_secret, 
            version, 
            lat, 
            lng, 
            radius, 
            limit)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [17]:

# run the above function on each neighborhood and create a new dataframe called toronto_venues

cbr_venues = getNearbyVenues(names=cbr['Neighborhood'],
                                   latitudes=cbr['Latitude'],
                                   longitudes=cbr['Longitude'],
                                  )

Barton
Canberra
Page
City
Canberra
Barton
Hmas Creswell
Jervis Bay
Wreck Bay
Hmas Harman
Parliament House
Yarralumla
Harman
Capital Hill
Russell
Parkes
Deakin
Acton
Watson
Hackett
Ainslie
O'Connor
Downer
Dickson
Lyneham
Manuka
Forrest
Griffith
Red Hill
Kingston
Narrabundah
Causeway
Garran
Hughes
Curtin
Swinger Hill
Woden
O'Malley
Chifley
Lyons
Phillip
Torrens
Isaacs
Mawson
Pearce
Farrer
Civic Square
Canberra Airport
Majura
Pialligo
Fyshwick
Canberra Mc
Uriarra
Uriarra Village
Wright
Duffy
Weston Creek
Weston
Coree
Fisher
Coombs
Stromlo
Turner
Braddon
Campbell
Reid
Jamison Centre
Weetangera
Scullin
Macquarie
Cook
Aranda
Hawker
Kippax Centre
Kippax
Florey
Dunlop
Macgregor
Latham
Charnwood
Fraser
Melba
Flynn
Higgins
Holt
Spence
Belconnen
Belconnen DC
Bruce
Lawson
Belconnen
Hall
Paddys River
Kowen
Hume
Oaks Estate
Beard
Kambah Village
Oxley
Macarthur
Monash
Crace
Kinlyside
Franklin
Taylor
Casey
Moncrieff
Harrison
Jacka
Forde
Bonner
Australian National University
Deakin West
Duntroon
Black 

In [18]:
# check dimensions and data

print(cbr_venues.shape)
cbr_venues.head()

(901, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Barton,-35.30829,149.13354,Ottoman Cuisine,-35.305615,149.136704,Turkish Restaurant
1,Barton,-35.30829,149.13354,Ostani,-35.311509,149.133533,Hotel Bar
2,Barton,-35.30829,149.13354,Little Bird,-35.306099,149.135331,Café
3,Barton,-35.30829,149.13354,LiloTang,-35.311991,149.133847,Japanese Restaurant
4,Barton,-35.30829,149.13354,National Archives of Australia,-35.304637,149.131004,History Museum


In [20]:
# treat cafe and coffee shops as the same

cbr_venues.replace({'Venue Category': 'Café'}, {'Venue Category': 'Coffee shop'}, regex=True)

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Barton,-35.308290,149.133540,Ottoman Cuisine,-35.305615,149.136704,Turkish Restaurant
1,Barton,-35.308290,149.133540,Ostani,-35.311509,149.133533,Hotel Bar
2,Barton,-35.308290,149.133540,Little Bird,-35.306099,149.135331,Café
3,Barton,-35.308290,149.133540,LiloTang,-35.311991,149.133847,Japanese Restaurant
4,Barton,-35.308290,149.133540,National Archives of Australia,-35.304637,149.131004,History Museum
5,Barton,-35.308290,149.133540,Hotel Realm,-35.311592,149.133483,Hotel
6,Barton,-35.308290,149.133540,Little National,-35.310967,149.131799,Hotel
7,Barton,-35.308290,149.133540,Maple + Clove,-35.312387,149.133460,Café
8,Barton,-35.308290,149.133540,Hideout,-35.306748,149.133673,Coffee Shop
9,Barton,-35.308290,149.133540,Burbury Hotel & Apartments,-35.311741,149.133716,Hotel


In [15]:
# the number of venues returned for each neighborhood

cbr_venues.groupby('Neighborhood').count()

print('There are {} unique venue categories.'.format(len(cbr_venues['Venue Category'].unique())))
print(cbr_venues.head())

There are 166 unique venue categories.
  Neighborhood  Neighborhood Latitude  ...  Venue Longitude       Venue Category
0       Barton              -35.30829  ...       149.136704   Turkish Restaurant
1       Barton              -35.30829  ...       149.133533            Hotel Bar
2       Barton              -35.30829  ...       149.135331                 Café
3       Barton              -35.30829  ...       149.133847  Japanese Restaurant
4       Barton              -35.30829  ...       149.131004       History Museum

[5 rows x 7 columns]


In [16]:
# analyse each neighbourhood

# one hot encoding
cbr_onehot = pd.get_dummies(cbr_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
cbr_onehot['Neighborhood'] = cbr_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [cbr_onehot.columns[-1]] + list(cbr_onehot.columns[:-1])
cbr_onehot = cbr_onehot[fixed_columns]

cbr_onehot.head()

Unnamed: 0,Neighborhood,Airport,Airport Lounge,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Australian Restaurant,Baby Store,Bakery,Bar,Baseball Field,Beer Bar,Bistro,Boat or Ferry,Bookstore,Breakfast Spot,Brewery,Burger Joint,Burmese Restaurant,Burrito Place,Bus Station,Bus Stop,Business Service,Café,Campaign Office,Cantonese Restaurant,Chinese Restaurant,Chocolate Shop,Cocktail Bar,Coffee Shop,Comic Shop,Concert Hall,Convenience Store,Cricket Ground,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,...,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shop & Service,Shopping Mall,Shopping Plaza,Skate Park,Soccer Field,Social Club,Southern / Soul Food Restaurant,Spa,Spanish Restaurant,Sporting Goods Shop,Sports Bar,Sports Club,Steakhouse,Supermarket,Sushi Restaurant,Szechuan Restaurant,Tapas Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Park Ride / Attraction,Thrift / Vintage Store,Tiki Bar,Toy / Game Store,Track,Trail,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Veterinarian,Video Game Store,Vietnamese Restaurant,Whisky Bar,Wine Bar,Wine Shop,Yoga Studio
0,Barton,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0
1,Barton,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Barton,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Barton,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Barton,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [17]:
cbr_grouped = cbr_onehot.groupby('Neighborhood').mean().reset_index()
cbr_grouped

Unnamed: 0,Neighborhood,Airport,Airport Lounge,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Australian Restaurant,Baby Store,Bakery,Bar,Baseball Field,Beer Bar,Bistro,Boat or Ferry,Bookstore,Breakfast Spot,Brewery,Burger Joint,Burmese Restaurant,Burrito Place,Bus Station,Bus Stop,Business Service,Café,Campaign Office,Cantonese Restaurant,Chinese Restaurant,Chocolate Shop,Cocktail Bar,Coffee Shop,Comic Shop,Concert Hall,Convenience Store,Cricket Ground,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,...,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shop & Service,Shopping Mall,Shopping Plaza,Skate Park,Soccer Field,Social Club,Southern / Soul Food Restaurant,Spa,Spanish Restaurant,Sporting Goods Shop,Sports Bar,Sports Club,Steakhouse,Supermarket,Sushi Restaurant,Szechuan Restaurant,Tapas Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Park Ride / Attraction,Thrift / Vintage Store,Tiki Bar,Toy / Game Store,Track,Trail,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Veterinarian,Video Game Store,Vietnamese Restaurant,Whisky Bar,Wine Bar,Wine Shop,Yoga Studio
0,Acton,0.000000,0.000000,0.0,0.0,0.0,0.00,0.000000,0.000000,0.000000,0.111111,0.00,0.000000,0.000000,0.00,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.222222,0.000000,0.000000,0.00,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.000000,0.000000,...,0.000000,0.00,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000
1,Ainslie,0.000000,0.000000,0.0,0.0,0.0,0.00,0.142857,0.000000,0.142857,0.000000,0.00,0.000000,0.000000,0.00,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.142857,0.000000,0.000000,0.00,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.000000,0.000000,...,0.000000,0.00,0.0,0.000000,0.000000,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000
2,Amaroo,0.000000,0.000000,0.0,0.0,0.0,1.00,0.000000,0.000000,0.000000,0.000000,0.00,0.000000,0.000000,0.00,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.00,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.000000,0.000000,...,0.000000,0.00,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000
3,Aranda,0.000000,0.000000,0.0,0.0,0.0,0.00,0.000000,0.000000,0.000000,0.200000,0.00,0.000000,0.000000,0.00,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.400000,0.000000,0.000000,0.00,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.200000,0.0,0.000000,0.000000,...,0.000000,0.00,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000
4,Australian National University,0.000000,0.000000,0.0,0.0,0.0,0.00,0.000000,0.000000,0.000000,0.000000,0.00,0.033333,0.000000,0.00,0.033333,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.166667,0.000000,0.000000,0.00,0.000000,0.033333,0.200000,0.000000,0.033333,0.000000,0.0,0.000000,0.000000,0.0,0.000000,0.000000,...,0.000000,0.00,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.033333,0.000000,0.000000,0.033333,0.033333,0.033333,0.0,0.033333,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.0,0.033333,0.033333,0.033333,0.000000,0.000000
5,Banks,0.000000,0.000000,0.0,0.0,0.0,0.00,0.000000,0.000000,0.000000,0.000000,0.00,0.000000,0.000000,0.00,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.00,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.000000,0.000000,...,0.000000,0.00,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000
6,Barton,0.000000,0.000000,0.0,0.0,0.0,0.00,0.000000,0.000000,0.000000,0.000000,0.00,0.000000,0.041667,0.00,0.000000,0.041667,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.250000,0.000000,0.041667,0.00,0.000000,0.000000,0.083333,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.000000,0.000000,...,0.000000,0.00,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.041667,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.041667,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000
7,Beard,0.000000,0.000000,0.0,0.0,0.0,0.00,0.000000,0.000000,0.000000,0.000000,0.00,0.000000,0.000000,0.00,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.00,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.333333,0.0,0.000000,0.000000,...,0.000000,0.00,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000
8,Belconnen,0.000000,0.000000,0.0,0.0,0.0,0.00,0.000000,0.000000,0.000000,0.000000,0.00,0.000000,0.000000,0.00,0.033333,0.000000,0.0,0.033333,0.0,0.066667,0.000000,0.000000,0.0,0.066667,0.000000,0.000000,0.00,0.000000,0.000000,0.100000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.000000,0.033333,...,0.033333,0.00,0.0,0.000000,0.033333,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.033333,0.033333,0.000000,0.000000,0.000000,0.0,0.033333,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.0,0.033333,0.000000,0.000000,0.000000,0.033333
9,Belconnen DC,0.000000,0.000000,0.0,0.0,0.0,0.00,0.000000,0.000000,0.000000,0.000000,0.00,0.000000,0.000000,0.00,0.033333,0.000000,0.0,0.033333,0.0,0.066667,0.000000,0.000000,0.0,0.066667,0.000000,0.000000,0.00,0.000000,0.000000,0.100000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.000000,0.033333,...,0.033333,0.00,0.0,0.000000,0.033333,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.033333,0.033333,0.000000,0.000000,0.000000,0.0,0.033333,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.0,0.033333,0.000000,0.000000,0.000000,0.033333


In [19]:
# top 3 frequencies

num_top_venues = 10

for hood in cbr_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = cbr_grouped[cbr_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Acton----
                 venue  freq
0                 Café  0.22
1                Hotel  0.22
2  Indie Movie Theater  0.11
3            Hotel Bar  0.11
4                  Bar  0.11
5               Museum  0.11
6        Movie Theater  0.11
7              Airport  0.00
8   Photography Studio  0.00
9         Outlet Store  0.00


----Ainslie----
                   venue  freq
0                   Café  0.14
1      Fish & Chips Shop  0.14
2                    Pub  0.14
3  Australian Restaurant  0.14
4         Shopping Plaza  0.14
5                 Bakery  0.14
6          Grocery Store  0.14
7     Photography Studio  0.00
8   Outdoor Supply Store  0.00
9           Outlet Store  0.00


----Amaroo----
                           venue  freq
0             Athletics & Sports   1.0
1                        Airport   0.0
2                    Planetarium   0.0
3           Outdoor Supply Store   0.0
4                   Outlet Store   0.0
5  Paper / Office Supplies Store   0.0
6                 

In [0]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [21]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = cbr_grouped['Neighborhood']

for ind in np.arange(cbr_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(cbr_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Acton,Café,Hotel,Indie Movie Theater,Bar,Movie Theater,Museum,Hotel Bar,Hostel,Food Truck,Gaming Cafe
1,Ainslie,Pub,Fish & Chips Shop,Australian Restaurant,Bakery,Grocery Store,Café,Shopping Plaza,Food Truck,Gaming Cafe,Furniture / Home Store
2,Amaroo,Athletics & Sports,Yoga Studio,Food Court,Garden,Gaming Cafe,Furniture / Home Store,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant
3,Aranda,Café,Recreation Center,Dance Studio,Bar,Food Court,Gaming Cafe,Furniture / Home Store,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint
4,Australian National University,Coffee Shop,Café,Food Truck,Whisky Bar,Thai Restaurant,Gaming Cafe,Cocktail Bar,Wine Bar,Steakhouse,Concert Hall


In [22]:
# cluster neighbourhoods

# set number of clusters
kclusters = 5

cbr_grouped_clustering = cbr_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(cbr_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:5]

array([3, 0, 0, 3, 0], dtype=int32)

In [0]:
# add clustering labels
neighborhoods_venues_sorted.insert(1, 'Cluster Labels', kmeans.labels_)

In [24]:
neighborhoods_venues_sorted.tail()

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
130,Weetangera,4,Yoga Studio,Bus Station,Ice Cream Shop,IT Services,Garden,Gaming Cafe,Furniture / Home Store,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint
131,Weston,3,Café,Sandwich Place,Fast Food Restaurant,Cricket Ground,Yoga Studio,Food Court,Furniture / Home Store,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint
132,Woden,0,Toy / Game Store,Furniture / Home Store,Home Service,Paper / Office Supplies Store,Yoga Studio,Gaming Cafe,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant
133,Wreck Bay,3,Café,Gift Shop,Gym / Fitness Center,History Museum,Food Truck,Gaming Cafe,Furniture / Home Store,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint
134,Wright,2,River,Yoga Studio,Food & Drink Shop,Gaming Cafe,Furniture / Home Store,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck


In [25]:
# merge dataset and check output

cbr_merged = pd.merge(cbr, neighborhoods_venues_sorted, on='Neighborhood')
cbr_merged.tail()

Unnamed: 0,Neighborhood,Postcode,Country,Region,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
133,Ginninderra Village,2913,Australia,Australian Capital Territory,-35.231177,149.081971,3,Café,Yoga Studio,Food Truck,Garden,Gaming Cafe,Furniture / Home Store,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant
134,Palmerston,2913,Australia,Australian Capital Territory,-35.19725,149.11758,0,Bus Station,Grocery Store,Baseball Field,Bus Stop,Filipino Restaurant,Gas Station,Garden Center,Garden,Gaming Cafe,Furniture / Home Store
135,Ngunnawal,2913,Australia,Australian Capital Territory,-35.17319,149.10802,0,Grocery Store,Bus Station,Fast Food Restaurant,Southern / Soul Food Restaurant,Food Truck,Gaming Cafe,Furniture / Home Store,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint
136,Nicholls,2913,Australia,Australian Capital Territory,-35.18418,149.09916,0,Resort,Grocery Store,Soccer Field,Event Space,Farmers Market,Furniture / Home Store,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant
137,Amaroo,2914,Australia,Australian Capital Territory,-35.16922,149.12637,0,Athletics & Sports,Yoga Studio,Food Court,Garden,Gaming Cafe,Furniture / Home Store,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant


In [26]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=10)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(cbr_merged['Latitude'], cbr_merged['Longitude'], cbr_merged['Neighborhood'], cbr_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [27]:
# cluster 0
cbr_merged.loc[cbr_merged['Cluster Labels'] == 0, cbr_merged.columns[[1] + list(range(5, cbr_merged.shape[1]))]]

Unnamed: 0,Postcode,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,2614,149.048260,0,Bus Station,Steakhouse,Vietnamese Restaurant,Motel,Food Truck,Garden,Gaming Cafe,Furniture / Home Store,Fruit & Vegetable Store,Frozen Yogurt Shop
5,2601,149.130930,0,Coffee Shop,Café,Record Shop,Theater,Electronics Store,Restaurant,Comic Shop,Cocktail Bar,Shopping Mall,Chocolate Shop
9,2600,149.200910,0,Gym,Yoga Studio,Food Truck,Garden,Gaming Cafe,Furniture / Home Store,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant
11,2600,149.200910,0,Gym,Yoga Studio,Food Truck,Garden,Gaming Cafe,Furniture / Home Store,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant
17,2602,149.157310,0,Café,Filipino Restaurant,Fish & Chips Shop,Shopping Plaza,Grocery Store,Yoga Studio,Food Truck,Furniture / Home Store,Fruit & Vegetable Store,Frozen Yogurt Shop
19,2602,149.146550,0,Pub,Fish & Chips Shop,Australian Restaurant,Bakery,Grocery Store,Café,Shopping Plaza,Food Truck,Gaming Cafe,Furniture / Home Store
20,2602,149.115310,0,Health & Beauty Service,Bus Stop,Sports Club,Business Service,Fast Food Restaurant,Filipino Restaurant,Garden,Event Space,Gaming Cafe,Furniture / Home Store
22,2602,149.142950,0,Café,Pool,Vegetarian / Vegan Restaurant,Bus Stop,Social Club,Bakery,Department Store,Pharmacy,Hostel,Hotel
23,2602,149.126790,0,Pub,Gym,Chinese Restaurant,Fruit & Vegetable Store,Athletics & Sports,Tennis Court,Hockey Arena,Café,Gym / Fitness Center,Furniture / Home Store
25,2603,149.128440,0,Hotel,Athletics & Sports,Australian Restaurant,Tennis Court,Martial Arts Dojo,Yoga Studio,Food Truck,Garden,Gaming Cafe,Furniture / Home Store


In [28]:
# cluster 1
cbr_merged.loc[cbr_merged['Cluster Labels'] == 1, cbr_merged.columns[[1] + list(range(5, cbr_merged.shape[1]))]]

Unnamed: 0,Postcode,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
57,2612,149.16005,1,Bus Stop,Food Truck,Garden Center,Garden,Gaming Cafe,Furniture / Home Store,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant
69,2615,149.02221,1,Bus Stop,Food Truck,Garden Center,Garden,Gaming Cafe,Furniture / Home Store,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant
90,2904,149.09226,1,Bus Stop,Food Truck,Garden Center,Garden,Gaming Cafe,Furniture / Home Store,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant


In [29]:
# cluster 2
cbr_merged.loc[cbr_merged['Cluster Labels'] == 2, cbr_merged.columns[[1] + list(range(5, cbr_merged.shape[1]))]]

Unnamed: 0,Postcode,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
50,2611,149.04089,2,River,Yoga Studio,Food & Drink Shop,Gaming Cafe,Furniture / Home Store,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck
54,2611,149.0424,2,River,Yoga Studio,Food & Drink Shop,Gaming Cafe,Furniture / Home Store,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck


In [30]:
# cluster 3
cbr_merged.loc[cbr_merged['Cluster Labels'] == 3, cbr_merged.columns[[1] + list(range(5, cbr_merged.shape[1]))]]

Unnamed: 0,Postcode,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,2600,149.13354,3,Café,Hotel,Coffee Shop,Bistro,French Restaurant,Gym,Cantonese Restaurant,Event Space,History Museum,Breakfast Spot
1,221,149.13354,3,Café,Hotel,Coffee Shop,Bistro,French Restaurant,Gym,Cantonese Restaurant,Event Space,History Museum,Breakfast Spot
2,2600,149.12655,3,Café,Gift Shop,Gym / Fitness Center,History Museum,Food Truck,Gaming Cafe,Furniture / Home Store,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint
3,2601,149.12655,3,Café,Gift Shop,Gym / Fitness Center,History Museum,Food Truck,Gaming Cafe,Furniture / Home Store,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint
6,2540,149.144621,3,Pub,Café,Trail,Park,Yoga Studio,Food Court,Furniture / Home Store,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint
7,2540,149.108064,3,Café,Grocery Store,Sports Bar,Gas Station,Seafood Restaurant,Pizza Place,Gym,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint
8,2540,149.12655,3,Café,Gift Shop,Gym / Fitness Center,History Museum,Food Truck,Gaming Cafe,Furniture / Home Store,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint
10,2600,149.12668,3,Café,Gift Shop,Gym / Fitness Center,History Museum,Food Truck,Gaming Cafe,Furniture / Home Store,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint
12,2600,149.12655,3,Café,Gift Shop,Gym / Fitness Center,History Museum,Food Truck,Gaming Cafe,Furniture / Home Store,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint
13,2600,149.14992,3,Café,Plaza,Playground,Coffee Shop,Yoga Studio,Food Truck,Furniture / Home Store,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint


In [0]:
# cluster 4
cbr_merged.loc[cbr_merged['Cluster Labels'] == 4, cbr_merged.columns[[1] + list(range(5, cbr_merged.shape[1]))]]

Unnamed: 0,Postcode,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
70,2615,149.0443,4,Convenience Store,Yoga Studio,Fish Market,Fruit & Vegetable Store,Frozen Yogurt Shop
110,2617,149.09891,4,Convenience Store,Yoga Studio,Fish Market,Fruit & Vegetable Store,Frozen Yogurt Shop
124,2905,149.12418,4,Convenience Store,Yoga Studio,Fish Market,Fruit & Vegetable Store,Frozen Yogurt Shop


In [0]:
# cluster 5
cbr_merged.loc[cbr_merged['Cluster Labels'] == 5, cbr_merged.columns[[1] + list(range(5, cbr_merged.shape[1]))]]

Unnamed: 0,Postcode,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
