<a href="https://colab.research.google.com/github/MaguireMaName/Coursera_Capstone/blob/master/The_Battle_of_Neighborhoods.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Capstone: Battle of the Neighbourhoods

##Introduction/Business Problem

This project will bring together neighboured information on venues (from Foursquare) and neighbourhood crime statistics to find complementary neighbourhood concerning local venues, but to compare and contrast alike neighbourhoods with crime statistics.

This information will inform a business who wishes to set up shop in a reasonably safe neighbourhood or enable an individual to identify a venue to visit in a reasonably safe neighbourhood.

## Data description

The Foursquare data will be used to identify similar neighbourhoods by the frequency of venues by type. This alikeness will allow businesses to identify those neighbourhoods with relatively high levels of competition. For example, if I wanted to set up a coffee shop, I might want to avoid a neighbourhood saturated with coffee shops!

Furthermore, the value of the Foursquare dataset is to be extended by the inclusion of local crime dataset alongside resident population dataset. The integration of these datasets will enable business or individuals to identify further those venues and neighbourhoods that intersect with unreasonably high incidences of crime on a per capita basis.

This information will enable business or individuals to identify on a map those neighbourhoods based on the type of venues and the incidence of crime per capita.  Allowing businesses to target and assess the potential location of their operations by frequency of venues (i.e., competition) and low crime per capita.  Similarly, individuals can use the information to assess what venues to visit in what neighbourhood by type of venue and the incidence of crime in that neighbourhood.

In [0]:
# !pip install geocoder

In [3]:
# load dependancies

import pandas as pd
import numpy as np
import geocoder
import folium
from geopy.geocoders import Nominatim
import requests
from sklearn.cluster import KMeans
import matplotlib.cm as cm
import matplotlib.colors as colors



In [4]:
cbr = pd.read_csv('Canberra suburbs.csv')
print(cbr.shape)
cbr.head()

(155, 4)


Unnamed: 0,Neighborhood,Postcode,Country,Region
0,Barton,2600,Australia,Australian Capital Territory
1,Canberra,2600,Australia,Australian Capital Territory
2,Page,2614,Australia,Australian Capital Territory
3,City,2601,Australia,Australian Capital Territory
4,Canberra,2601,Australia,Australian Capital Territory


In [0]:
Lat_list=[]
Lng_list=[]

for i in range(cbr.shape[0]):
    address='{}, Canberra, Australia'.format(cbr.at[i,'Neighborhood'])
    g = geocoder.arcgis(address)
    Lat_list.append(g.latlng[0])
    Lng_list.append(g.latlng[1])

In [7]:
for i in range(cbr.shape[0]): 
  
    cbr['Latitude'] = Lat_list
    cbr['Longitude'] = Lng_list
    
print(cbr.shape)
cbr.head()

(155, 6)


Unnamed: 0,Neighborhood,Postcode,Country,Region,Latitude,Longitude
0,Barton,2600,Australia,Australian Capital Territory,-35.30829,149.13354
1,Canberra,2600,Australia,Australian Capital Territory,-35.30654,149.12655
2,Page,2614,Australia,Australian Capital Territory,-35.23954,149.04826
3,City,2601,Australia,Australian Capital Territory,-35.28007,149.13093
4,Canberra,2601,Australia,Australian Capital Territory,-35.30654,149.12655


In [8]:
address = 'Canberra, Australian Capital Territory'

geolocator = Nominatim(user_agent="canberra_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude

# create map of Canberra using latitude and longitude values
map_cbr = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, label in zip(cbr['Latitude'], cbr['Longitude'], cbr['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='blue',
        fill_opacity=0.8,
        parse_html=False).add_to(map_cbr)  
    
map_cbr

In [0]:
# define Foursquare Credentials and Version

client_id = 'KL5SVGOS40RKZBQK4G1VXYBKBICWCDQL2NMCASHFYER432SS' #  Foursquare ID'
client_secret = '1A5KPYJQIATH0SDZXPPZ5YK0SHLBYVEGPER5AAIIMDXLZ0AB' #  Foursquare Secret
version = '20180604'
limit = 100

In [0]:
# let's create a function to repeat the same process to all the neighborhoods in toronto

def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            client_id, 
            client_secret, 
            version, 
            lat, 
            lng, 
            radius, 
            limit)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [12]:
# run the above function on each neighborhood and create a new dataframe called toronto_venues

cbr_venues = getNearbyVenues(names=cbr['Neighborhood'],
                                   latitudes=cbr['Latitude'],
                                   longitudes=cbr['Longitude']
                                  )

Barton
Canberra
Page
City
Canberra
Barton
Hmas Creswell
Jervis Bay
Wreck Bay
Hmas Harman
Parliament House
Yarralumla
Harman
Capital Hill
Russell
Parkes
Deakin
Acton
Watson
Hackett
Ainslie
O'Connor
Downer
Dickson
Lyneham
Manuka
Forrest
Griffith
Red Hill
Kingston
Narrabundah
Causeway
Garran
Hughes
Curtin
Swinger Hill
Woden
O'Malley
Chifley
Lyons
Phillip
Torrens
Isaacs
Mawson
Pearce
Farrer
Civic Square
Canberra Airport
Majura
Pialligo
Fyshwick
Canberra Mc
Uriarra
Uriarra Village
Wright
Duffy
Weston Creek
Weston
Coree
Fisher
Coombs
Stromlo
Turner
Braddon
Campbell
Reid
Jamison Centre
Weetangera
Scullin
Macquarie
Cook
Aranda
Hawker
Kippax Centre
Kippax
Florey
Dunlop
Macgregor
Latham
Charnwood
Fraser
Melba
Flynn
Higgins
Holt
Spence
Belconnen
Belconnen DC
Bruce
Lawson
Belconnen
Hall
Paddys River
Kowen
Hume
Oaks Estate
Beard
Kambah Village
Oxley
Macarthur
Monash
Crace
Kinlyside
Franklin
Taylor
Casey
Moncrieff
Harrison
Jacka
Forde
Bonner
Australian National University
Deakin West
Duntroon
Black 

In [15]:
# check dimensions and data

print(cbr_venues.shape)
cbr_venues.head()

(1180, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Barton,-35.30829,149.13354,Ottoman Cuisine,-35.305615,149.136704,Turkish Restaurant
1,Barton,-35.30829,149.13354,Ostani,-35.311509,149.133533,Hotel Bar
2,Barton,-35.30829,149.13354,Little Bird,-35.306099,149.135331,Café
3,Barton,-35.30829,149.13354,LiloTang,-35.311991,149.133847,Japanese Restaurant
4,Barton,-35.30829,149.13354,National Archives of Australia,-35.304637,149.131004,History Museum


In [19]:
# the number of venues returned for each neighborhood

cbr_venues.groupby('Neighborhood').count()

print('There are {} unique venue categories.'.format(len(cbr_venues['Venue Category'].unique())))
print(cbr_venues.head())

There are 188 unique venue categories.
  Neighborhood  Neighborhood Latitude  ...  Venue Longitude       Venue Category
0       Barton              -35.30829  ...       149.136704   Turkish Restaurant
1       Barton              -35.30829  ...       149.133533            Hotel Bar
2       Barton              -35.30829  ...       149.135331                 Café
3       Barton              -35.30829  ...       149.133847  Japanese Restaurant
4       Barton              -35.30829  ...       149.131004       History Museum

[5 rows x 7 columns]


In [23]:
# analyse each neighbourhood

# one hot encoding
cbr_onehot = pd.get_dummies(cbr_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
cbr_onehot['Neighborhood'] = cbr_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [cbr_onehot.columns[-1]] + list(cbr_onehot.columns[:-1])
cbr_onehot = cbr_onehot[fixed_columns]

cbr_onehot.head()

Unnamed: 0,Neighborhood,Airport,Airport Lounge,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Australian Restaurant,Auto Garage,Baby Store,Bakery,Bar,Baseball Field,Beer Bar,Bike Trail,Bistro,Boat or Ferry,Bookstore,Breakfast Spot,Brewery,Burger Joint,Burmese Restaurant,Burrito Place,Bus Station,Bus Stop,Business Service,Café,Campaign Office,Cantonese Restaurant,Chinese Restaurant,Chocolate Shop,Clothing Store,Cocktail Bar,Coffee Shop,Comic Shop,Concert Hall,Construction & Landscaping,Convenience Store,Creperie,...,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shop & Service,Shopping Mall,Shopping Plaza,Skate Park,Soccer Field,Social Club,Spa,Spanish Restaurant,Sporting Goods Shop,Sports Bar,Sports Club,Steakhouse,Supermarket,Sushi Restaurant,Szechuan Restaurant,Tapas Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Park Ride / Attraction,Thrift / Vintage Store,Tiki Bar,Toy / Game Store,Track,Trail,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Veterinarian,Video Game Store,Video Store,Vietnamese Restaurant,Whisky Bar,Wine Bar,Wine Shop,Yoga Studio
0,Barton,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0
1,Barton,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Barton,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Barton,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Barton,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [24]:
cbr_grouped = cbr_onehot.groupby('Neighborhood').mean().reset_index()
cbr_grouped

Unnamed: 0,Neighborhood,Airport,Airport Lounge,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Australian Restaurant,Auto Garage,Baby Store,Bakery,Bar,Baseball Field,Beer Bar,Bike Trail,Bistro,Boat or Ferry,Bookstore,Breakfast Spot,Brewery,Burger Joint,Burmese Restaurant,Burrito Place,Bus Station,Bus Stop,Business Service,Café,Campaign Office,Cantonese Restaurant,Chinese Restaurant,Chocolate Shop,Clothing Store,Cocktail Bar,Coffee Shop,Comic Shop,Concert Hall,Construction & Landscaping,Convenience Store,Creperie,...,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shop & Service,Shopping Mall,Shopping Plaza,Skate Park,Soccer Field,Social Club,Spa,Spanish Restaurant,Sporting Goods Shop,Sports Bar,Sports Club,Steakhouse,Supermarket,Sushi Restaurant,Szechuan Restaurant,Tapas Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Park Ride / Attraction,Thrift / Vintage Store,Tiki Bar,Toy / Game Store,Track,Trail,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Veterinarian,Video Game Store,Video Store,Vietnamese Restaurant,Whisky Bar,Wine Bar,Wine Shop,Yoga Studio
0,Acton,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.00,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.00,0.00,0.000000,0.00,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.222222,0.000000,0.00,0.000000,0.00,0.000000,0.000000,0.000000,0.00,0.000000,0.0,0.000000,0.000000,...,0.000000,0.00,0.0,0.00,0.000000,0.000000,0.0,0.0,0.0,0.0,0.0,0.00,0.000000,0.000000,0.000000,0.000000,0.00000,0.000000,0.000000,0.000000,0.0,0.000000,0.00,0.00,0.0,0.00,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.00
1,Ainslie,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.00,0.142857,0.0,0.000000,0.142857,0.000000,0.0,0.000000,0.000000,0.00,0.00,0.000000,0.00,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.142857,0.000000,0.00,0.000000,0.00,0.000000,0.000000,0.000000,0.00,0.000000,0.0,0.000000,0.000000,...,0.000000,0.00,0.0,0.00,0.000000,0.142857,0.0,0.0,0.0,0.0,0.0,0.00,0.000000,0.000000,0.000000,0.000000,0.00000,0.000000,0.000000,0.000000,0.0,0.000000,0.00,0.00,0.0,0.00,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.00
2,Amaroo,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.50,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.00,0.00,0.000000,0.00,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.00,0.000000,0.00,0.000000,0.000000,0.000000,0.00,0.000000,0.0,0.000000,0.000000,...,0.000000,0.00,0.0,0.00,0.000000,0.000000,0.0,0.0,0.0,0.0,0.0,0.00,0.000000,0.000000,0.000000,0.000000,0.00000,0.000000,0.000000,0.000000,0.0,0.000000,0.00,0.00,0.0,0.00,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.00
3,Aranda,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.00,0.000000,0.0,0.000000,0.000000,0.250000,0.0,0.000000,0.000000,0.00,0.00,0.000000,0.00,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.250000,0.000000,0.00,0.000000,0.00,0.000000,0.000000,0.000000,0.00,0.000000,0.0,0.000000,0.000000,...,0.000000,0.00,0.0,0.00,0.000000,0.000000,0.0,0.0,0.0,0.0,0.0,0.00,0.000000,0.000000,0.000000,0.000000,0.00000,0.000000,0.000000,0.000000,0.0,0.000000,0.00,0.00,0.0,0.00,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.00
4,Australian National University,0.000000,0.000000,0.0,0.012821,0.0,0.012821,0.00,0.012821,0.0,0.000000,0.012821,0.025641,0.0,0.012821,0.000000,0.00,0.00,0.012821,0.00,0.0,0.000000,0.0,0.012821,0.000000,0.012821,0.0,0.115385,0.000000,0.00,0.025641,0.00,0.000000,0.012821,0.076923,0.00,0.012821,0.0,0.000000,0.012821,...,0.000000,0.00,0.0,0.00,0.012821,0.000000,0.0,0.0,0.0,0.0,0.0,0.00,0.000000,0.000000,0.012821,0.000000,0.00000,0.012821,0.012821,0.012821,0.0,0.025641,0.00,0.00,0.0,0.00,0.000000,0.000000,0.0,0.000000,0.012821,0.0,0.0,0.0,0.0,0.038462,0.012821,0.012821,0.000000,0.00
5,Banks,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.00,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.00,0.00,0.000000,0.00,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.00,0.000000,0.00,0.000000,0.000000,0.000000,0.00,0.000000,0.0,0.000000,0.000000,...,0.000000,0.00,0.0,0.00,0.000000,0.000000,0.0,0.0,0.0,0.0,0.0,0.00,0.000000,0.000000,0.000000,0.000000,0.00000,0.000000,0.000000,0.000000,0.0,0.000000,0.00,0.00,0.0,0.00,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.00
6,Barton,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.00,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.04,0.00,0.000000,0.04,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.280000,0.000000,0.04,0.000000,0.00,0.000000,0.000000,0.040000,0.00,0.000000,0.0,0.000000,0.000000,...,0.000000,0.00,0.0,0.00,0.000000,0.000000,0.0,0.0,0.0,0.0,0.0,0.00,0.040000,0.000000,0.000000,0.000000,0.00000,0.000000,0.000000,0.000000,0.0,0.000000,0.00,0.00,0.0,0.00,0.000000,0.000000,0.0,0.000000,0.040000,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.00
7,Beard,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.00,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.00,0.00,0.000000,0.00,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.00,0.000000,0.00,0.000000,0.000000,0.000000,0.00,0.000000,0.0,0.000000,0.000000,...,0.000000,0.00,0.0,0.00,0.000000,0.000000,0.0,0.0,0.0,0.0,0.0,0.00,0.000000,0.000000,0.000000,0.000000,0.00000,0.000000,0.000000,0.000000,0.0,0.000000,0.00,0.00,0.0,0.00,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.00
8,Belconnen,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.00,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.00,0.00,0.040000,0.00,0.0,0.020000,0.0,0.040000,0.000000,0.000000,0.0,0.100000,0.000000,0.00,0.000000,0.00,0.000000,0.000000,0.100000,0.00,0.000000,0.0,0.000000,0.000000,...,0.020000,0.00,0.0,0.00,0.020000,0.000000,0.0,0.0,0.0,0.0,0.0,0.02,0.000000,0.000000,0.000000,0.020000,0.02000,0.000000,0.000000,0.000000,0.0,0.020000,0.00,0.00,0.0,0.00,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.0,0.0,0.020000,0.000000,0.000000,0.000000,0.02
9,Belconnen DC,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.00,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.00,0.00,0.040000,0.00,0.0,0.020000,0.0,0.040000,0.000000,0.000000,0.0,0.100000,0.000000,0.00,0.000000,0.00,0.000000,0.000000,0.100000,0.00,0.000000,0.0,0.000000,0.000000,...,0.020000,0.00,0.0,0.00,0.020000,0.000000,0.0,0.0,0.0,0.0,0.0,0.02,0.000000,0.000000,0.000000,0.020000,0.02000,0.000000,0.000000,0.000000,0.0,0.020000,0.00,0.00,0.0,0.00,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.0,0.0,0.020000,0.000000,0.000000,0.000000,0.02


In [25]:
# top 3 frequencies

num_top_venues = 10

for hood in cbr_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = cbr_grouped[cbr_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Acton----
                           venue  freq
0                  Movie Theater  0.22
1                           Café  0.22
2                          Hotel  0.22
3                      Hotel Bar  0.11
4                         Museum  0.11
5            Indie Movie Theater  0.11
6            Japanese Restaurant  0.00
7  Paper / Office Supplies Store  0.00
8                   Music School  0.00
9                    Music Venue  0.00


----Ainslie----
                   venue  freq
0                    Pub  0.14
1  Australian Restaurant  0.14
2          Grocery Store  0.14
3         Shopping Plaza  0.14
4                 Bakery  0.14
5      Fish & Chips Shop  0.14
6                   Café  0.14
7            Planetarium  0.00
8            Pizza Place  0.00
9           Music School  0.00


----Amaroo----
                           venue  freq
0             Athletics & Sports   0.5
1                       Pharmacy   0.5
2                        Airport   0.0
3  Paper / Office Supplie

In [0]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [51]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = cbr_grouped['Neighborhood']

for ind in np.arange(cbr_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(cbr_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Acton,Hotel,Movie Theater,Café,Museum,Hotel Bar,Indie Movie Theater,Home Service,Fish & Chips Shop,Hot Dog Joint,French Restaurant
1,Ainslie,Pub,Australian Restaurant,Shopping Plaza,Grocery Store,Bakery,Fish & Chips Shop,Café,Event Space,Frozen Yogurt Shop,Fried Chicken Joint
2,Amaroo,Athletics & Sports,Pharmacy,Yoga Studio,Fish & Chips Shop,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food & Drink Shop
3,Aranda,Bar,Café,Recreation Center,Dance Studio,Fish & Chips Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food & Drink Shop
4,Australian National University,Café,Coffee Shop,Japanese Restaurant,Grocery Store,Italian Restaurant,Vietnamese Restaurant,Plaza,Hotel,Chinese Restaurant,Restaurant


In [52]:
# cluster neighbourhoods

# set number of clusters
kclusters = 4

cbr_grouped_clustering = cbr_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(cbr_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:5]

array([1, 1, 1, 1, 1], dtype=int32)

In [0]:
# add clustering labels
neighborhoods_venues_sorted.insert(1, 'Cluster Labels', kmeans.labels_)

In [54]:
neighborhoods_venues_sorted.tail()

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
129,Weston Creek,1,Skate Park,Malay Restaurant,Yoga Studio,Fish & Chips Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop
130,Woden,1,Furniture / Home Store,Auto Garage,Gym / Fitness Center,Paper / Office Supplies Store,Martial Arts Dojo,Fish & Chips Shop,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck
131,Wreck Bay,1,Café,Gift Shop,History Museum,Gym / Fitness Center,Yoga Studio,Fish Market,Fried Chicken Joint,French Restaurant,Food Truck,Food Court
132,Wright,1,River,Yoga Studio,Filipino Restaurant,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market
133,Yarralumla,3,Bus Stop,Fish Market,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop


In [55]:
# merge dataset and check output

cbr_merged = pd.merge(cbr, neighborhoods_venues_sorted, on='Neighborhood')
cbr_merged.tail()

Unnamed: 0,Neighborhood,Postcode,Country,Region,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
132,Ginninderra Village,2913,Australia,Australian Capital Territory,-35.231177,149.081971,1,Café,Yoga Studio,Fish Market,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop
133,Palmerston,2913,Australia,Australian Capital Territory,-35.19725,149.11758,0,Grocery Store,Massage Studio,Fruit & Vegetable Store,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market
134,Ngunnawal,2913,Australia,Australian Capital Territory,-35.17319,149.10802,0,Grocery Store,Diner,Fast Food Restaurant,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop
135,Nicholls,2913,Australia,Australian Capital Territory,-35.18418,149.09916,0,Grocery Store,Resort,Soccer Field,Filipino Restaurant,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market
136,Amaroo,2914,Australia,Australian Capital Territory,-35.16922,149.12637,1,Athletics & Sports,Pharmacy,Yoga Studio,Fish & Chips Shop,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food & Drink Shop


In [56]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=10)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(cbr_merged['Latitude'], cbr_merged['Longitude'], cbr_merged['Neighborhood'], cbr_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [57]:
# cluster 0
cbr_merged.loc[cbr_merged['Cluster Labels'] == 0, cbr_merged.columns[[1] + list(range(5, cbr_merged.shape[1]))]]

Unnamed: 0,Postcode,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
26,2603,149.12669,0,Asian Restaurant,Grocery Store,Yoga Studio,Fish Market,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food & Drink Shop
30,2605,149.10875,0,Grocery Store,Malay Restaurant,Bakery,Yoga Studio,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food & Drink Shop
38,2607,149.08789,0,Bus Station,Burger Joint,Shop & Service,Dry Cleaner,Flea Market,Frozen Yogurt Shop,Donut Shop,Fried Chicken Joint,French Restaurant,Food Truck
60,2614,149.04042,0,Grocery Store,Bus Station,Moving Target,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop
70,2615,149.03168,0,Grocery Store,Soccer Field,Fruit & Vegetable Store,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market
76,2615,149.06022,0,Grocery Store,Convenience Store,Shopping Mall,Filipino Restaurant,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market
88,2903,149.08206,0,Bus Station,Yoga Studio,Fish Market,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food & Drink Shop
91,2911,149.10606,0,Grocery Store,IT Services,Gastropub,Park,Filipino Restaurant,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop
96,2914,149.16094,0,Park,Bus Station,Yoga Studio,Fish Market,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food & Drink Shop
107,2611,149.04076,0,Track,Grocery Store,Clothing Store,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market


In [58]:
# cluster 1
cbr_merged.loc[cbr_merged['Cluster Labels'] == 1, cbr_merged.columns[[1] + list(range(5, cbr_merged.shape[1]))]]

Unnamed: 0,Postcode,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,2600,149.133540,1,Café,Hotel,Bistro,Hotel Bar,Cantonese Restaurant,History Museum,Japanese Restaurant,Event Space,Turkish Restaurant,Sports Bar
1,221,149.133540,1,Café,Hotel,Bistro,Hotel Bar,Cantonese Restaurant,History Museum,Japanese Restaurant,Event Space,Turkish Restaurant,Sports Bar
2,2600,149.126550,1,Café,Gift Shop,History Museum,Gym / Fitness Center,Yoga Studio,Fish Market,Fried Chicken Joint,French Restaurant,Food Truck,Food Court
3,2601,149.126550,1,Café,Gift Shop,History Museum,Gym / Fitness Center,Yoga Studio,Fish Market,Fried Chicken Joint,French Restaurant,Food Truck,Food Court
4,2614,149.048260,1,Bus Station,Steakhouse,Vietnamese Restaurant,Home Service,Motel,Fish Market,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck
5,2601,149.130930,1,Coffee Shop,Café,Japanese Restaurant,Thai Restaurant,Korean Restaurant,Record Shop,Bookstore,Supermarket,Sushi Restaurant,Clothing Store
6,2540,149.144621,1,Pub,Café,Park,Trail,Filipino Restaurant,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop
7,2540,149.108064,1,Café,Sports Bar,Pizza Place,Gas Station,Gym,Grocery Store,Seafood Restaurant,Filipino Restaurant,Food Truck,Food Court
8,2540,149.126550,1,Café,Gift Shop,History Museum,Gym / Fitness Center,Yoga Studio,Fish Market,Fried Chicken Joint,French Restaurant,Food Truck,Food Court
9,2600,149.126680,1,Café,Gift Shop,History Museum,Gym / Fitness Center,Yoga Studio,Fish Market,Fried Chicken Joint,French Restaurant,Food Truck,Food Court


In [59]:
# cluster 2
cbr_merged.loc[cbr_merged['Cluster Labels'] == 2, cbr_merged.columns[[1] + list(range(5, cbr_merged.shape[1]))]]

Unnamed: 0,Postcode,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
81,2617,149.10001,2,Shopping Mall,Yoga Studio,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fish Market
99,2914,149.14303,2,Shopping Mall,Yoga Studio,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fish Market
119,2903,149.105611,2,Shopping Mall,Yoga Studio,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fish Market


In [61]:
# cluster 3
cbr_merged.loc[cbr_merged['Cluster Labels'] == 3, cbr_merged.columns[[1] + list(range(5, cbr_merged.shape[1]))]]

Unnamed: 0,Postcode,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
10,2600,149.1067,3,Bus Stop,Fish Market,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop
57,2612,149.16005,3,Bus Stop,Fish Market,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop
68,2615,149.02221,3,Bus Stop,Fish Market,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop
90,2904,149.09226,3,Bus Stop,Fish Market,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop


In [60]:
# cluster 4
cbr_merged.loc[cbr_merged['Cluster Labels'] == 4, cbr_merged.columns[[1] + list(range(5, cbr_merged.shape[1]))]]

Unnamed: 0,Postcode,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue


In [38]:
# cluster 5
cbr_merged.loc[cbr_merged['Cluster Labels'] == 5, cbr_merged.columns[[1] + list(range(5, cbr_merged.shape[1]))]]

Unnamed: 0,Postcode,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue


In [64]:
# read in crime statistics

cbr_crime = pd.read_excel('Canberra Suburb Crime Stats.xlsx')
print(cbr_crime.shape)
cbr_crime.head()

(124, 19)


Unnamed: 0,Neighbourhood,Sum of 1 Homicide,Sum of 2a Assault -FV,Sum of 2b Assault -Non-FV,Sum of 3 Sexual Assault,Sum of 5a Robbery -armed,Sum of 4 Other offences against a person,Sum of 5b Robbery - other,Sum of 6a Burglary dwellings,Sum of 6b Burglary shops,Sum of 6c Burglary other,Sum of 7 Motor vehicle theft,Sum of 9 Other offences,Sum of 8 Property damage,Sum of 91a TINs Speeding,Sum of 91b TINs Mobile Use,Sum of 91c TINs Seatbelts,Sum of 92 Theft (excluding Motor Vehicles),Sum of 91d TINs Other
0,ACTON,4,4,18,1,1,1,0,1,2,6,12,122,50,68,20,3,81,81
1,AINSLIE,0,15,21,1,0,1,2,34,1,8,14,218,78,3,2,0,136,34
2,AMAROO,0,9,19,5,0,1,0,26,0,3,7,49,28,4,2,3,83,17
3,ARANDA,0,8,3,1,0,2,1,12,0,3,3,103,13,199,14,3,30,127
4,BANKS,0,5,2,3,0,1,0,16,1,0,8,59,33,1,0,0,35,4
