<a href="https://colab.research.google.com/github/jbadham/Coursera_Capstone/blob/master/Toronto_neighbourhood_clustering.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Toronto Neighbourhood clustering

Created by Jennifer Badham for the IBM Professional Certificate in Data Science  
March 2020  

## Context

The objective is to cluster neighbourhoods in the city of Toronto based on similarity of the venues they offer. The general method is to obtain information about the venues available in each neighbourhood and count such venues by category (such as park or italian restaurant). The profile of a neighbourhood describes the proportion of venues in each of these categories. Neighbourhoods are clustered by similarity of their profiles.

In [0]:
# preparation - load libraries
import pandas as pd
import numpy as np

## Part 1: Obtain neighbourhood information

The Toronto postal code system assigns a code starting with M to each neighbourhood. The postal code information is scraped from the Widipedia page at https://en.wikipedia.org/w/index.php?title=List_of_postal_codes_of_Canada:\_M&oldid=890001695 to construct a dataframe of neighboodhoods with their postal codes.

**Step 1: Scrape the webpage and confirm the text has been downloaded.**

In [2]:
from requests import get
url = 'https://en.wikipedia.org/w/index.php?title=List_of_postal_codes_of_Canada:_M&oldid=890001695'
response = get(url)
print(response.text[:1000])


<!DOCTYPE html>
<html class="client-nojs" lang="en" dir="ltr">
<head>
<meta charset="UTF-8"/>
<title>List of postal codes of Canada: M - Wikipedia</title>
<script>document.documentElement.className="client-js";RLCONF={"wgBreakFrames":!1,"wgSeparatorTransformTable":["",""],"wgDigitTransformTable":["",""],"wgDefaultDateFormat":"dmy","wgMonthNames":["","January","February","March","April","May","June","July","August","September","October","November","December"],"wgRequestId":"Xn8@zApAAEYAAIzw0x4AAACM","wgCSPNonce":!1,"wgCanonicalNamespace":"","wgCanonicalSpecialPageName":!1,"wgNamespaceNumber":0,"wgPageName":"List_of_postal_codes_of_Canada:_M","wgTitle":"List of postal codes of Canada: M","wgCurRevisionId":947772202,"wgRevisionId":890001695,"wgArticleId":539066,"wgIsArticle":!0,"wgIsRedirect":!1,"wgAction":"view","wgUserName":null,"wgUserGroups":["*"],"wgCategories":["Pages with citations using unsupported parameters","Communications in Ontario","Postal codes in Canada","Toronto","Ontari

**Step 2: Extract key information and format**

The required information is stored as cells in a table (html tag is 'td' within a row 'tr'). So, first create a separate record for each table cell and inspect some.

In [3]:
# construct a BeautifulSoup object with the scraped webpage
from bs4 import BeautifulSoup
soup = BeautifulSoup(response.text, 'html.parser')

# extract cells and display count and example
nbr_info = soup.find_all('tr')
print("Number of cells found:", len(nbr_info))
print(nbr_info[1].text)
nbr_info[1].text

Number of cells found: 294

M1A
Not assigned
Not assigned



'\nM1A\nNot assigned\nNot assigned\n'

Examine the structure of a single row in the soup object and confirm the 'td' tags.

In [4]:
nbr_info[1].find_all('td')

[<td>M1A</td>, <td>Not assigned</td>, <td>Not assigned
 </td>]

For each of the results, the postal code is in the first table cell ('td') tag, the name of the borough is in the second, and the neighbourhood is in the third. So, construct lists of each of these data items with a loop through the rows.

In [0]:
# construct empty lists to store the extracted data
codes = []
boroughs = []
neighbourhoods = []

# loop through the results, send postal code to code list
for this_line in nbr_info:
     # loop through table cell tags within one record
    names = []
    for this_tag in this_line.find_all('td'):
        names.append(this_tag.text)
    if len(names) > 2:
        codes.append(names[0])
        boroughs.append(names[1])
        neighbourhoods.append(names[2])

Convert to a dataframe and clean. Cleaning removes the last three records, postal codes that are not assigned, and combines neighbourhoods with the same postal code.

In [6]:
# create the dataframe
locs = pd.DataFrame({
    'PostalCode': codes,
    'Borough': boroughs,
    'Neighbourhood': neighbourhoods
})
print("Number of neighbourhoods before cleaning:", len(locs))

# From inspection, remove last three records
locs = locs.iloc[0:len(locs)-3]
# and the last two characters (\n) of neighbourhood
locs['Neighbourhood'] = locs['Neighbourhood'].str.replace(r'\n', '')
locs

Number of neighbourhoods before cleaning: 291


Unnamed: 0,PostalCode,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront
...,...,...,...
283,M8Z,Etobicoke,Mimico NW
284,M8Z,Etobicoke,The Queensway West
285,M8Z,Etobicoke,Royal York South West
286,M8Z,Etobicoke,South of Bloor


In [7]:
# remove the codes that are not assigned
locs2 = locs[(locs['Borough'] != "Not assigned")]
print("Number of neighbourhoods after deleting irrelevant rows:", len(locs2))

# combine the neighbourhoods in each borough
df_codes = locs2.groupby(['PostalCode', 'Borough'], as_index = False).agg({'Neighbourhood': ', '.join})
print("Number of postal codes in cleaned dataframe:", len(df_codes))

# if a neighbourhood is 'Not assigned', use the borough name
print("Neighbourhoods not assigned, before correction:", sum(df_codes['Neighbourhood'] == "Not assigned"))
df_codes.loc[df_codes['Neighbourhood'] == "Not assigned", "Neighbourhood"] = df_codes.loc[df_codes['Neighbourhood'] == "Not assigned", "Borough"]
print("Neighbourhoods not assigned, after correction:", sum(df_codes['Neighbourhood'] == "Not assigned"))

# inspect results
df_codes

Number of neighbourhoods after deleting irrelevant rows: 211
Number of postal codes in cleaned dataframe: 103
Neighbourhoods not assigned, before correction: 1
Neighbourhoods not assigned, after correction: 0


Unnamed: 0,PostalCode,Borough,Neighbourhood
0,M1B,Scarborough,"Rouge, Malvern"
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union"
2,M1E,Scarborough,"Guildwood, Morningside, West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae
...,...,...,...
98,M9N,York,Weston
99,M9P,Etobicoke,Westmount
100,M9R,Etobicoke,"Kingsview Village, Martin Grove Gardens, Richv..."
101,M9V,Etobicoke,"Albion Gardens, Beaumond Heights, Humbergate, ..."


Neighbourhood information in required format, report rows

In [8]:
print("Number of rows in neighbourhood dataframe:", df_codes.shape[0])

Number of rows in neighbourhood dataframe: 103


# Part 2: Add geolocation information

Use the Python geocoder package to obtain latitude and longitude information from the postal code. As advised in the assignment description, use a while loop (code provided) to repeat the request until the information is obtained.

In [0]:
# !pip install geocoder
#import geocoder

# initialise lists to receive latitudes and longitudes
#lats = []
#longs = []

# loop through the postal codes in the dataframe
#for postal_code in df_codes['PostalCode']:
    # create variable to receive geodata for an invidual postcode
 #   lat_lng_coords = None
    # loop until coordinates obtained
 #   while(lat_lng_coords is None):
 #       g = geocoder.google('{}, Toronto, Ontario'.format(postal_code))
 #       lat_lng_coords = g.latlng
    # geocoder is unreliable so report progress
 #   print("Coordinates obtained for:", postal_code)
    # add the results to the latitude and longitude lists
 #   lats.append(lat_lng_coords[0])
 #   longs.append(lat_lng_coords[1])
    



However, repeated attempts led to infinite loops waiting for geocoder to provide the information. Hence, interrupt the loop and import the backup csv file instead.

In [9]:
df_geocodes = pd.read_csv("http://cocl.us/Geospatial_data")
df_geocodes.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


Merge the dataframes by postal code to attach the latitude and longitude values to the neighbourhood names.

In [10]:
# merge the dataframes
df_locations = pd.merge(df_codes, df_geocodes, left_on = "PostalCode", right_on = "Postal Code")
df_locations.drop('Postal Code', axis = 'columns', inplace=True)
print("Postal Codes in geo-located data:", df_locations.shape[0])
df_locations

Postal Codes in geo-located data: 103


Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476
...,...,...,...,...,...
98,M9N,York,Weston,43.706876,-79.518188
99,M9P,Etobicoke,Westmount,43.696319,-79.532242
100,M9R,Etobicoke,"Kingsview Village, Martin Grove Gardens, Richv...",43.688905,-79.554724
101,M9V,Etobicoke,"Albion Gardens, Beaumond Heights, Humbergate, ...",43.739416,-79.588437


# Part 3: Cluster neighbourhoods by venue profile

Neighbourhoods are to be clustered based on the similarity of their mix of venue types. Foursquare data will be used to extract up to 100 venues for each neighbourhood. The neighbourhood profile describes the proportion of venues in the neighbourhood that fall into each of the most common venue categories overall.

**Step 1: Obtain venue information for each neighbourhood**

In [11]:
import json # library to handle JSON files
import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# set up Foursquare credentials
CLIENT_ID = 'CEWHNCOSODH3US5NVLI5H2ZKJN5FWHPEQS4BBJSRSUIKNIS1' # your Foursquare ID
CLIENT_SECRET = 'EAINZEE4FJBAUS5XLOHW0JQLC1RO0LBWOUHUKVKFT21MVVFN' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

# define function to return venue name and category
radius = 500   # within 500m of specified location
LIMIT = 100    # up to 100 venues

def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['categories'][0]['name']) for v in results])

    # construct the dataframe
    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighbourhood', 
                  'Latitude', 
                  'Longitude', 
                  'Venue', 
                  'Category']
    
    return(nearby_venues)

# return the venue details for all Toronto neighbourhoods
df_venues = getNearbyVenues(names = df_locations['Neighbourhood'],
                            latitudes = df_locations['Latitude'],
                            longitudes = df_locations['Longitude']
                            )

df_venues

Unnamed: 0,Neighbourhood,Latitude,Longitude,Venue,Category
0,"Rouge, Malvern",43.806686,-79.194353,Wendy’s,Fast Food Restaurant
1,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497,Royal Canadian Legion,Bar
2,"Guildwood, Morningside, West Hill",43.763573,-79.188711,G & G Electronics,Electronics Store
3,"Guildwood, Morningside, West Hill",43.763573,-79.188711,Marina Spa,Spa
4,"Guildwood, Morningside, West Hill",43.763573,-79.188711,Big Bite Burrito,Mexican Restaurant
...,...,...,...,...,...
2244,"Albion Gardens, Beaumond Heights, Humbergate, ...",43.739416,-79.588437,McDonald's,Fast Food Restaurant
2245,"Albion Gardens, Beaumond Heights, Humbergate, ...",43.739416,-79.588437,Dollarama,Discount Store
2246,"Albion Gardens, Beaumond Heights, Humbergate, ...",43.739416,-79.588437,NORI SUSHI,Japanese Restaurant
2247,Northwest,43.706748,-79.594054,Economy Rent A Car,Rental Car Location


**Step 2: Identify the most common venue categories over all neighbourhoods**

Over 2000 venues have been returned. Count the number in each category and report the top 20.

In [12]:
num_categories = df_venues.groupby('Category').count()
num_categories.drop(['Neighbourhood', 'Latitude', 'Longitude'], axis = 'columns', inplace=True)
num_categories.sort_values('Venue', ascending=False, inplace=True)
num_categories.head(20)

Unnamed: 0_level_0,Venue
Category,Unnamed: 1_level_1
Coffee Shop,191
Café,97
Restaurant,77
Park,54
Pizza Place,53
Italian Restaurant,48
Bakery,47
Japanese Restaurant,43
Bar,42
Sandwich Place,40


Retain only the venues that are in one of these most common 20 categories.

In [13]:
# store the relevant categories
top_categories = num_categories[0:20]
top_categories.reset_index(inplace=True)
print("Total venues in top 20 categories:", top_categories['Venue'].sum())

Total venues in top 20 categories: 975


**Step 3: Construct neighbourhood profiles**

Restricting to the most frequent 20 categories reduces the total number of venues found from over 2000 to around 1000 (depending on shen Foursquare called. Construct a profile for each neighbouhood that contains the proportion of returned venues that are of each of these top 20 categories.

First, calculate the total number of venues by neighbourhood (including categories that are not in the most common 20). Then, count venues by category in each neighbourhood and calculate the proportion.

In [14]:
# count the total number of venues returned
nbr_totals = df_venues.groupby('Neighbourhood').count()
nbr_totals.drop(['Venue', 'Latitude', 'Longitude'], axis = 'columns', inplace=True)
nbr_totals.rename(columns={'Category':'Total'}, inplace=True)
nbr_totals.head(3)

Unnamed: 0_level_0,Total
Neighbourhood,Unnamed: 1_level_1
"Adelaide, King, Richmond",100
Agincourt,5
"Agincourt North, L'Amoreaux East, Milliken, Steeles East",2


In [15]:
# count venues by category in each neighbourhood
df_cat_counts = df_venues.groupby(by = ['Neighbourhood', 'Category']).count()
df_cat_counts.drop(['Latitude', 'Longitude'], axis='columns', inplace=True)
df_cat_counts.reset_index(inplace=True)

# drop the less common categories
df_cat_counts_clean = df_cat_counts[df_cat_counts['Category'].isin(top_categories['Category'])]
print("Confirm filtered correctly:", df_cat_counts_clean['Venue'].sum())

# attach the total venue count
df_profile = pd.merge(df_cat_counts_clean, nbr_totals, left_on='Neighbourhood', right_index=True)

# calculate proportions and check
df_profile['Prop'] = df_profile['Venue'] / df_profile['Total']
df_profile

Confirm filtered correctly: 975


Unnamed: 0,Neighbourhood,Category,Venue,Total,Prop
0,"Adelaide, King, Richmond",American Restaurant,2,100,0.020000
4,"Adelaide, King, Richmond",Bakery,2,100,0.020000
5,"Adelaide, King, Richmond",Bar,3,100,0.030000
12,"Adelaide, King, Richmond",Café,4,100,0.040000
13,"Adelaide, King, Richmond",Clothing Store,2,100,0.020000
...,...,...,...,...,...
1564,"Woodbine Gardens, Parkview Hill",Fast Food Restaurant,1,13,0.076923
1570,"Woodbine Gardens, Parkview Hill",Pizza Place,2,13,0.153846
1576,Woodbine Heights,Park,1,11,0.090909
1581,York Mills West,Bank,1,3,0.333333


Since we are interested in the proportion of venues in common categories, remove the neighbourhoods with very few venues as their proportions are not meaningful.

In [16]:
df_profile = df_profile[df_profile['Total'] >= 20]
df_profile

Unnamed: 0,Neighbourhood,Category,Venue,Total,Prop
0,"Adelaide, King, Richmond",American Restaurant,2,100,0.020000
4,"Adelaide, King, Richmond",Bakery,2,100,0.020000
5,"Adelaide, King, Richmond",Bar,3,100,0.030000
12,"Adelaide, King, Richmond",Café,4,100,0.040000
13,"Adelaide, King, Richmond",Clothing Store,2,100,0.020000
...,...,...,...,...,...
1536,Willowdale South,Japanese Restaurant,1,34,0.029412
1542,Willowdale South,Pizza Place,2,34,0.058824
1545,Willowdale South,Restaurant,2,34,0.058824
1546,Willowdale South,Sandwich Place,2,34,0.058824


Finally, reshape the dataframe so that the category proportions are columns to match the standard format for features.

In [17]:
# transpose
nbr_proportions = df_profile.pivot(index='Neighbourhood', columns='Category', values='Prop')
# replace the missing values with 0 
nbr_proportions.fillna(0, inplace=True)

# check reasonable
nbr_proportions.head()

Category,American Restaurant,Bakery,Bank,Bar,Café,Clothing Store,Coffee Shop,Fast Food Restaurant,Gym,Hotel,Italian Restaurant,Japanese Restaurant,Park,Pizza Place,Pub,Restaurant,Sandwich Place,Seafood Restaurant,Sushi Restaurant,Thai Restaurant
Neighbourhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
"Adelaide, King, Richmond",0.02,0.02,0.0,0.03,0.04,0.02,0.07,0.01,0.02,0.02,0.0,0.01,0.0,0.02,0.0,0.05,0.01,0.02,0.02,0.04
"Bedford Park, Lawrence Manor East",0.04,0.0,0.0,0.0,0.04,0.0,0.08,0.04,0.0,0.0,0.08,0.0,0.0,0.08,0.04,0.08,0.08,0.0,0.04,0.04
Berczy Park,0.0,0.036364,0.0,0.0,0.036364,0.0,0.090909,0.0,0.0,0.018182,0.0,0.018182,0.018182,0.0,0.018182,0.036364,0.0,0.036364,0.0,0.018182
"Brockton, Exhibition Place, Parkdale Village",0.0,0.041667,0.0,0.041667,0.083333,0.0,0.083333,0.0,0.041667,0.0,0.041667,0.041667,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0
"Cabbagetown, St. James Town",0.0,0.045455,0.022727,0.0,0.068182,0.0,0.068182,0.0,0.0,0.0,0.045455,0.022727,0.045455,0.045455,0.045455,0.045455,0.022727,0.0,0.0,0.022727


**Step 4: Use profiles to cluster neighbourhoods**

Neighbourhood similarity is to be assessed by the difference between the profiles. That is, neighbourhoods are considered similar if, for example, the proportion of venues in the neighbourhood that are coffee shops or parks are similar. The neighbourhoods are mapped and their cluster used to colour the marker.

Create the clusters with k-means algorithm. Features are not normalised - which retains the relative importance of the different venue categories. Distance is only available as Euclidean distance, but that is reasonable for a distance between vectors of proportions.

In [18]:
from sklearn.cluster import KMeans

# set number of clusters and run k-means clustering
kclusters = 5
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(nbr_proportions)

# attach cluster label to neighbourhood dataset
nbr_proportions['Cluster'] = kmeans.labels_
nbr_proportions.head(3)

Category,American Restaurant,Bakery,Bank,Bar,Café,Clothing Store,Coffee Shop,Fast Food Restaurant,Gym,Hotel,Italian Restaurant,Japanese Restaurant,Park,Pizza Place,Pub,Restaurant,Sandwich Place,Seafood Restaurant,Sushi Restaurant,Thai Restaurant,Cluster
Neighbourhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
"Adelaide, King, Richmond",0.02,0.02,0.0,0.03,0.04,0.02,0.07,0.01,0.02,0.02,0.0,0.01,0.0,0.02,0.0,0.05,0.01,0.02,0.02,0.04,3
"Bedford Park, Lawrence Manor East",0.04,0.0,0.0,0.0,0.04,0.0,0.08,0.04,0.0,0.0,0.08,0.0,0.0,0.08,0.04,0.08,0.08,0.0,0.04,0.04,2
Berczy Park,0.0,0.036364,0.0,0.0,0.036364,0.0,0.090909,0.0,0.0,0.018182,0.0,0.018182,0.018182,0.0,0.018182,0.036364,0.0,0.036364,0.0,0.018182,3


Use the neighbourhood name to attach the cluster numbers to the dataset with the latitude and longitude of each neighbourhood, and map.

In [19]:
# attach cluster number to geolocation dataset
df_map = pd.merge(df_locations, nbr_proportions, left_on = 'Neighbourhood', right_index=True)
df_map = df_map[['PostalCode', 'Neighbourhood', 'Latitude', 'Longitude', 'Cluster']]
df_map.head()

Unnamed: 0,PostalCode,Neighbourhood,Latitude,Longitude,Cluster
18,M2J,"Fairview, Henry Farm, Oriole",43.778517,-79.346556,0
22,M2N,Willowdale South,43.77012,-79.408493,2
27,M3C,"Flemingdon Park, Don Mills South",43.7259,-79.340923,0
38,M4G,Leaside,43.70906,-79.363452,3
39,M4H,Thorncliffe Park,43.705369,-79.349372,3


In [22]:
import folium
import matplotlib.cm as cm
import matplotlib.colors as colors

# Toronto co-ordinates for map centre
lat_Toronto = 43.6532
long_Toronto = -79.3832

# create map
map_clusters = folium.Map(location=[lat_Toronto, long_Toronto], zoom_start=13)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(df_map['Latitude'], df_map['Longitude'], df_map['Neighbourhood'], df_map['Cluster']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

From the map, the clustering algorithm has successfully distinguished between downtown areas from more suburban areas based on the profile of venue types available in the location.