<H1> Analyzing Neighbourhoods in Boise, Idaho for Opening a New Restaurant <H1>

<H2> Introduction

Boise has been one of the fastest growing cities in the United States for the past few years according to multiple sources. Main drivers for the population growth include families or retirees moving from other cities/states in search of lower home prices and crime rates, and increasingly more new hires recruited by high-tech companies such as Micron Technology and ON semiconductor. In addition, with the COVID19 outbreak and therefore more companies allowing employees to work remotely from home, Boise has also attracted a lot of out-of-the-states employees that are based in other cities/states. With a growing population comes with a growing demand in restaurants, thus the aim of this project is to study the neighbourhoods in Boise to determine possible locations for opening new restaurants. This project can benefit business owners and entrepreneurs who want to expand their business and invest in Boise. 

<H2> Data Collection

Data used in this project are collected from multiple sources, a summary of which is provided below. 

<H3> Neighbourhoods Data

The neighbourhood list is scrapped from the city of Boise development website: https://www.cityofboise.org/departments/planning-and-development-services/planning-and-zoning/comprehensive-planning/neighborhood-planning/neighborhood-almanac/ using BeautifulSoup. Data cleaning is performed to remove some inactive neighbourhoods as they are certainly not the ideal locations for any business. 

<H3> Geographical Coordinates 

Geographical coordinates are obtained from the GeoPy library in python for the neighbourhood list obtained above. 

<H3> Venue Data

Venue data are extracted using the Foursquare API and then KMeans clustering is performed to find out the ideal locations for opening new restaurants. 

<H3> Import Python libraries

In [1]:
!pip install geopy
!pip install geocoder
!pip install folium

import numpy as np
import pandas as pd
import seaborn as sns
from geopy.geocoders import Nominatim
import geocoder
import requests
import folium
import matplotlib.cm as cm
import matplotlib.colors as colors
import matplotlib.pyplot as plt

from sklearn.cluster import KMeans

from pandas.io.json import json_normalize
from sklearn.metrics import silhouette_score

from bs4 import BeautifulSoup


  from cryptography.utils import int_from_bytes
  from cryptography.utils import int_from_bytes
  from cryptography.utils import int_from_bytes
  from cryptography.utils import int_from_bytes
Collecting geocoder
  Downloading geocoder-1.38.1-py2.py3-none-any.whl (98 kB)
[K     |████████████████████████████████| 98 kB 9.6 MB/s  eta 0:00:01
Collecting ratelim
  Downloading ratelim-0.1.6-py2.py3-none-any.whl (4.0 kB)
Installing collected packages: ratelim, geocoder
Successfully installed geocoder-1.38.1 ratelim-0.1.6
  from cryptography.utils import int_from_bytes
  from cryptography.utils import int_from_bytes
Collecting folium
  Downloading folium-0.12.1-py2.py3-none-any.whl (94 kB)
[K     |████████████████████████████████| 94 kB 7.9 MB/s  eta 0:00:01
Collecting branca>=0.3.0
  Downloading branca-0.4.2-py3-none-any.whl (24 kB)
Installing collected packages: branca, folium
Successfully installed branca-0.4.2 folium-0.12.1


<H3> Data scraping and preprocessing

Extract table elements by using BeautifulSoup:

In [3]:
url = 'https://www.cityofboise.org/departments/planning-and-development-services/planning-and-zoning/comprehensive-planning/neighborhood-planning/neighborhood-almanac/'
html = requests.get(url)
soup = BeautifulSoup(html.content, "html5lib")
table=soup.find('table')
td = table.findAll('td')

td

[<td style="width:50%"><strong>PLANNING AREA</strong></td>,
 <td style="width:50%"><strong>NEIGHBORHOOD ASSOCIATIONS</strong></td>,
 <td style="width:50%"><strong><a data-udi="umb://media/ed0b58874e1f480ca290f05a7b025927" href="/media/9749/almanac_march2020_airport.pdf" rel="noopener" target="_blank" title="Airport Planning Area.pdf">Airport</a></strong></td>,
 <td style="width:50%">South Eisenman</td>,
 <td style="width:50%"><strong><a data-udi="umb://media/ed25f49982e8491b8062a849657e2927" href="/media/9750/almanac_march2020_barber-valley.pdf" rel="noopener" target="_blank" title="Barber Valley Planning Area .pdf">Barber Valley</a></strong></td>,
 <td style="width:50%">Barber Valley</td>,
 <td style="width:50%"><strong><a data-udi="umb://media/c13f8a8ef32d4f1c8ec957ac1a88dd1c" href="/media/9761/almanac_march2020_central-bench.pdf" rel="noopener" target="_blank" title="Central Bench Planning Area .pdf">Central Bench</a></strong></td>,
 <td style="width:50%"><p>Borah<br/>Central Bench<

By checking the table on the website and comparing it with the td elements, we know that the useful neighbourhood data is from 3,5,... in list td. Extract it with a for loop: 

In [5]:
neigh = []

for i in np.arange(3,len(td),2):
    neigh_add = str(td[i]).strip('<td style="width:50%">').strip('</p>').split('<br/>')
    neigh.extend(neigh_add)
    
neigh

['South Eisenman',
 'Barber Valley',
 'Borah',
 'Central Bench',
 'Central Rim',
 'Depot Bench',
 'Hillcrest',
 'Liberty Park',
 'Morris Hill',
 'Vista',
 'Downtown Boise',
 'Lusk District',
 'West Downtown',
 'Boise Heights',
 'Central Foothills',
 'Highlands',
 'Quail Ridge (inactive)',
 'Somerset (inactive)',
 'Warm Springs Mesa',
 'East End',
 'North End',
 'Sunset',
 "Veteran's Park",
 'West End',
 'Collister',
 'North West',
 'Pierce Park',
 'South Boise Village',
 'Southeast Boise',
 'South Cole',
 'Southwest Ada County Alliance',
 'Centennial',
 'Glenwood Rim',
 'West Bench',
 'West Valley',
 'Winstead Park']

We notice that some neighbourhoods are labeled as "inactive". Remove those from the list:

In [7]:
neigh_active = []    
neigh_active.extend(x for x in neigh if 'inactive' not in x)

neigh_active

['South Eisenman',
 'Barber Valley',
 'Borah',
 'Central Bench',
 'Central Rim',
 'Depot Bench',
 'Hillcrest',
 'Liberty Park',
 'Morris Hill',
 'Vista',
 'Downtown Boise',
 'Lusk District',
 'West Downtown',
 'Boise Heights',
 'Central Foothills',
 'Highlands',
 'Warm Springs Mesa',
 'East End',
 'North End',
 'Sunset',
 "Veteran's Park",
 'West End',
 'Collister',
 'North West',
 'Pierce Park',
 'South Boise Village',
 'Southeast Boise',
 'South Cole',
 'Southwest Ada County Alliance',
 'Centennial',
 'Glenwood Rim',
 'West Bench',
 'West Valley',
 'Winstead Park']

Converting this list to panda dataframe:

In [37]:
df = pd.DataFrame(neigh_active)
df.columns = ['Neighbourhoods']

df

Unnamed: 0,Neighbourhoods
0,South Eisenman
1,Barber Valley
2,Borah
3,Central Bench
4,Central Rim
5,Depot Bench
6,Hillcrest
7,Liberty Park
8,Morris Hill
9,Vista


Now the extraction and cleaning of neighbourhoods data has been completed. Next step is to obtain the geographical coordinates from geocoder:

In [44]:
Lat = []
Lon = []

for neigh in df['Neighbourhoods']:
    g = geocoder.arcgis('{}, Boise, Idaho'.format(neigh)) 
    Lat.append(g.latlng[0])
    Lon.append(g.latlng[1])


Add the geographical coordinates to the dataframe df. Also make a copy of Lat and Lon because geocoder sometimes will time out. 

In [90]:
Lat_copy = Lat.copy()
Lon_copy = Lon.copy()

df['Latitude'] = Lat
df['Longitude'] = Lon

df

Unnamed: 0,Neighbourhoods,Latitude,Longitude
0,South Eisenman,43.522998,-116.152223
1,Barber Valley,43.575474,-116.144475
2,Borah,43.59683,-116.27417
3,Central Bench,43.5971,-116.24423
4,Central Rim,43.61473,-116.23597
5,Depot Bench,43.60054,-116.22355
6,Hillcrest,43.56552,-116.18346
7,Liberty Park,43.60948,-116.2598
8,Morris Hill,43.60931,-116.24346
9,Vista,43.58356,-116.21367


<H3> Data visualization

Using Folium we can add all the neighbourhoods to the map:

In [91]:
address = "Boise, ID, USA"
geolocator = Nominatim(user_agent = "Boise_explorer")
location = geolocator.geocode(address)
Boise_lat = location.latitude
Boise_lon = location.longitude

In [92]:
map_Boise = folium.Map(location=[Boise_lat, Boise_lon], zoom_start=12)

for lat, lng, neigh in zip(df['Latitude'], df['Longitude'], df['Neighbourhoods']):
    label = '{}'.format(neigh)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='red',
        fill=True,
        fill_color='#3183cc',
        fill_opacity=0.3,
        parse_html=False).add_to(map_Boise)  
    
map_Boise

<H3> Obtaining venue info from Foursquare API

Create FSQ credentials:

In [119]:
CLIENT_ID = 'QILVY4PUK3QRAARVVJ4SOEFLHPZFJD1MBMAJZMBEHJIDW02D' # your Foursquare ID
CLIENT_SECRET = 'PZO2PIUOIAD3PKXOXRDKHOBUTTLOITSQW2L3TPBQ0YBXT153' # your Foursquare Secret
ACCESS_TOKEN = '20ES5WGNKRWBSJ5U5RWZFS3NW310OSB00FDN535TL2TCGTY5' # your FourSquare Access Token
VERSION = '20180604'
LIMIT = 100

Taking the first neighbourhood as an example and explore how the data is structured from FSQ API:

In [120]:
neighbourhood_name = df.loc[0, 'Neighbourhoods']
neighbourhood_lat = df.loc[0, 'Latitude']
neighbourhood_lon = df.loc[0, 'Longitude']
neighbourhood_name

'South Eisenman'

Find 100 venues in 1km range:

In [122]:
LIMIT = 100
radius = 1000

url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighbourhood_lat, 
    neighbourhood_lon, 
    radius, 
    LIMIT)

results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '60fb6f590893ce40516453fe'},
 'response': {'headerLocation': 'Orchard',
  'headerFullLocation': 'Orchard',
  'headerLocationGranularity': 'city',
  'totalResults': 4,
  'suggestedBounds': {'ne': {'lat': 43.53199839960722,
    'lng': -116.13983409170933},
   'sw': {'lat': 43.513998381607195, 'lng': -116.1646119971671}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '5b9ae3bc4ac28a002c4dc078',
       'name': 'Peak Thermo King – Boise',
       'location': {'address': '8303 S Federal Way',
        'lat': 43.52657161396114,
        'lng': -116.14999091038135,
        'labeledLatLngs': [{'label': 'display',
          'lat': 43.52657161396114,
          'lng': -116.14999091038135}],
        'distance': 436,
        'postalCode': '83716',
   

Define the get_category_type function from the course:

In [123]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

Then clean the JSON object (pick the relavant data) and store data in a dataframe:

In [124]:
venues = results['response']['groups'][0]['items']
    
venues_df = pd.json_normalize(venues)

needed_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
venues_df = venues_df.loc[:, needed_columns]

venues_df['venue.categories'] = venues_df.apply(get_category_type, axis=1)

venues_df.columns = [col.split(".")[-1] for col in venues_df.columns]

venues_df

Unnamed: 0,name,categories,lat,lng
0,Peak Thermo King – Boise,Home Service,43.526572,-116.149991
1,Cummins Sales and Service,Automotive Shop,43.520874,-116.147346
2,Eurest dining,Café,43.525227,-116.145209
3,Micron 17C Cafeteria,Cafeteria,43.530095,-116.149415


We can see this is not a very good location as it only has 4 venues in total, 2 out of which are cafeteria, but we got an idea of how the data is structured overall and verified how to obtain relavant data. Now we can repeat it for every neighbourhood in the list:

In [126]:
def get_venues(names, lats, lons, radius=1000):
    
    venues_list = []
    
    for name, lat, lon in zip(names, lats, lons):
            
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        results = requests.get(url).json()['response']['groups'][0]['items']
        
        venues_list.append([(
            name, 
            lat, 
            lng, 
            item['venue']['name'],
            item['venue']['categories'][0]['name'],
            item['venue']['location']['lat'], 
            item['venue']['location']['lng']) for item in results])

    nearby_venues = pd.DataFrame([item for venue in venues_list for item in venue])
    nearby_venues.columns = ['Neighbourhood', 
                  'Neighbourhood Lat', 
                  'Neighbourhood Lon', 
                  'Venue', 
                  'Venue Category', 
                  'Venue Lat', 
                  'Venue Lon']
    
    return(nearby_venues)

In [127]:
Boise_venues = get_venues(df['Neighbourhoods'],df['Latitude'],df['Longitude'],radius = 1000)

In [128]:
Boise_venues

Unnamed: 0,Neighbourhood,Neighbourhood Lat,Neighbourhood Lon,Venue,Venue Category,Venue Lat,Venue Lon
0,South Eisenman,43.522998,-116.26159,World Center for Birds of Prey,Zoo,43.516729,-116.255983
1,South Eisenman,43.522998,-116.26159,El Rancho del Crabtreeio's,Scenic Lookout,43.522603,-116.251302
2,Barber Valley,43.575474,-116.26159,Mad Swede Brewing Company,Brewery,43.577719,-116.273188
3,Barber Valley,43.575474,-116.26159,Maverik Adventures First Stop,Gas Station,43.575330,-116.273377
4,Barber Valley,43.575474,-116.26159,Sherwin-Williams Floorcovering Store,Hardware Store,43.578249,-116.272041
...,...,...,...,...,...,...,...
586,Winstead Park,43.626460,-116.26159,The Original Sunrise Cafe,Breakfast Spot,43.617996,-116.264910
587,Winstead Park,43.626460,-116.26159,Over 19,Adult Boutique,43.630121,-116.250441
588,Winstead Park,43.626460,-116.26159,State Liquor Store Garden City-111,Liquor Store,43.632321,-116.252388
589,Winstead Park,43.626460,-116.26159,Loose Screw Beer Co.,Brewery,43.633308,-116.253609


In [135]:
Boise_venues['Venue Category'].value_counts()

Gas Station             33
Brewery                 27
Sandwich Place          25
Coffee Shop             23
Fast Food Restaurant    22
                        ..
Locksmith                1
Bookstore                1
Cuban Restaurant         1
Scenic Lookout           1
Food & Drink Shop        1
Name: Venue Category, Length: 102, dtype: int64

Check how many venues were returned for each neighbourhood to get a general idea of which neighbourhoods are more developed:

In [140]:
Boise_venues.groupby('Neighbourhood').Venue.count()

Neighbourhood
Barber Valley                    10
Boise Heights                    16
Borah                            29
Centennial                        1
Central Bench                    26
Central Foothills                26
Central Rim                      44
Collister                        15
Depot Bench                      13
Downtown Boise                   43
East End                         15
Glenwood Rim                     25
Highlands                        21
Hillcrest                         3
Liberty Park                     14
Lusk District                    10
Morris Hill                      14
North End                        15
North West                        1
Pierce Park                       5
South Boise Village              31
South Cole                        1
South Eisenman                    2
Southeast Boise                   8
Southwest Ada County Alliance     3
Sunset                           12
Veteran's Park                   16
Vista         

Filter neighbourhoods with cafe:

In [145]:
food = Boise_venues['Venue Category'].str.contains('Restaurant|Bar|Café|Cafe|Pub|Cuisine', case=False, regex=True)
Boise_foodvenues = Boise_venues[food].reset_index(drop=True)

Boise_foodvenues['Neighbourhood'].value_counts()

Warm Springs Mesa                9
South Boise Village              9
Downtown Boise                   7
Vista                            7
Central Rim                      7
Borah                            6
Central Bench                    6
Central Foothills                5
West Downtown                    5
Glenwood Rim                     5
Winstead Park                    5
Collister                        4
Morris Hill                      3
East End                         3
West Bench                       3
Boise Heights                    3
Liberty Park                     3
North End                        2
Highlands                        2
Depot Bench                      2
Veteran's Park                   2
West End                         2
Southeast Boise                  1
Southwest Ada County Alliance    1
Lusk District                    1
Hillcrest                        1
Name: Neighbourhood, dtype: int64

We can see that the top neighbourhoods with a lot of resturants are Warm Spring Mesa, South Boise Village, Downtown Boise etc. 

<H3> Run KMeans to cluster the neighbourhoods and find out which neighbourhoods have more restaurants. 

First need to do one-hot encoding:

In [160]:
Boise_onehot = pd.get_dummies(Boise_venues['Venue Category'])
Boise_onehot['Neighbourhood'] = Boise_venues['Neighbourhood']

columns = ['Neighbourhood'] + list(Boise_onehot.columns[0:-1])
Boise_onehot = Boise_onehot[columns]

Boise_onehot

Unnamed: 0,Neighbourhood,ATM,Accessories Store,Adult Boutique,Airport,Alternative Healer,American Restaurant,Aquarium,Arcade,Arts & Crafts Store,...,Sushi Restaurant,Taco Place,Thai Restaurant,Thrift / Vintage Store,Toy / Game Store,Trail,Video Store,Waste Facility,Wine Bar,Zoo
0,South Eisenman,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,1
1,South Eisenman,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Barber Valley,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Barber Valley,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Barber Valley,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
586,Winstead Park,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
587,Winstead Park,0,0,1,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
588,Winstead Park,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
589,Winstead Park,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


Then group by neighbourhood and take its mean:

In [162]:
Boise_onehot_grouped = Boise_onehot.groupby('Neighbourhood').mean().reset_index()
Boise_onehot_grouped.head()

Unnamed: 0,Neighbourhood,ATM,Accessories Store,Adult Boutique,Airport,Alternative Healer,American Restaurant,Aquarium,Arcade,Arts & Crafts Store,...,Sushi Restaurant,Taco Place,Thai Restaurant,Thrift / Vintage Store,Toy / Game Store,Trail,Video Store,Waste Facility,Wine Bar,Zoo
0,Barber Valley,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Boise Heights,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0
2,Borah,0.034483,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,...,0.034483,0.034483,0.034483,0.034483,0.034483,0.0,0.0,0.0,0.0,0.0
3,Centennial,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Central Bench,0.038462,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,...,0.038462,0.038462,0.038462,0.038462,0.038462,0.0,0.0,0.0,0.0,0.0


Sort the venues in descending order:

In [165]:
# method to sort venues
def return_most_common_venues(row, num_top_venues=10):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Create a new dataframe to display the top 10 venues for each neighbourhood

In [187]:
indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighbourhood']
for i in np.arange(10):
    try:
        columns.append('{}{} Most Common Venue'.format(i+1, indicators[i]))
    except:
        columns.append('{}th Most Common Venue'.format(i+1))

# create a new dataframe
Boise_venues_sorted = pd.DataFrame(columns=columns)
Boise_venues_sorted['Neighbourhood'] = Boise_onehot_grouped['Neighbourhood']

for i in np.arange(Boise_onehot_grouped.shape[0]):
    Boise_venues_sorted.iloc[i, 1:] = return_most_common_venues(Boise_onehot_grouped.iloc[i, :], 10)

Boise_venues_sorted.head()

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Barber Valley,Construction & Landscaping,Hardware Store,Airport,Moving Target,Gas Station,Business Service,Automotive Shop,Brewery,Zoo,Food
1,Boise Heights,Smoke Shop,Gas Station,Lingerie Store,Pet Store,Fast Food Restaurant,Sandwich Place,Liquor Store,Dive Bar,Food Truck,Brewery
2,Borah,Fast Food Restaurant,Cosmetics Shop,Sandwich Place,ATM,Paintball Field,Gas Station,Clothing Store,Credit Union,Middle Eastern Restaurant,Ice Cream Shop
3,Centennial,Ice Cream Shop,Zoo,Food Truck,Department Store,Diner,Discount Store,Dive Bar,Eastern European Restaurant,Eye Doctor,Farmers Market
4,Central Bench,Fast Food Restaurant,Sandwich Place,ATM,Auto Garage,Gym,Gym / Fitness Center,Fruit & Vegetable Store,IT Services,Ice Cream Shop,Middle Eastern Restaurant


Now we are ready to run KMeans with 5 clusters:

In [188]:
kclusters = 5

Boise_onehot_clustering = Boise_onehot_grouped.drop('Neighbourhood',1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=42).fit(Boise_onehot_clustering)

# add clustering labels
Boise_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

Boise_venues_sorted

Unnamed: 0,Cluster Labels,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,4,Barber Valley,Construction & Landscaping,Hardware Store,Airport,Moving Target,Gas Station,Business Service,Automotive Shop,Brewery,Zoo,Food
1,0,Boise Heights,Smoke Shop,Gas Station,Lingerie Store,Pet Store,Fast Food Restaurant,Sandwich Place,Liquor Store,Dive Bar,Food Truck,Brewery
2,0,Borah,Fast Food Restaurant,Cosmetics Shop,Sandwich Place,ATM,Paintball Field,Gas Station,Clothing Store,Credit Union,Middle Eastern Restaurant,Ice Cream Shop
3,3,Centennial,Ice Cream Shop,Zoo,Food Truck,Department Store,Diner,Discount Store,Dive Bar,Eastern European Restaurant,Eye Doctor,Farmers Market
4,0,Central Bench,Fast Food Restaurant,Sandwich Place,ATM,Auto Garage,Gym,Gym / Fitness Center,Fruit & Vegetable Store,IT Services,Ice Cream Shop,Middle Eastern Restaurant
5,4,Central Foothills,Coffee Shop,Brewery,Gym,Bowling Alley,Harbor / Marina,Home Service,Juice Bar,Motorsports Shop,Nightclub,Greek Restaurant
6,0,Central Rim,Gas Station,Pizza Place,Fast Food Restaurant,Pool Hall,Sandwich Place,Automotive Shop,Burger Joint,Mexican Restaurant,Furniture / Home Store,Burrito Place
7,4,Collister,Gym,Coffee Shop,Harbor / Marina,Cosmetics Shop,Nightclub,Juice Bar,Automotive Shop,Motorsports Shop,Pet Store,American Restaurant
8,0,Depot Bench,Cosmetics Shop,Spa,Gas Station,IT Services,Ice Cream Shop,Fruit & Vegetable Store,Middle Eastern Restaurant,Diner,Park,Clothing Store
9,0,Downtown Boise,Pizza Place,Gas Station,Automotive Shop,Burger Joint,Pool Hall,Sandwich Place,Fast Food Restaurant,Hobby Shop,Furniture / Home Store,Mexican Restaurant


Examine each cluster to see if it's suitable for restaurants: 

In [189]:
Boise_venues_sorted[Boise_venues_sorted['Cluster Labels'] == 0]

Unnamed: 0,Cluster Labels,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,0,Boise Heights,Smoke Shop,Gas Station,Lingerie Store,Pet Store,Fast Food Restaurant,Sandwich Place,Liquor Store,Dive Bar,Food Truck,Brewery
2,0,Borah,Fast Food Restaurant,Cosmetics Shop,Sandwich Place,ATM,Paintball Field,Gas Station,Clothing Store,Credit Union,Middle Eastern Restaurant,Ice Cream Shop
4,0,Central Bench,Fast Food Restaurant,Sandwich Place,ATM,Auto Garage,Gym,Gym / Fitness Center,Fruit & Vegetable Store,IT Services,Ice Cream Shop,Middle Eastern Restaurant
6,0,Central Rim,Gas Station,Pizza Place,Fast Food Restaurant,Pool Hall,Sandwich Place,Automotive Shop,Burger Joint,Mexican Restaurant,Furniture / Home Store,Burrito Place
8,0,Depot Bench,Cosmetics Shop,Spa,Gas Station,IT Services,Ice Cream Shop,Fruit & Vegetable Store,Middle Eastern Restaurant,Diner,Park,Clothing Store
9,0,Downtown Boise,Pizza Place,Gas Station,Automotive Shop,Burger Joint,Pool Hall,Sandwich Place,Fast Food Restaurant,Hobby Shop,Furniture / Home Store,Mexican Restaurant
10,0,East End,Gas Station,Cosmetics Shop,Fruit & Vegetable Store,Clothing Store,Eye Doctor,Middle Eastern Restaurant,Fast Food Restaurant,Sandwich Place,Spa,Bar
14,0,Liberty Park,Gas Station,Cosmetics Shop,Fruit & Vegetable Store,Clothing Store,Eye Doctor,Middle Eastern Restaurant,Fast Food Restaurant,Sandwich Place,Spa,Bar
15,0,Lusk District,Cosmetics Shop,Aquarium,Middle Eastern Restaurant,Spa,Fruit & Vegetable Store,Clothing Store,Sandwich Place,Gas Station,Ice Cream Shop,Hotel
16,0,Morris Hill,Gas Station,Cosmetics Shop,Fruit & Vegetable Store,Clothing Store,Eye Doctor,Middle Eastern Restaurant,Fast Food Restaurant,Sandwich Place,Spa,Bar


In [190]:
Boise_venues_sorted[Boise_venues_sorted['Cluster Labels'] == 1]

Unnamed: 0,Cluster Labels,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
18,1,North West,Trail,Zoo,Food & Drink Shop,Deli / Bodega,Department Store,Diner,Discount Store,Dive Bar,Eastern European Restaurant,Eye Doctor


In [191]:
Boise_venues_sorted[Boise_venues_sorted['Cluster Labels'] == 2]

Unnamed: 0,Cluster Labels,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
21,2,South Cole,River,Zoo,Credit Union,Deli / Bodega,Department Store,Diner,Discount Store,Dive Bar,Eastern European Restaurant,Eye Doctor


In [192]:
Boise_venues_sorted[Boise_venues_sorted['Cluster Labels'] == 3]

Unnamed: 0,Cluster Labels,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,3,Centennial,Ice Cream Shop,Zoo,Food Truck,Department Store,Diner,Discount Store,Dive Bar,Eastern European Restaurant,Eye Doctor,Farmers Market


In [193]:
Boise_venues_sorted[Boise_venues_sorted['Cluster Labels'] == 4]

Unnamed: 0,Cluster Labels,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,4,Barber Valley,Construction & Landscaping,Hardware Store,Airport,Moving Target,Gas Station,Business Service,Automotive Shop,Brewery,Zoo,Food
5,4,Central Foothills,Coffee Shop,Brewery,Gym,Bowling Alley,Harbor / Marina,Home Service,Juice Bar,Motorsports Shop,Nightclub,Greek Restaurant
7,4,Collister,Gym,Coffee Shop,Harbor / Marina,Cosmetics Shop,Nightclub,Juice Bar,Automotive Shop,Motorsports Shop,Pet Store,American Restaurant
11,4,Glenwood Rim,Coffee Shop,Brewery,Gym,Sports Bar,Nightclub,Greek Restaurant,Paper / Office Supplies Store,Pizza Place,Rental Car Location,Salon / Barbershop
12,4,Highlands,Coffee Shop,Brewery,Gym,ATM,Sports Bar,Home Service,Nightclub,Photography Studio,Rental Car Location,Bowling Alley
13,4,Hillcrest,Waste Facility,American Restaurant,Golf Course,Zoo,Food & Drink Shop,Department Store,Diner,Discount Store,Dive Bar,Eastern European Restaurant
23,4,Southeast Boise,Automotive Shop,Hardware Store,Paper / Office Supplies Store,Airport,Playground,Sushi Restaurant,Brewery,Food & Drink Shop,Diner,Discount Store
24,4,Southwest Ada County Alliance,Waste Facility,American Restaurant,Golf Course,Zoo,Food & Drink Shop,Department Store,Diner,Discount Store,Dive Bar,Eastern European Restaurant
25,4,Sunset,Brewery,Coffee Shop,ATM,Accessories Store,Paper / Office Supplies Store,Home Service,Photography Studio,Arts & Crafts Store,Nightclub,Diner
32,4,West Valley,Coffee Shop,Brewery,Paper / Office Supplies Store,Accessories Store,Gas Station,Arts & Crafts Store,Food,Department Store,Diner,Discount Store


<H3> Results and Discussion

By checking the five clusters we obtained above, we can notice that some of the clusters have more food related venues in their top 5 most common venues, while other clusters do not. It appears that neighbourhoods in cluster 0 are the most suitable for opening new restaurants. Finally we merge the coordinates back for plotting map:

In [196]:
Boise_venues_merged = df.copy()
Boise_venues_merged.rename(columns={'Neighbourhoods':'Neighbourhood'}, inplace=True)
Boise_venues_merged = Boise_venues_merged.join(Boise_venues_sorted.set_index('Neighbourhood'), on = 'Neighbourhood')

Boise_venues_merged.head()

Unnamed: 0,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,South Eisenman,43.522998,-116.152223,0,Zoo,Scenic Lookout,Credit Union,Deli / Bodega,Department Store,Diner,Discount Store,Dive Bar,Eastern European Restaurant,Eye Doctor
1,Barber Valley,43.575474,-116.144475,4,Construction & Landscaping,Hardware Store,Airport,Moving Target,Gas Station,Business Service,Automotive Shop,Brewery,Zoo,Food
2,Borah,43.59683,-116.27417,0,Fast Food Restaurant,Cosmetics Shop,Sandwich Place,ATM,Paintball Field,Gas Station,Clothing Store,Credit Union,Middle Eastern Restaurant,Ice Cream Shop
3,Central Bench,43.5971,-116.24423,0,Fast Food Restaurant,Sandwich Place,ATM,Auto Garage,Gym,Gym / Fitness Center,Fruit & Vegetable Store,IT Services,Ice Cream Shop,Middle Eastern Restaurant
4,Central Rim,43.61473,-116.23597,0,Gas Station,Pizza Place,Fast Food Restaurant,Pool Hall,Sandwich Place,Automotive Shop,Burger Joint,Mexican Restaurant,Furniture / Home Store,Burrito Place


In [197]:

# create map
map_clusters = folium.Map(location=[df["Latitude"][0], df["Longitude"][0]], zoom_start=12)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(Boise_venues_merged['Latitude'], Boise_venues_merged['Longitude'], Boise_venues_merged['Neighbourhood'], Boise_venues_merged['Cluster Labels'].astype(int)):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters