# Where to open a restaurant in Glasgow - Battle of Neighbourhoods - Week1

## Introduction / Business Problem 

<p>Glasgow is the largest and most populous city in Scotland. </p>

<p>It is known for its vibrant nature. It is always welcoming new businesses. As it is touted as a business capital of Scotland, there is no shortage of tourists and business visitors. </p>

## Target Audience

<p> Target audience for this report is the restauranteers, who are looking for a suitable place to open a restaurant in Glasgow constituencies</p>

<p>Not all the constituencies are having equal distribution of restaurants. So this is an attempt to find out how where the restaurants are available in Glasgow constituencies, what is the ratio of restaurants to population and suggesting whic constituency would be better to open the restaurant in Glasgow. </p>

<img src="https://kali-capstone-assignment.s3.eu-gb.cloud-object-storage.appdomain.cloud/Glasgow.PNG" alt="Glasgow"></img>

# Data Section

<p> I found the postcodes of Scotand in CSV Format in the following location </p>
<a href="https://www.doogal.co.uk/PostcodeDownloads.php">Postal Codes in UK</a>

<p> I have uploaded the CSV into IBM Object storage, as I am going to use IBM Watson Studio for this exercise. </p>
<p> Here is the link of the file in Object storage.. </p>
<a href="https://kali-capstone-assignment.s3.eu-gb.cloud-object-storage.appdomain.cloud/scotland.csv">https://kali-capstone-assignment.s3.eu-gb.cloud-object-storage.appdomain.cloud/scotland.csv</a>


<p> Let's explore the data and how it will be used for my purposes. I am going to start loading the data into Pandas Dataframe for this purpose. </p>

In [2]:
import pandas as pd
import numpy as np

In [1]:
!conda install -c conda-forge geopy --yes  
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes  
import folium # map rendering library

print('Libraries imported.')

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/jupyterlab/conda/envs/python

  added / updated specs:
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    altair-4.1.0               |             py_1         614 KB  conda-forge
    branca-0.4.1               |             py_0          26 KB  conda-forge
    brotlipy-0.7.0             |py36h8c4c3a4_1000         346 KB  conda-forge
    chardet-3.0.4              |py36h9f0ad1d_1006         188 KB  conda-forge
    cryptography-2.9.2         |   py36h45558ae_0         613 KB  co

In [3]:
scotland_df = pd.read_csv("https://kali-capstone-assignment.s3.eu-gb.cloud-object-storage.appdomain.cloud/scotland.csv")

  interactivity=interactivity, compiler=compiler, result=result)


### Print the data to see how it appears

In [4]:
scotland_df.head(5)

Unnamed: 0,Postcode,In Use?,Latitude,Longitude,Easting,Northing,Grid Ref,County,District,Ward,...,User Type,Last updated,Nearest station,Distance to station,Postcode area,Postcode district,Police force,Water company,Plus Code,Average Income
0,AB1 0AA,No,57.101474,-2.242851,385386.0,801193.0,NJ853011,,Aberdeen City,Lower Deeside,...,0,2020-02-19,Portlethen,8.31408,AB,AB1,Scotland,Scottish Water,9C9V4Q24+HV,
1,AB1 0AB,No,57.102554,-2.246308,385177.0,801314.0,NJ851013,,Aberdeen City,Lower Deeside,...,0,2020-02-19,Portlethen,8.55457,AB,AB1,Scotland,Scottish Water,9C9V4Q33+2F,
2,AB1 0AD,No,57.100556,-2.248342,385053.0,801092.0,NJ850010,,Aberdeen City,Lower Deeside,...,0,2020-02-19,Portlethen,8.54352,AB,AB1,Scotland,Scottish Water,9C9V4Q22+6M,
3,AB1 0AE,No,57.084444,-2.255708,384600.0,799300.0,NO845992,,Aberdeenshire,North Kincardine,...,0,2020-02-19,Portlethen,8.20809,AB,AB1,Scotland,Scottish Water,9C9V3PMV+QP,
4,AB1 0AF,No,57.096656,-2.258102,384460.0,800660.0,NJ844006,,Aberdeen City,Lower Deeside,...,1,2020-02-19,Portlethen,8.85583,AB,AB1,Scotland,Scottish Water,9C9V3PWR+MQ,


In [80]:
scotland_df.shape

(224804, 47)

#### As the data contains very granular detail, we need to narrow it down further

<ul>Firstly extract Glasgow only data from the scotland dataset</ul>
<ul>This will be achieved by the District column. Let's see what are the different values of the district column</ul>

##### Let's obtain the dataframe for glasgow only

In [7]:
glasgow_only = scotland_df[scotland_df['District'] == 'Glasgow City' ]

###### Further data processing steps

<ul> Filter out the rows that are not "In Use?" </ul>
<ul> Take only the columns Constituency, Latitude, Longitude, Population </ul> 
<ul> Group them and create a new Data frame which has the aggregated values </ul>

In [10]:
glasgow_only_active = glasgow_only[ glasgow_only["In Use?"] == 'Yes']
glasgow_only_active.shape

(15413, 47)

In [12]:
glasgow_working_df = glasgow_only_active[["Ward","Constituency","Latitude","Longitude", "Population", "Average Income"]].reset_index()

In [13]:
glasgow_cons_population = glasgow_working_df.groupby("Ward")["Population"].sum().reset_index()

In [14]:
glasgow_cons_latitude = glasgow_working_df.groupby("Ward")["Latitude"].max().reset_index()

In [15]:
glasgow_cons_longitude = glasgow_working_df.groupby("Ward")["Longitude"].min().reset_index()

In [16]:
 glasgow_cons_latlong = pd.merge(glasgow_cons_latitude, glasgow_cons_longitude, on="Ward")

In [17]:
glasgow_df = pd.merge(glasgow_cons_latlong, glasgow_cons_population, on="Ward")

In [18]:
glasgow_df.head(10)

Unnamed: 0,Ward,Latitude,Longitude,Population
0,Anderston/City/Yorkhill,55.869576,-4.306318,28099.0
1,Baillieston,55.867866,-4.154606,21526.0
2,Calton,55.860109,-4.245577,22806.0
3,Canal,55.92641,-4.294789,26364.0
4,Cardonald,55.860704,-4.379913,29876.0
5,Dennistoun,55.877318,-4.24014,20024.0
6,Drumchapel/Anniesland,55.918846,-4.387423,28598.0
7,East Centre,55.87341,-4.198915,26529.0
8,Garscadden/Scotstounhill,55.900029,-4.385637,30088.0
9,Govan,55.868672,-4.352455,24722.0


In [81]:
glasgow_df.shape

(23, 4)

#### This is the data that will be used with Four Square API to explore venues in each of these constituencies. Clustering would be applied to the venues and sort them out to see where the restaurants are ranked. This will allow to decide which or whether any areas are good for opening a restaurant 

### This is the End of Week 1 submission for Battle of Neighbourhoods - Capstone Project

In [19]:

address = 'Glasgow'

geolocator = Nominatim(user_agent="glasgow_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Glasgow are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Glasgow are 55.8609825, -4.2488787.


In [21]:
# create map of Toronto using latitude and longitude values
map_glasgow = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to maps
for lat, lng, cons, popl in zip(glasgow_df['Latitude'], glasgow_df['Longitude'], glasgow_df['Ward'], glasgow_df['Population']):
    label = '{},Population={}'.format(cons, popl)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_glasgow)  
    
map_glasgow

In [29]:
import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

CLIENT_ID = 'ATWTECKHNWTE1JYPP1BMIL2JHIGRMPF35IVMVXYRXSYW2LBY' # your Foursquare ID
CLIENT_SECRET = '322YTO0QNQVAI4QLB2IGIJDQIL0BVJCPRISE34WYINAJPQHK' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT=100
radius = 3000

In [30]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        #print(len(requests.get(url).json()["response"]['groups']))
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Ward', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [31]:
glasgow_venues = getNearbyVenues(names=glasgow_df['Ward'],
                                   latitudes=glasgow_df['Latitude'],
                                   longitudes=glasgow_df['Longitude']
                                  )

Anderston/City/Yorkhill
Baillieston
Calton
Canal
Cardonald
Dennistoun
Drumchapel/Anniesland
East Centre
Garscadden/Scotstounhill
Govan
Greater Pollok
Hillhead
Langside
Linn
Maryhill
Newlands/Auldburn
North East
Partick East/Kelvindale
Pollokshields
Shettleston
Southside Central
Springburn/Robroyston
Victoria Park


In [32]:
glasgow_venues.groupby('Ward').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Ward,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Anderston/City/Yorkhill,32,32,32,32,32,32
Calton,55,55,55,55,55,55
Cardonald,1,1,1,1,1,1
Dennistoun,2,2,2,2,2,2
East Centre,1,1,1,1,1,1
Garscadden/Scotstounhill,2,2,2,2,2,2
Govan,10,10,10,10,10,10
Hillhead,6,6,6,6,6,6
Langside,2,2,2,2,2,2
Linn,4,4,4,4,4,4


In [33]:
print('There are {} uniques categories.'.format(len(glasgow_venues['Venue Category'].unique())))

There are 81 uniques categories.


In [34]:
# one hot encoding
glasgow_onehot = pd.get_dummies(glasgow_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
glasgow_onehot['Ward'] = glasgow_venues['Ward'] 

# move neighborhood column to the first column
fixed_columns = [glasgow_onehot.columns[-1]] + list(glasgow_onehot.columns[:-1])
glasgow_onehot = glasgow_onehot[fixed_columns]

glasgow_onehot.head()

Unnamed: 0,Ward,American Restaurant,Athletics & Sports,Auto Garage,Bakery,Bank,Bar,Beer Bar,Bistro,Bowling Alley,...,Tapas Restaurant,Tea Room,Tennis Court,Theater,Track,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Warehouse Store,Whisky Bar
0,Anderston/City/Yorkhill,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Anderston/City/Yorkhill,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Anderston/City/Yorkhill,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,1
3,Anderston/City/Yorkhill,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Anderston/City/Yorkhill,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [35]:
glasgow_grouped = glasgow_onehot.groupby('Ward').mean().reset_index()

num_top_venues = 5

for hood in glasgow_grouped['Ward']:
    print("----"+hood+"----")
    temp = glasgow_grouped[glasgow_grouped['Ward'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Anderston/City/Yorkhill----
         venue  freq
0         Café  0.12
1          Pub  0.06
2   Restaurant  0.06
3  Supermarket  0.06
4     Pharmacy  0.06


----Calton----
                venue  freq
0         Coffee Shop  0.09
1                 Bar  0.07
2        Cocktail Bar  0.07
3                Café  0.07
4  Seafood Restaurant  0.07


----Cardonald----
                   venue  freq
0          Go Kart Track   1.0
1                  River   0.0
2            Record Shop   0.0
3                    Pub   0.0
4  Portuguese Restaurant   0.0


----Dennistoun----
               venue  freq
0  Convenience Store   0.5
1             Bakery   0.5
2               Pier   0.0
3              River   0.0
4         Restaurant   0.0


----East Centre----
                 venue  freq
0    Indian Restaurant   1.0
1  American Restaurant   0.0
2             Pharmacy   0.0
3          Record Shop   0.0
4                  Pub   0.0


----Garscadden/Scotstounhill----
           venue  freq
0       Pharma

In [36]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [61]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Ward']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Ward'] = glasgow_grouped['Ward']

for ind in np.arange(glasgow_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(glasgow_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Ward,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Anderston/City/Yorkhill,Café,Discount Store,Chinese Restaurant,Restaurant,Pharmacy,Supermarket,Pub,Bar,Outdoor Supply Store,Sandwich Place
1,Calton,Coffee Shop,Café,Cocktail Bar,Seafood Restaurant,Bar,Pub,Italian Restaurant,Plaza,Hotel,Restaurant
2,Cardonald,Go Kart Track,Whisky Bar,Garden Center,Deli / Bodega,Discount Store,Electronics Store,English Restaurant,Fast Food Restaurant,Fish & Chips Shop,Furniture / Home Store
3,Dennistoun,Bakery,Convenience Store,Whisky Bar,Gas Station,Discount Store,Electronics Store,English Restaurant,Fast Food Restaurant,Fish & Chips Shop,Furniture / Home Store
4,East Centre,Indian Restaurant,Italian Restaurant,Deli / Bodega,Discount Store,Electronics Store,English Restaurant,Fast Food Restaurant,Fish & Chips Shop,Furniture / Home Store,Garden Center


### Cluster constituencies

In [62]:
# set number of clusters
kclusters = 5

glasgow_grouped_clustering = glasgow_grouped.drop('Ward', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(glasgow_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([1, 1, 2, 3, 4, 3, 1, 1, 1, 1], dtype=int32)

In [63]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

glasgow_merged = glasgow_df

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
glasgow_merged = glasgow_merged.join(neighborhoods_venues_sorted.set_index('Ward'), on='Ward')

glasgow_merged.head() # check the last columns!

Unnamed: 0,Ward,Latitude,Longitude,Population,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Anderston/City/Yorkhill,55.869576,-4.306318,28099.0,1.0,Café,Discount Store,Chinese Restaurant,Restaurant,Pharmacy,Supermarket,Pub,Bar,Outdoor Supply Store,Sandwich Place
1,Baillieston,55.867866,-4.154606,21526.0,,,,,,,,,,,
2,Calton,55.860109,-4.245577,22806.0,1.0,Coffee Shop,Café,Cocktail Bar,Seafood Restaurant,Bar,Pub,Italian Restaurant,Plaza,Hotel,Restaurant
3,Canal,55.92641,-4.294789,26364.0,,,,,,,,,,,
4,Cardonald,55.860704,-4.379913,29876.0,2.0,Go Kart Track,Whisky Bar,Garden Center,Deli / Bodega,Discount Store,Electronics Store,English Restaurant,Fast Food Restaurant,Fish & Chips Shop,Furniture / Home Store


In [64]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(glasgow_merged['Latitude'], glasgow_merged['Longitude'], glasgow_merged['Ward'], glasgow_merged['Cluster Labels']):
    if np.isnan(cluster): 
        int_cluster = 0
    else:
        int_cluster = int(cluster) 
    label = folium.Popup(str(poi) + ' Cluster ' + str(int_cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[int_cluster-1],
        fill=True,
        fill_color=rainbow[int_cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

#### Cluster 1 

In [71]:
glasgow_merged.loc[glasgow_merged['Cluster Labels'] == 0, glasgow_merged.columns[[0] + [3] + list(range(5, glasgow_merged.shape[1]))]]

Unnamed: 0,Ward,Population,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
21,Springburn/Robroyston,25721.0,Chinese Restaurant,Whisky Bar,Convenience Store,Discount Store,Electronics Store,English Restaurant,Fast Food Restaurant,Fish & Chips Shop,Furniture / Home Store,Garden Center


#### Cluster 2

In [73]:
glasgow_merged.loc[glasgow_merged['Cluster Labels'] == 1, glasgow_merged.columns[[0] + [3] +  list(range(5, glasgow_merged.shape[1]))]]

Unnamed: 0,Ward,Population,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Anderston/City/Yorkhill,28099.0,Café,Discount Store,Chinese Restaurant,Restaurant,Pharmacy,Supermarket,Pub,Bar,Outdoor Supply Store,Sandwich Place
2,Calton,22806.0,Coffee Shop,Café,Cocktail Bar,Seafood Restaurant,Bar,Pub,Italian Restaurant,Plaza,Hotel,Restaurant
9,Govan,24722.0,Hotel,Bistro,Gas Station,Garden Center,Furniture / Home Store,Pier,Scandinavian Restaurant,Restaurant,Auto Garage,Beer Bar
11,Hillhead,22905.0,Whisky Bar,Soccer Field,Convenience Store,Gym,Hotel,Jazz Club,Theater,College Cafeteria,Vegetarian / Vegan Restaurant,Deli / Bodega
12,Langside,28592.0,Supermarket,Train Station,Gas Station,Deli / Bodega,Discount Store,Electronics Store,English Restaurant,Fast Food Restaurant,Fish & Chips Shop,Furniture / Home Store
13,Linn,29478.0,Train Station,Bakery,Tennis Court,Café,Whisky Bar,Garden Center,Discount Store,Electronics Store,English Restaurant,Fast Food Restaurant
14,Maryhill,19718.0,River,Bus Stop,Grocery Store,Park,Deli / Bodega,Discount Store,Electronics Store,English Restaurant,Fast Food Restaurant,Fish & Chips Shop
15,Newlands/Auldburn,22502.0,College Cafeteria,Train Station,Track,Garden Center,Convenience Store,Deli / Bodega,Discount Store,Electronics Store,English Restaurant,Fast Food Restaurant
16,North East,19904.0,Pet Store,Café,Park,Whisky Bar,Discount Store,Electronics Store,English Restaurant,Fast Food Restaurant,Fish & Chips Shop,Furniture / Home Store
17,Partick East/Kelvindale,26887.0,Discount Store,Chinese Restaurant,Pub,Gym / Fitness Center,Auto Garage,Bank,Grocery Store,Electronics Store,Fast Food Restaurant,Supermarket


#### Cluster 3 

In [74]:
glasgow_merged.loc[glasgow_merged['Cluster Labels'] == 2, glasgow_merged.columns[[0] + [3] +  list(range(5, glasgow_merged.shape[1]))]]

Unnamed: 0,Ward,Population,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,Cardonald,29876.0,Go Kart Track,Whisky Bar,Garden Center,Deli / Bodega,Discount Store,Electronics Store,English Restaurant,Fast Food Restaurant,Fish & Chips Shop,Furniture / Home Store


#### Cluster 4 

In [75]:
glasgow_merged.loc[glasgow_merged['Cluster Labels'] == 3, glasgow_merged.columns[[0] + [3] +  list(range(5, glasgow_merged.shape[1]))]]

Unnamed: 0,Ward,Population,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
5,Dennistoun,20024.0,Bakery,Convenience Store,Whisky Bar,Gas Station,Discount Store,Electronics Store,English Restaurant,Fast Food Restaurant,Fish & Chips Shop,Furniture / Home Store
8,Garscadden/Scotstounhill,30088.0,Moving Target,Pharmacy,Whisky Bar,Garden Center,Discount Store,Electronics Store,English Restaurant,Fast Food Restaurant,Fish & Chips Shop,Furniture / Home Store
22,Victoria Park,21618.0,Bakery,Golf Course,Pharmacy,Whisky Bar,Garden Center,Discount Store,Electronics Store,English Restaurant,Fast Food Restaurant,Fish & Chips Shop


#### Cluster 5

In [76]:
glasgow_merged.loc[glasgow_merged['Cluster Labels'] == 4, glasgow_merged.columns[[0] + [3] +  list(range(5, glasgow_merged.shape[1]))]]

Unnamed: 0,Ward,Population,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
7,East Centre,26529.0,Indian Restaurant,Italian Restaurant,Deli / Bodega,Discount Store,Electronics Store,English Restaurant,Fast Food Restaurant,Fish & Chips Shop,Furniture / Home Store,Garden Center


## Population of Cluster 2

In [78]:
glasgow_merged.loc[glasgow_merged['Cluster Labels'] == 1, glasgow_merged.columns[[3]]].sum()

Population    322842.0
dtype: float64

## Population of Cluster 4

In [79]:
glasgow_merged.loc[glasgow_merged['Cluster Labels'] == 3, glasgow_merged.columns[[3]]].sum()

Population    71730.0
dtype: float64

## Based on these results we can find out there are 5 clusters (as that is the number of clusters we used) 

### Here are the observations based on 5 clusters

<ul><b>Cluster 5 </b> Consisting of ward East Centre is well served with Restaurants. Top 2 venues are Indian restaurant and Italian Restaurant. </ul>
<ul><b>Cluster 1 </b> Consisting of ward Springburn/Robroyston is also having its fair share of eateries with position 1, 6, 7 and 8. </ul>

So clearly these two clusters are not a good choice for starting a new restaurants. 

<ul><b>Cluster 3 </b> Consisting of ward Cardonald is not having Restaurants in its top 5. It has a population total of nearly 30,000 people </ul>
<ul><b>Cluster 4 </b> Consisting of ward Dennistoun / Victoria park/ Garscadden is also not having  Restaurants in its top 5. Its cumulative population is nearly 71,000 people </ul>
<ul><b>Cluster 2 </b> Consisting of various wards is under served with Restaurants. With the exception of few wards, other wards doesn't have Restaurant in its top 5 . Its cumulative population is nearly 320,000 people </ul>

## It is therefore suggested to open restaurants in the following five wards(in the specific order)
<li>Garscadden</li>
<li>Linn</li>
<li>Langside</li>
<li>Shettleston</li>
<li>Govan</li>

