# Peer-graded Assignment: Capstone Project - The Battle of Neighborhoods
This report includes six sections as follows:

- Introduction [where you discuss the business problem and who would be interested in this project.]
- Data [where you describe the data that will be used to solve the problem and the source of the data.]
- Methodology [which represents the main component of the report where you discuss and describe any exploratory data analysis that you did, any inferential statistical testing that you performed, if any, and what machine learnings were used and why.]
- Results [where you discuss the results.]
- Discussion [where you discuss any observations you noted and any recommendations you can make based on the results.]
- Conclusion [where you conclude the report.]

## Introduction
Edinburgh and Glasgow are two major cities in Scotland. Both cities are noted for their culture, architecture, musical exports, turism, and business links.

According to Wikipedia (www.wikipedia.org):

* Edinburgh is Scotland's second most populous city and the seventh most populous in the United Kingdom. The official population estimates are 488,050 (2016) for the Locality of Edinburgh (Edinburgh pre 1975 regionalisation plus Currie and Balerno), 518,500 (2018) for the City of Edinburgh, and 1,339,380 (2014) for the city region. Edinburgh lies at the heart of the Edinburgh and South East Scotland city region comprising East Lothian, Edinburgh, Fife, Midlothian, Scottish Borders and West Lothian.

* Glasgow is the most populous city in Scotland, and the third most populous city in the United Kingdom, as of the 2017 estimated city population of 621,020. Historically part of Lanarkshire, the city now forms the Glasgow City council area, one of the 32 council areas of Scotland; the local authority is Glasgow City Council. Glasgow is situated on the River Clyde in the country's West Central Lowlands. It is the fifth most visited city in the UK.

In this project, the above two cities will be compared in details using machine learning segmentation and clustering along with Foursquare data. The objectives include:

* How similar these two cities are? 
* Which city is better for living for certain requirements?

## Data
In order to apply this study, basic geo data of this two cities need to be collected:

* Postcode for Edinburgh (https://en.wikipedia.org/wiki/EH_postcode_area)
* Postcode for Glasgow (https://en.wikipedia.org/wiki/G_postcode_area)

The latitude and longitude data also required:

* (https://www.freemaptools.com/download/outcode-postcodes/postcode-outcodes.csv)

In [1]:
#pip install wikipedia, lxml
!conda install -c conda-forge wikipedia --yes 
!conda install -c conda-forge lxml --yes
import wikipedia as wp
import pandas as pd 

!conda install -c conda-forge geopy --yes # For Latitude and Longitud
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

!conda install -c conda-forge folium=0.5.0 --yes # Foursquare API lab
import folium # map rendering library

#using beautiful soup to parsing html with postalcodes
from bs4 import BeautifulSoup
import requests

print('Libraries imported.')

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - wikipedia


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    ca-certificates-2019.6.16  |       hecc5488_0         145 KB  conda-forge
    openssl-1.1.1c             |       h516909a_0         2.1 MB  conda-forge
    wikipedia-1.4.0            |             py_2          13 KB  conda-forge
    certifi-2019.6.16          |           py36_1         149 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         2.4 MB

The following NEW packages will be INSTALLED:

    wikipedia:       1.4.0-py_2        conda-forge

The following packages will be UPDATED:

    ca-certificates: 2019.5.15-0                   --> 2019.6.16-hecc5488_0 conda-forge
    certifi:         2019.6.16-py36_1       

In [2]:
#Firstly - Download CSV file with postcodes (longitud and latitud)

#get longitud and latitude data for the postcodes of Edinburgh and Glasgow
#load postcodes in a dataframe
postcode_data = pd.read_csv("https://www.freemaptools.com/download/outcode-postcodes/postcode-outcodes.csv")
#print dataframe
postcode_data

Unnamed: 0,id,postcode,latitude,longitude
0,2,AB10,57.135140,-2.117310
1,3,AB11,57.138750,-2.090890
2,4,AB12,57.101000,-2.110600
3,5,AB13,57.108010,-2.237760
4,6,AB14,57.100760,-2.270730
5,7,AB15,57.138680,-2.165250
6,8,AB16,57.161150,-2.155430
7,9,AB21,57.209600,-2.200330
8,10,AB22,57.187240,-2.119130
9,11,AB23,57.212420,-2.087760


In [3]:
#remove unnecessary columns of postcode data
postcode_data = postcode_data.drop('id',1)
postcode_data

Unnamed: 0,postcode,latitude,longitude
0,AB10,57.135140,-2.117310
1,AB11,57.138750,-2.090890
2,AB12,57.101000,-2.110600
3,AB13,57.108010,-2.237760
4,AB14,57.100760,-2.270730
5,AB15,57.138680,-2.165250
6,AB16,57.161150,-2.155430
7,AB21,57.209600,-2.200330
8,AB22,57.187240,-2.119130
9,AB23,57.212420,-2.087760


In [4]:
#get rows and cols
postcode_data.shape

(3003, 3)

In [5]:
#Second - Get Edinburgh postcodes
edi_url = "https://en.wikipedia.org/wiki/EH_postcode_area"
 
# Getting the webpage, creating a Response object.
response = requests.get(edi_url)
 
# Extracting the source code of the page.
edi_data = response.text
 
# Passing the source code to BeautifulSoup to create a BeautifulSoup object for it.
soup = BeautifulSoup(edi_data, 'lxml')

#Getting the data from the html
edi_table = soup.find('table', class_='wikitable sortable')# Grab the postcode table

In [6]:
#Edinburgh data processing - postcodes and neighborhood
postcode_name=''
posttown_name = ''
neighborhood_name =''

# define the dataframe columns
column_names = ['postcode','post town','neighborhood'] 

# instantiate the dataframe
edi_neighborhoods = pd.DataFrame(columns=column_names)
edi_neighborhoods

for row in edi_table.find_all('tr', style=''):
    postcode_name =  row.find('th').get_text().replace('\n','')
    #print(postcode_name)
    columns = row.find_all('td')
    #print(columns)
    neighborhood_name=''
    if len(columns)>0:
            posttown_name = columns[0].get_text().replace('\n','')
            # Extracting all the <a> tags into a list.
            tags = columns[1].find_all('a')
            
            neighborhood_name =', '.join([tag.get_text() for tag in tags ])
            
    if (neighborhood_name!=''):
        edi_neighborhoods = edi_neighborhoods.append({'postcode': postcode_name,
                                          'post town': posttown_name,
                                          'neighborhood': neighborhood_name}, ignore_index=True)

edi_neighborhoods #Edinburgh data of posttowns and neighborhoods

Unnamed: 0,postcode,post town,neighborhood
0,EH1,EDINBURGH,"Old Town, GPO, St. James Centre"
1,EH2,EDINBURGH,"New Town, Princes Street"
2,EH3,EDINBURGH,"Queen Street, Stockbridge, West End, Tollcross..."
3,EH4,EDINBURGH,"Dean Village, Comely Bank, A90, Barnton, Cramo..."
4,EH5,EDINBURGH,"Granton, Firth of Forth, Ferry Road"
5,EH6,EDINBURGH,"Leith, Newhaven"
6,EH7,EDINBURGH,"Restalrig, Craigentinny"
7,EH8,EDINBURGH,"Southside, Newington, Canongate, Holyrood Park..."
8,EH9,EDINBURGH,"Marchmont, Grange"
9,EH10,EDINBURGH,"A702, Bruntsfield, Morningside, Fairmilehead"


In [7]:
#Third - Get Glasgow postcodes
gla_url = "https://en.wikipedia.org/wiki/G_postcode_area"
 
# Getting the webpage, creating a Response object.
response = requests.get(gla_url)
 
# Extracting the source code of the page.
gla_data = response.text
 
# Passing the source code to BeautifulSoup to create a BeautifulSoup object for it.
soup = BeautifulSoup(gla_data, 'lxml')

#Getting the data from the html
gla_table = soup.find('table', class_='wikitable sortable')# Grab the postcode table

In [8]:
#Glasgow data processing - postcodes and neighborhood
postcode_name=''
posttown_name = ''
neighborhood_name =''

# define the dataframe columns
column_names = ['postcode','post town','neighborhood'] 

# instantiate the dataframe
gla_neighborhoods = pd.DataFrame(columns=column_names)
gla_neighborhoods

for row in gla_table.find_all('tr', style=''):
    postcode_name =  row.find('th').get_text().replace('\n','')
    #print(postcode_name)
    columns = row.find_all('td')
    #print(columns)
    neighborhood_name=''
    if len(columns)>0:
            posttown_name = columns[0].get_text().replace('\n','')
            # Extracting all the <a> tags into a list.
            tags = columns[1].find_all('a')
            
            neighborhood_name =', '.join([tag.get_text() for tag in tags ])
            
    if (neighborhood_name!=''):
        gla_neighborhoods = gla_neighborhoods.append({'postcode': postcode_name,
                                          'post town': posttown_name,
                                          'neighborhood': neighborhood_name}, ignore_index=True)

gla_neighborhoods #Glasgow data of posttowns and neighborhoods

Unnamed: 0,postcode,post town,neighborhood
0,G1,GLASGOW,Merchant City
1,G2,GLASGOW,"Blythswood Hill, Anderston"
2,G3,GLASGOW,"Anderston, Finnieston, Garnethill, Park, Woodl..."
3,G4,GLASGOW,"Calton, Cowcaddens, Kelvinbridge, Townhead, Wo..."
4,G5,GLASGOW,Gorbals
5,G11,GLASGOW,"Broomhill, Partick, Partickhill"
6,G12,GLASGOW,"West End, Dowanhill, Hillhead, Hyndland, Kelvi..."
7,G13,GLASGOW,"Anniesland, Knightswood, Yoker"
8,G14,GLASGOW,"Whiteinch, Scotstoun"
9,G15,GLASGOW,Drumchapel


In [9]:
#Merging Postcodes and Longitud and Latitud Values (EDI and GLA)
#===============================================================

#Edinburgh merging data with latitude and longitud
edi_neighborhoods = pd.merge(postcode_data, edi_neighborhoods, on='postcode')
edi_neighborhoods.shape #size of data for EDI

(23, 5)

In [10]:
#print EDI data
edi_neighborhoods #to be used for EDI data analysis

Unnamed: 0,postcode,latitude,longitude,post town,neighborhood
0,EH1,55.95243,-3.1884,EDINBURGH,"Old Town, GPO, St. James Centre"
1,EH10,55.92077,-3.20984,EDINBURGH,"A702, Bruntsfield, Morningside, Fairmilehead"
2,EH11,55.93387,-3.24867,EDINBURGH,"A71, Haymarket, Gorgie, Stenhouse, Sighthill, ..."
3,EH12,55.94262,-3.27137,EDINBURGH,"A8, Murrayfield, Corstorphine, Gyle"
4,EH13,55.90788,-3.24144,EDINBURGH,"Colinton, Oxgangs"
5,EH14,55.90925,-3.28308,"BALERNO, CURRIE, EDINBURGH, JUNIPER GREEN","Slateford, Longstone, Wester Hailes, Juniper G..."
6,EH15,55.94686,-3.11136,EDINBURGH,"Portobello, Duddingston"
7,EH16,55.92221,-3.15387,EDINBURGH,"Liberton, Cameron Toll, Craigmillar, Niddrie"
8,EH17,55.90704,-3.14222,EDINBURGH,"Gilmerton, Moredun, Mortonhall"
9,EH2,55.95417,-3.19486,EDINBURGH,"New Town, Princes Street"


In [11]:
#Glasgow merging data with latitude and longitud
gla_neighborhoods = pd.merge(postcode_data, gla_neighborhoods, on='postcode')
gla_neighborhoods.shape #size of data for GLA

(53, 5)

In [12]:
#print GLA data
gla_neighborhoods #to be used for GLA data analysis

Unnamed: 0,postcode,latitude,longitude,post town,neighborhood
0,G1,55.86038,-4.24671,GLASGOW,Merchant City
1,G11,55.87356,-4.31142,GLASGOW,"Broomhill, Partick, Partickhill"
2,G12,55.88006,-4.30061,GLASGOW,"West End, Dowanhill, Hillhead, Hyndland, Kelvi..."
3,G13,55.89358,-4.3462,GLASGOW,"Anniesland, Knightswood, Yoker"
4,G14,55.88095,-4.34864,GLASGOW,"Whiteinch, Scotstoun"
5,G15,55.9094,-4.36476,GLASGOW,Drumchapel
6,G2,55.86382,-4.2549,GLASGOW,"Blythswood Hill, Anderston"
7,G20,55.8858,-4.28176,GLASGOW,"Maryhill, North Kelvinside, Ruchill"
8,G21,55.88063,-4.22069,GLASGOW,"Balornock, Barmulloch, Cowlairs, Royston, Spri..."
9,G22,55.88998,-4.25002,GLASGOW,"Milton, Parkhouse, Possilpark"


In [13]:
#Edinburgh Map
#=============
#Get the geographical coordinates of Edinburgh.

address = 'Edinburgh, UK'

geolocator = Nominatim(user_agent="to_explorer")
location = geolocator.geocode(address)
edi_latitude = location.latitude
edi_longitude = location.longitude
print('The geograpical coordinate of Edinburgh are {}, {}.'.format(edi_latitude, edi_longitude))

The geograpical coordinate of Edinburgh are 55.9521476, -3.1889908.


In [14]:
# create map of Edinburgh using latitude and longitude values
map_edinburgh = folium.Map(location=[edi_latitude, edi_longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(edi_neighborhoods['latitude'], edi_neighborhoods['longitude'], edi_neighborhoods['neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_edinburgh)  
    
map_edinburgh

In [15]:
#Glasgow Map
#===========
#Get the geographical coordinates of Glasgow.

address = 'Glasgow, UK'

geolocator = Nominatim(user_agent="to_explorer")
location = geolocator.geocode(address)
gla_latitude = location.latitude
gla_longitude = location.longitude
print('The geograpical coordinate of Glasgow are {}, {}.'.format(gla_latitude, gla_longitude))

The geograpical coordinate of Glasgow are 55.8611389, -4.2501672.


In [16]:
# create map of Glasgow using latitude and longitude values
map_glasgow = folium.Map(location=[gla_latitude, gla_longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(gla_neighborhoods['latitude'], gla_neighborhoods['longitude'], gla_neighborhoods['neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_glasgow)  
    
map_glasgow

In [17]:
#Analysis of Data for both cities - Edinburgh and Glasgow
#========================================================

#import of libraries

#Json Libraries
import json # library to handle JSON files
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

import numpy as np # library to handle data in a vectorized manner

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

In [18]:
# @hidden_cell
CLIENT_ID = 'QJI2OIVERR1LLMYAFFN1IPYBCWSACKGWJAIUBTSX0JCH5TI2' # your Foursquare ID
CLIENT_SECRET = 'YPLEEQ5L2VO0FPKVNJFSJMXXLROC0PQDPULDZBKVK2FKQNMR' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

# Limit and radius use by Foursquare API
LIMIT = 100 # limit of number of venues returned by Foursquare API
radius = 500 # define radius

In [19]:
#Functions to be used for the data analysis

# get_category_type() function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']
    
    
# getNearbyVenues(...) function to get the main venues for all neighborhoods in the cities
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['neighborhood', 
                  'neighborhood latitude', 
                  'neighborhood longitude', 
                  'venue', 
                  'venue latitude', 
                  'venue longitude', 
                  'venue category']
    
    return(nearby_venues)

# function to sort the venues in descending order.
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [20]:
#Edinburgh DATA Analysis
#=======================
#Let's explore the first neighborhood in our dataframe.
#Get the neighborhood's name.
edi_neighborhoods.loc[0, 'neighborhood']
#Get the neighborhood's latitude and longitude values.
edi_neighborhood_latitude = edi_neighborhoods.loc[0, 'latitude'] # neighborhood latitude value
edi_neighborhood_longitude = edi_neighborhoods.loc[0, 'longitude'] # neighborhood longitude value

edi_neighborhood_name = edi_neighborhoods.loc[0, 'neighborhood'] # neighborhood name
print('Latitude and longitude values of {} are {}, {}.'.format(edi_neighborhood_name, 
                                                               edi_neighborhood_latitude, 
                                                               edi_neighborhood_longitude))

Latitude and longitude values of Old Town, GPO, St. James Centre are 55.95243000000001, -3.1884.


In [21]:
# create Edinburgh URL
edi_url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    edi_neighborhood_latitude, 
    edi_neighborhood_longitude, 
    radius, 
    LIMIT)
edi_url # display URL

'https://api.foursquare.com/v2/venues/explore?&client_id=QJI2OIVERR1LLMYAFFN1IPYBCWSACKGWJAIUBTSX0JCH5TI2&client_secret=YPLEEQ5L2VO0FPKVNJFSJMXXLROC0PQDPULDZBKVK2FKQNMR&v=20180605&ll=55.95243000000001,-3.1884&radius=500&limit=100'

In [22]:
#Send the GET request and examine the results of this neighbourhood
edi_results = requests.get(edi_url).json()
edi_results

{'meta': {'code': 200, 'requestId': '5d56a627a6ec98002c0b7af0'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'}]},
  'headerLocation': 'Canongate',
  'headerFullLocation': 'Canongate, Edinburgh',
  'headerLocationGranularity': 'neighborhood',
  'totalResults': 140,
  'suggestedBounds': {'ne': {'lat': 55.95693000450001,
    'lng': -3.18037757683752},
   'sw': {'lat': 55.947929995500004, 'lng': -3.1964224231624803}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4ba35400f964a5206b3538e3',
       'name': 'The Balmoral Hotel',
       'location': {'address': '1 Princes St',
        'crossStreet': 'at North Bridge',
        'lat': 55.95311255845786,
        'lng': -3.189509384085317,
        'labeledLatLngs'

In [23]:
#Now we are ready to clean the json and structure it into a pandas dataframe.

edi_venues = edi_results['response']['groups'][0]['items']
    
edi_nearby_venues = json_normalize(edi_venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
edi_nearby_venues = edi_nearby_venues.loc[:, filtered_columns]

# filter the category for each row
edi_nearby_venues['venue.categories'] = edi_nearby_venues.apply(get_category_type, axis=1)

# clean columns
edi_nearby_venues.columns = [col.split(".")[-1] for col in edi_nearby_venues.columns]

#edi_nearby_venues.head()
edi_nearby_venues

Unnamed: 0,name,categories,lat,lng
0,The Balmoral Hotel,Hotel,55.953113,-3.189509
1,The Guildford Arms,Pub,55.953668,-3.190052
2,Apple Princes Street,Electronics Store,55.953354,-3.189947
3,The Voodoo Rooms,Bar,55.953622,-3.190504
4,Princes Street Suites,Hotel,55.953370,-3.186934
5,The Milkman,Coffee Shop,55.950650,-3.191010
6,Hilton Edinburgh Carlton,Hotel,55.950768,-3.187729
7,Viva Mexico,Mexican Restaurant,55.950691,-3.189522
8,SCOTCH Whisky Bar,Whisky Bar,55.953062,-3.190095
9,The Doric Tavern,Gastropub,55.950982,-3.190627


In [24]:
print('{} venues were returned by Foursquare.'.format(edi_nearby_venues.shape[0]))

100 venues were returned by Foursquare.


In [25]:
# Generate a dataframe with Edinburgh venues

edinburgh_venues = getNearbyVenues(names=edi_neighborhoods['neighborhood'],
                                   latitudes=edi_neighborhoods['latitude'],
                                   longitudes=edi_neighborhoods['longitude']
                                  )

print(edinburgh_venues.shape)
edinburgh_venues.head()

Old Town, GPO, St. James Centre
A702, Bruntsfield, Morningside, Fairmilehead
A71, Haymarket, Gorgie, Stenhouse, Sighthill, the Calders
A8, Murrayfield, Corstorphine, Gyle
Colinton, Oxgangs
Slateford, Longstone, Wester Hailes, Juniper Green, Currie, Balerno
Portobello, Duddingston
Liberton, Cameron Toll, Craigmillar, Niddrie
Gilmerton, Moredun, Mortonhall
New Town, Princes Street
Newbridge, Ratho
Kirkliston
Queen Street, Stockbridge, West End, Tollcross, Fountainbridge
South Queensferry
Dean Village, Comely Bank, A90, Barnton, Cramond, Sainsbury's, Craigleith, A90
Granton, Firth of Forth, Ferry Road
Leith, Newhaven
Restalrig, Craigentinny
Southside, Newington, Canongate, Holyrood Park, Abbeyhill, Mountcastle, Southside
Marchmont, Grange
Jobcentre Plus
Scottish Gas
Scottish Parliament
(466, 7)


Unnamed: 0,neighborhood,neighborhood latitude,neighborhood longitude,venue,venue latitude,venue longitude,venue category
0,"Old Town, GPO, St. James Centre",55.95243,-3.1884,The Balmoral Hotel,55.953113,-3.189509,Hotel
1,"Old Town, GPO, St. James Centre",55.95243,-3.1884,The Guildford Arms,55.953668,-3.190052,Pub
2,"Old Town, GPO, St. James Centre",55.95243,-3.1884,Apple Princes Street,55.953354,-3.189947,Electronics Store
3,"Old Town, GPO, St. James Centre",55.95243,-3.1884,The Voodoo Rooms,55.953622,-3.190504,Bar
4,"Old Town, GPO, St. James Centre",55.95243,-3.1884,Princes Street Suites,55.95337,-3.186934,Hotel


In [26]:
#number of venues for each neighbourhood
edinburgh_venues.groupby('neighborhood').count()

Unnamed: 0_level_0,neighborhood latitude,neighborhood longitude,venue,venue latitude,venue longitude,venue category
neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"A702, Bruntsfield, Morningside, Fairmilehead",5,5,5,5,5,5
"A71, Haymarket, Gorgie, Stenhouse, Sighthill, the Calders",6,6,6,6,6,6
"A8, Murrayfield, Corstorphine, Gyle",13,13,13,13,13,13
"Colinton, Oxgangs",4,4,4,4,4,4
"Dean Village, Comely Bank, A90, Barnton, Cramond, Sainsbury's, Craigleith, A90",3,3,3,3,3,3
"Gilmerton, Moredun, Mortonhall",5,5,5,5,5,5
"Granton, Firth of Forth, Ferry Road",4,4,4,4,4,4
Jobcentre Plus,10,10,10,10,10,10
Kirkliston,7,7,7,7,7,7
"Leith, Newhaven",26,26,26,26,26,26


In [27]:
#number of  unique categories can be curated from all the returned venues
print('There are {} uniques categories.'.format(len(edinburgh_venues['venue category'].unique())))

There are 134 uniques categories.


In [28]:
#Analyzing Each Neighborhood
# one hot encoding
edinburgh_onehot = pd.get_dummies(edinburgh_venues[['venue category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
edinburgh_onehot['neighborhood'] = edinburgh_venues['neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [edinburgh_onehot.columns[-1]] + list(edinburgh_onehot.columns[:-1])
edinburgh_onehot = edinburgh_onehot[fixed_columns]

#edinburgh_onehot.head()
edinburgh_onehot

Unnamed: 0,neighborhood,American Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Asian Restaurant,Auto Garage,Baby Store,Bagel Shop,Bakery,...,Train Station,Tram Station,Vegetarian / Vegan Restaurant,Whisky Bar,Wine Bar,Wings Joint,Women's Store,Yoga Studio,Zoo,Zoo Exhibit
0,"Old Town, GPO, St. James Centre",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,"Old Town, GPO, St. James Centre",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,"Old Town, GPO, St. James Centre",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,"Old Town, GPO, St. James Centre",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,"Old Town, GPO, St. James Centre",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
5,"Old Town, GPO, St. James Centre",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
6,"Old Town, GPO, St. James Centre",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
7,"Old Town, GPO, St. James Centre",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
8,"Old Town, GPO, St. James Centre",0,0,0,0,0,0,0,0,0,...,0,0,0,1,0,0,0,0,0,0
9,"Old Town, GPO, St. James Centre",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [29]:
edinburgh_onehot.shape #num of rows & cols

(466, 135)

In [30]:
#grouping rows by neighborhood and by taking the mean of the frequency of occurrence of each category
edinburgh_grouped = edinburgh_onehot.groupby('neighborhood').mean().reset_index()
edinburgh_grouped

Unnamed: 0,neighborhood,American Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Asian Restaurant,Auto Garage,Baby Store,Bagel Shop,Bakery,...,Train Station,Tram Station,Vegetarian / Vegan Restaurant,Whisky Bar,Wine Bar,Wings Joint,Women's Store,Yoga Studio,Zoo,Zoo Exhibit
0,"A702, Bruntsfield, Morningside, Fairmilehead",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,"A71, Haymarket, Gorgie, Stenhouse, Sighthill, ...",0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"A8, Murrayfield, Corstorphine, Gyle",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.538462
3,"Colinton, Oxgangs",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,"Dean Village, Comely Bank, A90, Barnton, Cramo...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,"Gilmerton, Moredun, Mortonhall",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,"Granton, Firth of Forth, Ferry Road",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Jobcentre Plus,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,...,0.1,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Kirkliston,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,"Leith, Newhaven",0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.038462,0.038462,...,0.0,0.0,0.038462,0.038462,0.038462,0.0,0.0,0.0,0.0,0.0


In [31]:
#confirm the new group size

edinburgh_grouped.shape

(23, 135)

In [32]:
#printing each neighborhood along with the top 5 most common venues

num_top_venues = 5

for hood in edinburgh_grouped['neighborhood']:
    print("----"+hood+"----")
    temp = edinburgh_grouped[edinburgh_grouped['neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----A702, Bruntsfield, Morningside, Fairmilehead----
               venue  freq
0              Hotel   0.2
1  Fish & Chips Shop   0.2
2               Café   0.2
3               Park   0.2
4                Bar   0.2


----A71, Haymarket, Gorgie, Stenhouse, Sighthill, the Calders----
            venue  freq
0           River  0.17
1     Auto Garage  0.17
2            Park  0.17
3            Café  0.17
4  Discount Store  0.17


----A8, Murrayfield, Corstorphine, Gyle----
                venue  freq
0         Zoo Exhibit  0.54
1         Coffee Shop  0.15
2  Chinese Restaurant  0.08
3                 Zoo  0.08
4       Grocery Store  0.08


----Colinton, Oxgangs----
           venue  freq
0  Bowling Alley  0.25
1    Coffee Shop  0.25
2         Forest  0.25
3    Supermarket  0.25
4            Pub  0.00


----Dean Village, Comely Bank, A90, Barnton, Cramond, Sainsbury's, Craigleith, A90----
                 venue  freq
0    Indian Restaurant  0.33
1                 Café  0.33
2                

In [33]:
#Putting data that into a pandas dataframe

#Creating a new dataframe and display the top 10 venues for each neighborhood.

num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
edi_neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
edi_neighborhoods_venues_sorted['neighborhood'] = edinburgh_grouped['neighborhood']

for ind in np.arange(edinburgh_grouped.shape[0]):
    edi_neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(edinburgh_grouped.iloc[ind, :], num_top_venues)

#edi_neighborhoods_venues_sorted.head()
edi_neighborhoods_venues_sorted

Unnamed: 0,neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"A702, Bruntsfield, Morningside, Fairmilehead",Fish & Chips Shop,Hotel,Park,Café,Bar,Zoo Exhibit,Diner,Donut Shop,Dog Run,Dive Bar
1,"A71, Haymarket, Gorgie, Stenhouse, Sighthill, ...",River,Café,Discount Store,Auto Garage,Park,Skate Park,Dog Run,Dive Bar,Diner,Dessert Shop
2,"A8, Murrayfield, Corstorphine, Gyle",Zoo Exhibit,Coffee Shop,Grocery Store,Zoo,Café,Chinese Restaurant,Argentinian Restaurant,Diner,Donut Shop,Dog Run
3,"Colinton, Oxgangs",Supermarket,Forest,Bowling Alley,Coffee Shop,General Entertainment,Electronics Store,Construction & Landscaping,Convenience Store,Cosmetics Shop,Cricket Ground
4,"Dean Village, Comely Bank, A90, Barnton, Cramo...",Trail,Indian Restaurant,Café,Zoo Exhibit,Donut Shop,Dog Run,Dive Bar,Discount Store,Diner,Dessert Shop
5,"Gilmerton, Moredun, Mortonhall",Supermarket,Chinese Restaurant,Park,Bakery,Construction & Landscaping,Cupcake Shop,Deli / Bodega,Cricket Ground,Cosmetics Shop,Department Store
6,"Granton, Firth of Forth, Ferry Road",Grocery Store,Rugby Pitch,Pharmacy,Bed & Breakfast,Dessert Shop,Donut Shop,Dog Run,Dive Bar,Discount Store,Diner
7,Jobcentre Plus,Motorcycle Shop,Clothing Store,Sporting Goods Shop,Shopping Plaza,Supermarket,Baby Store,Tram Station,Train Station,Fast Food Restaurant,Donut Shop
8,Kirkliston,Health & Beauty Service,Bowling Alley,Fish & Chips Shop,Convenience Store,Pub,Construction & Landscaping,Grocery Store,Baby Store,Bagel Shop,Cricket Ground
9,"Leith, Newhaven",Bar,Supermarket,Coffee Shop,Art Gallery,Bagel Shop,Spanish Restaurant,Bistro,Beer Garden,Pool,Steakhouse


In [34]:
#Cluster Edinburgh Neighborhoods

# set number of clusters
kclusters = 5

edinburgh_grouped_clustering = edinburgh_grouped.drop('neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(edinburgh_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=int32)

In [35]:
#Creating a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

# add clustering labels
edi_neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

edinburgh_merged = edi_neighborhoods

#print(edinburgh_merged)

# merge edinburgh_grouped with edi_data to add latitude/longitude for each neighborhood
edinburgh_merged = edinburgh_merged.join(edi_neighborhoods_venues_sorted.set_index('neighborhood'), on='neighborhood')

#edinburgh_merged.head() # check the last columns!
edinburgh_merged

Unnamed: 0,postcode,latitude,longitude,post town,neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,EH1,55.95243,-3.1884,EDINBURGH,"Old Town, GPO, St. James Centre",0,Hotel,Bar,Café,Restaurant,Pub,Art Gallery,Steakhouse,Whisky Bar,Coffee Shop,French Restaurant
1,EH10,55.92077,-3.20984,EDINBURGH,"A702, Bruntsfield, Morningside, Fairmilehead",0,Fish & Chips Shop,Hotel,Park,Café,Bar,Zoo Exhibit,Diner,Donut Shop,Dog Run,Dive Bar
2,EH11,55.93387,-3.24867,EDINBURGH,"A71, Haymarket, Gorgie, Stenhouse, Sighthill, ...",0,River,Café,Discount Store,Auto Garage,Park,Skate Park,Dog Run,Dive Bar,Diner,Dessert Shop
3,EH12,55.94262,-3.27137,EDINBURGH,"A8, Murrayfield, Corstorphine, Gyle",0,Zoo Exhibit,Coffee Shop,Grocery Store,Zoo,Café,Chinese Restaurant,Argentinian Restaurant,Diner,Donut Shop,Dog Run
4,EH13,55.90788,-3.24144,EDINBURGH,"Colinton, Oxgangs",0,Supermarket,Forest,Bowling Alley,Coffee Shop,General Entertainment,Electronics Store,Construction & Landscaping,Convenience Store,Cosmetics Shop,Cricket Ground
5,EH14,55.90925,-3.28308,"BALERNO, CURRIE, EDINBURGH, JUNIPER GREEN","Slateford, Longstone, Wester Hailes, Juniper G...",3,Golf Course,Sporting Goods Shop,Fish & Chips Shop,Zoo Exhibit,Dessert Shop,Dog Run,Dive Bar,Discount Store,Diner,Department Store
6,EH15,55.94686,-3.11136,EDINBURGH,"Portobello, Duddingston",2,Bus Stop,Zoo Exhibit,Diner,Electronics Store,Donut Shop,Dog Run,Dive Bar,Discount Store,Dessert Shop,Comedy Club
7,EH16,55.92221,-3.15387,EDINBURGH,"Liberton, Cameron Toll, Craigmillar, Niddrie",0,Grocery Store,Park,Hotel,Korean Restaurant,Donut Shop,Dog Run,Dive Bar,Discount Store,Diner,Dessert Shop
8,EH17,55.90704,-3.14222,EDINBURGH,"Gilmerton, Moredun, Mortonhall",0,Supermarket,Chinese Restaurant,Park,Bakery,Construction & Landscaping,Cupcake Shop,Deli / Bodega,Cricket Ground,Cosmetics Shop,Department Store
9,EH2,55.95417,-3.19486,EDINBURGH,"New Town, Princes Street",0,Café,Bar,Hotel,Coffee Shop,Art Gallery,Steakhouse,French Restaurant,Pub,Restaurant,Cocktail Bar


In [36]:
#visualizing the resulting clusters

# create map
edi_map_clusters = folium.Map(location=[edi_latitude, edi_longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(edinburgh_merged['latitude'], edinburgh_merged['longitude'], edinburgh_merged['neighborhood'], edinburgh_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(edi_map_clusters)
       
edi_map_clusters

In [37]:
#cluster analisis 
#Examine Cluster 1
edinburgh_merged.loc[edinburgh_merged['Cluster Labels'] == 0, edinburgh_merged.columns[[1] + list(range(5, edinburgh_merged.shape[1]))]]

Unnamed: 0,latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,55.95243,0,Hotel,Bar,Café,Restaurant,Pub,Art Gallery,Steakhouse,Whisky Bar,Coffee Shop,French Restaurant
1,55.92077,0,Fish & Chips Shop,Hotel,Park,Café,Bar,Zoo Exhibit,Diner,Donut Shop,Dog Run,Dive Bar
2,55.93387,0,River,Café,Discount Store,Auto Garage,Park,Skate Park,Dog Run,Dive Bar,Diner,Dessert Shop
3,55.94262,0,Zoo Exhibit,Coffee Shop,Grocery Store,Zoo,Café,Chinese Restaurant,Argentinian Restaurant,Diner,Donut Shop,Dog Run
4,55.90788,0,Supermarket,Forest,Bowling Alley,Coffee Shop,General Entertainment,Electronics Store,Construction & Landscaping,Convenience Store,Cosmetics Shop,Cricket Ground
7,55.92221,0,Grocery Store,Park,Hotel,Korean Restaurant,Donut Shop,Dog Run,Dive Bar,Discount Store,Diner,Dessert Shop
8,55.90704,0,Supermarket,Chinese Restaurant,Park,Bakery,Construction & Landscaping,Cupcake Shop,Deli / Bodega,Cricket Ground,Cosmetics Shop,Department Store
9,55.95417,0,Café,Bar,Hotel,Coffee Shop,Art Gallery,Steakhouse,French Restaurant,Pub,Restaurant,Cocktail Bar
11,55.95652,0,Health & Beauty Service,Bowling Alley,Fish & Chips Shop,Convenience Store,Pub,Construction & Landscaping,Grocery Store,Baby Store,Bagel Shop,Cricket Ground
12,55.95412,0,Bar,Café,Coffee Shop,Pub,French Restaurant,Sandwich Place,Cocktail Bar,Mexican Restaurant,Hotel,Thai Restaurant


In [38]:
#Examine Cluster 2
edinburgh_merged.loc[edinburgh_merged['Cluster Labels'] == 1, edinburgh_merged.columns[[1] + list(range(5, edinburgh_merged.shape[1]))]]


Unnamed: 0,latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
13,55.98455,1,Train Station,Campground,Brewery,Park,Bus Stop,Dog Run,Dive Bar,Discount Store,Diner,Zoo Exhibit
18,55.94909,1,Park,Lake,Scenic Lookout,Mountain,Dessert Shop,Donut Shop,Dog Run,Dive Bar,Discount Store,Diner


In [39]:
#Examine Cluster 3
edinburgh_merged.loc[edinburgh_merged['Cluster Labels'] == 2, edinburgh_merged.columns[[1] + list(range(5, edinburgh_merged.shape[1]))]]

Unnamed: 0,latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
6,55.94686,2,Bus Stop,Zoo Exhibit,Diner,Electronics Store,Donut Shop,Dog Run,Dive Bar,Discount Store,Dessert Shop,Comedy Club


In [40]:
#Examine Cluster 4
edinburgh_merged.loc[edinburgh_merged['Cluster Labels'] == 3, edinburgh_merged.columns[[1] + list(range(5, edinburgh_merged.shape[1]))]]

Unnamed: 0,latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
5,55.90925,3,Golf Course,Sporting Goods Shop,Fish & Chips Shop,Zoo Exhibit,Dessert Shop,Dog Run,Dive Bar,Discount Store,Diner,Department Store


In [41]:
#Examine Cluster 5
edinburgh_merged.loc[edinburgh_merged['Cluster Labels'] == 4, edinburgh_merged.columns[[1] + list(range(5, edinburgh_merged.shape[1]))]]

Unnamed: 0,latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
10,55.9306,4,Hotel,Food Truck,Bridal Shop,Zoo Exhibit,Diner,Donut Shop,Dog Run,Dive Bar,Discount Store,Dessert Shop


In [42]:
#Glasgow DATA Analysis
#=======================
#Let's explore the first neighborhood in our dataframe.
#Get the neighborhood's name.
gla_neighborhoods.loc[0, 'neighborhood']
#Get the neighborhood's latitude and longitude values.
gla_neighborhood_latitude = gla_neighborhoods.loc[0, 'latitude'] # neighborhood latitude value
gla_neighborhood_longitude = gla_neighborhoods.loc[0, 'longitude'] # neighborhood longitude value

gla_neighborhood_name = gla_neighborhoods.loc[0, 'neighborhood'] # neighborhood name
print('Latitude and longitude values of {} are {}, {}.'.format(gla_neighborhood_name, 
                                                               gla_neighborhood_latitude, 
                                                               gla_neighborhood_longitude))

Latitude and longitude values of Merchant City are 55.860380000000006, -4.24671.


In [43]:
# create Glasgow URL
gla_url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    gla_neighborhood_latitude, 
    gla_neighborhood_longitude, 
    radius, 
    LIMIT)
gla_url # display URL

'https://api.foursquare.com/v2/venues/explore?&client_id=QJI2OIVERR1LLMYAFFN1IPYBCWSACKGWJAIUBTSX0JCH5TI2&client_secret=YPLEEQ5L2VO0FPKVNJFSJMXXLROC0PQDPULDZBKVK2FKQNMR&v=20180605&ll=55.860380000000006,-4.24671&radius=500&limit=100'

In [44]:
#Send the GET request and examine the results of this neighbourhood
gla_results = requests.get(gla_url).json()
gla_results

{'meta': {'code': 200, 'requestId': '5d56a6316adbf5003947337e'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'}]},
  'headerLocation': 'Merchant City',
  'headerFullLocation': 'Merchant City, Glasgow',
  'headerLocationGranularity': 'neighborhood',
  'totalResults': 81,
  'suggestedBounds': {'ne': {'lat': 55.86488000450001,
    'lng': -4.23870659528205},
   'sw': {'lat': 55.855879995500004, 'lng': -4.2547134047179505}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '53e52950498ee2c0974731b8',
       'name': 'DogHouse Merchant City',
       'location': {'address': '99 Hutcheson St',
        'lat': 55.859376787425894,
        'lng': -4.247850296672637,
        'labeledLatLngs': [{'label': 'display',
     

In [45]:
#Now we are ready to clean the json and structure it into a pandas dataframe.

gla_venues = gla_results['response']['groups'][0]['items']
    
gla_nearby_venues = json_normalize(gla_venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
gla_nearby_venues = gla_nearby_venues.loc[:, filtered_columns]

# filter the category for each row
gla_nearby_venues['venue.categories'] = gla_nearby_venues.apply(get_category_type, axis=1)

# clean columns
gla_nearby_venues.columns = [col.split(".")[-1] for col in gla_nearby_venues.columns]

#gla_nearby_venues.head()
gla_nearby_venues

Unnamed: 0,name,categories,lat,lng
0,DogHouse Merchant City,Beer Bar,55.859377,-4.247850
1,Hutchesons Glasgow,Steakhouse,55.859800,-4.247892
2,Spitfire Espresso,Coffee Shop,55.859456,-4.245130
3,Italian Kitchen,Italian Restaurant,55.859361,-4.243795
4,iCafe,Coffee Shop,55.859379,-4.244422
5,Wilson Street Pantry,Breakfast Spot,55.858251,-4.245971
6,Dhabba,Indian Restaurant,55.858190,-4.246037
7,Paesano Pizza,Pizza Place,55.859721,-4.250754
8,George Square,Plaza,55.861164,-4.250207
9,The Z Hotel Glasgow,Hotel,55.861789,-4.248732


In [46]:
print('{} venues were returned by Foursquare.'.format(gla_nearby_venues.shape[0]))

81 venues were returned by Foursquare.


In [47]:
# Generate a dataframe with Glasgow venues

glasgow_venues = getNearbyVenues(names=gla_neighborhoods['neighborhood'],
                                   latitudes=gla_neighborhoods['latitude'],
                                   longitudes=gla_neighborhoods['longitude']
                                  )

print(glasgow_venues.shape)
#glasgow_venues.head()
glasgow_venues

Merchant City
Broomhill, Partick, Partickhill
West End, Dowanhill, Hillhead, Hyndland, Kelvindale, Kelvinside, Botanic Gardens, University of Glasgow
Anniesland, Knightswood, Yoker
Whiteinch, Scotstoun
Drumchapel
Blythswood Hill, Anderston
Maryhill, North Kelvinside, Ruchill
Balornock, Barmulloch, Cowlairs, Royston, Springburn, Sighthill
Milton, Parkhouse, Possilpark
Lambhill, Summerston
Anderston, Finnieston, Garnethill, Park, Woodlands, Yorkhill
Dennistoun, Haghill, Parkhead
Carmyle, Tollcross, Mount Vernon, Lightburn, Sandyhills, Shettleston, Springboig
Cardowan, Carntyne, Craigend, Cranhill, Garthamlock, Millerston, Provanmill, Queenslie, Riddrie, Robroyston, Ruchazie, Stepps, Wellhouse
Easterhouse, Easthall, Provanhall
Calton, Cowcaddens, Kelvinbridge, Townhead, Woodlands, Woodside
Bridgeton, Calton, Dalmarnock
Pollokshields, Shawlands
Battlefield, Govanhill, Mount Florida, Strathbungo, Toryglen
Mansewood, Newlands, Pollokshaws
Cathcart, Simshill, Croftfoot, King's Park, Muirend, 

Unnamed: 0,neighborhood,neighborhood latitude,neighborhood longitude,venue,venue latitude,venue longitude,venue category
0,Merchant City,55.860380,-4.246710,DogHouse Merchant City,55.859377,-4.247850,Beer Bar
1,Merchant City,55.860380,-4.246710,Hutchesons Glasgow,55.859800,-4.247892,Steakhouse
2,Merchant City,55.860380,-4.246710,Spitfire Espresso,55.859456,-4.245130,Coffee Shop
3,Merchant City,55.860380,-4.246710,Italian Kitchen,55.859361,-4.243795,Italian Restaurant
4,Merchant City,55.860380,-4.246710,iCafe,55.859379,-4.244422,Coffee Shop
5,Merchant City,55.860380,-4.246710,Wilson Street Pantry,55.858251,-4.245971,Breakfast Spot
6,Merchant City,55.860380,-4.246710,Dhabba,55.858190,-4.246037,Indian Restaurant
7,Merchant City,55.860380,-4.246710,Paesano Pizza,55.859721,-4.250754,Pizza Place
8,Merchant City,55.860380,-4.246710,George Square,55.861164,-4.250207,Plaza
9,Merchant City,55.860380,-4.246710,The Z Hotel Glasgow,55.861789,-4.248732,Hotel


In [48]:
#number of venues for each neighbourhood
glasgow_venues.groupby('neighborhood').count()

Unnamed: 0_level_0,neighborhood latitude,neighborhood longitude,venue,venue latitude,venue longitude,venue category
neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Alexandria, Arrochar, Aldochlay, Ardlui, Balloch, Bonhill, Gartocharn, Inverarnan, Jamestown, Luss, Tarbet",3,3,3,3,3,3
"Anderston, Finnieston, Garnethill, Park, Woodlands, Yorkhill",40,40,40,40,40,40
"Anniesland, Knightswood, Yoker",3,3,3,3,3,3
"Arden, Carnwadric, Deaconsbank, Giffnock, Kennishead, Thornliebank",3,3,3,3,3,3
"Auldhouse, East Kilbride",2,2,2,2,2,2
"Baillieston, Bargeddie, Chryston, Garrowhill, Gartcosh, Gartloch, Moodiesburn, Muirhead, Springhill",1,1,1,1,1,1
"Baldernock, Milngavie, Mugdock",10,10,10,10,10,10
"Balfron, Balmaha, Blanefield, Croftamie, Drymen, Dumgoyne, Fintry, Killearn, Rowardennan, Strathblane",4,4,4,4,4,4
"Balornock, Barmulloch, Cowlairs, Royston, Springburn, Sighthill",2,2,2,2,2,2
"Barrhead, Neilston, Uplawmoor",2,2,2,2,2,2


In [49]:
#number of  unique categories can be curated from all the returned venues
print('There are {} uniques categories.'.format(len(glasgow_venues['venue category'].unique())))

There are 126 uniques categories.


In [50]:
#Analyzing Each Neighborhood
# one hot encoding
glasgow_onehot = pd.get_dummies(glasgow_venues[['venue category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
glasgow_onehot['neighborhood'] = glasgow_venues['neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [glasgow_onehot.columns[-1]] + list(glasgow_onehot.columns[:-1])
glasgow_onehot = glasgow_onehot[fixed_columns]

#glasgow_onehot.head()
glasgow_onehot

Unnamed: 0,neighborhood,American Restaurant,Art Gallery,Asian Restaurant,Athletics & Sports,Auto Garage,BBQ Joint,Bagel Shop,Bakery,Bar,...,Thai Restaurant,Theater,Thrift / Vintage Store,Toy / Game Store,Trail,Train Station,Tram Station,Vietnamese Restaurant,Warehouse Store,Whisky Bar
0,Merchant City,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Merchant City,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Merchant City,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Merchant City,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Merchant City,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
5,Merchant City,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
6,Merchant City,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
7,Merchant City,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
8,Merchant City,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
9,Merchant City,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [51]:
glasgow_onehot.shape #num of rows & cols

(506, 127)

In [52]:
#grouping rows by neighborhood and by taking the mean of the frequency of occurrence of each category
glasgow_grouped = glasgow_onehot.groupby('neighborhood').mean().reset_index()
glasgow_grouped

Unnamed: 0,neighborhood,American Restaurant,Art Gallery,Asian Restaurant,Athletics & Sports,Auto Garage,BBQ Joint,Bagel Shop,Bakery,Bar,...,Thai Restaurant,Theater,Thrift / Vintage Store,Toy / Game Store,Trail,Train Station,Tram Station,Vietnamese Restaurant,Warehouse Store,Whisky Bar
0,"Alexandria, Arrochar, Aldochlay, Ardlui, Ballo...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,"Anderston, Finnieston, Garnethill, Park, Woodl...",0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.125,...,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.025
2,"Anniesland, Knightswood, Yoker",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,"Arden, Carnwadric, Deaconsbank, Giffnock, Kenn...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,"Auldhouse, East Kilbride",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,"Baillieston, Bargeddie, Chryston, Garrowhill, ...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,"Baldernock, Milngavie, Mugdock",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,...,0.0,0.0,0.0,0.0,0.1,0.1,0.0,0.0,0.0,0.0
7,"Balfron, Balmaha, Blanefield, Croftamie, Dryme...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,"Balornock, Barmulloch, Cowlairs, Royston, Spri...",0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0
9,"Barrhead, Neilston, Uplawmoor",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [53]:
#confirm the new group size

glasgow_grouped.shape

(50, 127)

In [54]:
#printing each neighborhood along with the top 5 most common venues

num_top_venues = 5

for hood in glasgow_grouped['neighborhood']:
    print("----"+hood+"----")
    temp = glasgow_grouped[glasgow_grouped['neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Alexandria, Arrochar, Aldochlay, Ardlui, Balloch, Bonhill, Gartocharn, Inverarnan, Jamestown, Luss, Tarbet----
                 venue  freq
0    Outdoor Sculpture  0.33
1               Castle  0.33
2      Harbor / Marina  0.33
3  American Restaurant  0.00
4             Pharmacy  0.00


----Anderston, Finnieston, Garnethill, Park, Woodlands, Yorkhill----
               venue  freq
0                Bar  0.12
1  Indian Restaurant  0.10
2          Nightclub  0.08
3        Coffee Shop  0.08
4               Café  0.08


----Anniesland, Knightswood, Yoker----
                 venue  freq
0             Bus Stop  0.33
1                 Lake  0.33
2           Playground  0.33
3  American Restaurant  0.00
4            Racetrack  0.00


----Arden, Carnwadric, Deaconsbank, Giffnock, Kennishead, Thornliebank----
                 venue  freq
0        Shopping Mall  0.33
1         Soccer Field  0.33
2                 Park  0.33
3  American Restaurant  0.00
4            Racetrack  0.00


----Auldho

In [55]:
#Putting data that into a pandas dataframe

#Creating a new dataframe and display the top 10 venues for each neighborhood.

num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
gla_neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
gla_neighborhoods_venues_sorted['neighborhood'] = glasgow_grouped['neighborhood']

for ind in np.arange(glasgow_grouped.shape[0]):
    gla_neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(glasgow_grouped.iloc[ind, :], num_top_venues)

#gla_neighborhoods_venues_sorted.head()
gla_neighborhoods_venues_sorted

Unnamed: 0,neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Alexandria, Arrochar, Aldochlay, Ardlui, Ballo...",Harbor / Marina,Outdoor Sculpture,Castle,Event Space,Food & Drink Shop,Flower Shop,Fish & Chips Shop,Fast Food Restaurant,Whisky Bar,Gas Station
1,"Anderston, Finnieston, Garnethill, Park, Woodl...",Bar,Indian Restaurant,Café,Nightclub,Coffee Shop,Pub,Japanese Restaurant,Pizza Place,Whisky Bar,Movie Theater
2,"Anniesland, Knightswood, Yoker",Playground,Lake,Bus Stop,Whisky Bar,Gas Station,Discount Store,Dive Bar,Doner Restaurant,Electronics Store,English Restaurant
3,"Arden, Carnwadric, Deaconsbank, Giffnock, Kenn...",Shopping Mall,Park,Soccer Field,Whisky Bar,Food & Drink Shop,Department Store,Diner,Discount Store,Dive Bar,Doner Restaurant
4,"Auldhouse, East Kilbride",Scottish Restaurant,Bar,Whisky Bar,Gas Station,Diner,Discount Store,Dive Bar,Doner Restaurant,Electronics Store,English Restaurant
5,"Baillieston, Bargeddie, Chryston, Garrowhill, ...",Lake,Whisky Bar,Gastropub,Diner,Discount Store,Dive Bar,Doner Restaurant,Electronics Store,English Restaurant,Event Space
6,"Baldernock, Milngavie, Mugdock",Coffee Shop,Bar,Café,Train Station,Trail,Sandwich Place,Supermarket,Grocery Store,Whisky Bar,Event Space
7,"Balfron, Balmaha, Blanefield, Croftamie, Dryme...",Gastropub,Convenience Store,Flower Shop,Gift Shop,Gas Station,Diner,Discount Store,Dive Bar,Doner Restaurant,Electronics Store
8,"Balornock, Barmulloch, Cowlairs, Royston, Spri...",Train Station,Auto Garage,Whisky Bar,Event Space,French Restaurant,Food & Drink Shop,Flower Shop,Fish & Chips Shop,Fast Food Restaurant,Electronics Store
9,"Barrhead, Neilston, Uplawmoor",Construction & Landscaping,Hotel,Whisky Bar,Gas Station,Diner,Discount Store,Dive Bar,Doner Restaurant,Electronics Store,English Restaurant


In [56]:
#Cluster Glasgow Neighborhoods

# set number of clusters
kclusters = 5

glasgow_grouped_clustering = glasgow_grouped.drop('neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(glasgow_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_

#Creating a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

# add clustering labels
gla_neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

In [57]:
#Creating a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.
glasgow_merged = gla_neighborhoods

# merge glasgow_grouped with edi_data to add latitude/longitude for each neighborhood
glasgow_merged = glasgow_merged.join(gla_neighborhoods_venues_sorted.set_index('neighborhood'), on='neighborhood')

#clean data
#drop postcodes that does not have a cluster label = NaN 
glasgow_merged = glasgow_merged[glasgow_merged['Cluster Labels'].notnull()]

#convert label in integer
glasgow_merged = glasgow_merged.astype({'Cluster Labels': 'int32'})

#glasgow_merged.head() # check the last columns!
glasgow_merged

Unnamed: 0,postcode,latitude,longitude,post town,neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,G1,55.86038,-4.24671,GLASGOW,Merchant City,1,Bar,Coffee Shop,Italian Restaurant,Pub,Cocktail Bar,Steakhouse,Seafood Restaurant,Sandwich Place,Japanese Restaurant,Shopping Mall
1,G11,55.87356,-4.31142,GLASGOW,"Broomhill, Partick, Partickhill",0,Coffee Shop,Café,Sandwich Place,Deli / Bodega,Supermarket,Beer Bar,Mexican Restaurant,Shopping Plaza,Restaurant,Outdoor Supply Store
2,G12,55.88006,-4.30061,GLASGOW,"West End, Dowanhill, Hillhead, Hyndland, Kelvi...",1,Convenience Store,Hotel,Gym,Italian Restaurant,Restaurant,Whisky Bar,Flower Shop,Fish & Chips Shop,Fast Food Restaurant,Event Space
3,G13,55.89358,-4.3462,GLASGOW,"Anniesland, Knightswood, Yoker",1,Playground,Lake,Bus Stop,Whisky Bar,Gas Station,Discount Store,Dive Bar,Doner Restaurant,Electronics Store,English Restaurant
4,G14,55.88095,-4.34864,GLASGOW,"Whiteinch, Scotstoun",1,Sports Bar,Spanish Restaurant,Bus Stop,Rugby Pitch,English Restaurant,Flower Shop,Fish & Chips Shop,Fast Food Restaurant,Event Space,Electronics Store
5,G15,55.9094,-4.36476,GLASGOW,Drumchapel,0,Supermarket,Discount Store,Shopping Mall,Whisky Bar,Event Space,Food & Drink Shop,Flower Shop,Fish & Chips Shop,Fast Food Restaurant,English Restaurant
6,G2,55.86382,-4.2549,GLASGOW,"Blythswood Hill, Anderston",1,Bar,Hotel,Cocktail Bar,Café,Coffee Shop,Pub,Chinese Restaurant,Greek Restaurant,Indian Restaurant,Restaurant
7,G20,55.8858,-4.28176,GLASGOW,"Maryhill, North Kelvinside, Ruchill",0,Pub,Café,Supermarket,Chinese Restaurant,Grocery Store,Fast Food Restaurant,French Restaurant,Flower Shop,Fish & Chips Shop,Event Space
8,G21,55.88063,-4.22069,GLASGOW,"Balornock, Barmulloch, Cowlairs, Royston, Spri...",0,Train Station,Auto Garage,Whisky Bar,Event Space,French Restaurant,Food & Drink Shop,Flower Shop,Fish & Chips Shop,Fast Food Restaurant,Electronics Store
9,G22,55.88998,-4.25002,GLASGOW,"Milton, Parkhouse, Possilpark",1,Racetrack,Gas Station,Train Station,Department Store,Diner,Discount Store,Dive Bar,Doner Restaurant,Electronics Store,English Restaurant


In [58]:
#visualizing the resulting clusters

# create map
gla_map_clusters = folium.Map(location=[gla_latitude, gla_longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(glasgow_merged['latitude'], glasgow_merged['longitude'], glasgow_merged['neighborhood'], glasgow_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(gla_map_clusters)
       
gla_map_clusters

In [59]:
#cluster analisis 
#Examine Cluster 1
glasgow_merged.loc[glasgow_merged['Cluster Labels'] == 0, glasgow_merged.columns[[1] + list(range(5, glasgow_merged.shape[1]))]]

Unnamed: 0,latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,55.87356,0,Coffee Shop,Café,Sandwich Place,Deli / Bodega,Supermarket,Beer Bar,Mexican Restaurant,Shopping Plaza,Restaurant,Outdoor Supply Store
5,55.9094,0,Supermarket,Discount Store,Shopping Mall,Whisky Bar,Event Space,Food & Drink Shop,Flower Shop,Fish & Chips Shop,Fast Food Restaurant,English Restaurant
7,55.8858,0,Pub,Café,Supermarket,Chinese Restaurant,Grocery Store,Fast Food Restaurant,French Restaurant,Flower Shop,Fish & Chips Shop,Event Space
8,55.88063,0,Train Station,Auto Garage,Whisky Bar,Event Space,French Restaurant,Food & Drink Shop,Flower Shop,Fish & Chips Shop,Fast Food Restaurant,Electronics Store
10,55.90193,0,Bar,Playground,Supermarket,Whisky Bar,Food & Drink Shop,Flower Shop,Fish & Chips Shop,Fast Food Restaurant,Event Space,English Restaurant
12,55.85748,0,Fast Food Restaurant,Electronics Store,Warehouse Store,Discount Store,Outlet Store,Pizza Place,BBQ Joint,Hardware Store,Supermarket,Pet Store
15,55.86817,0,IT Services,Performing Arts Venue,Café,Whisky Bar,French Restaurant,Diner,Discount Store,Dive Bar,Doner Restaurant,Electronics Store
18,55.83815,0,Café,Train Station,Bakery,Platform,Auto Garage,Event Space,Food & Drink Shop,Flower Shop,Fish & Chips Shop,Fast Food Restaurant
20,55.81825,0,Park,Chinese Restaurant,Café,Whisky Bar,Department Store,Discount Store,Dive Bar,Doner Restaurant,Electronics Store,English Restaurant
25,55.8477,0,Convenience Store,Pharmacy,Supermarket,Fast Food Restaurant,Bakery,Sandwich Place,Indian Restaurant,Deli / Bodega,Doner Restaurant,Dive Bar


In [60]:
#Examine Cluster 2
glasgow_merged.loc[glasgow_merged['Cluster Labels'] == 1, glasgow_merged.columns[[1] + list(range(5, glasgow_merged.shape[1]))]]

Unnamed: 0,latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,55.86038,1,Bar,Coffee Shop,Italian Restaurant,Pub,Cocktail Bar,Steakhouse,Seafood Restaurant,Sandwich Place,Japanese Restaurant,Shopping Mall
2,55.88006,1,Convenience Store,Hotel,Gym,Italian Restaurant,Restaurant,Whisky Bar,Flower Shop,Fish & Chips Shop,Fast Food Restaurant,Event Space
3,55.89358,1,Playground,Lake,Bus Stop,Whisky Bar,Gas Station,Discount Store,Dive Bar,Doner Restaurant,Electronics Store,English Restaurant
4,55.88095,1,Sports Bar,Spanish Restaurant,Bus Stop,Rugby Pitch,English Restaurant,Flower Shop,Fish & Chips Shop,Fast Food Restaurant,Event Space,Electronics Store
6,55.86382,1,Bar,Hotel,Cocktail Bar,Café,Coffee Shop,Pub,Chinese Restaurant,Greek Restaurant,Indian Restaurant,Restaurant
9,55.88998,1,Racetrack,Gas Station,Train Station,Department Store,Diner,Discount Store,Dive Bar,Doner Restaurant,Electronics Store,English Restaurant
11,55.86619,1,Bar,Indian Restaurant,Café,Nightclub,Coffee Shop,Pub,Japanese Restaurant,Pizza Place,Whisky Bar,Movie Theater
13,55.8484,1,Bar,Shopping Plaza,Park,Grocery Store,Sandwich Place,Whisky Bar,Fish & Chips Shop,Fast Food Restaurant,Event Space,English Restaurant
14,55.87351,1,Bakery,Playground,Gym / Fitness Center,English Restaurant,Food & Drink Shop,Flower Shop,Fish & Chips Shop,Fast Food Restaurant,Event Space,Whisky Bar
16,55.86837,1,Hotel,Sporting Goods Shop,Discount Store,Concert Hall,Chinese Restaurant,Café,Shoe Store,Movie Theater,Mobile Phone Shop,Coffee Shop


In [61]:
#Examine Cluster 3
glasgow_merged.loc[glasgow_merged['Cluster Labels'] == 2, glasgow_merged.columns[[1] + list(range(5, glasgow_merged.shape[1]))]]

Unnamed: 0,latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
41,55.77811,2,Business Service,Whisky Bar,Deli / Bodega,Diner,Discount Store,Dive Bar,Doner Restaurant,Electronics Store,English Restaurant,Event Space


In [62]:
#Examine Cluster 4
glasgow_merged.loc[glasgow_merged['Cluster Labels'] == 3, glasgow_merged.columns[[1] + list(range(5, glasgow_merged.shape[1]))]]

Unnamed: 0,latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
17,55.84824,3,Bakery,Grocery Store,Event Space,French Restaurant,Food & Drink Shop,Flower Shop,Fish & Chips Shop,Fast Food Restaurant,Whisky Bar,Gastropub
38,55.80406,3,Grocery Store,Whisky Bar,Gastropub,Diner,Discount Store,Dive Bar,Doner Restaurant,Electronics Store,English Restaurant,Event Space
44,55.91309,3,Convenience Store,Grocery Store,Gym,Fast Food Restaurant,Whisky Bar,Event Space,Food & Drink Shop,Flower Shop,Fish & Chips Shop,English Restaurant


In [63]:
#Examine Cluster 5
glasgow_merged.loc[glasgow_merged['Cluster Labels'] == 4, glasgow_merged.columns[[1] + list(range(5, glasgow_merged.shape[1]))]]

Unnamed: 0,latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
36,55.87372,4,Lake,Whisky Bar,Gastropub,Diner,Discount Store,Dive Bar,Doner Restaurant,Electronics Store,English Restaurant,Event Space
