# IBM Data Science Capstone Project

# Lets explore the United Kingdom!

## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

## Introduction <a name="introduction"></a>

One of my life goals is to travel to Manchester to watch a football game at Old Trafford. When I am there I want to visit the area and find the various places of interest around the city. Using data that I found from a UK government site and the foursquare API I will cluster the city of Manchester and see what are the various places/venues of interest there are.

## Data <a name="data"></a>

Similar to that of the data used for the labs and previous submission. I will be using the Google Maps API and the FourSquare API to retrieve location data for the UK. Using Folium I will create a map of the UK showcasing the various towns of the country. The venue data from each suburb will be extracted and through the use of the K-means clustering algorithm I will perform clustering on the UK data to group the towns together and observe the results. This project will be of interest to those looking at exploring various countries and comparing the geographical data to that of other developed countries.

## Methodology <a name="methodology"></a>

## First lets extract the geographical data for the UK

In [1]:
import numpy as np # library to handle data in a vectorized manner
import gmaps
import gmaps.datasets
import pandas as pd # library for data analsysis
from geopy.geocoders import Nominatim
import requests # library to handle requests
# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors
import json # library to handle JSON files
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library
from bs4 import BeautifulSoup 
print('Libraries imported.')

Libraries imported.


In [2]:
UK_town_data = pd.read_csv(r'C:\Users\schandrapregaas-maha\Documents\IBM Data Science Program Code\Capstone Project\uk-towns-sample\uk-towns-sample\csv\uk-towns-sample.csv')
UK_town_data.head()

Unnamed: 0,id,name,county,country,grid_reference,easting,northing,latitude,longitude,elevation,postcode_sector,local_government_area,nuts_region,type
0,1,Aaron's Hill,Surrey,England,SU957435,495783,143522,51.18291,-0.63098,78,GU7 2,Waverley District,South East,Suburban Area
1,2,Abbas Combe,Somerset,England,ST707226,370749,122688,51.00283,-2.41825,91,BA8 0,South Somerset District,South West,Village
2,3,Abberley,Worcestershire,England,SO744675,374477,267522,52.30522,-2.37574,152,WR6 6,Malvern Hills District,West Midlands,Village
3,4,Abberton,Essex,England,TM006190,600637,219093,51.8344,0.91066,44,CO5 7,Colchester District,Eastern,Village
4,5,Abberton,Worcestershire,England,SO995534,399538,253477,52.17955,-2.00817,68,WR10 2,Wychavon District,West Midlands,Hamlet


I will use geopy to get the coordinates of the UK

In [3]:
address = 'United Kingdom'

geolocator = Nominatim(user_agent="UK_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of the UK are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of the UK are 54.7023545, -3.2765753.


Now I will extract the data I want from this large dataframe

In [4]:
df_uk_data = UK_town_data[['id','name', 'county', 'latitude', 'longitude', 'type']]
df_uk_data

Unnamed: 0,id,name,county,latitude,longitude,type
0,1,Aaron's Hill,Surrey,51.18291,-0.63098,Suburban Area
1,2,Abbas Combe,Somerset,51.00283,-2.41825,Village
2,3,Abberley,Worcestershire,52.30522,-2.37574,Village
3,4,Abberton,Essex,51.83440,0.91066,Village
4,5,Abberton,Worcestershire,52.17955,-2.00817,Hamlet
...,...,...,...,...,...,...
1797,1813,Ayton,Berwickshire,55.84232,-2.12285,Village
1798,1814,Ayton,Tyne and Wear,54.89416,-1.55643,Suburban Area
1799,1815,Ayton Castle,Berwickshire,55.84665,-2.12135,Locality
1800,1816,Aywick,Shetland,60.56017,-1.03269,Village


In [5]:
map_uk = folium.Map(location=[latitude, longitude], zoom_start=7)



# add markers to map
for lat, lng,name, county in zip(df_uk_data['latitude'], df_uk_data['longitude'], df_uk_data['name'], df_uk_data['county']):
    label = '{}, {}'.format(county, name)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_uk)  
    
map_uk

## Using the FourSquare API

In [6]:
CLIENT_ID = '2PAVVWKZKMDYAO3OEF2ATVCHYGHLEDPQYSAAXVCEPUEPPWXA' # your Foursquare ID
CLIENT_SECRET = 'KQE3TXD2B3VUMS4DWZCOY052SFUGI2OLHDWRJQJ3TNOUBBBK' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: 2PAVVWKZKMDYAO3OEF2ATVCHYGHLEDPQYSAAXVCEPUEPPWXA
CLIENT_SECRET:KQE3TXD2B3VUMS4DWZCOY052SFUGI2OLHDWRJQJ3TNOUBBBK


## Lets isolate the data to the city of Manchester

In [7]:
address = 'Manchester'

geolocator = Nominatim(user_agent="Manchester_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Manchester are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Manchester are 53.4794892, -2.2451148.


In [8]:
manchester_data = df_uk_data[df_uk_data['county'] == 'Greater Manchester']
manchester_data.reset_index(drop=True)
manchester_data

Unnamed: 0,id,name,county,latitude,longitude,type
21,22,Abbey Hey,Greater Manchester,53.46514,-2.15963,Locality
175,176,Abram,Greater Manchester,53.50949,-2.59264,Village
176,177,Abram Brow,Greater Manchester,53.50791,-2.59027,Suburban Area
285,286,Acre,Greater Manchester,53.55218,-2.09701,Suburban Area
289,290,Acres,Greater Manchester,53.55834,-2.16579,Locality
369,370,Adswood,Greater Manchester,53.3913,-2.16962,Suburban Area
378,379,Affetside,Greater Manchester,53.61778,-2.37094,Village
422,423,Ainsworth,Greater Manchester,53.58799,-2.35791,Village
520,536,Alder Forest,Greater Manchester,53.4938,-2.37547,Suburban Area
533,549,Alder Root,Greater Manchester,53.53893,-2.13647,Suburban Area


## Lets create a map of Manchester

In [9]:
map_manchester = folium.Map(location=[latitude, longitude], zoom_start=10)



# add markers to map
for lat, lng,name, county in zip(manchester_data['latitude'], manchester_data['longitude'], manchester_data['name'], manchester_data['county']):
    label = '{}, {}'.format(county, name)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_manchester)  
    
map_manchester

In [12]:
county_latitude = manchester_data.loc[21, 'latitude'] # county latitude value
county_longitude = manchester_data.loc[21, 'longitude'] # county longitude value

name = manchester_data.loc[21, 'name'] # county name

print('Latitude and longitude values of {} are {}, {}.'.format(name, 
                                                               county_latitude, 
                                                               county_longitude))

Latitude and longitude values of Abbey Hey are 53.465140000000005, -2.15963.


In [13]:
LIMIT = 100
RADIUS = 500

url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    county_latitude, 
    county_longitude, 
    RADIUS, 
    LIMIT)
url

'https://api.foursquare.com/v2/venues/explore?&client_id=2PAVVWKZKMDYAO3OEF2ATVCHYGHLEDPQYSAAXVCEPUEPPWXA&client_secret=KQE3TXD2B3VUMS4DWZCOY052SFUGI2OLHDWRJQJ3TNOUBBBK&v=20180605&ll=53.465140000000005,-2.15963&radius=500&limit=100'

In [14]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '60c4964e62ba9f4e080777f8'},
 'response': {'headerLocation': 'Gorton North',
  'headerFullLocation': 'Gorton North, Manchester',
  'headerLocationGranularity': 'neighborhood',
  'totalResults': 5,
  'suggestedBounds': {'ne': {'lat': 53.46964000450001,
    'lng': -2.1520850415872688},
   'sw': {'lat': 53.4606399955, 'lng': -2.167174958412731}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '530f41b1498e5f25841b5407',
       'name': 'Donkey Sanctuary And Therapy Centre',
       'location': {'lat': 53.46488569011731,
        'lng': -2.156117265779587,
        'labeledLatLngs': [{'label': 'display',
          'lat': 53.46488569011731,
          'lng': -2.156117265779587}],
        'distance': 234,
        'cc': 'GB',
        'country': 'U

## Extract each venue category from data

In [15]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [16]:
venues = []
for lat, long, ID, county, name in zip(manchester_data['latitude'], manchester_data['longitude'], manchester_data['id'], manchester_data['county'], manchester_data['name']):
    url = "https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}".format(
        CLIENT_ID,
        CLIENT_SECRET,
        VERSION,
        lat,
        long,
        RADIUS, 
        LIMIT)
    
    results = requests.get(url).json()["response"]['groups'][0]['items']
    
    for venue in results:
        venues.append((
            ID, 
            county,
            name,
            lat, 
            long, 
            venue['venue']['name'], 
            venue['venue']['location']['lat'], 
            venue['venue']['location']['lng'],  
            venue['venue']['categories'][0]['name']))

In [17]:
# convert the venues list into a new DataFrame
venues_df = pd.DataFrame(venues)

# define the column names
venues_df.columns = ['id', 'county', 'name', 'Area_Latitude', 'Area_Longitude', 'VenueName', 'VenueLatitude', 'VenueLongitude', 'VenueCategory']

print(venues_df.shape)
venues_df.head()

(200, 9)


Unnamed: 0,id,county,name,Area_Latitude,Area_Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
0,22,Greater Manchester,Abbey Hey,53.46514,-2.15963,Donkey Sanctuary And Therapy Centre,53.464886,-2.156117,Stables
1,22,Greater Manchester,Abbey Hey,53.46514,-2.15963,Manchester Vacs,53.46697,-2.157742,Electronics Store
2,22,Greater Manchester,Abbey Hey,53.46514,-2.15963,Recording Studios Manchester,53.468611,-2.158323,Performing Arts Venue
3,22,Greater Manchester,Abbey Hey,53.46514,-2.15963,Heaven Pro Clean Ltd,53.467132,-2.154222,Home Service
4,22,Greater Manchester,Abbey Hey,53.46514,-2.15963,Gorton Reservoir,53.462963,-2.15364,Reservoir


In [18]:
venues_df.groupby('name').count()

Unnamed: 0_level_0,id,county,Area_Latitude,Area_Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
Abbey Hey,5,5,5,5,5,5,5,5
Abram,5,5,5,5,5,5,5,5
Abram Brow,5,5,5,5,5,5,5,5
Acre,2,2,2,2,2,2,2,2
Adswood,5,5,5,5,5,5,5,5
Affetside,1,1,1,1,1,1,1,1
Ainsworth,4,4,4,4,4,4,4,4
Alder Forest,4,4,4,4,4,4,4,4
Alder Root,3,3,3,3,3,3,3,3
Alkrington Garden Village,4,4,4,4,4,4,4,4


Here we see how many unique categories there are

In [19]:
print('There are {} uniques categories.'.format(len(venues_df['VenueCategory'].unique())))

There are 77 uniques categories.


## Analysis <a name="analysis"></a>

## Lets apply one hot encoding to better understand the data

In [22]:
Manchester_onehot = pd.get_dummies(venues_df[['VenueCategory']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
Manchester_onehot['name'] = venues_df['name'] 

# move neighborhood column to the first column
fixed_columns = [Manchester_onehot.columns[-1]] + list(Manchester_onehot.columns[:-1])
Manchester_onehot = Manchester_onehot[fixed_columns]

Manchester_onehot.tail()

Unnamed: 0,name,American Restaurant,Arts & Crafts Store,Auto Garage,Bakery,Bar,Beer Bar,Bookstore,Brazilian Restaurant,Breakfast Spot,...,Supermarket,Sushi Restaurant,Tea Room,Thai Restaurant,Trail,Train Station,Tram Station,Vegetarian / Vegan Restaurant,Warehouse Store,Wine Bar
195,Audenshaw,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
196,Austerlands,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
197,Austerlands,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
198,Austerlands,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
199,Austerlands,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [23]:
Manchester_onehot.shape

(200, 78)

In [25]:
Manchester_grouped = Manchester_onehot.groupby('name').mean().reset_index()
Manchester_grouped

Unnamed: 0,name,American Restaurant,Arts & Crafts Store,Auto Garage,Bakery,Bar,Beer Bar,Bookstore,Brazilian Restaurant,Breakfast Spot,...,Supermarket,Sushi Restaurant,Tea Room,Thai Restaurant,Trail,Train Station,Tram Station,Vegetarian / Vegan Restaurant,Warehouse Store,Wine Bar
0,Abbey Hey,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Abram,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0
2,Abram Brow,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0
3,Acre,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0
4,Adswood,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Affetside,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Ainsworth,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Alder Forest,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Alder Root,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,...,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Alkrington Garden Village,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [26]:
Manchester_grouped.shape

(28, 78)

## Lets find the top 5 venues for each area in the city

In [27]:
num_top_venues = 5

for hood in Manchester_grouped['name']:
    print("----"+hood+"----")
    temp = Manchester_grouped[Manchester_grouped['name'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Abbey Hey----
                   venue  freq
0              Reservoir   0.2
1           Home Service   0.2
2                Stables   0.2
3  Performing Arts Venue   0.2
4      Electronics Store   0.2


----Abram----
                venue  freq
0                 Pub   0.2
1                 Bar   0.2
2               Trail   0.2
3                Park   0.2
4  Chinese Restaurant   0.2


----Abram Brow----
                venue  freq
0                 Pub   0.2
1                 Bar   0.2
2               Trail   0.2
3                Park   0.2
4  Chinese Restaurant   0.2


----Acre----
                 venue  freq
0    Convenience Store   0.5
1         Tram Station   0.5
2  American Restaurant   0.0
3            Pet Store   0.0
4                  Pub   0.0


----Adswood----
            venue  freq
0    Liquor Store   0.2
1       Pet Store   0.2
2             Pub   0.2
3  Breakfast Spot   0.2
4            Park   0.2


----Affetside----
                   venue  freq
0                    

In [28]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [31]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['name']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
area_venues_sorted = pd.DataFrame(columns=columns)
area_venues_sorted['name'] = Manchester_grouped['name']

for ind in np.arange(Manchester_grouped.shape[0]):
    area_venues_sorted.iloc[ind, 1:] = return_most_common_venues(Manchester_grouped.iloc[ind, :], num_top_venues)

area_venues_sorted.head()

Unnamed: 0,name,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Abbey Hey,Home Service,Performing Arts Venue,Reservoir,Electronics Store,Stables,Gay Bar,Gastropub,Gas Station,Furniture / Home Store,French Restaurant
1,Abram,Pub,Trail,Bar,Chinese Restaurant,Park,Fish & Chips Shop,Department Store,Electronics Store,Event Service,Falafel Restaurant
2,Abram Brow,Pub,Trail,Bar,Chinese Restaurant,Park,Fish & Chips Shop,Department Store,Electronics Store,Event Service,Falafel Restaurant
3,Acre,Convenience Store,Tram Station,French Restaurant,Department Store,Electronics Store,Event Service,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish & Chips Shop
4,Adswood,Park,Liquor Store,Pet Store,Breakfast Spot,Pub,Wine Bar,Electronics Store,Event Service,Falafel Restaurant,Farmers Market


## Applying K-Means to data

In [32]:
# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

In [34]:
# set number of clusters
kclusters = 5

Manchester_grouped_clustering = Manchester_grouped.drop('name', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(Manchester_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 0, 0, 0, 0, 1, 0, 0, 4, 0])

In [35]:
# add clustering labels
area_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

Manchester_merged = manchester_data

# merge manhattan_grouped with manhattan_data to add latitude/longitude for each neighborhood
Manchester_merged = Manchester_merged.join(area_venues_sorted.set_index('name'), on='name')



Manchester_merged.dropna(inplace=True)
Manchester_merged.head() # check the last columns!

Unnamed: 0,id,name,county,latitude,longitude,type,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
21,22,Abbey Hey,Greater Manchester,53.46514,-2.15963,Locality,0.0,Home Service,Performing Arts Venue,Reservoir,Electronics Store,Stables,Gay Bar,Gastropub,Gas Station,Furniture / Home Store,French Restaurant
175,176,Abram,Greater Manchester,53.50949,-2.59264,Village,0.0,Pub,Trail,Bar,Chinese Restaurant,Park,Fish & Chips Shop,Department Store,Electronics Store,Event Service,Falafel Restaurant
176,177,Abram Brow,Greater Manchester,53.50791,-2.59027,Suburban Area,0.0,Pub,Trail,Bar,Chinese Restaurant,Park,Fish & Chips Shop,Department Store,Electronics Store,Event Service,Falafel Restaurant
285,286,Acre,Greater Manchester,53.55218,-2.09701,Suburban Area,0.0,Convenience Store,Tram Station,French Restaurant,Department Store,Electronics Store,Event Service,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish & Chips Shop
369,370,Adswood,Greater Manchester,53.3913,-2.16962,Suburban Area,0.0,Park,Liquor Store,Pet Store,Breakfast Spot,Pub,Wine Bar,Electronics Store,Event Service,Falafel Restaurant,Farmers Market


## Creating the cluster map

In [37]:
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(Manchester_merged['latitude'], Manchester_merged['longitude'], Manchester_merged['name'], Manchester_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[int(cluster)-1],
        fill=True,
        fill_color=rainbow[int(cluster)-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

## Cluster 1

In [38]:
Manchester_merged.loc[Manchester_merged['Cluster Labels'] == 0, Manchester_merged.columns[[1] + list(range(5, Manchester_merged.shape[1]))]]

Unnamed: 0,name,type,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
21,Abbey Hey,Locality,0.0,Home Service,Performing Arts Venue,Reservoir,Electronics Store,Stables,Gay Bar,Gastropub,Gas Station,Furniture / Home Store,French Restaurant
175,Abram,Village,0.0,Pub,Trail,Bar,Chinese Restaurant,Park,Fish & Chips Shop,Department Store,Electronics Store,Event Service,Falafel Restaurant
176,Abram Brow,Suburban Area,0.0,Pub,Trail,Bar,Chinese Restaurant,Park,Fish & Chips Shop,Department Store,Electronics Store,Event Service,Falafel Restaurant
285,Acre,Suburban Area,0.0,Convenience Store,Tram Station,French Restaurant,Department Store,Electronics Store,Event Service,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish & Chips Shop
369,Adswood,Suburban Area,0.0,Park,Liquor Store,Pet Store,Breakfast Spot,Pub,Wine Bar,Electronics Store,Event Service,Falafel Restaurant,Farmers Market
422,Ainsworth,Village,0.0,Home Service,Bar,Pub,Bus Stop,Wine Bar,French Restaurant,Electronics Store,Event Service,Falafel Restaurant,Farmers Market
520,Alder Forest,Suburban Area,0.0,Cricket Ground,Pub,Park,Home Service,Bakery,French Restaurant,Event Service,Falafel Restaurant,Farmers Market,Fast Food Restaurant
607,Alkrington Garden Village,Suburban Area,0.0,Grocery Store,Gay Bar,Nature Preserve,Pharmacy,Wine Bar,Fish & Chips Shop,Electronics Store,Event Service,Falafel Restaurant,Farmers Market
767,Altrincham,Town,0.0,Bar,Coffee Shop,Department Store,Pizza Place,Café,Clothing Store,Pub,French Restaurant,Pharmacy,Newsagent
863,Ancoats,Suburban Area,0.0,Coffee Shop,Platform,Bar,Beer Bar,Sandwich Place,Bakery,Pizza Place,Indian Restaurant,Hotel,Piercing Parlor


## Cluster 2

In [39]:
Manchester_merged.loc[Manchester_merged['Cluster Labels'] == 1, Manchester_merged.columns[[1] + list(range(5, Manchester_merged.shape[1]))]]

Unnamed: 0,name,type,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
378,Affetside,Village,1.0,Pub,French Restaurant,Department Store,Electronics Store,Event Service,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Wine Bar
741,Alt,Suburban Area,1.0,Pub,French Restaurant,Department Store,Electronics Store,Event Service,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Wine Bar
1490,Aspull Common,Suburban Area,1.0,Pub,Deli / Bodega,French Restaurant,Department Store,Electronics Store,Event Service,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish & Chips Shop


## Cluster 3

In [40]:
Manchester_merged.loc[Manchester_merged['Cluster Labels'] == 2, Manchester_merged.columns[[1] + list(range(5, Manchester_merged.shape[1]))]]

Unnamed: 0,name,type,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1675,Audenshaw,Locality,2.0,Indian Restaurant,Wine Bar,French Restaurant,Department Store,Electronics Store,Event Service,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish & Chips Shop


## Cluster 4

In [41]:
Manchester_merged.loc[Manchester_merged['Cluster Labels'] == 3, Manchester_merged.columns[[1] + list(range(5, Manchester_merged.shape[1]))]]

Unnamed: 0,name,type,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1388,Ashley Heath,Suburban Area,3.0,Restaurant,Wine Bar,Cricket Ground,Department Store,Electronics Store,Event Service,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish & Chips Shop


## Cluster 5

In [42]:
Manchester_merged.loc[Manchester_merged['Cluster Labels'] == 4, Manchester_merged.columns[[1] + list(range(5, Manchester_merged.shape[1]))]]

Unnamed: 0,name,type,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
533,Alder Root,Suburban Area,4.0,Home Service,Auto Garage,Supermarket,Wine Bar,French Restaurant,Electronics Store,Event Service,Falafel Restaurant,Farmers Market,Fast Food Restaurant
1075,Arden Park,Suburban Area,4.0,Auto Garage,Fast Food Restaurant,Supermarket,Pub,Wine Bar,Fish & Chips Shop,Department Store,Electronics Store,Event Service,Falafel Restaurant
1434,Ashton Heath,Suburban Area,4.0,Supermarket,Racecourse,Wine Bar,Fish & Chips Shop,Deli / Bodega,Department Store,Electronics Store,Event Service,Falafel Restaurant,Farmers Market
1511,Astley Bridge,Suburban Area,4.0,Supermarket,Pizza Place,Chinese Restaurant,Furniture / Home Store,Pet Store,Pub,Wine Bar,Farmers Market,Department Store,Electronics Store
1589,Atherton,Town,4.0,Supermarket,Pub,Bar,Roller Rink,Sandwich Place,Soccer Field,Wine Bar,Fast Food Restaurant,Department Store,Electronics Store


## Results and Discussion <a name="results"></a>

Per inspection of the clustering map and the results of the clustering above we can state that the majority of areas within the city of Manchester fall under cluster 1. We can also see that the majority of cluster 1 is relatively close to the city centre and thus would probably be the most popular spots to visit on my trip

## Conclusion <a name="conclusion"></a>

The purpose of this project was to identify the places of interest within the city of Manchester that I would most likely visiti while on my trip to the city.