# Opening a Sports Bar in Bath

### Business Problem 

I will be looking to open a sports bar in the City of Bath (UK). I appreciate that there is already numerous pubs in this city however I will be looking to open a sports bar in an area of the city where there is sufficent foot traffic from pedestrians as well as other popular venues that will maximise the number of people who will be in the vacinity of the bar.

Further to that, choosing a location that is close to parking and in close proximity to public transport is key so that both local people and customers from the outside area are able to reach it. Finally, I will consider how much competition there is in the location.

### Data 

I will use the FourSquare data on Bath to find bars, restaurants and other venues to see where the competition is location as well as parking, transport and other busy areas of the city.

Foursquare API will be used to request information on the top 100 venues and top 10 for each of our assigned Areas. The resulting information will be converted to a pandas dataframe. One hot encoding and k-means analysis will allow me to cluster the venues locations. The venue categories will then be analysed and subgrouped created to encompass multiple venue types. These included:

- Restaurants
- Pubs / bars
- Coffee / cafe's shops
- Shops
- Arts
- Recreation
- Other

Foursquare developer’s API will be utilised to collect information on the top 100 venues within Bath. The resulting information will be converted to a workable dataframe and locations of each venue visualised on an interactive folium map. The resulting dataframe will be analysed using the one hot encoded and k-means will be applied to the dataset and the resulting clusters will be plotted onto a folium map for visual analysis of where the clusters lie. Venue categories will be clustered into subgroups, as stated in the previous above, and one hot encoded. The resulting dataframe will be visually analysied using the Matplotlib .plot function as bar-charts displaying the number of venues per ‘Area’ to discern which areas contained the most of which types of venues.

In [3]:
import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation

!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize

!conda install -c conda-forge folium=0.5.0 --yes
import folium # plotting library

print('Folium installed')
print('Libraries imported.')

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Folium installed
Libraries imported.


### Foursquare API 

In [9]:
CLIENT_ID = 'NK0L23UBQE4JARFAXJZ5Q4CSQ2TXXRMN44LEAR0MDCSIWDHU' # your Foursquare ID
CLIENT_SECRET = 'AIVQ0OWWVJN1UHLQ3DLWREZGOSHKUCRSR12TDNND0QN2C43Z' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 200
print('Your credentails:')
print('CLIENT_ID:' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID:NK0L23UBQE4JARFAXJZ5Q4CSQ2TXXRMN44LEAR0MDCSIWDHU
CLIENT_SECRET:AIVQ0OWWVJN1UHLQ3DLWREZGOSHKUCRSR12TDNND0QN2C43Z


##### Entering the details of the Roman Bath, which are located in the city centre of Bath and so I will look for locations around this location

In [10]:
address = 'Abbey Churchyard, Bath BA1 1LZ'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

51.38134185 -2.3596754409643284


In [11]:
radius = 2000
url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, radius, LIMIT)

In [12]:
import requests
results = requests.get(url).json()

In [13]:
items = results['response']['groups'][0]['items']
items[0]

{'reasons': {'count': 0,
  'items': [{'summary': 'This spot is popular',
    'type': 'general',
    'reasonName': 'globalInteractionReason'}]},
 'venue': {'id': '4f75f98ce4b04262b738ab1f',
  'name': 'Canary Gin And Wine Bar',
  'location': {'address': 'Queen St',
   'lat': 51.382825457825085,
   'lng': -2.3619487466976476,
   'labeledLatLngs': [{'label': 'display',
     'lat': 51.382825457825085,
     'lng': -2.3619487466976476}],
   'distance': 228,
   'cc': 'GB',
   'city': 'Bath',
   'state': 'Bath and North East Somerset',
   'country': 'United Kingdom',
   'formattedAddress': ['Queen St',
    'Bath',
    'Bath and North East Somerset',
    'United Kingdom']},
  'categories': [{'id': '4bf58dd8d48988d11e941735',
    'name': 'Cocktail Bar',
    'pluralName': 'Cocktail Bars',
    'shortName': 'Cocktail',
    'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/nightlife/cocktails_',
     'suffix': '.png'},
    'primary': True}],
  'photos': {'count': 0, 'groups': []}},
 'referra

In [14]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [15]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues

  This is separate from the ipykernel package so we can avoid doing imports until


Unnamed: 0,name,categories,lat,lng
0,Canary Gin And Wine Bar,Cocktail Bar,51.382825,-2.361949
1,Acorn Vegetarian Kitchen,Vegetarian / Vegan Restaurant,51.380800,-2.358273
2,The Whole Bagel,Bagel Shop,51.382757,-2.360067
3,Sotto Sotto,Italian Restaurant,51.380802,-2.356590
4,Ben's Cookies,Bakery,51.382056,-2.360164
...,...,...,...,...
95,Oldfield Park Railway Station (OLF),Train Station,51.379209,-2.380352
96,Bath Canoe Club,Sports Club,51.390057,-2.356636
97,Sainsbury's Local,Grocery Store,51.376758,-2.378526
98,Kwik-Fit,Auto Garage,51.393098,-2.349320


In [16]:
nearby_venues.groupby('categories').count()

Unnamed: 0_level_0,name,lat,lng
categories,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Asian Restaurant,1,1,1
Auto Garage,1,1,1
BBQ Joint,1,1,1
Bagel Shop,1,1,1
Bakery,1,1,1
Bar,1,1,1
Bed & Breakfast,2,2,2
Bookstore,2,2,2
Botanical Garden,1,1,1
Burger Joint,1,1,1


Above we have the list of top 100 venues according to the Foursquare API, we can also see these venues in various categories. 

### Visualising the overall Data

In [17]:
venues_map = folium.Map(location=[latitude, longitude], zoom_start=14) # generate map centred around the Conrad Hotel

# add a red circle marker to represent the Conrad Hotel
folium.features.CircleMarker(
    [latitude, longitude],
    radius=10,
    color='red',
    popup='Conrad Hotel',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(venues_map)

# add the Italian restaurants as blue circle markers
for lat, lng, label in zip(nearby_venues.lat, nearby_venues.lng, nearby_venues.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(venues_map)

# display map
venues_map

As you can see from this map, the most densely populated area of popular venues is around Westgate Street up towards John Street. We can also see that just north of the river towards Bath Spa train station and the Bath Bus station there isn't a large amount of competition for our potential sports bar however there is certainly the foot traffic from the transport links and parking available (Avon Street and Bath Spa Station Car Park).

### Bath Area Data

I will now import postcode data from an excel spreadsheet found from https://data.bathhacked.org/datasets/bath-north-east-somerset-postcodes which contains location data on bath postcodes. I will import this spreadsheet via Excel and group the data into their neighbourhoods. 

In [19]:
bath_data = pd.read_excel('/Users/Olliekesner/Desktop/Bath & North East  BA1 compact .xls')
bath_data.head()

Unnamed: 0,postcode,neighbourhood,latitude,longitude
0,BA1 0QD,Abbey,51.378855,-2.35556
1,BA1 0JE,Abbey,51.378855,-2.35556
2,BA1 2ER,Abbey,51.38629,-2.364571
3,BA1 5DZ,Abbey,51.386634,-2.36111
4,BA1 0AG,Abbey,51.378855,-2.35556


In [20]:
address = 'Bath'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)


51.3813864 -2.3596963


In [21]:
map_bath = folium.Map(location = [latitude, longitude], zoom_start=13)

# add markers to map
for lat, lng, neighborhood in zip(bath_data['latitude'], bath_data['longitude'], bath_data['neighbourhood']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html = True)
    folium.CircleMarker(
        [lat, lng],
        radius = 5,
        popup = label,
        color = 'blue',
        fill = True,
        fill_color = '#3186cc',
        fill_opacity = 0.7,
        parse_html = False).add_to(map_bath)  
    
map_bath

I will now look at the top 100 popular venues around a 500 metre radius of these locations. 

In [22]:
CLIENT_ID = 'NK0L23UBQE4JARFAXJZ5Q4CSQ2TXXRMN44LEAR0MDCSIWDHU' # your Foursquare ID
CLIENT_SECRET = 'AIVQ0OWWVJN1UHLQ3DLWREZGOSHKUCRSR12TDNND0QN2C43Z' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 100
print('Your credentails:')
print('CLIENT_ID:' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID:NK0L23UBQE4JARFAXJZ5Q4CSQ2TXXRMN44LEAR0MDCSIWDHU
CLIENT_SECRET:AIVQ0OWWVJN1UHLQ3DLWREZGOSHKUCRSR12TDNND0QN2C43Z


In [23]:
#defining radius and limit of venues to get
radius=500
LIMIT=100

def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighbourhood', 
                  'Neighbourhood Latitude', 
                  'Neighbourhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [24]:
bath_venues = getNearbyVenues(names = bath_data['neighbourhood'],
                                   latitudes = bath_data['latitude'],
                                   longitudes = bath_data['longitude'])
bath_venues.head()

Abbey
Abbey
Abbey
Abbey
Abbey
Bathavon North
Bathavon North
Bathavon North
Bathavon North
Bathavon North
Kingsmead
Kingsmead
Kingsmead
Kingsmead
Kingsmead
Kingsmead
Kingsmead
Kingsmead
Lambridge
Lambridge
Lambridge
Lambridge
Lansdown
Lansdown
Lansdown
Lansdown
Lansdown
Lansdown
Newbridge
Newbridge
Newbridge
Newbridge
Walcot
Walcot
Walcot
Walcot
Weston
Weston
Weston
Weston
Widcombe
Widcombe


Unnamed: 0,Neighbourhood,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Abbey,51.378855,-2.35556,Sotto Sotto,51.380802,-2.35659,Italian Restaurant
1,Abbey,51.378855,-2.35556,Acorn Vegetarian Kitchen,51.3808,-2.358273,Vegetarian / Vegan Restaurant
2,Abbey,51.378855,-2.35556,The White Hart,51.376511,-2.353245,Pub
3,Abbey,51.378855,-2.35556,La Perla,51.380684,-2.356204,Tapas Restaurant
4,Abbey,51.378855,-2.35556,Green Rocket Cafe,51.380693,-2.357129,Vegetarian / Vegan Restaurant


In [26]:
bath_venues.groupby('Neighbourhood').count()


Unnamed: 0_level_0,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighbourhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Abbey,224,224,224,224,224,224
Bathavon North,10,10,10,10,10,10
Kingsmead,314,314,314,314,314,314
Lambridge,30,30,30,30,30,30
Lansdown,84,84,84,84,84,84
Newbridge,38,38,38,38,38,38
Walcot,24,24,24,24,24,24
Weston,14,14,14,14,14,14
Widcombe,110,110,110,110,110,110


In [27]:
print('There are {} uniques categories.'.format(len(bath_venues['Venue Category'].unique())))

There are 107 uniques categories.


#### Analysing each neighbourhood 

In [28]:
# one hot encoding
bath_onehot = pd.get_dummies(bath_venues[['Venue Category']], prefix = "", prefix_sep = "")

# add neighbourhood column back to dataframe
bath_onehot['Neighbourhood'] = bath_venues['Neighbourhood'] 

# move neighbourhood column to the first column
fixed_columns = [bath_onehot.columns[-1]] + list(bath_onehot.columns[:-1])
bath_onehot = bath_onehot[fixed_columns]

bath_onehot.head()

Unnamed: 0,Neighbourhood,American Restaurant,Art Gallery,Asian Restaurant,Athletics & Sports,Auto Garage,BBQ Joint,Bagel Shop,Bakery,Bar,...,Tennis Court,Thai Restaurant,Theater,Trail,Vacation Rental,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Warehouse Store,Women's Store
0,Abbey,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Abbey,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,1,0,0,0,0
2,Abbey,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Abbey,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Abbey,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,1,0,0,0,0


In [29]:
bath_grouped = bath_onehot.groupby('Neighbourhood').mean().reset_index()
bath_grouped.head()

Unnamed: 0,Neighbourhood,American Restaurant,Art Gallery,Asian Restaurant,Athletics & Sports,Auto Garage,BBQ Joint,Bagel Shop,Bakery,Bar,...,Tennis Court,Thai Restaurant,Theater,Trail,Vacation Rental,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Warehouse Store,Women's Store
0,Abbey,0.008929,0.0,0.008929,0.0,0.0,0.0,0.004464,0.022321,0.013393,...,0.0,0.013393,0.008929,0.0,0.0,0.026786,0.0,0.013393,0.0,0.0
1,Bathavon North,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0
2,Kingsmead,0.0,0.0,0.015924,0.0,0.0,0.0,0.015924,0.019108,0.012739,...,0.0,0.015924,0.031847,0.0,0.0,0.015924,0.0,0.015924,0.003185,0.0
3,Lambridge,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.033333,...,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Lansdown,0.0,0.0,0.011905,0.0,0.0,0.0,0.0,0.0,0.0,...,0.011905,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [38]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending = False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [39]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighbourhood']
for ind in np.arange(num_top_venues):
    try:
        # append 'st', 'nd', 'rd' to the top 3 venues
        columns.append('{}{} Most Common Venue'.format(ind + 1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind + 1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns = columns)
neighborhoods_venues_sorted['Neighbourhood'] = bath_grouped['Neighbourhood']

for ind in np.arange(bath_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(bath_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Abbey,Pub,French Restaurant,Café,Coffee Shop,Hotel,Italian Restaurant,Tea Room,Vegetarian / Vegan Restaurant,Bookstore,History Museum
1,Bathavon North,Plaza,Hot Dog Joint,Park,Music Venue,Café,Bed & Breakfast,Construction & Landscaping,Vacation Rental,Gastropub,Athletics & Sports
2,Kingsmead,Café,Coffee Shop,Pub,History Museum,Cocktail Bar,Restaurant,Hotel,Supermarket,Seafood Restaurant,Bookstore
3,Lambridge,Hotel,Convenience Store,Pub,Theater,Gourmet Shop,Pharmacy,Grocery Store,Rugby Stadium,Burger Joint,Café
4,Lansdown,Pub,History Museum,Park,Hotel,Restaurant,Café,Steakhouse,Performing Arts Venue,French Restaurant,Italian Restaurant


In [40]:
from sklearn.cluster import KMeans
import matplotlib.cm as cm
import matplotlib.colors as colors
# set number of clusters
kclusters = 5
bath_grouped_clustering = bath_grouped.drop('Neighbourhood', 1)
# run k-means clustering
kmeans = KMeans(n_clusters = kclusters, random_state = 0).fit(bath_grouped_clustering)
# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 3, 0, 0, 0, 1, 2, 4, 0], dtype=int32)

In [41]:
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

bath_merged = bath_venues
bath_merged = bath_merged.merge(neighborhoods_venues_sorted, on = 'Neighbourhood')

bath_merged.head()

Unnamed: 0,Neighbourhood,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Abbey,51.378855,-2.35556,Sotto Sotto,51.380802,-2.35659,Italian Restaurant,0,Pub,French Restaurant,Café,Coffee Shop,Hotel,Italian Restaurant,Tea Room,Vegetarian / Vegan Restaurant,Bookstore,History Museum
1,Abbey,51.378855,-2.35556,Acorn Vegetarian Kitchen,51.3808,-2.358273,Vegetarian / Vegan Restaurant,0,Pub,French Restaurant,Café,Coffee Shop,Hotel,Italian Restaurant,Tea Room,Vegetarian / Vegan Restaurant,Bookstore,History Museum
2,Abbey,51.378855,-2.35556,The White Hart,51.376511,-2.353245,Pub,0,Pub,French Restaurant,Café,Coffee Shop,Hotel,Italian Restaurant,Tea Room,Vegetarian / Vegan Restaurant,Bookstore,History Museum
3,Abbey,51.378855,-2.35556,La Perla,51.380684,-2.356204,Tapas Restaurant,0,Pub,French Restaurant,Café,Coffee Shop,Hotel,Italian Restaurant,Tea Room,Vegetarian / Vegan Restaurant,Bookstore,History Museum
4,Abbey,51.378855,-2.35556,Green Rocket Cafe,51.380693,-2.357129,Vegetarian / Vegan Restaurant,0,Pub,French Restaurant,Café,Coffee Shop,Hotel,Italian Restaurant,Tea Room,Vegetarian / Vegan Restaurant,Bookstore,History Museum


In [42]:
map_clusters = folium.Map(location = [latitude, longitude], zoom_start = 13)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i * x) ** 2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(bath_merged['Neighbourhood Latitude'], bath_merged['Neighbourhood Longitude'], bath_merged['Neighbourhood'], bath_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius = 5,
        popup = label,
        color = rainbow[cluster - 1],
        fill = True,
        fill_color = rainbow[cluster - 1],
        fill_opacity = 0.7).add_to(map_clusters)
       
map_clusters

## Examining Clusters

We can now examine each cluster and determine the discriminating venue categories that distinguish each cluster.

In [44]:
bath_merged.loc[bath_merged['Cluster Labels'] == 0, bath_merged.columns[[1] + list(range(5, bath_merged.shape[1]))]]

Unnamed: 0,Neighbourhood Latitude,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,51.378855,-2.356590,Italian Restaurant,0,Pub,French Restaurant,Café,Coffee Shop,Hotel,Italian Restaurant,Tea Room,Vegetarian / Vegan Restaurant,Bookstore,History Museum
1,51.378855,-2.358273,Vegetarian / Vegan Restaurant,0,Pub,French Restaurant,Café,Coffee Shop,Hotel,Italian Restaurant,Tea Room,Vegetarian / Vegan Restaurant,Bookstore,History Museum
2,51.378855,-2.353245,Pub,0,Pub,French Restaurant,Café,Coffee Shop,Hotel,Italian Restaurant,Tea Room,Vegetarian / Vegan Restaurant,Bookstore,History Museum
3,51.378855,-2.356204,Tapas Restaurant,0,Pub,French Restaurant,Café,Coffee Shop,Hotel,Italian Restaurant,Tea Room,Vegetarian / Vegan Restaurant,Bookstore,History Museum
4,51.378855,-2.357129,Vegetarian / Vegan Restaurant,0,Pub,French Restaurant,Café,Coffee Shop,Hotel,Italian Restaurant,Tea Room,Vegetarian / Vegan Restaurant,Bookstore,History Museum
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
843,51.379923,-2.368810,Bed & Breakfast,0,Pub,Coffee Shop,Café,Hotel,Clothing Store,Bar,Pizza Place,Bakery,Italian Restaurant,Tea Room
844,51.379923,-2.368500,Supermarket,0,Pub,Coffee Shop,Café,Hotel,Clothing Store,Bar,Pizza Place,Bakery,Italian Restaurant,Tea Room
845,51.379923,-2.372184,Hardware Store,0,Pub,Coffee Shop,Café,Hotel,Clothing Store,Bar,Pizza Place,Bakery,Italian Restaurant,Tea Room
846,51.379923,-2.373672,Hotel,0,Pub,Coffee Shop,Café,Hotel,Clothing Store,Bar,Pizza Place,Bakery,Italian Restaurant,Tea Room


In [45]:
bath_merged.loc[bath_merged['Cluster Labels'] == 1, bath_merged.columns[[1] + list(range(5, bath_merged.shape[1]))]]

Unnamed: 0,Neighbourhood Latitude,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
662,51.386274,-2.395584,Bed & Breakfast,1,Grocery Store,Gastropub,Brewery,Bakery,Convenience Store,Café,Bike Shop,Rental Car Location,Coffee Shop,Trail
663,51.386274,-2.395345,Brewery,1,Grocery Store,Gastropub,Brewery,Bakery,Convenience Store,Café,Bike Shop,Rental Car Location,Coffee Shop,Trail
664,51.386274,-2.393486,Gastropub,1,Grocery Store,Gastropub,Brewery,Bakery,Convenience Store,Café,Bike Shop,Rental Car Location,Coffee Shop,Trail
665,51.386274,-2.398864,Hotel,1,Grocery Store,Gastropub,Brewery,Bakery,Convenience Store,Café,Bike Shop,Rental Car Location,Coffee Shop,Trail
666,51.386274,-2.388978,Bike Shop,1,Grocery Store,Gastropub,Brewery,Bakery,Convenience Store,Café,Bike Shop,Rental Car Location,Coffee Shop,Trail
667,51.386274,-2.395825,Rental Car Location,1,Grocery Store,Gastropub,Brewery,Bakery,Convenience Store,Café,Bike Shop,Rental Car Location,Coffee Shop,Trail
668,51.386274,-2.388873,Café,1,Grocery Store,Gastropub,Brewery,Bakery,Convenience Store,Café,Bike Shop,Rental Car Location,Coffee Shop,Trail
669,51.386274,-2.388749,Bakery,1,Grocery Store,Gastropub,Brewery,Bakery,Convenience Store,Café,Bike Shop,Rental Car Location,Coffee Shop,Trail
670,51.386274,-2.388689,Convenience Store,1,Grocery Store,Gastropub,Brewery,Bakery,Convenience Store,Café,Bike Shop,Rental Car Location,Coffee Shop,Trail
671,51.386274,-2.400624,Trail,1,Grocery Store,Gastropub,Brewery,Bakery,Convenience Store,Café,Bike Shop,Rental Car Location,Coffee Shop,Trail


In [46]:
bath_merged.loc[bath_merged['Cluster Labels'] == 2, bath_merged.columns[[1] + list(range(5, bath_merged.shape[1]))]]

Unnamed: 0,Neighbourhood Latitude,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
700,51.394057,-2.351122,Pub,2,Pub,Restaurant,Gourmet Shop,Convenience Store,Pharmacy,Grocery Store,Furniture / Home Store,Performing Arts Venue,Rugby Stadium,Pizza Place
701,51.394057,-2.349912,Restaurant,2,Pub,Restaurant,Gourmet Shop,Convenience Store,Pharmacy,Grocery Store,Furniture / Home Store,Performing Arts Venue,Rugby Stadium,Pizza Place
702,51.394057,-2.355744,Pub,2,Pub,Restaurant,Gourmet Shop,Convenience Store,Pharmacy,Grocery Store,Furniture / Home Store,Performing Arts Venue,Rugby Stadium,Pizza Place
703,51.394057,-2.354881,Pizza Place,2,Pub,Restaurant,Gourmet Shop,Convenience Store,Pharmacy,Grocery Store,Furniture / Home Store,Performing Arts Venue,Rugby Stadium,Pizza Place
704,51.394057,-2.358268,Furniture / Home Store,2,Pub,Restaurant,Gourmet Shop,Convenience Store,Pharmacy,Grocery Store,Furniture / Home Store,Performing Arts Venue,Rugby Stadium,Pizza Place
705,51.394057,-2.35631,Performing Arts Venue,2,Pub,Restaurant,Gourmet Shop,Convenience Store,Pharmacy,Grocery Store,Furniture / Home Store,Performing Arts Venue,Rugby Stadium,Pizza Place
706,51.394057,-2.356761,Pub,2,Pub,Restaurant,Gourmet Shop,Convenience Store,Pharmacy,Grocery Store,Furniture / Home Store,Performing Arts Venue,Rugby Stadium,Pizza Place
707,51.394183,-2.349912,Restaurant,2,Pub,Restaurant,Gourmet Shop,Convenience Store,Pharmacy,Grocery Store,Furniture / Home Store,Performing Arts Venue,Rugby Stadium,Pizza Place
708,51.394183,-2.351122,Pub,2,Pub,Restaurant,Gourmet Shop,Convenience Store,Pharmacy,Grocery Store,Furniture / Home Store,Performing Arts Venue,Rugby Stadium,Pizza Place
709,51.394183,-2.347091,Grocery Store,2,Pub,Restaurant,Gourmet Shop,Convenience Store,Pharmacy,Grocery Store,Furniture / Home Store,Performing Arts Venue,Rugby Stadium,Pizza Place


In [47]:
bath_merged.loc[bath_merged['Cluster Labels'] == 3, bath_merged.columns[[1] + list(range(5, bath_merged.shape[1]))]]

Unnamed: 0,Neighbourhood Latitude,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
224,51.405794,-2.340673,Plaza,3,Plaza,Hot Dog Joint,Park,Music Venue,Café,Bed & Breakfast,Construction & Landscaping,Vacation Rental,Gastropub,Athletics & Sports
225,51.400812,-2.307424,Gastropub,3,Plaza,Hot Dog Joint,Park,Music Venue,Café,Bed & Breakfast,Construction & Landscaping,Vacation Rental,Gastropub,Athletics & Sports
226,51.400812,-2.306805,Bed & Breakfast,3,Plaza,Hot Dog Joint,Park,Music Venue,Café,Bed & Breakfast,Construction & Landscaping,Vacation Rental,Gastropub,Athletics & Sports
227,51.400812,-2.302287,Park,3,Plaza,Hot Dog Joint,Park,Music Venue,Café,Bed & Breakfast,Construction & Landscaping,Vacation Rental,Gastropub,Athletics & Sports
228,51.400812,-2.308502,Athletics & Sports,3,Plaza,Hot Dog Joint,Park,Music Venue,Café,Bed & Breakfast,Construction & Landscaping,Vacation Rental,Gastropub,Athletics & Sports
229,51.400812,-2.30397,Vacation Rental,3,Plaza,Hot Dog Joint,Park,Music Venue,Café,Bed & Breakfast,Construction & Landscaping,Vacation Rental,Gastropub,Athletics & Sports
230,51.412482,-2.320677,Construction & Landscaping,3,Plaza,Hot Dog Joint,Park,Music Venue,Café,Bed & Breakfast,Construction & Landscaping,Vacation Rental,Gastropub,Athletics & Sports
231,51.412482,-2.31562,Music Venue,3,Plaza,Hot Dog Joint,Park,Music Venue,Café,Bed & Breakfast,Construction & Landscaping,Vacation Rental,Gastropub,Athletics & Sports
232,51.406303,-2.306645,Café,3,Plaza,Hot Dog Joint,Park,Music Venue,Café,Bed & Breakfast,Construction & Landscaping,Vacation Rental,Gastropub,Athletics & Sports
233,51.406303,-2.31272,Hot Dog Joint,3,Plaza,Hot Dog Joint,Park,Music Venue,Café,Bed & Breakfast,Construction & Landscaping,Vacation Rental,Gastropub,Athletics & Sports


In [48]:
bath_merged.loc[bath_merged['Cluster Labels'] == 4, bath_merged.columns[[1] + list(range(5, bath_merged.shape[1]))]]

Unnamed: 0,Neighbourhood Latitude,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
724,51.399922,-2.392801,Convenience Store,4,Grocery Store,Convenience Store,Gastropub,Café,Coffee Shop,Art Gallery,Shipping Store,Farm,Women's Store,Discount Store
725,51.399922,-2.397487,Art Gallery,4,Grocery Store,Convenience Store,Gastropub,Café,Coffee Shop,Art Gallery,Shipping Store,Farm,Women's Store,Discount Store
726,51.399922,-2.397429,Farm,4,Grocery Store,Convenience Store,Gastropub,Café,Coffee Shop,Art Gallery,Shipping Store,Farm,Women's Store,Discount Store
727,51.397358,-2.389964,Grocery Store,4,Grocery Store,Convenience Store,Gastropub,Café,Coffee Shop,Art Gallery,Shipping Store,Farm,Women's Store,Discount Store
728,51.397358,-2.391971,Café,4,Grocery Store,Convenience Store,Gastropub,Café,Coffee Shop,Art Gallery,Shipping Store,Farm,Women's Store,Discount Store
729,51.397358,-2.389389,Café,4,Grocery Store,Convenience Store,Gastropub,Café,Coffee Shop,Art Gallery,Shipping Store,Farm,Women's Store,Discount Store
730,51.397358,-2.388555,Gastropub,4,Grocery Store,Convenience Store,Gastropub,Café,Coffee Shop,Art Gallery,Shipping Store,Farm,Women's Store,Discount Store
731,51.399436,-2.389964,Grocery Store,4,Grocery Store,Convenience Store,Gastropub,Café,Coffee Shop,Art Gallery,Shipping Store,Farm,Women's Store,Discount Store
732,51.399436,-2.38882,Shipping Store,4,Grocery Store,Convenience Store,Gastropub,Café,Coffee Shop,Art Gallery,Shipping Store,Farm,Women's Store,Discount Store
733,51.399436,-2.392801,Convenience Store,4,Grocery Store,Convenience Store,Gastropub,Café,Coffee Shop,Art Gallery,Shipping Store,Farm,Women's Store,Discount Store


After looking at the data from the clusters we can see that clusters one, two, three and five all have either a pub or gastropub within their top 3 most common venues. This gives our first indication that cluster four may be the best area for our pub as it will have the least competition and thus there may be the most need for such an establishment. 

### Conclusions

This gives the indication that cluster four will be the best area for our pub as it will have the least competition and thus there may be the most need for such an establishment. 
