# Opening a Sports Bar in Bath

I will be looking to open a sports bar in the City of Bath (UK). I appreciate that there is already numerous pubs in this city however I will be looking to open a sports bar in an area of the city where there is sufficent foot traffic from pedestrians as well as other popular venues that will maximise the number of people who will be in the vacinity of the bar.

Further to that, choosing a location that is close to parking and in close proximity to public transport is key so that both local people and customers from the outside area are able to reach it. Finally, I will consider how much competition there is in the location.

I will use the FourSquare data on Bath to find bars, restaurants and other venues to see where the competition is location as well as parking, transport and other busy areas of the city.

Foursquare API will be used to request information on the top 100 venues and top 10 for each of our assigned Areas. The resulting information will be converted to a pandas dataframe. One hot encoding and k-means analysis will allow me to cluster the venues locations. The venue categories will then be analysed and subgrouped created to encompass multiple venue types. These included:

- Restaurants
- Pubs / bars
- Coffee / cafe's shops
- Shops
- Arts
- Recreation
- Other

Foursquare developer’s API will be utilised to collect information on the top 100 venues within Bath. The resulting information will be converted to a workable dataframe and locations of each venue visualised on an interactive folium map. The resulting dataframe will be analysed using the sklearn library function kmeans, one hot encoded and k-means will be applied to the dataset and the resulting clusters will be plotted onto a folium map for visual analysis of where the clusters lie. Venue categories will be clustered into subgroups, as stated in the previous above, and one hot encoded. The resulting dataframe will be visually analysied using the Matplotlib .plot function as bar-charts displaying the number of venues per ‘Area’ to discern which areas contained the most of which types of venues.

In [1]:
import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation

!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize

!conda install -c conda-forge folium=0.5.0 --yes
import folium # plotting library

print('Folium installed')
print('Libraries imported.')

Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/jupyterlab/conda/envs/python

  added / updated specs:
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    geographiclib-1.50         |             py_0          34 KB  conda-forge
    geopy-1.21.0               |             py_0          58 KB  conda-forge
    openssl-1.1.1g             |       h516909a_0         2.1 MB  conda-forge
    ------------------------------------------------------------
                                           Total:         2.2 MB

The following NEW packages will be INSTALLED:

  geographiclib      conda-forge/noarch::geographiclib-1.50-py_0
  geopy              conda-forge/noarch::geopy-1.21.0-py_0

The following packages will be UPDATED:

  openssl                                 1.1.1f-h516909a_0 --> 1.1.1g-h51

In [5]:
CLIENT_ID = 'BR2KRRUOLKFBA4KITZADFSRUY1W0IICICWOER5T4HWCYXQHN' # your Foursquare ID
CLIENT_SECRET = 'HL0RLE4WCWSJJ1VXRNLI0ZHQYBCAONJO2KBT52OUQSP0TDLQ' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 200
print('Your credentails:')
print('CLIENT_ID:' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID:BR2KRRUOLKFBA4KITZADFSRUY1W0IICICWOER5T4HWCYXQHN
CLIENT_SECRET:HL0RLE4WCWSJJ1VXRNLI0ZHQYBCAONJO2KBT52OUQSP0TDLQ


#### Entering the details of the Roman Bath, which are located in the city centre of Bath and so I will look for locations around this location

In [6]:
address = 'Abbey Churchyard, Bath BA1 1LZ'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

51.38134185 -2.3596754409643284


In [7]:
radius = 2000
url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, radius, LIMIT)

In [9]:
import requests
results = requests.get(url).json()

In [10]:
items = results['response']['groups'][0]['items']
items[0]

{'reasons': {'count': 0,
  'items': [{'summary': 'This spot is popular',
    'type': 'general',
    'reasonName': 'globalInteractionReason'}]},
 'venue': {'id': '4f75f98ce4b04262b738ab1f',
  'name': 'Canary Gin And Wine Bar',
  'location': {'address': 'Queen St',
   'lat': 51.382825457825085,
   'lng': -2.3619487466976476,
   'labeledLatLngs': [{'label': 'display',
     'lat': 51.382825457825085,
     'lng': -2.3619487466976476}],
   'distance': 228,
   'cc': 'GB',
   'city': 'Bath',
   'state': 'Bath and North East Somerset',
   'country': 'United Kingdom',
   'formattedAddress': ['Queen St',
    'Bath',
    'Bath and North East Somerset',
    'United Kingdom']},
  'categories': [{'id': '4bf58dd8d48988d11e941735',
    'name': 'Cocktail Bar',
    'pluralName': 'Cocktail Bars',
    'shortName': 'Cocktail',
    'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/nightlife/cocktails_',
     'suffix': '.png'},
    'primary': True}],
  'photos': {'count': 0, 'groups': []}},
 'referra

In [23]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [25]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues

  This is separate from the ipykernel package so we can avoid doing imports until


Unnamed: 0,name,categories,lat,lng
0,Canary Gin And Wine Bar,Cocktail Bar,51.382825,-2.361949
1,Acorn Vegetarian Kitchen,Vegetarian / Vegan Restaurant,51.380800,-2.358273
2,The Whole Bagel,Bagel Shop,51.382757,-2.360067
3,Sotto Sotto,Italian Restaurant,51.380802,-2.356590
4,Ben's Cookies,Bakery,51.382056,-2.360164
...,...,...,...,...
94,Tesco Express,Grocery Store,51.384757,-2.381982
95,Oldfield Park Railway Station (OLF),Train Station,51.379209,-2.380352
96,Sainsbury's Local,Grocery Store,51.376758,-2.378526
97,Sham Castle,Scenic Lookout,51.382288,-2.337577


In [29]:
nearby_venues.groupby('categories').count()

Unnamed: 0_level_0,name,lat,lng
categories,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Asian Restaurant,1,1,1
BBQ Joint,1,1,1
Bagel Shop,1,1,1
Bakery,1,1,1
Bar,1,1,1
Bed & Breakfast,2,2,2
Bookstore,2,2,2
Botanical Garden,1,1,1
Burger Joint,1,1,1
Café,3,3,3


#### Visualise venues on map

In [38]:
venues_map = folium.Map(location=[latitude, longitude], zoom_start=14) # generate map centred around the Conrad Hotel

# add a red circle marker to represent the Conrad Hotel
folium.features.CircleMarker(
    [latitude, longitude],
    radius=10,
    color='red',
    popup='Conrad Hotel',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(venues_map)

# add the Italian restaurants as blue circle markers
for lat, lng, label in zip(dataframe_filtered.lat, dataframe_filtered.lng, dataframe_filtered.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(venues_map)

# display map
venues_map

As you can see from this map, the most densely populated area of popular venues is around Westgate Street up towards John Street. We can also see that just north of the river towards Bath Spa train station and the Bath Bus station there isn't a large amount of competition for our potential sports bar however there is certainly the foot traffic from the transport links and parking available (Avon Street and Bath Spa Station Car Park