# Where in London needs coffee?
## Aim: To use Foursquare data to identify areas of Central London that need new coffee shops based on number of coffee shops already available and the rating of those coffee shops.

## Introduction

For this project, I decided to work out where the best place would be for a new coffee shop to open in London, taking into consideration the number of coffee shops already in the area and the ratings they have. If there are lots of coffee shops with mainly good reviews, that implies that the loyal customer base would be unlikely to switch to a new business so the best area would have few coffee shops and/or lower ratings on the coffee shops already there.

## Data

The data I will be utilising is the coffee shop category from FourSquare looking at Central London to summarise the number of coffee shops and the average rating of those coffee shops. I also acquired population data and rental data for the postcodes from doogal.co.uk and home.co.uk 

## Method

### Data fetching and cleaning

In [157]:
import pandas as pd
import numpy as np
!pip3 --quiet install lxml
!pip3 --quiet install folium
import folium
import requests
import json
!pip3 --quiet install sklearn
from sklearn.cluster import KMeans
import matplotlib.cm as cm
import matplotlib.colors as colors

In [3]:
url = 'https://en.wikipedia.org/wiki/EC_postcode_area'
df_list = pd.read_html(url)
east = df_list[1]
east.drop(['Post town', 'Local authority area(s)'], axis=1, inplace=True)
east.dropna(inplace=True)
url = 'https://en.wikipedia.org/wiki/WC_postcode_area'
df_list = pd.read_html(url)
west = df_list[1]
west.drop(['Post town', 'Local authority area(s)'], axis=1, inplace=True)
west.dropna(inplace=True)
central = east.append(west, ignore_index=True)
central.columns = ['Postcode', 'Area']
central.head()

Unnamed: 0,Postcode,Area
0,EC1A,St Bartholomew's Hospital
1,EC1M,"Clerkenwell, Farringdon"
2,EC1N,Hatton Garden
3,EC1R,"Finsbury, Finsbury Estate (west)"
4,EC1V,"Finsbury (east), Moorfields Eye Hospital"


In [4]:
coords = pd.read_csv('Central_London_Coordinates.csv')
coords.head()
nbrhds = pd.merge(central,coords)
nbrhds.columns = ['Postcode', 'Area', 'Lat', 'Long']
nbrhds.head()

Unnamed: 0,Postcode,Area,Lat,Long
0,EC1A,St Bartholomew's Hospital,51.5183,-0.0991
1,EC1M,"Clerkenwell, Farringdon",51.5209,-0.1006
2,EC1N,Hatton Garden,51.5196,-0.1079
3,EC1R,"Finsbury, Finsbury Estate (west)",51.5242,-0.1072
4,EC1V,"Finsbury (east), Moorfields Eye Hospital",51.5268,-0.0954


In [5]:
address = 'London, GB'

latitude = 51.51492
longitude = -0.10084
print('The geograpical coordinate of London are {}, {}.'.format(latitude, longitude))

# create map of Toronto using latitude and longitude values
map_london = folium.Map(location=[latitude, longitude], zoom_start=14)

# add markers to map
for lat, lng, area in zip(nbrhds['Lat'], nbrhds['Long'], nbrhds['Area']):
    label = area
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_london)  
    
map_london

The geograpical coordinate of London are 51.51492, -0.10084.


In [6]:
CLIENT_ID = '2YLBSIPBXVIHPOLHXIMJOQSRXKW1KF3CMBMFJRPHW4ID5W01' # your Foursquare ID
CLIENT_SECRET = 'FXZSAWQ1KGDX0JGFLQQOTJEUXL24KHRYCP4UIZHFJHNXQWRO' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentials:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentials:
CLIENT_ID: 2YLBSIPBXVIHPOLHXIMJOQSRXKW1KF3CMBMFJRPHW4ID5W01
CLIENT_SECRET:FXZSAWQ1KGDX0JGFLQQOTJEUXL24KHRYCP4UIZHFJHNXQWRO


In [11]:
def getNearbyVenues(names, latitudes, longitudes, radius=500, LIMIT=50):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name'],
            v['venue']['id']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Area', 
                  'Area Latitude', 
                  'Area Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category',
                  'Venue ID']
    
    return(nearby_venues)

In [12]:
clondon_venues = getNearbyVenues(names=nbrhds['Area'],
                                   latitudes=nbrhds['Lat'],
                                   longitudes=nbrhds['Long']
                                  )
print('There are {} unique venues.'.format(len(clondon_venues['Venue'].unique())))

There are 854 unique venues.


Adding coffee shops to map in orange

In [13]:
clondon_coffee = clondon_venues[clondon_venues['Venue Category'] == 'Coffee Shop']

for lat, lng, name in zip(clondon_coffee['Venue Latitude'], clondon_coffee['Venue Longitude'], clondon_coffee['Venue']):
    label = name
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='orange',
        fill=True,
        fill_color='#FFA500',
        fill_opacity=0.7,
        parse_html=False).add_to(map_london)
map_london

From this, we can work out the areas with the fewest coffee shops

In [14]:
coffee_count = clondon_coffee[['Area','Venue']].groupby('Area').count()
coffee_count['Venue Count'] = coffee_count['Venue']/coffee_count['Venue'].sum()
coffee_count = coffee_count[coffee_count['Venue Count'] == min(coffee_count['Venue Count'])].reset_index()
coffee_count.drop('Venue', axis=1, inplace=True)
min_coffee = pd.merge(coffee_count, clondon_coffee)
min_coffee.head()

Unnamed: 0,Area,Venue Count,Area Latitude,Area Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category,Venue ID
0,Charing Cross,0.007143,51.5077,-0.1226,Notes Music & Coffee,51.509696,-0.126985,Coffee Shop,4cdadd2dc409b60cac66d11a
1,"Somerset House, Temple (west)",0.007143,51.5104,-0.1152,Lundenwic,51.512823,-0.118343,Coffee Shop,55c34aac498e536d77bc73b0


As can be seen above, Charing Cross and Somerset House, Temple (West) only have one coffee shop each. These areas can be seen in red on the map.

In [15]:
for lat, lng, name in zip(min_coffee['Area Latitude'], min_coffee['Area Longitude'], min_coffee['Area']):
    label = name
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='red',
        fill=True,
        fill_color='#FF0000',
        fill_opacity=0.7,
        parse_html=False).add_to(map_london)
map_london

Next, we would like to know the rating of the shops in those 2 areas

In [37]:
rating_list = []
for x in min_coffee['Venue ID']:
    url = 'https://api.foursquare.com/v2/venues/{}?client_id={}&client_secret={}&v={}'.format(x, CLIENT_ID, CLIENT_SECRET, VERSION)
    result = requests.get(url).json()
    rating_list.append(result['response']['venue']['rating'])
min_coffee['Rating'] = rating_list
min_coffee

Unnamed: 0,Area,Venue Count,Area Latitude,Area Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category,Venue ID,Rating
0,Charing Cross,0.007143,51.5077,-0.1226,Notes Music & Coffee,51.509696,-0.126985,Coffee Shop,4cdadd2dc409b60cac66d11a,8.3
1,"Somerset House, Temple (west)",0.007143,51.5104,-0.1152,Lundenwic,51.512823,-0.118343,Coffee Shop,55c34aac498e536d77bc73b0,8.8


This data is not very promising for opening a coffee shop. While there only appears to be one coffee shop in each of these areas, they are quite highly rated. Let's try a new approach and cluster, then look at the number of coffee shops in each cluster.

In [62]:
# one hot encoding
clondon_onehot = pd.get_dummies(clondon_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
clondon_onehot['Area'] = clondon_venues['Area'] 

# move neighborhood column to the first column
fixed_columns = [clondon_onehot.columns[-1]] + list(clondon_onehot.columns[:-1])
clondon_onehot = clondon_onehot[fixed_columns]

clondon_grouped = clondon_onehot.groupby('Area').mean().reset_index()

num_top_venues = 5

for hood in clondon_grouped['Area']:
    #print("----"+hood+"----")
    temp = clondon_grouped[clondon_grouped['Area'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    #print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    #print('\n')
    
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Area']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Area'] = clondon_grouped['Area']

for ind in np.arange(clondon_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(clondon_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Area,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Bank of England,Hotel,Gym / Fitness Center,Coffee Shop,Restaurant,Asian Restaurant,French Restaurant,Cocktail Bar,Café,Boxing Gym,Steakhouse
1,Barbican,Gym / Fitness Center,Coffee Shop,Hotel,Food Truck,Indie Movie Theater,Italian Restaurant,Bar,Concert Hall,Piano Bar,Burrito Place
2,Blackfriars,Coffee Shop,Art Museum,Sandwich Place,Bar,Bakery,Cocktail Bar,Pub,Bookstore,Pizza Place,Restaurant
3,"Bloomsbury, British Museum, Southampton Row",Hotel,Pub,Coffee Shop,Plaza,Bookstore,Hotel Bar,Exhibit,Café,Burger Joint,Gym / Fitness Center
4,"Broadgate, Liverpool Street",Food Truck,Coffee Shop,Cocktail Bar,Boxing Gym,Burger Joint,Plaza,Café,Pizza Place,Chinese Restaurant,Lounge


In [67]:
# set number of clusters
kclusters = 5

clondon_grouped_clustering = clondon_grouped.drop('Area', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(clondon_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

# add clustering labels
neighborhoods_venues_sorted.drop('Cluster Labels', axis=1, inplace=True, errors='ignore')
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

clondon_merged = nbrhds
# merge clondon_grouped with clondon_data to add latitude/longitude for each area
clondon_merged = clondon_merged.join(neighborhoods_venues_sorted.set_index('Area'), on='Area')

clondon_merged.head() # check the last columns!

Unnamed: 0,Postcode,Area,Lat,Long,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,EC1A,St Bartholomew's Hospital,51.5183,-0.0991,4,Gym / Fitness Center,Wine Bar,Italian Restaurant,Garden,Coffee Shop,French Restaurant,Art Gallery,Café,Sandwich Place,Cocktail Bar
1,EC1M,"Clerkenwell, Farringdon",51.5209,-0.1006,4,French Restaurant,Pub,Hotel,Gym / Fitness Center,Vietnamese Restaurant,Wine Bar,Beer Bar,Café,Coffee Shop,Modern European Restaurant
2,EC1N,Hatton Garden,51.5196,-0.1079,0,Coffee Shop,Pub,Hotel,Wine Bar,Vietnamese Restaurant,Gym / Fitness Center,Beer Bar,Sushi Restaurant,French Restaurant,Food Truck
3,EC1R,"Finsbury, Finsbury Estate (west)",51.5242,-0.1072,0,Pub,Bar,Hotel,Coffee Shop,Italian Restaurant,Pizza Place,Vietnamese Restaurant,Breakfast Spot,Bakery,Cocktail Bar
4,EC1V,"Finsbury (east), Moorfields Eye Hospital",51.5268,-0.0954,3,Food Truck,Coffee Shop,Pub,Art Gallery,Gym / Fitness Center,Italian Restaurant,Vietnamese Restaurant,Café,Turkish Restaurant,Ramen Restaurant


In [70]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=14)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(clondon_merged['Lat'], clondon_merged['Long'], clondon_merged['Area'], clondon_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [95]:
cluster0 = clondon_merged.loc[clondon_merged['Cluster Labels'] == 0]
cluster1 = clondon_merged.loc[clondon_merged['Cluster Labels'] == 1]
cluster2 = clondon_merged.loc[clondon_merged['Cluster Labels'] == 2]
cluster3 = clondon_merged.loc[clondon_merged['Cluster Labels'] == 3]
cluster4 = clondon_merged.loc[clondon_merged['Cluster Labels'] == 4]
coffee_cluster0 = pd.merge(cluster0, clondon_coffee)
coffee_cluster0 = coffee_cluster0[['Postcode', 'Area', 'Area Latitude', 'Area Longitude', 'Venue', 'Venue Latitude', 'Venue Longitude']]
coffee_cluster1 = pd.merge(cluster1, clondon_coffee)
coffee_cluster1 = coffee_cluster1[['Postcode', 'Area', 'Area Latitude', 'Area Longitude', 'Venue', 'Venue Latitude', 'Venue Longitude']]
coffee_cluster2 = pd.merge(cluster2, clondon_coffee)
coffee_cluster2 = coffee_cluster2[['Postcode', 'Area', 'Area Latitude', 'Area Longitude', 'Venue', 'Venue Latitude', 'Venue Longitude']]
coffee_cluster3 = pd.merge(cluster3, clondon_coffee)
coffee_cluster3 = coffee_cluster3[['Postcode', 'Area', 'Area Latitude', 'Area Longitude', 'Venue', 'Venue Latitude', 'Venue Longitude']]
coffee_cluster4 = pd.merge(cluster4, clondon_coffee)
coffee_cluster4 = coffee_cluster4[['Postcode', 'Area', 'Area Latitude', 'Area Longitude', 'Venue', 'Venue Latitude', 'Venue Longitude']]

In [102]:
coffee_cluster2.shape

(8, 7)

Coffee cluster 2 has the smallest shape (i.e. the fewest coffee shops per cluster of the 5 clusters) so could potentially be the best choice to open a coffee shop. However, this isn't very convincing evidence for a new business. Let's look at the population in each of these areas.

In [123]:
postcodes = pd.read_csv('Postcode districts.csv')
postcodes.dropna()
coffee_cluster0 = pd.merge(coffee_cluster0, postcodes)[['Postcode', 'Area', 'Area Latitude', 'Area Longitude', 'Venue', 'Venue Latitude', 'Venue Longitude', 'Population']]
pop_cluster0 = coffee_cluster0['Population'].unique().sum()
coffee_cluster1 = pd.merge(coffee_cluster1, postcodes)[['Postcode', 'Area', 'Area Latitude', 'Area Longitude', 'Venue', 'Venue Latitude', 'Venue Longitude', 'Population']]
pop_cluster1 = coffee_cluster1['Population'].unique().sum()
coffee_cluster2 = pd.merge(coffee_cluster2, postcodes)[['Postcode', 'Area', 'Area Latitude', 'Area Longitude', 'Venue', 'Venue Latitude', 'Venue Longitude', 'Population']]
pop_cluster2 = coffee_cluster2['Population'].unique().sum()
coffee_cluster3 = pd.merge(coffee_cluster3, postcodes)[['Postcode', 'Area', 'Area Latitude', 'Area Longitude', 'Venue', 'Venue Latitude', 'Venue Longitude', 'Population']]
pop_cluster3 = coffee_cluster3['Population'].unique().sum()
coffee_cluster4 = pd.merge(coffee_cluster4, postcodes)[['Postcode', 'Area', 'Area Latitude', 'Area Longitude', 'Venue', 'Venue Latitude', 'Venue Longitude', 'Population']]
pop_cluster4 = coffee_cluster4['Population'].unique().sum()
print("The number of people per coffee shop in Cluster 0 %.2f, Cluster 1 %.2f, Cluster 2 %.2f, Cluster 3 %.2f and Cluster 4 %.2f"% (pop_cluster0/coffee_cluster0.shape[0], pop_cluster1/coffee_cluster1.shape[0], pop_cluster2/coffee_cluster2.shape[0], pop_cluster3/coffee_cluster3.shape[0], pop_cluster4/coffee_cluster4.shape[0]))

The number of people per coffee shop in Cluster 0 437.30, Cluster 1 53.40, Cluster 2 434.62, Cluster 3 1124.83 and Cluster 4 819.66


Judging by the population sizes of the areas the place most in need of a coffee shop would be cluster 3 as there are 1124.83 people per coffee shop! That looks a little more promising!

### Median rents per postcode
Using data from home.co.uk, the median rent for each postcode can be gleaned.

In [194]:
median_rent = pd.read_csv('Median_Rents_pcm.csv')
median_rent.columns = ['Postcode', 'Median Rent pcm']
coffee_cluster0 = pd.merge(coffee_cluster0, median_rent)
coffee_cluster1 = pd.merge(coffee_cluster1, median_rent)
coffee_cluster2 = pd.merge(coffee_cluster2, median_rent)
coffee_cluster3 = pd.merge(coffee_cluster3, median_rent)
coffee_cluster4 = pd.merge(coffee_cluster4, median_rent)
print("The mean rent pcm in Cluster 0 £%.2f, Cluster 1 £%.2f, Cluster 2 £%.2f, Cluster 3 £%.2f and Cluster 4 £%.2f" % (coffee_cluster0['Median Rent pcm'].mean(), coffee_cluster1['Median Rent pcm'].mean(), coffee_cluster2['Median Rent pcm'].mean(), coffee_cluster3['Median Rent pcm'].mean(), coffee_cluster4['Median Rent pcm'].mean()))

The mean rent pcm in Cluster 0 £2263.82, Cluster 1 £2305.29, Cluster 2 £2817.00, Cluster 3 £2324.56 and Cluster 4 £2292.81


From this data the cheapest place to open a new coffee shop would be Cluster 0

Finally, let's see how much per potential customer the rent would cost.

In [189]:
print("Rent per potential customer: Cluster 0 £%.2f, Cluster 1 £%.2f, Cluster 2 £%.2f, Cluster 3 £%.2f, Cluster 4 £%.2f" % (coffee_cluster0['Median Rent pcm'].mean()/pop_cluster0, coffee_cluster1['Median Rent pcm'].mean()/pop_cluster1, coffee_cluster2['Median Rent pcm'].mean()/pop_cluster2, coffee_cluster3['Median Rent pcm'].mean()/pop_cluster3, coffee_cluster4['Median Rent pcm'].mean()/pop_cluster4))

Rent per potential customer: Cluster 0 £0.13, Cluster 1 £1.03, Cluster 2 £0.81, Cluster 3 £0.11, Cluster 4 £0.09


In [195]:
coffee_cluster3

Unnamed: 0,Postcode,Area,Area Latitude,Area Longitude,Venue,Venue Latitude,Venue Longitude,Population,Median Rent pcm
0,EC1V,"Finsbury (east), Moorfields Eye Hospital",51.5268,-0.0954,Central Street Cafe,51.526387,-0.096519,13065.0,2253
1,EC1V,"Finsbury (east), Moorfields Eye Hospital",51.5268,-0.0954,Jimmy And The Bee,51.526402,-0.100223,13065.0,2253
2,EC1V,"Finsbury (east), Moorfields Eye Hospital",51.5268,-0.0954,Westland Coffee & Wine,51.528194,-0.090467,13065.0,2253
3,EC1V,"Finsbury (east), Moorfields Eye Hospital",51.5268,-0.0954,Fix Coffee,51.523333,-0.09326,13065.0,2253
4,EC1V,"Finsbury (east), Moorfields Eye Hospital",51.5268,-0.0954,Goswell Road Coffee,51.525715,-0.099773,13065.0,2253
5,EC1Y,"St Luke's, Bunhill Fields",51.5235,-0.0903,Fix Coffee,51.523333,-0.09326,3928.0,2253
6,EC1Y,"St Luke's, Bunhill Fields",51.5235,-0.0903,Ozone Coffee Roasters,51.524693,-0.086737,3928.0,2253
7,EC1Y,"St Luke's, Bunhill Fields",51.5235,-0.0903,Giddy Up,51.522191,-0.09347,3928.0,2253
8,EC1Y,"St Luke's, Bunhill Fields",51.5235,-0.0903,Shoreditch Grind,51.525781,-0.087828,3928.0,2253
9,EC1Y,"St Luke's, Bunhill Fields",51.5235,-0.0903,Hermanos Colombian coffee Roasters,51.525835,-0.087688,3928.0,2253


## Results

As can be seen from the above code, there was not a conclusive answer as to where the best place to open a coffee shop would be. Factors taken into consideration were, the scale of competition from other coffee shops, the population of the area, the rent of the area and the rent in comparison to the number of potential customers. Based on these factors, I think the best place to open a coffee shop would be in cluster 3 as while the rent is not the cheapest (£2324.56), there is the highest potential customer base (1124.83 people per coffee shop) and relatively one of the cheapest rents per potential customer (£0.11 per potential customer). Cluster 3 relates to areas including Finsbury (East), Moorfield's Eye Hospital, St Luke's, Bunhill Fields, Broadgate, Liverpool Street and Barbican (turquoise on the map above).

## Discussion

During this project I encountered many issues including a poorly defined brief, limited Foursquare queries and difficulty acquiring data. I do not believe my brief was very well defined for this problem as I do not have very clear goals as to what factors are the most important for a new business. If I were to do this project again, I would like to more thoroughly research what is the most important to capitalise on e.g. rent, potential competitors or some other factor. I also was unable to query Foursquare as many times as I would like for the ratings of each coffee shop under the Sandbox tier account, and so had to change the way I went about this project. The last major problem I encountered was difficulty getting the data I would have liked to use. I would have liked to look at the ratings of each coffee shop as mentioned, as well as the average sales for coffee shops per area, independent vs chain coffee shops in each area etc. Datasets appeared to be pretty hard to find by just searching online, especially when looking for free datasets as most were behind a paywall.

## Conclusion

I summarised from this project that the best place to open a coffee shop would be in cluster 3, which includes the areas of Barbican, Broadgate and Liverpool Street. 