# Where to open the most popular Cocktail bar in town?

<b>Introduction</b>

In the gastro environment bars of all sorts play a major role. From a simple pub serving beer and maybe some food to exclusive cocktail bars the array and style is unique to every venue. While the venue itself is of great importance to maintaining a solid customers basis, finding new customers can be greatly influenced by the location and neighborhood.

This study focuses on a regional study within Zürich to open a cocktail bar. No food will be served. While the venue wants to capture a high number of high-end customers who will travel to the bar specifically, the goal is to grow their customer base based on location. Essentially making it more accessible and favorable to a broader audience. All business aspects like pricing and products on offer are neglected, focusing solely on the most favorable location to supplement the venue while still making it a standout establishment.

As the bar will not serve any food and be a unique venue for one or two drinks in the afternoon/evening following informations were taken into consideration:
- Proximity to an array of restaurants
- Reduced competition in the region


<b>Methods</b>

Using the Foursquare API the locations of restaurants, bars, train-, tram stations and bus stops were selected. 4 clusters of restaurants and rival bars were created. Based on the data it might be possible to view regions most fruitful to open the cocktail bar at. Optimally regions with high densities of restaurants and low densities of bars would be recorded. To also view the accessiblity for an after work drink public transport point (train, tram and bus stations) were clustered.

In [144]:
import random # library for random number generation
import numpy as np # library for vectorized computation
import pandas as pd # library to process data as dataframes

import matplotlib.pyplot as plt # plotting library
# backend for rendering plots within the browser
%matplotlib inline 

from sklearn.cluster import KMeans 
#from sklearn.datasets.samples_generator import make_blobs

!pip install bs4
from bs4 import BeautifulSoup
import requests

from geopy.geocoders import Nominatim 
import requests
! conda install -c conda-forge folium
#import folium
import matplotlib.cm as cm
import matplotlib.colors as colors
from collections import OrderedDict
import matplotlib as mpl

from pandas.io.json import json_normalize
import folium

import json

print('Libraries imported.')

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Libraries imported.


In [145]:
#Foursquare Info
CLIENT_ID = 'F3AM3GHNJL0VSLIGH3F5LMF1CXB5CO43PDC2FDUNVM0DQDYN' # your Foursquare ID
CLIENT_SECRET = 'S5IPPXEXVSECSWPEAC1SBW1E00LAISVPRO2ACCNATWMVZTJX' # your Foursquare Secret
VERSION = '20210615'
Limit = 300

In [146]:
#other search parameters
city = 'Zürich'

In [147]:
#funtion to search for a query near a city, returns json file
def SearchVenues(query):
    
    search_results=[]
    for searchtopic in zip(query):
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&near={}&limit={}&query={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION,
            city, 
            Limit,
            query)
        results = requests.get(url).json()['response']['groups'][0]['items']
    
        search_results.append([(
            v['venue']['name'],
            v['venue']['location']['lat'],
            v['venue']['location']['lng']) for v in results])
    
        return (search_results)
    
    

In [148]:
#creating json file with wanted content
query1 = 'restaurant'
restaurants = SearchVenues(query1)
restaurants

[[('John Baker Ltd', 47.36720835584306, 8.54729328625253),
  ('Coco Grill & Bar', 47.368958637059286, 8.538430806857162),
  ('KIN', 47.374076573465295, 8.5188527405262),
  ('Waidhof', 47.42427, 8.5251699),
  ('Kafi Dihei', 47.375239, 8.513285),
  ('169 West', 47.373884, 8.518049),
  ('John Baker', 47.376282010318235, 8.526332291424712),
  ('Café Noir', 47.38205413373607, 8.5306430877368),
  ('Samurai', 47.37374701258848, 8.527696984598784),
  ('Äss-Bar', 47.37256105232428, 8.543692839575389),
  ('Co Chin Chin', 47.38380906569936, 8.528479899712615),
  ('Restaurant Gertrudhof', 47.37379063895187, 8.515933955106151),
  ('Ooki', 47.3722774666315, 8.518805943275494),
  ('Paninoteca Il Pentagramma', 47.38112303767148, 8.533554349862975),
  ('Miki Ramen', 47.375152, 8.516653),
  ('Ban Song Thai', 47.369395, 8.544136),
  ('Kokoro', 47.38068, 8.526546),
  ('Wirtschaft Degenried', 47.36739497739939, 8.579027236425036),
  ('Restaurant JOSEF', 47.384005824668954, 8.52902029772919),
  ('Restaurant

In [149]:
query2 = 'drinks'
bars = SearchVenues(query2)
bars

[[('169 West', 47.373884, 8.518049),
  ('Riffraff', 47.38283076143206, 8.52925380040824),
  ('Kronenhalle Bar', 47.367697234571004, 8.54575627368673),
  ('Widder Bar', 47.3724149, 8.5398629),
  ('Old Crow', 47.37209247639519, 8.541023989469323),
  ('Rimini Bar', 47.371375547689496, 8.532764972134194),
  ('Grande Café & Bar', 47.375479016798664, 8.54339495305972),
  ('Restaurant JOSEF', 47.384005824668954, 8.52902029772919),
  ('La Stanza', 47.368386297594476, 8.536837348137258),
  ('Tales', 47.372877935361316, 8.5328166820715),
  ('Bank', 47.376172675802614, 8.526431322097778),
  ('barfussbar', 47.368440546907514, 8.542180955021328),
  ('Grüntal', 47.39313457381005, 8.521172971199407),
  ("Frieda's Büxe", 47.37736322362934, 8.510998269310983),
  ('Frau Gerolds Garten', 47.385241637574936, 8.51894137614916),
  ('Restaurant Bärengasse', 47.370301, 8.538797),
  ('MOUDI - Bar & Restaurant', 47.369919844894525, 8.534484993124845),
  ('Sphères', 47.39190314991482, 8.518938880095412),
  ('Bar

In [150]:
#writting json data into panda df
df_rest = pd.DataFrame(item for venue_list in restaurants for item in venue_list)
df_bars = pd.DataFrame(item for venue_list in bars for item in venue_list)
df_bars.head(5)

Unnamed: 0,0,1,2
0,169 West,47.373884,8.518049
1,Riffraff,47.382831,8.529254
2,Kronenhalle Bar,47.367697,8.545756
3,Widder Bar,47.372415,8.539863
4,Old Crow,47.372092,8.541024


In [151]:
#view shape
print(df_rest.shape)
print(df_bars.shape)

#edit column titles
df_rest.columns = ('Name', 'Lat', 'Lng')
df_bars.columns = ('Name', 'Lat', 'Lng')

df_rest.head(5)

(100, 3)
(100, 3)


Unnamed: 0,Name,Lat,Lng
0,John Baker Ltd,47.367208,8.547293
1,Coco Grill & Bar,47.368959,8.538431
2,KIN,47.374077,8.518853
3,Waidhof,47.42427,8.52517
4,Kafi Dihei,47.375239,8.513285


In [195]:
#setting up k-means
k_meansR = KMeans(init="k-means++", n_clusters=6, n_init=12)
k_meansR.fit(df_rest[['Lat','Lng']])
k_meansR_labels = k_means.labels_
k_meansR_labels

array([0, 0, 1, 2, 1, 1, 3, 3, 3, 0, 3, 1, 1, 3, 1, 0, 3, 5, 3, 4, 0, 1,
       0, 0, 2, 3, 0, 1, 1, 0, 0, 0, 1, 1, 1, 5, 1, 3, 2, 5, 3, 3, 0, 0,
       0, 1, 0, 0, 1, 2, 0, 0, 1, 0, 3, 0, 3, 0, 0, 0, 5, 0, 0, 2, 1, 1,
       1, 3, 1, 0, 3, 3, 2, 3, 1, 0, 0, 1, 1, 1, 0, 3, 1, 3, 0, 2, 3, 1,
       3, 3, 4, 0, 3, 3, 3, 5, 0, 1, 1, 3], dtype=int32)

In [216]:
k_meansR_cluster_centers = k_means.cluster_centers_
k_meansR_cluster_centers

array([[47.37205861,  8.54116696],
       [47.37818179,  8.51497823],
       [47.40699208,  8.54226729],
       [47.37880834,  8.52914186],
       [47.33955581,  8.53804299],
       [47.36733442,  8.57198056]])

In [217]:
#analogue for bars
k_meansB = KMeans(init="k-means++", n_clusters=6, n_init=12)
k_meansB.fit(df_bars[['Lat','Lng']])
k_meansB_labels = k_means.labels_
k_meansB_cluster_centers = k_means.cluster_centers_

In [218]:
#append kmeans cluster number to dafa frame for easy colouring
df_rest['Cluster Nr.'] = k_meansR_labels
df_bars['Cluster Nr.'] = k_meansB_labels

In [219]:
#plotting map
address = 'Zuerich'
geolocator = Nominatim(user_agent="zuerich_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude

#setting cluster number as 6
k=6

# create map
MyMap = folium.Map(location=[latitude, longitude], zoom_start=11)

# setting color scheme for the clusters (R)
X = np.arange(k)
Ys = [i + X + (i*X)**2 for i in range(k)]
colors_array = cm.Greens(np.linspace(0, 1, len(Ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# adding markers to the map (R)
markers_colors = []
for lat, lng, counter in zip(df_rest['Lat'], df_rest['Lng'], df_rest['Cluster Nr.']):
    label = folium.Popup(str(df_rest['Cluster Nr.']) + ' Cluster ' + str(counter), parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color=rainbow[counter-1],
        fill=True,
        fill_color=rainbow[counter-1],
        fill_opacity=0.7).add_to(MyMap)
 

# setting color scheme for the clusters (B)
X2 = np.arange(k)
Ys2 = [i + X2 + (i*X2)**2 for i in range(k)]
colors_array = cm.Reds(np.linspace(0, 1, len(Ys2)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# adding markers to the map (B)
markers_colors2 = []
for lat, lng, counter in zip(df_bars['Lat'], df_bars['Lng'], df_bars['Cluster Nr.']):
    label = folium.Popup(str(df_bars['Cluster Nr.']) + ' Cluster ' + str(counter), parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color=rainbow[counter-1],
        fill=True,
        fill_color=rainbow[counter-1],
        fill_opacity=0.7).add_to(MyMap)

In [220]:
MyMap

In [221]:
tooltip = "Click me!"

#cluster centers
folium.Marker(
    [47.37205861,  8.54116696], popup="<i>Potential Location 1</i>", tooltip=tooltip
).add_to(MyMap)
folium.Marker(
    [47.37818179,  8.51497823], popup="<i>Potential Location 1</i>", tooltip=tooltip
).add_to(MyMap)
folium.Marker(
    [47.40699208,  8.54226729], popup="<i>Potential Location 1</i>", tooltip=tooltip
).add_to(MyMap)
folium.Marker(
    [47.37880834,  8.52914186], popup="<i>Potential Location 1</i>", tooltip=tooltip
).add_to(MyMap)
folium.Marker(
    [47.33955581,  8.53804299], popup="<i>Potential Location 1</i>", tooltip=tooltip
).add_to(MyMap)
folium.Marker(
    [47.36733442,  8.57198056], popup="<i>Potential Location 1</i>", tooltip=tooltip
).add_to(MyMap)

<folium.map.Marker at 0x7f1acf057a10>

In [223]:
MyMap

<b>Results</b>

From the data displayed on the map it is difficult to find a stand out spot with high densities of restaurants without a number of bars in the same area. Nevertheless some centers maked in the map above centering around clusters of restaurants, have lower bars in their vicinity suggesting lower local competition. These places might make a great place to open thhe planned establishment.

<b>Discussion</b>

This result is not surprising maily due to the fact that the market regulates its self and therefore already controlles supply and demand. Additionaly the number of venues selected with the free account limit the accuary of this experiment.

<b>Conlusion</b>

While it was educational to display the regional trends through out the city the obtained data is most likey not significant enough to warrant an informed desicion. For a more accurate picture a larger data set with more detailed differation between restaurants and bars to more accuretly cater towards the target group might be of use.

<b>References</b>

All of the "IBM Data Science"-course material on the coursera website as well as the labs within the course.

<b>Acknowledgements</b>

The IBM-Help staff throughout the entire course who helped during many individual issues.