# Battle of the Neighbourhoods

### Business Problem

This project will focus on where to open a new coffeeshop in London. Since this report is tailored for a new coffeeshop for a new entrepreneur, the report will attempt to focus on business districts that have not yet been overexposed in terms of available coffeeshops and specifically new up and coming business districts in the city of London. 

The research will first focus on where new businesses are setting up their offices under the assumption that unlike several large business locations such as Canary Wharf, the areas that will be considered in this report will not have many coffeeshops already open or there will still exist a market gap between demand and supply.

New business districts will be exploted and the data science methodology learned throughout the course will be used to attempt to find the best possible district.

### Data

The final decision will be based on a number of factors including:

    *Number of nearby businesses
    *Distance from major bus stops and underground
    *Number of nearby coffeeshops
    *Population

The data sources to be used will be 

    * importing a csv file to display the london disricts and boroughs
    * using google geocoding to retrieve the respective longitude and latitude coordinates
    * foursquare to get all nearby coffeeshops



In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files
from random import seed
#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab

#!conda install -c conda-forge geopy --yes
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

!conda install -c conda-forge folium=0.5.0 --yes
import folium # plotting library

print('imported everything')

Fetching package metadata .............
Solving package specifications: .

# All requested packages already installed.
# packages in environment at /opt/conda/envs/DSX-Python35:
#
folium                    0.5.0                      py_0    conda-forge
imported everything


In [2]:
address = 'Hackney Central, London, UK'
geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

51.5470943 -0.0571899236646886


In [3]:
map_hooray = folium.Map(location=[latitude, longitude], zoom_start = 16) # Uses lat then lon. The bigger the zoom number, the closer in you get
map_hooray # Calls the map to display

In [4]:
CLIENT_ID = '1RD1EFWPWGULHGQMSPPYBAJGGW1DGRO13SA3CNZR1SIDMO5K' # your Foursquare ID
CLIENT_SECRET = 'VKSV4NFHHBM5EN3N5YTILOZSG3JTM1RVVUQMD0C5D2H2DUSU' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 100
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: 1RD1EFWPWGULHGQMSPPYBAJGGW1DGRO13SA3CNZR1SIDMO5K
CLIENT_SECRET:VKSV4NFHHBM5EN3N5YTILOZSG3JTM1RVVUQMD0C5D2H2DUSU


In [5]:
search_query = 'Coffee Shop'
radius = 1000

In [6]:
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)
url

'https://api.foursquare.com/v2/venues/search?client_id=1RD1EFWPWGULHGQMSPPYBAJGGW1DGRO13SA3CNZR1SIDMO5K&client_secret=VKSV4NFHHBM5EN3N5YTILOZSG3JTM1RVVUQMD0C5D2H2DUSU&ll=51.5470943,-0.0571899236646886&v=20180604&query=Coffee Shop&radius=1000&limit=100'

In [7]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5c9e3f79351e3d4c7d6a561c'},
 'response': {'venues': [{'categories': [{'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/coffeeshop_',
       'suffix': '.png'},
      'id': '4bf58dd8d48988d1e0931735',
      'name': 'Coffee Shop',
      'pluralName': 'Coffee Shops',
      'primary': True,
      'shortName': 'Coffee Shop'}],
    'hasPerk': False,
    'id': '524e8fa5498e5593dbb9edcd',
    'location': {'cc': 'GB',
     'country': 'United Kingdom',
     'distance': 40,
     'formattedAddress': ['United Kingdom'],
     'labeledLatLngs': [{'label': 'display',
       'lat': 51.547139,
       'lng': -0.056611}],
     'lat': 51.547139,
     'lng': -0.056611},
    'name': 'Coffee Station',
    'referralId': 'v-1553874809'},
   {'categories': [{'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/shops/pet_store_',
       'suffix': '.png'},
      'id': '4bf58dd8d48988d100951735',
      'name': 'Pet Store',
      'pluralName': 'Pet Stores',
  

In [8]:
venues = results['response']['venues']

dataframe = json_normalize(venues)
dataframe.head()

Unnamed: 0,categories,hasPerk,id,location.address,location.cc,location.city,location.country,location.crossStreet,location.distance,location.formattedAddress,location.labeledLatLngs,location.lat,location.lng,location.neighborhood,location.postalCode,location.state,name,referralId,venuePage.id
0,"[{'pluralName': 'Coffee Shops', 'shortName': '...",False,524e8fa5498e5593dbb9edcd,,GB,,United Kingdom,,40,[United Kingdom],"[{'label': 'display', 'lng': -0.056611, 'lat':...",51.547139,-0.056611,,,,Coffee Station,v-1553874809,
1,"[{'pluralName': 'Pet Stores', 'shortName': 'Pe...",False,4cd55474fb595481146fdd50,40 Amhurst Rd.,GB,Hackney,United Kingdom,,126,"[40 Amhurst Rd., Hackney, Greater London, E8 1...","[{'label': 'display', 'lng': -0.05698456733322...",51.548227,-0.056985,,E8 1JN,Greater London,The Pet Shop,v-1553874809,
2,"[{'pluralName': 'Coffee Shops', 'shortName': '...",False,57572c25498e9a9df90f9e51,8 Amhurst Road,GB,London,United Kingdom,,94,"[8 Amhurst Road, London, Greater London, E8 2A...","[{'label': 'display', 'lng': -0.05603342819921...",51.547542,-0.056033,,E8 2AQ,Greater London,Costa Coffee,v-1553874809,
3,"[{'pluralName': 'Thrift / Vintage Stores', 'sh...",False,4e3143e4152071f36460b67a,4 Morning Lane,GB,Hackney,United Kingdom,,214,"[4 Morning Lane, Hackney, Greater London, E9 6...","[{'label': 'display', 'lng': -0.05462318658828...",51.546013,-0.054623,,E9 6NA,Greater London,Scope Charity Shop,v-1553874809,
4,"[{'pluralName': 'Coffee Shops', 'shortName': '...",False,5a6efb1cc876c825f70110b7,,GB,London,United Kingdom,,59,"[London, Greater London, E8 1FJ, United Kingdom]","[{'label': 'display', 'lng': -0.056338, 'lat':...",51.547147,-0.056338,,E8 1FJ,Greater London,Take 5 Coffee,v-1553874809,


In [9]:
filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
dataframe_filtered = dataframe.loc[:, filtered_columns]

def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

dataframe_filtered['categories'] = dataframe_filtered.apply(get_category_type, axis=1)

dataframe_filtered.columns = [column.split('.')[-1] for column in dataframe_filtered.columns]

dataframe_filtered.head()

Unnamed: 0,name,categories,address,cc,city,country,crossStreet,distance,formattedAddress,labeledLatLngs,lat,lng,neighborhood,postalCode,state,id
0,Coffee Station,Coffee Shop,,GB,,United Kingdom,,40,[United Kingdom],"[{'label': 'display', 'lng': -0.056611, 'lat':...",51.547139,-0.056611,,,,524e8fa5498e5593dbb9edcd
1,The Pet Shop,Pet Store,40 Amhurst Rd.,GB,Hackney,United Kingdom,,126,"[40 Amhurst Rd., Hackney, Greater London, E8 1...","[{'label': 'display', 'lng': -0.05698456733322...",51.548227,-0.056985,,E8 1JN,Greater London,4cd55474fb595481146fdd50
2,Costa Coffee,Coffee Shop,8 Amhurst Road,GB,London,United Kingdom,,94,"[8 Amhurst Road, London, Greater London, E8 2A...","[{'label': 'display', 'lng': -0.05603342819921...",51.547542,-0.056033,,E8 2AQ,Greater London,57572c25498e9a9df90f9e51
3,Scope Charity Shop,Thrift / Vintage Store,4 Morning Lane,GB,Hackney,United Kingdom,,214,"[4 Morning Lane, Hackney, Greater London, E9 6...","[{'label': 'display', 'lng': -0.05462318658828...",51.546013,-0.054623,,E9 6NA,Greater London,4e3143e4152071f36460b67a
4,Take 5 Coffee,Coffee Shop,,GB,London,United Kingdom,,59,"[London, Greater London, E8 1FJ, United Kingdom]","[{'label': 'display', 'lng': -0.056338, 'lat':...",51.547147,-0.056338,,E8 1FJ,Greater London,5a6efb1cc876c825f70110b7


In [10]:
venues_map = folium.Map(location=[latitude, longitude], zoom_start=15) 


# add Ecco as a red circle mark
folium.features.CircleMarker(
    [latitude, longitude],
    radius=10,
    popup='Tube Station',
    fill=True,
    color='red',
    fill_color='red',
    fill_opacity=0.6
    ).add_to(venues_map)


# add popular spots to the map as blue circle markers
for lat, lng, label in zip(dataframe_filtered.lat, dataframe_filtered.lng, dataframe_filtered.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        fill=True,
        color='blue',
        fill_color='blue',
        fill_opacity=0.6
        ).add_to(venues_map)

In [11]:
venues_map

In [12]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [13]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['name', 
                  'Latitude', 
                  'Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [14]:
coffee_venues = getNearbyVenues(names=dataframe_filtered['name'],
                                   latitudes=dataframe_filtered['lat'],
                                   longitudes=dataframe_filtered['lng']
                                  )
print(coffee_venues.shape)
coffee_venues.head()

Coffee Station
The Pet Shop
Costa Coffee
Scope Charity Shop
Take 5 Coffee
Coffee8
Flower Shop
Coffee Lovers Café
Black Sheep Coffee
Coffee Is My Cup Of Tea
Costa Coffee
The Pet Shop London Ltd
Flying Horse Coffee
Burberry Outlet
Shop from Crisis
Pound Shop
Sense Charity Shop
O2 Shop Hackney
Lion Coffee + Records
Climpson & Sons
Momosan Shop
Black Box Coffee at Stage 3
Dose Coffee
Mother Kelly's Bottle Shop
Dark Arts Coffee
Prideaux House Charity Shop
Tempesta Coffee
Dose Coffee Lon
Camden Lock Book Shop
The Hackney Shop
FARM:shop & cafe
Costa Coffee
Shop on the Square
Friendly Bake Shop
RSPCA Charity Shop
Lion Coffee + Records
Nemrut Kebab Shop
Second Hand Shop
Tuck Shop
Dalston Shop
Union House Coffee
Queens Bridge Tyre Shop Ltd
DIY Art Shop
Wild Coffee Co.
Black and White Coffee Company @ Palm 2
Nomad Coffee
Black And White Coffee Co
Wayside Community Centre Charity Shop
Merito Coffee
Terrone & Co.
(2937, 7)


Unnamed: 0,name,Latitude,Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Coffee Station,51.547139,-0.056611,Temple of Hackney,51.546038,-0.054241,Vegetarian / Vegan Restaurant
1,Coffee Station,51.547139,-0.056611,Every Cloud,51.546098,-0.054405,Cocktail Bar
2,Coffee Station,51.547139,-0.056611,The Cock Tavern,51.546356,-0.055208,Pub
3,Coffee Station,51.547139,-0.056611,Bánh Mì Hội-An Vietnamese Street Food in London,51.546686,-0.055679,Sandwich Place
4,Coffee Station,51.547139,-0.056611,Paper Dress Vintage,51.547376,-0.054681,Thrift / Vintage Store


In [15]:
coffee_venues.groupby('Venue Category').count()

Unnamed: 0_level_0,name,Latitude,Longitude,Venue,Venue Latitude,Venue Longitude
Venue Category,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
American Restaurant,2,2,2,2,2,2
Argentinian Restaurant,7,7,7,7,7,7
Art Gallery,28,28,28,28,28,28
Art Museum,7,7,7,7,7,7
Bakery,43,43,43,43,43,43
Bar,33,33,33,33,33,33
Beer Bar,6,6,6,6,6,6
Beer Store,19,19,19,19,19,19
Bistro,2,2,2,2,2,2
Bookstore,39,39,39,39,39,39


In [16]:
print('There are {} uniques categories.'.format(len(coffee_venues['Venue Category'].unique())))

There are 127 uniques categories.


In [17]:
# one hot encoding
coffee_onehot = pd.get_dummies(coffee_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
coffee_onehot['name'] = coffee_venues['name'] 

# move neighborhood column to the first column
fixed_columns = [coffee_onehot.columns[-1]] + list(coffee_onehot.columns[:-1])
coffee_onehot = coffee_onehot[fixed_columns]

coffee_onehot.head()

Unnamed: 0,name,American Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Bakery,Bar,Beer Bar,Beer Store,Bistro,Bookstore,Boutique,Boxing Gym,Breakfast Spot,Brewery,Burger Joint,Burrito Place,Butcher,Café,Canal,Canal Lock,Cantonese Restaurant,Caribbean Restaurant,Chinese Restaurant,Church,Clothing Store,Cocktail Bar,Coffee Shop,Convenience Store,Coworking Space,Creperie,Cuban Restaurant,Deli / Bodega,Department Store,Diner,Discount Store,Dive Bar,Donut Shop,Dumpling Restaurant,English Restaurant,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Flea Market,Flower Shop,Food & Drink Shop,Food Court,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Garden,Gastropub,Gay Bar,Gift Shop,Gourmet Shop,Grocery Store,Gym,Gym / Fitness Center,Health Food Store,Historic Site,History Museum,Hotel,Ice Cream Shop,Indie Movie Theater,Intersection,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewish Restaurant,Kebab Restaurant,Lounge,Market,Martial Arts Dojo,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Mobile Phone Shop,Modern European Restaurant,Movie Theater,Museum,Music Venue,Nightclub,Organic Grocery,Outlet Store,Park,Performing Arts Venue,Pharmacy,Pie Shop,Pilates Studio,Pizza Place,Plaza,Pool,Print Shop,Pub,Ramen Restaurant,Record Shop,Recording Studio,Restaurant,Roof Deck,Salon / Barbershop,Sandwich Place,Scandinavian Restaurant,Seafood Restaurant,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Street Food Gathering,Supermarket,Taco Place,Taiwanese Restaurant,Tapas Restaurant,Tennis Court,Thai Restaurant,Theater,Thrift / Vintage Store,Train Station,Tunnel,Turkish Restaurant,Vegetarian / Vegan Restaurant,Veneto Restaurant,Vietnamese Restaurant,Wine Bar,Wine Shop,Yakitori Restaurant,Yoga Studio
0,Coffee Station,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0
1,Coffee Station,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Coffee Station,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Coffee Station,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Coffee Station,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0


In [18]:
coffee_grouped = coffee_onehot.groupby('name').mean().reset_index()
coffee_grouped.head()

Unnamed: 0,name,American Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Bakery,Bar,Beer Bar,Beer Store,Bistro,Bookstore,Boutique,Boxing Gym,Breakfast Spot,Brewery,Burger Joint,Burrito Place,Butcher,Café,Canal,Canal Lock,Cantonese Restaurant,Caribbean Restaurant,Chinese Restaurant,Church,Clothing Store,Cocktail Bar,Coffee Shop,Convenience Store,Coworking Space,Creperie,Cuban Restaurant,Deli / Bodega,Department Store,Diner,Discount Store,Dive Bar,Donut Shop,Dumpling Restaurant,English Restaurant,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Flea Market,Flower Shop,Food & Drink Shop,Food Court,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Garden,Gastropub,Gay Bar,Gift Shop,Gourmet Shop,Grocery Store,Gym,Gym / Fitness Center,Health Food Store,Historic Site,History Museum,Hotel,Ice Cream Shop,Indie Movie Theater,Intersection,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewish Restaurant,Kebab Restaurant,Lounge,Market,Martial Arts Dojo,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Mobile Phone Shop,Modern European Restaurant,Movie Theater,Museum,Music Venue,Nightclub,Organic Grocery,Outlet Store,Park,Performing Arts Venue,Pharmacy,Pie Shop,Pilates Studio,Pizza Place,Plaza,Pool,Print Shop,Pub,Ramen Restaurant,Record Shop,Recording Studio,Restaurant,Roof Deck,Salon / Barbershop,Sandwich Place,Scandinavian Restaurant,Seafood Restaurant,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Street Food Gathering,Supermarket,Taco Place,Taiwanese Restaurant,Tapas Restaurant,Tennis Court,Thai Restaurant,Theater,Thrift / Vintage Store,Train Station,Tunnel,Turkish Restaurant,Vegetarian / Vegan Restaurant,Veneto Restaurant,Vietnamese Restaurant,Wine Bar,Wine Shop,Yakitori Restaurant,Yoga Studio
0,Black And White Coffee Co,0.0,0.011364,0.034091,0.011364,0.045455,0.011364,0.0,0.0,0.0,0.034091,0.011364,0.0,0.022727,0.022727,0.0,0.0,0.011364,0.113636,0.011364,0.011364,0.011364,0.011364,0.0,0.0,0.0,0.034091,0.056818,0.0,0.0,0.0,0.0,0.011364,0.0,0.0,0.0,0.0,0.011364,0.0,0.0,0.0,0.0,0.0,0.011364,0.0,0.0,0.011364,0.011364,0.0,0.011364,0.011364,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.011364,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011364,0.0,0.0,0.0,0.011364,0.011364,0.0,0.011364,0.0,0.0,0.0,0.0,0.0,0.0,0.011364,0.011364,0.0,0.0,0.011364,0.0,0.0,0.011364,0.011364,0.034091,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.034091,0.011364,0.0,0.011364,0.0,0.011364,0.011364,0.0,0.0,0.0,0.0,0.0,0.011364,0.011364,0.0,0.011364,0.0,0.0,0.011364,0.0,0.011364,0.022727,0.011364,0.022727,0.011364,0.011364,0.0,0.022727
1,Black Box Coffee at Stage 3,0.0,0.0,0.0,0.0,0.018868,0.0,0.0,0.0,0.0,0.0,0.018868,0.0,0.0,0.037736,0.018868,0.0,0.018868,0.075472,0.0,0.0,0.0,0.0,0.0,0.018868,0.037736,0.056604,0.075472,0.0,0.0,0.0,0.0,0.0,0.018868,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018868,0.0,0.037736,0.0,0.018868,0.0,0.0,0.0,0.037736,0.0,0.0,0.0,0.018868,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018868,0.018868,0.018868,0.0,0.0,0.0,0.018868,0.0,0.0,0.0,0.018868,0.0,0.0,0.018868,0.0,0.0,0.0,0.09434,0.018868,0.0,0.0,0.037736,0.0,0.0,0.018868,0.018868,0.0,0.0,0.0,0.018868,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018868,0.018868,0.018868,0.0,0.018868,0.037736,0.0,0.018868,0.0,0.018868,0.0,0.0
2,Black Sheep Coffee,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.036585,0.012195,0.0,0.0,0.121951,0.0,0.0,0.0,0.0,0.02439,0.012195,0.02439,0.04878,0.04878,0.0,0.0,0.0,0.0,0.0,0.012195,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02439,0.012195,0.0,0.0,0.0,0.0,0.012195,0.0,0.012195,0.012195,0.0,0.0,0.012195,0.012195,0.04878,0.0,0.0,0.0,0.0,0.012195,0.02439,0.0,0.0,0.0,0.012195,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012195,0.012195,0.012195,0.012195,0.012195,0.0,0.0,0.012195,0.012195,0.0,0.012195,0.012195,0.0,0.0,0.02439,0.0,0.0,0.0,0.085366,0.012195,0.0,0.0,0.036585,0.0,0.0,0.02439,0.012195,0.0,0.0,0.0,0.012195,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012195,0.012195,0.012195,0.0,0.02439,0.036585,0.0,0.012195,0.0,0.012195,0.0,0.0
3,Black and White Coffee Company @ Palm 2,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.034483,0.0,0.034483,0.0,0.0,0.0,0.034483,0.034483,0.0,0.0,0.068966,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.034483,0.0,0.0,0.034483,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.034483,0.0,0.0,0.0,0.068966,0.0,0.034483,0.0,0.034483,0.0,0.0,0.0,0.0,0.034483,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.103448,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.034483,0.034483,0.0,0.0,0.0,0.034483,0.0,0.0
4,Burberry Outlet,0.0,0.0,0.0,0.0,0.0,0.013333,0.0,0.013333,0.0,0.0,0.026667,0.0,0.0,0.026667,0.0,0.0,0.0,0.12,0.0,0.0,0.0,0.0,0.013333,0.013333,0.026667,0.053333,0.093333,0.0,0.0,0.0,0.0,0.0,0.013333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026667,0.013333,0.013333,0.0,0.0,0.0,0.013333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.013333,0.026667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013333,0.013333,0.0,0.013333,0.013333,0.0,0.0,0.013333,0.013333,0.0,0.013333,0.013333,0.0,0.0,0.04,0.0,0.0,0.0,0.093333,0.013333,0.0,0.0,0.013333,0.0,0.0,0.026667,0.013333,0.0,0.0,0.0,0.013333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013333,0.013333,0.013333,0.0,0.026667,0.026667,0.0,0.013333,0.0,0.0,0.0,0.013333


In [19]:
coffee_grouped.shape

(47, 128)

In [20]:
num_top_venues = 5

for hood in coffee_grouped['name']:
    print("----"+hood+"----")
    temp = coffee_grouped[coffee_grouped['name'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Black And White Coffee Co----
         venue  freq
0         Café  0.11
1  Coffee Shop  0.06
2       Bakery  0.05
3          Pub  0.05
4  Art Gallery  0.03


----Black Box Coffee at Stage 3----
                           venue  freq
0                            Pub  0.09
1                    Coffee Shop  0.08
2                           Café  0.08
3                   Cocktail Bar  0.06
4  Vegetarian / Vegan Restaurant  0.04


----Black Sheep Coffee----
           venue  freq
0           Café  0.12
1            Pub  0.09
2   Cocktail Bar  0.05
3  Grocery Store  0.05
4    Coffee Shop  0.05


----Black and White Coffee Company @ Palm 2----
           venue  freq
0            Pub  0.10
1  Grocery Store  0.07
2           Café  0.07
3   Burger Joint  0.03
4       Creperie  0.03


----Burberry Outlet----
           venue  freq
0           Café  0.12
1    Coffee Shop  0.09
2            Pub  0.09
3   Cocktail Bar  0.05
4  Grocery Store  0.04


----Camden Lock Book Shop----
         venue  f

In [21]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [22]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['name']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
venues_sorted = pd.DataFrame(columns=columns)
venues_sorted['name'] = coffee_grouped['name']

for ind in np.arange(coffee_grouped.shape[0]):
    venues_sorted.iloc[ind, 1:] = return_most_common_venues(coffee_grouped.iloc[ind, :], num_top_venues)

venues_sorted.head()

Unnamed: 0,name,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Black And White Coffee Co,Café,Coffee Shop,Pub,Bakery,Art Gallery,Restaurant,Bookstore,Cocktail Bar,Pizza Place,Yoga Studio
1,Black Box Coffee at Stage 3,Pub,Coffee Shop,Café,Cocktail Bar,Grocery Store,Restaurant,Brewery,Hotel,Vegetarian / Vegan Restaurant,Clothing Store
2,Black Sheep Coffee,Café,Pub,Coffee Shop,Cocktail Bar,Grocery Store,Restaurant,Vegetarian / Vegan Restaurant,Brewery,Hotel,Clothing Store
3,Black and White Coffee Company @ Palm 2,Pub,Grocery Store,Café,Dumpling Restaurant,Bookstore,Cocktail Bar,Coffee Shop,Pizza Place,Creperie,Japanese Restaurant
4,Burberry Outlet,Café,Pub,Coffee Shop,Cocktail Bar,Pizza Place,Grocery Store,Sandwich Place,Clothing Store,Hotel,Fast Food Restaurant


In [23]:
# set number of clusters
kclusters = 5

coffee_grouped_clustering = coffee_grouped.drop('name', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(coffee_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 


array([1, 2, 2, 3, 2, 1, 1, 2, 2, 2], dtype=int32)

In [24]:
# add clustering labels

venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

venues_sorted.head()

Unnamed: 0,Cluster Labels,name,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,1,Black And White Coffee Co,Café,Coffee Shop,Pub,Bakery,Art Gallery,Restaurant,Bookstore,Cocktail Bar,Pizza Place,Yoga Studio
1,2,Black Box Coffee at Stage 3,Pub,Coffee Shop,Café,Cocktail Bar,Grocery Store,Restaurant,Brewery,Hotel,Vegetarian / Vegan Restaurant,Clothing Store
2,2,Black Sheep Coffee,Café,Pub,Coffee Shop,Cocktail Bar,Grocery Store,Restaurant,Vegetarian / Vegan Restaurant,Brewery,Hotel,Clothing Store
3,3,Black and White Coffee Company @ Palm 2,Pub,Grocery Store,Café,Dumpling Restaurant,Bookstore,Cocktail Bar,Coffee Shop,Pizza Place,Creperie,Japanese Restaurant
4,2,Burberry Outlet,Café,Pub,Coffee Shop,Cocktail Bar,Pizza Place,Grocery Store,Sandwich Place,Clothing Store,Hotel,Fast Food Restaurant


In [25]:
coffee_venues.head()

Unnamed: 0,name,Latitude,Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Coffee Station,51.547139,-0.056611,Temple of Hackney,51.546038,-0.054241,Vegetarian / Vegan Restaurant
1,Coffee Station,51.547139,-0.056611,Every Cloud,51.546098,-0.054405,Cocktail Bar
2,Coffee Station,51.547139,-0.056611,The Cock Tavern,51.546356,-0.055208,Pub
3,Coffee Station,51.547139,-0.056611,Bánh Mì Hội-An Vietnamese Street Food in London,51.546686,-0.055679,Sandwich Place
4,Coffee Station,51.547139,-0.056611,Paper Dress Vintage,51.547376,-0.054681,Thrift / Vintage Store


In [26]:
coffee_merged = coffee_venues

coffee_merged = coffee_merged.join(venues_sorted.set_index('name'), on='name')

coffee_merged.head() # check the last columns!

Unnamed: 0,name,Latitude,Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Coffee Station,51.547139,-0.056611,Temple of Hackney,51.546038,-0.054241,Vegetarian / Vegan Restaurant,2,Café,Pub,Grocery Store,Coffee Shop,Cocktail Bar,Restaurant,Vegetarian / Vegan Restaurant,Brewery,Hotel,Clothing Store
1,Coffee Station,51.547139,-0.056611,Every Cloud,51.546098,-0.054405,Cocktail Bar,2,Café,Pub,Grocery Store,Coffee Shop,Cocktail Bar,Restaurant,Vegetarian / Vegan Restaurant,Brewery,Hotel,Clothing Store
2,Coffee Station,51.547139,-0.056611,The Cock Tavern,51.546356,-0.055208,Pub,2,Café,Pub,Grocery Store,Coffee Shop,Cocktail Bar,Restaurant,Vegetarian / Vegan Restaurant,Brewery,Hotel,Clothing Store
3,Coffee Station,51.547139,-0.056611,Bánh Mì Hội-An Vietnamese Street Food in London,51.546686,-0.055679,Sandwich Place,2,Café,Pub,Grocery Store,Coffee Shop,Cocktail Bar,Restaurant,Vegetarian / Vegan Restaurant,Brewery,Hotel,Clothing Store
4,Coffee Station,51.547139,-0.056611,Paper Dress Vintage,51.547376,-0.054681,Thrift / Vintage Store,2,Café,Pub,Grocery Store,Coffee Shop,Cocktail Bar,Restaurant,Vegetarian / Vegan Restaurant,Brewery,Hotel,Clothing Store


In [30]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=15)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))


In [31]:
coffee_merged.loc[coffee_merged['Cluster Labels'] == 0, coffee_merged.columns[[1] + list(range(5, coffee_merged.shape[1]))]]

Unnamed: 0,Latitude,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1531,51.543788,-0.046967,Pub,0,Coffee Shop,Café,Pub,Yoga Studio,Train Station,Flea Market,Sporting Goods Shop,Convenience Store,Boutique,Garden
1532,51.543788,-0.047339,Café,0,Coffee Shop,Café,Pub,Yoga Studio,Train Station,Flea Market,Sporting Goods Shop,Convenience Store,Boutique,Garden
1533,51.543788,-0.046634,Coffee Shop,0,Coffee Shop,Café,Pub,Yoga Studio,Train Station,Flea Market,Sporting Goods Shop,Convenience Store,Boutique,Garden
1534,51.543788,-0.044603,Pub,0,Coffee Shop,Café,Pub,Yoga Studio,Train Station,Flea Market,Sporting Goods Shop,Convenience Store,Boutique,Garden
1535,51.543788,-0.047573,Yoga Studio,0,Coffee Shop,Café,Pub,Yoga Studio,Train Station,Flea Market,Sporting Goods Shop,Convenience Store,Boutique,Garden
1536,51.543788,-0.046948,Coffee Shop,0,Coffee Shop,Café,Pub,Yoga Studio,Train Station,Flea Market,Sporting Goods Shop,Convenience Store,Boutique,Garden
1537,51.543788,-0.047178,Beer Store,0,Coffee Shop,Café,Pub,Yoga Studio,Train Station,Flea Market,Sporting Goods Shop,Convenience Store,Boutique,Garden
1538,51.543788,-0.050131,Pub,0,Coffee Shop,Café,Pub,Yoga Studio,Train Station,Flea Market,Sporting Goods Shop,Convenience Store,Boutique,Garden
1539,51.543788,-0.042677,Café,0,Coffee Shop,Café,Pub,Yoga Studio,Train Station,Flea Market,Sporting Goods Shop,Convenience Store,Boutique,Garden
1540,51.543788,-0.051329,Coffee Shop,0,Coffee Shop,Café,Pub,Yoga Studio,Train Station,Flea Market,Sporting Goods Shop,Convenience Store,Boutique,Garden


In [32]:
coffee_merged.loc[coffee_merged['Cluster Labels'] == 1, coffee_merged.columns[[1] + list(range(5, coffee_merged.shape[1]))]]

Unnamed: 0,Latitude,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
787,51.537805,-0.057848,Roof Deck,1,Café,Pub,Coffee Shop,Bakery,Grocery Store,Pizza Place,Bookstore,Cocktail Bar,Art Gallery,Yoga Studio
788,51.537805,-0.058441,Bakery,1,Café,Pub,Coffee Shop,Bakery,Grocery Store,Pizza Place,Bookstore,Cocktail Bar,Art Gallery,Yoga Studio
789,51.537805,-0.057507,Restaurant,1,Café,Pub,Coffee Shop,Bakery,Grocery Store,Pizza Place,Bookstore,Cocktail Bar,Art Gallery,Yoga Studio
790,51.537805,-0.058591,Flea Market,1,Café,Pub,Coffee Shop,Bakery,Grocery Store,Pizza Place,Bookstore,Cocktail Bar,Art Gallery,Yoga Studio
791,51.537805,-0.057224,Coffee Shop,1,Café,Pub,Coffee Shop,Bakery,Grocery Store,Pizza Place,Bookstore,Cocktail Bar,Art Gallery,Yoga Studio
792,51.537805,-0.058824,Taiwanese Restaurant,1,Café,Pub,Coffee Shop,Bakery,Grocery Store,Pizza Place,Bookstore,Cocktail Bar,Art Gallery,Yoga Studio
793,51.537805,-0.058459,Vegetarian / Vegan Restaurant,1,Café,Pub,Coffee Shop,Bakery,Grocery Store,Pizza Place,Bookstore,Cocktail Bar,Art Gallery,Yoga Studio
794,51.537805,-0.057838,Breakfast Spot,1,Café,Pub,Coffee Shop,Bakery,Grocery Store,Pizza Place,Bookstore,Cocktail Bar,Art Gallery,Yoga Studio
795,51.537805,-0.057871,Bar,1,Café,Pub,Coffee Shop,Bakery,Grocery Store,Pizza Place,Bookstore,Cocktail Bar,Art Gallery,Yoga Studio
796,51.537805,-0.058329,Bakery,1,Café,Pub,Coffee Shop,Bakery,Grocery Store,Pizza Place,Bookstore,Cocktail Bar,Art Gallery,Yoga Studio


In [34]:
coffee_merged.loc[coffee_merged['Cluster Labels'] == 2, coffee_merged.columns[[1] + list(range(5, coffee_merged.shape[1]))]]

Unnamed: 0,Latitude,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,51.547139,-0.054241,Vegetarian / Vegan Restaurant,2,Café,Pub,Grocery Store,Coffee Shop,Cocktail Bar,Restaurant,Vegetarian / Vegan Restaurant,Brewery,Hotel,Clothing Store
1,51.547139,-0.054405,Cocktail Bar,2,Café,Pub,Grocery Store,Coffee Shop,Cocktail Bar,Restaurant,Vegetarian / Vegan Restaurant,Brewery,Hotel,Clothing Store
2,51.547139,-0.055208,Pub,2,Café,Pub,Grocery Store,Coffee Shop,Cocktail Bar,Restaurant,Vegetarian / Vegan Restaurant,Brewery,Hotel,Clothing Store
3,51.547139,-0.055679,Sandwich Place,2,Café,Pub,Grocery Store,Coffee Shop,Cocktail Bar,Restaurant,Vegetarian / Vegan Restaurant,Brewery,Hotel,Clothing Store
4,51.547139,-0.054681,Thrift / Vintage Store,2,Café,Pub,Grocery Store,Coffee Shop,Cocktail Bar,Restaurant,Vegetarian / Vegan Restaurant,Brewery,Hotel,Clothing Store
5,51.547139,-0.05692,Café,2,Café,Pub,Grocery Store,Coffee Shop,Cocktail Bar,Restaurant,Vegetarian / Vegan Restaurant,Brewery,Hotel,Clothing Store
6,51.547139,-0.055399,Movie Theater,2,Café,Pub,Grocery Store,Coffee Shop,Cocktail Bar,Restaurant,Vegetarian / Vegan Restaurant,Brewery,Hotel,Clothing Store
7,51.547139,-0.055004,Ramen Restaurant,2,Café,Pub,Grocery Store,Coffee Shop,Cocktail Bar,Restaurant,Vegetarian / Vegan Restaurant,Brewery,Hotel,Clothing Store
8,51.547139,-0.055239,Theater,2,Café,Pub,Grocery Store,Coffee Shop,Cocktail Bar,Restaurant,Vegetarian / Vegan Restaurant,Brewery,Hotel,Clothing Store
9,51.547139,-0.054558,Pub,2,Café,Pub,Grocery Store,Coffee Shop,Cocktail Bar,Restaurant,Vegetarian / Vegan Restaurant,Brewery,Hotel,Clothing Store


In [None]:
coffee_merged.loc[coffee_merged['Cluster Labels'] == 3, coffee_merged.columns[[1] + list(range(5, coffee_merged.shape[1]))]]