# LIVING IN LONDON: For the everyday person

## Capstone Project - The Battle of Neighborhoods (Week 2)

### For the IBM Applied Data Science Capstone

## 1. Import libraries

Before we get the data and start exploring it, let's download all the dependencies that we will need.

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
!pip install geocoder
import geocoder

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes
import folium # map rendering library

print('Libraries imported.')

Solving environment: done

# All requested packages already installed.

Libraries imported.


## 2. Download and Explore Initial Dataset

### 2.1 Downloading Dataset
Here is the link to the dataset: https://en.wikipedia.org/wiki/List_of_areas_of_London

In [42]:
#Using Pandas to extract data
data = pd.read_html('https://en.wikipedia.org/wiki/List_of_areas_of_London')

In [43]:
data

[                                                   0
 0  Map all coordinates in "Category:Areas of Lond...
 1                 Download coordinates as: KML · GPX,
                                            Location  \
 0                                        Abbey Wood   
 1                                             Acton   
 2                                         Addington   
 3                                        Addiscombe   
 4                                       Albany Park   
 5                                  Aldborough Hatch   
 6                                           Aldgate   
 7                                           Aldwych   
 8                                          Alperton   
 9                                           Anerley   
 10                                            Angel   
 11                                        Aperfield   
 12                                          Archway   
 13                                   Ardleigh Green 

In [44]:
df=data[1]
df.head()

Unnamed: 0,Location,London borough,Post town,Postcode district,Dial code,OS grid ref
0,Abbey Wood,"Bexley, Greenwich [7]",LONDON,SE2,20,TQ465785
1,Acton,"Ealing, Hammersmith and Fulham[8]",LONDON,"W3, W4",20,TQ205805
2,Addington,Croydon[8],CROYDON,CR0,20,TQ375645
3,Addiscombe,Croydon[8],CROYDON,CR0,20,TQ345665
4,Albany Park,Bexley,"BEXLEY, SIDCUP","DA5, DA14",20,TQ478728


In [45]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 533 entries, 0 to 532
Data columns (total 6 columns):
Location             533 non-null object
London borough       533 non-null object
Post town            533 non-null object
Postcode district    533 non-null object
Dial code            533 non-null object
OS grid ref          531 non-null object
dtypes: object(6)
memory usage: 25.1+ KB


### 2.2 Data Cleaning

In [46]:
#Checking location of null values
df.loc[df['OS grid ref'].isnull()]

Unnamed: 0,Location,London borough,Post town,Postcode district,Dial code,OS grid ref
53,Blendon,Bexley,BEXLEY,DA 5,20,
233,Hazelwood,Bromley,ORPINGTON,BR6,1689,


In [47]:
#Removing rows in 'OS grid ref' with 'NaN'
df.dropna(subset=['OS grid ref'], axis=0, inplace=True)
df['OS grid ref'].isnull().any()

False

In [48]:
#Checking column names
print(df.columns.tolist())

['Location', 'London\xa0borough', 'Post town', 'Postcode\xa0district', 'Dial\xa0code', 'OS grid ref']


In [49]:
#Keeping the Location, Borough and Postcode variables
df=df[['Location', 'London\xa0borough', 'Postcode\xa0district']]
df.head()

Unnamed: 0,Location,London borough,Postcode district
0,Abbey Wood,"Bexley, Greenwich [7]",SE2
1,Acton,"Ealing, Hammersmith and Fulham[8]","W3, W4"
2,Addington,Croydon[8],CR0
3,Addiscombe,Croydon[8],CR0
4,Albany Park,Bexley,"DA5, DA14"


In [50]:
#Renaming columnns
df.rename(columns={'Location': 'Neighborhood', 'London\xa0borough': 'Borough', 'Postcode\xa0district': 'PostalCode'}, inplace=True)

In [51]:
#Removing all parentheses, brackets and texts in between
df['Neighborhood'] = df['Neighborhood'].str.replace(r"\(.*\)", "")
df['Borough'] = df['Borough'].str.replace(r"\[.*\]", "")
df['Neighborhood'] = df['Neighborhood'].str.strip()
df.head()

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  from ipykernel import kernelapp as app
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  app.launch_new_instance()
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy


Unnamed: 0,Neighborhood,Borough,PostalCode
0,Abbey Wood,"Bexley, Greenwich",SE2
1,Acton,"Ealing, Hammersmith and Fulham","W3, W4"
2,Addington,Croydon,CR0
3,Addiscombe,Croydon,CR0
4,Albany Park,Bexley,"DA5, DA14"


In [52]:
#Checking Bexley row which was initially "Bexley (also Old Bexley, Bexley Village)"
df.loc[df['Neighborhood']=='Bexley']

Unnamed: 0,Neighborhood,Borough,PostalCode
44,Bexley,Bexley,DA5


In [53]:
#Resetting index
df.reset_index(drop=True, inplace=True)
df.loc[53,:]

Neighborhood    Bloomsbury
Borough             Camden
PostalCode             WC1
Name: 53, dtype: object

In [54]:
df.loc[52:54,:]

Unnamed: 0,Neighborhood,Borough,PostalCode
52,Blackwall,Tower Hamlets,E14
53,Bloomsbury,Camden,WC1
54,Botany Bay,Enfield,EN2


#### Confirms that Blendon was taken out and dataset is clean.

## 3. Get the Geographical coordinates

In [55]:
#Define a function to get coordinates
def get_latlng(location):
    #Initialize your variable to None
    lat_lng_coords = None
    #Loop until you get the coordinates
    while(lat_lng_coords is None):
        g = geocoder.arcgis('{}, London, United Kingdom'.format(location))
        lat_lng_coords = g.latlng
    return lat_lng_coords

#Call the function to get the coordinates, store in a new list using list comprehension
coords = [ get_latlng(location) for location in df["Neighborhood"].tolist() ]

In [56]:
coords[0:5]

[[51.492450000000076, 0.12127000000003818],
 [51.51324000000005, -0.2674599999999714],
 [51.575810032233505, -0.10933991526687237],
 [51.472748982987284, -0.2033256827571753],
 [51.48582000000005, -0.08025999999995292]]

In [57]:
#Create temporary dataframe to populate the coordinates into Latitude and Longitude
df_coords = pd.DataFrame(coords, columns=['Latitude', 'Longitude'])

In [58]:
#Merge the coordinates into the original dataframe
df['Latitude'] = df_coords['Latitude']
df['Longitude'] = df_coords['Longitude']

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  from ipykernel import kernelapp as app
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  app.launch_new_instance()


In [59]:
df.head()

Unnamed: 0,Neighborhood,Borough,PostalCode,Latitude,Longitude
0,Abbey Wood,"Bexley, Greenwich",SE2,51.49245,0.12127
1,Acton,"Ealing, Hammersmith and Fulham","W3, W4",51.51324,-0.26746
2,Addington,Croydon,CR0,51.57581,-0.10934
3,Addiscombe,Croydon,CR0,51.472749,-0.203326
4,Albany Park,Bexley,"DA5, DA14",51.48582,-0.08026


In [60]:
df.shape

(531, 5)

In [61]:
# save the DataFrame as CSV file
df.to_csv("london_df.csv", index=False)

## 4. Explore and Cluster Neighborhoods

### 4.1 Explore locations in London through visualisations
Use geopy to get the latitude and longitude of London, UK.

In [62]:
address = 'London, UK'

geolocator = Nominatim(user_agent="London")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of London are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of London are 51.5073219, -0.1276474.


Create a map of London with neighborhoods superimposed on top.

In [63]:
map_london = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, location in zip(df['Latitude'], df['Longitude'], df['Borough'], df['Neighborhood']):
    label = '{}, {}'.format(location, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_london)  

map_london

### 4.2 Explore the locations with Foursquare API
Define Foursquare Credentials and Version

In [64]:
CLIENT_ID = '5RH1MJBVVEE12ZPDE4XE0JTS1POSL2WLCMW0LVO5NNASSIX1' # your Foursquare ID
CLIENT_SECRET = 'NYPXTZBGBK2GFKGLOR4LOTXGBY2S33XCBU2ZBYLVPCBJ5MTW' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: 5RH1MJBVVEE12ZPDE4XE0JTS1POSL2WLCMW0LVO5NNASSIX1
CLIENT_SECRET:NYPXTZBGBK2GFKGLOR4LOTXGBY2S33XCBU2ZBYLVPCBJ5MTW


Define Get Nearby Venues function

In [65]:
def getNearbyVenues(names, latitudes, longitudes):
    radius=500
    LIMIT=100
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [66]:
London_venues = getNearbyVenues(names=df['Neighborhood'],
                                 latitudes=df['Latitude'],
                                 longitudes=df['Longitude'])

Abbey Wood
Acton
Addington
Addiscombe
Albany Park
Aldborough Hatch
Aldgate
Aldwych
Alperton
Anerley
Angel
Aperfield
Archway
Ardleigh Green
Arkley
Arnos Grove
Balham
Bankside
Barbican
Barking
Barkingside
Barnehurst
Barnes
Barnes Cray
Barnet Gate
Barnet
Barnsbury
Battersea
Bayswater
Beckenham
Beckton
Becontree
Becontree Heath
Beddington
Bedford Park
Belgravia
Bellingham
Belmont
Belmont
Belsize Park
Belvedere
Bermondsey
Berrylands
Bethnal Green
Bexley
Bexleyheath
Bickley
Biggin Hill
Blackfen
Blackfriars
Blackheath
Blackheath Royal Standard
Blackwall
Bloomsbury
Botany Bay
Bounds Green
Bow
Bowes Park
Brentford
Brent Cross
Brent Park
Brimsdown
Brixton
Brockley
Bromley
Bromley
Bromley Common
Brompton
Brondesbury
Brunswick Park
Bulls Cross
Burnt Oak
Burroughs, The
Camberwell
Cambridge Heath
Camden Town
Canary Wharf
Cann Hall
Canning Town
Canonbury
Carshalton
Castelnau
Castle Green
Catford
Chadwell Heath
Chalk Farm
Charing Cross
Charlton
Chase Cross
Cheam
Chelsea
Chelsfield
Chessington
Childs H

Checkign number of venue returned for each neighborhood

In [67]:
London_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Abbey Wood,7,7,7,7,7,7
Acton,7,7,7,7,7,7
Addington,7,7,7,7,7,7
Addiscombe,63,63,63,63,63,63
Albany Park,8,8,8,8,8,8
Aldborough Hatch,4,4,4,4,4,4
Aldgate,71,71,71,71,71,71
Aldwych,80,80,80,80,80,80
Alperton,18,18,18,18,18,18
Anerley,5,5,5,5,5,5


Encoding Categories of Venues

In [68]:
#One hot encoding
London_onehot = pd.get_dummies(London_venues[['Venue Category']], prefix="", prefix_sep="")

#Adding Neighborhood column to dataframe
London_onehot.drop(['Neighborhood'],axis=1,inplace=True) 
London_onehot.insert(loc=0, column='Neighborhood', value=London_venues['Neighborhood'] )

#Getting the shape of the dataframe
London_onehot.shape

(14103, 415)

Group rows by neighborhood and take mean of the frequency of each category

In [69]:
London_grouped = London_onehot.groupby('Neighborhood').mean().reset_index()
London_grouped.head()

Unnamed: 0,Neighborhood,ATM,Accessories Store,Adult Boutique,Afghan Restaurant,African Restaurant,Airport Service,Airport Terminal,American Restaurant,Antique Shop,Aquarium,Arcade,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Arts & Entertainment,Asian Restaurant,Athletics & Sports,Australian Restaurant,Austrian Restaurant,Auto Dealership,Auto Garage,Auto Workshop,Automotive Shop,BBQ Joint,Bagel Shop,Bakery,Bar,Baseball Field,Basketball Court,Beach,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Betting Shop,Bike Rental / Bike Share,Bike Shop,Bistro,Boarding House,Boat or Ferry,Bookstore,Botanical Garden,Boutique,Bowling Alley,Boxing Gym,Brasserie,Brazilian Restaurant,Breakfast Spot,Brewery,Bridge,Bubble Tea Shop,Buddhist Temple,Building,Bulgarian Restaurant,Burger Joint,Burrito Place,Bus Station,Bus Stop,Business Service,Butcher,Cable Car,Cafeteria,Café,Camera Store,Campground,Canal,Canal Lock,Candy Store,Cantonese Restaurant,Caribbean Restaurant,Carpet Store,Casino,Castle,Caucasian Restaurant,Cemetery,Champagne Bar,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Circus School,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Roaster,Coffee Shop,College Cafeteria,College Quad,College Residence Hall,Colombian Restaurant,Comedy Club,Comfort Food Restaurant,Comic Shop,Community Center,Concert Hall,Construction & Landscaping,Convenience Store,Convention Center,Cosmetics Shop,Costume Shop,Coworking Space,Creperie,Cricket Ground,Cuban Restaurant,Cupcake Shop,Cycle Studio,Czech Restaurant,Dance Studio,Deli / Bodega,Department Store,Design Studio,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Distillery,Dive Bar,Dog Run,Doner Restaurant,Donut Shop,Dry Cleaner,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Empanada Restaurant,English Restaurant,Entertainment Service,Ethiopian Restaurant,Event Service,Event Space,Exhibit,Fabric Shop,Factory,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Film Studio,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Service,Food Stand,Food Truck,Football Stadium,Forest,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General Entertainment,German Restaurant,Gift Shop,Go Kart Track,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gun Range,Gym,Gym / Fitness Center,Gym Pool,Gymnastics Gym,Halal Restaurant,Harbor / Marina,Hardware Store,Health & Beauty Service,Health Food Store,Herbs & Spices Store,Himalayan Restaurant,Historic Site,History Museum,Hobby Shop,Home Service,Hookah Bar,Hostel,Hot Dog Joint,Hotel,Hotel Bar,Housing Development,Hunan Restaurant,Hungarian Restaurant,IT Services,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indie Theater,Indonesian Restaurant,Indoor Play Area,Intersection,Irish Pub,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Kebab Restaurant,Kids Store,Kitchen Supply Store,Korean Restaurant,Kosher Restaurant,Lake,Latin American Restaurant,Lawyer,Leather Goods Store,Lebanese Restaurant,Library,Light Rail Station,Lighting Store,Lingerie Store,Liquor Store,Locksmith,Lounge,Malay Restaurant,Market,Martial Arts Dojo,Massage Studio,Mattress Store,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Military Base,Mini Golf,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Motorcycle Shop,Movie Theater,Multiplex,Museum,Music Store,Music Venue,Nail Salon,Nature Preserve,New American Restaurant,Nightclub,Noodle House,North Indian Restaurant,Office,Okonomiyaki Restaurant,Opera House,Optical Shop,Organic Grocery,Other Repair Shop,Outdoor Gym,Outdoor Sculpture,Outdoor Supply Store,Outdoors & Recreation,Outlet Mall,Pakistani Restaurant,Palace,Paper / Office Supplies Store,Park,Pawn Shop,Pedestrian Plaza,Performing Arts Venue,Perfume Shop,Persian Restaurant,Peruvian Restaurant,Pet Café,Pet Service,Pet Store,Pharmacy,Photography Studio,Pie Shop,Pier,Pilates Studio,Pizza Place,Platform,Playground,Plaza,Poke Place,Polish Restaurant,Pool,Pool Hall,Portuguese Restaurant,Post Office,Print Shop,Pub,Public Art,Radio Station,Rafting,Ramen Restaurant,Record Shop,Recording Studio,Recreation Center,Rental Car Location,Reservoir,Residential Building (Apartment / Condo),Restaurant,River,Road,Rock Climbing Spot,Rock Club,Roller Rink,Roof Deck,Rugby Pitch,Russian Restaurant,Sake Bar,Salad Place,Salon / Barbershop,Salsa Club,Sandwich Place,Sausage Shop,Scandinavian Restaurant,Scenic Lookout,School,Science Museum,Scottish Restaurant,Sculpture Garden,Seafood Restaurant,Shaanxi Restaurant,Shabu-Shabu Restaurant,Shoe Store,Shop & Service,Shopping Mall,Shopping Plaza,Skate Park,Skating Rink,Smoke Shop,Smoothie Shop,Snack Place,Soccer Field,Soccer Stadium,Social Club,Soup Place,South American Restaurant,South Indian Restaurant,Southern / Soul Food Restaurant,Souvenir Shop,Souvlaki Shop,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Sri Lankan Restaurant,Stables,Stadium,Stationery Store,Steakhouse,Street Art,Street Food Gathering,Student Center,Supermarket,Sushi Restaurant,Szechuan Restaurant,Taco Place,Tailor Shop,Taiwanese Restaurant,Tapas Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Park Ride / Attraction,Thrift / Vintage Store,Tour Provider,Tourist Information Center,Toy / Game Store,Track,Track Stadium,Trail,Train Station,Tram Station,Transportation Service,Tree,Tunnel,Turkish Restaurant,Udon Restaurant,University,Used Auto Dealership,Used Bookstore,Vape Store,Vegetarian / Vegan Restaurant,Veneto Restaurant,Veterinarian,Video Game Store,Vietnamese Restaurant,Warehouse Store,Watch Shop,Whisky Bar,Windmill,Wine Bar,Wine Shop,Wings Joint,Women's Store,Xinjiang Restaurant,Yakitori Restaurant,Yoga Studio,Zoo,Zoo Exhibit
0,Abbey Wood,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.285714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Acton,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.285714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Addington,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Addiscombe,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.063492,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.079365,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.031746,0.0,0.0,0.0,0.063492,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.031746,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.031746,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.095238,0.0,0.015873,0.0,0.0,0.031746,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.031746,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.031746,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.063492,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.031746,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.015873,0.0,0.015873,0.0,0.0,0.031746,0.0,0.0
4,Albany Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


## 5. Analysis for Supermarket and Train Station

Create new dataframe with supermarket and train station and conduct clustering.

In [70]:
#smts stands for supermarket and train station
London_smts=London_grouped[['Neighborhood', 'Supermarket', 'Train Station']]

Making clusters for neighborhoods

In [71]:
#Set number of clusters
kclusters = 5

London_smts_clustering = London_smts.drop('Neighborhood', 1)

#Run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(London_smts_clustering)

#Check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([3, 2, 2, 1, 1, 1, 1, 1, 1, 4], dtype=int32)

In [72]:
#Create new dataset for merging
London_merged = London_smts

#Add clustering labels
London_merged['Clusters']=kmeans.labels_
London_merged.head()

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy


Unnamed: 0,Neighborhood,Supermarket,Train Station,Clusters
0,Abbey Wood,0.285714,0.142857,3
1,Acton,0.0,0.142857,2
2,Addington,0.0,0.142857,2
3,Addiscombe,0.015873,0.0,1
4,Albany Park,0.0,0.0,1


Adding most common venues in locations.
Define function to sort venues in descending order

In [73]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Create a dataframe that displays the top 10 venues for each neighborhood

In [74]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = London_grouped['Neighborhood']

for ind in np.arange(London_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(London_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Abbey Wood,Supermarket,Train Station,Coffee Shop,Convenience Store,Platform,Historic Site,Ethiopian Restaurant,Event Service,Event Space,Exhibit
1,Acton,Grocery Store,Breakfast Spot,Indian Restaurant,Bed & Breakfast,Park,Train Station,Zoo Exhibit,Filipino Restaurant,Exhibit,Fabric Shop
2,Addington,Café,Trail,Park,Coffee Shop,Tapas Restaurant,Train Station,Convenience Store,Food Court,Fast Food Restaurant,Event Service
3,Addiscombe,Italian Restaurant,Café,Coffee Shop,Bakery,Pub,Grocery Store,Juice Bar,Climbing Gym,Park,Thai Restaurant
4,Albany Park,Café,Building,Pub,Fast Food Restaurant,Garden,Bar,Sporting Goods Shop,Food Service,Film Studio,Exhibit


Merging Cluster dataset and Most Common Venue dataset

In [75]:
London= London_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')
London= London.join(df.set_index('Neighborhood'), on='Neighborhood')
London.head()

Unnamed: 0,Neighborhood,Supermarket,Train Station,Clusters,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Borough,PostalCode,Latitude,Longitude
0,Abbey Wood,0.285714,0.142857,3,Supermarket,Train Station,Coffee Shop,Convenience Store,Platform,Historic Site,Ethiopian Restaurant,Event Service,Event Space,Exhibit,"Bexley, Greenwich",SE2,51.49245,0.12127
1,Acton,0.0,0.142857,2,Grocery Store,Breakfast Spot,Indian Restaurant,Bed & Breakfast,Park,Train Station,Zoo Exhibit,Filipino Restaurant,Exhibit,Fabric Shop,"Ealing, Hammersmith and Fulham","W3, W4",51.51324,-0.26746
2,Addington,0.0,0.142857,2,Café,Trail,Park,Coffee Shop,Tapas Restaurant,Train Station,Convenience Store,Food Court,Fast Food Restaurant,Event Service,Croydon,CR0,51.57581,-0.10934
3,Addiscombe,0.015873,0.0,1,Italian Restaurant,Café,Coffee Shop,Bakery,Pub,Grocery Store,Juice Bar,Climbing Gym,Park,Thai Restaurant,Croydon,CR0,51.472749,-0.203326
4,Albany Park,0.0,0.0,1,Café,Building,Pub,Fast Food Restaurant,Garden,Bar,Sporting Goods Shop,Food Service,Film Studio,Exhibit,Bexley,"DA5, DA14",51.48582,-0.08026


## 6. Results

### 6.1 Creating a new map

In [76]:
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=10)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(London['Latitude'], London['Longitude'], London['Neighborhood'], London['Clusters']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### 6.2 Viewing individual clusters

In [77]:
#Cluster 0
London.loc[London['Clusters'] == 0]

Unnamed: 0,Neighborhood,Supermarket,Train Station,Clusters,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Borough,PostalCode,Latitude,Longitude
14,Arkley,0.133333,0.0,0,Supermarket,Grocery Store,Coffee Shop,Furniture / Home Store,Café,Discount Store,Warehouse Store,Pie Shop,Gym / Fitness Center,Pub,Barnet,"EN5, NW7",51.579763,-0.030331
27,Battersea,0.076923,0.038462,0,Café,Indian Restaurant,Pub,Supermarket,Italian Restaurant,Grocery Store,Sandwich Place,Gym / Fitness Center,Train Station,Gym,Wandsworth,SW11,51.4676,-0.1629
30,Beckton,0.0625,0.0,0,Pub,Bakery,Sandwich Place,Bookstore,Fast Food Restaurant,Grocery Store,Supermarket,Clothing Store,Warehouse Store,Park,Newham,"E6, E16, IG11",51.53292,0.05461
44,Bexleyheath,0.0625,0.0,0,Pub,Recreation Center,Pizza Place,Business Service,Supermarket,Bowling Alley,Fast Food Restaurant,Mexican Restaurant,Greek Restaurant,Gym,Bexley,"DA6, DA7, SE2",51.46013,0.13808
45,Bickley,0.083333,0.0,0,Grocery Store,Pub,Asian Restaurant,Supermarket,Vegetarian / Vegan Restaurant,Café,Park,Farm,Sandwich Place,Coffee Shop,Bromley,BR3,51.572825,-0.01433
57,Brent Cross,0.066667,0.0,0,Café,Coffee Shop,Clothing Store,Supermarket,Department Store,Women's Store,Electronics Store,Sporting Goods Shop,Men's Store,Bookstore,Barnet,"NW2, NW4",51.576501,-0.217572
58,Brent Park,0.142857,0.0,0,Scandinavian Restaurant,Sandwich Place,Supermarket,Furniture / Home Store,Portuguese Restaurant,Fast Food Restaurant,Zoo Exhibit,Event Service,Event Space,Exhibit,Brent,NW10,51.55384,-0.25845
85,Charlton,0.133333,0.0,0,Coffee Shop,Supermarket,Sporting Goods Shop,Department Store,Soccer Stadium,Discount Store,Warehouse Store,Clothing Store,Thai Restaurant,Electronics Store,Greenwich,SE7,51.48702,0.03188
101,Colindale,0.142857,0.0,0,Gym / Fitness Center,Supermarket,Fast Food Restaurant,Grocery Store,Bus Stop,Convenience Store,Pub,Coffee Shop,Park,Pizza Place,Barnet,NW9,51.594082,-0.252251
105,Colyers,0.111111,0.0,0,Grocery Store,Gym / Fitness Center,Skating Rink,Kebab Restaurant,Pub,Supermarket,Café,Gastropub,Fast Food Restaurant,Event Space,Bexley,DA8,51.424583,-0.137596


In [78]:
#Cluster 1
London.loc[London['Clusters'] == 1]

Unnamed: 0,Neighborhood,Supermarket,Train Station,Clusters,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Borough,PostalCode,Latitude,Longitude
3,Addiscombe,0.015873,0.0,1,Italian Restaurant,Café,Coffee Shop,Bakery,Pub,Grocery Store,Juice Bar,Climbing Gym,Park,Thai Restaurant,Croydon,CR0,51.472749,-0.203326
4,Albany Park,0.0,0.0,1,Café,Building,Pub,Fast Food Restaurant,Garden,Bar,Sporting Goods Shop,Food Service,Film Studio,Exhibit,Bexley,"DA5, DA14",51.48582,-0.08026
5,Aldborough Hatch,0.0,0.0,1,Restaurant,Construction & Landscaping,Music Venue,History Museum,Zoo Exhibit,Filipino Restaurant,Event Service,Event Space,Exhibit,Fabric Shop,Redbridge,IG2,54.09199,-1.38166
6,Aldgate,0.0,0.0,1,Hotel,Coffee Shop,Cocktail Bar,French Restaurant,Gym / Fitness Center,Restaurant,Wine Bar,Argentinian Restaurant,Asian Restaurant,Garden,City,EC3,51.513312,-0.077765
7,Aldwych,0.0,0.0,1,Pub,Theater,Restaurant,Coffee Shop,Dessert Shop,Sandwich Place,Clothing Store,Burger Joint,Historic Site,Bar,Westminster,WC2,51.512653,-0.118607
8,Alperton,0.0,0.0,1,Lebanese Restaurant,Grocery Store,Middle Eastern Restaurant,French Restaurant,Pizza Place,Fast Food Restaurant,Gym,Café,Garden,Furniture / Home Store,Brent,HA0,51.526869,-0.20644
10,Angel,0.01,0.0,1,Coffee Shop,Food Truck,Italian Restaurant,Gym / Fitness Center,Hotel,Bar,Pub,Vietnamese Restaurant,Cocktail Bar,Yoga Studio,Islington,"EC1, N1",51.52473,-0.08754
11,Aperfield,0.0,0.0,1,French Restaurant,Pizza Place,Coffee Shop,Café,Chinese Restaurant,Grocery Store,Indian Restaurant,Pub,Filipino Restaurant,Exhibit,Bromley,TN16,51.44192,-0.16711
12,Archway,0.0,0.0,1,Coffee Shop,Grocery Store,Pub,Pizza Place,Bar,Italian Restaurant,Farmers Market,Vegetarian / Vegan Restaurant,Sandwich Place,Gastropub,Islington,N19,51.565748,-0.134922
13,Ardleigh Green,0.0,0.0,1,Pub,Bus Stop,Café,Grocery Store,Deli / Bodega,Thai Restaurant,Gastropub,Cocktail Bar,Thrift / Vintage Store,Ethiopian Restaurant,Havering,RM11,51.544754,-0.083123


In [79]:
#Cluster 3
London.loc[London['Clusters'] == 2]

Unnamed: 0,Neighborhood,Supermarket,Train Station,Clusters,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Borough,PostalCode,Latitude,Longitude
1,Acton,0.0,0.142857,2,Grocery Store,Breakfast Spot,Indian Restaurant,Bed & Breakfast,Park,Train Station,Zoo Exhibit,Filipino Restaurant,Exhibit,Fabric Shop,"Ealing, Hammersmith and Fulham","W3, W4",51.51324,-0.26746
2,Addington,0.0,0.142857,2,Café,Trail,Park,Coffee Shop,Tapas Restaurant,Train Station,Convenience Store,Food Court,Fast Food Restaurant,Event Service,Croydon,CR0,51.57581,-0.10934
63,Bromley,0.076923,0.076923,2,Sporting Goods Shop,Soccer Stadium,Hostel,Bar,Train Station,Chinese Restaurant,Sandwich Place,Supermarket,Mediterranean Restaurant,Stadium,Bromley,BR1,51.601511,-0.066365
63,Bromley,0.076923,0.076923,2,Sporting Goods Shop,Soccer Stadium,Hostel,Bar,Train Station,Chinese Restaurant,Sandwich Place,Supermarket,Mediterranean Restaurant,Stadium,Tower Hamlets,E3,51.601511,-0.066365
64,Bromley Common,0.076923,0.076923,2,Sporting Goods Shop,Soccer Stadium,Hostel,Bar,Train Station,Chinese Restaurant,Sandwich Place,Supermarket,Mediterranean Restaurant,Stadium,Bromley,BR3,51.601511,-0.066365
67,Brunswick Park,0.0,0.066667,2,Coffee Shop,Pub,Cosmetics Shop,Food & Drink Shop,Fried Chicken Joint,South American Restaurant,Café,Tapas Restaurant,Park,Grocery Store,Barnet,N11,51.5836,-0.07607
77,Canonbury,0.0,0.055556,2,Pub,Trail,Fruit & Vegetable Store,Performing Arts Venue,Train Station,Thai Restaurant,Coffee Shop,Café,Pizza Place,Organic Grocery,Islington,N1,51.54868,-0.09175
93,Chinbrook,0.0,0.076923,2,Platform,Grocery Store,Park,Train Station,Chinese Restaurant,Indian Restaurant,Fried Chicken Joint,Pub,Coffee Shop,Zoo Exhibit,Lewisham,SE12,51.43155,0.022245
111,Cranford,0.0,0.142857,2,Tapas Restaurant,Coffee Shop,Convenience Store,Food & Drink Shop,Park,Train Station,Café,Event Space,Exhibit,Fabric Shop,Hounslow,TW5,51.580084,-0.10959
134,Downe,0.0,0.142857,2,Train Station,Bistro,Deli / Bodega,Middle Eastern Restaurant,Pub,Bar,Coffee Shop,Event Space,Exhibit,Fabric Shop,Bromley,BR6,51.632211,-0.10488


In [80]:
#Cluster 4
London.loc[London['Clusters'] == 3]

Unnamed: 0,Neighborhood,Supermarket,Train Station,Clusters,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Borough,PostalCode,Latitude,Longitude
0,Abbey Wood,0.285714,0.142857,3,Supermarket,Train Station,Coffee Shop,Convenience Store,Platform,Historic Site,Ethiopian Restaurant,Event Service,Event Space,Exhibit,"Bexley, Greenwich",SE2,51.49245,0.12127
20,Barkingside,0.5,0.0,3,Supermarket,Grocery Store,Park,Zoo Exhibit,Film Studio,Exhibit,Fabric Shop,Factory,Falafel Restaurant,Farm,Redbridge,IG6,51.58511,0.07841
86,Chase Cross,0.333333,0.0,3,Park,Supermarket,Zoo Exhibit,Film Studio,Event Space,Exhibit,Fabric Shop,Factory,Falafel Restaurant,Farm,Havering,RM5,51.416607,-0.117325
116,Cricklewood,0.25,0.0,3,Supermarket,Bus Station,Pizza Place,Park,Zoo Exhibit,Filipino Restaurant,Event Space,Exhibit,Fabric Shop,Factory,"Barnet, Brent, Camden",NW2,51.56491,-0.21651
118,Crook Log,0.2,0.0,3,Grocery Store,Supermarket,Australian Restaurant,Park,Vietnamese Restaurant,Zoo Exhibit,Filipino Restaurant,Event Space,Exhibit,Fabric Shop,Bexley,DA6,51.48777,-0.0415
121,Croydon,0.25,0.0,3,Café,Supermarket,Park,Fast Food Restaurant,Zoo Exhibit,Exhibit,Fabric Shop,Factory,Falafel Restaurant,Farm,Croydon,CR0,51.59347,-0.08338
126,Dagenham,0.2,0.0,3,Auto Garage,Indian Restaurant,Supermarket,Pub,Gas Station,Zoo Exhibit,Film Studio,Event Space,Exhibit,Fabric Shop,Barking and Dagenham,"RM9, RM10",51.569085,-0.026985
149,Edgware,0.2,0.0,3,Supermarket,Pub,Sandwich Place,Auto Garage,Outdoor Supply Store,Dim Sum Restaurant,Furniture / Home Store,Gym / Fitness Center,Chinese Restaurant,Grocery Store,Barnet,HA8,51.594582,-0.2602
218,Harrow Weald,0.2,0.0,3,Grocery Store,Indian Restaurant,Supermarket,Thai Restaurant,Fast Food Restaurant,Zoo Exhibit,Filipino Restaurant,Event Space,Exhibit,Fabric Shop,Harrow,HA3,51.60643,-0.34024
329,Northolt,0.25,0.0,3,Café,Supermarket,Grocery Store,Fast Food Restaurant,Zoo Exhibit,Exhibit,Fabric Shop,Factory,Falafel Restaurant,Farm,Ealing,UB5,51.59499,-0.08089


In [81]:
#Cluster 5
London.loc[London['Clusters'] == 4]

Unnamed: 0,Neighborhood,Supermarket,Train Station,Clusters,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Borough,PostalCode,Latitude,Longitude
9,Anerley,0.0,0.2,4,Hardware Store,Pub,Gas Station,Park,Train Station,Zoo Exhibit,Fast Food Restaurant,Event Service,Event Space,Exhibit,Bromley,SE20,51.41233,-0.06539
36,Bellingham,0.0,0.25,4,Brewery,Bus Stop,Metro Station,Train Station,Food Truck,Food Stand,Exhibit,Fabric Shop,Factory,Forest,Lewisham,SE6,51.60257,-0.05587
81,Catford,0.0,0.2,4,Discount Store,Furniture / Home Store,Turkish Restaurant,Pizza Place,Train Station,Food Court,Food Service,Ethiopian Restaurant,Event Service,Event Space,Lewisham,SE6,51.43722,-0.01868
197,Hadley Wood,0.0,0.333333,4,Tunnel,Golf Course,Train Station,Zoo Exhibit,Filipino Restaurant,Event Service,Event Space,Exhibit,Fabric Shop,Factory,Enfield,EN4,51.66669,-0.16981
249,Kensal Green,0.0,0.2,4,Bakery,Pub,Park,Portuguese Restaurant,Train Station,Filipino Restaurant,Event Service,Event Space,Exhibit,Fabric Shop,Brent,"NW10, NW6",51.53054,-0.22548
302,Morden Park,0.0,0.333333,4,Train Station,English Restaurant,Pool,Park,Hotel,Event Service,Event Space,Exhibit,Fabric Shop,Factory,Merton,SM4,51.39206,-0.20352
375,Romford,0.0,0.25,4,Bus Station,Grocery Store,Arcade,Train Station,Zoo Exhibit,Film Studio,Event Space,Exhibit,Fabric Shop,Factory,Havering,RM1,51.54859,0.04117
513,Woodford Green,0.0,0.222222,4,Café,Pub,Train Station,Restaurant,Chinese Restaurant,Grocery Store,Food Service,Ethiopian Restaurant,Event Service,Event Space,"Redbridge, Waltham Forest",IG8,51.553463,0.025281


__Result__:

In [82]:
print("Average in Cluster 0 for Supermarkets: %.2f" % London['Supermarket'].loc[London['Clusters'] == 0].mean(axis=0))
print("Average in Cluster 0 for Train Stations: %.2f" % London['Train Station'].loc[London['Clusters'] == 0].mean(axis=0))
print("Number of locations: ", London['Train Station'].loc[London['Clusters'] == 0].size)

Average in Cluster 0 for Supermarkets: 0.09
Average in Cluster 0 for Train Stations: 0.00
Number of locations:  57


In [83]:
print("Average in Cluster 1 for Supermarkets: %.2f" % London['Supermarket'].loc[London['Clusters'] == 1].mean(axis=0))
print("Average in Cluster 1 for Train Stations: %.2f" % London['Train Station'].loc[London['Clusters'] == 1].mean(axis=0))
print("Number of locations: ", London['Train Station'].loc[London['Clusters'] == 1].size)

Average in Cluster 1 for Supermarkets: 0.01
Average in Cluster 1 for Train Stations: 0.00
Number of locations:  411


In [84]:
print("Average in Cluster 2 for Supermarkets: %.2f" % London['Supermarket'].loc[London['Clusters'] == 2].mean(axis=0))
print("Average in Cluster 2 for Train Stations: %.2f" % London['Train Station'].loc[London['Clusters'] == 2].mean(axis=0))
print("Number of locations: ", London['Train Station'].loc[London['Clusters'] == 2].size)

Average in Cluster 2 for Supermarkets: 0.02
Average in Cluster 2 for Train Stations: 0.09
Number of locations:  38


In [85]:
print("Average in Cluster 3 for Supermarkets: %.2f" % London['Supermarket'].loc[London['Clusters'] == 3].mean(axis=0))
print("Average in Cluster 3 for Train Stations: %.2f" % London['Train Station'].loc[London['Clusters'] == 3].mean(axis=0))
print("Number of locations: ", London['Train Station'].loc[London['Clusters'] == 3].size)

Average in Cluster 3 for Supermarkets: 0.27
Average in Cluster 3 for Train Stations: 0.03
Number of locations:  15


In [86]:
print("Average in Cluster 4 for Supermarkets: %.2f" % London['Supermarket'].loc[London['Clusters'] == 4].mean(axis=0))
print("Average in Cluster 4 for Train Stations: %.2f" % London['Train Station'].loc[London['Clusters'] == 4].mean(axis=0))
print("Number of locations: ", London['Train Station'].loc[London['Clusters'] == 4].size)

Average in Cluster 4 for Supermarkets: 0.00
Average in Cluster 4 for Train Stations: 0.25
Number of locations:  8


Results in a table format:

Clusters | Size | Supermarkets(Avg) | Train Stations(Avg)
-|-|-|-
0 | 57  | 0.09 | 0.00
1 | 411 | 0.01 | 0.00
2 | 38  | 0.02 | 0.09
3 | 15  | 0.27 | 0.03
4 | 8   | 0.00 | 0.25