# <center>The Battle of Neighborhoods</center>

# Introduction

Tourism is important in London, someone who wants to visit London would like to know where is the best place to reserve a hotel. He would like to be closed to museums and best attractions.
It's also important to look at public transportation around the hotel.

**Target audience**
A company would like to install a new hotel in London. The place of this hotel is important.

**Where is the best place to install a new hotel?**
Address the decision where to locate a new hotel in a city with already a lot of hotel. We decide to not look in downtown city (too many hotels) and to focus in the inner London.
We will have to find a place:
- near some best attractions (as museums, Big Ben, etc)
- near public transportation
- near restaurants

# Data

To solve this problem I will use differents Data:
- The Wikipedia page with Neighboroods and coordinates: "https://en.wikipedia.org/wiki/List_of_London_boroughs"
With this page I will be able to extract all the boroughs names. There 32 boroughs in London, 33 with the city of London. (Examples: Greenwich, Bexley, Westminster). 
It's important to keep all of them because some attractions are not Downtown.
Generally neighborhoods have the same names than public transportations stops.
To be able to use Foursquare, we will need some coordinates. For examples we can decide to center or researches around Big Ben
- A csv file with Inner/ Outer London informations: https://files.datapress.com/london/dataset/london-borough-profiles/2015-09-24T15:49:52/london-borough-profiles.csv
We will focus only in Inner London.
- Foursquare location data
Necessary to see where the best attractions are and to find the best borough for the hotel.
It's also important to see how many hotels are already around.

### 1. Obtain Neighborhoods

In [1]:
# import all libraries
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import requests # library to handle requests
import urllib.request



We will extract neighborhoods and coordinates from a wikipedia page.

In [2]:
url="https://en.wikipedia.org/wiki/List_of_London_boroughs"
# query the page and put the HTML data into url
page = urllib.request.urlopen(url)


In [3]:
df = pd.read_html(url, header=0)
df= df[0]
df


Unnamed: 0,Borough,Inner,Status,Local authority,Political control,Headquarters,Area (sq mi),Population (2019 est)[1],Co-ordinates,Nr. in map
0,Barking and Dagenham [note 1],,,Barking and Dagenham London Borough Council,Labour,"Town Hall, 1 Town Square",13.93,212906,".mw-parser-output .geo-default,.mw-parser-outp...",25
1,Barnet,,,Barnet London Borough Council,Conservative,"Barnet House, 2 Bristol Avenue, Colindale",33.49,395896,51°37′31″N 0°09′06″W﻿ / ﻿51.6252°N 0.1517°W,31
2,Bexley,,,Bexley London Borough Council,Conservative,"Civic Offices, 2 Watling Street",23.38,248287,51°27′18″N 0°09′02″E﻿ / ﻿51.4549°N 0.1505°E,23
3,Brent,,,Brent London Borough Council,Labour,"Brent Civic Centre, Engineers Way",16.7,329771,51°33′32″N 0°16′54″W﻿ / ﻿51.5588°N 0.2817°W,12
4,Bromley,,,Bromley London Borough Council,Conservative,"Civic Centre, Stockwell Close",57.97,332336,51°24′14″N 0°01′11″E﻿ / ﻿51.4039°N 0.0198°E,20
5,Camden,,,Camden London Borough Council,Labour,"Camden Town Hall, Judd Street",8.4,270029,51°31′44″N 0°07′32″W﻿ / ﻿51.5290°N 0.1255°W,11
6,Croydon,,,Croydon London Borough Council,Labour,"Bernard Weatherill House, Mint Walk",33.41,386710,51°22′17″N 0°05′52″W﻿ / ﻿51.3714°N 0.0977°W,19
7,Ealing,,,Ealing London Borough Council,Labour,"Perceval House, 14-16 Uxbridge Road",21.44,341806,51°30′47″N 0°18′32″W﻿ / ﻿51.5130°N 0.3089°W,13
8,Enfield,,,Enfield London Borough Council,Labour,"Civic Centre, Silver Street",31.74,333794,51°39′14″N 0°04′48″W﻿ / ﻿51.6538°N 0.0799°W,30
9,Greenwich [note 2],[note 3],Royal,Greenwich London Borough Council,Labour,"Woolwich Town Hall, Wellington Street",18.28,287942,51°29′21″N 0°03′53″E﻿ / ﻿51.4892°N 0.0648°E,22


We have to take off [Note], and keep only two columns.

In [4]:
df= df[ ['Borough', 'Co-ordinates'] ]
df['Borough'] = pd.DataFrame([str(line).strip(' [').strip(']').replace("","") for line in df['Borough']])
df=df.replace ('\[note 1', '', regex=True)
df=df.replace ('\[note 2', '', regex=True)
df=df.replace ('\[note 4', '', regex=True)
df

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  from ipykernel import kernelapp as app


Unnamed: 0,Borough,Co-ordinates
0,Barking and Dagenham,".mw-parser-output .geo-default,.mw-parser-outp..."
1,Barnet,51°37′31″N 0°09′06″W﻿ / ﻿51.6252°N 0.1517°W
2,Bexley,51°27′18″N 0°09′02″E﻿ / ﻿51.4549°N 0.1505°E
3,Brent,51°33′32″N 0°16′54″W﻿ / ﻿51.5588°N 0.2817°W
4,Bromley,51°24′14″N 0°01′11″E﻿ / ﻿51.4039°N 0.0198°E
5,Camden,51°31′44″N 0°07′32″W﻿ / ﻿51.5290°N 0.1255°W
6,Croydon,51°22′17″N 0°05′52″W﻿ / ﻿51.3714°N 0.0977°W
7,Ealing,51°30′47″N 0°18′32″W﻿ / ﻿51.5130°N 0.3089°W
8,Enfield,51°39′14″N 0°04′48″W﻿ / ﻿51.6538°N 0.0799°W
9,Greenwich,51°29′21″N 0°03′53″E﻿ / ﻿51.4892°N 0.0648°E


### 2. Find Inner London


Let's find the names of inner London neighborhoods. For that, we will use a second file.

In [5]:
url2 = 'https://files.datapress.com/london/dataset/london-borough-profiles/2015-09-24T15:49:52/london-borough-profiles.csv'
inner = pd.read_csv(url2, encoding= 'unicode_escape')


inner.head()


Unnamed: 0,Code,Area_name,Inner/_Outer_London,GLA_Population_Estimate_2017,GLA_Household_Estimate_2017,Inland_Area_(Hectares),Population_density_(per_hectare)_2017,"Average_Age,_2017","Proportion_of_population_aged_0-15,_2015","Proportion_of_population_of_working-age,_2015","Proportion_of_population_aged_65_and_over,_2015",Net_internal_migration_(2015),Net_international_migration_(2015),Net_natural_change_(2015),%_of_resident_population_born_abroad_(2015),Largest_migrant_population_by_country_of_birth_(2011),%_of_largest_migrant_population_(2011),Second_largest_migrant_population_by_country_of_birth_(2011),%_of_second_largest_migrant_population_(2011),Third_largest_migrant_population_by_country_of_birth_(2011),%_of_third_largest_migrant_population_(2011),%_of_population_from_BAME_groups_(2016),%_people_aged_3+_whose_main_language_is_not_English_(2011_Census),"Overseas_nationals_entering_the_UK_(NINo),_(2015/16)","New_migrant_(NINo)_rates,_(2015/16)",Largest_migrant_population_arrived_during_2015/16,Second_largest_migrant_population_arrived_during_2015/16,Third_largest_migrant_population_arrived_during_2015/16,Employment_rate_(%)_(2015),Male_employment_rate_(2015),Female_employment_rate_(2015),Unemployment_rate_(2015),Youth_Unemployment_(claimant)_rate_18-24_(Dec-15),Proportion_of_16-18_year_olds_who_are_NEET_(%)_(2014),Proportion_of_the_working-age_population_who_claim_out-of-work_benefits_(%)_(May-2016),%_working-age_with_a_disability_(2015),Proportion_of_working_age_people_with_no_qualifications_(%)_2015,Proportion_of_working_age_with_degree_or_equivalent_and_above_(%)_2015,"Gross_Annual_Pay,_(2016)",Gross_Annual_Pay_-_Male_(2016),Gross_Annual_Pay_-_Female_(2016),Modelled_Household_median_income_estimates_2012/13,%_adults_that_volunteered_in_past_12_months_(2010/11_to_2012/13),Number_of_jobs_by_workplace_(2014),%_of_employment_that_is_in_public_sector_(2014),"Jobs_Density,_2015","Number_of_active_businesses,_2015",Two-year_business_survival_rates_(started_in_2013),Crime_rates_per_thousand_population_2014/15,Fires_per_thousand_population_(2014),Ambulance_incidents_per_hundred_population_(2014),"Median_House_Price,_2015","Average_Band_D_Council_Tax_charge_(£),_2015/16",New_Homes_(net)_2015/16_(provisional),"Homes_Owned_outright,_(2014)_%","Being_bought_with_mortgage_or_loan,_(2014)_%","Rented_from_Local_Authority_or_Housing_Association,_(2014)_%","Rented_from_Private_landlord,_(2014)_%","%_of_area_that_is_Greenspace,_2005",Total_carbon_emissions_(2014),"Household_Waste_Recycling_Rate,_2014/15","Number_of_cars,_(2011_Census)","Number_of_cars_per_household,_(2011_Census)","%_of_adults_who_cycle_at_least_once_per_month,_2014/15","Average_Public_Transport_Accessibility_score,_2014","Achievement_of_5_or_more_A*-_C_grades_at_GCSE_or_equivalent_including_English_and_Maths,_2013/14",Rates_of_Children_Looked_After_(2016),%_of_pupils_whose_first_language_is_not_English_(2015),%_children_living_in_out-of-work_households_(2015),"Male_life_expectancy,_(2012-14)","Female_life_expectancy,_(2012-14)",Teenage_conception_rate_(2014),Life_satisfaction_score_2011-14_(out_of_10),Worthwhileness_score_2011-14_(out_of_10),Happiness_score_2011-14_(out_of_10),Anxiety_score_2011-14_(out_of_10),Childhood_Obesity_Prevalance_(%)_2015/16,People_aged_17+_with_diabetes_(%),Mortality_rate_from_causes_considered_preventable_2012/14,Political_control_in_council,Proportion_of_seats_won_by_Conservatives_in_2014_election,Proportion_of_seats_won_by_Labour_in_2014_election,Proportion_of_seats_won_by_Lib_Dems_in_2014_election,Turnout_at_2014_local_elections
0,E09000001,City of London,Inner London,8800,5326,290,30.3,43.2,11.4,73.1,15.5,-7,665,30,.,United States,2.8,France,2.0,Australia,1.9,27.5,17.1,975,152.2,India,France,United States,64.6,.,.,.,1.6,.,3.4,.,.,.,.,.,.,"£63,620",.,500400,3.4,84.3,26130,64.3,.,12.3,.,799999,931.2,80,.,.,.,.,4.8,1036,34.4,1692,0.4,16.9,7.9,78.6,101,.,7.9,.,.,.,6.6,7.1,6.0,5.6,,2.6,129,.,.,.,.,.
1,E09000002,Barking and Dagenham,Outer London,209000,78188,3611,57.9,32.9,27.2,63.1,9.7,-1176,2509,2356,37.8,Nigeria,4.7,India,2.3,Pakistan,2.3,49.5,18.7,7538,59.1,Romania,Bulgaria,Lithuania,65.8,75.6,56.5,11,4.5,5.7,10.5,17.2,11.3,32.2,27886,30104,24602,"£29,420",20.5,58900,21.1,0.5,6560,73.0,83.4,3.0,13.7,243500,1354.03,730,16.4,27.4,35.9,20.3,33.6,644,23.4,56966,0.8,8.8,3.0,58.0,69,41.7,18.7,77.6,82.1,32.4,7.1,7.6,7.1,3.1,28.5,7.3,228,Lab,0,100,0,36.5
2,E09000003,Barnet,Outer London,389600,151423,8675,44.9,37.3,21.1,64.9,14.0,-3379,5407,2757,35.2,India,3.1,Poland,2.4,Iran,2.0,38.7,23.4,13094,53.1,Romania,Poland,Italy,68.5,74.5,62.9,8.5,1.9,2.5,6.2,14.9,5.2,49,33443,36475,31235,"£40,530",33.2,167300,18.7,0.7,26190,73.8,62.7,1.6,11.1,445000,1397.07,1460,32.4,25.2,11.1,31.1,41.3,1415,38.0,144717,1.1,7.4,3.0,67.3,35,46,9.3,82.1,85.1,12.8,7.5,7.8,7.4,2.8,20.7,6.0,134,Cons,50.8,.,1.6,40.5
3,E09000004,Bexley,Outer London,244300,97736,6058,40.3,39.0,20.6,62.9,16.6,413,760,1095,16.1,Nigeria,2.6,India,1.5,Ireland,0.9,21.4,6.0,2198,14.4,Romania,Poland,Nigeria,75.1,82.1,68.5,7.6,2.9,3.4,6.8,15.9,10.8,33.5,34350,37881,28924,"£36,990",22.1,80700,15.9,0.6,9075,73.5,51.8,2.3,11.8,275000,1472.43,-130,38.1,35.3,15.2,11.4,31.7,975,54.0,108507,1.2,10.6,2.6,60.3,46,32.6,12.6,80.4,84.4,19.5,7.4,7.7,7.2,3.3,22.7,6.9,164,Cons,71.4,23.8,0,39.6
4,E09000005,Brent,Outer London,332100,121048,4323,76.8,35.6,20.9,67.8,11.3,-7739,7640,3372,53.9,India,9.2,Poland,3.4,Ireland,2.9,64.9,37.2,22162,100.9,Romania,Italy,Portugal,69.5,76,62.6,7.5,3.1,2.6,8.3,17.7,6.2,45.1,29812,30129,29600,"£32,140",17.3,133600,17.6,0.6,15745,74.4,78.8,1.8,12.1,407250,1377.24,1050,22.2,22.6,20.4,34.8,21.9,1175,35.2,87802,0.8,7.9,3.7,60.1,45,37.6,13.7,80.1,85.1,18.5,7.3,7.4,7.2,2.9,24.3,7.9,169,Lab,9.5,88.9,1.6,36.3


Clean Data

In [6]:
inner= inner[ ['Code', 'Area_name',"Inner/_Outer_London"] ]
inner = inner.rename(columns = {'Area_name':'Borough'})
inner.head()

Unnamed: 0,Code,Borough,Inner/_Outer_London
0,E09000001,City of London,Inner London
1,E09000002,Barking and Dagenham,Outer London
2,E09000003,Barnet,Outer London
3,E09000004,Bexley,Outer London
4,E09000005,Brent,Outer London


Merge Data

In [7]:
df2=pd.merge(df, inner, on='Borough', how='left',)  
df2.head()

Unnamed: 0,Borough,Co-ordinates,Code,Inner/_Outer_London
0,Barking and Dagenham,".mw-parser-output .geo-default,.mw-parser-outp...",,
1,Barnet,51°37′31″N 0°09′06″W﻿ / ﻿51.6252°N 0.1517°W,E09000003,Outer London
2,Bexley,51°27′18″N 0°09′02″E﻿ / ﻿51.4549°N 0.1505°E,E09000004,Outer London
3,Brent,51°33′32″N 0°16′54″W﻿ / ﻿51.5588°N 0.2817°W,E09000005,Outer London
4,Bromley,51°24′14″N 0°01′11″E﻿ / ﻿51.4039°N 0.0198°E,E09000006,Outer London


Keep only Inner London

In [8]:
df2 = df2.loc[df2['Inner/_Outer_London']=="Inner London"]
df2 = df2.reset_index(drop=True)
df2.head()

Unnamed: 0,Borough,Co-ordinates,Code,Inner/_Outer_London
0,Camden,51°31′44″N 0°07′32″W﻿ / ﻿51.5290°N 0.1255°W,E09000007,Inner London
1,Hackney,51°32′42″N 0°03′19″W﻿ / ﻿51.5450°N 0.0553°W,E09000012,Inner London
2,Haringey,51°36′00″N 0°06′43″W﻿ / ﻿51.6000°N 0.1119°W,E09000014,Inner London
3,Islington,51°32′30″N 0°06′08″W﻿ / ﻿51.5416°N 0.1022°W,E09000019,Inner London
4,Kensington and Chelsea,51°30′07″N 0°11′41″W﻿ / ﻿51.5020°N 0.1947°W,E09000020,Inner London


### 3. Adding Lat Lon coordinates to separate columns

In [9]:
coordinates = df2['Co-ordinates'].str.strip('()')                               \
                   .str.split('/', expand=True)                   \
                   .rename(columns={0:'coordinate 1', 1:'coordinate 2'})
coordinates.head()

Unnamed: 0,coordinate 1,coordinate 2
0,51°31′44″N 0°07′32″W﻿,﻿51.5290°N 0.1255°W
1,51°32′42″N 0°03′19″W﻿,﻿51.5450°N 0.0553°W
2,51°36′00″N 0°06′43″W﻿,﻿51.6000°N 0.1119°W
3,51°32′30″N 0°06′08″W﻿,﻿51.5416°N 0.1022°W
4,51°30′07″N 0°11′41″W﻿,﻿51.5020°N 0.1947°W


In [10]:
coordinates = coordinates.drop(columns = ['coordinate 2'])
coordinates = coordinates.replace (' ',',', regex=True)
coordinates = coordinates['coordinate 1'].str.strip('()')                               \
                   .str.split(',', expand=True)                   \
                   .rename(columns={0:'Latitude',1:'Longitude', 2:'1'})
coordinates.head()

Unnamed: 0,Latitude,Longitude,1
0,51°31′44″N,0°07′32″W﻿,
1,51°32′42″N,0°03′19″W﻿,
2,51°36′00″N,0°06′43″W﻿,
3,51°32′30″N,0°06′08″W﻿,
4,51°30′07″N,0°11′41″W﻿,


Convert Lat/Lon format

In [11]:
pattern = r'(?P<d>[\d\.]+).*?(?P<m>[\d\.]+).*?(?P<s>[\d\.]+)'

dms = coordinates['Latitude'].str.extract(pattern).astype(float)
coordinates['LATITUDE'] = dms['d'] + dms['m'].div(60) + dms['s'].div(3600)

# Similarly we do for the longitude    
dms = coordinates['Longitude'].str.extract(pattern).astype(float)
coordinates['LONGITUDE'] = (dms['d'] + dms['m'].div(60) + dms['s'].div(3600))* -1

coordinates

Unnamed: 0,Latitude,Longitude,1,LATITUDE,LONGITUDE
0,51°31′44″N,0°07′32″W﻿,,51.528889,-0.125556
1,51°32′42″N,0°03′19″W﻿,,51.545,-0.055278
2,51°36′00″N,0°06′43″W﻿,,51.6,-0.111944
3,51°32′30″N,0°06′08″W﻿,,51.541667,-0.102222
4,51°30′07″N,0°11′41″W﻿,,51.501944,-0.194722
5,51°27′39″N,0°06′59″W﻿,,51.460833,-0.116389
6,51°26′43″N,0°01′15″W﻿,,51.445278,-0.020833
7,51°30′28″N,0°02′49″E﻿,,51.507778,-0.046944
8,51°30′13″N,0°04′49″W﻿,,51.503611,-0.080278
9,51°30′36″N,0°00′21″W﻿,,51.51,-0.005833


Let's do one table with neighborhoods and coordinates.

In [12]:
df2=pd.concat([df2, coordinates], axis=1)
df2.head()

Unnamed: 0,Borough,Co-ordinates,Code,Inner/_Outer_London,Latitude,Longitude,1,LATITUDE,LONGITUDE
0,Camden,51°31′44″N 0°07′32″W﻿ / ﻿51.5290°N 0.1255°W,E09000007,Inner London,51°31′44″N,0°07′32″W﻿,,51.528889,-0.125556
1,Hackney,51°32′42″N 0°03′19″W﻿ / ﻿51.5450°N 0.0553°W,E09000012,Inner London,51°32′42″N,0°03′19″W﻿,,51.545,-0.055278
2,Haringey,51°36′00″N 0°06′43″W﻿ / ﻿51.6000°N 0.1119°W,E09000014,Inner London,51°36′00″N,0°06′43″W﻿,,51.6,-0.111944
3,Islington,51°32′30″N 0°06′08″W﻿ / ﻿51.5416°N 0.1022°W,E09000019,Inner London,51°32′30″N,0°06′08″W﻿,,51.541667,-0.102222
4,Kensington and Chelsea,51°30′07″N 0°11′41″W﻿ / ﻿51.5020°N 0.1947°W,E09000020,Inner London,51°30′07″N,0°11′41″W﻿,,51.501944,-0.194722


In [13]:
df2 = df2.drop(columns = ['Co-ordinates','1', 'Latitude', 'Longitude'])
df2.head(12)

Unnamed: 0,Borough,Code,Inner/_Outer_London,LATITUDE,LONGITUDE
0,Camden,E09000007,Inner London,51.528889,-0.125556
1,Hackney,E09000012,Inner London,51.545,-0.055278
2,Haringey,E09000014,Inner London,51.6,-0.111944
3,Islington,E09000019,Inner London,51.541667,-0.102222
4,Kensington and Chelsea,E09000020,Inner London,51.501944,-0.194722
5,Lambeth,E09000022,Inner London,51.460833,-0.116389
6,Lewisham,E09000023,Inner London,51.445278,-0.020833
7,Newham,E09000025,Inner London,51.507778,-0.046944
8,Southwark,E09000028,Inner London,51.503611,-0.080278
9,Tower Hamlets,E09000030,Inner London,51.51,-0.005833


Let's rename Borough in Neighborhood.|

In [14]:
df2 =df2.rename(columns= {'Borough': 'Neighborhood'})

### 4. Create a map of London

In [15]:
!pip install folium
import folium # map rendering library
!pip install geopy 
from geopy.geocoders import Nominatim 

Collecting folium
  Downloading folium-0.12.1-py2.py3-none-any.whl (94 kB)
[K     |████████████████████████████████| 94 kB 7.3 MB/s  eta 0:00:01
Collecting branca>=0.3.0
  Downloading branca-0.4.2-py3-none-any.whl (24 kB)
Installing collected packages: branca, folium
Successfully installed branca-0.4.2 folium-0.12.1


#### Use geopy library to get the latitude and longitude values of London.

In [16]:
address = 'London, UK'

geolocator = Nominatim(user_agent="london_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geographical coordinates of London are {}, {}.'.format(latitude, longitude))

The geographical coordinates of London are 51.5073219, -0.1276474.


In [17]:
# create map of London using latitude and longitude values
map_london = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, neighborhood in zip(df2['LATITUDE'], df2['LONGITUDE'], df2['Neighborhood']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_london)  
    
map_london

In [18]:
# @hidden_cell
CLIENT_ID = 'DGJDNTFVNM42QDVUMN1WFXUYUS0QTV2AVA3WBQG1V1LBTWAD' # your Foursquare ID
CLIENT_SECRET = 'CHH0T4FQPJYZAHQRMZQZYSOEBICOJDET044ZHSNZFZ052G5Z' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

import json # library to handle JSON files
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

Explore Inner London

In [19]:
# create a function to repeat the same process
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [20]:
London_venues = getNearbyVenues(names=df2['Neighborhood'],
                                   latitudes=df2['LATITUDE'],
                                   longitudes=df2['LONGITUDE']
                                  )

Camden
Hackney
Haringey
Islington
Kensington and Chelsea
Lambeth
Lewisham
Newham
Southwark
Tower Hamlets
Wandsworth
Westminster


#### Let's check the size of the resulting dataframe

In [21]:
print(London_venues.shape)
London_venues.head()

(625, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Camden,51.528889,-0.125556,The Sir John Ritblat Gallery: Treasures of the...,51.529666,-0.127541,Museum
1,Camden,51.528889,-0.125556,St. Pancras Renaissance Hotel London,51.529733,-0.125912,Hotel
2,Camden,51.528889,-0.125556,Pullman London St Pancras,51.528668,-0.128191,Hotel
3,Camden,51.528889,-0.125556,Origin Coffee Roasters,51.529133,-0.126618,Coffee Shop
4,Camden,51.528889,-0.125556,Pullman Hotel Breakfast Area,51.528484,-0.128126,Breakfast Spot


Let's count the number of venues by neighborhood.

In [22]:
London_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Camden,69,69,69,69,69,69
Hackney,59,59,59,59,59,59
Haringey,22,22,22,22,22,22
Islington,41,41,41,41,41,41
Kensington and Chelsea,54,54,54,54,54,54
Lambeth,100,100,100,100,100,100
Lewisham,31,31,31,31,31,31
Newham,8,8,8,8,8,8
Southwark,100,100,100,100,100,100
Tower Hamlets,22,22,22,22,22,22


In [23]:
print('There are {} uniques categories.'.format(len(London_venues['Venue Category'].unique())))

There are 151 uniques categories.


## 5.Analyze Each Neighborhood

In [24]:
# one hot encoding
London_onehot = pd.get_dummies(London_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
London_onehot['Neighborhood'] = London_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [London_onehot.columns[-1]] + list(London_onehot.columns[:-1])
London_onehot = London_onehot[fixed_columns]

London_grouped = London_onehot.groupby('Neighborhood').mean().reset_index()
London_grouped.shape

(12, 152)

In [25]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [26]:
num_top_venues = 15

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = London_grouped['Neighborhood']

for ind in np.arange(London_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(London_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head(12)

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
0,Camden,Café,Coffee Shop,Pub,Hotel,Burger Joint,Pizza Place,Train Station,Bakery,Breakfast Spot,Restaurant,Chocolate Shop,Deli / Bodega,Grocery Store,Park,Outdoor Sculpture
1,Hackney,Pub,Coffee Shop,Brewery,Bakery,Cocktail Bar,Modern European Restaurant,Café,Organic Grocery,Grocery Store,Clothing Store,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Hotel,Pizza Place,Bus Station
2,Haringey,Park,Playground,Italian Restaurant,Convenience Store,Portuguese Restaurant,Pub,Restaurant,Movie Theater,Fast Food Restaurant,Mediterranean Restaurant,Café,Supermarket,Bus Station,Grocery Store,Hotel
3,Islington,Pub,Bakery,Ice Cream Shop,Cocktail Bar,Theater,Boutique,Burger Joint,Indian Restaurant,Art Museum,Park,Organic Grocery,Music Venue,Mediterranean Restaurant,Liquor Store,Latin American Restaurant
4,Kensington and Chelsea,Juice Bar,Café,Hotel,Bakery,Spa,Burger Joint,French Restaurant,Restaurant,Pub,English Restaurant,Garden,Clothing Store,Italian Restaurant,Mediterranean Restaurant,Grocery Store
5,Lambeth,Caribbean Restaurant,Market,Pub,Coffee Shop,Gym / Fitness Center,Beer Bar,Pizza Place,Nightclub,Sandwich Place,Fried Chicken Joint,Food Court,Restaurant,Department Store,Cupcake Shop,Cocktail Bar
6,Lewisham,Supermarket,Grocery Store,Coffee Shop,Platform,Train Station,Italian Restaurant,Cocktail Bar,Pharmacy,Pizza Place,Portuguese Restaurant,Pub,Dessert Shop,Sandwich Place,Shopping Mall,Bus Stop
7,Newham,Pub,Bus Stop,Italian Restaurant,Grocery Store,Park,Hotel,Hostel,BBQ Joint,Event Space,Food Court,Fish Market,Fish & Chips Shop,Film Studio,Fast Food Restaurant,Farmers Market
8,Southwark,Coffee Shop,Bar,Pub,Cocktail Bar,French Restaurant,Scenic Lookout,Pizza Place,Hotel,Italian Restaurant,Museum,Park,English Restaurant,Sushi Restaurant,Indian Restaurant,Restaurant
9,Tower Hamlets,Coffee Shop,Hotel,Sandwich Place,Italian Restaurant,Boat or Ferry,Convenience Store,Outdoor Sculpture,Chinese Restaurant,Café,Bus Stop,Light Rail Station,Steakhouse,Pizza Place,Grocery Store,Harbor / Marina


## 6. Cluster Neighborhoods

In [27]:
# import k-means from clustering stage
from sklearn.cluster import KMeans
# set number of clusters
kclusters = 5

London_grouped_clustering = London_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(London_grouped_clustering)

In [28]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

London_merged = df2
# merge  to add latitude/longitude for each neighborhood
London_merged = London_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

London_merged.head()


Unnamed: 0,Neighborhood,Code,Inner/_Outer_London,LATITUDE,LONGITUDE,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
0,Camden,E09000007,Inner London,51.528889,-0.125556,1,Café,Coffee Shop,Pub,Hotel,Burger Joint,Pizza Place,Train Station,Bakery,Breakfast Spot,Restaurant,Chocolate Shop,Deli / Bodega,Grocery Store,Park,Outdoor Sculpture
1,Hackney,E09000012,Inner London,51.545,-0.055278,1,Pub,Coffee Shop,Brewery,Bakery,Cocktail Bar,Modern European Restaurant,Café,Organic Grocery,Grocery Store,Clothing Store,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Hotel,Pizza Place,Bus Station
2,Haringey,E09000014,Inner London,51.6,-0.111944,4,Park,Playground,Italian Restaurant,Convenience Store,Portuguese Restaurant,Pub,Restaurant,Movie Theater,Fast Food Restaurant,Mediterranean Restaurant,Café,Supermarket,Bus Station,Grocery Store,Hotel
3,Islington,E09000019,Inner London,51.541667,-0.102222,0,Pub,Bakery,Ice Cream Shop,Cocktail Bar,Theater,Boutique,Burger Joint,Indian Restaurant,Art Museum,Park,Organic Grocery,Music Venue,Mediterranean Restaurant,Liquor Store,Latin American Restaurant
4,Kensington and Chelsea,E09000020,Inner London,51.501944,-0.194722,1,Juice Bar,Café,Hotel,Bakery,Spa,Burger Joint,French Restaurant,Restaurant,Pub,English Restaurant,Garden,Clothing Store,Italian Restaurant,Mediterranean Restaurant,Grocery Store


Now, we will create a map to see where each cluster is, localization is important for the final decision.

In [29]:
# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(London_merged['LATITUDE'], London_merged['LONGITUDE'], London_merged['Neighborhood'], London_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

## 7. Examine Clusters

### Cluster 0:

In [30]:
London_merged.loc[London_merged['Cluster Labels'] == 0, London_merged.columns[[0] + list(range(5, London_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
3,Islington,0,Pub,Bakery,Ice Cream Shop,Cocktail Bar,Theater,Boutique,Burger Joint,Indian Restaurant,Art Museum,Park,Organic Grocery,Music Venue,Mediterranean Restaurant,Liquor Store,Latin American Restaurant


We saw in the table with the count of venues by neighborhood, that Islington did not have a lot of venues. Here we can see an Art Museum, but because of the number of venues, we are not going to look deeper in this neighborhood.

### Cluster 1:

In [31]:
London_merged.loc[London_merged['Cluster Labels'] == 1, London_merged.columns[[0] + list(range(5, London_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
0,Camden,1,Café,Coffee Shop,Pub,Hotel,Burger Joint,Pizza Place,Train Station,Bakery,Breakfast Spot,Restaurant,Chocolate Shop,Deli / Bodega,Grocery Store,Park,Outdoor Sculpture
1,Hackney,1,Pub,Coffee Shop,Brewery,Bakery,Cocktail Bar,Modern European Restaurant,Café,Organic Grocery,Grocery Store,Clothing Store,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Hotel,Pizza Place,Bus Station
4,Kensington and Chelsea,1,Juice Bar,Café,Hotel,Bakery,Spa,Burger Joint,French Restaurant,Restaurant,Pub,English Restaurant,Garden,Clothing Store,Italian Restaurant,Mediterranean Restaurant,Grocery Store
5,Lambeth,1,Caribbean Restaurant,Market,Pub,Coffee Shop,Gym / Fitness Center,Beer Bar,Pizza Place,Nightclub,Sandwich Place,Fried Chicken Joint,Food Court,Restaurant,Department Store,Cupcake Shop,Cocktail Bar
8,Southwark,1,Coffee Shop,Bar,Pub,Cocktail Bar,French Restaurant,Scenic Lookout,Pizza Place,Hotel,Italian Restaurant,Museum,Park,English Restaurant,Sushi Restaurant,Indian Restaurant,Restaurant
9,Tower Hamlets,1,Coffee Shop,Hotel,Sandwich Place,Italian Restaurant,Boat or Ferry,Convenience Store,Outdoor Sculpture,Chinese Restaurant,Café,Bus Stop,Light Rail Station,Steakhouse,Pizza Place,Grocery Store,Harbor / Marina
10,Wandsworth,1,Coffee Shop,Pub,Pharmacy,Breakfast Spot,Clothing Store,Indian Restaurant,Supermarket,Asian Restaurant,Gym / Fitness Center,Café,Portuguese Restaurant,Sandwich Place,Sporting Goods Shop,Stationery Store,Chaat Place
11,Westminster,1,Coffee Shop,Hotel,Italian Restaurant,Sandwich Place,Pub,Theater,Restaurant,Sushi Restaurant,Sporting Goods Shop,Gym / Fitness Center,Hotel Bar,Juice Bar,Korean Restaurant,Café,Camera Store


This cluster is the one with the greatest number of neighborhoods. 
Categories we can find:
-	Hotel
-	Coffee Shop
-	Pub
-	Restaurants
-	Bus Stop / Train Station
-	Museum
-	Park
We will not look deeper in Hackney and Tower Hamlets, because they are too far from Downtown city. 
Camden and Southwark could be a good fit.


### Cluster 2:

In [32]:
London_merged.loc[London_merged['Cluster Labels'] == 2, London_merged.columns[[0] + list(range(5, London_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
7,Newham,2,Pub,Bus Stop,Italian Restaurant,Grocery Store,Park,Hotel,Hostel,BBQ Joint,Event Space,Food Court,Fish Market,Fish & Chips Shop,Film Studio,Fast Food Restaurant,Farmers Market


In this cluster, we don’t see any attraction, and as in the table with the count of venues by neighborhood, Newham had only 8 venues, we’re not going to look deeper in this neighborhood.

### Cluster 3:

In [33]:
London_merged.loc[London_merged['Cluster Labels'] == 3, London_merged.columns[[0] + list(range(5, London_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
6,Lewisham,3,Supermarket,Grocery Store,Coffee Shop,Platform,Train Station,Italian Restaurant,Cocktail Bar,Pharmacy,Pizza Place,Portuguese Restaurant,Pub,Dessert Shop,Sandwich Place,Shopping Mall,Bus Stop


This cluster contains only one neighborhood: Lewisham. We don’t see any attraction in the categories.
If we look at this neighborhood on the map, we can see that the neighborhood is too far from downtown for our hotel. Because of these, we’re not going to look deeper in this neighborhood.


### Cluster 4

In [34]:
London_merged.loc[London_merged['Cluster Labels'] == 4, London_merged.columns[[0] + list(range(5, London_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
2,Haringey,4,Park,Playground,Italian Restaurant,Convenience Store,Portuguese Restaurant,Pub,Restaurant,Movie Theater,Fast Food Restaurant,Mediterranean Restaurant,Café,Supermarket,Bus Station,Grocery Store,Hotel


This cluster contains only one neighborhood: Haringey. We don’t see any attraction in the categories.
If we look at this neighborhood on the map, we can see that the neighborhood is too far from downtown for our hotel. Because of these, we’re not going to look deeper in this neighborhood


**We have two neighborhoods who could be a good fit for our hotel. We will look deeper in venues near them.**

## Camden Neighborhood

In [35]:
neighborhood_latitude = df2.loc[0, 'LATITUDE'] # neighborhood latitude value
neighborhood_longitude = df2.loc[0, 'LONGITUDE'] # neighborhood longitude value

neighborhood_name = df2.loc[0, 'Neighborhood'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of Camden are 51.528888888888886, -0.12555555555555556.


#### Now, let's get the top 100 venues that are in Camden within a radius of 500 meters.

In [40]:
LIMIT = 100
radius=500
url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, neighborhood_latitude, neighborhood_longitude, VERSION, radius, LIMIT)

results = requests.get(url).json()
                                                                                                                           # function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head(10)



Unnamed: 0,name,categories,lat,lng
0,The Sir John Ritblat Gallery: Treasures of the...,Museum,51.529666,-0.127541
1,St. Pancras Renaissance Hotel London,Hotel,51.529733,-0.125912
2,Pullman London St Pancras,Hotel,51.528668,-0.128191
3,Origin Coffee Roasters,Coffee Shop,51.529133,-0.126618
4,Pullman Hotel Breakfast Area,Breakfast Spot,51.528484,-0.128126
5,Pitted Olive,Turkish Restaurant,51.526369,-0.125623
6,Half Cup,Café,51.527838,-0.124951
7,Ladurée,Dessert Shop,51.530268,-0.125734
8,Fortnum & Mason,Gift Shop,51.530541,-0.125525
9,Patisserie Deux Amis,Coffee Shop,51.526798,-0.124189


With Foursquare, I extracted the top 100 venues that are in Camden within a radius of 500 meters.
In the neighborhood, we can find:
-	Museums
-	Train stations
-	Garden
-	Art Gallery
-	Sculpture
-	Parks
-	Restaurants
-	Coffee Shops
This neighborhood has a lot of tourist’s attractions and we can find train stations to go to Downtown city.


## Southwark Neigborhood

In [37]:
neighborhood_latitude2 = df2.loc[8, 'LATITUDE'] # neighborhood latitude value
neighborhood_longitude2 = df2.loc[8, 'LONGITUDE'] # neighborhood longitude value

neighborhood_name2 = df2.loc[8, 'Neighborhood'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name2, 
                                                               neighborhood_latitude2, 
                                                         
                                                               neighborhood_longitude2))

Latitude and longitude values of Southwark are 51.50361111111111, -0.08027777777777778.


In [38]:
#### Now, let's get the top 100 venues that are in Southwark within a radius of 500 meters.

In [41]:
LIMIT = 100
radius=500
url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, neighborhood_latitude2, neighborhood_longitude2, VERSION, radius, LIMIT)

results = requests.get(url).json()
                                                                                                                           # function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head(10)



Unnamed: 0,name,categories,lat,lng
0,The Queen's Walk,Scenic Lookout,51.505237,-0.079039
1,The LaLit,Hotel,51.503306,-0.078623
2,Nine Lives,Cocktail Bar,51.503343,-0.082054
3,M&S Simply Food,Grocery Store,51.505117,-0.083103
4,Restaurant Story,Restaurant,51.502856,-0.077776
5,Bridge Theatre,Theater,51.504089,-0.077293
6,Unicorn Theatre,Theater,51.503995,-0.080991
7,Flat Iron,Steakhouse,51.504209,-0.082254
8,The Ivy Tower Bridge,English Restaurant,51.503491,-0.076883
9,London Riviera,Bar,51.505329,-0.079312


The Queen's walk - Tower of London Riverside Walk- Art Gallery - Theater - Tour provider - Restaurants - Art Museum - Tower Bridge Piazza - pARKS - gARDEN - Scenic lookouts

With Foursquare, I extracted the top 100 venues that are in Southwark within a radius of 500 meters.
In the neighborhood, we can find:
-	Scenic Lookout (The Queen’s walk, Tower of London Riverside Walk, etc.)
-	Theaters
-	Tour provider
-	Art Gallery
-	Art Museum
-	Restaurants
-	Parks
-	Garden

This neighborhood has a lot of tourist’s attractions, but it doesn’t have Bus or Train Station. We can see eventually see a tour provider.


# Results and Discussions
Our analysis shows that the Inner London is composed of 12 neighborhoods. We can see a lot of similarities between all of them.
After clustering we realized that all neighborhoods had a lot of restaurants, so the number of restaurants couldn’t be a criterion of selection for the place of our new hotel.
If we want a hotel who attracts tourists, it must be closed of some tourist’s attraction and have bus or train stations. We know how difficult it is to drive or to park in London city. Train and bus are the best options if you don’t have a hotel in downtown.
After clustering, I didn’t have enough information to answer the question: Where to install a new hotel?
So, I looked deeper in both neighborhoods who could be a good place: Camden – Southwark.
Because of the absence of bus or train station in Southwark, I would recommend Camden as a place for the new hotel.


# Conclusions
The purpose of this project was to identify the best place to install a new hotel in London.
At he beginning we decided to look in the inner London, for two reasons:
-	Too many hotels in Downtown City.
-	Buildings are too expensive.
By calculating the number of venues from Foursquare data we have first identified the most actives neighborhoods. Clustering of these neighborhoods was then performed to create a map with satisfying criteria. Two Neighborhood was then selected.
After further analysis of these neighborhoods in a radius of 500 m, only one has all the necessary criteria.
Based on these analyses, I decided to recommend Camden for the installation of the new hotel.
More analysis will be necessary to know where exactly to install it in the neighborhood.
