# Capstone Project - The Battle of Neighborhoods (Week 1-2)

## Business Problem section 

#### Background

According to Bloomberg News, the London Housing Market is in a rut. It is now facing a number of different headwinds, including the prospect of higher taxes and a warning from the Bank of England that U.K. home values could fall as much as 30 percent in the event of a disorderly exit from the European Union. More specifically, four overlooked cracks suggest that the London market may be in worse shape than many realize: hidden price falls, record-low sales, homebuilder exodus and tax hikes addressing overseas buyers of homes in England and Wales. 

#### Business Problem

In this scenario, it is urgent to adopt machine learning tools in order to assist homebuyers clientele in London to make wise and effective decisions. As a result, the business problem we are currently posing is: how could we provide support to homebuyers clientele in to purchase a suitable real estate in London in this uncertain economic and financial scenario? 

To solve this business problem, we are going to cluster London neighborhoods in order to recommend venues and the current average price of real estate where homebuyers can make a real estate investment. We will recommend profitable venues according to amenities and essential facilities surrounding such venues i.e. elementary schools, high schools, hospitals & grocery stores.




##  Data section

Data on London properties and the relative price paid data were extracted from the HM Land Registry (http://landregistry.data.gov.uk/). The following fields comprise the address data included in Price Paid Data: Postcode; PAON Primary Addressable Object Name. Typically the house number or name; SAON Secondary Addressable Object Name. If there is a sub-building, for example, the building is divided into flats, there will be a SAON; Street; Locality; Town/City; District; County.

To explore and target recommended locations across different venues according to the presence of amenities and essential facilities, we will access data through FourSquare API interface and arrange them as a dataframe for visualization. By merging data on London properties and the relative price paid data from the HM Land Registry and data on amenities and essential facilities surrounding such properties from FourSquare API interface, we will be able to recommend profitable real estate investments.


## Methodology section

The Methodology section will describe the main components of our analysis and predication system.
The Methodology section comprises four stages: 
    1. Collect Inspection Data
    2. Explore and Understand Data
    3. Data preparation and preprocessing 
    4. Modeling

#### 1. Collect Inspection Data

After importing the necessary libraries, we download the data from the HM Land Registry website as follows:

In [1]:
import os # Operating System
import numpy as np
import pandas as pd
import datetime as dt # Datetime
import json # library to handle JSON files

!conda install -c conda-forge geopy --yes
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

!conda install -c conda-forge folium=0.5.0 --yes
import folium #import folium # map rendering library

print('Libraries imported.')

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    certifi-2020.4.5.1         |   py36h9f0ad1d_0         151 KB  conda-forge
    geographiclib-1.50         |             py_0          34 KB  conda-forge
    openssl-1.1.1g             |       h516909a_0         2.1 MB  conda-forge
    geopy-1.22.0               |     pyh9f0ad1d_0          63 KB  conda-forge
    python_abi-3.6             |          1_cp36m           4 KB  conda-forge
    ca-certificates-2020.4.5.1 |       hecc5488_0         146 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         2.5 MB

The following NEW packages will be INSTALLED:

    geographiclib:   1.50-py_0           conda-forge
    geopy:          

In [2]:
#Read the data for examination (Source: http://landregistry.data.gov.uk/)
df_ppd = pd.read_csv("http://prod2.publicdata.landregistry.gov.uk.s3-website-eu-west-1.amazonaws.com/pp-2018.csv")

Before using data, we will have to explore and understand it.

#### 2. Explore and Understand Data

We read the dataset that we collected from the HM Land Registry website into a pandas' data frame and display the first five rows of it as follows: 

In [3]:
df_ppd.head(5) 

Unnamed: 0,{666758D7-43A9-3363-E053-6B04A8C0D74E},405000,2018-01-25 00:00,WR15 8LH,D,N,F,RAMBLERS WAY,Unnamed: 8,Unnamed: 9,BORASTON,TENBURY WELLS,SHROPSHIRE,SHROPSHIRE.1,A,A.1
0,{666758D7-43AA-3363-E053-6B04A8C0D74E},315000,2018-01-23 00:00,SY7 8QA,D,N,F,MONT CENISE,,,CLUN,CRAVEN ARMS,SHROPSHIRE,SHROPSHIRE,A,A
1,{666758D7-43AD-3363-E053-6B04A8C0D74E},165000,2018-01-19 00:00,SY1 2BF,T,Y,F,42,,PENSON WAY,,SHREWSBURY,SHROPSHIRE,SHROPSHIRE,A,A
2,{666758D7-43B0-3363-E053-6B04A8C0D74E},370000,2018-01-22 00:00,SY8 4DF,D,N,F,WILLOW HEY,,,ASHFORD CARBONEL,LUDLOW,SHROPSHIRE,SHROPSHIRE,A,A
3,{666758D7-43B3-3363-E053-6B04A8C0D74E},320000,2018-01-19 00:00,TF10 7ET,D,N,F,3,,PRINCESS GARDENS,,NEWPORT,WREKIN,WREKIN,A,A
4,{666758D7-43B4-3363-E053-6B04A8C0D74E},180000,2018-01-31 00:00,SY3 0NQ,S,N,F,79,,LYTHWOOD ROAD,BAYSTON HILL,SHREWSBURY,SHROPSHIRE,SHROPSHIRE,A,A


In [4]:
df_ppd.shape

(1029799, 16)

Our dataset consists of over 700000 rows and 16 columns. We will now prepare and preprocess data accordingly. 

#### 3. Data preparation and preprocessing

At this stage, we prepare our dataset for the modeling process, opting for the most suitable machine learning algorithm for our scope. Accordingly, we perform the following steps: 
- Rename the column names 
- Format the date column 
- Sort data by date of sale 
- Select data only for the city of London 
- Make a list of street names in London 
- Calculate the street-wise average price of the property
- Read the street-wise coordinates into a data frame, eliminating recurring word London from individual names 
- Join the data to find the coordinates of locations which fit into client's budget
- Plot recommended locations on London map along with current market prices






In [5]:
# Assign meaningful column names
df_ppd.columns = ['TUID', 'Price', 'Date_Transfer', 'Postcode', 'Prop_Type', 'Old_New', 'Duration', 'PAON', \
                  'SAON', 'Street', 'Locality', 'Town_City', 'District', 'County', 'PPD_Cat_Type', 'Record_Status']

In [6]:
# Format the date column
df_ppd['Date_Transfer'] = df_ppd['Date_Transfer'].apply(pd.to_datetime)

# Delete all obsolete transactions which were done before 2016
df_ppd.drop(df_ppd[df_ppd.Date_Transfer.dt.year < 2016].index, inplace=True)

# Sort by Date of Sale
df_ppd.sort_values(by=['Date_Transfer'],ascending=[False],inplace=True)

In [7]:
df_ppd_london = df_ppd.query("Town_City == 'LONDON'")

# Make a list of street names in LONDON
streets = df_ppd_london['Street'].unique().tolist()

In [8]:
df_grp_price = df_ppd_london.groupby(['Street'])['Price'].mean().reset_index()

# Give meaningful names to the columns
df_grp_price.columns = ['Street', 'Avg_Price']

In [9]:
#Input your Budget's Upper Limit and Lower Limit - Find the locations df_grp_price which fits your budget
df_affordable = df_grp_price.query("(Avg_Price >= 2200000) & (Avg_Price <= 2500000)")

In [10]:
# Display the dataframe
df_affordable

Unnamed: 0,Street,Avg_Price
196,ALBION SQUARE,2.450000e+06
390,ANHALT ROAD,2.435000e+06
405,ANSDELL TERRACE,2.250000e+06
422,APPLEGARTH ROAD,2.400000e+06
855,BARONSMEAD ROAD,2.375000e+06
981,BEAUCLERC ROAD,2.480000e+06
1102,BELVEDERE DRIVE,2.340000e+06
1215,BICKENHALL STREET,2.208500e+06
1253,BIRCHLANDS AVENUE,2.217000e+06
1553,BRAMPTON GROVE,2.456875e+06


In [11]:
import pandas as pd
import numpy as np
import datetime as DT
import hmac
from geopy.geocoders import Nominatim
from geopy.distance import vincenty
# import k-means from clustering stage
from sklearn.cluster import KMeans

In [12]:
for index, item in df_affordable.iterrows():
    print(f"index: {index}")
    print(f"item: {item}")
    print(f"item.Street only: {item.Street}")

index: 196
item: Street       ALBION SQUARE
Avg_Price         2.45e+06
Name: 196, dtype: object
item.Street only: ALBION SQUARE
index: 390
item: Street       ANHALT ROAD
Avg_Price      2.435e+06
Name: 390, dtype: object
item.Street only: ANHALT ROAD
index: 405
item: Street       ANSDELL TERRACE
Avg_Price           2.25e+06
Name: 405, dtype: object
item.Street only: ANSDELL TERRACE
index: 422
item: Street       APPLEGARTH ROAD
Avg_Price            2.4e+06
Name: 422, dtype: object
item.Street only: APPLEGARTH ROAD
index: 855
item: Street       BARONSMEAD ROAD
Avg_Price          2.375e+06
Name: 855, dtype: object
item.Street only: BARONSMEAD ROAD
index: 981
item: Street       BEAUCLERC ROAD
Avg_Price          2.48e+06
Name: 981, dtype: object
item.Street only: BEAUCLERC ROAD
index: 1102
item: Street       BELVEDERE DRIVE
Avg_Price           2.34e+06
Name: 1102, dtype: object
item.Street only: BELVEDERE DRIVE
index: 1215
item: Street       BICKENHALL STREET
Avg_Price           2.2085e+06
N

In [13]:
geolocator = Nominatim()


  if __name__ == '__main__':


In [14]:
df_affordable['city_coord'] = df_affordable['Street'].apply(geolocator.geocode).apply(lambda x: (x.latitude, x.longitude))

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  if __name__ == '__main__':


In [15]:
df_affordable

Unnamed: 0,Street,Avg_Price,city_coord
196,ALBION SQUARE,2.450000e+06,"(-41.27375755, 173.28939323910353)"
390,ANHALT ROAD,2.435000e+06,"(51.4803164, -0.1668011)"
405,ANSDELL TERRACE,2.250000e+06,"(51.4998899, -0.1891027)"
422,APPLEGARTH ROAD,2.400000e+06,"(53.7486539, -0.3266704)"
855,BARONSMEAD ROAD,2.375000e+06,"(51.4773147, -0.239457)"
981,BEAUCLERC ROAD,2.480000e+06,"(30.2114523, -81.6179807)"
1102,BELVEDERE DRIVE,2.340000e+06,"(38.0728178, -78.4587964)"
1215,BICKENHALL STREET,2.208500e+06,"(51.5212014, -0.1589082)"
1253,BIRCHLANDS AVENUE,2.217000e+06,"(51.4483941, -0.1604676)"
1553,BRAMPTON GROVE,2.456875e+06,"(51.5899607, -0.3185249)"


In [16]:
df_affordable[['Latitude', 'Longitude']] = df_affordable['city_coord'].apply(pd.Series)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self[k1] = value[k2]


In [17]:
df_affordable

Unnamed: 0,Street,Avg_Price,city_coord,Latitude,Longitude
196,ALBION SQUARE,2.450000e+06,"(-41.27375755, 173.28939323910353)",-41.273758,173.289393
390,ANHALT ROAD,2.435000e+06,"(51.4803164, -0.1668011)",51.480316,-0.166801
405,ANSDELL TERRACE,2.250000e+06,"(51.4998899, -0.1891027)",51.499890,-0.189103
422,APPLEGARTH ROAD,2.400000e+06,"(53.7486539, -0.3266704)",53.748654,-0.326670
855,BARONSMEAD ROAD,2.375000e+06,"(51.4773147, -0.239457)",51.477315,-0.239457
981,BEAUCLERC ROAD,2.480000e+06,"(30.2114523, -81.6179807)",30.211452,-81.617981
1102,BELVEDERE DRIVE,2.340000e+06,"(38.0728178, -78.4587964)",38.072818,-78.458796
1215,BICKENHALL STREET,2.208500e+06,"(51.5212014, -0.1589082)",51.521201,-0.158908
1253,BIRCHLANDS AVENUE,2.217000e+06,"(51.4483941, -0.1604676)",51.448394,-0.160468
1553,BRAMPTON GROVE,2.456875e+06,"(51.5899607, -0.3185249)",51.589961,-0.318525


In [18]:
df = df_affordable.drop(columns=['city_coord'])


In [19]:
df

Unnamed: 0,Street,Avg_Price,Latitude,Longitude
196,ALBION SQUARE,2.450000e+06,-41.273758,173.289393
390,ANHALT ROAD,2.435000e+06,51.480316,-0.166801
405,ANSDELL TERRACE,2.250000e+06,51.499890,-0.189103
422,APPLEGARTH ROAD,2.400000e+06,53.748654,-0.326670
855,BARONSMEAD ROAD,2.375000e+06,51.477315,-0.239457
981,BEAUCLERC ROAD,2.480000e+06,30.211452,-81.617981
1102,BELVEDERE DRIVE,2.340000e+06,38.072818,-78.458796
1215,BICKENHALL STREET,2.208500e+06,51.521201,-0.158908
1253,BIRCHLANDS AVENUE,2.217000e+06,51.448394,-0.160468
1553,BRAMPTON GROVE,2.456875e+06,51.589961,-0.318525


In [20]:
address = 'London, UK'

geolocator = Nominatim()
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of London City are {}, {}.'.format(latitude, longitude))

  app.launch_new_instance()


The geograpical coordinate of London City are 51.5073219, -0.1276474.


In [21]:
# create map of London using latitude and longitude values
map_london = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, price, street in zip(df['Latitude'], df['Longitude'], df['Avg_Price'], df['Street']):
    label = '{}, {}'.format(street, price)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_london)  
    
map_london

In [22]:
#Define Foursquare Credentials and Version

CLIENT_ID = 'MCMJBQ0KCOAEJW4BI0Q4LKMGTQRBWK1ST0VA1IZ5SYM1J4IF' # Foursquare ID
CLIENT_SECRET = 'GAKKFOHY3I4G5LDQU1PTOA0Y55D2DGWG21FNOF0ERTHPWQMR' # Foursquare Secret
VERSION = '20180604' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: KI3TR0QO4JOKMFELOMF3WSOOI3HFNBF5YLW354MYWBKDHEX3
CLIENT_SECRET:QF4ZBLJRBV4BQX52DVWUPEHJ14A2UJABPCZARZQZYTKIISUD


We can now proceed to the Modeling phase. We will analyze neighborhoods to recommend real estates where home buyers can make a real estate investment. We will then recommend profitable venues according to amenities and essential facilities surrounding such venues i.e. elementary schools, high schools, hospitals & grocery stores.

#### 4. Modeling

After exploring the dataset and gaining insights into it, we are ready to use the clustering methodology to analyze real estates. We will use the k-means clustering technique as it is fast and efficient in terms of computational cost, is highly flexible to account for mutations in real estate market in London and is accurate.

In [23]:
def getNearbyVenues(names, latitudes, longitudes, radius=500, LIMIT=100):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Street', 
                  'Street Latitude', 
                  'Street Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [24]:
# Run the above function on each location and create a new dataframe called location_venues and display it.
location_venues = getNearbyVenues(names=df['Street'],
                                   latitudes=df['Latitude'],
                                   longitudes=df['Longitude']
                                  )

ALBION SQUARE
ANHALT ROAD
ANSDELL TERRACE
APPLEGARTH ROAD
BARONSMEAD ROAD
BEAUCLERC ROAD
BELVEDERE DRIVE
BICKENHALL STREET
BIRCHLANDS AVENUE
BRAMPTON GROVE
BRIARDALE GARDENS
BROOKWAY
BURBAGE ROAD
BURY WALK
CALLCOTT STREET
CAMPDEN HILL ROAD
CAMPION ROAD
CANNING PLACE
CARLISLE ROAD
CARLTON GARDENS
CARLYLE COURT
CHALCOT SQUARE
CHARLES LANE
CHELSEA CRESCENT
CHESTER CLOSE NORTH
CHEYNE COURT
CHEYNE ROW
CHISWICK MALL
CITY ROAD
CLARENDON STREET
CLONCURRY STREET
COLBECK MEWS
COLLEGE CRESCENT
CORNWALL TERRACE MEWS
COURT LANE GARDENS
CRESCENT GROVE
DALEBURY ROAD
DEWHURST ROAD
DORIA ROAD
DOWNSHIRE HILL
DUCHESS WALK
ECCLESTON SQUARE MEWS
EGBERT STREET
EGERTON PLACE
ELM PARK ROAD
FLORAL STREET
FRANK DIXON WAY
FULTON MEWS
GERARD ROAD
GERRARD ROAD
GIRDLERS ROAD
GLOUCESTER CRESCENT
GORDON PLACE
GRAFTON SQUARE
GRAHAM TERRACE
HARMAN DRIVE
HARRIS STREET
HAVANNAH STREET
HAZLEWELL ROAD
HEREFORD MEWS
HERONDALE AVENUE
HIGHGATE HIGH STREET
HIGHWOOD HILL
HILLGATE PLACE
HOLLYCROFT AVENUE
HOLLYWOOD MEWS
HONEYWELL

In [25]:
location_venues

Unnamed: 0,Street,Street Latitude,Street Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,ALBION SQUARE,-41.273758,173.289393,The Free House,-41.273340,173.287364,Bar
1,ALBION SQUARE,-41.273758,173.289393,The Indian Cafe,-41.273308,173.286530,Indian Restaurant
2,ALBION SQUARE,-41.273758,173.289393,Queen's Gardens,-41.273671,173.291383,Park
3,ALBION SQUARE,-41.273758,173.289393,Urban,-41.274355,173.286317,New American Restaurant
4,ALBION SQUARE,-41.273758,173.289393,Fish Stop,-41.276010,173.289592,Fish & Chips Shop
5,ALBION SQUARE,-41.273758,173.289393,Deville Cafe,-41.271941,173.285535,Beer Garden
6,ALBION SQUARE,-41.273758,173.289393,Fresh Choice,-41.272194,173.287218,Supermarket
7,ALBION SQUARE,-41.273758,173.289393,The Bridge Street Collective,-41.272520,173.285517,Café
8,ALBION SQUARE,-41.273758,173.289393,Mango,-41.274460,173.285345,Indian Restaurant
9,ALBION SQUARE,-41.273758,173.289393,Hopgood's,-41.274749,173.283831,Restaurant


In [26]:
location_venues.groupby('Street').count()


Unnamed: 0_level_0,Street Latitude,Street Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Street,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
ALBION SQUARE,28,28,28,28,28,28
ANHALT ROAD,16,16,16,16,16,16
ANSDELL TERRACE,47,47,47,47,47,47
APPLEGARTH ROAD,4,4,4,4,4,4
BARONSMEAD ROAD,15,15,15,15,15,15
BEAUCLERC ROAD,4,4,4,4,4,4
BELVEDERE DRIVE,3,3,3,3,3,3
BICKENHALL STREET,58,58,58,58,58,58
BIRCHLANDS AVENUE,12,12,12,12,12,12
BRAMPTON GROVE,2,2,2,2,2,2


In [27]:
# get the List of Unique Categories
print('There are {} uniques categories.'.format(len(location_venues['Venue Category'].unique())))


There are 335 uniques categories.


In [28]:
location_venues.shape

(4401, 7)

In [29]:
# one hot encoding
venues_onehot = pd.get_dummies(location_venues[['Venue Category']], prefix="", prefix_sep="")

# add street column back to dataframe
venues_onehot['Street'] = location_venues['Street'] 

# move street column to the first column
fixed_columns = [venues_onehot.columns[-1]] + list(venues_onehot.columns[:-1])

#fixed_columns
venues_onehot = venues_onehot[fixed_columns]

venues_onehot.head()

Unnamed: 0,Street,Accessories Store,Adult Boutique,Afghan Restaurant,African Restaurant,American Restaurant,Antique Shop,Arcade,Argentinian Restaurant,Art Gallery,...,Vietnamese Restaurant,Warehouse Store,Waterfront,Weight Loss Center,Wine Bar,Wine Shop,Women's Store,Yoga Studio,Zoo,Zoo Exhibit
0,ALBION SQUARE,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,ALBION SQUARE,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,ALBION SQUARE,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,ALBION SQUARE,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,ALBION SQUARE,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [30]:
london_grouped = venues_onehot.groupby('Street').mean().reset_index()
london_grouped

Unnamed: 0,Street,Accessories Store,Adult Boutique,Afghan Restaurant,African Restaurant,American Restaurant,Antique Shop,Arcade,Argentinian Restaurant,Art Gallery,...,Vietnamese Restaurant,Warehouse Store,Waterfront,Weight Loss Center,Wine Bar,Wine Shop,Women's Store,Yoga Studio,Zoo,Zoo Exhibit
0,ALBION SQUARE,0.000000,0.0,0.0,0.0,0.000000,0.0,0.000000,0.000000,0.035714,...,0.000000,0.00,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0
1,ANHALT ROAD,0.000000,0.0,0.0,0.0,0.000000,0.0,0.000000,0.000000,0.000000,...,0.000000,0.00,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0
2,ANSDELL TERRACE,0.000000,0.0,0.0,0.0,0.000000,0.0,0.000000,0.000000,0.021277,...,0.000000,0.00,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0
3,APPLEGARTH ROAD,0.000000,0.0,0.0,0.0,0.000000,0.0,0.000000,0.000000,0.000000,...,0.000000,0.00,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0
4,BARONSMEAD ROAD,0.000000,0.0,0.0,0.0,0.000000,0.0,0.000000,0.000000,0.000000,...,0.000000,0.00,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0
5,BEAUCLERC ROAD,0.000000,0.0,0.0,0.0,0.000000,0.0,0.000000,0.000000,0.000000,...,0.000000,0.00,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0
6,BELVEDERE DRIVE,0.000000,0.0,0.0,0.0,0.000000,0.0,0.000000,0.000000,0.000000,...,0.000000,0.00,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0
7,BICKENHALL STREET,0.000000,0.0,0.0,0.0,0.000000,0.0,0.000000,0.000000,0.000000,...,0.000000,0.00,0.000000,0.0,0.017241,0.0,0.017241,0.017241,0.000000,0.0
8,BIRCHLANDS AVENUE,0.000000,0.0,0.0,0.0,0.000000,0.0,0.000000,0.000000,0.000000,...,0.000000,0.00,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0
9,BRAMPTON GROVE,0.000000,0.0,0.0,0.0,0.000000,0.0,0.000000,0.000000,0.000000,...,0.000000,0.00,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0


In [31]:
london_grouped.shape


(152, 336)

In [32]:
# What are the top 5 venues/facilities nearby profitable real estate investments?#

num_top_venues = 5

for hood in london_grouped['Street']:
    print("----"+hood+"----")
    temp = london_grouped[london_grouped['Street'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----ALBION SQUARE----
               venue  freq
0               Café  0.21
1  Indian Restaurant  0.07
2                Bar  0.07
3         Restaurant  0.07
4        Coffee Shop  0.07


----ANHALT ROAD----
               venue  freq
0                Pub  0.25
1  French Restaurant  0.12
2      Grocery Store  0.12
3              Diner  0.06
4             Garden  0.06


----ANSDELL TERRACE----
            venue  freq
0      Restaurant  0.09
1  Clothing Store  0.09
2       Juice Bar  0.06
3           Hotel  0.06
4             Pub  0.06


----APPLEGARTH ROAD----
               venue  freq
0          Nightclub  0.25
1             Casino  0.25
2     Sandwich Place  0.25
3                Bar  0.25
4  Accessories Store  0.00


----BARONSMEAD ROAD----
             venue  freq
0             Park  0.07
1      Pizza Place  0.07
2      Coffee Shop  0.07
3    Movie Theater  0.07
4  Thai Restaurant  0.07


----BEAUCLERC ROAD----
               venue  freq
0    Harbor / Marina  0.25
1    Automotive Sho

               venue  freq
0  Indian Restaurant  0.25
1        Coffee Shop  0.25
2  Convenience Store  0.25
3  Fish & Chips Shop  0.25
4   Pedestrian Plaza  0.00


----GERRARD ROAD----
                        venue  freq
0                         Pub  0.33
1  Construction & Landscaping  0.33
2        Fast Food Restaurant  0.33
3           Outdoor Sculpture  0.00
4                 Pastry Shop  0.00


----GIRDLERS ROAD----
                venue  freq
0                 Pub  0.15
1      Sandwich Place  0.06
2               Hotel  0.06
3  Italian Restaurant  0.06
4   Convention Center  0.06


----GORDON PLACE----
                 venue  freq
0    Mobile Phone Shop   0.5
1             Bus Stop   0.5
2  Outdoor Event Space   0.0
3          Pastry Shop   0.0
4                 Park   0.0


----GRAFTON SQUARE----
          venue  freq
0           Pub  0.10
1          Café  0.05
2    Restaurant  0.05
3           Bar  0.05
4  Burger Joint  0.05


----GRAHAM TERRACE----
                        venu

               venue  freq
0             Lounge  0.33
1               Pool  0.33
2                Spa  0.33
3  Outdoor Sculpture  0.00
4        Pastry Shop  0.00


----OAKWOOD COURT----
                  venue  freq
0           Video Store  0.25
1                  Food  0.25
2           Golf Course  0.25
3           Pizza Place  0.25
4  Other Great Outdoors  0.00


----OBSERVATORY GARDENS----
            venue  freq
0            Café  0.09
1             Pub  0.05
2  Clothing Store  0.05
3       Juice Bar  0.04
4      Restaurant  0.04


----OLD COURT PLACE----
            venue  freq
0           Hotel  0.11
1          Garden  0.06
2  Clothing Store  0.06
3       Juice Bar  0.05
4         Exhibit  0.05


----ONSLOW MEWS WEST----
                venue  freq
0               Hotel  0.11
1  Italian Restaurant  0.06
2              Bakery  0.05
3      Sandwich Place  0.04
4              Garden  0.04


----PALACE PLACE----
               venue  freq
0  Electronics Store   1.0
1  Outdoor Sculptu

                  venue  freq
0           Supermarket  0.43
1        Discount Store  0.14
2   Rental Car Location  0.14
3           Pizza Place  0.14
4  Fast Food Restaurant  0.14


----SUTHERLAND PLACE----
                           venue  freq
0                           Park   1.0
1              Accessories Store   0.0
2          Performing Arts Venue   0.0
3                    Pastry Shop   0.0
4  Paper / Office Supplies Store   0.0


----SYDNEY STREET----
                venue  freq
0                 Spa  0.25
1  Italian Restaurant  0.25
2         Men's Store  0.25
3                 Gym  0.25
4   Accessories Store  0.00


----THAMES BANK----
                  venue  freq
0         Grocery Store  0.25
1          Burger Joint  0.25
2           Pizza Place  0.25
3  Gym / Fitness Center  0.25
4     Accessories Store  0.00


----THE HEXAGON----
                venue  freq
0      Clothing Store  0.10
1                 Pub  0.08
2  Italian Restaurant  0.08
3         Coffee Shop  0.08
4  

In [33]:
# Define a function to return the most common venues/facilities nearby real estate investments#

def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [34]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Street']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))



In [35]:
# create a new dataframe
venues_sorted = pd.DataFrame(columns=columns)
venues_sorted['Street'] = london_grouped['Street']

for ind in np.arange(london_grouped.shape[0]):
    venues_sorted.iloc[ind, 1:] = return_most_common_venues(london_grouped.iloc[ind, :], num_top_venues)



In [36]:
venues_sorted.head()

Unnamed: 0,Street,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,ALBION SQUARE,Café,Indian Restaurant,Bar,Pub,Coffee Shop,Restaurant,Department Store,Park,Seafood Restaurant,New American Restaurant
1,ANHALT ROAD,Pub,Grocery Store,French Restaurant,Cocktail Bar,Diner,Plaza,English Restaurant,Japanese Restaurant,Gym / Fitness Center,Farmers Market
2,ANSDELL TERRACE,Clothing Store,Restaurant,Hotel,Pub,Juice Bar,Indian Restaurant,Bakery,Italian Restaurant,Burger Joint,Supermarket
3,APPLEGARTH ROAD,Bar,Sandwich Place,Casino,Nightclub,Farm,English Restaurant,Ethiopian Restaurant,Event Space,Exhibit,Factory
4,BARONSMEAD ROAD,Breakfast Spot,Thai Restaurant,Community Center,Pub,Coffee Shop,Restaurant,Park,Farmers Market,Café,Nature Preserve


In [37]:
venues_sorted.shape

(152, 11)

In [38]:
london_grouped.shape

(152, 336)

In [39]:
london_grouped=df

After our inspection of venues/facilities/amenities nearby the most profitable real estate investments in London, we could begin by clustering properties by venues/facilities/amenities nearby. 

In [40]:
#Distribute in 5 Clusters

# set number of clusters
kclusters = 5

london_grouped_clustering = london_grouped.drop('Street', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(london_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:50]

array([1, 3, 0, 3, 2, 1, 2, 0, 0, 1, 3, 3, 3, 1, 2, 2, 1, 3, 0, 1, 4, 4,
       3, 1, 1, 0, 3, 4, 1, 0, 3, 2, 3, 2, 2, 4, 3, 3, 2, 0, 1, 2, 4, 0,
       4, 0, 0, 4, 0, 0], dtype=int32)

In [41]:
#Dataframe to include Clusters

london_grouped_clustering=df
london_grouped_clustering.head()

Unnamed: 0,Street,Avg_Price,Latitude,Longitude
196,ALBION SQUARE,2450000.0,-41.273758,173.289393
390,ANHALT ROAD,2435000.0,51.480316,-0.166801
405,ANSDELL TERRACE,2250000.0,51.49989,-0.189103
422,APPLEGARTH ROAD,2400000.0,53.748654,-0.32667
855,BARONSMEAD ROAD,2375000.0,51.477315,-0.239457


In [42]:
london_grouped_clustering.shape

(162, 4)

In [43]:
df.shape

(162, 4)

In [44]:
london_grouped_clustering.dtypes

Street        object
Avg_Price    float64
Latitude     float64
Longitude    float64
dtype: object

In [45]:
df.dtypes

Street        object
Avg_Price    float64
Latitude     float64
Longitude    float64
dtype: object

In [46]:
# add clustering labels
london_grouped_clustering['Cluster Labels'] = kmeans.labels_

# merge london_grouped with london_data to add latitude/longitude for each neighborhood
london_grouped_clustering = london_grouped_clustering.join(venues_sorted.set_index('Street'), on='Street')

london_grouped_clustering.head(30) # check the last columns!

Unnamed: 0,Street,Avg_Price,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
196,ALBION SQUARE,2450000.0,-41.273758,173.289393,1,Café,Indian Restaurant,Bar,Pub,Coffee Shop,Restaurant,Department Store,Park,Seafood Restaurant,New American Restaurant
390,ANHALT ROAD,2435000.0,51.480316,-0.166801,3,Pub,Grocery Store,French Restaurant,Cocktail Bar,Diner,Plaza,English Restaurant,Japanese Restaurant,Gym / Fitness Center,Farmers Market
405,ANSDELL TERRACE,2250000.0,51.49989,-0.189103,0,Clothing Store,Restaurant,Hotel,Pub,Juice Bar,Indian Restaurant,Bakery,Italian Restaurant,Burger Joint,Supermarket
422,APPLEGARTH ROAD,2400000.0,53.748654,-0.32667,3,Bar,Sandwich Place,Casino,Nightclub,Farm,English Restaurant,Ethiopian Restaurant,Event Space,Exhibit,Factory
855,BARONSMEAD ROAD,2375000.0,51.477315,-0.239457,2,Breakfast Spot,Thai Restaurant,Community Center,Pub,Coffee Shop,Restaurant,Park,Farmers Market,Café,Nature Preserve
981,BEAUCLERC ROAD,2480000.0,30.211452,-81.617981,1,Pizza Place,Automotive Shop,Harbor / Marina,Doctor's Office,Zoo Exhibit,Farmers Market,Ethiopian Restaurant,Event Space,Exhibit,Factory
1102,BELVEDERE DRIVE,2340000.0,38.072818,-78.458796,2,Playground,Pool,Zoo Exhibit,Farmers Market,Electronics Store,English Restaurant,Ethiopian Restaurant,Event Space,Exhibit,Factory
1215,BICKENHALL STREET,2208500.0,51.521201,-0.158908,0,Gastropub,Café,Hotel,Restaurant,Italian Restaurant,Pizza Place,Movie Theater,Pub,Garden,Greek Restaurant
1253,BIRCHLANDS AVENUE,2217000.0,51.448394,-0.160468,0,French Restaurant,Pub,Breakfast Spot,Coffee Shop,Chinese Restaurant,Bakery,Lake,Train Station,Brewery,Pizza Place
1553,BRAMPTON GROVE,2456875.0,51.589961,-0.318525,1,Construction & Landscaping,Home Service,Zoo Exhibit,Farmers Market,Electronics Store,English Restaurant,Ethiopian Restaurant,Event Space,Exhibit,Factory


In [47]:
# Create Map

map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(london_grouped_clustering['Latitude'], london_grouped_clustering['Longitude'], london_grouped_clustering['Street'], london_grouped_clustering['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [48]:
london_grouped_clustering.loc[london_grouped_clustering['Cluster Labels'] == 0, london_grouped_clustering.columns[[1] + list(range(5, london_grouped_clustering.shape[1]))]].head()


Unnamed: 0,Avg_Price,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
405,2250000.0,Clothing Store,Restaurant,Hotel,Pub,Juice Bar,Indian Restaurant,Bakery,Italian Restaurant,Burger Joint,Supermarket
1215,2208500.0,Gastropub,Café,Hotel,Restaurant,Italian Restaurant,Pizza Place,Movie Theater,Pub,Garden,Greek Restaurant
1253,2217000.0,French Restaurant,Pub,Breakfast Spot,Coffee Shop,Chinese Restaurant,Bakery,Lake,Train Station,Brewery,Pizza Place
2225,2200000.0,,,,,,,,,,
2638,2250000.0,Cosmetics Shop,Bookstore,Bakery,Grocery Store,Supermarket,Clothing Store,Pharmacy,Coffee Shop,Gift Shop,Pizza Place


In [49]:
london_grouped_clustering.loc[london_grouped_clustering['Cluster Labels'] == 1, london_grouped_clustering.columns[[1] + list(range(5, london_grouped_clustering.shape[1]))]].head()


Unnamed: 0,Avg_Price,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
196,2450000.0,Café,Indian Restaurant,Bar,Pub,Coffee Shop,Restaurant,Department Store,Park,Seafood Restaurant,New American Restaurant
981,2480000.0,Pizza Place,Automotive Shop,Harbor / Marina,Doctor's Office,Zoo Exhibit,Farmers Market,Ethiopian Restaurant,Event Space,Exhibit,Factory
1553,2456875.0,Construction & Landscaping,Home Service,Zoo Exhibit,Farmers Market,Electronics Store,English Restaurant,Ethiopian Restaurant,Event Space,Exhibit,Factory
1980,2492500.0,Supermarket,English Restaurant,Coffee Shop,Park,Dry Cleaner,American Restaurant,Pub,Fast Food Restaurant,Rental Car Location,Café
2136,2461000.0,Pub,Trail,Zoo Exhibit,Farm,Electronics Store,English Restaurant,Ethiopian Restaurant,Event Space,Exhibit,Factory


In [50]:
london_grouped_clustering.loc[london_grouped_clustering['Cluster Labels'] == 2, london_grouped_clustering.columns[[1] + list(range(5, london_grouped_clustering.shape[1]))]].head()


Unnamed: 0,Avg_Price,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
855,2375000.0,Breakfast Spot,Thai Restaurant,Community Center,Pub,Coffee Shop,Restaurant,Park,Farmers Market,Café,Nature Preserve
1102,2340000.0,Playground,Pool,Zoo Exhibit,Farmers Market,Electronics Store,English Restaurant,Ethiopian Restaurant,Event Space,Exhibit,Factory
2068,2375000.0,Pub,Park,Indian Restaurant,Yoga Studio,Bakery,Hotel,Grocery Store,Ice Cream Shop,Bookstore,Farmers Market
2129,2379652.7,Pub,Grocery Store,Bakery,Indian Restaurant,Yoga Studio,Hotel,Ice Cream Shop,Coffee Shop,Park,Pizza Place
2944,2367500.0,Hotel,Pub,Garden,Café,Italian Restaurant,Coffee Shop,Mediterranean Restaurant,Bar,Chinese Restaurant,Bakery


In [51]:
london_grouped_clustering.loc[london_grouped_clustering['Cluster Labels'] == 3, london_grouped_clustering.columns[[1] + list(range(5, london_grouped_clustering.shape[1]))]].head()


Unnamed: 0,Avg_Price,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
390,2435000.0,Pub,Grocery Store,French Restaurant,Cocktail Bar,Diner,Plaza,English Restaurant,Japanese Restaurant,Gym / Fitness Center,Farmers Market
422,2400000.0,Bar,Sandwich Place,Casino,Nightclub,Farm,English Restaurant,Ethiopian Restaurant,Event Space,Exhibit,Factory
1632,2397132.0,Sporting Goods Shop,Coffee Shop,Gym / Fitness Center,Grocery Store,Creperie,Farmers Market,English Restaurant,Ethiopian Restaurant,Event Space,Exhibit
1797,2400000.0,Art Gallery,Zoo Exhibit,Fast Food Restaurant,English Restaurant,Ethiopian Restaurant,Event Space,Exhibit,Factory,Falafel Restaurant,Farm
1914,2445000.0,Grocery Store,Dance Studio,Gym,Athletics & Sports,Bar,Fishing Store,Falafel Restaurant,Electronics Store,Food,English Restaurant


In [52]:
london_grouped_clustering.loc[london_grouped_clustering['Cluster Labels'] == 4, london_grouped_clustering.columns[[1] + list(range(5, london_grouped_clustering.shape[1]))]].head()


Unnamed: 0,Avg_Price,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2242,2300000.0,Construction & Landscaping,Farm,Zoo Exhibit,Fast Food Restaurant,English Restaurant,Ethiopian Restaurant,Event Space,Exhibit,Factory,Falafel Restaurant
2406,2286679.0,Café,Bar,Italian Restaurant,Pub,Coffee Shop,Park,Convenience Store,French Restaurant,Optical Shop,Cupcake Shop
2686,2287500.0,Pub,Brewery,Gym / Fitness Center,Gift Shop,Art Museum,Zoo Exhibit,Farmers Market,Ethiopian Restaurant,Event Space,Exhibit
3377,2298000.0,Hotel,Zoo Exhibit,Fast Food Restaurant,English Restaurant,Ethiopian Restaurant,Event Space,Exhibit,Factory,Falafel Restaurant,Farm
4285,2265000.0,American Restaurant,Gym,Trail,Farmers Market,Electronics Store,English Restaurant,Ethiopian Restaurant,Event Space,Exhibit,Factory


## Results and Discussion section

First of all, even though the London Housing Market may be in a rut, it is still an "ever-green" for business affairs.

We may discuss our results under two main perspectives. 

First, we may examine them according to neighborhoods/London areas. It is interesting to note that, although West London (Notting Hill, Kensington, Chelsea, Marylebone) and North-West London (Hampsted) might be considered highly profitable venues to purchase a real estate according to amenities and essential facilities surrounding such venues i.e. elementary schools, high schools, hospitals & grocery stores, South-West London (Wandsworth, Balham) and North-West London (Isliington) are arising as next future elite venues with a wide range of amenities and facilities. Accordingly, one might target under-priced real estates in these areas of London in order to make a business affair.

Second, we may analyze our results according to the five clusters we have produced. Even though, all clusters could praise an optimal range of facilities and amenities, we have found two main patterns. The first pattern we are referring to, i.e. Clusters 0, 2 and 4, may target home buyers prone to live in 'green' areas with parks, waterfronts. Instead, the second pattern we are referring to, i.e. Clusters 1 and 3, may target individuals who love pubs, theatres and soccer. 


## Conclusion

To sum up, according to Bloomberg News, the London Housing Market is in a rut. It is now facing a number of different headwinds, including the prospect of higher taxes and a warning from the Bank of England that U.K. home values could fall as much as 30 percent in the event of a disorderly exit from the European Union. In this scenario, it is urgent to adopt machine learning tools in order to assist homebuyers clientele in London to make wise and effective decisions. As a result, the business problem we were posing was: how could we provide support to homebuyers clientele in to purchase a suitable real estate in London in this uncertain economic and financial scenario? 

To solve this business problem, we clustered London neighborhoods in order to recommend venues and the current average price of real estate where homebuyers can make a real estate investment. We recommended profitable venues according to amenities and essential facilities surrounding such venues i.e. elementary schools, high schools, hospitals & grocery stores.

First, we gathered data on London properties and the relative price paid data were extracted from the HM Land Registry (http://landregistry.data.gov.uk/). Moreover, to explore and target recommended locations across different venues according to the presence of amenities and essential facilities, we accessed data through FourSquare API interface and arranged them as a data frame for visualization. By merging data on London properties and the relative price paid data from the HM Land Registry and data on amenities and essential facilities surrounding such properties from FourSquare API interface, we were able to recommend profitable real estate investments. 

Second, The Methodology section comprised four stages: 1. Collect Inspection Data; 2. Explore and Understand Data; 3. Data preparation and preprocessing; 4. Modeling. In particular, in the modeling section, we used the k-means clustering technique as it is fast and efficient in terms of computational cost, is highly flexible to account for mutations in real estate market in London and is accurate.

Finally, we drew the conclusion that even though the London Housing Market may be in a rut, it is still an "ever-green" for business affairs. We discussed our results under two main perspectives. First, we examined them according to neighborhoods/London areas. although West London (Notting Hill, Kensington, Chelsea, Marylebone) and North-West London (Hampsted) might be considered highly profitable venues to purchase a real estate according to amenities and essential facilities surrounding such venues i.e. elementary schools, high schools, hospitals & grocery stores, South-West London (Wandsworth, Balham) and North-West London (Isliington) are arising as next future elite venues with a wide range of amenities and facilities. Accordingly, one might target under-priced real estates in these areas of London in order to make a business affair. Second, we analyzed our results according to the five clusters we produced. While Clusters 0, 2 and 4 may target home buyers prone to live in 'green' areas with parks, waterfronts, Clusters 1 and 3 may target individuals who love pubs, theatres and soccer. 

