#### Problem Statement: Find a house at a convenient distance from desired amenities (e.g. pubs, cafes, supermarkets etc)

**Stakeholders**: People who are looking to switch house in Lewisham Council In London, UK.

**WHY**: Finding a house in London is a difficult task especially when you have specific requirements for the surroundings. So, this program should help stakeholders to narrow down their search. This solution can be extended to any council in London or even any other city

**Assumptions**: Data is downloaded from Doogle and it comprises of postcodes in Lewisham council In London, UK. Assuming that this data is limited and doesn't represent actual availability of the house. But that should be easy as we could just focus on the area here. I am going to focus on only Forest Hill part of the Lewisham as FourSquare api has aggressive rate limit for explore api of 500 calls/day.

In [2]:
import pandas as pd # Pandas is a data wrangling library

# The link we used for input dataset to extract geographical location information about Lewisham council
link = 'https://www.doogal.co.uk/AdministrativeAreasCSV.ashx?district=E09000023' 


__Read the data into a data frame__

In [3]:
Lewisham_Data = pd.read_csv(link)
Lewisham_Data.head()

Unnamed: 0,Postcode,In Use?,Latitude,Longitude,Easting,Northing,Grid Ref,Ward,Parish,Introduced,Terminated,Altitude,Country,Last Updated,Quality
0,BR1 4BY,Yes,51.417289,-0.001741,539050,170591,TQ390705,Downham,"Lewisham, unparished area",1980-01-01,,35,England,2018-11-15,Within the building of the matched address clo...
1,BR1 4DN,Yes,51.418996,-0.002156,539016,170780,TQ390707,Downham,"Lewisham, unparished area",1980-01-01,,35,England,2018-11-15,Within the building of the matched address clo...
2,BR1 4EY,Yes,51.418477,0.005042,539518,170736,TQ395707,Downham,"Lewisham, unparished area",1980-01-01,,50,England,2018-11-15,Within the building of the matched address clo...
3,BR1 4FD,Yes,51.421083,-0.002194,539007,171012,TQ390710,Downham,"Lewisham, unparished area",2010-01-01,,33,England,2018-11-15,Within the building of the matched address clo...
4,BR1 4JG,Yes,51.419403,-0.000728,539114,170828,TQ391708,Downham,"Lewisham, unparished area",1980-01-01,,40,England,2018-11-15,Within the building of the matched address clo...


__Extracted useful columns - Postcode, Latitude, Longitude and ward__

In [165]:
Lewisham_Data_Cleaned = Lewisham_Data[['Postcode','Latitude','Longitude','Ward']]
Lewisham_Data_Cleaned.head()

Unnamed: 0,Postcode,Latitude,Longitude,Ward
0,BR1 4BY,51.417289,-0.001741,Downham
1,BR1 4DN,51.418996,-0.002156,Downham
2,BR1 4EY,51.418477,0.005042,Downham
3,BR1 4FD,51.421083,-0.002194,Downham
4,BR1 4JG,51.419403,-0.000728,Downham


__To find the number of unique wards in the dataset__

In [166]:
print('There are {} wards in the Lewisham Council.'.format(
        len(Lewisham_Data_Cleaned['Ward'].unique())
    )
)

There are 18 wards in the Lewisham Council.


__Lets Concentrate on all the rows in forest hill to minimize the calls to FourSquare api__

In [168]:
ForestHill_Data = Lewisham_Data_Cleaned[Lewisham_Data_Cleaned['Ward'] == 'Forest Hill']
print(ForestHill_Data.shape)
ForestHill_Data.head()

(369, 4)


Unnamed: 0,Postcode,Latitude,Longitude,Ward
2913,SE23 1HS,51.442611,-0.058915,Forest Hill
2914,SE23 1HT,51.442611,-0.058915,Forest Hill
3198,SE23 2LE,51.438921,-0.053316,Forest Hill
3424,SE23 3AA,51.445738,-0.052767,Forest Hill
3425,SE23 3AB,51.445143,-0.052677,Forest Hill


__I split the Postcode into two parts, district and sector where sector will be unique and __
__will further be used in the analysis__

In [170]:
def Assign_sector(row):
    row['Sector'] = row.Postcode.split(' ',1)[1]
    return row
ForestHill_Data = ForestHill_Data.apply(Assign_sector,axis = 1)

In [171]:
ForestHill_Data.head()

Unnamed: 0,Postcode,Latitude,Longitude,Ward,Sector
2913,SE23 1HS,51.442611,-0.058915,Forest Hill,1HS
2914,SE23 1HT,51.442611,-0.058915,Forest Hill,1HT
3198,SE23 2LE,51.438921,-0.053316,Forest Hill,2LE
3424,SE23 3AA,51.445738,-0.052767,Forest Hill,3AA
3425,SE23 3AB,51.445143,-0.052677,Forest Hill,3AB


__Import all the necessary libraries like pandas and numpy for data manipulation,
geocoder for geographical analysis,
matplotlib and folium for Visualization,
requests to handle requests etc__

In [173]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


__Find the latitude and longitude of Forest Hill to further use it to plot the all the sectors found in it__

In [175]:
address = 'Forest Hill, London'

geolocator = Nominatim(user_agent = 'chrome')
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Forest Hill, London are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Forest Hill, London are 51.4392494, -0.0530179.


__Map Forest Hill and all the sectors extracted from it__

In [None]:
map_Lewisham = folium.Map(location = [latitude, longitude], zoom_start=15)

# add Forest Hill points to the map

for lat, long, ward, sector in zip(ForestHill_Data['Latitude'],ForestHill_Data['Longitude'],ForestHill_Data['Ward'],ForestHill_Data['Sector']):
    label = '{}, {}'.format(ward,sector)
    label = folium.Popup(label, parse_html = True)
    folium.CircleMarker(
    [lat,long],
    radius = 5,
    popup = label,
    color = 'blue',
    fill = True,
    fill_color='#3186cc',
    fill_opacity=0.7,
    parse_html=False).add_to(map_Lewisham) 

map_Lewisham

__Insert your client Id and secret for accessing foursquare API__

In [5]:
CLIENT_ID = 'Put your client ID here' # your Foursquare ID
CLIENT_SECRET = 'Put your client secret here' # your Foursquare Secret
VERSION = '20180811' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: Put your client ID here
CLIENT_SECRET:Put your client secret here


__This is a function to explore the chosen sectors (Forest Hill) and find all the 
nearby venues(limit kept on 100) in all the sectors__

In [227]:
LIMIT = 100

def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Sector', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

__Get all the sectors corresponding to which venues are extracted__

In [None]:
ForestHill_venues = getNearbyVenues(names=ForestHill_Data['Sector'],
                                   latitudes=ForestHill_Data['Latitude'],
                                   longitudes=ForestHill_Data['Longitude']
                                  )

__Print basic things like shape, head and few counts to view the data appended with four square API results__

In [212]:
print(ForestHill_venues.shape)


(5141, 7)


__I checked the counts of the venues in each sector__

In [213]:
ForestHill_venues.groupby('Sector').count()

__Lets see how many venue categories are found__

In [214]:
print('There are {} uniques categories.'.format(len(ForestHill_venues['Venue Category'].unique())))

There are 45 uniques categories.


#### Analyze the Sectors

__This is to pivot the venues tables where all the rows are sectors and all columns are different venue categories__

In [215]:
# one hot encoding
ForestHill_onehot = pd.get_dummies(ForestHill_venues[['Venue Category']], prefix="", prefix_sep="")

# add sector column back to dataframe
ForestHill_onehot['Sector'] = ForestHill_venues['Sector'] 

# move Sector column to the first column
fixed_columns = [ForestHill_onehot.columns[-1]] + list(ForestHill_onehot.columns[:-1])
ForestHill_onehot = ForestHill_onehot[fixed_columns]

ForestHill_onehot.head()

Unnamed: 0,Sector,Aquarium,Arts & Crafts Store,Beer Store,Bookstore,Bus Stop,Café,Chinese Restaurant,Coffee Shop,Convenience Store,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Food,Food Truck,Forest,Garden,Garden Center,Gas Station,Gastropub,Gift Shop,Grocery Store,Gym / Fitness Center,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Mediterranean Restaurant,Mobile Phone Shop,Museum,Nature Preserve,Park,Pharmacy,Pizza Place,Platform,Playground,Plaza,Post Office,Pub,Record Shop,Scenic Lookout,Street Art,Supermarket,Thai Restaurant,Thrift / Vintage Store,Trail,Train Station
0,1HS,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,1HS,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,1HS,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0
3,1HS,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,1HS,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


__Next, let's group rows by sector and by taking the mean of the frequency of occurrence of each category__

In [217]:
ForestHill_grouped = ForestHill_onehot.groupby('Sector').mean().reset_index()

__Lets return most common venue categories__

In [219]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [221]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Sector']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
Sector_venues_sorted = pd.DataFrame(columns=columns)
Sector_venues_sorted['Sector'] = ForestHill_grouped['Sector']

for ind in np.arange(ForestHill_grouped.shape[0]):
    Sector_venues_sorted.iloc[ind, 1:] = return_most_common_venues(ForestHill_grouped.iloc[ind, :], num_top_venues)



#### K Means Clustering

In [222]:
# set number of clusters
kclusters = 5

ForestHill_grouped_clustering = ForestHill_grouped.drop('Sector', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(ForestHill_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([1, 1, 1, 2, 2, 2, 2, 2, 2, 2], dtype=int32)

__Add cluster labels to the actual data__

In [223]:
ForestHill_merged = ForestHill_Data

# add clustering labels
ForestHill_merged['Cluster Labels'] = kmeans.labels_

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
ForestHill_merged = ForestHill_merged.join(Sector_venues_sorted.set_index('Sector'), on='Sector')

ForestHill_merged.head() # check the last columns!

Unnamed: 0,Postcode,Latitude,Longitude,Ward,Sector,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2913,SE23 1HS,51.442611,-0.058915,Forest Hill,1HS,1,Café,Coffee Shop,Pub,Indian Restaurant,Playground,Bus Stop,Farmers Market,Grocery Store,Gym / Fitness Center,Trail
2914,SE23 1HT,51.442611,-0.058915,Forest Hill,1HT,1,Café,Coffee Shop,Pub,Indian Restaurant,Playground,Bus Stop,Farmers Market,Grocery Store,Gym / Fitness Center,Trail
3198,SE23 2LE,51.438921,-0.053316,Forest Hill,2LE,1,Coffee Shop,Pub,Café,Gym / Fitness Center,Supermarket,Train Station,Fish & Chips Shop,Pizza Place,Gastropub,Grocery Store
3424,SE23 3AA,51.445738,-0.052767,Forest Hill,3AA,2,Thrift / Vintage Store,Garden Center,Scenic Lookout,Nature Preserve,Train Station,Fast Food Restaurant,Gastropub,Gas Station,Garden,Forest
3425,SE23 3AB,51.445143,-0.052677,Forest Hill,3AB,2,Garden Center,Scenic Lookout,Pub,Nature Preserve,Train Station,Fast Food Restaurant,Gastropub,Gas Station,Garden,Forest


__Finally visualize the clusters__

In [225]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=15)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(ForestHill_merged['Latitude'], ForestHill_merged['Longitude'], ForestHill_merged['Sector'], ForestHill_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

__Let's Examine the clusters__

__Cluster 0__

In [207]:
ForestHill_merged.loc[ForestHill_merged['Cluster Labels'] == 0, ForestHill_merged.columns[[1] + list(range(5, ForestHill_merged.shape[1]))]]

Unnamed: 0,Latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3451,51.444666,0,Farmers Market,Café,Plaza,Convenience Store,Grocery Store,Bus Stop,Bookstore,Gift Shop,Gastropub,Gas Station
3452,51.445104,0,Convenience Store,Scenic Lookout,Plaza,Farmers Market,Fast Food Restaurant,Gastropub,Gas Station,Garden Center,Garden,Forest
3453,51.445948,0,Thrift / Vintage Store,Scenic Lookout,Food Truck,Plaza,Train Station,Fast Food Restaurant,Gastropub,Gas Station,Garden Center,Garden
3454,51.444147,0,Aquarium,Trail,Bus Stop,Café,Museum,Plaza,Farmers Market,Grocery Store,Food Truck,Gastropub
3456,51.443746,0,Indian Restaurant,Plaza,Bus Stop,Café,Coffee Shop,Convenience Store,Farmers Market,Grocery Store,Trail,Museum
3458,51.44282,0,Indian Restaurant,Playground,Bus Stop,Café,Coffee Shop,Convenience Store,Farmers Market,Grocery Store,Trail,Museum
3459,51.445475,0,Thrift / Vintage Store,Food Truck,Plaza,Farmers Market,Train Station,Fast Food Restaurant,Gastropub,Gas Station,Garden Center,Garden
3460,51.44291,0,Indian Restaurant,Plaza,Bus Stop,Café,Coffee Shop,Farmers Market,Grocery Store,Trail,Museum,Playground
3461,51.443312,0,Indian Restaurant,Playground,Bus Stop,Café,Coffee Shop,Convenience Store,Farmers Market,Grocery Store,Trail,Museum
3462,51.443155,0,Indian Restaurant,Playground,Bus Stop,Café,Coffee Shop,Convenience Store,Farmers Market,Grocery Store,Trail,Museum


__Cluster 1__

In [208]:
ForestHill_merged.loc[ForestHill_merged['Cluster Labels'] == 1, ForestHill_merged.columns[[1] + list(range(5, ForestHill_merged.shape[1]))]]

Unnamed: 0,Latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2913,51.442611,1,Café,Coffee Shop,Pub,Indian Restaurant,Playground,Bus Stop,Farmers Market,Grocery Store,Gym / Fitness Center,Trail
2914,51.442611,1,Café,Coffee Shop,Pub,Indian Restaurant,Playground,Bus Stop,Farmers Market,Grocery Store,Gym / Fitness Center,Trail
3198,51.438921,1,Coffee Shop,Pub,Café,Gym / Fitness Center,Supermarket,Train Station,Fish & Chips Shop,Pizza Place,Gastropub,Grocery Store
3432,51.448257,1,Pub,Indian Restaurant,Scenic Lookout,Café,Convenience Store,Bus Stop,Bookstore,Gastropub,Gas Station,Garden Center
3433,51.448531,1,Pub,Indian Restaurant,Scenic Lookout,Café,Convenience Store,Bus Stop,Bookstore,Gastropub,Gas Station,Garden Center
3434,51.448764,1,Pub,Indian Restaurant,Scenic Lookout,Café,Convenience Store,Bus Stop,Bookstore,Gastropub,Gas Station,Garden Center
3437,51.436446,1,Pub,Coffee Shop,Café,Gym / Fitness Center,Supermarket,Pharmacy,Pizza Place,Gastropub,Train Station,Fish & Chips Shop
3439,51.436552,1,Pub,Coffee Shop,Café,Gym / Fitness Center,Supermarket,Pharmacy,Pizza Place,Gastropub,Train Station,Fish & Chips Shop
3440,51.448701,1,Pub,Indian Restaurant,Scenic Lookout,Café,Convenience Store,Bus Stop,Bookstore,Gastropub,Gas Station,Garden Center
3442,51.448668,1,Pub,Indian Restaurant,Scenic Lookout,Café,Fast Food Restaurant,Gastropub,Gas Station,Garden Center,Garden,Forest


__Cluster 2__

In [209]:
ForestHill_merged.loc[ForestHill_merged['Cluster Labels'] == 2, ForestHill_merged.columns[[1] + list(range(5, ForestHill_merged.shape[1]))]]

Unnamed: 0,Latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3424,51.445738,2,Thrift / Vintage Store,Garden Center,Scenic Lookout,Nature Preserve,Train Station,Fast Food Restaurant,Gastropub,Gas Station,Garden,Forest
3425,51.445143,2,Garden Center,Scenic Lookout,Pub,Nature Preserve,Train Station,Fast Food Restaurant,Gastropub,Gas Station,Garden,Forest
3426,51.445271,2,Thrift / Vintage Store,Garden Center,Scenic Lookout,Nature Preserve,Train Station,Fast Food Restaurant,Gastropub,Gas Station,Garden,Forest
3427,51.445199,2,Thrift / Vintage Store,Garden Center,Scenic Lookout,Nature Preserve,Train Station,Fast Food Restaurant,Gastropub,Gas Station,Garden,Forest
3428,51.445227,2,Thrift / Vintage Store,Garden Center,Scenic Lookout,Food Truck,Nature Preserve,Train Station,Fast Food Restaurant,Gastropub,Gas Station,Garden
3429,51.446739,2,Scenic Lookout,Food Truck,Train Station,Fast Food Restaurant,Gift Shop,Gastropub,Gas Station,Garden Center,Garden,Forest
3430,51.44596,2,Thrift / Vintage Store,Scenic Lookout,Food Truck,Nature Preserve,Convenience Store,Train Station,Fish & Chips Shop,Gastropub,Gas Station,Garden Center
3431,51.447505,2,Scenic Lookout,Café,Chinese Restaurant,Convenience Store,Nature Preserve,Train Station,Food,Gastropub,Gas Station,Garden Center
3435,51.446698,2,Scenic Lookout,Nature Preserve,Food Truck,Convenience Store,Train Station,Fish & Chips Shop,Gastropub,Gas Station,Garden Center,Garden
3436,51.446182,2,Scenic Lookout,Food Truck,Train Station,Fast Food Restaurant,Gift Shop,Gastropub,Gas Station,Garden Center,Garden,Forest


__Cluster 3__

In [210]:
ForestHill_merged.loc[ForestHill_merged['Cluster Labels'] == 3, ForestHill_merged.columns[[1] + list(range(5, ForestHill_merged.shape[1]))]]

Unnamed: 0,Latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3829,51.431397,3,Café,Grocery Store,Pharmacy,Gym / Fitness Center,Japanese Restaurant,Mobile Phone Shop,Convenience Store,Park,Gas Station,Garden Center
3838,51.431158,3,Café,Grocery Store,Pharmacy,Gym / Fitness Center,Japanese Restaurant,Mobile Phone Shop,Convenience Store,Park,Gas Station,Garden Center
3840,51.431652,3,Café,Grocery Store,Pharmacy,Japanese Restaurant,Mobile Phone Shop,Convenience Store,Park,Gastropub,Gas Station,Garden Center
3842,51.432143,3,Café,Grocery Store,Pharmacy,Japanese Restaurant,Mobile Phone Shop,Convenience Store,Park,Gastropub,Gas Station,Garden Center
3843,51.432309,3,Japanese Restaurant,Café,Mobile Phone Shop,Park,Pharmacy,Train Station,Food,Gas Station,Garden Center,Garden
3844,51.433482,3,Japanese Restaurant,Café,Forest,Park,Pharmacy,Train Station,Food,Gastropub,Gas Station,Garden Center
3845,51.432876,3,Japanese Restaurant,Café,Mobile Phone Shop,Park,Pharmacy,Train Station,Food,Gas Station,Garden Center,Garden
3846,51.433212,3,Japanese Restaurant,Café,Park,Pharmacy,Train Station,Food,Gastropub,Gas Station,Garden Center,Garden
3847,51.43426,3,Japanese Restaurant,Café,Forest,Park,Pharmacy,Train Station,Food,Gastropub,Gas Station,Garden Center
3848,51.431851,3,Café,Grocery Store,Pharmacy,Japanese Restaurant,Mobile Phone Shop,Convenience Store,Park,Gastropub,Gas Station,Garden Center


__Cluster4__

In [211]:
ForestHill_merged.loc[ForestHill_merged['Cluster Labels'] == 4, ForestHill_merged.columns[[1] + list(range(5, ForestHill_merged.shape[1]))]]

Unnamed: 0,Latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3950,51.435478,4,Forest,Record Shop,Pub,Park,Pharmacy,Train Station,Fast Food Restaurant,Gas Station,Garden Center,Garden
4325,51.431653,4,Street Art,Forest,Pub,Convenience Store,Park,Train Station,Fish & Chips Shop,Gastropub,Gas Station,Garden Center
4327,51.431868,4,Café,Forest,Pub,Convenience Store,Park,Train Station,Fish & Chips Shop,Gastropub,Gas Station,Garden Center
4328,51.431439,4,Café,Forest,Pub,Convenience Store,Park,Train Station,Fish & Chips Shop,Gastropub,Gas Station,Garden Center
4331,51.432025,4,Street Art,Forest,Pub,Convenience Store,Park,Train Station,Fish & Chips Shop,Gastropub,Gas Station,Garden Center
4406,51.433427,4,Park,Forest,Pub,Train Station,Fast Food Restaurant,Gastropub,Gas Station,Garden Center,Garden,Food Truck
4408,51.433668,4,Pub,Park,Forest,Train Station,Fast Food Restaurant,Gastropub,Gas Station,Garden Center,Garden,Food Truck
4409,51.434268,4,Pub,Park,Forest,Train Station,Fast Food Restaurant,Gastropub,Gas Station,Garden Center,Garden,Food Truck
4410,51.435338,4,Park,Forest,Pub,Train Station,Fast Food Restaurant,Gastropub,Gas Station,Garden Center,Garden,Food Truck
4411,51.436603,4,Playground,Bus Stop,Forest,Park,Fast Food Restaurant,Gastropub,Gas Station,Garden Center,Garden,Food Truck


### By analysing the clusters we can see that there are 5 very different kinds of areas in the Forest Hill.

  * Cluster 0. Contains houses with proximity to Indian Restaurants, Bus Stops, Train stations etc.
  * Cluster 1. For youngsters - The cluster hilights all houses in the proximity to cafes, pubs and nightlife etc.
  * Cluster 2. It contains houses with proximity to Vintage stores, nature reserves etc.
  * Cluster 3. It contains houses with proximity to Japenese restaurant, Grocery Stores, Mobile shopts etc.
  * Cluster 4. Is suitable for those who like greenary, nearby forest, parks etc.

__By looking at this graph one can easily decide which part of the Forest Hill they would like to live. So I hope it solves the important problem of narrowing down the search for the house.__
