# Introduction

## Problem statement

Finding a place for a traveller who is migrating to London based on the average price of the houses and the venues at proximity to the streets of London

### Approach
A high level approach is as follows:

1) The travellers decides to travel/migrate to London and wants to explore more about the localities of London.
2) The ForeSquare website is scrapped for the top venues in the London Town
3) From this list of top venues the list is augmented with additional grographical data
4) Using this additional geographical data the top nearby venues are selects
5) A map is presented to the traveller showing the streets on town of london and based on the venues which streets are similar. 

### Steps followed
The aspect of the project includes:
1)Data Acquisition
2)Data Cleansing
3)Data Analysis
4)Machine Learning

# Data Section 

## Data sources 
The following data sources are used for the project 

1)The street level data from the public land registry data of London - http://prod.publicdata.landregistry.gov.uk.s3-website-eu-west-1.amazonaws.com/ 
2) The Geographic information of the streets of London using the geocoder 
3) Four Square API to get the nearby venues of the given street /address

# Methodology

## Data Acquisition:

The property registry data from UK for the streets are downloaded which has the price of the property etc. For the address from the public property registry data the Latitidues and Longitudes are fetched using the geocoder. 
Then we use the Four Square API to get the Venues nearby the latitudes or Longitudes or the streets. 

## Data Cleansing
The data from the UK property registry is cleansed to have only the requried variables and latitides are longitudes are merged and sent to four square API to get the near by venues. 

## Data Analysis 
The average household price from the UK property registry is set to be an affordable price and any property below the affordable price is used for further data analysis. 
For those streets which has affordable homes, the near by venues are got from the Four Square APi and sorted based on the number of venues and the count of venues. The Venues are categorized and pivoted and then merged back with the neighbourhood column dataset. The top 2 venues for each street is identified 

## Machine Learning
K mean clustering is applied to the data and seperated into 5 clusters based on the venue. The clustered data labels are then assigned back to the original dataset to identify the actual street or address. 

# Resuls 
The map shows the clusters in different colors and we can identify the streets using the Label name assigned in the dataset or the map. 

# Discussion and Conclusion 
In current world, migration is a common thing and people move from places to place and from time to time. This analysis gives us an idea of average price of house in London, their locality and what there surroundings look like which can be used by peope who wants to move to London. 

In [2]:
import numpy as np # data in a vectorized manner manipulation
import pandas as pd # data analsysis
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe
import json # JSON files manipulation
import requests # HTTP library
from bs4 import BeautifulSoup # scraping library

from sklearn.cluster import KMeans # clustering algorithm

pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# !conda install -c conda-forge folium=0.5.0 --yes
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


In [2]:
# Download csv
# !wget -O landregistry.csv http://prod.publicdata.landregistry.gov.uk.s3-website-eu-west-1.amazonaws.com/pp-complete.csv

In [4]:
# Read the data for examination (Source: http://landregistry.data.gov.uk/)
df_ppd = pd.read_csv("./pp-2018.csv")


# Assign meaningful column names
df_ppd.columns = ['TUID', 'Price', 'Date_Transfer', 'Postcode', 'Prop_Type', 'Old_New', 'Duration', 'PAON', \
                  'SAON', 'Street', 'Locality', 'Town_City', 'District', 'County', 'PPD_Cat_Type', 'Record_Status']

df_ppd.head()

Unnamed: 0,TUID,Price,Date_Transfer,Postcode,Prop_Type,Old_New,Duration,PAON,SAON,Street,Locality,Town_City,District,County,PPD_Cat_Type,Record_Status
0,{7E86B6FA-70B8-458C-E053-6B04A8C0C84C},355000,2018-10-19 00:00,UB3 1DZ,F,Y,L,"BOILER HOUSE, 2",FLAT 54,MATERIAL WALK,,HAYES,HILLINGDON,GREATER LONDON,A,A
1,{7E86B6FA-70B9-458C-E053-6B04A8C0C84C},465000,2018-09-14 00:00,EN5 2FQ,F,Y,L,"DELPHI HOUSE, 4",FLAT 5,HERA AVENUE,,BARNET,BARNET,GREATER LONDON,A,A
2,{7E86B6FA-70BA-458C-E053-6B04A8C0C84C},540000,2018-09-14 00:00,EN5 2FQ,F,Y,L,"DELPHI HOUSE, 4",FLAT 17,HERA AVENUE,,BARNET,BARNET,GREATER LONDON,A,A
3,{7E86B6FA-70BB-458C-E053-6B04A8C0C84C},415000,2018-10-02 00:00,N13 5EX,F,Y,L,"HAZELTREE LODGE, 16 - 18",FLAT 9,HAZELWOOD LANE,,LONDON,ENFIELD,GREATER LONDON,A,A
4,{7E86B6FA-70BC-458C-E053-6B04A8C0C84C},470000,2018-09-17 00:00,EN5 2FQ,F,Y,L,"DELPHI HOUSE, 4",FLAT 21,HERA AVENUE,,BARNET,BARNET,GREATER LONDON,A,A


In [5]:
# Format the date column
print(df_ppd.dtypes)

TUID             object
Price             int64
Date_Transfer    object
Postcode         object
Prop_Type        object
Old_New          object
Duration         object
PAON             object
SAON             object
Street           object
Locality         object
Town_City        object
District         object
County           object
PPD_Cat_Type     object
Record_Status    object
dtype: object


In [6]:
df_ppd_city = df_ppd.query("Town_City == 'LONDON'")

# Make a list of street names in LONDON
streets = df_ppd_city['Street'].unique().tolist()

In [7]:
df_grp_price = df_ppd_city.groupby(['Street'])['Price'].mean().reset_index()

# Give meaningful names to the columns
df_grp_price.columns = ['Street', 'Avg_Price']

In [12]:
df_affordable = df_grp_price.query("(Avg_Price >= 2100000) & (Avg_Price <= 2200000)")
df_affordable.head()

Unnamed: 0,Street,Avg_Price
547,ASHCHURCH PARK VILLAS,2150000.0
665,AVENUE ROAD,2143471.0
753,BALLINGDON ROAD,2105000.0
1123,BERESFORD TERRACE,2100000.0
1422,BOSTON PLACE,2167500.0


In [10]:
from geopy.geocoders import Nominatim

nom = Nominatim(user_agent="my-application")

In [13]:
df_temp = pd.DataFrame()
df_temp['Full Address'] = df_affordable['Street'] + ', London'
df_temp['Full Address coords'] = df_temp['Full Address'].apply(nom.geocode)
df_temp.head()

Unnamed: 0,Full Address,Full Address coords
547,"ASHCHURCH PARK VILLAS, London","(Ashchurch Park Villas, Brook Green, London Bo..."
665,"AVENUE ROAD, London","(Avenue Road, Mackenzie Road, Penge, Bromley, ..."
753,"BALLINGDON ROAD, London","(Ballingdon Road, Balham, London Borough of Wa..."
1123,"BERESFORD TERRACE, London","(Beresford Terrace, Mildmay Park, Canonbury, L..."
1422,"BOSTON PLACE, London","(Boston Place, Marylebone, City of Westminster..."


In [14]:

df_temp['Latitude'] = df_temp['Full Address coords'].apply(lambda x : x.latitude if x != None else None)
df_temp['Longitude'] = df_temp['Full Address coords'].apply(lambda x : x.longitude if x != None else None)
df_temp.head()

Unnamed: 0,Full Address,Full Address coords,Latitude,Longitude
547,"ASHCHURCH PARK VILLAS, London","(Ashchurch Park Villas, Brook Green, London Bo...",51.500051,-0.242173
665,"AVENUE ROAD, London","(Avenue Road, Mackenzie Road, Penge, Bromley, ...",51.406797,-0.049519
753,"BALLINGDON ROAD, London","(Ballingdon Road, Balham, London Borough of Wa...",51.454189,-0.158856
1123,"BERESFORD TERRACE, London","(Beresford Terrace, Mildmay Park, Canonbury, L...",51.550294,-0.091434
1422,"BOSTON PLACE, London","(Boston Place, Marylebone, City of Westminster...",51.523997,-0.162951


In [15]:
# df_affordable.join(df_temp)
final = pd.concat([df_affordable, df_temp], axis=1, sort=False)
final = final.dropna()
print(final.dtypes)
final.head()

Street                  object
Avg_Price              float64
Full Address            object
Full Address coords     object
Latitude               float64
Longitude              float64
dtype: object


Unnamed: 0,Street,Avg_Price,Full Address,Full Address coords,Latitude,Longitude
547,ASHCHURCH PARK VILLAS,2150000.0,"ASHCHURCH PARK VILLAS, London","(Ashchurch Park Villas, Brook Green, London Bo...",51.500051,-0.242173
665,AVENUE ROAD,2143471.0,"AVENUE ROAD, London","(Avenue Road, Mackenzie Road, Penge, Bromley, ...",51.406797,-0.049519
753,BALLINGDON ROAD,2105000.0,"BALLINGDON ROAD, London","(Ballingdon Road, Balham, London Borough of Wa...",51.454189,-0.158856
1123,BERESFORD TERRACE,2100000.0,"BERESFORD TERRACE, London","(Beresford Terrace, Mildmay Park, Canonbury, L...",51.550294,-0.091434
1422,BOSTON PLACE,2167500.0,"BOSTON PLACE, London","(Boston Place, Marylebone, City of Westminster...",51.523997,-0.162951


In [16]:
city_loc = nom.geocode('London, UK')
print(city_loc.latitude, city_loc.longitude)

51.5073219 -0.1276474


In [17]:
map_london = folium.Map(location=[51.5073219, -0.1276474], zoom_start=11)

for location in final.itertuples(): #iterate each row of the dataframe
    label = 'Street: {};  Average Price: {};'.format(location[1], location[2])
    label = folium.Popup(label, parse_html=True)    
    folium.CircleMarker(
        [location[-2], location[-1]],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_london)
    
    
    
map_london

In [18]:
CLIENT_ID = 'Y2SDCM5ONE2XYX5RQPDJSYNJCNM3BQVGHF0Z0OCRDCWXKNZC' # your Foursquare ID
CLIENT_SECRET = 'QA4IVQZIZ4TLH4BZTZKWE3Y1JKGVL12SHMDE2IBP43VM45ZS' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)



Your credentails:
CLIENT_ID: Y2SDCM5ONE2XYX5RQPDJSYNJCNM3BQVGHF0Z0OCRDCWXKNZC
CLIENT_SECRET:QA4IVQZIZ4TLH4BZTZKWE3Y1JKGVL12SHMDE2IBP43VM45ZS


In [None]:
def getNearbyVenues(names, latitudes, longitudes, radius=500, LIMIT=100):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Street', 
                  'Street Latitude', 
                  'Street Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [20]:
location_venues = getNearbyVenues(names=final['Street'],
                                   latitudes=final['Latitude'],
                                   longitudes=final['Longitude']
                                  )



ASHCHURCH PARK VILLAS
AVENUE ROAD
BALLINGDON ROAD
BERESFORD TERRACE
BOSTON PLACE
BRACKENBURY GARDENS
BRAMSHOT AVENUE
BROOKFIELD PARK
BROWNING CLOSE
BRYANSTON SQUARE
CANFIELD GARDENS
CARLISLE ROAD
CARLYLE CLOSE
CHANCE STREET
CLEVELAND SQUARE
COLINETTE ROAD
COLLEGE CROSS
COTSWOLD MEWS
CRANLEY MEWS
CUMBERLAND TERRACE
DIANA ROAD
DRYBURGH ROAD
DUCHESS WALK
ECCLESTON MEWS
EGERTON PLACE
FLORAL STREET
GREAT RUSSELL STREET
GREEN CLOSE
GROSVENOR GARDENS
HAMBLEDON PLACE
HENDERSON ROAD
HEWER STREET
HIGHLEVER ROAD
HILLGATE PLACE
HOBHOUSE COURT
IVOR PLACE
LEINSTER MEWS
LONG LANE
MANSON MEWS
MARGIN DRIVE
MAUNSEL STREET
MEADOWBANK
MILFORD LANE
MOLYNEUX STREET
MOORGATE
MUSEUM STREET
ONSLOW CRESCENT
PARKFIELDS
PAVILION ROAD
PENCOMBE MEWS
PLAYHOUSE YARD
PUTNEY HEATH LANE
QUEENS GATE GARDENS
RACTON ROAD
RANDOLPH MEWS
SHELDON SQUARE
ST JAMES'S STREET
STAFFORD STREET
TANNER STREET
TIERNEY LANE
TITE STREET
WINTERBROOK ROAD
WOODBOROUGH ROAD


In [23]:
location_venues.head()

Unnamed: 0,Street,Street Latitude,Street Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,ASHCHURCH PARK VILLAS,51.500051,-0.242173,Detour Café,51.502391,-0.242705,Coffee Shop
1,ASHCHURCH PARK VILLAS,51.500051,-0.242173,The Eagle,51.5006,-0.239431,Pub
2,ASHCHURCH PARK VILLAS,51.500051,-0.242173,Som Tam House,51.502508,-0.242832,Thai Restaurant
3,ASHCHURCH PARK VILLAS,51.500051,-0.242173,Laveli Bakery,51.503074,-0.24316,Bakery
4,ASHCHURCH PARK VILLAS,51.500051,-0.242173,Ravenscourt Park,51.496614,-0.238652,Park


In [21]:
print('{} venues were returned.'.format(location_venues.shape[0]))
print('there were {} unique venue categories'.format(len(location_venues['Venue Category'].unique())))

3056 venues were returned.
there were 269 unique venue categories


In [22]:
venues_in_each = location_venues.groupby('Street').count()
venues_in_each = pd.DataFrame(list(zip(venues_in_each.index, venues_in_each['Venue'])), columns=['Street', 'num of Venues']).set_index('Street').join(final[['Street']].set_index('Street'), on='Street').reset_index()
# venues_in_each['Distance'] = np.int64(venues_in_each['Distance'])

print(venues_in_each.head())

venues_in_each[['num of Venues']].describe()

                  Street  num of Venues
0  ASHCHURCH PARK VILLAS             28
1            AVENUE ROAD              4
2        BALLINGDON ROAD             12
3      BERESFORD TERRACE             28
4           BOSTON PLACE             90


Unnamed: 0,num of Venues
count,63.0
mean,48.507937
std,37.351371
min,3.0
25%,12.5
50%,39.0
75%,91.5
max,100.0


In [23]:
# one hot encoding
streets_onehot = pd.get_dummies(location_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
streets_onehot['Street'] = location_venues['Street'] 

# move neighborhood column to the first column
fixed_columns = [streets_onehot.columns[-1]] + list(streets_onehot.columns[:-1])
streets_onehot = streets_onehot[fixed_columns]

streets_onehot.head()

Unnamed: 0,Street,Accessories Store,African Restaurant,American Restaurant,Antique Shop,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Australian Restaurant,Austrian Restaurant,Auto Garage,BBQ Joint,Bagel Shop,Bakery,Bar,Bed & Breakfast,Beer Bar,Beer Store,Belgian Restaurant,Bike Rental / Bike Share,Bistro,Boat or Ferry,Bookstore,Botanical Garden,Boutique,Bowling Alley,Boxing Gym,Brasserie,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Station,Bus Stop,Business Service,Café,Camera Store,Canal,Candy Store,Cantonese Restaurant,Caribbean Restaurant,Casino,Castle,Champagne Bar,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Clothing Store,Club House,Cocktail Bar,Coffee Shop,College Quad,Comedy Club,Comfort Food Restaurant,Comic Shop,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Creperie,Cricket Ground,Cultural Center,Cupcake Shop,Cycle Studio,Dance Studio,Deli / Bodega,Department Store,Design Studio,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Distillery,Donut Shop,Dry Cleaner,Electronics Store,English Restaurant,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Field,Fish & Chips Shop,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Garden,Garden Center,Gas Station,Gastropub,General College & University,General Entertainment,German Restaurant,Gift Shop,Gourmet Shop,Greek Restaurant,Grilled Meat Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Halal Restaurant,Harbor / Marina,Hardware Store,Health & Beauty Service,Herbs & Spices Store,Historic Site,History Museum,Hobby Shop,Home Service,Hookah Bar,Hostel,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Iraqi Restaurant,Irish Pub,Israeli Restaurant,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Jewelry Store,Juice Bar,Kebab Restaurant,Korean Restaurant,Kurdish Restaurant,Lake,Latin American Restaurant,Lebanese Restaurant,Light Rail Station,Liquor Store,Lounge,Malay Restaurant,Market,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Modern European Restaurant,Monument / Landmark,Moroccan Restaurant,Motorcycle Shop,Movie Theater,Multiplex,Museum,Music Store,Music Venue,Nature Preserve,Neighborhood,Nightclub,Noodle House,North Indian Restaurant,Okonomiyaki Restaurant,Opera House,Organic Grocery,Outdoor Event Space,Outdoor Sculpture,Outdoor Supply Store,Outdoors & Recreation,Pakistani Restaurant,Palace,Park,Pastry Shop,Pedestrian Plaza,Performing Arts Venue,Perfume Shop,Persian Restaurant,Peruvian Restaurant,Pet Store,Pharmacy,Photography Lab,Pie Shop,Pier,Pilates Studio,Pizza Place,Platform,Playground,Plaza,Poke Place,Polish Restaurant,Portuguese Restaurant,Print Shop,Pub,Ramen Restaurant,Record Shop,Rental Car Location,Residential Building (Apartment / Condo),Restaurant,Road,Roof Deck,Rugby Pitch,Russian Restaurant,Salad Place,Salon / Barbershop,Sandwich Place,Scandinavian Restaurant,Scenic Lookout,Science Museum,Scottish Restaurant,Seafood Restaurant,Shoe Store,Shopping Mall,Shopping Plaza,Skating Rink,Snack Place,Soba Restaurant,Soccer Field,Social Club,South American Restaurant,Souvenir Shop,Souvlaki Shop,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Stationery Store,Steakhouse,Street Food Gathering,Supermarket,Sushi Restaurant,Szechuan Restaurant,Tailor Shop,Tapas Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Thrift / Vintage Store,Tiki Bar,Tour Provider,Toy / Game Store,Trail,Train Station,Tram Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Warehouse Store,Watch Shop,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio,Zoo Exhibit
0,ASHCHURCH PARK VILLAS,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,ASHCHURCH PARK VILLAS,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,ASHCHURCH PARK VILLAS,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,ASHCHURCH PARK VILLAS,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,ASHCHURCH PARK VILLAS,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [24]:
streets_grouped = streets_onehot.groupby('Street').mean().reset_index()
print(streets_grouped.shape)
streets_grouped

(63, 270)


Unnamed: 0,Street,Accessories Store,African Restaurant,American Restaurant,Antique Shop,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Australian Restaurant,Austrian Restaurant,Auto Garage,BBQ Joint,Bagel Shop,Bakery,Bar,Bed & Breakfast,Beer Bar,Beer Store,Belgian Restaurant,Bike Rental / Bike Share,Bistro,Boat or Ferry,Bookstore,Botanical Garden,Boutique,Bowling Alley,Boxing Gym,Brasserie,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Station,Bus Stop,Business Service,Café,Camera Store,Canal,Candy Store,Cantonese Restaurant,Caribbean Restaurant,Casino,Castle,Champagne Bar,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Clothing Store,Club House,Cocktail Bar,Coffee Shop,College Quad,Comedy Club,Comfort Food Restaurant,Comic Shop,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Creperie,Cricket Ground,Cultural Center,Cupcake Shop,Cycle Studio,Dance Studio,Deli / Bodega,Department Store,Design Studio,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Distillery,Donut Shop,Dry Cleaner,Electronics Store,English Restaurant,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Field,Fish & Chips Shop,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Garden,Garden Center,Gas Station,Gastropub,General College & University,General Entertainment,German Restaurant,Gift Shop,Gourmet Shop,Greek Restaurant,Grilled Meat Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Halal Restaurant,Harbor / Marina,Hardware Store,Health & Beauty Service,Herbs & Spices Store,Historic Site,History Museum,Hobby Shop,Home Service,Hookah Bar,Hostel,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Iraqi Restaurant,Irish Pub,Israeli Restaurant,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Jewelry Store,Juice Bar,Kebab Restaurant,Korean Restaurant,Kurdish Restaurant,Lake,Latin American Restaurant,Lebanese Restaurant,Light Rail Station,Liquor Store,Lounge,Malay Restaurant,Market,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Modern European Restaurant,Monument / Landmark,Moroccan Restaurant,Motorcycle Shop,Movie Theater,Multiplex,Museum,Music Store,Music Venue,Nature Preserve,Neighborhood,Nightclub,Noodle House,North Indian Restaurant,Okonomiyaki Restaurant,Opera House,Organic Grocery,Outdoor Event Space,Outdoor Sculpture,Outdoor Supply Store,Outdoors & Recreation,Pakistani Restaurant,Palace,Park,Pastry Shop,Pedestrian Plaza,Performing Arts Venue,Perfume Shop,Persian Restaurant,Peruvian Restaurant,Pet Store,Pharmacy,Photography Lab,Pie Shop,Pier,Pilates Studio,Pizza Place,Platform,Playground,Plaza,Poke Place,Polish Restaurant,Portuguese Restaurant,Print Shop,Pub,Ramen Restaurant,Record Shop,Rental Car Location,Residential Building (Apartment / Condo),Restaurant,Road,Roof Deck,Rugby Pitch,Russian Restaurant,Salad Place,Salon / Barbershop,Sandwich Place,Scandinavian Restaurant,Scenic Lookout,Science Museum,Scottish Restaurant,Seafood Restaurant,Shoe Store,Shopping Mall,Shopping Plaza,Skating Rink,Snack Place,Soba Restaurant,Soccer Field,Social Club,South American Restaurant,Souvenir Shop,Souvlaki Shop,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Stationery Store,Steakhouse,Street Food Gathering,Supermarket,Sushi Restaurant,Szechuan Restaurant,Tailor Shop,Tapas Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Thrift / Vintage Store,Tiki Bar,Tour Provider,Toy / Game Store,Trail,Train Station,Tram Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Warehouse Store,Watch Shop,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio,Zoo Exhibit
0,ASHCHURCH PARK VILLAS,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.071429,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.107143,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.035714,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.107143,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.107143,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0
1,AVENUE ROAD,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,BALLINGDON ROAD,0.083333,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,BERESFORD TERRACE,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.107143,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.178571,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,BOSTON PLACE,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011111,0.0,0.011111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011111,0.0,0.0,0.011111,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011111,0.011111,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.011111,0.0,0.0,0.0,0.0,0.0,0.0,0.011111,0.011111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011111,0.0,0.0,0.011111,0.0,0.0,0.0,0.0,0.0,0.011111,0.0,0.0,0.0,0.011111,0.0,0.011111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011111,0.0,0.011111,0.011111,0.0,0.0,0.0,0.011111,0.0,0.0,0.0,0.044444,0.0,0.011111,0.0,0.011111,0.0,0.0,0.0,0.0,0.0,0.011111,0.0,0.0,0.022222,0.011111,0.055556,0.011111,0.011111,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011111,0.0,0.0,0.0,0.0,0.011111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011111,0.0,0.0,0.0,0.0,0.0,0.011111,0.0,0.0,0.0,0.0,0.0,0.011111,0.0,0.0,0.011111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011111,0.0,0.0,0.0,0.0,0.033333,0.022222,0.0,0.0,0.0,0.0,0.011111,0.0,0.055556,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.011111,0.0,0.0,0.0,0.011111,0.0,0.0,0.0,0.0,0.011111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011111,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.011111,0.0,0.022222,0.022222,0.011111,0.0,0.0,0.0,0.0,0.011111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011111,0.0
5,BRACKENBURY GARDENS,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.117647,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.117647,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.058824,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.117647,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,BRAMSHOT AVENUE,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.285714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,BROOKFIELD PARK,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.133333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,BROWNING CLOSE,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,BRYANSTON SQUARE,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.03,0.0,0.0,0.01,0.01,0.01,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.11,0.02,0.01,0.01,0.0,0.01,0.0,0.0,0.04,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.02,0.0,0.03,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.02,0.0,0.0


In [25]:
num_top_venues = 2

for hood in streets_grouped['Street']:
    print("----"+hood+"----")
    temp = streets_grouped[streets_grouped['Street'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----ASHCHURCH PARK VILLAS----
           venue  freq
0  Grocery Store  0.14
1            Pub  0.11


----AVENUE ROAD----
           venue  freq
0           Park  0.25
1  Grocery Store  0.25


----BALLINGDON ROAD----
  venue  freq
0   Pub  0.25
1  Café  0.25


----BERESFORD TERRACE----
        venue  freq
0         Pub  0.18
1  Restaurant  0.14


----BOSTON PLACE----
         venue  freq
0  Coffee Shop  0.07
1         Café  0.07


----BRACKENBURY GARDENS----
           venue  freq
0    Coffee Shop  0.12
1  Grocery Store  0.12


----BRAMSHOT AVENUE----
                    venue  freq
0                Bus Stop  0.29
1  Furniture / Home Store  0.14


----BROOKFIELD PARK----
                      venue  freq
0                       Pub  0.20
1  Mediterranean Restaurant  0.13


----BROWNING CLOSE----
                        venue  freq
0                Home Service  0.17
1  Construction & Landscaping  0.17


----BRYANSTON SQUARE----
                 venue  freq
0                Hotel  0.11
1

In [26]:
# set number of clusters
kclusters = 5

streets_grouped_clustering = streets_grouped.drop('Street', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(streets_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:38] 

array([0, 0, 0, 4, 4, 0, 4, 0, 4, 4, 4, 0, 4, 4, 4, 0, 4, 4, 4, 0, 0, 0,
       4, 4, 4, 4, 4, 1, 4, 0, 3, 0, 0, 4, 4, 4, 4, 4], dtype=int32)

In [27]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [28]:
num_top_venues = 5

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Street']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Street'] = streets_grouped['Street']

for ind in np.arange(streets_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(streets_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Street,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,ASHCHURCH PARK VILLAS,Grocery Store,Pub,Coffee Shop,Park,Indian Restaurant
1,AVENUE ROAD,Park,Tapas Restaurant,Tram Station,Grocery Store,Zoo Exhibit
2,BALLINGDON ROAD,Café,Pub,Accessories Store,Antique Shop,Italian Restaurant
3,BERESFORD TERRACE,Pub,Restaurant,Café,Turkish Restaurant,Pizza Place
4,BOSTON PLACE,Café,Coffee Shop,Hotel,Pub,Grocery Store


In [29]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

streets_merged = final

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
streets_merged = streets_merged.join(neighborhoods_venues_sorted.set_index('Street'), on='Street')

streets_merged

Unnamed: 0,Street,Avg_Price,Full Address,Full Address coords,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
547,ASHCHURCH PARK VILLAS,2150000.0,"ASHCHURCH PARK VILLAS, London","(Ashchurch Park Villas, Brook Green, London Bo...",51.500051,-0.242173,0,Grocery Store,Pub,Coffee Shop,Park,Indian Restaurant
665,AVENUE ROAD,2143471.0,"AVENUE ROAD, London","(Avenue Road, Mackenzie Road, Penge, Bromley, ...",51.406797,-0.049519,0,Park,Tapas Restaurant,Tram Station,Grocery Store,Zoo Exhibit
753,BALLINGDON ROAD,2105000.0,"BALLINGDON ROAD, London","(Ballingdon Road, Balham, London Borough of Wa...",51.454189,-0.158856,0,Café,Pub,Accessories Store,Antique Shop,Italian Restaurant
1123,BERESFORD TERRACE,2100000.0,"BERESFORD TERRACE, London","(Beresford Terrace, Mildmay Park, Canonbury, L...",51.550294,-0.091434,4,Pub,Restaurant,Café,Turkish Restaurant,Pizza Place
1422,BOSTON PLACE,2167500.0,"BOSTON PLACE, London","(Boston Place, Marylebone, City of Westminster...",51.523997,-0.162951,4,Café,Coffee Shop,Hotel,Pub,Grocery Store
1502,BRACKENBURY GARDENS,2150000.0,"BRACKENBURY GARDENS, London","(Brackenbury Gardens, Brook Green, London Boro...",51.500623,-0.230729,0,Grocery Store,Coffee Shop,Pub,Fish & Chips Shop,Gastropub
1546,BRAMSHOT AVENUE,2177900.0,"BRAMSHOT AVENUE, London","(Bramshot Avenue, East Greenwich, Greenwich, L...",51.48116,0.022652,4,Bus Stop,Spa,Rugby Pitch,Furniture / Home Store,Train Station
1762,BROOKFIELD PARK,2150000.0,"BROOKFIELD PARK, London","(Brookfield Park, Tufnell Park, London Borough...",51.561811,-0.146356,0,Pub,Mediterranean Restaurant,Grocery Store,Coffee Shop,Café
1810,BROWNING CLOSE,2160000.0,"BROWNING CLOSE, London","(Browning Close, Collier Row, London Borough o...",51.599607,0.14913,4,Gym,Construction & Landscaping,Home Service,Flea Market,Print Shop
1854,BRYANSTON SQUARE,2197583.0,"BRYANSTON SQUARE, London","(Bryanston Square, Marylebone, City of Westmin...",51.517067,-0.160365,4,Hotel,Middle Eastern Restaurant,Lebanese Restaurant,Coffee Shop,Italian Restaurant


In [30]:
streets_merged.loc[streets_merged['Cluster Labels'] == 0, streets_merged.columns[[1] + list(range(5, streets_merged.shape[1]))]]

Unnamed: 0,Avg_Price,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
547,2150000.0,-0.242173,0,Grocery Store,Pub,Coffee Shop,Park,Indian Restaurant
665,2143471.0,-0.049519,0,Park,Tapas Restaurant,Tram Station,Grocery Store,Zoo Exhibit
753,2105000.0,-0.158856,0,Café,Pub,Accessories Store,Antique Shop,Italian Restaurant
1502,2150000.0,-0.230729,0,Grocery Store,Coffee Shop,Pub,Fish & Chips Shop,Gastropub
1762,2150000.0,-0.146356,0,Pub,Mediterranean Restaurant,Grocery Store,Coffee Shop,Café
2211,2200000.0,-0.209667,0,Café,Coffee Shop,Pub,Park,Tennis Court
2970,2124375.0,-0.229843,0,Grocery Store,Convenience Store,Tennis Court,Gym / Fitness Center,Coffee Shop
3505,2108333.0,-0.146098,0,Zoo Exhibit,Pub,Fountain,Beer Bar,Park
3832,2125000.0,-0.023588,0,Grocery Store,Pub,Café,Liquor Store,Vegetarian / Vegan Restaurant
4007,2165625.0,-0.230559,0,Convenience Store,Grocery Store,Pub,Gastropub,Gym / Fitness Center


In [31]:
streets_merged.loc[streets_merged['Cluster Labels'] == 1, streets_merged.columns[[1] + list(range(5, streets_merged.shape[1]))]]

Unnamed: 0,Avg_Price,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
5565,2102667.0,-0.182441,1,Photography Lab,Fruit & Vegetable Store,Cosmetics Shop,Field,Flea Market


In [32]:
streets_merged.loc[streets_merged['Cluster Labels'] == 2, streets_merged.columns[[1] + list(range(5, streets_merged.shape[1]))]]

Unnamed: 0,Avg_Price,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
12552,2120500.0,-73.323612,2,Food Truck,Botanical Garden,Flower Shop,Farm,Farmers Market


In [33]:
streets_merged.loc[streets_merged['Cluster Labels'] == 3, streets_merged.columns[[1] + list(range(5, streets_merged.shape[1]))]]

Unnamed: 0,Avg_Price,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
6148,2146000.0,0.031668,3,Grocery Store,Fast Food Restaurant,Ice Cream Shop,Bus Stop,Pub
9589,2200000.0,-0.037249,3,Indian Restaurant,Grocery Store,Antique Shop,Fast Food Restaurant,Flea Market


In [34]:
streets_merged.loc[streets_merged['Cluster Labels'] == 4, streets_merged.columns[[1] + list(range(5, streets_merged.shape[1]))]]

Unnamed: 0,Avg_Price,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
1123,2100000.0,-0.091434,4,Pub,Restaurant,Café,Turkish Restaurant,Pizza Place
1422,2167500.0,-0.162951,4,Café,Coffee Shop,Hotel,Pub,Grocery Store
1546,2177900.0,0.022652,4,Bus Stop,Spa,Rugby Pitch,Furniture / Home Store,Train Station
1810,2160000.0,0.14913,4,Gym,Construction & Landscaping,Home Service,Flea Market,Print Shop
1854,2197583.0,-0.160365,4,Hotel,Middle Eastern Restaurant,Lebanese Restaurant,Coffee Shop,Italian Restaurant
2137,2188333.0,-0.179709,4,Coffee Shop,Café,Italian Restaurant,Pizza Place,Grocery Store
2227,2175000.0,-0.258852,4,Bed & Breakfast,Pub,Outdoors & Recreation,Grocery Store,Train Station
2421,2177000.0,-0.075103,4,Pizza Place,Café,Restaurant,Coffee Shop,Cocktail Bar
2839,2150000.0,-0.183167,4,Hotel,Pub,Café,Coffee Shop,Grocery Store
2978,2185000.0,-0.105443,4,Pub,Park,Mediterranean Restaurant,Cocktail Bar,Bakery


In [35]:
import math

# create map
map_clusters = folium.Map(location=[city_loc.latitude, city_loc.longitude], zoom_start=12)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, .2, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

markers_colors = []
for lat, lon, poi, cluster in zip(streets_merged['Latitude'], streets_merged['Longitude'], streets_merged['Street'], streets_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
            [lat, lon],
            radius=5,
            popup=label,
            color=rainbow[int(cluster)-1],
            fill=True,
            fill_color=rainbow[int(cluster)-1],
            fill_opacity=1).add_to(map_clusters)


map_clusters