# 1. Introduction

### 1.1 Introduction to the project
Singapore is a sovereign city-state and island country located in South East Asia. Toronto, on the other hand, is the most populous city in Canada located in the continent of North America. While one might say that both cities are more different than they are similar, from their cultures to the location that each city is situated. However, the the sizes of both cities and their populations are in the same ranges. Furthermore, their quality of living is comparable, and there is a sizable number of Singaproeans living in Toronto and vice versa.


### 1.2 Aim

The aim of this project is to compare the top common venues that people like to visit in the City of Toronto and Singapore. Using location data to explore the geographical location of Singapore, key insights was gathered and will be presented in this report. This report is mainly targeted at Canadians who are currently living in Toronto and are thinking of relocating to Singapore or Singaporeans who want to relocate to Toronto. This report will hopely provide the key insights and information that these people will need to make an informed decision.

# 2. Methodology

In this project, the Foursquare location data API was used to explore the geographical location of Singapore. Firstly data was scraped from the wikipedia page: https://en.wikipedia.org/wiki/Postal_codes_in_Singapore.

In [1]:
#import all libraries needed for project

import numpy as np #numpy library to vectorise data

import pandas as pd #pandas library for data analysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json #json library to handle json files will will be received

from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests #to handle requests

from pandas.io.json import json_normalize #transform JSON file into a pandas dataframe

#import matplotlib
import matplotlib.cm as cm
import matplotlib.colors as colors
import matplotlib.pyplot as plt
%matplotlib inline 

#import kmeans from sklearn
from sklearn.cluster import KMeans
import folium #map library to render maps
from bs4 import BeautifulSoup

print ('Libraries imported!')

Libraries imported!


### 2.1 Data retrieval

The beautiful soup python package was used to retrieve the data from wikipedia and pandas library was used to process the retrieved data, separating Singapore into its respective locations by its postal sector. The geocoder library was also used to retrieve the longitude and latitudes of the various postal sectors. The folium map package was then used to plot the locations on the map of Singapore for visualisation purposes. Lastly, the Foursquare API was used in the end to retrieve the categories of various venues, before using pandas again to combine both data in both tables and sorting them in their categories.

In [2]:
#using BeautifulSoup package to receive data and wrangling
data = requests.get('https://en.wikipedia.org/wiki/Postal_codes_in_Singapore').text
soup = BeautifulSoup(data,'xml')
table = soup.find('table')

df = pd.DataFrame(columns = ['Postal district','Postal sector','Generallocation'])

for tr in table.find_all('tr'): #search through entire data to find only the wanted data from table
    row=[]
    for td in tr.find_all('td'):
        row.append(td.text.strip())
    if len(row)==3: #every 3 data received, append it to table
        df.loc[len(df)] = row

df #checked with website

Unnamed: 0,Postal district,Postal sector,Generallocation
0,1,"01, 02, 03, 04, 05, 06","Raffles Place, Cecil, Marina, People's Park"
1,2,"07, 08","Anson, Tanjong Pagar"
2,3,"14, 15, 16","Bukit Merah, Queenstown, Tiong Bahru"
3,4,"09, 10","Telok Blangah, Harbourfront"
4,5,"11, 12, 13","Pasir Panjang, Hong Leong Garden, Clementi New..."
5,6,17,"High Street, Beach Road (part)"
6,7,"18, 19","Middle Road, Golden Mile"
7,8,"20, 21","Little India, Farrer Park, Jalan Besar, Lavender"
8,9,"22, 23","Orchard, Cairnhill, River Valley"
9,10,"24, 25, 26, 27","Ardmore, Bukit Timah, Holland Road, Tanglin"


In [3]:
df.shape

(28, 3)

In [4]:
sg_geodata = pd.read_csv('sg_geodata.csv')
sg_geodata.head()

Unnamed: 0,PostalCode,Latitude,Longitude
0,"01, 02, 03, 04, 05, 06",1.28372,103.851013
1,"07, 08",1.27372,103.84361
2,"14, 15, 16",1.28531,103.832619
3,"09, 10",1.26535,103.818848
4,"11, 12, 13",1.2898,103.785332


In [5]:
sg_geodata.rename(columns = {'PostalCode':'Postal sector'},inplace = True)
sg_geodata_2 = pd.merge(df, sg_geodata, on = 'Postal sector')
sg_geodata_2.head()

Unnamed: 0,Postal district,Postal sector,Generallocation,Latitude,Longitude
0,1,"01, 02, 03, 04, 05, 06","Raffles Place, Cecil, Marina, People's Park",1.28372,103.851013
1,2,"07, 08","Anson, Tanjong Pagar",1.27372,103.84361
2,3,"14, 15, 16","Bukit Merah, Queenstown, Tiong Bahru",1.28531,103.832619
3,4,"09, 10","Telok Blangah, Harbourfront",1.26535,103.818848
4,5,"11, 12, 13","Pasir Panjang, Hong Leong Garden, Clementi New...",1.2898,103.785332


In [6]:
address = 'Singapore'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude_sg = location.latitude
longitude_sg = location.longitude
print('The geograpical coordinate of Singapore are {}, {}.'.format(latitude_sg, longitude_sg))

The geograpical coordinate of Singapore are 1.357107, 103.8194992.


In [7]:
#creating map of Singapore Using Latitude and Longtitude Values
map_singapore = folium.Map(location=[latitude_sg, longitude_sg], zoom_start=11)

# add markers to map
for lat, lng, genloc in zip(sg_geodata_2['Latitude'], sg_geodata_2['Longitude'], sg_geodata_2['Generallocation']):
    label = '{}'.format(genloc)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_singapore)  
    
map_singapore

In [8]:
CLIENT_ID = 'DQMEQDEEDQ5BP5T2AVVL4LSBRDLGH1J3RGF44KQPJFXIPQU3' # your Foursquare ID
CLIENT_SECRET = 'RY21BXJETKUFLWDJ334FYHJZB1DBLTUEQ1IKSAECEZCETABQ' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version 
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: DQMEQDEEDQ5BP5T2AVVL4LSBRDLGH1J3RGF44KQPJFXIPQU3
CLIENT_SECRET:RY21BXJETKUFLWDJ334FYHJZB1DBLTUEQ1IKSAECEZCETABQ


In [9]:
LIMIT = 100 # limit of number of venues returned by Foursquare API 
radius = 200 # define radius

def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighbourhood', 
                  'Neighbourhood Latitude', 
                  'Neighbourhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [10]:
sg_venues = getNearbyVenues(names = sg_geodata_2['Generallocation'],
                                   latitudes = sg_geodata_2['Latitude'],
                                   longitudes = sg_geodata_2['Longitude']
                                  )

Raffles Place, Cecil, Marina, People's Park
Anson, Tanjong Pagar
Bukit Merah, Queenstown, Tiong Bahru
Telok Blangah, Harbourfront
Pasir Panjang, Hong Leong Garden, Clementi New Town
High Street, Beach Road (part)
Middle Road, Golden Mile
Little India, Farrer Park, Jalan Besar, Lavender
Orchard, Cairnhill, River Valley
Ardmore, Bukit Timah, Holland Road, Tanglin
Watten Estate, Novena, Thomson
Balestier, Toa Payoh, Serangoon
Macpherson, Braddell
Geylang, Eunos
Katong, Joo Chiat, Amber Road
Bedok, Upper East Coast, Eastwood, Kew Drive
Loyang, Changi
Simei, Tampines, Pasir Ris
Serangoon Garden, Hougang, Punggol
Bishan, Ang Mo Kio
Upper Bukit Timah, Clementi Park, Ulu Pandan
Jurong, Tuas
Hillview, Dairy Farm, Bukit Panjang, Choa Chu Kang
Lim Chu Kang, Tengah
Kranji, Woodgrove, Woodlands
Upper Thomson, Springleaf
Yishun, Sembawang
Seletar


In [11]:
print(sg_venues.shape)
sg_venues.head()

(1146, 7)


Unnamed: 0,Neighbourhood,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"Raffles Place, Cecil, Marina, People's Park",1.28372,103.851013,CITY Hot Pot Shabu shabu,1.284173,103.851585,Hotpot Restaurant
1,"Raffles Place, Cecil, Marina, People's Park",1.28372,103.851013,Virgin Active,1.284608,103.850815,Gym / Fitness Center
2,"Raffles Place, Cecil, Marina, People's Park",1.28372,103.851013,The Salad Shop,1.285523,103.851177,Salad Place
3,"Raffles Place, Cecil, Marina, People's Park",1.28372,103.851013,CULINARYON,1.284876,103.850933,Comfort Food Restaurant
4,"Raffles Place, Cecil, Marina, People's Park",1.28372,103.851013,The Fullerton Bay Hotel,1.283878,103.853314,Hotel


In [12]:
sg_venues.groupby('Neighbourhood').count()

Unnamed: 0_level_0,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighbourhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Anson, Tanjong Pagar",100,100,100,100,100,100
"Ardmore, Bukit Timah, Holland Road, Tanglin",5,5,5,5,5,5
"Balestier, Toa Payoh, Serangoon",25,25,25,25,25,25
"Bedok, Upper East Coast, Eastwood, Kew Drive",58,58,58,58,58,58
"Bishan, Ang Mo Kio",13,13,13,13,13,13
"Bukit Merah, Queenstown, Tiong Bahru",97,97,97,97,97,97
"Geylang, Eunos",75,75,75,75,75,75
"High Street, Beach Road (part)",68,68,68,68,68,68
"Hillview, Dairy Farm, Bukit Panjang, Choa Chu Kang",24,24,24,24,24,24
"Jurong, Tuas",18,18,18,18,18,18


In [13]:
#Searching each categories
print('There are {} uniques categories.'.format(len(sg_venues['Venue Category'].unique())))

There are 197 uniques categories.


###  2.2 Analysing each Area

One hot encoding was used to transform all data into values for ease of analysis. The top 10 venues within each location was then listed out.

In [14]:
#Analysing neighborhoods using one hot encoding
sg_onehot = pd.get_dummies(sg_venues[['Venue Category']], prefix="", prefix_sep="") 
# add neighborhood column back to dataframe 
sg_onehot['Neighbourhood'] = sg_venues['Neighbourhood']  
# move neighborhood column to the first column 
fixed_columns = [sg_onehot.columns[-1]] + list(sg_onehot.columns[:-1]) 
sg_onehot.head()

Unnamed: 0,ATM,Accessories Store,Airport,Airport Lounge,Airport Terminal,American Restaurant,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Australian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bar,Beer Bar,Beer Garden,Bistro,Bookstore,Border Crossing,Boutique,Bowling Green,Breakfast Spot,Bridge,Bubble Tea Shop,Buffet,Building,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Business Service,Cafeteria,Café,Canal,Candy Store,Cantonese Restaurant,Chinese Restaurant,Chocolate Shop,Clothing Store,Club House,Cocktail Bar,Coffee Shop,Comfort Food Restaurant,Concert Hall,Convenience Store,Cosmetics Shop,Cupcake Shop,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Discount Store,Dog Run,Dumpling Restaurant,Duty-free Shop,Electronics Store,Event Space,Fast Food Restaurant,Field,Filipino Restaurant,Flea Market,Food & Drink Shop,Food Court,Food Stand,Football Stadium,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Gaming Cafe,Garden,Gastropub,General Entertainment,German Restaurant,Gift Shop,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Hainan Restaurant,Halal Restaurant,Harbor / Marina,Health Food Store,History Museum,Hobby Shop,Hong Kong Restaurant,Hospital,Hostel,Hotel,Hotel Bar,Hotpot Restaurant,Ice Cream Shop,Indian Restaurant,Indonesian Restaurant,Indoor Play Area,Italian Restaurant,Japanese Restaurant,Juice Bar,Karaoke Bar,Kebab Restaurant,Kids Store,Korean Restaurant,Lake,Latin American Restaurant,Lighthouse,Lighting Store,Lounge,Malay Restaurant,Massage Studio,Medical Center,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Monument / Landmark,Mosque,Motel,Movie Theater,Multiplex,Museum,Music Venue,Neighborhood,Nightclub,Noodle House,Office,Other Repair Shop,Paper / Office Supplies Store,Park,Peking Duck Restaurant,Pet Store,Pharmacy,Pizza Place,Playground,Plaza,Pool,Pool Hall,Pop-Up Shop,Portuguese Restaurant,Pub,Ramen Restaurant,Resort,Restaurant,River,Rock Club,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shabu-Shabu Restaurant,Shoe Store,Shopping Mall,Shopping Plaza,Skating Rink,Snack Place,Soba Restaurant,Soccer Field,Soup Place,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Stadium,Stationery Store,Steakhouse,Street Food Gathering,Supermarket,Sushi Restaurant,Swiss Restaurant,Tea Room,Temple,Thai Restaurant,Theater,Theme Restaurant,Thrift / Vintage Store,Toy / Game Store,Trail,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Waterfront,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio,Zhejiang Restaurant,Zoo Exhibit,Neighbourhood
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,"Raffles Place, Cecil, Marina, People's Park"
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,"Raffles Place, Cecil, Marina, People's Park"
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,"Raffles Place, Cecil, Marina, People's Park"
3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,"Raffles Place, Cecil, Marina, People's Park"
4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,"Raffles Place, Cecil, Marina, People's Park"


In [15]:
sg_onehot.shape

(1146, 198)

In [16]:
#Group rows by taking mean of freq of occurence of each category
sg_grouped = sg_onehot.groupby('Neighbourhood').mean().reset_index()
sg_grouped

Unnamed: 0,Neighbourhood,ATM,Accessories Store,Airport,Airport Lounge,Airport Terminal,American Restaurant,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Australian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bar,Beer Bar,Beer Garden,Bistro,Bookstore,Border Crossing,Boutique,Bowling Green,Breakfast Spot,Bridge,Bubble Tea Shop,Buffet,Building,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Business Service,Cafeteria,Café,Canal,Candy Store,Cantonese Restaurant,Chinese Restaurant,Chocolate Shop,Clothing Store,Club House,Cocktail Bar,Coffee Shop,Comfort Food Restaurant,Concert Hall,Convenience Store,Cosmetics Shop,Cupcake Shop,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Discount Store,Dog Run,Dumpling Restaurant,Duty-free Shop,Electronics Store,Event Space,Fast Food Restaurant,Field,Filipino Restaurant,Flea Market,Food & Drink Shop,Food Court,Food Stand,Football Stadium,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Gaming Cafe,Garden,Gastropub,General Entertainment,German Restaurant,Gift Shop,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Hainan Restaurant,Halal Restaurant,Harbor / Marina,Health Food Store,History Museum,Hobby Shop,Hong Kong Restaurant,Hospital,Hostel,Hotel,Hotel Bar,Hotpot Restaurant,Ice Cream Shop,Indian Restaurant,Indonesian Restaurant,Indoor Play Area,Italian Restaurant,Japanese Restaurant,Juice Bar,Karaoke Bar,Kebab Restaurant,Kids Store,Korean Restaurant,Lake,Latin American Restaurant,Lighthouse,Lighting Store,Lounge,Malay Restaurant,Massage Studio,Medical Center,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Monument / Landmark,Mosque,Motel,Movie Theater,Multiplex,Museum,Music Venue,Neighborhood,Nightclub,Noodle House,Office,Other Repair Shop,Paper / Office Supplies Store,Park,Peking Duck Restaurant,Pet Store,Pharmacy,Pizza Place,Playground,Plaza,Pool,Pool Hall,Pop-Up Shop,Portuguese Restaurant,Pub,Ramen Restaurant,Resort,Restaurant,River,Rock Club,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shabu-Shabu Restaurant,Shoe Store,Shopping Mall,Shopping Plaza,Skating Rink,Snack Place,Soba Restaurant,Soccer Field,Soup Place,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Stadium,Stationery Store,Steakhouse,Street Food Gathering,Supermarket,Sushi Restaurant,Swiss Restaurant,Tea Room,Temple,Thai Restaurant,Theater,Theme Restaurant,Thrift / Vintage Store,Toy / Game Store,Trail,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Waterfront,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio,Zhejiang Restaurant,Zoo Exhibit
0,"Anson, Tanjong Pagar",0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.03,0.01,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.06,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.02,0.11,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.01,0.0,0.0,0.02,0.01,0.0,0.02,0.13,0.01,0.0,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.01,0.0,0.0,0.02,0.0,0.01,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.02,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,"Ardmore, Bukit Timah, Holland Road, Tanglin",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"Balestier, Toa Payoh, Serangoon",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.08,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.12,0.0,0.0,0.0,0.0,0.08,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.04,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,"Bedok, Upper East Coast, Eastwood, Kew Drive",0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.051724,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.017241,0.0,0.017241,0.0,0.0,0.017241,0.017241,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.068966,0.0,0.017241,0.0,0.0,0.068966,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.051724,0.0,0.0,0.017241,0.017241,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.017241,0.017241,0.017241,0.0,0.017241,0.051724,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.051724,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.017241,0.017241,0.0,0.0,0.0,0.0,0.0,0.034483,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0
4,"Bishan, Ang Mo Kio",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.230769,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,"Bukit Merah, Queenstown, Tiong Bahru",0.0,0.0,0.0,0.0,0.0,0.020619,0.0,0.0,0.0,0.0,0.092784,0.0,0.0,0.0,0.0,0.030928,0.020619,0.0,0.0,0.0,0.030928,0.0,0.010309,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.092784,0.0,0.0,0.010309,0.092784,0.0,0.0,0.0,0.0,0.030928,0.010309,0.0,0.010309,0.0,0.010309,0.0,0.0,0.020619,0.020619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030928,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.010309,0.010309,0.0,0.0,0.010309,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.010309,0.051546,0.010309,0.0,0.010309,0.0,0.0,0.010309,0.010309,0.020619,0.0,0.010309,0.0,0.0,0.020619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.061856,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020619,0.0,0.0,0.0,0.0,0.0,0.0,0.051546,0.0,0.0,0.0,0.0,0.0,0.020619,0.0,0.0,0.020619,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030928,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0
6,"Geylang, Eunos",0.0,0.0,0.0,0.0,0.0,0.013333,0.0,0.0,0.0,0.0,0.053333,0.0,0.0,0.013333,0.0,0.026667,0.0,0.0,0.0,0.0,0.0,0.0,0.013333,0.0,0.013333,0.0,0.026667,0.0,0.0,0.0,0.013333,0.0,0.0,0.013333,0.0,0.0,0.026667,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.013333,0.013333,0.0,0.053333,0.0,0.0,0.0,0.026667,0.0,0.0,0.0,0.0,0.0,0.013333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013333,0.0,0.026667,0.013333,0.0,0.026667,0.0,0.013333,0.013333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013333,0.0,0.0,0.0,0.0,0.026667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026667,0.0,0.0,0.0,0.0,0.013333,0.0,0.0,0.0,0.0,0.013333,0.0,0.08,0.0,0.0,0.026667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026667,0.0,0.04,0.013333,0.0,0.0,0.0,0.013333,0.0,0.013333,0.0,0.0,0.0,0.013333,0.013333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,"High Street, Beach Road (part)",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014706,0.0,0.0,0.0,0.0,0.0,0.014706,0.0,0.029412,0.044118,0.0,0.0,0.029412,0.0,0.0,0.0,0.0,0.014706,0.014706,0.014706,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014706,0.0,0.014706,0.014706,0.0,0.014706,0.0,0.0,0.0,0.014706,0.044118,0.014706,0.0,0.029412,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014706,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.029412,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014706,0.0,0.0,0.0,0.0,0.014706,0.0,0.0,0.0,0.014706,0.044118,0.0,0.014706,0.0,0.014706,0.0,0.0,0.044118,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014706,0.0,0.014706,0.0,0.014706,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014706,0.0,0.0,0.0,0.0,0.0,0.014706,0.0,0.029412,0.0,0.0,0.0,0.0,0.014706,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.029412,0.014706,0.0,0.0,0.0,0.0,0.0,0.014706,0.0,0.0,0.029412,0.014706,0.0,0.0,0.0,0.0,0.014706,0.0,0.014706,0.014706,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014706,0.0,0.0,0.0,0.0,0.014706,0.014706,0.0,0.0,0.0,0.0,0.014706,0.014706,0.014706,0.0,0.0,0.014706,0.0,0.0,0.029412,0.0,0.0
8,"Hillview, Dairy Farm, Bukit Panjang, Choa Chu ...",0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.041667,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,"Jurong, Tuas",0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.222222,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556


In [17]:
num_top_venues = 10

for area in sg_grouped['Neighbourhood']:
    print("----"+area+"----")
    temp = sg_grouped[sg_grouped['Neighbourhood'] == area].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Anson, Tanjong Pagar----
                 venue  freq
0  Japanese Restaurant  0.13
1          Coffee Shop  0.11
2                 Café  0.06
3                Hotel  0.05
4     Ramen Restaurant  0.04
5           Food Court  0.04
6               Bakery  0.03
7    Korean Restaurant  0.03
8   Italian Restaurant  0.02
9    Indian Restaurant  0.02


----Ardmore, Bukit Timah, Holland Road, Tanglin----
              venue  freq
0       Bus Station   0.2
1              Pool   0.2
2  Football Stadium   0.2
3               Gym   0.2
4              Café   0.2
5               ATM   0.0
6              Park   0.0
7      Neighborhood   0.0
8         Nightclub   0.0
9      Noodle House   0.0


----Balestier, Toa Payoh, Serangoon----
                 venue  freq
0   Chinese Restaurant  0.12
1          Snack Place  0.08
2         Dessert Shop  0.08
3               Bakery  0.08
4          Coffee Shop  0.08
5            Bookstore  0.04
6  Monument / Landmark  0.04
7          Supermarket  0.04
8   Froze

### 2.3 Sorting Venues in Descending order:

In [18]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [19]:
indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighbourhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighbourhoods_venues_sorted = pd.DataFrame(columns=columns)
neighbourhoods_venues_sorted['Neighbourhood'] = sg_grouped['Neighbourhood']

for ind in np.arange(sg_grouped.shape[0]):
    neighbourhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(sg_grouped.iloc[ind, :], num_top_venues)

neighbourhoods_venues_sorted.head()

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Anson, Tanjong Pagar",Japanese Restaurant,Coffee Shop,Café,Hotel,Ramen Restaurant,Food Court,Bakery,Korean Restaurant,Italian Restaurant,Indian Restaurant
1,"Ardmore, Bukit Timah, Holland Road, Tanglin",Café,Football Stadium,Pool,Bus Station,Gym,Zoo Exhibit,French Restaurant,Food Stand,Food Court,Food & Drink Shop
2,"Balestier, Toa Payoh, Serangoon",Chinese Restaurant,Bakery,Dessert Shop,Snack Place,Coffee Shop,Grocery Store,Monument / Landmark,Frozen Yogurt Shop,Pool,Café
3,"Bedok, Upper East Coast, Eastwood, Kew Drive",Coffee Shop,Chinese Restaurant,Asian Restaurant,Food Court,Japanese Restaurant,Sandwich Place,Supermarket,Bakery,Fast Food Restaurant,Dessert Shop
4,"Bishan, Ang Mo Kio",Chinese Restaurant,Park,Dessert Shop,Food Court,Japanese Restaurant,Dog Run,Bus Station,General Entertainment,Asian Restaurant,Skating Rink


### 2.4 Clustering neighborhoods using k means
The neighbourhoods were clustered into areas using k means method. The number of clusters was set to 5 because in Singapore is normally separated into 5 different areas: North, South, East, West, and Central.

In [59]:
# set number of clusters
kclusters = 5

sg_grouped_clustering = sg_grouped.drop('Neighbourhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, n_init = 30).fit(sg_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_

array([0, 1, 0, 0, 2, 0, 0, 0, 0, 2, 0, 2, 3, 0, 2, 0, 0, 0, 0, 2, 2, 2,
       0, 0, 2, 0, 4])

In [60]:
# add clustering labels
neighbourhoods_venues_sorted.insert(0, 'Cluster_Labels', kmeans.labels_)

sg_merged = sg_geodata_2

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
sg_merged = sg_merged.join(neighbourhoods_venues_sorted.set_index('Neighbourhood'), on='Generallocation')

sg_merged # check the last columns!

Unnamed: 0,Postal district,Postal sector,Generallocation,Latitude,Longitude,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,1,"01, 02, 03, 04, 05, 06","Raffles Place, Cecil, Marina, People's Park",1.28372,103.851013,0.0,Hotel,Salad Place,Cocktail Bar,Coffee Shop,Food Court,French Restaurant,Japanese Restaurant,Italian Restaurant,Nightclub,Gym / Fitness Center
1,2,"07, 08","Anson, Tanjong Pagar",1.27372,103.84361,0.0,Japanese Restaurant,Coffee Shop,Café,Hotel,Ramen Restaurant,Food Court,Bakery,Korean Restaurant,Italian Restaurant,Indian Restaurant
2,3,"14, 15, 16","Bukit Merah, Queenstown, Tiong Bahru",1.28531,103.832619,0.0,Café,Chinese Restaurant,Asian Restaurant,Noodle House,Hotel,Seafood Restaurant,Bookstore,Bakery,Coffee Shop,Thai Restaurant
3,4,"09, 10","Telok Blangah, Harbourfront",1.26535,103.818848,0.0,Clothing Store,Chinese Restaurant,Toy / Game Store,Bakery,Fast Food Restaurant,Food Court,Asian Restaurant,Spa,Office,Coffee Shop
4,5,"11, 12, 13","Pasir Panjang, Hong Leong Garden, Clementi New...",1.2898,103.785332,0.0,Coffee Shop,Office,Asian Restaurant,Restaurant,Medical Center,Kebab Restaurant,Pharmacy,Beer Bar,Vietnamese Restaurant,Hong Kong Restaurant
5,6,17,"High Street, Beach Road (part)",1.29056,103.849564,0.0,Japanese Restaurant,Hotel,Italian Restaurant,Bar,Cocktail Bar,Concert Hall,Bistro,Shopping Mall,Nightclub,Bakery
6,7,"18, 19","Middle Road, Golden Mile",1.30012,103.85199,0.0,Café,Hotel,Japanese Restaurant,Chinese Restaurant,Gaming Cafe,Sandwich Place,Bakery,Ice Cream Shop,Art Museum,Art Gallery
7,8,"20, 21","Little India, Farrer Park, Jalan Besar, Lavender",1.30668,103.849407,3.0,Indian Restaurant,Vegetarian / Vegan Restaurant,Sporting Goods Shop,Restaurant,Bakery,Museum,Rock Club,Motel,Coffee Shop,Hotel
8,9,"22, 23","Orchard, Cairnhill, River Valley",1.304464,103.832353,0.0,Boutique,Sushi Restaurant,Japanese Restaurant,Hotel,Bakery,Cosmetics Shop,Shopping Mall,Coffee Shop,Asian Restaurant,Department Store
9,10,"24, 25, 26, 27","Ardmore, Bukit Timah, Holland Road, Tanglin",1.329488,103.802053,1.0,Café,Football Stadium,Pool,Bus Station,Gym,Zoo Exhibit,French Restaurant,Food Stand,Food Court,Food & Drink Shop


In [62]:
#remove Lim Chu Kang, Tengah area
sg_merged = sg_merged.dropna()
sg_merged['Cluster_Labels'] = sg_merged.Cluster_Labels.astype(int)

In [63]:
#check again
sg_merged

Unnamed: 0,Postal district,Postal sector,Generallocation,Latitude,Longitude,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,1,"01, 02, 03, 04, 05, 06","Raffles Place, Cecil, Marina, People's Park",1.28372,103.851013,0,Hotel,Salad Place,Cocktail Bar,Coffee Shop,Food Court,French Restaurant,Japanese Restaurant,Italian Restaurant,Nightclub,Gym / Fitness Center
1,2,"07, 08","Anson, Tanjong Pagar",1.27372,103.84361,0,Japanese Restaurant,Coffee Shop,Café,Hotel,Ramen Restaurant,Food Court,Bakery,Korean Restaurant,Italian Restaurant,Indian Restaurant
2,3,"14, 15, 16","Bukit Merah, Queenstown, Tiong Bahru",1.28531,103.832619,0,Café,Chinese Restaurant,Asian Restaurant,Noodle House,Hotel,Seafood Restaurant,Bookstore,Bakery,Coffee Shop,Thai Restaurant
3,4,"09, 10","Telok Blangah, Harbourfront",1.26535,103.818848,0,Clothing Store,Chinese Restaurant,Toy / Game Store,Bakery,Fast Food Restaurant,Food Court,Asian Restaurant,Spa,Office,Coffee Shop
4,5,"11, 12, 13","Pasir Panjang, Hong Leong Garden, Clementi New...",1.2898,103.785332,0,Coffee Shop,Office,Asian Restaurant,Restaurant,Medical Center,Kebab Restaurant,Pharmacy,Beer Bar,Vietnamese Restaurant,Hong Kong Restaurant
5,6,17,"High Street, Beach Road (part)",1.29056,103.849564,0,Japanese Restaurant,Hotel,Italian Restaurant,Bar,Cocktail Bar,Concert Hall,Bistro,Shopping Mall,Nightclub,Bakery
6,7,"18, 19","Middle Road, Golden Mile",1.30012,103.85199,0,Café,Hotel,Japanese Restaurant,Chinese Restaurant,Gaming Cafe,Sandwich Place,Bakery,Ice Cream Shop,Art Museum,Art Gallery
7,8,"20, 21","Little India, Farrer Park, Jalan Besar, Lavender",1.30668,103.849407,3,Indian Restaurant,Vegetarian / Vegan Restaurant,Sporting Goods Shop,Restaurant,Bakery,Museum,Rock Club,Motel,Coffee Shop,Hotel
8,9,"22, 23","Orchard, Cairnhill, River Valley",1.304464,103.832353,0,Boutique,Sushi Restaurant,Japanese Restaurant,Hotel,Bakery,Cosmetics Shop,Shopping Mall,Coffee Shop,Asian Restaurant,Department Store
9,10,"24, 25, 26, 27","Ardmore, Bukit Timah, Holland Road, Tanglin",1.329488,103.802053,1,Café,Football Stadium,Pool,Bus Station,Gym,Zoo Exhibit,French Restaurant,Food Stand,Food Court,Food & Drink Shop


In [64]:
# create map
map_clusters = folium.Map(location=[latitude_sg, longitude_sg], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for la, lo, p, cluster in zip(sg_merged['Latitude'], sg_merged['Longitude'], sg_merged['Generallocation'], sg_merged['Cluster_Labels']):
    label = folium.Popup(str(p) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [la, lo],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### 2.5 Investigating Clusters

#### 2.5.1 Cluster 1

In [65]:
sg_merged.loc[sg_merged['Cluster_Labels'] == 0, sg_merged.columns[[1] + list(range(5, sg_merged.shape[1]))]]

Unnamed: 0,Postal sector,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"01, 02, 03, 04, 05, 06",0,Hotel,Salad Place,Cocktail Bar,Coffee Shop,Food Court,French Restaurant,Japanese Restaurant,Italian Restaurant,Nightclub,Gym / Fitness Center
1,"07, 08",0,Japanese Restaurant,Coffee Shop,Café,Hotel,Ramen Restaurant,Food Court,Bakery,Korean Restaurant,Italian Restaurant,Indian Restaurant
2,"14, 15, 16",0,Café,Chinese Restaurant,Asian Restaurant,Noodle House,Hotel,Seafood Restaurant,Bookstore,Bakery,Coffee Shop,Thai Restaurant
3,"09, 10",0,Clothing Store,Chinese Restaurant,Toy / Game Store,Bakery,Fast Food Restaurant,Food Court,Asian Restaurant,Spa,Office,Coffee Shop
4,"11, 12, 13",0,Coffee Shop,Office,Asian Restaurant,Restaurant,Medical Center,Kebab Restaurant,Pharmacy,Beer Bar,Vietnamese Restaurant,Hong Kong Restaurant
5,17,0,Japanese Restaurant,Hotel,Italian Restaurant,Bar,Cocktail Bar,Concert Hall,Bistro,Shopping Mall,Nightclub,Bakery
6,"18, 19",0,Café,Hotel,Japanese Restaurant,Chinese Restaurant,Gaming Cafe,Sandwich Place,Bakery,Ice Cream Shop,Art Museum,Art Gallery
8,"22, 23",0,Boutique,Sushi Restaurant,Japanese Restaurant,Hotel,Bakery,Cosmetics Shop,Shopping Mall,Coffee Shop,Asian Restaurant,Department Store
10,"28, 29, 30",0,Café,Coffee Shop,Hotel,Italian Restaurant,Japanese Restaurant,Ramen Restaurant,Chinese Restaurant,Bakery,Thai Restaurant,Asian Restaurant
11,"31, 32, 33",0,Chinese Restaurant,Bakery,Dessert Shop,Snack Place,Coffee Shop,Grocery Store,Monument / Landmark,Frozen Yogurt Shop,Pool,Café


#### 2.5.2 Cluster 2

In [66]:
sg_merged.loc[sg_merged['Cluster_Labels'] == 1, sg_merged.columns[[1] + list(range(5, sg_merged.shape[1]))]]

Unnamed: 0,Postal sector,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
9,"24, 25, 26, 27",1,Café,Football Stadium,Pool,Bus Station,Gym,Zoo Exhibit,French Restaurant,Food Stand,Food Court,Food & Drink Shop


#### 2.5.3 Cluster 3

In [67]:
sg_merged.loc[sg_merged['Cluster_Labels'] == 2, sg_merged.columns[[1] + list(range(5, sg_merged.shape[1]))]]

Unnamed: 0,Postal sector,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
12,"34, 35, 36, 37",2,Food Court,Asian Restaurant,Bistro,Bus Station,French Restaurant,Chinese Restaurant,Vegetarian / Vegan Restaurant,Noodle House,Coffee Shop,Soccer Field
17,"51, 52",2,Coffee Shop,Park,Convenience Store,Stadium,Café,Supermarket,Asian Restaurant,Trail,Lighting Store,Restaurant
18,"53, 54, 55, 82",2,Chinese Restaurant,Café,Coffee Shop,Bus Station,Convenience Store,Dim Sum Restaurant,Bakery,Asian Restaurant,Playground,Restaurant
19,"56, 57",2,Chinese Restaurant,Park,Dessert Shop,Food Court,Japanese Restaurant,Dog Run,Bus Station,General Entertainment,Asian Restaurant,Skating Rink
21,"60, 61, 62, 63, 64",2,Bus Station,Zoo Exhibit,Sandwich Place,Bakery,Café,Chinese Restaurant,Coffee Shop,Discount Store,Food Court,Garden
24,"72, 73",2,Fast Food Restaurant,Food Court,Restaurant,Pizza Place,Supermarket,Soccer Field,Park,Bus Station,Zoo Exhibit,Food Stand
25,"77, 78",2,Chinese Restaurant,Asian Restaurant,Park,Japanese Restaurant,Bus Station,Seafood Restaurant,Breakfast Spot,Café,Restaurant,Food Court
27,"79, 80",2,Resort,Gastropub,Airport Terminal,Food Court,Asian Restaurant,Café,Field,Fried Chicken Joint,French Restaurant,Football Stadium


#### 2.5.4 Cluster 4

In [68]:
sg_merged.loc[sg_merged['Cluster_Labels'] == 3, sg_merged.columns[[1] + list(range(5, sg_merged.shape[1]))]]

Unnamed: 0,Postal sector,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
7,"20, 21",3,Indian Restaurant,Vegetarian / Vegan Restaurant,Sporting Goods Shop,Restaurant,Bakery,Museum,Rock Club,Motel,Coffee Shop,Hotel


#### 2.5.5 Cluster 5

In [69]:
sg_merged.loc[sg_merged['Cluster_Labels'] == 4, sg_merged.columns[[1] + list(range(5, sg_merged.shape[1]))]]

Unnamed: 0,Postal sector,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
26,"75, 76",4,Office,Restaurant,Playground,Food & Drink Shop,Chinese Restaurant,Bar,Zoo Exhibit,French Restaurant,Football Stadium,Food Stand
