# Capstone Project - The Battle of Neighborhoods

 ### Singapore Visitors and XPATS Venue Recommendation

## Introduction

Singapore is a small country and is one of the most visited countries in Asia. There are a lot of websites where travelers can check and retrieve recommendations of places to stay or visit.  
However, most of this websites creates a recommendation simply based on usual tourist attractions or key residential areas that are mostly expensive or already known for travelers based on  
certain keywords like "Hotel", or "Backpackers" etc. The intention on this project is to collect and create a data driven recommnedation that can supplement these websites by utilizing data  
retrieved from Singapore open data sources and FourSquare API venue recommendations.

To limit the scope of this sample recommender, this notebook will be using a scenario where we compare a cost of 1 month HDB unit rental in a different Singapore towns and then look at the  
venue recommendation from FourSquare API.

For this demo, this notebook will look at the following data:
   1. Singapore Median Rental Prices by town.
   2. Popular Food venues in the vicinity.

With this data, a sample website recommnedation is to provide visitors on a limited budget a decision tool to decide on  where to stay, or select a location more suitable for his place of interest.
Other possible checks that the user can access using this same notebook are categories like Outdoors and Recreation or Nigthlife.

###  Data Acquisition
This part of notebook will be incharge of retrieving and preparing the data for this project.<br>
This project will be getting needed information from the following sources:
 1. Singapore Towns and median residential rental prices
    - Data will retrieved from singapore open dataset from "<a href='https://data.gov.sg/dataset/b35046dc-7428-4cff-968d-ef4c3e9e6c99'>median rent by town and flattype</a>" from https://data.gov.sg website.
 2. Singapore Towns Coordinates 
    - Retrieved using google maps api
 3. Singapore Top Venue Recommendations
    - Retrieved using FourSquare API, we will retrive a sample user places of interest.

#### Importing Python Libraries
This section imports required python libraries for processing data. <br>
While this first part of python notebook is for data acquisition, we will use some  of the libraries make some data visualization.

In [1]:
!conda install -c conda-forge folium=0.5.0 --yes # comment/uncomment if not yet installed.
!conda install -c conda-forge geopy --yes        # comment/uncomment if not yet installed

import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analsysis

# Numpy and Pandas libraries were already imported at the beginning of this notebook.
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe
# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors
# import k-means from clustering stage
from sklearn.cluster import KMeans
import folium # map rendering library

import requests # library to handle requests
import lxml.html as lh
import bs4 as bs
import urllib.request

print('Libraries imported.')

Fetching package metadata .............
Solving package specifications: .

# All requested packages already installed.
# packages in environment at /opt/conda/envs/DSX-Python35:
#
folium                    0.5.0                      py_0    conda-forge
Fetching package metadata .............
Solving package specifications: .

# All requested packages already installed.
# packages in environment at /opt/conda/envs/DSX-Python35:
#
geopy                     1.18.1                     py_0    conda-forge
Libraries imported.


In [2]:
from IPython.display import HTML
import base64

# Extra Helper scripts to generate download links for saved dataframes in csv format.
def create_download_link( df, title = "Download CSV file", filename = "data.csv"):  
    csv = df.to_csv()
    b64 = base64.b64encode(csv.encode())
    payload = b64.decode()
    html = '<a download="{filename}" href="data:text/csv;base64,{payload}" target="_blank">{title}</a>'
    html = html.format(payload=payload,title=title,filename=filename)
    return HTML(html)

#### 1. Downloading Singapore towns list with and median residential rental prices

In [3]:
import zipfile
import os
!wget -q -O 'median-rent-by-town-and-flat-type.zip' "https://data.gov.sg/dataset/b35046dc-7428-4cff-968d-ef4c3e9e6c99/download"
zf = zipfile.ZipFile('./median-rent-by-town-and-flat-type.zip')
sgp_median_rent_by_town_data = pd.read_csv(zf.open("median-rent-by-town-and-flat-type.csv"))
sgp_median_rent_by_town_data.rename(columns = {'town':'Town'}, inplace = True)
sgp_median_rent_by_town_data.head()

Unnamed: 0,quarter,Town,flat_type,median_rent
0,2005-Q2,ANG MO KIO,1-RM,na
1,2005-Q2,ANG MO KIO,2-RM,na
2,2005-Q2,ANG MO KIO,3-RM,800
3,2005-Q2,ANG MO KIO,4-RM,950
4,2005-Q2,ANG MO KIO,5-RM,-


#### Data Cleanup and re-grouping.
The retrieved table contains some un-wanted entries and needs some cleanup.
The following tasks will be performed:
* Drop/ignore cells with missing data.
* Use most current data record.
* Fix data types.

In [4]:
# Drop rows with rental price == 'na'.
sgp_median_rent_by_town_data_filter=sgp_median_rent_by_town_data[~sgp_median_rent_by_town_data['median_rent'].isin(['-','na'])]

# Take the most recent report which is "2018-Q2"
sgp_median_rent_by_town_data_filter=sgp_median_rent_by_town_data_filter[sgp_median_rent_by_town_data_filter['quarter'] == "2018-Q2"]

# Now that all rows reports are "2018-Q2", we dont need this column anymore.
sgp_median_rent_by_town_data_filter=sgp_median_rent_by_town_data_filter.drop(['quarter'], axis=1)

# Ensure that median_rent column is float64.
sgp_median_rent_by_town_data_filter['median_rent']=sgp_median_rent_by_town_data_filter['median_rent'].astype(np.float64)

* Note: We can separate the analysis HDB unit size to be more accurate, For this demonstration however, We will do a simplier analysis by using a median price for all available rental units regardless of its size. 

In [5]:
singapore_average_rental_prices_by_town = sgp_median_rent_by_town_data_filter.groupby(['Town'])['median_rent'].mean().reset_index()
singapore_average_rental_prices_by_town

Unnamed: 0,Town,median_rent
0,ANG MO KIO,2033.333333
1,BEDOK,2087.5
2,BISHAN,2233.333333
3,BUKIT BATOK,1962.5
4,BUKIT MERAH,2162.5
5,BUKIT PANJANG,1737.5
6,CENTRAL,2450.0
7,CHOA CHU KANG,1933.333333
8,CLEMENTI,2263.333333
9,GEYLANG,2166.666667


* Adding geographical coordinates of each town location.

In [6]:
# The code was removed by Watson Studio for sharing.

google_key=hidden_from_view


#### 2. Retrieve town coordinates.
Google api will be used to retrive the coordinates (latitude and longitude of each town centers. For this exercise, I just used the MRT stations as the center points of each evaluated towns.
The town coordinates will be used in retrieval of Foursquare API location data. 

In [7]:
singapore_average_rental_prices_by_town['Latitude'] = 0.0
singapore_average_rental_prices_by_town['Longitude'] = 0.0

for idx,town in singapore_average_rental_prices_by_town['Town'].iteritems():
    address = town + " MRT station, Singapore" ; # I prefer to use MRT stations as more important central location of each town
    url = 'https://maps.googleapis.com/maps/api/geocode/json?address={}&key={}'.format(address,google_key)
    lat = requests.get(url).json()["results"][0]["geometry"]["location"]['lat']
    lng = requests.get(url).json()["results"][0]["geometry"]["location"]['lng']
    singapore_average_rental_prices_by_town.loc[idx,'Latitude'] = lat
    singapore_average_rental_prices_by_town.loc[idx,'Longitude'] = lng

In [8]:
# Alternative if above does not work. 
# CODE IS DISABLED <<< if {0}: >>>
if (0):
    geo = Nominatim(user_agent='Mypythonapi')
    for idx,town in singapore_average_rental_prices_by_town['Town'].iteritems():
        coord = geo.geocode(town + ' ' + "Singapore", timeout = 10)
        if coord:
            singapore_average_rental_prices_by_town.loc[idx,'Latitude'] = coord.latitude
            singapore_average_rental_prices_by_town.loc[idx,'Longitude'] = coord.longitude
        else:
            singapore_average_rental_prices_by_town.loc[idx,'Latitude'] = NULL
            singapore_average_rental_prices_by_town.loc[idx,'Longitude'] = NULL

In [9]:
singapore_average_rental_prices_by_town.set_index("Town")

Unnamed: 0_level_0,median_rent,Latitude,Longitude
Town,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
ANG MO KIO,2033.333333,1.369972,103.849588
BEDOK,2087.5,1.324011,103.930172
BISHAN,2233.333333,1.351042,103.84993
BUKIT BATOK,1962.5,1.348506,103.749222
BUKIT MERAH,2162.5,1.289642,103.816798
BUKIT PANJANG,1737.5,1.276068,103.791904
CENTRAL,2450.0,1.288155,103.846718
CHOA CHU KANG,1933.333333,1.385385,103.744337
CLEMENTI,2263.333333,1.31507,103.765246
GEYLANG,2166.666667,1.316367,103.882772


#### Generate Singapore basemap.

In [10]:
geo = Nominatim(user_agent='My-IBMNotebook')
address = 'Singapore'
location = geo.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Singapore {}, {}.'.format(latitude, longitude))

# create map of Singapore using latitude and longitude values
map_singapore = folium.Map(location=[latitude, longitude],tiles="OpenStreetMap", zoom_start=10)

# add markers to map
for lat, lng, town in zip(
    singapore_average_rental_prices_by_town['Latitude'],
    singapore_average_rental_prices_by_town['Longitude'],
    singapore_average_rental_prices_by_town['Town']):
    label = town
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=4,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#87cefa',
        fill_opacity=0.5,
        parse_html=False).add_to(map_singapore)
map_singapore

The geograpical coordinate of Singapore 1.2904753, 103.8520359.


In [11]:
fileName = "singapore_average_rpbt.csv"
linkName = "Singapore Average Rental Prices"
create_download_link(singapore_average_rental_prices_by_town,linkName,fileName)

#### 3. Retrieving FourSquare Places of interest.

<h1 align=center><font size = 5>Segmenting and Clustering Towns in Singapore</font></h1>

## Introduction

I will  be using the Foursquare API to explore neighborhoods in selected cities in toronto.
The Foursquare **explore** function will be used to get the most common venue categories in each neighborhood, and then use this feature to group the neighborhoods into clusters. The *k*-means clustering algorithm will be used for the analysis.
Fnally, use the Folium library to visualize the neighborhoods in Toronto and their emerging clusters.

In [12]:
# The code was removed by Watson Studio for sharing.

Hidden Foursqure API Keyset


In [13]:
# The code was removed by Watson Studio for sharing.

CLIENT_ID     = hidden
CLIENT_SECRET = hidden
VERSION       = 20190102
LIMIT         = 80


## 1. Exploring Neighbourhood  in Singapore
#### Using the following foursquare api query url, search venues on all boroughs in selected Singapore towns.
> `https://api.foursquare.com/v2/venues/`**search**`?client_id=`**CLIENT_ID**`&client_secret=`**CLIENT_SECRET**`&ll=`**LATITUDE**`,`**LONGITUDE**`&v=`**VERSION**`&query=`**QUERY**`&radius=`**RADIUS**`&limit=`**LIMIT**

Retrieving data from FourSquare API is not so straight forward. It returns a json list top venues to visit to city. The scores however, is retrieved on a separate query to the FourSquare Venue API and is limited to 50 queries per day when using a free FourSquare subscription.<br/> The following functions generates the query urls and processes the returned json data into dataframe.
<br/><br/>
The function **getNearbyVenues** extracts the following information for the dataframe it generates:
* Venue ID
* Venue Name
* Coordinates : Latitude and Longitude
* Category Name

The function **getVenuesByCategory** performs the following:
  1. **category** based venue search to simulate user venue searches based on certain places of interest. This search extracts the following information:
   * Venue ID
   * Venue Name
   * Coordinates : Latitude and Longitude
   * Category Name
  2. For each retrieved **venueID**, retrive the venues category rating.

The generated data frame in the second function contains the following column:
<TABLE align='left'>
    <tr>
        <th>Column Name</th><th>Description</th>
    </tr>
<tr><td>Town</td><td>Town Name</td></tr>
<tr><td>Town Latitude</td><td>Towns MRT station Latitude</td></tr>
<tr><td>Town Longitude</td><td>Town MRT station Latitude</td></tr>
<tr><td>VenueID</td><td>FourSquare Venue ID</td></tr>
<tr><td>VenueName</td><td>Venue Name</td></tr>
<tr><td>score</td><td>FourSquare Venue user rating</td></tr>
<tr><td>category</td><td>Category group name</td></tr>
<tr><td>catID</td><td>Category ID</td></tr>
<tr><td>latitude</td><td>Venue Location - latitude</td></tr>
<tr><td>longitude</td><td>Venue Location - longitude</td></tr>


In [14]:
import time
# ---------------------------------------------
# The following function retrieves the venues given the names and coordinates and stores it into dataframe.
FOURSQUARE_EXPLORE_URL = 'https://api.foursquare.com/v2/venues/explore?'
FOURSQUARE_SEARCH_URL = 'https://api.foursquare.com/v2/venues/search?'

def getNearbyVenues(names, latitudes, longitudes, radius=500):
    global CLIENT_ID
    global CLIENT_SECRET
    global FOURSQUARE_EXPLORE_URL
    global FOURSQUARE_SEARCH_URL
    global VERSION
    global LIMIT
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print('getNearbyVenues',names)
        cyclefsk2()
        # create the API request URL
        url = '{}&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            FOURSQUARE_EXPLORE_URL,CLIENT_ID,CLIENT_SECRET,VERSION,
            lat,lng,radius,LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name,lat,lng, 
            v['venue']['id'],v['venue']['name'], 
            v['venue']['location']['lat'],v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])
        time.sleep(2)

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Town','Town Latitude','Town Longitude','Venue','Venue Latitude','Venue Longitude','Venue Category']
    
    return(nearby_venues)

In [15]:
FOURSQUARE_SEARCH_URL = 'https://api.foursquare.com/v2/venues/search?'
# SEARCH VENUES BY CATEGORY

# Dataframe : venue_id_recover 
# - store venue id to recover failed venues id score retrieval later if foursquare limit is exceeded when getting score.
venue_id_rcols = ['VenueID']
venue_id_recover = pd.DataFrame(columns=venue_id_rcols)

def getVenuesByCategory(names, latitudes, longitudes, categoryID, radius=500):
    global CLIENT_ID
    global CLIENT_SECRET
    global FOURSQUARE_EXPLORE_URL
    global FOURSQUARE_SEARCH_URL
    global VERSION
    global LIMIT
    venue_columns = ['Town','Town Latitude','Town Longitude','VenueID','VenueName','score','category','catID','latitude','longitude']
    venue_DF = pd.DataFrame(columns=venue_columns)
    print("[#Start getVenuesByCategory]")
    for name, lat, lng in zip(names, latitudes, longitudes):
        cyclefsk2()
        print(name,",",end='')
        #print('getVenuesByCategory',categoryID,name) ; # DEBUG: be quiet
        # create the API request URL
        url = '{}client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}&categoryId={}'.format(
            FOURSQUARE_SEARCH_URL,CLIENT_ID,CLIENT_SECRET,VERSION,lat,lng,radius,LIMIT,categoryID)
        # make the GET request
        results = requests.get(url).json()
        # Populate dataframe with the category venue results
        # Extracting JSON  data values
        
        for jsonSub in results['response']['venues']:
            #print(jsonSub)
            # JSON Results may not be in expected format or incomplete data, in that case, skip!
            ven_id = 0
            try:
                # If there are any issue with a restaurant, retry or ignore and continue
                # Get location details
                ven_id   = jsonSub['id']
                ven_cat  = jsonSub['categories'][0]['pluralName']
                ven_CID  = jsonSub['categories'][0]['id']
                ven_name = jsonSub['name']
                ven_lat  = jsonSub['location']['lat']
                ven_lng  = jsonSub['location']['lng']
                venue_DF = venue_DF.append({
                    'Town'      : name,
                    'Town Latitude' : lat,
                    'Town Longitude': lng,
                    'VenueID'   : ven_id,
                    'VenueName' : ven_name,
                    'score'     : 'nan',
                    'category'  : ven_cat,
                    'catID'     : ven_CID,
                    'latitude'  : ven_lat,
                    'longitude' : ven_lng}, ignore_index=True)
            except:
                continue
    # END OF LOOP, return.
    print("\n[#Done getVenuesByCategory]")
    return(venue_DF)

In [16]:
FOURSQUARE_SEARCH_URL = 'https://api.foursquare.com/v2/venues/search?'
# SEARCH VENUES BY CATEGORY

# Dataframe : venue_id_recover 
# - store venue id to recover failed venues id score retrieval later if foursquare limit is exceeded when getting score.
venue_id_rcols = ['VenueID','Score']
venue_id_recover = pd.DataFrame(columns=venue_id_rcols)

def getVenuesIDScore(venueID):
    global CLIENT_ID
    global CLIENT_SECRET
    global FOURSQUARE_EXPLORE_URL
    global FOURSQUARE_SEARCH_URL
    global VERSION
    global LIMIT
    global venue_id_recover
    print("[#getVenuesIDScore]")
    venID_URL = 'https://api.foursquare.com/v2/venues/{}?client_id={}&client_secret={}&v={}'.format(venueID,CLIENT_ID,CLIENT_SECRET,VERSION)
    print(venID_URL)
    venID_score = 0.00
    # Process results
    try:
        venID_result = requests.get(venID_URL).json()
        venID_score  = venID_result['response']['venue']['rating']
    except: 
        venue_id_recover = venue_id_recover.append({'VenueID' : venueID, 'Score' : 0.0})
        cyclefsk2()
        return ["error",0.0]
    return ["success",venID_score]

In [17]:
singapore_average_rental_prices_by_town.dtypes

Town            object
median_rent    float64
Latitude       float64
Longitude      float64
dtype: object

In [18]:
venue_columns = ['Town','Town Latitude','Town Longitude','VenueID','VenueName','score','category','catID','latitude','longitude']
singapore_town_venues = pd.DataFrame(columns=venue_columns)

#### Search Venues with recommendations on  : Food Venues (Restaurants,Fastfoods, etc.)

To demonstrate user selection of places of interest, We will use this Food Venues category in our further analysis.
* This Foursquare search is expected to collect venues in the following category:
 * category
 * Food Courts
 * Coffee Shops
 * Restaurants
 * Cafés
 * Other food venues

In [19]:
# Food Venues : Restaurants, Fastfoods, Etc
# For testing
if (0):
    categoryID = "4d4b7105d754a06377d81259"
    town_names = ['ANG MO KIO']
    lat_list   = [1.3699718]
    lng_list   = [103.8495876]
    tmp = getVenuesByCategory(names=town_names,latitudes=lat_list,longitudes=lng_list,categoryID=categoryID)
    singapore_town_venues = pd.concat([singapore_town_venues,tmp], ignore_index=True)

In [20]:
# Food Venues : Restaurants, Fastfoods, Etc
categoryID = "4d4b7105d754a06374d81259"
town_names = singapore_average_rental_prices_by_town['Town']
lat_list   = singapore_average_rental_prices_by_town['Latitude']
lng_list   = singapore_average_rental_prices_by_town['Longitude']
singapore_food_venues = getVenuesByCategory(names=town_names,latitudes=lat_list,longitudes=lng_list,categoryID=categoryID)

[#Start getVenuesByCategory]
ANG MO KIO ,BEDOK ,BISHAN ,BUKIT BATOK ,BUKIT MERAH ,BUKIT PANJANG ,CENTRAL ,CHOA CHU KANG ,CLEMENTI ,GEYLANG ,HOUGANG ,JURONG EAST ,JURONG WEST ,KALLANG/WHAMPOA ,MARINE PARADE ,PASIR RIS ,PUNGGOL ,QUEENSTOWN ,SEMBAWANG ,SENGKANG ,SERANGOON ,TAMPINES ,TOA PAYOH ,WOODLANDS ,YISHUN ,
[#Done getVenuesByCategory]


* Save collected Singapore food venues by town into csv for future use.

In [21]:
# Save collected Singapore food venues by town into csv for future use.
fileName = "singapore_food_venues.Category.csv"
linkName = "IBM Storage Link:singapore_food_venues.Category.csv"
create_download_link(singapore_food_venues,linkName,fileName)

#### Search Venues with recommendations on  : Outdoors and Recreation
Note: 
* 2nd Test: Retrieve venues for Outdoors and Recreation.
* This section can be ran separately due to maximum limit encountered when using Foursquare free API version. I have saved simmilar results in github to run the same analyis.

In [22]:
# Disable for this run demo.
if (0):
    # Outdoors & Recreation, 
    categoryID = "4d4b7105d754a06377d81259"
    town_names = singapore_average_rental_prices_by_town['Town']
    lat_list   = singapore_average_rental_prices_by_town['Latitude']
    lng_list   = singapore_average_rental_prices_by_town['Longitude']
    singapore_outdoor_venues_by_town = getVenuesByCategory(names=town_names,latitudes=lat_list,longitudes=lng_list,categoryID=categoryID)
    # Save collected Singapore Outdoors & Recreation venues by town into csv for future use.
    # singapore_outdoor_venues_by_town.to_csv('singapore_outdoorAndRecration.Category.csv',index=False)
    fileName = "singapore_outdoorAndRecration.Category.csv"
    linkName = "IBM Storage Link:singapore_outdoorAndRecration.Category.csv"
    create_download_link(singapore_food_venues,linkName,fileName)

#### Search Venues with recommendations on  : Singapore NightLife
Note: 
* 3nd Test: Retrieve venues for Outdoors and Recreation venues that are accessible at night. This includes places like NightClubs, Bars and places of interest operating 24 hours.
* This section can be ran separately due to maximum limit encountered when using Foursquare free API version. I have saved simmilar results in github to run the same analyis.

In [23]:
# Disable for this run demo.
if (0):
    #Nightlife Spot = 4d4b7105d754a06376d81259
    categoryID = "4d4b7105d754a06376d81259"
    town_names = singapore_average_rental_prices_by_town['Town']
    lat_list   = singapore_average_rental_prices_by_town['Latitude']
    lng_list   = singapore_average_rental_prices_by_town['Longitude']
    singapore_Nightlife_by_town = getVenuesByCategory(names=town_names,latitudes=lat_list,longitudes=lng_list,categoryID=categoryID) 
    
    # Save collected Singapore Outdoors & Recreation venues by town into csv for future use.
    # singapore_outdoor_venues_by_town.to_csv('singapore_outdoorAndRecration.Category.csv',index=False)
    fileName = "singapore_Nightlife_by_town.Category.csv"
    linkName = "IBM Storage Link:singapore_Nightlife_by_town.Category.csv"
    create_download_link(singapore_Nightlife_by_town,linkName,fileName)

In [24]:
# The code was removed by Watson Studio for sharing.

#### In this section, We use the FourSquare API to retrieve venue scores of locations. Note that there is max query limit of 50 in FourSquare API for free subscription. So use or query carefully.

In [25]:
score_is_NAN = len(singapore_food_venues[singapore_food_venues['score'].isnull()].index.tolist())
print("Current score=NaN count=",score_is_NAN)
for idx in singapore_food_venues[singapore_food_venues['score'].isnull()].index.tolist():
    venueID = singapore_food_venues.loc[idx,'VenueID']
    status,score = getVenuesIDScore(venueID)
    if status == "success":
        singapore_food_venues.loc[idx,'score'] = score
score_is_NAN = len(singapore_food_venues[singapore_food_venues['score'].isnull()].index.tolist())
print("PostRun score=NaN count=",score_is_NAN)
print('Done',end='')

Current score=NaN count= 0
PostRun score=NaN count= 0
Done

* Note: Re-run continuation, reload saved csv file. # Reloading previously saved runs to avoid re-running FourSquare API.

In [26]:
# The code was removed by Watson Studio for sharing.

* Combine venues collection into one dataframe : singapore_town_venues

In [27]:
# If all categories are called
if (0):
    singapore_town_venues = pd.concat([singapore_food_venues,singapore_outdoor_venues_by_town,singapore_Nightlife_by_town], ignore_index=True)
#else
singapore_town_venues = singapore_food_venues
singapore_town_venues.shape

(1249, 10)

#### Data cleanup uneeded entries
* Eliminate possible venue duplicates.
* Improve the quality of our venue selection by removing venues with no ratings or 0.0

In [28]:
# Eliminate possible venue duplicates.
singapore_town_venues = singapore_town_venues[venue_columns]
# Drop rows with score == 0
singapore_town_venues = singapore_town_venues[singapore_town_venues.score > 0.0]
# Drop rows with missing elements
singapore_town_venues = singapore_town_venues.dropna(axis='columns')

In [29]:
singapore_town_venues.shape

(644, 10)

In [30]:
singapore_town_venues.head()

Unnamed: 0,Town,Town Latitude,Town Longitude,VenueID,VenueName,score,category,catID,latitude,longitude
2,ANG MO KIO,1.369972,103.849588,50138eaee4b05d9dc80ae5b0,Hong Kong Sheng Kee Dessert ??????,5.8,Dessert Shops,4bf58dd8d48988d1d0941735,1.369473,103.849241
4,ANG MO KIO,1.369972,103.849588,5be2d3831af8520039a38da2,Malaysia Boleh!,7.5,Food Courts,4bf58dd8d48988d120951735,1.369669,103.8489
5,ANG MO KIO,1.369972,103.849588,4b7cdf36f964a520fda72fe3,BreadTalk / Toast Box,5.5,Breakfast Spots,4bf58dd8d48988d143941735,1.369177,103.848874
7,ANG MO KIO,1.369972,103.849588,4b1e9fc6f964a520d21c24e3,Ichiban Sushi,5.6,Sushi Restaurants,4bf58dd8d48988d1d2941735,1.369156,103.849109
10,ANG MO KIO,1.369972,103.849588,5340222411d247b11bb27bb8,Eighteen Chefs,5.5,Diners,4bf58dd8d48988d147941735,1.369265,103.848706


In [31]:
# Save town venues collection. 
# This list is already intersting data for display in different webpages.
fileName = "recommended.singapore_town_venues.csv"
linkName = "IBM Storage Link:recommended_singapore_town_venues.csv"
create_download_link(singapore_food_venues,linkName,fileName)

#### Check venue count per town.

In [32]:
singapore_town_venues.groupby('Town').count()

Unnamed: 0_level_0,Town Latitude,Town Longitude,VenueID,VenueName,score,category,catID,latitude,longitude
Town,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
ANG MO KIO,34,34,34,34,34,34,34,34,34
BEDOK,29,29,29,29,29,29,29,29,29
BISHAN,36,36,36,36,36,36,36,36,36
BUKIT BATOK,22,22,22,22,22,22,22,22,22
BUKIT MERAH,9,9,9,9,9,9,9,9,9
BUKIT PANJANG,15,15,15,15,15,15,15,15,15
CENTRAL,46,46,46,46,46,46,46,46,46
CHOA CHU KANG,27,27,27,27,27,27,27,27,27
CLEMENTI,34,34,34,34,34,34,34,34,34
GEYLANG,25,25,25,25,25,25,25,25,25


In [33]:
# Verify the dtypes 
singapore_town_venues.dtypes

Town               object
Town Latitude     float64
Town Longitude    float64
VenueID            object
VenueName          object
score             float64
category           object
catID              object
latitude          float64
longitude         float64
dtype: object

#### How many unique categories can be curated from all the returned venues?

In [34]:
# Count number of categories that can be curated.
print('There are {} uniques categories.'.format(len(singapore_town_venues['category'].unique())))

There are 67 uniques categories.


#### What are the top 20 most common venue types?

In [35]:
# Check top 10 most frequently occuring venue type
singapore_town_venues.groupby('category')['VenueName'].count().sort_values(ascending=False)[:20]

category
Food Courts              92
Coffee Shops             62
Fast Food Restaurants    61
Chinese Restaurants      58
Cafés                    35
Japanese Restaurants     34
Asian Restaurants        22
Noodle Houses            20
Thai Restaurants         19
Sushi Restaurants        16
Seafood Restaurants      15
Italian Restaurants      13
Sandwich Places          13
Indian Restaurants       11
Bubble Tea Shops         11
American Restaurants      8
Burger Joints             8
Dim Sum Restaurants       8
Dessert Shops             8
Fried Chicken Joints      7
Name: VenueName, dtype: int64

#### What are the top 20 venues given with highest score rating?

In [36]:
# Top 10 venues with highest given score rating
singapore_town_venues.groupby(['Town','category'])['score'].mean().sort_values(ascending=False)[:20]

Town           category            
MARINE PARADE  Food Courts             8.70
CENTRAL        Asian Restaurants       8.50
               Seafood Restaurants     8.35
YISHUN         Noodle Houses           8.30
CENTRAL        Udon Restaurants        8.30
GEYLANG        BBQ Joints              8.30
CENTRAL        Breweries               8.20
               Dumpling Restaurants    8.10
YISHUN         Bubble Tea Shops        8.10
               Chinese Restaurants     8.00
BEDOK          Chinese Restaurants     8.00
JURONG WEST    Wings Joints            8.00
CENTRAL        Bubble Tea Shops        8.00
HOUGANG        Thai Restaurants        7.90
JURONG EAST    Hotpot Restaurants      7.90
BUKIT PANJANG  Cafés                   7.90
CENTRAL        Bistros                 7.90
BUKIT PANJANG  Thai Restaurants        7.85
GEYLANG        Cafés                   7.80
TAMPINES       Fried Chicken Joints    7.80
Name: score, dtype: float64

## Analyze Each Singapore Town nearby recommended venues

In [37]:
# one hot encoding
sg_onehot = pd.get_dummies(singapore_town_venues[['category']], prefix="", prefix_sep="")

# add Town column back to dataframe
sg_onehot['Town'] = singapore_town_venues['Town'] 

# move neighborhood column to the first column
fixed_columns = [sg_onehot.columns[-1]] + list(sg_onehot.columns[:-1])
sg_onehot = sg_onehot[fixed_columns]

# Check returned one hot encoding data:
print('One hot encoding returned "{}" rows.'.format(sg_onehot.shape[0]))

# Regroup rows by town and mean of frequency occurrence per category.
sg_grouped = sg_onehot.groupby('Town').mean().reset_index()

print('One hot encoding re-group returned "{}" rows.'.format(sg_grouped.shape[0]))
sg_grouped.head()

One hot encoding returned "644" rows.
One hot encoding re-group returned "25" rows.


Unnamed: 0,Town,American Restaurants,Asian Restaurants,BBQ Joints,Bakeries,Bars,Bistros,Breakfast Spots,Breweries,Bubble Tea Shops,Buffets,Burger Joints,Cafeterias,Cafés,Cantonese Restaurants,Cha Chaan Tengs,Chinese Breakfast Places,Chinese Restaurants,Coffee Shops,Comfort Food Restaurants,Dessert Shops,Dim Sum Restaurants,Diners,Dongbei Restaurants,Dumpling Restaurants,English Restaurants,Fast Food Restaurants,Fish & Chips Shops,Food Courts,French Restaurants,Fried Chicken Joints,Frozen Yogurt Shops,Gastropubs,Grocery Stores,Hainan Restaurants,Halal Restaurants,Hong Kong Restaurants,Hotpot Restaurants,Ice Cream Shops,Indian Restaurants,Indonesian Restaurants,Italian Restaurants,Japanese Curry Restaurants,Japanese Restaurants,Korean Restaurants,Macanese Restaurants,Malay Restaurants,Mexican Restaurants,Miscellaneous Shops,Modern European Restaurants,Noodle Houses,Pizza Places,Portuguese Restaurants,Ramen Restaurants,Restaurants,Sandwich Places,Seafood Restaurants,Shopping Malls,Snack Places,Soup Places,Steakhouses,Sushi Restaurants,Taiwanese Restaurants,Thai Restaurants,Udon Restaurants,Vegetarian / Vegan Restaurants,Vietnamese Restaurants,Wings Joints
0,ANG MO KIO,0.0,0.0,0.0,0.029412,0.0,0.0,0.029412,0.0,0.029412,0.0,0.029412,0.0,0.058824,0.0,0.0,0.0,0.0,0.029412,0.0,0.058824,0.0,0.029412,0.0,0.0,0.0,0.117647,0.0,0.176471,0.0,0.029412,0.0,0.0,0.0,0.0,0.029412,0.029412,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.029412,0.029412,0.029412,0.0,0.0,0.029412,0.029412,0.029412,0.0,0.0,0.029412,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0
1,BEDOK,0.034483,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.034483,0.0,0.0,0.0,0.034483,0.206897,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.068966,0.0,0.103448,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.034483,0.034483,0.034483,0.034483,0.034483,0.0,0.0,0.068966,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.068966,0.0,0.0,0.0,0.0,0.0,0.034483
2,BISHAN,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.111111,0.138889,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.083333,0.0,0.083333,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.027778,0.0,0.027778,0.0,0.111111,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.027778,0.0,0.027778,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0
3,BUKIT BATOK,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.136364,0.136364,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.136364,0.0,0.272727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.045455,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0
4,BUKIT MERAH,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.444444,0.222222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


## Analyze Singapore Town most visited venues

In [38]:
num_top_venues = 10
for town in sg_grouped['Town']:
    print("# Town=< "+town+" >")
    temp = sg_grouped[sg_grouped['Town'] == town].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

# Town=< ANG MO KIO >
                   venue  freq
0            Food Courts  0.18
1  Fast Food Restaurants  0.12
2   Japanese Restaurants  0.06
3          Dessert Shops  0.06
4      Sushi Restaurants  0.06
5                  Cafés  0.06
6           Snack Places  0.03
7          Noodle Houses  0.03
8      Ramen Restaurants  0.03
9            Restaurants  0.03


# Town=< BEDOK >
                    venue  freq
0            Coffee Shops  0.21
1             Food Courts  0.10
2       Sushi Restaurants  0.07
3    Japanese Restaurants  0.07
4   Fast Food Restaurants  0.07
5    American Restaurants  0.03
6     Chinese Restaurants  0.03
7         Sandwich Places  0.03
8  Indonesian Restaurants  0.03
9      Indian Restaurants  0.03


# Town=< BISHAN >
                   venue  freq
0           Coffee Shops  0.14
1   Japanese Restaurants  0.11
2    Chinese Restaurants  0.11
3            Food Courts  0.08
4  Fast Food Restaurants  0.08
5                  Cafés  0.08
6       Bubble Tea Shops  0.0

First, let's write a function to sort the venues in descending order.

In [39]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    return row_categories_sorted.index.values[0:num_top_venues]

In [40]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Town']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
town_venues_sorted = pd.DataFrame(columns=columns)
town_venues_sorted['Town'] = sg_grouped['Town']

for ind in np.arange(sg_grouped.shape[0]):
    town_venues_sorted.iloc[ind, 1:] = return_most_common_venues(sg_grouped.iloc[ind, :], num_top_venues)

print(town_venues_sorted.shape)
town_venues_sorted.head()

(25, 11)


Unnamed: 0,Town,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,ANG MO KIO,Food Courts,Fast Food Restaurants,Dessert Shops,Japanese Restaurants,Sushi Restaurants,Cafés,Ramen Restaurants,Hong Kong Restaurants,Halal Restaurants,Miscellaneous Shops
1,BEDOK,Coffee Shops,Food Courts,Sushi Restaurants,Japanese Restaurants,Fast Food Restaurants,Wings Joints,Fried Chicken Joints,Indian Restaurants,Ice Cream Shops,Hotpot Restaurants
2,BISHAN,Coffee Shops,Japanese Restaurants,Chinese Restaurants,Fast Food Restaurants,Food Courts,Cafés,Bubble Tea Shops,American Restaurants,Portuguese Restaurants,Dumpling Restaurants
3,BUKIT BATOK,Food Courts,Coffee Shops,Fast Food Restaurants,Chinese Restaurants,Asian Restaurants,Thai Restaurants,Pizza Places,Ice Cream Shops,Japanese Restaurants,Sandwich Places
4,BUKIT MERAH,Chinese Restaurants,Coffee Shops,Food Courts,Bistros,Cafés,Dongbei Restaurants,Comfort Food Restaurants,Dessert Shops,Dim Sum Restaurants,Diners


## Clustering Neighborhoods
Run *k*-means to cluster the Towns into 5 clusters.

In [41]:
# set number of clusters
kclusters = 5
sg_grouped_clustering = sg_grouped.drop('Town', 1)
# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=1).fit(sg_grouped_clustering)

# check cluster labels generated for each row in the dataframe
print(kmeans.labels_[0:10])
print(len(kmeans.labels_))

[1 3 3 1 2 4 3 1 1 4]
25


In [42]:
town_venues_sorted.head()

Unnamed: 0,Town,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,ANG MO KIO,Food Courts,Fast Food Restaurants,Dessert Shops,Japanese Restaurants,Sushi Restaurants,Cafés,Ramen Restaurants,Hong Kong Restaurants,Halal Restaurants,Miscellaneous Shops
1,BEDOK,Coffee Shops,Food Courts,Sushi Restaurants,Japanese Restaurants,Fast Food Restaurants,Wings Joints,Fried Chicken Joints,Indian Restaurants,Ice Cream Shops,Hotpot Restaurants
2,BISHAN,Coffee Shops,Japanese Restaurants,Chinese Restaurants,Fast Food Restaurants,Food Courts,Cafés,Bubble Tea Shops,American Restaurants,Portuguese Restaurants,Dumpling Restaurants
3,BUKIT BATOK,Food Courts,Coffee Shops,Fast Food Restaurants,Chinese Restaurants,Asian Restaurants,Thai Restaurants,Pizza Places,Ice Cream Shops,Japanese Restaurants,Sandwich Places
4,BUKIT MERAH,Chinese Restaurants,Coffee Shops,Food Courts,Bistros,Cafés,Dongbei Restaurants,Comfort Food Restaurants,Dessert Shops,Dim Sum Restaurants,Diners


In [43]:
town_venues_sorted = town_venues_sorted.set_index("Town")
sg_merged = singapore_average_rental_prices_by_town.set_index("Town")
# add clustering labels
sg_merged['Cluster Labels'] = kmeans.labels_
# merge sg_grouped with singapore_average_rental_prices_by_town to add latitude/longitude for each neighborhood
sg_merged = sg_merged.join(town_venues_sorted)
sg_merged

Unnamed: 0_level_0,median_rent,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
Town,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
ANG MO KIO,2033.333333,1.369972,103.849588,1,Food Courts,Fast Food Restaurants,Dessert Shops,Japanese Restaurants,Sushi Restaurants,Cafés,Ramen Restaurants,Hong Kong Restaurants,Halal Restaurants,Miscellaneous Shops
BEDOK,2087.5,1.324011,103.930172,3,Coffee Shops,Food Courts,Sushi Restaurants,Japanese Restaurants,Fast Food Restaurants,Wings Joints,Fried Chicken Joints,Indian Restaurants,Ice Cream Shops,Hotpot Restaurants
BISHAN,2233.333333,1.351042,103.84993,3,Coffee Shops,Japanese Restaurants,Chinese Restaurants,Fast Food Restaurants,Food Courts,Cafés,Bubble Tea Shops,American Restaurants,Portuguese Restaurants,Dumpling Restaurants
BUKIT BATOK,1962.5,1.348506,103.749222,1,Food Courts,Coffee Shops,Fast Food Restaurants,Chinese Restaurants,Asian Restaurants,Thai Restaurants,Pizza Places,Ice Cream Shops,Japanese Restaurants,Sandwich Places
BUKIT MERAH,2162.5,1.289642,103.816798,2,Chinese Restaurants,Coffee Shops,Food Courts,Bistros,Cafés,Dongbei Restaurants,Comfort Food Restaurants,Dessert Shops,Dim Sum Restaurants,Diners
BUKIT PANJANG,1737.5,1.276068,103.791904,4,Chinese Restaurants,Thai Restaurants,Food Courts,Indian Restaurants,Cafés,Asian Restaurants,Noodle Houses,Seafood Restaurants,Vietnamese Restaurants,Burger Joints
CENTRAL,2450.0,1.288155,103.846718,3,Cafés,Chinese Restaurants,Food Courts,Coffee Shops,Ramen Restaurants,Japanese Restaurants,Fast Food Restaurants,Noodle Houses,Diners,Hotpot Restaurants
CHOA CHU KANG,1933.333333,1.385385,103.744337,1,Fast Food Restaurants,Food Courts,Noodle Houses,Coffee Shops,Asian Restaurants,Thai Restaurants,Bakeries,Italian Restaurants,Dessert Shops,Cafés
CLEMENTI,2263.333333,1.31507,103.765246,1,Food Courts,Fast Food Restaurants,Coffee Shops,Fried Chicken Joints,Asian Restaurants,Thai Restaurants,Dim Sum Restaurants,Japanese Restaurants,Chinese Restaurants,Cafés
GEYLANG,2166.666667,1.316367,103.882772,4,Chinese Restaurants,Dim Sum Restaurants,Food Courts,Noodle Houses,Coffee Shops,Seafood Restaurants,Vegetarian / Vegan Restaurants,Steakhouses,Fast Food Restaurants,BBQ Joints


* Save csv copy of merged data

In [44]:
# Save town cluster collection. 
# This list is already interesting data for display in different webpages.
fileName = "sg_top_common_venues.csv"
linkName = "IBM Storage Link:" + fileName
create_download_link(singapore_food_venues,linkName,fileName)

In [45]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], tiles="Openstreetmap", zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(sg_merged['Latitude'], sg_merged['Longitude'], sg_merged.index.values,kmeans.labels_):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=10,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=1).add_to(map_clusters)
       
map_clusters


<img src = "https://raw.githubusercontent.com/crismag/Coursera_Capstone/master/saved_data/clustered_vicimap.png" width = 800, align = "center">

In [46]:
# The code was removed by Watson Studio for sharing.

In [47]:
# The code was removed by Watson Studio for sharing.

In [48]:
# The code was removed by Watson Studio for sharing.

## Discussion and Conclusion

On this notebook, Analysis of best town venue recommendations based on Food venue category has been presented. Recommendations based on other user searches like available outdoor and recreation areas are also available. As singapore is a small country with a whole host of interesting venues scattered around the town, the information extracted in this notebook present on the town areas, will be a good supplement to web based recommendations for visitors to find out nearby venues of interest and be a useful aid in deciding a place to stay or where to go during their visits.

Using Foursquare API, we have collected a good amount of venue recommnedations in Singapore Towns. Sourcing from the venue recommendations from FourSquare has its limitation, The list of venues is not exhaustive list of all the available venues is the area. Furthermore, not all the venues found in the the area has a stored ratings. For this reason, the number of analyzed venues are only about 50% of all the available venues initially collected. The results therefore may significantly change, when more information are collected on those with missing data. 

The generated clusters from our results shows that there are very good and interesting places located in areas where the median rents are cheaper. This kind of results may be very interesting for travelers who are also on budget constraints. Our results also yielded some interesting findings. For instance, The initial assumption among websites providing recommendations is that the Central Area that have the highest median rent also have better food venues. The results however shows that while Marine Parade, a cheaper location has better rated food courts. Result shows that most popular food venue among Singaporeans, residents and visitors are **Food Courts, Coffee Shops and Fast Food Restaurants**. The highest rated Food Courts are located in __Marine Parade__, and in __Central Area__s.


I will be providing a separate supplementary Inferential Statics about on these data collected and also update in a new notebook using other categories. For now, this completes the requirements for this task.

Thank you.

Cris Magalang  
email: crism.dev@gmail.com  
linkedin: www.linkedin.com/in/crismagalang  


Created For: COURSERA __**IBM Applied Data Science Capstone** Project__