# The Kopi Latte Ratio Project: Data Collection

The objective of this notebook is to collect location & reviews data of coffee shops and kopitiams across Singapore by querying the Google Maps Places API.

## Get details of planning areas in Singapore using OneMap API

The Urban Redevelopment Authority (URA) has delineated a total of 55 planning areas in Singapore, which can be used in the Google Maps queries. Here, I am using the OneMap API to pull the names and geographic data of the planning areas to use in the analysis.

### Load Credentials & Get OneMap Access Token

In [1]:
import requests
from dotenv import load_dotenv
import os

# Load onemap credentials from .env file in root folder
load_dotenv()
onemap_email = os.getenv("ONEMAP_ACCOUNT_EMAIL")
onemap_password = os.getenv("ONEMAP_ACCOUNT_PASSWORD")

# Get auth token with POST request      
url = "https://www.onemap.gov.sg/api/auth/post/getToken"
      
payload = {
        "email":onemap_email,
        "password": onemap_password
      }
      
response = requests.request("POST", url, json=payload)
      
access_token = response.json().get('access_token')

### Request List of Planning Areas and GeoJSON Polygons

In [2]:
import pandas as pd
import json


url = "https://www.onemap.gov.sg/api/public/popapi/getAllPlanningarea?year=2019"
      
headers = {"Authorization": access_token}
      
response = requests.request("GET", url, headers=headers)

# The response is a list of planning area names and their respective geojson polygons. Parse the response and save as a pandas DataFrame
planning_areas = pd.DataFrame(response.json()['SearchResults'])
planning_areas.head()

Unnamed: 0,pln_area_n,geojson
0,BEDOK,"{""type"":""MultiPolygon"",""coordinates"":[[[[103.9..."
1,BUKIT TIMAH,"{""type"":""MultiPolygon"",""coordinates"":[[[[103.7..."
2,BUKIT BATOK,"{""type"":""MultiPolygon"",""coordinates"":[[[[103.7..."
3,BUKIT MERAH,"{""type"":""MultiPolygon"",""coordinates"":[[[[103.8..."
4,CENTRAL WATER CATCHMENT,"{""type"":""MultiPolygon"",""coordinates"":[[[[103.8..."


## Query Google Maps Places API for Location & Review Data

With the list of planning areas, the next step is to prepare the queries to request from the Google Maps API. My approach is to concatenate the location type (eg 'cafe') with each planning area (eg 'BUKIT TIMAH'). This will give us a comprehensive list of queries (eg 'cafe in BUKIT TIMAH') to extract the location data

### Prepare location queries for Google Maps Places API

In [3]:
def prepare_queries(queries, locations):
    """
    Prepare a list of queries for Place Search 

    Parameters:
        queries (list): A list of queries to search for. Eg ["coffee", "coffee shops", "kopi"]
        locations (list): A list of locations to search in for each query in query_list. Eg ["bishan", "orchard", "ang mo kio"]

    Returns: A list of queries after concatenating queries and locations. Eg 'coffee in bishan'
    """

    return [i + " in " + j for i in query_list for j in location_list]

In [4]:
query_list = ['coffee', 'coffee shop', 'kopi', 'kopitiam', 'cafe', 'latte']
location_list = planning_areas['pln_area_n']

queries = prepare_queries(query_list, location_list)
queries[0:2] + queries[-2:]

['coffee in BEDOK',
 'coffee in BUKIT TIMAH',
 'latte in SUNGEI KADUT',
 'latte in YISHUN']

### Extract location data from Google Maps Places API

Each place search query returns a maximum of 60 results per query with the responses paginated with 20 results per response. Hence, I defined a function to return all 60 results in a query.

In [17]:
def get_all_query_results(query, gmaps_client):
    """
    Get maximum number of results (60) for each Google Place API Text Search query
    """
    results = []

    # Search for places using query and save value of next_page_token
    print("Requesting results for", query)
    response = gmaps_client.places(query=query)
    results.extend(response.get('results'))

    next_page_token = response.get("next_page_token")

    while next_page_token is not None:
        time.sleep(2)
        response = gmaps_client.places(page_token=next_page_token)
        results.extend(response.get("results"))
        next_page_token = response.get("next_page_token")
    else:
        print("next_page_token not found. Returning list of jsons for", query)

    return results


In [20]:
for query in queries[0:5]:
    print(query)

coffee in BEDOK
coffee in BUKIT TIMAH
coffee in BUKIT BATOK
coffee in BUKIT MERAH
coffee in CENTRAL WATER CATCHMENT


In [21]:
import googlemaps
import time

load_dotenv()
places_api_key = os.getenv("PLACES_API_KEY")

# Initiate Google Maps API client
gmaps = googlemaps.Client(key=places_api_key)

# loop through queries and add to results list
results = []
for query in queries[0:2]:
    results.extend(get_all_query_results(query, gmaps))

Requesting results for coffee in BEDOK
next_page_token not found. Returning list of jsons for coffee in BEDOK
Requesting results for coffee in BUKIT TIMAH
next_page_token not found. Returning list of jsons for coffee in BUKIT TIMAH


In [31]:
results[0]

{'business_status': 'OPERATIONAL',
 'formatted_address': '136 Bedok North Ave 3, #01-152, Singapore 460136',
 'geometry': {'location': {'lat': 1.3282621, 'lng': 103.935252},
  'viewport': {'northeast': {'lat': 1.329290979892722,
    'lng': 103.9365587798927},
   'southwest': {'lat': 1.326591320107278, 'lng': 103.9338591201073}}},
 'icon': 'https://maps.gstatic.com/mapfiles/place_api/icons/v1/png_71/cafe-71.png',
 'icon_background_color': '#FF9E67',
 'icon_mask_base_uri': 'https://maps.gstatic.com/mapfiles/place_api/icons/v2/cafe_pinlet',
 'name': 'Percolate',
 'opening_hours': {'open_now': True},
 'photos': [{'height': 4032,
   'html_attributions': ['<a href="https://maps.google.com/maps/contrib/104988888938292694772">Dennis Neo</a>'],
   'photo_reference': 'AelY_CsXan0Ut4ztd9zxKtideH0P3RoLUIWHkpqRVxRS032KzuP6J3n2uId2URR9MuaO4b7nzejKDJvNxbgqTr9KDM4_G3ZfE2p9nRJGHVp5JF6YIBkLo8TREV3qAVwEg9KSBwq3BeH0WQHUrnfbwbejINm0vfm7-SChoXMJEpG61XmtX66x',
   'width': 3024}],
 'place_id': 'ChIJV6HD-Eo92j

In [32]:
# There may be duplicate places across queries. Save response as pandas df and remove duplicates
places_df = pd.json_normalize(results).drop_duplicates(['place_id'])
places_df.head()

Unnamed: 0,business_status,formatted_address,icon,icon_background_color,icon_mask_base_uri,name,photos,place_id,price_level,rating,...,user_ratings_total,geometry.location.lat,geometry.location.lng,geometry.viewport.northeast.lat,geometry.viewport.northeast.lng,geometry.viewport.southwest.lat,geometry.viewport.southwest.lng,opening_hours.open_now,plus_code.compound_code,plus_code.global_code
0,OPERATIONAL,"136 Bedok North Ave 3, #01-152, Singapore 460136",https://maps.gstatic.com/mapfiles/place_api/ic...,#FF9E67,https://maps.gstatic.com/mapfiles/place_api/ic...,Percolate,"[{'height': 4032, 'html_attributions': ['<a hr...",ChIJV6HD-Eo92jERjhfY7NEDrOM,2.0,4.4,...,1001,1.328262,103.935252,1.329291,103.936559,1.326591,103.933859,True,8WHP+84 Singapore,6PH58WHP+84
1,OPERATIONAL,"216 Bedok North Street 1, #01-32, Singapore 46...",https://maps.gstatic.com/mapfiles/place_api/ic...,#FF9E67,https://maps.gstatic.com/mapfiles/place_api/ic...,Generation Coffee Roasters (Bedok),"[{'height': 4000, 'html_attributions': ['<a hr...",ChIJhbwWY-I92jERxtB-gF22sL0,,4.6,...,187,1.327248,103.933039,1.32873,103.934367,1.32603,103.931668,False,8WGM+V6 Singapore,6PH58WGM+V6
2,OPERATIONAL,"744 Bedok Reservoir Rd, #01-3029 Reservoir Vil...",https://maps.gstatic.com/mapfiles/place_api/ic...,#FF9E67,https://maps.gstatic.com/mapfiles/place_api/ic...,Refuel Cafe,"[{'height': 2268, 'html_attributions': ['<a hr...",ChIJcf_SpPk82jERM28p3SYNBnI,2.0,4.2,...,1128,1.337519,103.921323,1.33891,103.922575,1.336211,103.919875,True,8WQC+2G Singapore,6PH58WQC+2G
3,OPERATIONAL,"537 Bedok North Street 3, #01-575, Singapore 4...",https://maps.gstatic.com/mapfiles/place_api/ic...,#FF9E67,https://maps.gstatic.com/mapfiles/place_api/ic...,Marie's Lapis Cafe,"[{'height': 3072, 'html_attributions': ['<a hr...",ChIJ3Vc6OY092jERhObI1bZ_4Sk,,4.7,...,367,1.331827,103.924498,1.333259,103.925757,1.330559,103.923058,True,8WJF+PQ Singapore,6PH58WJF+PQ
4,OPERATIONAL,"311 New Upper Changi Rd #01-78 Bedok Mall, Sin...",https://maps.gstatic.com/mapfiles/place_api/ic...,#FF9E67,https://maps.gstatic.com/mapfiles/place_api/ic...,COFFEESARANG,"[{'height': 526, 'html_attributions': ['<a hre...",ChIJOb_8OAwj2jERPK-QVelr5Vk,,4.1,...,184,1.325154,103.929854,1.326789,103.931245,1.324089,103.928545,True,8WGH+3W Singapore,6PH58WGH+3W


In [None]:
#places_df.to_csv('places.csv', index=False)

# Get reviews for places

In [24]:
# Get a deduplicated list of place IDs for all places with at least one review in df
review_place_ids = places_df.loc[places_df["user_ratings_total"] > 0, "place_id"].drop_duplicates().tolist()


# Request for reviews in review_place_ids
reviews = []
for place_id in review_place_ids:
    response = gmaps.place(place_id)
    reviews.append(response)

reviews[0]

{'html_attributions': [],
 'result': {'address_components': [{'long_name': '#01-152',
    'short_name': '#01-152',
    'types': ['subpremise']},
   {'long_name': '136', 'short_name': '136', 'types': ['street_number']},
   {'long_name': 'Bedok North Avenue 3',
    'short_name': 'Bedok North Ave 3',
    'types': ['route']},
   {'long_name': 'Bedok',
    'short_name': 'Bedok',
    'types': ['neighborhood', 'political']},
   {'long_name': 'Singapore',
    'short_name': 'Singapore',
    'types': ['locality', 'political']},
   {'long_name': 'Singapore',
    'short_name': 'SG',
    'types': ['country', 'political']},
   {'long_name': '460136', 'short_name': '460136', 'types': ['postal_code']}],
  'adr_address': '<span class="street-address">136 Bedok North Ave 3</span>, #01-152, <span class="country-name">Singapore</span> <span class="postal-code">460136</span>',
  'business_status': 'OPERATIONAL',
  'curbside_pickup': True,
  'current_opening_hours': {'open_now': True,
   'periods': [{'close

In [26]:
reviews_df = pd.concat([pd.json_normalize(place['result']).explode('reviews') for place in reviews])
reviews_df.head()

Unnamed: 0,address_components,adr_address,business_status,curbside_pickup,delivery,dine_in,formatted_address,formatted_phone_number,icon,icon_background_color,...,geometry.viewport.southwest.lat,geometry.viewport.southwest.lng,opening_hours.open_now,opening_hours.periods,opening_hours.weekday_text,plus_code.compound_code,plus_code.global_code,serves_wine,serves_vegetarian_food,secondary_opening_hours
0,"[{'long_name': '#01-152', 'short_name': '#01-1...","<span class=""street-address"">136 Bedok North A...",OPERATIONAL,True,True,True,"136 Bedok North Ave 3, #01-152, Singapore 460136",8259 0316,https://maps.gstatic.com/mapfiles/place_api/ic...,#FF9E67,...,1.326592,103.93386,True,"[{'close': {'day': 0, 'time': '1900'}, 'open':...","[Monday: 9:00 AM – 7:00 PM, Tuesday: 9:00 AM –...",8WHP+84 Singapore,6PH58WHP+84,,,
0,"[{'long_name': '#01-152', 'short_name': '#01-1...","<span class=""street-address"">136 Bedok North A...",OPERATIONAL,True,True,True,"136 Bedok North Ave 3, #01-152, Singapore 460136",8259 0316,https://maps.gstatic.com/mapfiles/place_api/ic...,#FF9E67,...,1.326592,103.93386,True,"[{'close': {'day': 0, 'time': '1900'}, 'open':...","[Monday: 9:00 AM – 7:00 PM, Tuesday: 9:00 AM –...",8WHP+84 Singapore,6PH58WHP+84,,,
0,"[{'long_name': '#01-152', 'short_name': '#01-1...","<span class=""street-address"">136 Bedok North A...",OPERATIONAL,True,True,True,"136 Bedok North Ave 3, #01-152, Singapore 460136",8259 0316,https://maps.gstatic.com/mapfiles/place_api/ic...,#FF9E67,...,1.326592,103.93386,True,"[{'close': {'day': 0, 'time': '1900'}, 'open':...","[Monday: 9:00 AM – 7:00 PM, Tuesday: 9:00 AM –...",8WHP+84 Singapore,6PH58WHP+84,,,
0,"[{'long_name': '#01-152', 'short_name': '#01-1...","<span class=""street-address"">136 Bedok North A...",OPERATIONAL,True,True,True,"136 Bedok North Ave 3, #01-152, Singapore 460136",8259 0316,https://maps.gstatic.com/mapfiles/place_api/ic...,#FF9E67,...,1.326592,103.93386,True,"[{'close': {'day': 0, 'time': '1900'}, 'open':...","[Monday: 9:00 AM – 7:00 PM, Tuesday: 9:00 AM –...",8WHP+84 Singapore,6PH58WHP+84,,,
0,"[{'long_name': '#01-152', 'short_name': '#01-1...","<span class=""street-address"">136 Bedok North A...",OPERATIONAL,True,True,True,"136 Bedok North Ave 3, #01-152, Singapore 460136",8259 0316,https://maps.gstatic.com/mapfiles/place_api/ic...,#FF9E67,...,1.326592,103.93386,True,"[{'close': {'day': 0, 'time': '1900'}, 'open':...","[Monday: 9:00 AM – 7:00 PM, Tuesday: 9:00 AM –...",8WHP+84 Singapore,6PH58WHP+84,,,


In [None]:
#reviews_df.to_csv("reviews.csv", index=False)