The following notebook is to experiment the usage of Google Maps API, as well as other services, to query the distance between HDB flats and potential points of interests.

In [1]:
import requests
import googlemaps
import pandas as pd
from dotenv import load_dotenv
import os
import geopandas as gpd
from shapely.geometry import Point

load_dotenv()
API_KEY = os.getenv("GOOGLE_MAPS_API_KEY")
gmaps = googlemaps.Client(key=API_KEY)

Since querying the coordinates of HDB flats using Google Maps can potentially incur high costs, we shall experiment with free services such as Nominatim API, a tool for geocoding with OpenStreetMap data

In [2]:
def query_address(street):
    url = "https://nominatim.openstreetmap.org/search"
    params = {
        "street": street,
        "format": "json",
        "country": "Singapore",
        "addressdetails": 1
    }

    response = requests.get(url, params=params)
    
    if response.status_code == 200:
        results = response.json()
        if results:
            # Extract latitude and longitude from the first result
            lat = results[0]['lat']
            lon = results[0]['lon']
            return lat, lon
        else:
            return None, None
    else:
        print("API request failed:", response.status_code)
        return None, None

print(query_address("647 PUNGGOL CTRL")) 

('1.3980551', '103.91549385189649')


Testing out the function on the first 5 rows of data in the HDB resale prices dataset:

In [4]:
data = pd.read_csv("../data/ResaleFlatPrices/ResaleFlatPricesBasedonRegistrationDateFromJan2015toDec2016.csv")

# combine block and street name with space in between
data['address'] = data['block'] + " " + data['street_name']
test_data = data.head(5).copy()

# create new columns for coordinates using query_address function
test_data['coordinates'] = test_data['address'].apply(query_address)
test_data

Unnamed: 0,month,town,flat_type,block,street_name,storey_range,floor_area_sqm,flat_model,lease_commence_date,remaining_lease,resale_price,address,coordinates
0,2015-01,ANG MO KIO,3 ROOM,174,ANG MO KIO AVE 4,07 TO 09,60.0,Improved,1986,70,255000.0,174 ANG MO KIO AVE 4,"(1.380906, 103.8395363)"
1,2015-01,ANG MO KIO,3 ROOM,541,ANG MO KIO AVE 10,01 TO 03,68.0,New Generation,1981,65,275000.0,541 ANG MO KIO AVE 10,"(1.3739835500000002, 103.85559074965985)"
2,2015-01,ANG MO KIO,3 ROOM,163,ANG MO KIO AVE 4,01 TO 03,69.0,New Generation,1980,64,285000.0,163 ANG MO KIO AVE 4,"(1.3738461499999999, 103.83858854339786)"
3,2015-01,ANG MO KIO,3 ROOM,446,ANG MO KIO AVE 10,01 TO 03,68.0,New Generation,1979,63,290000.0,446 ANG MO KIO AVE 10,"(1.3677793999999999, 103.85533443841078)"
4,2015-01,ANG MO KIO,3 ROOM,557,ANG MO KIO AVE 10,07 TO 09,68.0,New Generation,1980,64,290000.0,557 ANG MO KIO AVE 10,"(1.37165465, 103.85775588826617)"


Now we test out with the first row of data to get the distance between that HDB flat and an MRT Station (we use Ang Mo Kio Station since we know that the flat is in Ang Mo Kio) using Google Map's Distance Matrix API

In [5]:
# create function to get coordinates of MRT stations
def get_station_coordinates(station_name):
    # Use the Nominatim API to search for the address
    url = "https://nominatim.openstreetmap.org/search"
    params = {
        'q': station_name,
        'format': 'json',
        'limit': 1
    }
    
    response = requests.get(url, params=params)
    
    if response.status_code == 200:
        results = response.json()
        if results:
            # Extract latitude and longitude from the first result
            lat = results[0]['lat']
            lon = results[0]['lon']
            return lat, lon
        else:
            return None, None
    else:
        print("API request failed:", response.status_code)
        return None, None

# Example usage
station_name = "Ang Mo Kio Station"
lat, lon = get_station_coordinates(station_name)
if lat and lon:
    print(f"The coordinates for {station_name} are latitude {lat}, longitude {lon}.")
else:
    print("Coordinates could not be found.")

The coordinates for Ang Mo Kio Station are latitude 1.3698269, longitude 103.8494387.


In [6]:
# get first row of data
row = test_data.iloc[0]

# create function to get distance from HDB flat to MRT using Google Maps Distance Matrix API
def get_distance(row, station_name):
    lat, lon = row['coordinates']
    mrt_lat, mrt_lon = get_station_coordinates(station_name)
    result = gmaps.distance_matrix((lat, lon), (mrt_lat , mrt_lon), mode="transit")
    return result

print(get_distance(row, 'Ang Mo Kio Station'))

{'destination_addresses': ['2795 Ang Mo Kio Ave 8, Singapore 569812'], 'origin_addresses': ['626 Ang Mo Kio Ave 4, Block 626, Singapore 560626'], 'rows': [{'elements': [{'distance': {'text': '2.6 km', 'value': 2606}, 'duration': {'text': '23 mins', 'value': 1357}, 'status': 'OK'}]}], 'status': 'OK'}


Next, we test out other services (libraries/APIs) to see if it is able to achieve the same effect as Google Maps Distance Matrix API.

In [7]:
# trying out Haversine Library
from haversine import haversine, Unit

start_lat, start_lon = row['coordinates']
end_lat, end_lon = get_station_coordinates('Ang Mo Kio Station')
# convert the above coordinates to float
start_lat, start_lon = float(start_lat), float(start_lon)
end_lat, end_lon = float(end_lat), float(end_lon)
distance = haversine((start_lat, start_lon), (end_lat, end_lon), unit=Unit.KILOMETERS)

print(distance) 

1.6520890761964901


In [8]:
from pyproj import Geod

geod = Geod(ellps='WGS84')
# calculate the distance
angle1, angle2, distance = geod.inv(start_lon, start_lat, end_lon, end_lat)

print(f"Distance: {distance / 1000} kilometers")

Distance: 1.6477967640452793 kilometers


Using GeoPandas and Shapely (for geometric objects):

In [11]:
def calculate_geospatial_distance(lat1, lon1, lat2, lon2):
    # Create GeoSeries from the points
    point1 = gpd.GeoSeries([Point(lon1, lat1)], crs="EPSG:4326")
    point2 = gpd.GeoSeries([Point(lon2, lat2)], crs="EPSG:4326")
    
    # Reproject to a CRS that uses meters as distance units (e.g., World Mercator)
    point1 = point1.to_crs("EPSG:3395")
    point2 = point2.to_crs("EPSG:3395")
    
    # Calculate the distance between the points
    distance = point1.distance(point2).iloc[0]  # distance in meters
    
    return distance/1000

print(calculate_geospatial_distance(start_lat, start_lon, end_lat, end_lon))

1.6482684524315863


Using OpenRouteService API:

In [9]:
load_dotenv()
API_KEY_2 = os.getenv("OPEN_ROUTE_SERVICE_API_KEY")

# Create the URL for the Directions API
url = 'https://api.openrouteservice.org/v2/directions/driving-car'

params = {
    'api_key': API_KEY_2,
    'start': f"{start_lon},{start_lat}",
    'end': f"{end_lon},{end_lat}"
}

# Make the request
response = requests.get(url, params=params)

# Check if the request was successful
if response.status_code == 200:
    # Parse the JSON response
    data = response.json()
    
    # Extract the distance from the response (in meters)
    distance = data['features'][0]['properties']['segments'][0]['distance']
    
    # Print the distance
    print(f"Distance: {distance / 1000} km")  # Convert meters to kilometers
else:
    print("Failed to retrieve data", response.status_code)

Distance: 2.1437 km


From the above, OpenRouteService provides the closest result to Google Map's DistanceMatrix as compared to `haversine` and `pyproj` libraries, since OpenRouteService incorporates geospatial context.

However, due to the limitations of limited API requests, we can use GeoPandas and Shapely as potential alternatives.