# PPHS flats considerations

## Objective
The purpose of this notebook is to generate sufficient data to rank the PPHS flats available as of August 2024 from best to the worst, based on the following criteria:
1. Travel time to nearest MRT (lower = better)
2. Travel time to Raffles Place MRT (lower = better)
3. Cost of rental (lower = better)
4. Months to expiry (higher = better)

## Method
The original data collected will be from HDB's website for [flats available](https://www.hdb.gov.sg/cs/infoweb/residential/renting-a-flat/renting-from-hdb/parenthood-provisional-housing-schemepphs/application-procedure/flats-available-for-application-) and [rental rates](https://www.hdb.gov.sg/cs/infoweb/residential/renting-a-flat/renting-from-hdb/parenthood-provisional-housing-schemepphs/rents-and-deposits). 
The data will be tabulated in an excel sheet and processed with this notebook.

### 1. Importing the necessary data

In [1]:
import pandas as pd

df = pd.read_excel("/Users/yeelongsiah/Desktop/Personal/PythonLearning/PPHS/flats_available.xlsx")
df

Unnamed: 0,town,address,2room,3room,4room,site_expiry,months_to_expiry,site_exp_date,postal,latitude,longitude
0,Ang Mo Kio,Blk 475 Ang Mo Kio Avenue 10,0,1,0,1Q 2030,64,2030-01-01,560475,1.362663,103.856516
1,Bedok,Blk 13 Bedok South Road,1,0,0,4Q 2027,37,2027-10-01,460013,1.320405,103.936741
2,Bukit Batok,Blk 182 Bukit Batok West Avenue 8,0,1,0,1Q 2030,64,2030-01-01,650182,1.345564,103.743822
3,Bukit Merah,Blk 1 Tiong Bahru Road,0,2,1,4Q 2026,25,2026-10-01,162001,1.286137,103.833028
4,Bukit Merah,Blk 3 Tiong Bahru Road,0,3,2,4Q 2027,37,2027-10-01,162003,1.286179,103.83231
5,Bukit Merah,Blk 5 Tiong Bahru Road,0,5,3,4Q 2027,37,2027-10-01,162005,1.2862,103.831688
6,Bukit Merah,Blk 9 Tiong Bahru Road,0,2,0,4Q 2027,37,2027-10-01,161009,1.28616,103.830448
7,Bukit Merah,Blk 55 Lengkok Bahru,0,5,0,1Q 2030,64,2030-01-01,151055,1.28745,103.813445
8,Bukit Merah,Blk 115 Jalan Bukit Merah,0,1,0,1Q 2030,64,2030-01-01,160115,1.281081,103.826489
9,Bukit Merah,Blk 117 Jalan Bukit Merah,0,1,0,1Q 2030,64,2030-01-01,160117,1.281707,103.82684


In [2]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 38 entries, 0 to 37
Data columns (total 11 columns):
 #   Column            Non-Null Count  Dtype         
---  ------            --------------  -----         
 0   town              38 non-null     object        
 1   address           38 non-null     object        
 2   2room             38 non-null     int64         
 3   3room             38 non-null     int64         
 4   4room             38 non-null     int64         
 5   site_expiry       38 non-null     object        
 6   months_to_expiry  38 non-null     int64         
 7   site_exp_date     38 non-null     datetime64[ns]
 8   postal            38 non-null     int64         
 9   latitude          38 non-null     float64       
 10  longitude         38 non-null     float64       
dtypes: datetime64[ns](1), float64(2), int64(5), object(3)
memory usage: 3.4+ KB


In [3]:
import requests

def get_geodata(address: str) -> tuple:
    '''
    returns (postal, lat, long)
    '''
    if address is None:
        return (0,0,0)
    else:
        # for some reason cannot have the "blk " at the front
        if address[:4].lower() == "blk ":
            address = address[4:]
        url = f"https://www.onemap.gov.sg/api/common/elastic/search?searchVal={address}&returnGeom=Y&getAddrDetails=Y"   
        response = requests.request("GET", url).json()
        try:
            postal = response['results'][0]['POSTAL']
            lat = response['results'][0]['LATITUDE']
            long = response['results'][0]['LONGITUDE']
        
            return (postal,lat,long)
        except:
            print(f"error {address}")
            return (0,0,0)

get_geodata(df['address'][5])

('162005', '1.28620043157708', '103.831687678938')

In [4]:
df[['postal', 'latitude', 'longitude']] = df['address'].apply(lambda x: pd.Series(get_geodata(x)))
df

Unnamed: 0,town,address,2room,3room,4room,site_expiry,months_to_expiry,site_exp_date,postal,latitude,longitude
0,Ang Mo Kio,Blk 475 Ang Mo Kio Avenue 10,0,1,0,1Q 2030,64,2030-01-01,560475,1.36266297827625,103.85651592319
1,Bedok,Blk 13 Bedok South Road,1,0,0,4Q 2027,37,2027-10-01,460013,1.32040512272504,103.936741204279
2,Bukit Batok,Blk 182 Bukit Batok West Avenue 8,0,1,0,1Q 2030,64,2030-01-01,650182,1.34556402157015,103.743821531821
3,Bukit Merah,Blk 1 Tiong Bahru Road,0,2,1,4Q 2026,25,2026-10-01,162001,1.28613693561234,103.833027874167
4,Bukit Merah,Blk 3 Tiong Bahru Road,0,3,2,4Q 2027,37,2027-10-01,162003,1.28617878318091,103.832309554646
5,Bukit Merah,Blk 5 Tiong Bahru Road,0,5,3,4Q 2027,37,2027-10-01,162005,1.28620043157708,103.831687678938
6,Bukit Merah,Blk 9 Tiong Bahru Road,0,2,0,4Q 2027,37,2027-10-01,161009,1.28615970912446,103.830448149376
7,Bukit Merah,Blk 55 Lengkok Bahru,0,5,0,1Q 2030,64,2030-01-01,151055,1.28744981772847,103.81344506122
8,Bukit Merah,Blk 115 Jalan Bukit Merah,0,1,0,1Q 2030,64,2030-01-01,160115,1.28108139632864,103.826489308768
9,Bukit Merah,Blk 117 Jalan Bukit Merah,0,1,0,1Q 2030,64,2030-01-01,160117,1.28170700587327,103.82683952932


In [6]:
mrt_df = pd.read_csv("mrt_lrt_data.csv")
mrt_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 157 entries, 0 to 156
Data columns (total 4 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   station_name  157 non-null    object 
 1   type          157 non-null    object 
 2   lat           157 non-null    float64
 3   lng           157 non-null    float64
dtypes: float64(2), object(2)
memory usage: 5.0+ KB


In [11]:
mrt_df = mrt_df[mrt_df['type'] == "MRT"]
mrt_df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 119 entries, 0 to 118
Data columns (total 4 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   station_name  119 non-null    object 
 1   type          119 non-null    object 
 2   lat           119 non-null    float64
 3   lng           119 non-null    float64
dtypes: float64(2), object(2)
memory usage: 4.6+ KB


In [12]:
from geopy.distance import geodesic

# Define a new function to be applied for faster computation
def find_nearest_mrt(row):
    min_distance = float('inf')
    nearest_mrt = ''
    
    for idx, mrt in mrt_df.iterrows():
        distance = geodesic((row['latitude'], row['longitude']), (mrt['lat'], mrt['lng'])).km
        
        if distance < min_distance:
            min_distance = distance
            nearest_mrt = mrt['station_name']
    
    return pd.Series({'nearest MRT': nearest_mrt, 'distance to nearest MRT': min_distance})

df[['nearest_MRT', 'distance_to_nearest_MRT']] = df.apply(find_nearest_mrt, axis=1)

In [41]:
from keys import onemap_token
def get_mrt_travel_time(start_lat, start_long, nearest_mrt) -> int:
    '''
    returns the duration in minutes
    '''
    date = '06-06-2024'
    x, end_lat, end_long = get_geodata(f"{nearest_mrt} MRT Station")
    
    url = f"https://www.onemap.gov.sg/api/public/routingsvc/route?start={start_lat}%2C{start_long}&end={end_lat}%2C{end_long}&routeType=pt&date={date}&time=12%3A35%3A00&mode=TRANSIT&maxWalkDistance=400&numItineraries=1"
        
    headers = {"Authorization": onemap_token}
        
    response = requests.request("GET", url, headers=headers).json()
    duration = response['plan']['itineraries'][0]['duration']/60
    
    try:
        return duration
    except:
        return None
    
i=6
get_mrt_travel_time(df['latitude'][i], df['longitude'][i], df['nearest_MRT'][i])

5.583333333333333

In [30]:
df['time_to_MRT'] = df.apply(lambda row: get_mrt_travel_time(row['latitude'], row['longitude'], f"{row['nearest_MRT']}"), axis=1)
df

Unnamed: 0,town,address,2room,3room,4room,site_expiry,months_to_expiry,site_exp_date,postal,latitude,longitude,nearest_MRT,distance_to_nearest_MRT,time_to_MRT,time_to_CH
0,Ang Mo Kio,Blk 475 Ang Mo Kio Avenue 10,0,1,0,1Q 2030,64,2030-01-01,560475,1.36266297827625,103.85651592319,Ang Mo Kio,1.121213,18.016667,25.633333
1,Bedok,Blk 13 Bedok South Road,1,0,0,4Q 2027,37,2027-10-01,460013,1.32040512272504,103.936741204279,Bedok,0.83123,20.2,34.266667
2,Bukit Batok,Blk 182 Bukit Batok West Avenue 8,0,1,0,1Q 2030,64,2030-01-01,650182,1.34556402157015,103.743821531821,Bukit Batok,0.750456,14.316667,47.733333
3,Bukit Merah,Blk 1 Tiong Bahru Road,0,2,1,4Q 2026,25,2026-10-01,162001,1.28613693561234,103.833027874167,Tiong Bahru,0.677328,9.916667,16.5
4,Bukit Merah,Blk 3 Tiong Bahru Road,0,3,2,4Q 2027,37,2027-10-01,162003,1.28617878318091,103.832309554646,Tiong Bahru,0.597256,9.333333,15.916667
5,Bukit Merah,Blk 5 Tiong Bahru Road,0,5,3,4Q 2027,37,2027-10-01,162005,1.28620043157708,103.831687678938,Tiong Bahru,0.528054,7.633333,16.333333
6,Bukit Merah,Blk 9 Tiong Bahru Road,0,2,0,4Q 2027,37,2027-10-01,161009,1.28615970912446,103.830448149376,Tiong Bahru,0.391097,5.583333,14.283333
7,Bukit Merah,Blk 55 Lengkok Bahru,0,5,0,1Q 2030,64,2030-01-01,151055,1.28744981772847,103.81344506122,Redhill,0.445889,8.166667,17.866667
8,Bukit Merah,Blk 115 Jalan Bukit Merah,0,1,0,1Q 2030,64,2030-01-01,160115,1.28108139632864,103.826489308768,Tiong Bahru,0.607467,10.65,56.55
9,Bukit Merah,Blk 117 Jalan Bukit Merah,0,1,0,1Q 2030,64,2030-01-01,160117,1.28170700587327,103.82683952932,Tiong Bahru,0.536223,10.033333,55.516667


In [42]:
df['time_to_RP'] = df.apply(lambda row: get_mrt_travel_time(row['latitude'], row['longitude'], "Raffles Place"), axis=1)
df

Unnamed: 0,town,address,2room,3room,4room,site_expiry,months_to_expiry,site_exp_date,postal,latitude,longitude,nearest_MRT,distance_to_nearest_MRT,time_to_MRT,time_to_CH
0,Ang Mo Kio,Blk 475 Ang Mo Kio Avenue 10,0,1,0,1Q 2030,64,2030-01-01,560475,1.36266297827625,103.85651592319,Ang Mo Kio,1.121213,18.016667,27.033333
1,Bedok,Blk 13 Bedok South Road,1,0,0,4Q 2027,37,2027-10-01,460013,1.32040512272504,103.936741204279,Bedok,0.83123,20.2,36.15
2,Bukit Batok,Blk 182 Bukit Batok West Avenue 8,0,1,0,1Q 2030,64,2030-01-01,650182,1.34556402157015,103.743821531821,Bukit Batok,0.750456,14.316667,43.25
3,Bukit Merah,Blk 1 Tiong Bahru Road,0,2,1,4Q 2026,25,2026-10-01,162001,1.28613693561234,103.833027874167,Tiong Bahru,0.677328,9.916667,10.95
4,Bukit Merah,Blk 3 Tiong Bahru Road,0,3,2,4Q 2027,37,2027-10-01,162003,1.28617878318091,103.832309554646,Tiong Bahru,0.597256,9.333333,10.566667
5,Bukit Merah,Blk 5 Tiong Bahru Road,0,5,3,4Q 2027,37,2027-10-01,162005,1.28620043157708,103.831687678938,Tiong Bahru,0.528054,7.633333,11.616667
6,Bukit Merah,Blk 9 Tiong Bahru Road,0,2,0,4Q 2027,37,2027-10-01,161009,1.28615970912446,103.830448149376,Tiong Bahru,0.391097,5.583333,12.25
7,Bukit Merah,Blk 55 Lengkok Bahru,0,5,0,1Q 2030,64,2030-01-01,151055,1.28744981772847,103.81344506122,Redhill,0.445889,8.166667,15.833333
8,Bukit Merah,Blk 115 Jalan Bukit Merah,0,1,0,1Q 2030,64,2030-01-01,160115,1.28108139632864,103.826489308768,Tiong Bahru,0.607467,10.65,16.583333
9,Bukit Merah,Blk 117 Jalan Bukit Merah,0,1,0,1Q 2030,64,2030-01-01,160117,1.28170700587327,103.82683952932,Tiong Bahru,0.536223,10.033333,15.866667


In [90]:
df['town'] = df['town'].apply(lambda x : x.split('/')[0])
df

Unnamed: 0,town,address,2room,3room,4room,site_expiry,months_to_expiry,site_exp_date,postal,latitude,longitude,nearest_MRT,distance_to_nearest_MRT,time_to_MRT,time_to_RP
0,Ang Mo Kio,Blk 475 Ang Mo Kio Avenue 10,0,1,0,1Q 2030,64,2030-01-01,560475,1.36266297827625,103.85651592319,Ang Mo Kio,1.121213,18.016667,27.033333
1,Bedok,Blk 13 Bedok South Road,1,0,0,4Q 2027,37,2027-10-01,460013,1.32040512272504,103.936741204279,Bedok,0.83123,20.2,36.15
2,Bukit Batok,Blk 182 Bukit Batok West Avenue 8,0,1,0,1Q 2030,64,2030-01-01,650182,1.34556402157015,103.743821531821,Bukit Batok,0.750456,14.316667,43.25
3,Bukit Merah,Blk 1 Tiong Bahru Road,0,2,1,4Q 2026,25,2026-10-01,162001,1.28613693561234,103.833027874167,Tiong Bahru,0.677328,9.916667,10.95
4,Bukit Merah,Blk 3 Tiong Bahru Road,0,3,2,4Q 2027,37,2027-10-01,162003,1.28617878318091,103.832309554646,Tiong Bahru,0.597256,9.333333,10.566667
5,Bukit Merah,Blk 5 Tiong Bahru Road,0,5,3,4Q 2027,37,2027-10-01,162005,1.28620043157708,103.831687678938,Tiong Bahru,0.528054,7.633333,11.616667
6,Bukit Merah,Blk 9 Tiong Bahru Road,0,2,0,4Q 2027,37,2027-10-01,161009,1.28615970912446,103.830448149376,Tiong Bahru,0.391097,5.583333,12.25
7,Bukit Merah,Blk 55 Lengkok Bahru,0,5,0,1Q 2030,64,2030-01-01,151055,1.28744981772847,103.81344506122,Redhill,0.445889,8.166667,15.833333
8,Bukit Merah,Blk 115 Jalan Bukit Merah,0,1,0,1Q 2030,64,2030-01-01,160115,1.28108139632864,103.826489308768,Tiong Bahru,0.607467,10.65,16.583333
9,Bukit Merah,Blk 117 Jalan Bukit Merah,0,1,0,1Q 2030,64,2030-01-01,160117,1.28170700587327,103.82683952932,Tiong Bahru,0.536223,10.033333,15.866667


In [100]:
# Get the rental rates data from csv
rental_rates_df = pd.read_csv('processed data/rental_rates.csv')

# Function to get the rental rate 
def get_rental_rate(town, rooms):
    if isinstance(rooms,str) and "room" in rooms:
        rooms = int(rooms[0])
    try:
        return rental_rates_df[rental_rates_df['town']==town][f'{rooms}room'].values[0]
    except:
        return 0
get_rental_rate('Ang Mo Kio', 2)

500

In [105]:
# Step 1: Create an empty list to store the rows
rows = []

# Step 2: Iterate through each row in the DataFrame
for index, row in df.iterrows():
    town = row['town']
    for room_type in ['2room', '3room', '4room']:
        if row[room_type] != 0:  # Filter out rows with 0 values
            rows.append({'town': town, 'rooms': int(room_type.strip('room')),
                         'address': row['address'],
                         'nearest_MRT': row['nearest_MRT'],
                         'time_to_MRT': int(round(row['time_to_MRT'], 0)),   # criteria 1
                         'time_to_RP': int(round(row['time_to_RP'], 0)),     # criteria 2
                         'rent': get_rental_rate(town, room_type),           # criteria 3
                         'site_expiry': row['site_expiry'],
                         'months_to_expiry': row['months_to_expiry'],        # criteria 4
                         'availability': row[room_type]
                         })

# Step 3: Create a new DataFrame from the list of rows
PPHS_summary_df = pd.DataFrame(rows)
PPHS_summary_df

Unnamed: 0,town,rooms,address,nearest_MRT,time_to_MRT,time_to_RP,rent,site_expiry,months_to_expiry,availability
0,Ang Mo Kio,3,Blk 475 Ang Mo Kio Avenue 10,Ang Mo Kio,18,27,800,1Q 2030,64,1
1,Bedok,2,Blk 13 Bedok South Road,Bedok,20,36,500,4Q 2027,37,1
2,Bukit Batok,3,Blk 182 Bukit Batok West Avenue 8,Bukit Batok,14,43,700,1Q 2030,64,1
3,Bukit Merah,3,Blk 1 Tiong Bahru Road,Tiong Bahru,10,11,700,4Q 2026,25,2
4,Bukit Merah,4,Blk 1 Tiong Bahru Road,Tiong Bahru,10,11,1500,4Q 2026,25,1
5,Bukit Merah,3,Blk 3 Tiong Bahru Road,Tiong Bahru,9,11,700,4Q 2027,37,3
6,Bukit Merah,4,Blk 3 Tiong Bahru Road,Tiong Bahru,9,11,1500,4Q 2027,37,2
7,Bukit Merah,3,Blk 5 Tiong Bahru Road,Tiong Bahru,8,12,700,4Q 2027,37,5
8,Bukit Merah,4,Blk 5 Tiong Bahru Road,Tiong Bahru,8,12,1500,4Q 2027,37,3
9,Bukit Merah,3,Blk 9 Tiong Bahru Road,Tiong Bahru,6,12,700,4Q 2027,37,2


In [109]:
PPHS_summary_df.to_excel('processed data/PPHS Summary.xlsx', index=False)