# **Predictive Modelling of HDB Resale Prices in Relation to Urban Accessibility and Macroeconomic Trends**
## **SC3021 Project - AY25/26 Semester 2**
## **BACF1 Team 1**
Kwok Weng Jian (U2510454J)  
Chew En Yu (U2510555H)   (U2510950H)

# **Introduction**
The public housing market in Singapore serves as a critical pillar of national stability, housing over 80% of the resident population. However, the HDB resale market operates on open market principles, where transaction prices fluctuate significantly based on intrinsic property attributes, locational convenience, and macroeconomic conditions. Understanding the valuation dynamics of resale flats is essential for prospective homeowners, urban planners, and policymakers to navigate housing affordability and asset progression.

# **Project Goal**
This project aims to engineer a predictive model for HDB resale prices by analyzing the correlation between three primary dimensions:

* Structural Attributes (Age, floor area, lease remainder)

* Locational Utility (Proximity to transport and top-tier education)
* Macroeconomic Trends (Inflation and purchasing power)

##**Our Hypothesis**
* Locational accessibility such as proximity to MRT nodes and primary schools is positively correlated with resale valuation.

* Remaining lease tenure is positively correlated with price.

* Inflationary pressure (CPI) significantly distorts historical pricing, requiring normalization to compare value over time accurately.

To validate these hypotheses and construct a robust pricing model, we have curated a comprehensive data architecture comprising six distinct components. These datasets were selected to ensure our model captures both the physical reality of the housing units and the economic context in which they are traded.

1. The Core Dataset (Target Variable)
To establish a baseline for prediction, we require comprehensive historical transaction records.

Dataset: Resale Flat Prices

Source: Data.gov.sg (HDB)

Relevance: This dataset forms the foundation of our regression analysis. It contains granular transaction data from 1990 to the present, including Price, Floor Area, Flat Type, and Lease Commencement Date.

Strategic Utility: It provides the Dependent Variable (Resale Price) essential for training our predictive model.

2. Physical Building Details (Enrichment)
The core transaction data lacks specific structural metadata necessary for a granular assessment.

Dataset: HDB Property Information

Source: Data.gov.sg (HDB)

Relevance: This dataset offers a structural profile for every HDB block in Singapore, including critical features such as "Year Completed," "Max Floor Level," and "Total Dwelling Units."

Strategic Utility: By merging this with the core dataset, we can engineer high-value features such as Remaining Lease and Storey Range (e.g., differentiating high-rise premiums), thereby significantly improving prediction accuracy.

3. Transport Accessibility (Locational Feature)
In Singaporeâ€™s dense urban fabric, connectivity is a primary driver of real estate value.

Dataset: LTA MRT Station Exit (GeoJSON)

Source: Data.gov.sg (LTA)

Relevance: This provides the precise geospatial coordinates (Latitude/Longitude) of all MRT station exits across the island.

Strategic Utility: It allows us to calculate the Euclidean or walking distance (in km) between a target HDB block and the nearest transport node, quantifying the "Accessibility Premium."

4. Education Accessibility (Locational Feature)
Proximity to educational institutions, particularly the "1km radius" rule for primary school registration, exerts strong pressure on housing demand.

Dataset: General Information of Schools

Source: Data.gov.sg (MOE)

Relevance: Contains the official addresses and postal codes for all Primary, Secondary, and Junior College institutions.

Strategic Utility: This enables the creation of a "School Proximity" feature, quantifying the distance to the nearest primary school or determining if a unit falls within the catchment area of a top-tier institution.

5. Economic Context (Macro Feature)
Comparing a flat sold in 2010 to one sold in 2024 requires adjusting for the changing value of money.

Dataset: Consumer Price Index (CPI)

Source: SingStat (Department of Statistics)

Relevance: This index measures the weighted average change in prices of a basket of consumer goods and services, serving as a proxy for inflation.

Strategic Utility: We use this to normalize nominal housing prices into "Real Dollars," rendering the model time-invariant and allowing for a fair comparison of value across different economic eras.

6. Geospatial Mapping (Technical Implementation)
To bridge the gap between textual addresses and mathematical distance calculations, we leverage dynamic API integration.

Tool: OneMap API

Source: OneMap.gov.sg (Singapore Land Authority)

Relevance: As the authoritative national map of Singapore, this API offers the most accurate geocoding services available.

Strategic Utility: This satisfies the requirement for advanced data engineering skills. We utilize the Search API to programmatically convert HDB street addresses (e.g., "Block 105 Ang Mo Kio") into Latitude/Longitude coordinates, enabling the precise distance calculations required for the transport and education features.