# Housing Price Analysis and Prediction in New South Wales (NSW), Australia

## 1. Problem Introduction

The real estate market in New South Wales (NSW) is one of the most dynamic and valuable markets in Australia. Accurately estimating property prices is crucial for buyers, sellers, investors, and financial institutions to make informed decisions.
This project aims to **analyze the key factors that influence housing prices** and **build a predictive model** using both historical transaction data (NSW Property Sales) and current listing data (Domain listings).

---

## 2. Analysis Objectives

* Identify the most important factors that influence property prices (location, area, rooms, property type, zoning, etc.).
* Build a model to predict the price (or price per m²) of properties in NSW.
* Compare and evaluate the difference between **listing prices** and **predicted actual values**.
* Develop an interactive dashboard to explore price trends by region, price range, and property type.

---

## 3. Factors Influencing Housing Prices

The analysis leverages two datasets (Domain + NSW Property Sale) and may include the following key features:

* **Location:** Suburb, postcode, distance to the city center, proximity to schools, transport, and amenities.
* **Size and structure:** Area, number of bedrooms (Beds), bathrooms (Baths), parking spaces, lot number.
* **Property type:** House, apartment, land (Type, Nature of property, Primary purpose, Zoning).
* **Time information:** Contract date, listing date — to capture seasonal or temporal effects.
* **Legal and strata details:** Strata lot number, legal description (may affect price and marketability).
* **Historical data:** Previous sale price, price_per_m2 trends by suburb.
* **Additional factors:** market condition, local development trends.

---

## 4. Analytic Approach

Below is the step-by-step process from raw data to predictive model and dashboard visualization.

### Step 1. Business Understanding and Analytic Approach

#### Business Goal

The project aims to **analyze and predict house prices** in **New South Wales (NSW)** using both **Domain listings** and **NSW Property Sales** data.
Main objectives:

* Identify key factors affecting property prices.
* Build a predictive model for price estimation.
* Support data-driven decisions for buyers, sellers, and investors.

### Step 2. Data Collection, Understanding, Preparation

#### Data Sources

* **Domain Listings:** price, address, rooms, size, type.
* **NSW Sales Data:** transaction records with legal and zoning info.

#### Preparation Steps

* Clean missing/duplicate data.
* Standardize and merge by address.
* Create new features: `price_per_m2`, `distance_to_center`, `year_sold`.
* Encode categories (property type, suburb).

### Step 3. Data Analysis with SQL

Use SQL to:

* Summarize prices and sales by suburb.
* Detect duplicates or invalid records.
* Explore average prices over time.

### Step 4. Data Analysis with Python (pandas)

* Compute descriptive stats and correlations.
* Detect and handle outliers.
* Engineer new variables (price ratios, log(price)).
* Analyze patterns using pandas and numpy.

---

### Step 5. Data Visualization

Tools: **matplotlib**, **seaborn**
Key visuals:

* Price distribution (histogram).
* Area vs. price (scatter plot).
* Suburb price comparison (boxplot).
* Correlation heatmap.

---

### Step 6. Regression Analysis

Models:

* **Baseline:** Linear Regression.
* **Advanced:** Ridge, Lasso, Random Forest, XGBoost.

Metrics: MAE, RMSE, R²
Goal: Accurate price prediction and feature importance analysis.

---

### 7. Data Visualization with Power BI

Interactive dashboard includes:

* **Price heatmap** by suburb.
* **Filters** by type, region, year.
* **Trend charts** and **actual vs predicted** views.

**Business value:** Easy insights for pricing strategy and investment analysis.

## 5. Expected Results

* A predictive model with **acceptable accuracy (MAE ≤ 10%)**.
* A list of properties tagged as **Overpriced / Fair / Underpriced**.
* An interactive dashboard for visual market exploration.
* Insights into **key drivers** of housing prices across NSW.
