#  House Price Prediction with Neural Networks

##  What is This Project About?

In this project, we will use a **neural network** to **predict house prices**.

We’ll use a dataset that contains different information about houses — like how big the house is, how many bedrooms it has, how many bathrooms, and more.  
Our goal is to predict how much a house will cost based on these details.

---

##  What Are We Trying to Do?

We will:

- Load and understand the dataset  
- Clean the data (fix missing values, convert text to numbers)  
- Build a **neural network** using TensorFlow and Keras  
- Train the model to learn patterns from the data  
- Check how well our model predicts house prices using error values like:
  - Mean Absolute Error (MAE)  
  - Mean Squared Error (MSE)  
  - Root Mean Squared Error (RMSE)  

---

##  Why Is This Useful?

Knowing how much a house is worth helps:

- Buyers and sellers make smarter decisions  
- Investors know where to put money  
- Governments and real estate agents plan better  

Neural networks are great at finding complex patterns in data. That’s why we’ll use one for this project.



# Step 1: Import Libraries

In [28]:
# Import Libraries
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
from xgboost import XGBRegressor

# Step 2: Load the Dataset

In [29]:
# Load the dataset
df = pd.read_csv("/content/sample_data/california_housing_test.csv")

# Step 3: Understand and Pre-process Sample Data

In [30]:
# Show shape of the dataset
print("Dataset shape:", df.shape)

# Show basic statistics
print("\nSummary statistics:")
print(df.describe())


Dataset shape: (3000, 9)

Summary statistics:
         longitude    latitude  housing_median_age   total_rooms  \
count  3000.000000  3000.00000         3000.000000   3000.000000   
mean   -119.589200    35.63539           28.845333   2599.578667   
std       1.994936     2.12967           12.555396   2155.593332   
min    -124.180000    32.56000            1.000000      6.000000   
25%    -121.810000    33.93000           18.000000   1401.000000   
50%    -118.485000    34.27000           29.000000   2106.000000   
75%    -118.020000    37.69000           37.000000   3129.000000   
max    -114.490000    41.92000           52.000000  30450.000000   

       total_bedrooms    population  households  median_income  \
count     3000.000000   3000.000000  3000.00000    3000.000000   
mean       529.950667   1402.798667   489.91200       3.807272   
std        415.654368   1030.543012   365.42271       1.854512   
min          2.000000      5.000000     2.00000       0.499900   
25%        

# Step 4: Feature Scaling and Train-Test Split

In [31]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
import xgboost as xgb

# Load the dataset
df = pd.read_csv("/content/sample_data/california_housing_test.csv")

# Step 0: Separate features and target
X = df.drop("median_house_value", axis=1)
y = df["median_house_value"]

# Step 1: Split data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(
    X, y,
    test_size=0.2,
    random_state=0
)

print("Data Splitting Done")
print("X_train shape:", X_train.shape)
print("X_test shape:", X_test.shape)

Data Splitting Done
X_train shape: (2400, 8)
X_test shape: (600, 8)


# Step 5: Build the XGB Regressor ANN

In [32]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
import xgboost as xgb

# Load the dataset
df = pd.read_csv("/content/sample_data/california_housing_test.csv")

# Step 4: Initialize the XGBoost model
xgb_model = xgb.XGBRegressor(
    objective='reg:squarederror',
    n_estimators=100,
    learning_rate=0.1,
    max_depth=4,
    random_state=0
)

# Step 5: Train the model
xgb_model.fit(X_train_scaled, y_train)


# Step 6: Model Evaluation and Predictions

In [33]:
# Make predictions
y_pred = xgb_model.predict(X_test_scaled)

# Evaluate the model
mae = mean_absolute_error(y_test, y_pred)
mse = mean_squared_error(y_test, y_pred)
rmse = np.sqrt(mse)
r2 = r2_score(y_test, y_pred)

print("\nXGBoost Model Evaluation:")
print("Mean Absolute Error (MAE):", round(mae, 2))
print("Mean Squared Error (MSE):", round(mse, 2))
print("Root Mean Squared Error (RMSE):", round(rmse, 2))
print("R² Score:", round(r2, 4))



XGBoost Model Evaluation:
Mean Absolute Error (MAE): 39535.32
Mean Squared Error (MSE): 3218170665.63
Root Mean Squared Error (RMSE): 56728.92
R² Score: 0.7617
