Skip to content

California Housing Price Prediction - Linear Regression, Support Vector Regression, Decision Trees, and Random Forest Regression

License

Notifications You must be signed in to change notification settings

dilne/CaliforniaHousing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🏡California Housing Price Prediction

The following details and explains performing regression on the California housing dataset using a range of ML models:
Linear Regression, Support Vector Regression, Decision Trees, and Random Forest Regression.

Notebook

Open In Colab

Dataset

The dataset contains 20640 entries and 10 variables.

  • Longitude
  • Latitude
  • Housing Median Age
  • Total Rooms
  • Total Bedrooms
  • Population
  • Households
  • Median Income
  • Median House Value
  • Ocean Proximity

Notebook

In the notebook, I perform:

  • Data investigation
  • Data cleaning
  • Removing outliers
  • Exploratory data analysis
  • Feature engineering
  • Dimensionality reduction
  • Feature encoding
  • Correlation and multicolinearity assessment
  • Feature scaling
  • Model training (including grid search)

Results

The Random Forest Regression model emerged as the best performer among the trained models, with an average accuracy of $43,658.

  • R^2 Score: 0.7933309926525507
  • Mean Absolute Error: 29580.49344298964
  • Mean Squared Error: 1906039202.1731477
  • Root Mean Squared Error: 43658.208875000215
  • Mean Absolute Percentage Error: 17.003087000720146%

About

California Housing Price Prediction - Linear Regression, Support Vector Regression, Decision Trees, and Random Forest Regression

Topics

Resources

License

Stars

Watchers

Forks