Skip to content

House-Prices--Advanced-Regression-Techniques project tackles a regression problem, predicting house prices using 80 input features like MSSubClass, LotFrontage, LotArea, etc. The target variable is SalePrice. Leveraging XgBoost, this project aims for accurate and advanced regression techniques in house price prediction.


Notifications You must be signed in to change notification settings


Repository files navigation


Data: The dataset contains 80 input features, and 1 Target variable called SalePrice. We're expected to predict the sale price of different houses with various features such as MSSubClass, LotFrontage, LotArea, etc.

House Price Prediction

This is a Regression problem. You can import dataset from the following link to replicate the same results and follow along the experiement. We'll use XgBoost to solve this problem.

Instructions for Installation:

Dependencies: : You'll need to install below dependencies to run this project.

  • json: 2.0.9
  • pandas: 1.0.1
  • numpy: 1.18.1
  • matplotlib: 3.5.3
  • seaborn: 0.10.0
  • sklearn: 0.22.1

The code has been tested on Windows system. It should work well on other distributions but has not yet been tested.

In case of any issue with installation or otherwise, please contact me on Linkedin

Important learnings:

  • Wite a re-usable function to determine data type, Null Counts, Unique values, and Null_Percent in each variable and store in a dataframe.
  • Feature Engineering.
  • Easy method to check Null values across different features in dataset.
  • Encode rare categories using RareLabelEncoder.
  • Creating Class for temporal transformation that is compatible with SK_learn pipeline.
  • Building the Pre-Processing sklearn pipeline for data preprocessing such missing value imputation, feature engineering, data encoding, etc.
  • Calculate the feature importance
  • Automatic important feature selection using SelectFromModel.
  • Compare different model version such as Model without preprocessing data, Model with processed data, and Model with important variables only.


If you have a Data Science mini-project that you'd like to share, please follow the guidelines in

Code of Conduct

Please adhere to our Code of Conduct in all your interactions with the project.


This project is licensed under the MIT License.


For questions or inquiries, feel free to contact me on Linkedin.

About Me:

I’m a seasoned Data Scientist and founder of TowardsMachineLearning.Org. I've worked on various Machine Learning, NLP, and cutting-edge deep learning frameworks to solve numerous business problems.


House-Prices--Advanced-Regression-Techniques project tackles a regression problem, predicting house prices using 80 input features like MSSubClass, LotFrontage, LotArea, etc. The target variable is SalePrice. Leveraging XgBoost, this project aims for accurate and advanced regression techniques in house price prediction.








No releases published


No packages published