Skip to content

Bengaluru House Price Prediction using Python (Scikit-Learn, Pandas, NumPy, Matplotlib, Seaborn). Machine learning predicts prices based on features like location, size, and bathrooms. Data preprocessing, Ridge Regression model, and evaluation metrics ensure accurate predictions. Clone, install, and run the script for precise Bengaluru house prices

Notifications You must be signed in to change notification settings

sudoTheArkKnight/BangalorePricePrediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Bengaluru House Price Prediction

Python Scikit-Learn Pandas NumPy Matplotlib Seaborn

Predicting house prices in Bengaluru using machine learning.

Overview

This project focuses on predicting house prices in Bengaluru, India, based on various features such as location, size, number of bathrooms, and balconies. It uses machine learning techniques to build a regression model capable of providing accurate price estimates for residential properties.

Data

The dataset used for this project is sourced from 'Bengaluru_House_Data.csv.' It contains information about properties in Bengaluru, including their attributes and corresponding prices.

Data Preprocessing

  • Handling Missing Values: Rows with missing values in the 'location,' 'size,' 'bath,' and 'balcony' columns are removed.
  • Handling 'total_sqft' column: The 'total_sqft' column is converted to numeric, handling different formats.
  • Encoding Categorical Variables: Location data is label-encoded for model compatibility.
  • Feature Selection: Relevant features including 'location_encoded,' 'total_sqft,' 'bath,' and 'balcony' are selected.

Model Training

  • Data Splitting: The dataset is split into training and testing sets for model evaluation.
  • Polynomial Features: Polynomial features of degree 2 are added to capture complex relationships.
  • Ridge Regression: A Ridge Regression model with L2 regularization is trained to predict house prices.
  • Pipeline: A Scikit-Learn pipeline is used to streamline data preprocessing and model training.

Model Evaluation

The trained model is evaluated using the following metrics:

  • Mean Absolute Error (MAE): 42.83188201062303
  • Mean Squared Error (MSE): 9688.112980255195
  • Root Mean Squared Error (RMSE): 98.42821231870055
  • R-squared (R2): 0.48247268430488044
  • Cross-Validation RMSE: 97.99593594870805
  • Best Hyperparameters: {'ridge__alpha': 0.001}

Cross-Validation

Cross-validation is performed to assess model performance across multiple folds. The root mean squared error (RMSE) is used as the evaluation metric.

Hyperparameter Tuning

RandomizedSearchCV is employed to fine-tune the hyperparameters of the Ridge Regression model, optimizing its predictive accuracy.

Usage

  1. Clone this repository.
  2. Ensure you have the required libraries installed (pip install -r requirements.txt).
  3. Run the Python script to predict house prices.

Feedback and Contact

For any questions, feedback, or clarifications, please feel free to reach out via GitHub or LinkedIn.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

Bengaluru House Price Prediction using Python (Scikit-Learn, Pandas, NumPy, Matplotlib, Seaborn). Machine learning predicts prices based on features like location, size, and bathrooms. Data preprocessing, Ridge Regression model, and evaluation metrics ensure accurate predictions. Clone, install, and run the script for precise Bengaluru house prices

Topics

Resources

Stars

Watchers

Forks

Languages