Skip to content

harrisonjkane/data-science-projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Project Title: Housing Prices Prediction

Overview

This repository contains a collection of data science projects focused on predicting housing prices based on various features such as location, square footage, and other property-related metrics.

In this project, I perform exploratory data analysis (EDA), data preprocessing, and build machine learning models to predict housing prices. The project aims to demonstrate key data science skills such as data cleaning, feature engineering, model selection, and evaluation.

Project Structure

  • Notebooks: Contains Jupyter notebooks for data exploration, preprocessing, and model building.
  • Data: Placeholder for datasets used in the project. (Note: If datasets are large, they are not stored in the respository.)
  • Scripts: Contains Python scripts for data loading, model training, and evaluation.

Installation

To run this project locally, follow these steps:

  1. Clone this repository:
    git clone https://github.com/harrkane/data-science-projects.git
  2. Create and activate a virtual environment or conda environment:
    conda create --name ds_env python=3.12
    conda activate ds_env
  3. Install required dependencies:
    pip install -r requirements.txt
  4. Launch Jupyter Lab:
    jupyter lab

Dataset

The dataset used in this project is sources from Kaggle's Housing Prices competition. It contains various features related to houses, including:

  • SalePrice: The target variable we want to predict.
  • LotArea, OverallQual, YearBuilt, and more.

Models Used

The following machine learning models are used in this project:

  • Linear Regression
  • Decision Tree Regressor
  • Random Forest Regressor
  • XGBoost

Results

Model performance is evaluated using Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and R² score. Key findings and insights are documented in the notebooks.

Contributions

Feel free to submit issues or pull requests if you'd like to contribute to this project. Feedback and improvements are welcome!

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contact

If you have any questions or feedback, feel free to reach out:

About

This repository contains data science projects, including exploratory data analysis and machine learning models.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors