Predicting Housing Prices

What is this repository?

As an upcoming ML engineer, I challenged myself to put my machine learning skills to the test. I challenged myself by tackling the Housing Prices Challenge on Kaggle. The goal of this challenge is to predict the prices of houses in Ames, Iowa based on a given set of features. To be exact, there are 79 features in total. This project allows the engineer (in this case myself) to practice critical Data Science & Machine Learning techniques.

This repository is organized via various self-explanatory folders.

The model is evaluated using the Root Mean Square Error, as this is the metric we are trying to minimize. My best model has a RMSE of 0.13757. This currently ranks in the top 43%. In reality, my solution would be much higher for various reasons:

Some solutions have an unfeasible RMSE of 0.0. No Machine Learning model can predict with such accuracy. I suspect cheating occured here.
Some solutions have a RMSE of 0.00044. After further inspection of such solutions, I found that these solutions are invalid because of the fact that competitors are simply providing the results of answers to a similar challenge (Boston Housing Prices). Once again, I believe this is cheating since no real Machine Learning methodologies are being deployed.

Final Model: My best model is a tuned CatBoost Model.

Note: you may use my solution as a reference; however, I would strongly advise you to tackle this challenge on your own. The only way you will get better at machine learning is to practice it on your own. I do not condone nor am I responsible for any cheating that may occur as a result of this repository.

Machine Learning Project Checklist:

This checklist is what I use for every ML project. This goes through every major step & ensures that I have done everything correctly.

Framing the Problem - Complete
Getting the Data - Complete
Exploring the Data - Complete
Data Preprocessing - Complete
Model Development - Complete
Model Tuning/Ensemble Learning - Complete
Deploying Model on Test Set & Presentation of Solution - Complete

What tools are used in this project?

References

Future Adjustments

In reality, there are infinite adjustments I could make to improve my score; however, here a couple fruitful ones:

Combine the Tuned-CatBoost model with some other models (Linear Regression & Support Vector Machines seem promising)
Feature Engineering: I could maybe cut down the categories for certain features.
Feature Importance: Further feature selection. Use my model to make better selections for features.
Maybe incorporate outside data like many credible top-ranked solutions.

Closing Remarks

This project was very enjoyable ,and I definitely learned a lot along the way! I would recommend this challenge to anyone who is looking to dive into Machine Learning & Data Science. It is quite simple, and the dataset is relatively small & not overwhelming. Overall, this challenge was really fun and a great learning experience!

About the author

I am an undergraduate student @ Rutgers University New Brunswick, who is pursing bachelor degrees in Computer Science and Cognitive Science. Furthermore, I am pursing a certificate in Data Science. I have a passion for AI ,and I am always intriguied by its power. Feel free to contact me via Linkedln.
Enjoy!
Jinal Shah

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
.idea		.idea
Code		Code
Data		Data
Models		Models
Submissions		Submissions
Text Files		Text Files
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.idea

.idea

Code

Code

Data

Data

Models

Models

Submissions

Submissions

Text Files

Text Files

README.md

README.md

Repository files navigation

Predicting Housing Prices

What is this repository?

Machine Learning Project Checklist:

What tools are used in this project?

References

Future Adjustments

Closing Remarks

About the author

About

Releases

Packages

Languages

JinalShah2002/House-Prices-Challenge-Solution

Folders and files

Latest commit

History

Repository files navigation

Predicting Housing Prices

What is this repository?

Machine Learning Project Checklist:

What tools are used in this project?

References

Future Adjustments

Closing Remarks

About the author

About

Topics

Resources

Stars

Watchers

Forks

Languages