# House Prices - Advanced Regression Techniques
[Michael DiSanto](https://www.michaelpdisanto.com) - 2023

## Project Objective

Ask a home buyer to describe their dream house, and they probably won't begin with the height of the basement ceiling or the proximity to an east-west railroad. But this playground competition's dataset proves that much more influences price negotiations than the number of bedrooms or a white-picket fence.

With 79 explanatory variables describing (almost) every aspect of residential homes in Ames, Iowa, this competition challenges you to predict the final price of each home.

In this project, I aim to predict sales prices of homes using advanced regression techniques, including feature engineering, random forests, and gradient boosting. For each Id in the test set, I will predict the value of the SalePrice variable. The output will be evaluated on Root-Mean-Squared-Error (RMSE) between the logarithm of the predicted value and the logarithm of the observed sales price. (Taking logs means that errors in predicting expensive houses and cheap houses will affect the result equally.)

## Understanding the Data
* Import necessary libraries.
* Load and inspect the dataset(s) you'll be working with.
* Display basic statistics, data types, and any initial observations.
* Data cleaning, preprocessing, and handling missing values if necessary.

### Importing Libraries

### Loading Data

### Data Summary

### Data Cleaning + Preprocessing

## Exploratory Data Analysis (EDA)
* Visualizations and statistical summaries to gain insights into the data.
* Histograms, scatter plots, box plots, correlation matrices, etc.
* Identify patterns, trends, and potential outliers.

### Visualizations

### Statistical Summaries

## Data Preparation 
* Feature engineering: Create new features if needed.
* Data scaling, normalization, or encoding for machine learning models.
* Train-test split: Divide the data into training and testing sets.

### Feature Engineering

### Data Normalization/Encoding (for ML models)

### Splitting the Data

## Modeling
* Select machine learning algorithms that are appropriate for your problem.
* Train and evaluate models.
* Hyperparameter tuning.
* Cross-validation if applicable.
* Performance metrics (e.g., accuracy, F1-score, RMSE, etc.).

### Model Selection

### Model Training and Evaluation

### Hyperparameter Tuning

### Cross-Validation (if applicable)

### Performance Metrics

## Results and Discussion
* Present the results of your analysis and modeling.
* Interpret the model's performance and what it means for the project's goals.
* Discuss any challenges encountered and potential improvements.

### Analysis and Modeling Results

xxxxxx

### Performance Interpretation

xxxxxx

### Challenges and Potential Improvements

xxxxxx

## Conclusion
* Summarize the key findings and outcomes.
* Reiterate the project's objectives and whether they were achieved.
* Suggest possible extensions or future steps for the project.
* Highlight areas that could benefit from additional data or research.

### Key Findings and Outcomes

xxxxxx

### Future Work

xxxxxx

## References
* https://www.kaggle.com/competitions/house-prices-advanced-regression-techniques
* http://jse.amstat.org/v19n3/decock.pdf
* https://www.kaggle.com/code/skirmer/fun-with-real-estate-data

## License

MIT License

Copyright (c) 2023 Michael DiSanto

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.