Skip to content

pchadrow/dsc-mod-2-project-v2-1-onl01-dtsc-ft-041320

 
 

Repository files navigation

Realty Kings

Housing data investigation and modeling

Presentables

Presentation Presentation Video Relevant Blog

Purpose

Given housing data for Kings County Seattle, to find price predictors and create a linear regression model capable of predicting potential prices for new homes. This README shall also serve as a genreal outline and explanation of the analysis and devolpment process.

Synopisis

An exploration into distances did find some interesting trends that warrant further investigation and after numerous tried and failed attempts at using recursive feature elimination we settled on standard feature selection for our final model. While initially the model seemed promising, further investigation revealed our perceived model score to be artifically high due to the lack of a constant in the model.

Analysis

Primary data used

First

Our initial investigation looked at the relationship between house grade and house price here.

grade_comparison As we can see, there does appear to be a linear relationship with higher house grades and higher sell prices.


Second

We would then look into the affects of renovation and if more recent renovations had a stronger effect of the sale price of the house

renovation While not a strong relationship with our data, it was interesting to see that there did appear to be a slight trend with more recently renovated homes tending to have a slightly higher possible selling price than those that were renovated longer ago.


Third

Due to the lack of information of waterfront properties, we were curious if the houses proximity to water would have an effect on the sale price as well as the houses distance from downtown Seattle, which we investigated here

seattle There did indeed appear to be a correlation between the price of a house and its distance from Seattle, with houses that were closer to the city tending to have higher price possibilites.

Next we would determine a few hotspot locations in the water to get a better idea of house prices and their proximity to water. map Using these few hotspots we would then compare the distance from the closest hotspot to the homes price which would give us... hotspot Which shows a much stronger relationship between distance from water and house price than distance from Seattle.

Modeling

We attempted numerous modeling approaches with typically poor results. Here we made our first model using a blanket approach just to see what kind of results we'd get. In terms of R2, our results weren't bad. However, everything else about the model essentially was with numerous instances of multicollinearity and an unacceptable amount of kurtosis.

Next we attempted to use Recursive Feature Elimination here and here

While not all models were saved, numerous attempts were made with various adjustments that all seemed to provide similiar or diminishing results. High amounts of multicollinearity seemed unavoidable with this method.

Eventually we settled on our final model. At first glance, it looked like we had finally made a great model for our provided data! model However, further investigation showed that our test results weren't matching up with each other. Essentially we had forgotten to include a constant in our model as a house will never sell for $0. After including the constant we got a much different result that confirmed what our initial test results were trying to tell us... actual

Conclusion

Our investigation showed that numerous factors can influence the price of a house. Based off of our main investigations we can recommend rennovating the house before selling to provide a small improvement before selling, and if that renovation provides you with an opportunity to add square footage or additional rooms, it would be wise to do so in order to maximize the potential selling price. In terms of predicting the selling price...at this time we do not recommend using our current model as its proven itself to be widely innaccurate.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%