California Housing Price Predictor

The Repository explores predicting California housing prices in the 1990s using a Linear Regression Model. We use the scikit for the project and data required is available with scikit.

Data

Data Set Characteristics
Number of instances	20640
Number of attributes	8
Attributes	MedInc - median income in block group HouseAge - median house age in block group AveRooms - average number of rooms per household AveBedrms - average number of bedrooms per household Population - block group population AveOccup - average number of household members Latitude - block group latitude Longitude - block group longitude

Target : Median house value - for California districts, expressed in hundreds of thousands of dollars ($100,000).

This dataset was obtained from the StatLib repository. https://www.dcc.fc.up.pt/~ltorgo/Regression/cal_housing.html

This dataset was derived from the 1990 U.S. census, using one row per census block group. A block group is the smallest geographical unit for which the U.S. Census Bureau publishes sample data (a block group typically has a population of 600 to 3,000 people).

A household is a group of people residing within a home. Since the average number of rooms and bedrooms in this dataset are provided per household, these columns may take surprisingly large values for block groups with few households and many empty houses, such as vacation resorts.

Implementation Details 📜

Standard Linear Regression Model available with Scikit
Input : Above data from the 1990s consensus
Output : Median house price of the house in hundreds of thousands of dollars

Steps

Split the data into training and test dataset. We split it in the ratio of 80:20
Create a pipeline containing a StandardScaler and a LinearRegressionModel.
Fit the training data using the pipline to create the model.
Predict the results using the model ib the training data.
Compare the results to the given housing prices and calculate r2_score and Mean Squared error.

Results

Metric	Value
R2 Score	0.5891435539852219
MSE	0.5472825858911409

The R2 score of 0.58 tells us that there is still lots of variability when comparing the model predicted results and the true values provided in the dataset.

Future Explorations

Evaluating whether to normalize Longitude and Latitude attributes of the dataset. Is there better way to represent the latitude and longitudes
Evaluating if we can add external data to improve the model and its accuracy
Evaluation of other feature engineering techniques to get better representative features.

Libraries and Languages

Language:

Packages:

FAQs

What is Linear Regression ?

A linear regression model describes the relationship between a dependent variable, y, and one or more independent variables, X. Linear regression model is linear in terms of the coefficients. In this model, we try to fit a n-dimensional plane that represents the given data best.

For more details MathWorks

Acknowledgements

Contact

If you have any feedback/are interested in collaborating, please reach out to me via 📧

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
images		images
.gitignore		.gitignore
California_Housing_Prices.ipynb		California_Housing_Prices.ipynb
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

images

images

.gitignore

.gitignore

California_Housing_Prices.ipynb

California_Housing_Prices.ipynb

LICENSE

LICENSE

README.md

README.md

Repository files navigation

California Housing Price Predictor

Data

Implementation Details 📜

Steps

Results

Future Explorations

Libraries and Languages

FAQs

What is Linear Regression ?

Acknowledgements

Contact

License

About

Releases

Packages

Languages

License

rajamal/california-housing-price-predictor

Folders and files

Latest commit

History

Repository files navigation

California Housing Price Predictor

Data

Implementation Details 📜

Steps

Results

Future Explorations

Libraries and Languages

FAQs

What is Linear Regression ?

Acknowledgements

Contact

License

About

Resources

License

Stars

Watchers

Forks

Languages