Model Evaluation & Validation Project [Udacity Machine Learning Engineer Nanodegree]
Develop a model to predict Boston housing prices. Utilize sklearn techniques for training, testing, evaluating and optimizing models.
- Explore the data
- Develop a model using decision tree algorithm
- Optimize model parameters using grid search
- Predicting selling prices for 3 clients
-
home price client 1 $409,100.00 client 2 $285,600.00 client 3 $957,218.18 -
Applicability of the constructed model
Still have a big room to improve before being used in the real-world for below reasons.
- The mdoel learn from a old dataset which has quite different value of money from current, so it will not able to make a accurate prediction for house price nowadays.
- More features relevant to house prices should be considered such as location, economic growth and interest rate, etc.
This project requires Python and the following Python libraries installed:
You will also need to have software installed to run and execute a Jupyter Notebook
If you do not have Python installed yet, it is highly recommended that you install the Anaconda distribution of Python, which already has the above packages and more included.
In a terminal or command window, run one of the following commands:
ipython notebook boston_housing.ipynb
or
jupyter notebook boston_housing.ipynb
This will open the Jupyter Notebook software and project file in your browser.
The modified Boston housing dataset consists of 489 data points, with each datapoint having 3 features. This dataset is a modified version of the Boston Housing dataset found on the UCI Machine Learning Repository.
Features
RM
: average number of rooms per dwellingLSTAT
: percentage of population considered lower statusPTRATIO
: pupil-teacher ratio by town
Target Variable
4. MEDV
: median value of owner-occupied homes