In [4]:
## Load Data House Prices
import pandas
import sklearn.model_selection

url = "https://goo.gl/sXleFv"
names = ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO',
'B', 'LSTAT', 'MEDV']

data = pandas.read_csv(url, delim_whitespace=True, names=names)
array = data.values

print(data.shape)

X = array[:,0:13]
Y = array[:,13]
num_folds = 10
num_instances = len(X)
seed = 7
kfold = model.selection.KFold(n=num_instances, n_folds=num_folds, random_state=seed)

URLError: <urlopen error [Errno 11001] getaddrinfo failed>

## Linear Regression

Linear regression assumes that the input variables have a Gaussian distribution. 

It is also assumed that input variables are relevant to the output variable and that they are not highly
correlated with each other (a problem called collinearity).

http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html

In [3]:
from sklearn.linear_model import LinearRegression
model = LinearRegression()
scoring = 'neg_mean_squared_error'
results = cross_validation.cross_val_score(model, X, Y, cv=kfold, scoring=scoring)
print(results.mean())

-34.7052559445


## Ridge Regression

Ridge regression is an extension of linear regression where the loss function is modified to
minimize the complexity of the model measured as the sum squared value of the coefficient
values (also called the L2-norm). 

http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Ridge.html

In [4]:
from sklearn.linear_model import Ridge
model = Ridge(alpha=1.0)
scoring = 'neg_mean_squared_error'
results = cross_validation.cross_val_score(model, X, Y, cv=kfold, scoring=scoring)
print(results.mean(), results.std())


-33.9093958127 45.7022257637


## LASSO Regression

The Least Absolute Shrinkage and Selection Operator (or LASSO for short) is a modification
of linear regression, like ridge regression, where the loss function is modified to minimize the
complexity of the model measured as the sum absolute value of the coefficient values (also called
the L1-norm). 

http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Lasso.html

In [16]:
from sklearn.linear_model import Lasso
model = Lasso()
scoring = 'neg_mean_squared_error'
results = cross_validation.cross_val_score(model, X, Y, cv=kfold, scoring=scoring)
print(results.mean())

-34.4640845883


## ElasticNet Regression

ElasticNet is a form of regularization regression that combines the properties of both Ridge
Regression and LASSO regression. 

It seeks to minimize the complexity of the regression model (magnitude and number of 
regression coefficients) by penalizing the model using both the L2-norm (sum squared 
coefficient values) and the L1-norm (sum absolute coefficient values).

http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.ElasticNet.html

In [18]:
from sklearn.linear_model import ElasticNet

model = ElasticNet()
scoring = 'neg_mean_squared_error'
results = cross_validation.cross_val_score(model, X, Y, cv=kfold, scoring=scoring)
print(results.mean())


-31.1645737142


## K-Nearest Neighbors

K-Nearest Neighbors (or KNN) locates the K most similar instances in the training dataset for a
new data instance. 

From the K neighbors, a mean or median output variable is taken as the
prediction. Of note is the distance metric used (the metric argument). 

The Minkowski distance is used by default, which is a generalization of both the Euclidean distance 
(used when all inputs have the same scale) and Manhattan distance (for when the scales of the input variables differ).

http://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsRegressor.html

In [5]:
from sklearn.neighbors import KNeighborsRegressor

model = KNeighborsRegressor()
scoring = 'neg_mean_squared_error'
results = cross_validation.cross_val_score(model, X, Y, cv=kfold, scoring=scoring)
print(results.mean())

-107.28683898


## Classification and Regression Trees

Decision trees or the Classification and Regression Trees (CART as they are known) 
use the training data to select the best points to split the data in order to minimize a cost metric. 

The default cost metric for regression decision trees is the mean squared error, specified in the criterion
parameter. 

http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html
    

In [13]:
from sklearn.tree import DecisionTreeRegressor

model = DecisionTreeRegressor()
scoring = 'neg_mean_squared_error'
results = cross_validation.cross_val_score(model, X, Y, cv=kfold, scoring=scoring)
print(results.mean())

-34.0101431373


## Support Vector Machines

Support Vector Machines (SVM) were developed for binary classification. 

The technique has been extended for the prediction real-valued problems called Support Vector Regression (SVR).

Like the classification example, SVR is built upon the LIBSVM library. 

http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVR.html

In [26]:
from sklearn.svm import SVR

model = SVR()
scoring = 'neg_mean_squared_error'
results = cross_validation.cross_val_score(model, X, Y, cv=kfold, scoring=scoring)
print(results.mean())

-91.0478243332
