## Boston Housing Assignment

In this assignment you'll be using linear regression to estimate the cost of house in boston, using a well known dataset.

Goals:
+  Measure the performance of the model I created using $R^{2}$ and MSE
> Learn how to use sklearn.metrics.r2_score and sklearn.metrics.mean_squared_error
+  Implement a new model using L2 regularization
> Use sklearn.linear_model.Ridge or sklearn.linear_model.Lasso 
+  Get the best model you can by optimizing the regularization parameter.   

In [51]:
from sklearn import datasets
import math
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.cross_validation import train_test_split
from sklearn.metrics import mean_squared_error
from sklearn.metrics import r2_score
from sklearn.linear_model import LinearRegression
from sklearn import linear_model

In [52]:
bean = datasets.load_boston()
print(bean.DESCR)

Boston House Prices dataset

Notes
------
Data Set Characteristics:  

    :Number of Instances: 506 

    :Number of Attributes: 13 numeric/categorical predictive
    
    :Median Value (attribute 14) is usually the target

    :Attribute Information (in order):
        - CRIM     per capita crime rate by town
        - ZN       proportion of residential land zoned for lots over 25,000 sq.ft.
        - INDUS    proportion of non-retail business acres per town
        - CHAS     Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
        - NOX      nitric oxides concentration (parts per 10 million)
        - RM       average number of rooms per dwelling
        - AGE      proportion of owner-occupied units built prior to 1940
        - DIS      weighted distances to five Boston employment centres
        - RAD      index of accessibility to radial highways
        - TAX      full-value property-tax rate per $10,000
        - PTRATIO  pupil-teacher ratio by town
      

In [53]:
def load_boston():
    scaler = StandardScaler()
    boston = datasets.load_boston()
    X=boston.data
    y=boston.target
    X = scaler.fit_transform(X)
    return train_test_split(X,y)
    

In [54]:
X_train, X_test, y_train, y_test = load_boston()

In [55]:
X_train.shape

(379L, 13L)

### Fitting a Linear Regression

It's as easy as instantiating a new regression object (line 1) and giving your regression object your training data
(line 2) by calling .fit(independent variables, dependent variable)



In [56]:

clf = LinearRegression()
clf.fit(X_train, y_train)

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)

### Making a Prediction
X_test is our holdout set of data.  We know the answer (y_test) but the computer does not.   

Using the command below, I create a tuple for each observation, where I'm combining the real value (y_test) with
the value our regressor predicts (clf.predict(X_test))

Use a similiar format to get your r2 and mse metrics working.  Using the [scikit learn api](http://scikit-learn.org/stable/modules/model_evaluation.html) if you need help!

In [57]:
list(zip (y_test, clf.predict(X_test)))

[(16.100000000000001, 18.910277590031171),
 (25.199999999999999, 26.985604922992216),
 (19.600000000000001, 17.389154965992525),
 (15.199999999999999, 11.219730392861166),
 (20.600000000000001, 19.376604505202863),
 (21.199999999999999, 21.445787198385478),
 (21.800000000000001, 19.81717055550093),
 (14.300000000000001, 17.008801136281114),
 (13.1, 14.031100243934723),
 (22.600000000000001, 25.633196873352286),
 (22.300000000000001, 27.343859074179036),
 (21.699999999999999, 23.270633160503312),
 (22.5, 18.144932714436916),
 (22.399999999999999, 23.478019959311709),
 (22.199999999999999, 24.499630435880444),
 (20.199999999999999, 15.163720753708816),
 (20.800000000000001, 17.734096760911296),
 (13.4, 13.434171228417179),
 (13.300000000000001, 21.324770644080338),
 (22.399999999999999, 23.442296323790508),
 (14.0, 14.510112577946909),
 (15.1, 16.888524200743966),
 (24.300000000000001, 28.229504359731077),
 (50.0, 21.27899333401885),
 (13.4, 15.722241691251019),
 (14.4, 4.798266417240103

### r^2

In [58]:
r2_score(y_test, clf.predict(X_test), sample_weight=None, multioutput=None)

0.68989593952709827

### Root Mean Squared Error

In [59]:
math.sqrt(mean_squared_error(y_test, clf.predict(X_test), sample_weight=None, multioutput='uniform_average'))

4.757584043002894

### SciKit Ridge Optimization

In [68]:
clf2 = linear_model.RidgeCV(alphas=[0.1, 10.0, 100.0])
clf2.fit(X_train, y_train)

RidgeCV(alphas=[0.1, 10.0, 100.0], cv=None, fit_intercept=True, gcv_mode=None,
    normalize=False, scoring=None, store_cv_values=False)

In [65]:
list(zip (y_test, clf2.predict(X_test)))

[(16.100000000000001, 18.882770384878537),
 (25.199999999999999, 26.658594425621786),
 (19.600000000000001, 17.663077479935449),
 (15.199999999999999, 12.357761719444506),
 (20.600000000000001, 19.154561613959519),
 (21.199999999999999, 21.638376388241205),
 (21.800000000000001, 20.048109676616932),
 (14.300000000000001, 17.046298761574846),
 (13.1, 14.339999176415022),
 (22.600000000000001, 25.416537397030304),
 (22.300000000000001, 27.014339771267348),
 (21.699999999999999, 23.251319651567606),
 (22.5, 18.046340487480911),
 (22.399999999999999, 23.465766954109252),
 (22.199999999999999, 24.433612625520567),
 (20.199999999999999, 15.461010587871277),
 (20.800000000000001, 17.39048296140194),
 (13.4, 13.309759955283951),
 (13.300000000000001, 21.192288834432681),
 (22.399999999999999, 23.580942572426778),
 (14.0, 14.884900724526915),
 (15.1, 16.729501077666335),
 (24.300000000000001, 27.770369362563979),
 (50.0, 20.559717204297677),
 (13.4, 16.363824281772416),
 (14.4, 5.22957676830353

### Ridge r^2

In [69]:
r2_score(y_test, clf2.predict(X_test), sample_weight=None, multioutput=None)

0.69125686470416747

### Ridge Root Mean Squared Error

In [70]:
math.sqrt(mean_squared_error(y_test, clf2.predict(X_test), sample_weight=None, multioutput='uniform_average'))

4.747132978165947

### Lasso

In [71]:
clf3 = linear_model.LassoCV(alphas=[0.1, 10.0, 100.0])
clf3.fit(X_train, y_train)

LassoCV(alphas=[0.1, 10.0, 100.0], copy_X=True, cv=None, eps=0.001,
    fit_intercept=True, max_iter=1000, n_alphas=100, n_jobs=1,
    normalize=False, positive=False, precompute='auto', random_state=None,
    selection='cyclic', tol=0.0001, verbose=False)

In [72]:
list(zip (y_test, clf3.predict(X_test)))

[(16.100000000000001, 18.762106208400347),
 (25.199999999999999, 26.129530189301338),
 (19.600000000000001, 17.949981031817831),
 (15.199999999999999, 13.536328221025141),
 (20.600000000000001, 18.713813315909128),
 (21.199999999999999, 21.906902402864702),
 (21.800000000000001, 20.009803844045422),
 (14.300000000000001, 17.233160075207248),
 (13.1, 14.610221741856666),
 (22.600000000000001, 25.158834273533486),
 (22.300000000000001, 26.731674962914351),
 (21.699999999999999, 22.997156442607132),
 (22.5, 17.903431232151817),
 (22.399999999999999, 22.961120616242109),
 (22.199999999999999, 24.403142172616377),
 (20.199999999999999, 15.504655601598957),
 (20.800000000000001, 17.190968432589813),
 (13.4, 13.30364630688959),
 (13.300000000000001, 21.007387186316066),
 (22.399999999999999, 23.754742113706964),
 (14.0, 15.149293609282964),
 (15.1, 16.485334409739345),
 (24.300000000000001, 27.821404523970269),
 (50.0, 20.114318948843906),
 (13.4, 16.303878323786211),
 (14.4, 5.25413206315959

### Lasso r^2

In [73]:
r2_score(y_test, clf3.predict(X_test), sample_weight=None, multioutput=None)

0.69071997944216768

### Lasso Root Mean Squared Error

In [74]:
math.sqrt(mean_squared_error(y_test, clf3.predict(X_test), sample_weight=None, multioutput='uniform_average'))

4.751258671032764

### The Ridge regression test seems to only have had a nominal effect on the model's predictive accuracy. I adjusted the alpha values several times, but the overall change was very slight. The same goes for the Lasso model. Neither result shows much difference at all from the original, plain ol' linear regression model.