## Boston Housing Assignment

In this assignment you'll be using linear regression to estimate the cost of house in boston, using a well known dataset.

Goals:
+  Measure the performance of the model I created using $R^{2}$ and MSE
> Learn how to use sklearn.metrics.r2_score and sklearn.metrics.mean_squared_error
+  Implement a new model using L2 regularization
> Use sklearn.linear_model.Ridge or sklearn.linear_model.Lasso 
+  Get the best model you can by optimizing the regularization parameter.   

In [1]:
from sklearn import datasets
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.cross_validation import train_test_split
from sklearn.metrics import mean_squared_error
from sklearn.metrics import r2_score
from sklearn.linear_model import LinearRegression

In [2]:
bean = datasets.load_boston()
print bean.DESCR

Boston House Prices dataset

Notes
------
Data Set Characteristics:  

    :Number of Instances: 506 

    :Number of Attributes: 13 numeric/categorical predictive
    
    :Median Value (attribute 14) is usually the target

    :Attribute Information (in order):
        - CRIM     per capita crime rate by town
        - ZN       proportion of residential land zoned for lots over 25,000 sq.ft.
        - INDUS    proportion of non-retail business acres per town
        - CHAS     Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
        - NOX      nitric oxides concentration (parts per 10 million)
        - RM       average number of rooms per dwelling
        - AGE      proportion of owner-occupied units built prior to 1940
        - DIS      weighted distances to five Boston employment centres
        - RAD      index of accessibility to radial highways
        - TAX      full-value property-tax rate per $10,000
        - PTRATIO  pupil-teacher ratio by town
      

In [10]:
def load_boston():
    scaler = StandardScaler()
    boston = datasets.load_boston()
    X=boston.data
    y=boston.target
    X = scaler.fit_transform(X)
    return train_test_split(X,y)
    

In [11]:
X_train, X_test, y_train, y_test = load_boston()

In [12]:
X_train.shape

(379L, 13L)

### Fitting a Linear Regression

It's as easy as instantiating a new regression object (line 1) and giving your regression object your training data
(line 2) by calling .fit(independent variables, dependent variable)



In [13]:

clf = LinearRegression()
clf.fit(X_train, y_train)

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)

### Making a Prediction
X_test is our holdout set of data.  We know the answer (y_test) but the computer does not.   

Using the command below, I create a tuple for each observation, where I'm combining the real value (y_test) with
the value our regressor predicts (clf.predict(X_test))

Use a similiar format to get your r2 and mse metrics working.  Using the [scikit learn api](http://scikit-learn.org/stable/modules/model_evaluation.html) if you need help!

In [14]:
y_pred = clf.predict(X_test)
zip (y_test, y_pred)

[(28.199999999999999, 32.917755900274429),
 (24.5, 19.787853194159535),
 (7.2000000000000002, 9.6541248267801123),
 (13.1, 21.106936876906428),
 (22.399999999999999, 22.512797754951468),
 (22.899999999999999, 29.064677560983849),
 (27.5, 24.568950248712884),
 (14.1, 18.364685018889976),
 (23.600000000000001, 29.16128900523497),
 (20.800000000000001, 19.674596663657418),
 (34.899999999999999, 34.155713730092685),
 (27.5, 16.006615849961413),
 (18.5, 13.787947932429184),
 (12.6, 18.979370464387504),
 (18.899999999999999, 19.349877944501703),
 (25.0, 22.731658826904468),
 (39.799999999999997, 34.560723606969262),
 (13.300000000000001, 16.721385302871703),
 (16.399999999999999, 19.848047343063506),
 (33.0, 23.665813457877327),
 (30.800000000000001, 31.105362901778623),
 (29.800000000000001, 26.130714445807936),
 (20.600000000000001, 27.553298221190602),
 (23.899999999999999, 27.924759768988423),
 (28.100000000000001, 24.551634649747403),
 (7.2000000000000002, 18.36797222766813),
 (14.5, 18

In [15]:
r2Scoring = r2_score(y_test, y_pred)
r2Scoring

0.72608908619663992

In [16]:
mseResult = mean_squared_error(y_test, y_pred)
mseResult

22.06099795840581

### Ridge Example

The following covers using the Ridge linear model to perform linear regression.

In [37]:
from sklearn.linear_model import Ridge

ridgeModel = Ridge(alpha=0.001, copy_X=True, fit_intercept=True, max_iter=None, normalize=True, solver='auto', tol=0.0001)
ridgeModel.fit(X_train, y_train)

Ridge(alpha=0.001, copy_X=True, fit_intercept=True, max_iter=None,
   normalize=True, solver='auto', tol=0.0001)

In [38]:
ridgePredict = ridgeModel.predict(X_test)
zip(y_test, ridgePredict)

[(28.199999999999999, 32.877214759545538),
 (24.5, 19.806361258484625),
 (7.2000000000000002, 9.665654625386134),
 (13.1, 21.099889430174759),
 (22.399999999999999, 22.517714993569502),
 (22.899999999999999, 29.031197165666399),
 (27.5, 24.582970279638122),
 (14.1, 18.362440245320474),
 (23.600000000000001, 29.149265519856677),
 (20.800000000000001, 19.657717605436584),
 (34.899999999999999, 34.162921164740332),
 (27.5, 15.963824571706887),
 (18.5, 13.798145632386404),
 (12.6, 18.978947075875762),
 (18.899999999999999, 19.358912536410859),
 (25.0, 22.734230044423001),
 (39.799999999999997, 34.547256668566121),
 (13.300000000000001, 16.709954255970544),
 (16.399999999999999, 19.848630820136982),
 (33.0, 23.684028007856011),
 (30.800000000000001, 31.085634103002121),
 (29.800000000000001, 26.120618027638884),
 (20.600000000000001, 27.515869784885439),
 (23.899999999999999, 27.932358909115248),
 (28.100000000000001, 24.549123372385584),
 (7.2000000000000002, 18.366191855897437),
 (14.5, 1

In [39]:
ridgeR2 = r2_score(y_test, ridgePredict)
ridgeR2

0.72640978203739137

In [40]:
ridgeMSE = mean_squared_error(y_test, ridgePredict)
ridgeMSE

22.035168866056587

R2 score of about 0.726 and MSE of about 22.06 appears to be the most reasonably closest I can get the Ridge example to be optimized in relation to the LinearRegression model.

### Lasso Example

In [66]:
from sklearn.linear_model import Lasso

lassoModel = Lasso(alpha=0.0001, fit_intercept=True, normalize=True, precompute=False, copy_X=True, max_iter=55, tol=0.0001, warm_start=False, positive=False, random_state=None, selection='cyclic')
lassoModel.fit(X_train, y_train)

Lasso(alpha=0.0001, copy_X=True, fit_intercept=True, max_iter=55,
   normalize=True, positive=False, precompute=False, random_state=None,
   selection='cyclic', tol=0.0001, warm_start=False)

In [67]:
lassoPredict = lassoModel.predict(X_test)
zip(y_test, lassoPredict)

[(28.199999999999999, 32.886631023123265),
 (24.5, 19.81183309709759),
 (7.2000000000000002, 9.6640726723796444),
 (13.1, 21.099525961021367),
 (22.399999999999999, 22.512340602941762),
 (22.899999999999999, 29.030564904265884),
 (27.5, 24.583185405832342),
 (14.1, 18.361509105554116),
 (23.600000000000001, 29.155830434467262),
 (20.800000000000001, 19.666966934154345),
 (34.899999999999999, 34.163897538077322),
 (27.5, 15.98389933190067),
 (18.5, 13.788001775708139),
 (12.6, 18.97987333757192),
 (18.899999999999999, 19.349347966372964),
 (25.0, 22.730618458454721),
 (39.799999999999997, 34.551526515949021),
 (13.300000000000001, 16.704810162058926),
 (16.399999999999999, 19.84920795559642),
 (33.0, 23.675918993496119),
 (30.800000000000001, 31.091727019142827),
 (29.800000000000001, 26.121639743142563),
 (20.600000000000001, 27.512336075145594),
 (23.899999999999999, 27.931001510141758),
 (28.100000000000001, 24.552754589129833),
 (7.2000000000000002, 18.364618858142144),
 (14.5, 18.7

In [68]:
lassoR2 = r2_score(y_test, lassoPredict)
lassoR2

0.72643506326062379

In [69]:
lassoMSE = mean_squared_error(y_test, lassoPredict)
lassoMSE

22.033132696681761

### R2 Score of about 0.726 with an MSE value of about 22.03 appears to be as optimal as I can go with an alpha of 0.0001 with 55 iterations for the Lasso Example.