## Boston Housing Assignment

In this assignment you'll be using linear regression to estimate the cost of house in boston, using a well known dataset.

Goals:
+  Measure the performance of the model I created using $R^{2}$ and MSE
> Learn how to use sklearn.metrics.r2_score and sklearn.metrics.mean_squared_error
+  Implement a new model using L2 regularization
> Use sklearn.linear_model.Ridge or sklearn.linear_model.Lasso 
+  Get the best model you can by optimizing the regularization parameter.   

In [44]:
from sklearn import datasets
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.cross_validation import train_test_split
from sklearn.metrics import mean_squared_error
from sklearn.metrics import r2_score
from sklearn.linear_model import LinearRegression

In [45]:
bean = datasets.load_boston()
print bean.DESCR

Boston House Prices dataset

Notes
------
Data Set Characteristics:  

    :Number of Instances: 506 

    :Number of Attributes: 13 numeric/categorical predictive
    
    :Median Value (attribute 14) is usually the target

    :Attribute Information (in order):
        - CRIM     per capita crime rate by town
        - ZN       proportion of residential land zoned for lots over 25,000 sq.ft.
        - INDUS    proportion of non-retail business acres per town
        - CHAS     Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
        - NOX      nitric oxides concentration (parts per 10 million)
        - RM       average number of rooms per dwelling
        - AGE      proportion of owner-occupied units built prior to 1940
        - DIS      weighted distances to five Boston employment centres
        - RAD      index of accessibility to radial highways
        - TAX      full-value property-tax rate per $10,000
        - PTRATIO  pupil-teacher ratio by town
      

In [114]:
def load_boston():
    scaler = StandardScaler()
    boston = datasets.load_boston()
    X=boston.data
    y=boston.target
    X = scaler.fit_transform(X)
    return train_test_split(X,y)
    

In [115]:
X_train, X_test, y_train, y_test = load_boston()

In [116]:
X_train.shape

(379L, 13L)

### Fitting a Linear Regression

It's as easy as instantiating a new regression object (line 1) and giving your regression object your training data
(line 2) by calling .fit(independent variables, dependent variable)



In [117]:

clf = LinearRegression()
clf.fit(X_train, y_train)

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)

### Making a Prediction
X_test is our holdout set of data.  We know the answer (y_test) but the computer does not.   

Using the command below, I create a tuple for each observation, where I'm combining the real value (y_test) with
the value our regressor predicts (clf.predict(X_test))

Use a similiar format to get your r2 and mse metrics working.  Using the [scikit learn api](http://scikit-learn.org/stable/modules/model_evaluation.html) if you need help!

In [118]:
y_pred = clf.predict(X_test)
zip (y_test, y_pred)

[(32.399999999999999, 35.940542335081886),
 (18.199999999999999, 18.250053923816921),
 (20.899999999999999, 21.73138921242716),
 (23.399999999999999, 24.472650552065573),
 (8.0999999999999996, 4.1748793583456205),
 (30.800000000000001, 31.072860573130917),
 (7.2000000000000002, 10.037021581435184),
 (18.5, 19.222628947432217),
 (16.800000000000001, 20.479420151914162),
 (20.899999999999999, 20.852062187125519),
 (27.5, 19.807868916445486),
 (14.1, 17.500339856200334),
 (24.600000000000001, 28.810319102230022),
 (36.100000000000001, 33.184434036595547),
 (29.899999999999999, 31.336860301531175),
 (29.600000000000001, 24.569564339874336),
 (19.5, 20.175009331591593),
 (17.800000000000001, 22.971730569139918),
 (12.1, 18.29284315860264),
 (19.199999999999999, 20.200882410177858),
 (30.5, 30.472822525987645),
 (33.299999999999997, 36.268530950375634),
 (19.100000000000001, 24.087235420036357),
 (31.199999999999999, 28.839869219285013),
 (24.5, 27.513722966826862),
 (17.800000000000001, 17.

In [119]:
r2Scoring = r2_score(y_test, y_pred)
r2Scoring

0.73182616696103064

In [120]:
mseResult = mean_squared_error(y_test, y_pred)
mseResult

21.147261695013441

### Ridge Example

The following covers using the Ridge linear model to perform linear regression.

In [213]:
from sklearn.linear_model import Ridge

ridgeModel = Ridge(alpha=0.000001, copy_X=True, fit_intercept=True, max_iter=None, normalize=True, solver='auto', tol=0.0001)
ridgeModel.fit(X_train, y_train)

Ridge(alpha=1e-06, copy_X=True, fit_intercept=True, max_iter=None,
   normalize=True, solver='auto', tol=0.0001)

In [214]:
ridgePredict = ridgeModel.predict(X_test)
zip(y_test, ridgePredict)

[(32.399999999999999, 35.940525216252745),
 (18.199999999999999, 18.250060004172763),
 (20.899999999999999, 21.731403435723376),
 (23.399999999999999, 24.472649541588005),
 (8.0999999999999996, 4.1749342370622244),
 (30.800000000000001, 31.07284349200447),
 (7.2000000000000002, 10.037030175829827),
 (18.5, 19.22262746830431),
 (16.800000000000001, 20.479430123582823),
 (20.899999999999999, 20.852075425269298),
 (27.5, 19.807860396955384),
 (14.1, 17.500339089920708),
 (24.600000000000001, 28.810313632605709),
 (36.100000000000001, 33.184410068535982),
 (29.899999999999999, 31.336855302788035),
 (29.600000000000001, 24.56955389597001),
 (19.5, 20.175006163480624),
 (17.800000000000001, 22.971734012652746),
 (12.1, 18.292840041576021),
 (19.199999999999999, 20.200891961919591),
 (30.5, 30.47283538097961),
 (33.299999999999997, 36.268534699320377),
 (19.100000000000001, 24.087229114110855),
 (31.199999999999999, 28.839881361434859),
 (24.5, 27.513722368861195),
 (17.800000000000001, 17.69

In [215]:
ridgeR2 = r2_score(y_test, ridgePredict)
ridgeR2

0.73182588051663799

In [216]:
ridgeMSE = mean_squared_error(y_test, ridgePredict)
ridgeMSE

21.147284283028327

R2 score of about 0.731 and MSE of about 21.147 appears to be the most reasonably closest I can get the Ridge example to be optimized in relation to the LinearRegression model.

### Lasso Example

In [271]:
from sklearn.linear_model import Lasso

lassoModel = Lasso(alpha=0.00001, fit_intercept=True, normalize=True, precompute=False, copy_X=True, max_iter=53, tol=0.0001, warm_start=False, positive=False, random_state=None, selection='cyclic')
lassoModel.fit(X_train, y_train)

Lasso(alpha=1e-05, copy_X=True, fit_intercept=True, max_iter=53,
   normalize=True, positive=False, precompute=False, random_state=None,
   selection='cyclic', tol=0.0001, warm_start=False)

In [272]:
lassoPredict = lassoModel.predict(X_test)
zip(y_test, lassoPredict)

[(32.399999999999999, 35.938149656795012),
 (18.199999999999999, 18.251312203049764),
 (20.899999999999999, 21.732265063379508),
 (23.399999999999999, 24.471753367602911),
 (8.0999999999999996, 4.1811973663551463),
 (30.800000000000001, 31.070575794249507),
 (7.2000000000000002, 10.037891698813006),
 (18.5, 19.222327919738269),
 (16.800000000000001, 20.48125401290288),
 (20.899999999999999, 20.853665558398792),
 (27.5, 19.806935924375384),
 (14.1, 17.49995180106966),
 (24.600000000000001, 28.810832771046933),
 (36.100000000000001, 33.181414518506045),
 (29.899999999999999, 31.337404858868545),
 (29.600000000000001, 24.568728572576067),
 (19.5, 20.174469081948999),
 (17.800000000000001, 22.971810491054043),
 (12.1, 18.29211336723063),
 (19.199999999999999, 20.201986902582838),
 (30.5, 30.47481684717744),
 (33.299999999999997, 36.269049502245139),
 (19.100000000000001, 24.086414932504166),
 (31.199999999999999, 28.841569772059458),
 (24.5, 27.513034740535296),
 (17.800000000000001, 17.69

In [273]:
lassoR2 = r2_score(y_test, lassoPredict)
lassoR2

0.73179273262177369

In [274]:
lassoMSE = mean_squared_error(y_test, lassoPredict)
lassoMSE

21.149898211461945

R2 Score of about 0.731 with an MSE value of about 21.149 appears to be as optimal as I can go with an alpha of 0.00001 with 53 iterations for the Lasso Example.