## Boston Housing Assignment

In this assignment you'll be using linear regression to estimate the cost of house in boston, using a well known dataset.

Goals:
+  Measure the performance of the model I created using $R^{2}$ and MSE
> Learn how to use sklearn.metrics.r2_score and sklearn.metrics.mean_squared_error
+  Implement a new model using L2 regularization
> Use sklearn.linear_model.Ridge or sklearn.linear_model.Lasso 
+  Get the best model you can by optimizing the regularization parameter.   

In [1]:
from sklearn import datasets
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.cross_validation import train_test_split
from sklearn.metrics import mean_squared_error
from sklearn.metrics import r2_score
from sklearn.linear_model import LinearRegression

In [2]:
bean = datasets.load_boston()
print bean.DESCR

Boston House Prices dataset

Notes
------
Data Set Characteristics:  

    :Number of Instances: 506 

    :Number of Attributes: 13 numeric/categorical predictive
    
    :Median Value (attribute 14) is usually the target

    :Attribute Information (in order):
        - CRIM     per capita crime rate by town
        - ZN       proportion of residential land zoned for lots over 25,000 sq.ft.
        - INDUS    proportion of non-retail business acres per town
        - CHAS     Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
        - NOX      nitric oxides concentration (parts per 10 million)
        - RM       average number of rooms per dwelling
        - AGE      proportion of owner-occupied units built prior to 1940
        - DIS      weighted distances to five Boston employment centres
        - RAD      index of accessibility to radial highways
        - TAX      full-value property-tax rate per $10,000
        - PTRATIO  pupil-teacher ratio by town
      

In [3]:
def load_boston():
    scaler = StandardScaler()
    boston = datasets.load_boston()
    X=boston.data
    y=boston.target
    X = scaler.fit_transform(X)
    return train_test_split(X,y)
    

In [4]:
X_train, X_test, y_train, y_test = load_boston()

In [5]:
X_train.shape

(379, 13)

### Fitting a Linear Regression

It's as easy as instantiating a new regression object (line 1) and giving your regression object your training data
(line 2) by calling .fit(independent variables, dependent variable)



In [6]:

clf = LinearRegression()
clf.fit(X_train, y_train)

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)

### Making a Prediction
X_test is our holdout set of data.  We know the answer (y_test) but the computer does not.   

Using the command below, I create a tuple for each observation, where I'm combining the real value (y_test) with
the value our regressor predicts (clf.predict(X_test))

Use a similiar format to get your r2 and mse metrics working.  Using the [scikit learn api](http://scikit-learn.org/stable/modules/model_evaluation.html) if you need help!

In [7]:
zip (y_test, clf.predict(X_test))

[(19.600000000000001, 21.061412249239297),
 (19.300000000000001, 20.726855850100907),
 (50.0, 40.399399352927702),
 (14.9, 18.308684184280906),
 (11.9, 22.970927466257081),
 (17.100000000000001, 20.087864716465059),
 (35.200000000000003, 35.17896716749928),
 (22.600000000000001, 22.836458259050648),
 (22.399999999999999, 22.511163546333901),
 (14.4, 7.8758383636102938),
 (21.0, 20.955462407655908),
 (32.700000000000003, 29.80048827021756),
 (22.699999999999999, 24.917714403063552),
 (20.199999999999999, 16.384836409884585),
 (16.800000000000001, 20.77692252451331),
 (23.699999999999999, 28.220834178442061),
 (19.399999999999999, 16.655858155181033),
 (14.6, 8.2327635775113617),
 (22.699999999999999, 23.24378278565927),
 (23.199999999999999, 23.266672693657046),
 (24.800000000000001, 26.168281832561291),
 (30.100000000000001, 29.709822868204149),
 (13.800000000000001, 6.5499344184981361),
 (33.200000000000003, 32.125852120555507),
 (16.699999999999999, 20.397549052203985),
 (21.69999999

###Homework
1. Impliment scikit learn's r2 and mse methods to measure the performance of my linear regressor.

2. Impliment either sklearn.linear_model.Ridge or sklearn.linear_model.Lasso.

3. Optimize (by reviewing the r2 and mse scores and adjusting the regularization paramater) the regression model you pick.

Using http://scikit-learn.org/stable/modules/linear_model.html for reference

In [8]:
import math
from sklearn.linear_model import Ridge
from sklearn.linear_model import RidgeCV
from sklearn.linear_model import Lasso
from sklearn.linear_model import LassoCV

In [9]:
print 'OLS r2: ', r2_score(y_test, clf.predict(X_test))
print 'OLS MSE: ', mean_squared_error(y_test, clf.predict(X_test))
print 'OLS RMSE: ', math.sqrt(mean_squared_error(y_test, clf.predict(X_test)))

OLS r2:  0.78370712426
OLS MSE:  18.9306460595
OLS RMSE:  4.35093622793


In [10]:
clf_ridge = Ridge(alpha=.5)
clf_ridge.fit(X_train, y_train)

Ridge(alpha=0.5, copy_X=True, fit_intercept=True, max_iter=None,
   normalize=False, solver='auto', tol=0.001)

In [11]:
print 'Ridge r2: ', r2_score(y_test, clf_ridge.predict(X_test))
print 'Ridge MSE: ', mean_squared_error(y_test, clf_ridge.predict(X_test))
print 'Ridge RMSE: ', math.sqrt(mean_squared_error(y_test, clf_ridge.predict(X_test)))

Ridge r2:  0.783760916268
Ridge MSE:  18.9259380105
Ridge RMSE:  4.35039515567


In [12]:
clf_ridge_cv = RidgeCV(alphas=[-.5, .1, .5, .7], store_cv_values=True)
clf_ridge_cv.fit(X_train, y_train)

RidgeCV(alphas=[-0.5, 0.1, 0.5, 0.7], cv=None, fit_intercept=True,
    gcv_mode=None, normalize=False, scoring=None, store_cv_values=True)

In [13]:
clf_ridge_cv.alpha_

0.69999999999999996

In [14]:
clf_ridge7 = Ridge(alpha=.7)
clf_ridge7.fit(X_train, y_train)

Ridge(alpha=0.7, copy_X=True, fit_intercept=True, max_iter=None,
   normalize=False, solver='auto', tol=0.001)

In [15]:
print 'Ridge r2: ', r2_score(y_test, clf_ridge7.predict(X_test))
print 'Ridge MSE: ', mean_squared_error(y_test, clf_ridge7.predict(X_test))
print 'Ridge RMSE: ', math.sqrt(mean_squared_error(y_test, clf_ridge7.predict(X_test)))

Ridge r2:  0.783781374586
Ridge MSE:  18.924147433
Ridge RMSE:  4.35018935599


In [16]:
clf_lasso = Lasso(alpha=0.1)
clf_lasso.fit(X_train, y_train)

Lasso(alpha=0.1, copy_X=True, fit_intercept=True, max_iter=1000,
   normalize=False, positive=False, precompute=False, random_state=None,
   selection='cyclic', tol=0.0001, warm_start=False)

In [17]:
print 'Lasso r2: ', r2_score(y_test, clf_lasso.predict(X_test))
print 'Lasso MSE: ', mean_squared_error(y_test, clf_lasso.predict(X_test))
print 'Lasso RMSE: ', math.sqrt(mean_squared_error(y_test, clf_lasso.predict(X_test)))

Lasso r2:  0.777886882841
Lasso MSE:  19.440052252
Lasso RMSE:  4.4090874625


In [18]:
clf_lasso_cv = LassoCV()
clf_lasso_cv.fit(X_train, y_train)

LassoCV(alphas=None, copy_X=True, cv=None, eps=0.001, fit_intercept=True,
    max_iter=1000, n_alphas=100, n_jobs=1, normalize=False, positive=False,
    precompute='auto', random_state=None, selection='cyclic', tol=0.0001,
    verbose=False)

In [19]:
clf_lasso_cv.alpha_

0.014415398783530188

In [20]:
clf_lasso0 = Lasso(alpha=0.017244599307909947)
clf_lasso0.fit(X_train, y_train)

Lasso(alpha=0.0172445993079, copy_X=True, fit_intercept=True, max_iter=1000,
   normalize=False, positive=False, precompute=False, random_state=None,
   selection='cyclic', tol=0.0001, warm_start=False)

In [21]:
print 'Lasso r2: ', r2_score(y_test, clf_lasso0.predict(X_test))
print 'Lasso MSE: ', mean_squared_error(y_test, clf_lasso0.predict(X_test))
print 'Lasso RMSE: ', math.sqrt(mean_squared_error(y_test, clf_lasso0.predict(X_test)))

Lasso r2:  0.783502213602
Lasso MSE:  18.9485804974
Lasso RMSE:  4.35299672609
