## Boston Housing Assignment

In this assignment you'll be using linear regression to estimate the cost of house in boston, using a well known dataset.

Goals:
+  Measure the performance of the model I created using $R^{2}$ and MSE
> Learn how to use sklearn.metrics.r2_score and sklearn.metrics.mean_squared_error
+  Implement a new model using L2 regularization
> Use sklearn.linear_model.Ridge or sklearn.linear_model.Lasso 
+  Get the best model you can by optimizing the regularization parameter.   

In [109]:
from sklearn import datasets
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.cross_validation import train_test_split
from sklearn.metrics import mean_squared_error
from sklearn.metrics import r2_score
from sklearn.linear_model import LinearRegression

In [110]:
bean = datasets.load_boston()
print (bean.DESCR)

Boston House Prices dataset

Notes
------
Data Set Characteristics:  

    :Number of Instances: 506 

    :Number of Attributes: 13 numeric/categorical predictive
    
    :Median Value (attribute 14) is usually the target

    :Attribute Information (in order):
        - CRIM     per capita crime rate by town
        - ZN       proportion of residential land zoned for lots over 25,000 sq.ft.
        - INDUS    proportion of non-retail business acres per town
        - CHAS     Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
        - NOX      nitric oxides concentration (parts per 10 million)
        - RM       average number of rooms per dwelling
        - AGE      proportion of owner-occupied units built prior to 1940
        - DIS      weighted distances to five Boston employment centres
        - RAD      index of accessibility to radial highways
        - TAX      full-value property-tax rate per $10,000
        - PTRATIO  pupil-teacher ratio by town
      

In [111]:
def load_boston():
    scaler = StandardScaler()
    boston = datasets.load_boston()
    X=boston.data
    y=boston.target
    X = scaler.fit_transform(X)
    return train_test_split(X,y)

In [112]:
X_train, X_test, y_train, y_test = load_boston()

In [113]:
X_train.shape

(379, 13)

### Fitting a Linear Regression

It's as easy as instantiating a new regression object (line 1) and giving your regression object your training data
(line 2) by calling .fit(independent variables, dependent variable)



In [114]:

clf = LinearRegression()
clf.fit(X_train, y_train)

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)

### Making a Prediction
X_test is our holdout set of data.  We know the answer (y_test) but the computer does not.   

Using the command below, I create a tuple for each observation, where I'm combining the real value (y_test) with
the value our regressor predicts (clf.predict(X_test))

Use a similiar format to get your r2 and mse metrics working.  Using the [scikit learn api](http://scikit-learn.org/stable/modules/model_evaluation.html) if you need help!

In [115]:
zip (y_test, clf.predict(X_test))

<generator object zip at 0x114ebc3b8>

In [116]:
print ((y_test, clf.predict(X_test)))

(array([ 15.2,  12.6,  23.3,  16.8,  10.8,  31.2,  17.5,  21.5,  14.9,
        16.6,  18.9,  22.5,  24.5,  24. ,  45.4,  50. ,  19.6,  14.9,
        27.5,   8.4,  23.3,  22. ,  32.2,  16.4,  41.3,  20.3,   6.3,
        23.4,  20.4,  17.5,  21.9,  26.5,  25.1,  22.7,  16.7,  30.5,
        29.6,  36.2,  11.9,  21.4,  22.6,  20.6,  13. ,  33.2,  10.2,
         9.7,  11.8,  13.4,   8.3,  18.2,  13.8,  14.8,  23. ,  21.1,
        23.2,  21.2,   7.2,  18.5,  22.2,  10.2,  21.2,  22.2,   8.1,
        24.7,  20. ,  13.3,   8.5,  21.2,  18.4,  28.2,  19.8,  14.4,
        14.9,  22.8,   8.8,  19.4,  28.6,  25. ,  22.9,  20.5,  33.1,
        32.5,   9.6,  28.5,  31.5,  20.8,  23.1,  16.5,  26.6,  29.8,
        27.5,  19.6,  32. ,  21.4,  23.1,  19.8,  32. ,  10.9,  16.2,
        42.3,  23.3,  13.8,  50. ,  18.8,  15. ,  13.5,  33.1,  29.8,
        24.5,  10.5,  20.2,  18.4,  13.9,  21.7,  21. ,  14.6,  16.5,
        34.7,  23.8,  19.2,   8.7,  19.1,   7.5,  20.4,  33.2,  25. ,  22. ]), array([ 10

In [117]:
r2_score(y_test, (clf.predict(X_test)))

0.64296210216632077

In [118]:
y_true = [y_test]
y_pred = [(clf.predict(X_test))]
mean_squared_error(y_true, y_pred)

25.72755146434708

In [173]:
mse_value = mean_squared_error(y_test, clf.predict(X_test))

In [174]:
import math

In [175]:
math.sqrt(mse_value)

5.07217714591999

In [176]:
from sklearn.linear_model import Ridge
import numpy as np
n_samples, n_features = 506, 13
np.random.seed(0)
y = np.random.randn(n_samples)
X = np.random.randn(n_samples, n_features)
clf = Ridge(alpha=1)
clf.fit(X, y)

Ridge(alpha=1, copy_X=True, fit_intercept=True, max_iter=None,
   normalize=False, random_state=None, solver='auto', tol=0.001)

In [177]:
from sklearn import linear_model
clf = linear_model.Ridge ()
clf.fit(X_train, y_train)

Ridge(alpha=1.0, copy_X=True, fit_intercept=True, max_iter=None,
   normalize=False, random_state=None, solver='auto', tol=0.001)

In [178]:
clf.coef_

array([-1.0157296 ,  0.96221042,  0.0874656 ,  1.01507958, -1.77899531,
        3.58430445, -0.36307468, -3.06144946,  2.84412289, -2.03767158,
       -1.86910312,  0.85758612, -3.33046526])

In [179]:
clf.intercept_

22.540338544707261

In [180]:
clf = linear_model.Ridge (alpha = 1)
clf.fit(X_train, y_train)

Ridge(alpha=1, copy_X=True, fit_intercept=True, max_iter=None,
   normalize=False, random_state=None, solver='auto', tol=0.001)

In [181]:
r2_score(y_test, (clf.predict(X_test)))

0.64371101435245759

In [183]:
mean_squared_error(y_test, clf.predict(X_test))

25.673586109610177

In [184]:
clf = linear_model.Ridge (alpha = .05)
clf.fit(X_train, y_train)

Ridge(alpha=0.05, copy_X=True, fit_intercept=True, max_iter=None,
   normalize=False, random_state=None, solver='auto', tol=0.001)

In [185]:
r2_score(y_test, (clf.predict(X_test)))

0.64300159339768836

In [186]:
mean_squared_error(y_test, clf.predict(X_test))

25.724705792519057

In [187]:
clf = linear_model.Ridge (alpha = .01)
clf.fit(X_train, y_train)

Ridge(alpha=0.01, copy_X=True, fit_intercept=True, max_iter=None,
   normalize=False, random_state=None, solver='auto', tol=0.001)

In [188]:
r2_score(y_test, (clf.predict(X_test)))

0.64297001887525629

In [189]:
mean_squared_error(y_test, clf.predict(X_test))

25.726980999593053

In [190]:
clf = linear_model.Ridge (alpha = 10)
clf.fit(X_train, y_train)

Ridge(alpha=10, copy_X=True, fit_intercept=True, max_iter=None,
   normalize=False, random_state=None, solver='auto', tol=0.001)

In [191]:
r2_score(y_test, (clf.predict(X_test)))

0.64825168317105941

In [192]:
mean_squared_error(y_test, clf.predict(X_test))

25.346393138158295

In [193]:
clf = linear_model.Ridge (alpha = 100)
clf.fit(X_train, y_train)

Ridge(alpha=100, copy_X=True, fit_intercept=True, max_iter=None,
   normalize=False, random_state=None, solver='auto', tol=0.001)

In [194]:
r2_score(y_test, (clf.predict(X_test)))

0.66623468802053121

In [195]:
mean_squared_error(y_test, clf.predict(X_test))

24.050568001511564

In [206]:
clf = linear_model.Ridge (alpha = 150)
clf.fit(X_train, y_train)

Ridge(alpha=150, copy_X=True, fit_intercept=True, max_iter=None,
   normalize=False, random_state=None, solver='auto', tol=0.001)

In [207]:
r2_score(y_test, (clf.predict(X_test)))

0.66826348542787728

In [208]:
mean_squared_error(y_test, clf.predict(X_test))

23.904376266614712