## Boston Housing Assignment

In this assignment you'll be using linear regression to estimate the cost of house in boston, using a well known dataset.

Goals:
+  Measure the performance of the model I created using $R^{2}$ and MSE
> Learn how to use sklearn.metrics.r2_score and sklearn.metrics.mean_squared_error
+  Implement a new model using L2 regularization
> Use sklearn.linear_model.Ridge or sklearn.linear_model.Lasso 
+  Get the best model you can by optimizing the regularization parameter.   

In [1]:
from sklearn import datasets
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.cross_validation import train_test_split
from sklearn.metrics import mean_squared_error
from sklearn.metrics import r2_score
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import Ridge

In [2]:
bean = datasets.load_boston()
print(bean.DESCR)

Boston House Prices dataset

Notes
------
Data Set Characteristics:  

    :Number of Instances: 506 

    :Number of Attributes: 13 numeric/categorical predictive
    
    :Median Value (attribute 14) is usually the target

    :Attribute Information (in order):
        - CRIM     per capita crime rate by town
        - ZN       proportion of residential land zoned for lots over 25,000 sq.ft.
        - INDUS    proportion of non-retail business acres per town
        - CHAS     Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
        - NOX      nitric oxides concentration (parts per 10 million)
        - RM       average number of rooms per dwelling
        - AGE      proportion of owner-occupied units built prior to 1940
        - DIS      weighted distances to five Boston employment centres
        - RAD      index of accessibility to radial highways
        - TAX      full-value property-tax rate per $10,000
        - PTRATIO  pupil-teacher ratio by town
      

In [3]:
def load_boston():
    scaler = StandardScaler()
    boston = datasets.load_boston()
    X=boston.data
    y=boston.target
    X = scaler.fit_transform(X)
    return train_test_split(X,y)
    

In [4]:
X_train, X_test, y_train, y_test = load_boston()

In [5]:
X_train.shape

(379, 13)

### Fitting a Linear Regression

It's as easy as instantiating a new regression object (line 1) and giving your regression object your training data
(line 2) by calling .fit(independent variables, dependent variable)



In [6]:

clf = LinearRegression()
clf.fit(X_train, y_train)

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)

### Making a Prediction
X_test is our holdout set of data.  We know the answer (y_test) but the computer does not.   

Using the command below, I create a tuple for each observation, where I'm combining the real value (y_test) with
the value our regressor predicts (clf.predict(X_test))

Use a similiar format to get your r2 and mse metrics working.  Using the [scikit learn api](http://scikit-learn.org/stable/modules/model_evaluation.html) if you need help!

In [7]:
list(zip (y_test, clf.predict(X_test)))

[(24.600000000000001, 29.097939902070181),
 (18.600000000000001, 16.899197284772136),
 (24.5, 20.716861078738997),
 (15.6, 12.459996149902809),
 (27.100000000000001, 19.699333763613584),
 (37.600000000000001, 36.858256756628037),
 (32.0, 33.519682352018243),
 (14.300000000000001, 16.541934426982035),
 (24.100000000000001, 24.800847099053438),
 (21.699999999999999, 21.809379902762835),
 (19.5, 20.280002573570826),
 (23.0, 22.759218207764199),
 (31.600000000000001, 33.516963284001974),
 (32.200000000000003, 31.904912117909753),
 (50.0, 30.929090893520183),
 (14.9, 14.823532665067358),
 (16.600000000000001, 18.367733605278616),
 (7.0, 8.1745409655631249),
 (50.0, 43.36410662291388),
 (23.100000000000001, 25.550786234678036),
 (19.899999999999999, 19.759553527423588),
 (21.699999999999999, 21.197041595726482),
 (23.800000000000001, 25.336292947854613),
 (21.0, 20.944099529537528),
 (19.600000000000001, 19.154922985247772),
 (37.0, 30.363518940800873),
 (13.300000000000001, 16.1344226441327

## My work starts here

### Coefficient of Determination for the Linear Regression
Below is the calculated coefficient of determination for the
above linear regression

In [8]:
r2_score(y_test, clf.predict(X_test))

0.74992450248482678

### Root Mean Squared Error for the Linear Regression
Below is the calculated root mean squared error for the above linear regression

In [9]:
(mean_squared_error(y_test, clf.predict(X_test)))**0.5

4.7937978830992964

### Implementing scikit Ridge

In [10]:
clf_Ridge = Ridge(alpha=-10.0)
clf_Ridge.fit(X_train, y_train)

Ridge(alpha=-10.0, copy_X=True, fit_intercept=True, max_iter=None,
   normalize=False, random_state=None, solver='auto', tol=0.001)

In [11]:
list(zip (y_test, clf_Ridge.predict(X_test)))

[(24.600000000000001, 29.29418466318829),
 (18.600000000000001, 15.123747390952744),
 (24.5, 20.101478109163164),
 (15.6, 11.307574716319824),
 (27.100000000000001, 19.40650042311665),
 (37.600000000000001, 37.396775080214717),
 (32.0, 33.499745106322479),
 (14.300000000000001, 16.523632860685915),
 (24.100000000000001, 24.385032597196826),
 (21.699999999999999, 21.871086626664823),
 (19.5, 20.37845732780962),
 (23.0, 23.474197583823397),
 (31.600000000000001, 34.546408821840735),
 (32.200000000000003, 31.953995802213655),
 (50.0, 31.598008517399176),
 (14.9, 14.565901447298012),
 (16.600000000000001, 18.415933273960228),
 (7.0, 5.8614346783380178),
 (50.0, 44.948753060889914),
 (23.100000000000001, 25.224364436090251),
 (19.899999999999999, 20.331528176790947),
 (21.699999999999999, 20.728773702084318),
 (23.800000000000001, 25.311243107750101),
 (21.0, 20.411409578496752),
 (19.600000000000001, 18.632718552146859),
 (37.0, 29.610048284712263),
 (13.300000000000001, 16.677060332823771

### Measuring Performance of Ridge

In [12]:
r2_score(y_test, clf_Ridge.predict(X_test))

0.74549570527590892

In [13]:
(mean_squared_error(y_test, clf_Ridge.predict(X_test)))**0.5

4.8360602874614154

### Optimization
I started with alpha = 1.0 and tried several values above 1.0 and between 0 and 1.0, I found going below 0 decreased MSE and increased R2. I finally settled on -10 as lower values seemed to increase MSE.