## Boston Housing Assignment

In this assignment you'll be using linear regression to estimate the cost of house in boston, using a well known dataset.

Goals:
+  Measure the performance of the model I created using $R^{2}$ and MSE
> Learn how to use sklearn.metrics.r2_score and sklearn.metrics.mean_squared_error
+  Implement a new model using L2 regularization
> Use sklearn.linear_model.Ridge or sklearn.linear_model.Lasso 
+  Get the best model you can by optimizing the regularization parameter.   

In [6]:
from sklearn import datasets
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.cross_validation import train_test_split
from sklearn.metrics import mean_squared_error
from sklearn.metrics import r2_score
from sklearn.linear_model import LinearRegression

In [7]:
bean = datasets.load_boston()
print(bean.DESCR)

Boston House Prices dataset

Notes
------
Data Set Characteristics:  

    :Number of Instances: 506 

    :Number of Attributes: 13 numeric/categorical predictive
    
    :Median Value (attribute 14) is usually the target

    :Attribute Information (in order):
        - CRIM     per capita crime rate by town
        - ZN       proportion of residential land zoned for lots over 25,000 sq.ft.
        - INDUS    proportion of non-retail business acres per town
        - CHAS     Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
        - NOX      nitric oxides concentration (parts per 10 million)
        - RM       average number of rooms per dwelling
        - AGE      proportion of owner-occupied units built prior to 1940
        - DIS      weighted distances to five Boston employment centres
        - RAD      index of accessibility to radial highways
        - TAX      full-value property-tax rate per $10,000
        - PTRATIO  pupil-teacher ratio by town
      

In [8]:
def load_boston():
    scaler = StandardScaler()
    boston = datasets.load_boston()
    X=boston.data
    y=boston.target
    X = scaler.fit_transform(X)
    return train_test_split(X,y)#,random_state=42)

In [9]:
X_train, X_test, y_train, y_test = load_boston()

In [10]:
X_train.shape

(379, 13)

### Fitting a Linear Regression

It's as easy as instantiating a new regression object (line 1) and giving your regression object your training data
(line 2) by calling .fit(independent variables, dependent variable)



In [11]:
clf = LinearRegression()
clf.fit(X_train, y_train)

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)

### Making a Prediction
X_test is our holdout set of data.  We know the answer (y_test) but the computer does not.   

Using the command below, I create a tuple for each observation, where I'm combining the real value (y_test) with
the value our regressor predicts (clf.predict(X_test))

Use a similiar format to get your r2 and mse metrics working.  Using the [scikit learn api](http://scikit-learn.org/stable/modules/model_evaluation.html) if you need help!

In [12]:
list(zip (y_test, clf.predict(X_test)))

[(7.4000000000000004, 6.1557545319801399),
 (31.5, 31.150789995953865),
 (8.5, 7.5866130855849327),
 (17.5, 17.056366458677431),
 (21.199999999999999, 23.361994027764652),
 (34.899999999999999, 33.611285763490699),
 (50.0, 40.820086435322352),
 (14.5, 18.235117530830486),
 (23.0, 30.419884366857101),
 (21.899999999999999, 24.311769424719667),
 (37.200000000000003, 32.317868235873874),
 (12.1, 18.106048443207513),
 (44.799999999999997, 36.937300017828179),
 (22.0, 21.680081256696258),
 (20.5, 24.479304512836059),
 (17.100000000000001, 20.337850657391545),
 (21.0, 20.545052173959331),
 (29.100000000000001, 30.16083165303413),
 (29.899999999999999, 30.872515371986445),
 (32.399999999999999, 36.499801090485732),
 (10.5, 7.0917403286575507),
 (19.5, 20.693845756579947),
 (50.0, 37.992888180706487),
 (16.699999999999999, 20.567127694302499),
 (19.699999999999999, 21.740195216867296),
 (8.0999999999999996, 3.6676709299855972),
 (25.199999999999999, 27.279218151385734),
 (30.100000000000001, 2

<hr style="height:3px">
## Eric Maxwell
## CSC 570R
<hr style="height:1px">
### RMSE and R-Squared
<hr style="height:1px">

RSME

In [13]:
from math import *

In [22]:
#Calculate RMSE
rmse = sqrt(mean_squared_error(y_test, clf.predict(X_test)))
print("RSME =",rmse)

RSME = 4.216309422438102


<hr>
$R^{2}$

In [23]:
#Calculate R-squared
r2 = r2_score(y_test, clf.predict(X_test))
print("R2 =",r2)

R2 = 0.817466951368


<hr style="height:3px">
### Ridge Model
<hr style="height:1px">

In [16]:
from sklearn.linear_model import Ridge
from sklearn.linear_model import Lasso
import numpy as np

#### Create Base Model

In [26]:
#Create Ridge Learning Model
ridge = Ridge()
ridge.fit(X_train, y_train)

Ridge(alpha=1.0, copy_X=True, fit_intercept=True, max_iter=None,
   normalize=False, random_state=None, solver='auto', tol=0.001)

#### Display Base Model Scores

In [27]:
#Display RMSE and R2 for base model
print("RMSE =",sqrt(mean_squared_error(y_test, ridge.predict(X_test))))
print("R2 =",r2_score(y_test, ridge.predict(X_test)))


RMSE = 4.214918817393602
R2 = 0.817587336043


#### Optimize Alpha

In [28]:
#Try several possible alphas to optimize alpha value
a = np.arange(0.0, 20.0, 0.001)

#Save first alpha as best yet for rmse and r2 values
best_rmse_alpha_test = a[0]
best_r2_alpha_test = a[0]

#Create Ridge model with alpha = 0 and fit to training set
r = Ridge(alpha=a[0])
r.fit(X_train, y_train)
    
#Save RMSE and R2 scores for first alpha as best yet
best_rmse_test = sqrt(mean_squared_error(y_test, r.predict(X_test)))
best_r2_test = r2_score(y_test, r.predict(X_test))

#Iterate through alpha array updating the ridge model with new alpha and train model
for i in range(len(a)):
    r.set_params(alpha=a[i])
    r.fit(X_train, y_train)
        
    #Get predictions for the test set and calculate RMSE and R2 scores
    p_test = r.predict(X_test)
    current_rmse = sqrt(mean_squared_error(y_test, p_test))
    current_r2 = r2_score(y_test, p_test)
    
    #Update best alpha and scores if new model is an improvement
    if current_rmse < best_rmse_test:
        best_rmse_test = current_rmse
        best_rmse_alpha_test = a[i]
            
    if current_r2 > best_r2_test:
        best_r2_test = current_r2
        best_r2_alpha_test = a[i]
            
#Display results
print("Best alpha for RMSE on test set= ",best_rmse_alpha_test,", RMSE = ",best_rmse_test) 
print("Best alpha for r2 on test set = ",best_r2_alpha_test,", R2 =",best_r2_test)

Best alpha for RMSE on test set=  2.472 , RMSE =  4.2142123411422645
Best alpha for r2 on test set =  2.472 , R2 = 0.817648480478


#### Display RMSE and $R^{2}$ Results compared to Base Model

In [32]:
#Compare RMSE and R2 scores for base model vs model with optimized alpha
r = Ridge()
r.fit(X_train, y_train)
print("Ridge Base Model :: R2 =",r2_score(y_test, r.predict(X_test)),"; RMSE=", sqrt(mean_squared_error(y_test, r.predict(X_test))))

r = Ridge(alpha=best_rmse_alpha_test)
r.fit(X_train, y_train)
print("Ridge Model with Alpha =",best_rmse_alpha_test,":: R2 =",r2_score(y_test, r.predict(X_test)),"; RMSE=",sqrt(mean_squared_error(y_test, r.predict(X_test))))

Ridge Base Model :: R2 = 0.817587336043 ; RMSE= 4.214918817393602
Ridge Model with Alpha = 2.472 :: R2 = 0.817648480478 ; RMSE= 4.2142123411422645
