# Things to do
* How does LASSO actually work
* How do I make sure its at most 3 non zero entries save for the intercept?!?
* Read piazza, notes and books
* Cite the scikit documentation

# Submission

What to submit in your write-up:
1. Test average squared loss of OLS estimator.
2. Test average squared loss of the sparse linear predictor.
3. Names of the variables with non-zero entries in the sparse linear prediction. Report the actual variable names3 (e.g., CRIM, ZN, INDUS).
4. Proper citations for any external code you use.

No need to submit any code.

In [1]:
from scipy.io import loadmat
import sklearn as sklearn
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
from sklearn.linear_model import Lasso
from sklearn.linear_model import Lars

In [2]:
housing = loadmat('housing.mat')
data = housing['data']
labels = housing['labels']
tdata = housing['testdata']
tlabels = housing['testlabels']

In [3]:
#Compute the ordinary least squares (OLS) estimator based on the training data
model = LinearRegression(fit_intercept=True, normalize=False, copy_X=True, n_jobs=1)
estim = model.fit(data, labels, sample_weight=None)
preds = estim.predict(data)
avsqerror = mean_squared_error(labels, preds, sample_weight=None, multioutput='uniform_average')
print("train average squared loss: ",avsqerror)

#Compute the average squared loss of the OLS estimator on the test data
preds = estim.predict(tdata)
avsqerror = mean_squared_error(tlabels, preds, sample_weight=None, multioutput='uniform_average')
print("test average squared loss: ",avsqerror)

train average squared loss:  22.1037987797
test average squared loss:  24.4065641284


In [4]:
#compute a sparse weight vector with at most three nonzero entries (not including the “intercept”)
model = Lars(fit_intercept=True, verbose=False, normalize=True, precompute='auto', n_nonzero_coefs=3, \
         eps=2.2204460492503131e-16, copy_X=True, fit_path=True, positive=False)
estim = model.fit(data, labels)
preds = estim.predict(data)
avsqerror = mean_squared_error(labels, preds, sample_weight=None, multioutput='uniform_average')
print("train average squared loss: ",avsqerror)
print(estim.coef_)
#Compute the average squared loss of this sparse linear predictor on the test data
preds = estim.predict(tdata)
avsqerror = mean_squared_error(tlabels, preds, sample_weight=None, multioutput='uniform_average')
print("test average squared loss: ",avsqerror)

train average squared loss:  36.3524340166
[ 0.          0.          0.          0.          0.          0.
  1.55107735  0.          0.          0.          0.         -0.54287349
  0.         -3.36749643]
test average squared loss:  35.6791404934


In [7]:
1. CRIM: per capita crime rate by town 
2. ZN: proportion of residential land zoned for lots over 25,000 sq.ft. 
3. INDUS: proportion of non-retail business acres per town 
4. CHAS: Charles River dummy variable (= 1 if tract bounds river; 0 otherwise) 
5. NOX: nitric oxides concentration (parts per 10 million) 
x6. RM: average number of rooms per dwelling 
7. AGE: proportion of owner-occupied units built prior to 1940 
8. DIS: weighted distances to five Boston employment centres 
9. RAD: index of accessibility to radial highways 
10. TAX: full-value property-tax rate per $10,000 
x11. PTRATIO: pupil-teacher ratio by town 
12. B: 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town 
x13. LSTAT: % lower status of the population

[ 1.          1.95208352 -0.46895923  0.9572188  -0.28495014  1.45333934
 -0.36887913  0.93484526 -0.87874985  1.75146021  1.56308334  0.81875584
 -0.37002854  0.42594767]
