# Regression Week 5: Feature Selection and LASSO (Interpretation)

In this notebook, you will use LASSO to select features, building on a pre-implemented solver for LASSO (using GraphLab Create, though you can use other solvers). You will:
* Run LASSO with different L1 penalties.
* Choose best L1 penalty using a validation set.
* Choose best L1 penalty using a validation set, with additional constraint on the size of subset.

In the second notebook, you will implement your own LASSO solver, using coordinate descent. 

# Fire up Graphlab Create

In [47]:
import graphlab
import numpy
graphlab.product_key.set_product_key('7361-BA94-D081-785E-2308-3F6E-9D04-0EB6')

# Load in house sales data

Dataset is from house sales in King County, the region where the city of Seattle, WA is located.

In [48]:
sales = graphlab.SFrame('kc_house_data.gl/')

# Create new features

As in Week 2, we consider features that are some transformations of inputs.

In [49]:
from math import log, sqrt
sales['sqft_living_sqrt'] = sales['sqft_living'].apply(sqrt)
sales['sqft_lot_sqrt'] = sales['sqft_lot'].apply(sqrt)
sales['bedrooms_square'] = sales['bedrooms']*sales['bedrooms']

# In the dataset, 'floors' was defined with type string, 
# so we'll convert them to float, before creating a new feature.
sales['floors'] = sales['floors'].astype(float) 
sales['floors_square'] = sales['floors']*sales['floors']

* Squaring bedrooms will increase the separation between not many bedrooms (e.g. 1) and lots of bedrooms (e.g. 4) since 1^2 = 1 but 4^2 = 16. Consequently this variable will mostly affect houses with many bedrooms.
* On the other hand, taking square root of sqft_living will decrease the separation between big house and small house. The owner may not be exactly twice as happy for getting a house that is twice as big.

# Learn regression weights with L1 penalty

Let us fit a model with all the features available, plus the features we just created above.

In [50]:
all_features = ['bedrooms', 'bedrooms_square',
            'bathrooms',
            'sqft_living', 'sqft_living_sqrt',
            'sqft_lot', 'sqft_lot_sqrt',
            'floors', 'floors_square',
            'waterfront', 'view', 'condition', 'grade',
            'sqft_above',
            'sqft_basement',
            'yr_built', 'yr_renovated']

Applying L1 penalty requires adding an extra parameter (`l1_penalty`) to the linear regression call in GraphLab Create. (Other tools may have separate implementations of LASSO.)  Note that it's important to set `l2_penalty=0` to ensure we don't introduce an additional L2 penalty.

In [51]:
model_all = graphlab.linear_regression.create(sales, target='price', features=all_features,
                                              validation_set=None, 
                                              l2_penalty=0., l1_penalty=1e10)

Find what features had non-zero weight.

In [52]:
model_all.get("coefficients").print_rows(num_rows = len(all_features)+1)

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None |  274873.05595 |  None  |
|     bedrooms     |  None |      0.0      |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None | 8468.53108691 |  None  |
|   sqft_living    |  None | 24.4207209824 |  None  |
| sqft_living_sqrt |  None | 350.060553386 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None |      0.0      |  None  |
|      floors      |  None |      0.0      |  None  |
|  floors_square   |  None |      0.0      |  None  |
|    waterfront    |  None |      0.0      |  None  |
|       view       |  None |      0.0      |  None  |
|    condition     |  None |      0.0      |  None  |
|      grade       |  None | 842.068034898 |  None  |
|    sqft_above    |  None | 20.0247224171 |  None  |
|  sqft_basement   |  None |

Note that a majority of the weights have been set to zero. So by setting an L1 penalty that's large enough, we are performing a subset selection. 

***QUIZ QUESTION***:
According to this list of weights, which of the features have been chosen? 

# Selecting an L1 penalty

To find a good L1 penalty, we will explore multiple values using a validation set. Let us do three way split into train, validation, and test sets:
* Split our sales data into 2 sets: training and test
* Further split our training data into two sets: train, validation

Be *very* careful that you use seed = 1 to ensure you get the same answer!

In [53]:
(training_and_validation, testing) = sales.random_split(.9,seed=1) # initial train/test split
(training, validation) = training_and_validation.random_split(0.5, seed=1) # split training into train and validate

Next, we write a loop that does the following:
* For `l1_penalty` in [10^1, 10^1.5, 10^2, 10^2.5, ..., 10^7] (to get this in Python, type `np.logspace(1, 7, num=13)`.)
    * Fit a regression model with a given `l1_penalty` on TRAIN data. Specify `l1_penalty=l1_penalty` and `l2_penalty=0.` in the parameter list.
    * Compute the RSS on VALIDATION data (here you will want to use `.predict()`) for that `l1_penalty`
* Report which `l1_penalty` produced the lowest RSS on validation data.

When you call `linear_regression.create()` make sure you set `validation_set = None`.

Note: you can turn off the print out of `linear_regression.create()` with `verbose = False`

In [54]:
l1_penalty_set = numpy.logspace(1, 7, num=13)
output_name = 'price'
print l1_penalty_set

[  1.00000000e+01   3.16227766e+01   1.00000000e+02   3.16227766e+02
   1.00000000e+03   3.16227766e+03   1.00000000e+04   3.16227766e+04
   1.00000000e+05   3.16227766e+05   1.00000000e+06   3.16227766e+06
   1.00000000e+07]


In [55]:
def rss(model, data, features_list, output_name):
    predictions = model.predict(data[features_list])
    errors = predictions - data[output_name]
    squared_errors = errors ** 2
    return squared_errors.sum()

In [56]:
def build_l1_penalty_validation_errors_sframe(training_data, validation_data, features_list, output_name, l1_penalty_set):
    l1_penalty_errors_sframe = graphlab.SFrame()
    l1_penalty_errors_sframe['penalty'] = graphlab.SArray(l1_penalty_set)
    validation_errors = []
    for l1_penalty in l1_penalty_set:
        model = graphlab.linear_regression.create(training_data, target=output_name, features=features_list,
                                              validation_set=None, 
                                              l2_penalty=0., l1_penalty=l1_penalty)
        validation_errors.append(rss(model, validation_data, features_list, output_name))
        model.get("coefficients").print_rows(num_rows = len(features_list)+1)
    l1_penalty_errors_sframe['errors'] = graphlab.SArray(validation_errors)
    return l1_penalty_errors_sframe

In [57]:
l1_penalty_errors_sframe = build_l1_penalty_validation_errors_sframe(training, validation, all_features, output_name, l1_penalty_set)

+------------------+-------+------------------+--------+
|       name       | index |      value       | stderr |
+------------------+-------+------------------+--------+
|   (intercept)    |  None |  18993.4272128   |  None  |
|     bedrooms     |  None |  7936.96767903   |  None  |
| bedrooms_square  |  None |  936.993368193   |  None  |
|    bathrooms     |  None |  25409.5889341   |  None  |
|   sqft_living    |  None |  39.1151363797   |  None  |
| sqft_living_sqrt |  None |  1124.65021281   |  None  |
|     sqft_lot     |  None | 0.00348361822299 |  None  |
|  sqft_lot_sqrt   |  None |  148.258391011   |  None  |
|      floors      |  None |   21204.335467   |  None  |
|  floors_square   |  None |  12915.5243361   |  None  |
|    waterfront    |  None |  601905.594545   |  None  |
|       view       |  None |  93312.8573119   |  None  |
|    condition     |  None |  6609.03571245   |  None  |
|      grade       |  None |  6206.93999188   |  None  |
|    sqft_above    |  None |  4

+------------------+-------+------------------+--------+
|       name       | index |      value       | stderr |
+------------------+-------+------------------+--------+
|   (intercept)    |  None |  18993.4285345   |  None  |
|     bedrooms     |  None |  7936.96764801   |  None  |
| bedrooms_square  |  None |   936.99334899   |  None  |
|    bathrooms     |  None |  25409.5888977   |  None  |
|   sqft_living    |  None |  39.1151363649   |  None  |
| sqft_living_sqrt |  None |   1124.6502113   |  None  |
|     sqft_lot     |  None | 0.00348360549025 |  None  |
|  sqft_lot_sqrt   |  None |  148.258390148   |  None  |
|      floors      |  None |  21204.3353589   |  None  |
|  floors_square   |  None |  12915.5242399   |  None  |
|    waterfront    |  None |  601905.587264   |  None  |
|       view       |  None |  93312.8568285   |  None  |
|    condition     |  None |   6609.0356597   |  None  |
|      grade       |  None |  6206.93997768   |  None  |
|    sqft_above    |  None |  4

+------------------+-------+-----------------+--------+
|       name       | index |      value      | stderr |
+------------------+-------+-----------------+--------+
|   (intercept)    |  None |  18993.4327144  |  None  |
|     bedrooms     |  None |   7936.9675499  |  None  |
| bedrooms_square  |  None |  936.993288262  |  None  |
|    bathrooms     |  None |  25409.5887825  |  None  |
|   sqft_living    |  None |  39.1151363182  |  None  |
| sqft_living_sqrt |  None |  1124.65020653  |  None  |
|     sqft_lot     |  None | 0.0034835652258 |  None  |
|  sqft_lot_sqrt   |  None |  148.258387417  |  None  |
|      floors      |  None |  21204.3350171  |  None  |
|  floors_square   |  None |  12915.5239357  |  None  |
|    waterfront    |  None |   601905.56424  |  None  |
|       view       |  None |  93312.8552999  |  None  |
|    condition     |  None |  6609.03549292  |  None  |
|      grade       |  None |  6206.93993278  |  None  |
|    sqft_above    |  None |   43.287053201  |  

+------------------+-------+------------------+--------+
|       name       | index |      value       | stderr |
+------------------+-------+------------------+--------+
|   (intercept)    |  None |  18993.4459322   |  None  |
|     bedrooms     |  None |  7936.96723967   |  None  |
| bedrooms_square  |  None |  936.993096225   |  None  |
|    bathrooms     |  None |  25409.5884183   |  None  |
|   sqft_living    |  None |  39.1151361705   |  None  |
| sqft_living_sqrt |  None |  1124.65019144   |  None  |
|     sqft_lot     |  None | 0.00348343789842 |  None  |
|  sqft_lot_sqrt   |  None |  148.258378783   |  None  |
|      floors      |  None |  21204.3339364   |  None  |
|  floors_square   |  None |  12915.5229739   |  None  |
|    waterfront    |  None |  601905.491432   |  None  |
|       view       |  None |  93312.8504661   |  None  |
|    condition     |  None |  6609.03496549   |  None  |
|      grade       |  None |  6206.93979079   |  None  |
|    sqft_above    |  None |  4

+------------------+-------+------------------+--------+
|       name       | index |      value       | stderr |
+------------------+-------+------------------+--------+
|   (intercept)    |  None |  18993.4877305   |  None  |
|     bedrooms     |  None |  7936.96625863   |  None  |
| bedrooms_square  |  None |  936.992488952   |  None  |
|    bathrooms     |  None |  25409.5872664   |  None  |
|   sqft_living    |  None |  39.1151357034   |  None  |
| sqft_living_sqrt |  None |  1124.65014372   |  None  |
|     sqft_lot     |  None | 0.00348303525386 |  None  |
|  sqft_lot_sqrt   |  None |  148.258351477   |  None  |
|      floors      |  None |  21204.3305187   |  None  |
|  floors_square   |  None |  12915.5199324   |  None  |
|    waterfront    |  None |  601905.261191   |  None  |
|       view       |  None |  93312.8351802   |  None  |
|    condition     |  None |  6609.03329762   |  None  |
|      grade       |  None |  6206.93934177   |  None  |
|    sqft_above    |  None |  4

+------------------+-------+------------------+--------+
|       name       | index |      value       | stderr |
+------------------+-------+------------------+--------+
|   (intercept)    |  None |  18993.6199083   |  None  |
|     bedrooms     |  None |  7936.96315631   |  None  |
| bedrooms_square  |  None |  936.990568583   |  None  |
|    bathrooms     |  None |  25409.5836239   |  None  |
|   sqft_living    |  None |  39.1151342262   |  None  |
| sqft_living_sqrt |  None |  1124.64999282   |  None  |
|     sqft_lot     |  None | 0.00348176198003 |  None  |
|  sqft_lot_sqrt   |  None |  148.258265129   |  None  |
|      floors      |  None |  21204.3197113   |  None  |
|  floors_square   |  None |  12915.5103143   |  None  |
|    waterfront    |  None |  601904.533104   |  None  |
|       view       |  None |  93312.7868419   |  None  |
|    condition     |  None |  6609.02802336   |  None  |
|      grade       |  None |  6206.93792184   |  None  |
|    sqft_above    |  None |  4

+------------------+-------+------------------+--------+
|       name       | index |      value       | stderr |
+------------------+-------+------------------+--------+
|   (intercept)    |  None |  18994.0378912   |  None  |
|     bedrooms     |  None |  7936.95334592   |  None  |
| bedrooms_square  |  None |  936.984495845   |  None  |
|    bathrooms     |  None |  25409.5721055   |  None  |
|   sqft_living    |  None |   39.115129555   |  None  |
| sqft_living_sqrt |  None |  1124.64951563   |  None  |
|     sqft_lot     |  None | 0.00347773553446 |  None  |
|  sqft_lot_sqrt   |  None |  148.257992074   |  None  |
|      floors      |  None |   21204.285535   |  None  |
|  floors_square   |  None |  12915.4798991   |  None  |
|    waterfront    |  None |  601902.230693   |  None  |
|       view       |  None |  93312.6339828   |  None  |
|    condition     |  None |  6609.01134468   |  None  |
|      grade       |  None |  6206.93343165   |  None  |
|    sqft_above    |  None |  4

+------------------+-------+-----------------+--------+
|       name       | index |      value      | stderr |
+------------------+-------+-----------------+--------+
|   (intercept)    |  None |  18995.3596694  |  None  |
|     bedrooms     |  None |  7936.92232272  |  None  |
| bedrooms_square  |  None |  936.965292161  |  None  |
|    bathrooms     |  None |  25409.5356808  |  None  |
|   sqft_living    |  None |  39.1151147834  |  None  |
| sqft_living_sqrt |  None |  1124.64800663  |  None  |
|     sqft_lot     |  None | 0.0034650027953 |  None  |
|  sqft_lot_sqrt   |  None |  148.257128596  |  None  |
|      floors      |  None |  21204.1774601  |  None  |
|  floors_square   |  None |  12915.3837177  |  None  |
|    waterfront    |  None |   601894.94983  |  None  |
|       view       |  None |  93312.1505999  |  None  |
|    condition     |  None |  6608.95860207  |  None  |
|      grade       |  None |   6206.9192324  |  None  |
|    sqft_above    |  None |  43.2869767421  |  

+------------------+-------+------------------+--------+
|       name       | index |      value       | stderr |
+------------------+-------+------------------+--------+
|   (intercept)    |  None |   18999.539499   |  None  |
|     bedrooms     |  None |  7936.82421875   |  None  |
| bedrooms_square  |  None |  936.904564783   |  None  |
|    bathrooms     |  None |  25409.4204959   |  None  |
|   sqft_living    |  None |  39.1150680714   |  None  |
| sqft_living_sqrt |  None |  1124.64323476   |  None  |
|     sqft_lot     |  None | 0.00342473834052 |  None  |
|  sqft_lot_sqrt   |  None |  148.254398041   |  None  |
|      floors      |  None |  21203.8356975   |  None  |
|  floors_square   |  None |  12915.0795655   |  None  |
|    waterfront    |  None |  601871.925719   |  None  |
|       view       |  None |  93310.6220089   |  None  |
|    condition     |  None |  6608.79181529   |  None  |
|      grade       |  None |  6206.87433044   |  None  |
|    sqft_above    |  None |  4

+------------------+-------+------------------+--------+
|       name       | index |      value       | stderr |
+------------------+-------+------------------+--------+
|   (intercept)    |  None |  19012.7572816   |  None  |
|     bedrooms     |  None |  7936.51398675   |  None  |
| bedrooms_square  |  None |  936.712527935   |  None  |
|    bathrooms     |  None |  25409.0562492   |  None  |
|   sqft_living    |  None |  39.1149203552   |  None  |
| sqft_living_sqrt |  None |  1124.62814478   |  None  |
|     sqft_lot     |  None | 0.00329741094431 |  None  |
|  sqft_lot_sqrt   |  None |  148.245763265   |  None  |
|      floors      |  None |   21202.754949   |  None  |
|  floors_square   |  None |  12914.1177518   |  None  |
|    waterfront    |  None |  601799.117081   |  None  |
|       view       |  None |  93305.7881796   |  None  |
|    condition     |  None |  6608.26438913   |  None  |
|      grade       |  None |  6206.73233795   |  None  |
|    sqft_above    |  None |  4

+------------------+-------+------------------+--------+
|       name       | index |      value       | stderr |
+------------------+-------+------------------+--------+
|   (intercept)    |  None |  19054.5555764   |  None  |
|     bedrooms     |  None |  7935.53294712   |  None  |
| bedrooms_square  |  None |  936.105254156   |  None  |
|    bathrooms     |  None |  25407.9044002   |  None  |
|   sqft_living    |  None |  39.1144532356   |  None  |
| sqft_living_sqrt |  None |  1124.58042608   |  None  |
|     sqft_lot     |  None | 0.00289476640108 |  None  |
|  sqft_lot_sqrt   |  None |  148.218457709   |  None  |
|      floors      |  None |  21199.3373226   |  None  |
|  floors_square   |  None |  12911.0762302   |  None  |
|    waterfront    |  None |  601568.875973   |  None  |
|       view       |  None |  93290.5022704   |  None  |
|    condition     |  None |  6606.59652132   |  None  |
|      grade       |  None |  6206.28331834   |  None  |
|    sqft_above    |  None |  4

+------------------+-------+------------------+--------+
|       name       | index |      value       | stderr |
+------------------+-------+------------------+--------+
|   (intercept)    |  None |  19186.7333988   |  None  |
|     bedrooms     |  None |   7932.4306272   |  None  |
| bedrooms_square  |  None |   934.18488573   |  None  |
|    bathrooms     |  None |  25404.2619334   |  None  |
|   sqft_living    |  None |  39.1129760734   |  None  |
| sqft_living_sqrt |  None |  1124.42952628   |  None  |
|     sqft_lot     |  None | 0.00162149247578 |  None  |
|  sqft_lot_sqrt   |  None |  148.132109954   |  None  |
|      floors      |  None |  21188.5298383   |  None  |
|  floors_square   |  None |  12901.4580936   |  None  |
|    waterfront    |  None |  600840.789617   |  None  |
|       view       |  None |  93242.1639781   |  None  |
|    condition     |  None |  6601.32225988   |  None  |
|      grade       |  None |  6204.86339356   |  None  |
|    sqft_above    |  None |  4

+------------------+-------+--------------------+--------+
|       name       | index |       value        | stderr |
+------------------+-------+--------------------+--------+
|   (intercept)    |  None |   19604.7163508    |  None  |
|     bedrooms     |  None |   7922.62023075    |  None  |
| bedrooms_square  |  None |   928.112147889    |  None  |
|    bathrooms     |  None |    25392.743443    |  None  |
|   sqft_living    |  None |   39.1083048767    |  None  |
| sqft_living_sqrt |  None |   1123.95233925    |  None  |
|     sqft_lot     |  None | -0.000823987151992 |  None  |
|  sqft_lot_sqrt   |  None |   147.859054391    |  None  |
|      floors      |  None |    21154.353574    |  None  |
|  floors_square   |  None |    12871.042877    |  None  |
|    waterfront    |  None |   598538.378522    |  None  |
|       view       |  None |   93089.3048849    |  None  |
|    condition     |  None |   6584.64358167    |  None  |
|      grade       |  None |   6200.37319739    |  None 

In [58]:
l1_penalty_errors_sframe.print_rows(num_rows=len(l1_penalty_set+1))

+---------------+-------------------+
|    penalty    |       errors      |
+---------------+-------------------+
|      10.0     | 6.25766285142e+14 |
| 31.6227766017 | 6.25766285362e+14 |
|     100.0     | 6.25766286058e+14 |
| 316.227766017 | 6.25766288257e+14 |
|     1000.0    | 6.25766295212e+14 |
| 3162.27766017 | 6.25766317206e+14 |
|    10000.0    | 6.25766386761e+14 |
| 31622.7766017 | 6.25766606749e+14 |
|    100000.0   | 6.25767302792e+14 |
| 316227.766017 | 6.25769507644e+14 |
|   1000000.0   | 6.25776517727e+14 |
| 3162277.66017 | 6.25799062845e+14 |
|   10000000.0  | 6.25883719085e+14 |
+---------------+-------------------+
[13 rows x 2 columns]



In [59]:
def find_best_l1_penalty(l1_penalty_errors_sframe):
    min_validation_error = l1_penalty_errors_sframe['errors'].min()
    return l1_penalty_errors_sframe[l1_penalty_errors_sframe['errors'] == min_validation_error][0]['penalty']

*** QUIZ QUESTION. *** What was the best value for the `l1_penalty`?

In [60]:
best_l1_penalty = find_best_l1_penalty(l1_penalty_errors_sframe)
print best_l1_penalty

10.0


***QUIZ QUESTION***
Also, using this value of L1 penalty, how many nonzero weights do you have?

In [61]:
model = graphlab.linear_regression.create(training, target=output_name, features=all_features,
                                              validation_set=None, 
                                              l2_penalty=0., l1_penalty=best_l1_penalty)
model.get("coefficients").print_rows(num_rows = len(all_features)+1)

+------------------+-------+------------------+--------+
|       name       | index |      value       | stderr |
+------------------+-------+------------------+--------+
|   (intercept)    |  None |  18993.4272128   |  None  |
|     bedrooms     |  None |  7936.96767903   |  None  |
| bedrooms_square  |  None |  936.993368193   |  None  |
|    bathrooms     |  None |  25409.5889341   |  None  |
|   sqft_living    |  None |  39.1151363797   |  None  |
| sqft_living_sqrt |  None |  1124.65021281   |  None  |
|     sqft_lot     |  None | 0.00348361822299 |  None  |
|  sqft_lot_sqrt   |  None |  148.258391011   |  None  |
|      floors      |  None |   21204.335467   |  None  |
|  floors_square   |  None |  12915.5243361   |  None  |
|    waterfront    |  None |  601905.594545   |  None  |
|       view       |  None |  93312.8573119   |  None  |
|    condition     |  None |  6609.03571245   |  None  |
|      grade       |  None |  6206.93999188   |  None  |
|    sqft_above    |  None |  4

# Limit the number of nonzero weights

What if we absolutely wanted to limit ourselves to, say, 7 features? This may be important if we want to derive "a rule of thumb" --- an interpretable model that has only a few features in them.

In this section, you are going to implement a simple, two phase procedure to achive this goal:
1. Explore a large range of `l1_penalty` values to find a narrow region of `l1_penalty` values where models are likely to have the desired number of non-zero weights.
2. Further explore the narrow region you found to find a good value for `l1_penalty` that achieves the desired sparsity.  Here, we will again use a validation set to choose the best value for `l1_penalty`.

In [62]:
max_nonzeros = 7

## Exploring the larger range of values to find a narrow range with the desired sparsity

Let's define a wide range of possible `l1_penalty_values`:

In [63]:
l1_penalty_values = numpy.logspace(8, 10, num=20)
print l1_penalty_values

[  1.00000000e+08   1.27427499e+08   1.62377674e+08   2.06913808e+08
   2.63665090e+08   3.35981829e+08   4.28133240e+08   5.45559478e+08
   6.95192796e+08   8.85866790e+08   1.12883789e+09   1.43844989e+09
   1.83298071e+09   2.33572147e+09   2.97635144e+09   3.79269019e+09
   4.83293024e+09   6.15848211e+09   7.84759970e+09   1.00000000e+10]


Now, implement a loop that search through this space of possible `l1_penalty` values:

* For `l1_penalty` in `np.logspace(8, 10, num=20)`:
    * Fit a regression model with a given `l1_penalty` on TRAIN data. Specify `l1_penalty=l1_penalty` and `l2_penalty=0.` in the parameter list. When you call `linear_regression.create()` make sure you set `validation_set = None`
    * Extract the weights of the model and count the number of nonzeros. Save the number of nonzeros to a list.
        * *Hint: `model['coefficients']['value']` gives you an SArray with the parameters you learned.  If you call the method `.nnz()` on it, you will find the number of non-zero parameters!* 

In [64]:
def build_l1_penalty_coefficients_sframe(training_data, features_list, output_name, l1_penalty_set):
    l1_penalty_coefficients_sframe = graphlab.SFrame()
    l1_penalty_coefficients_sframe['penalty'] = graphlab.SArray(l1_penalty_set)
    coefficients_array = []
    nnz_array = []
    for l1_penalty in l1_penalty_set:
        model = graphlab.linear_regression.create(training_data, target=output_name, features=features_list,
                                              validation_set=None, 
                                              l2_penalty=0., l1_penalty=l1_penalty)
        coefficients = model['coefficients']['value']
        coefficients_array.append(coefficients)
        nnz_array.append(coefficients.nnz())
        model.get("coefficients").print_rows(num_rows = len(features_list)+1)
    l1_penalty_coefficients_sframe['coefficients'] = graphlab.SArray(coefficients_array)
    l1_penalty_coefficients_sframe['nnz'] = graphlab.SArray(nnz_array)
    return l1_penalty_coefficients_sframe

In [65]:
l1_penalty_coefficients_sframe = build_l1_penalty_coefficients_sframe(training, all_features, output_name, l1_penalty_values)

+------------------+-------+------------------+--------+
|       name       | index |      value       | stderr |
+------------------+-------+------------------+--------+
|   (intercept)    |  None |  25090.9173672   |  None  |
|     bedrooms     |  None |   7789.1770611   |  None  |
| bedrooms_square  |  None |  847.559686943   |  None  |
|    bathrooms     |  None |  25234.2091945   |  None  |
|   sqft_living    |  None |  39.0394459636   |  None  |
| sqft_living_sqrt |  None |  1117.31189557   |  None  |
|     sqft_lot     |  None | -0.0256861182399 |  None  |
|  sqft_lot_sqrt   |  None |   143.98899197   |  None  |
|      floors      |  None |  20695.3592396   |  None  |
|  floors_square   |  None |  12466.6906503   |  None  |
|    waterfront    |  None |  568204.644584   |  None  |
|       view       |  None |  91066.9428088   |  None  |
|    condition     |  None |  6360.78092625   |  None  |
|      grade       |  None |  6139.21280565   |  None  |
|    sqft_above    |  None |  4

+------------------+-------+------------------+--------+
|       name       | index |      value       | stderr |
+------------------+-------+------------------+--------+
|   (intercept)    |  None |  26746.6619366   |  None  |
|     bedrooms     |  None |  7743.97904785   |  None  |
| bedrooms_square  |  None |  822.358945251   |  None  |
|    bathrooms     |  None |  25178.6259306   |  None  |
|   sqft_living    |  None |  39.0107181353   |  None  |
| sqft_living_sqrt |  None |  1114.91071592   |  None  |
|     sqft_lot     |  None | -0.0186630737228 |  None  |
|  sqft_lot_sqrt   |  None |  142.519797841   |  None  |
|      floors      |  None |  20545.8673047   |  None  |
|  floors_square   |  None |  12339.2452502   |  None  |
|    waterfront    |  None |  558930.247072   |  None  |
|       view       |  None |  90439.7218512   |  None  |
|    condition     |  None |  6288.00946554   |  None  |
|      grade       |  None |  6118.41232062   |  None  |
|    sqft_above    |  None |  4

+------------------+-------+------------------+--------+
|       name       | index |      value       | stderr |
+------------------+-------+------------------+--------+
|   (intercept)    |  None |  28873.1810166   |  None  |
|     bedrooms     |  None |  7691.04707569   |  None  |
| bedrooms_square  |  None |  790.917579684   |  None  |
|    bathrooms     |  None |  25115.2785345   |  None  |
|   sqft_living    |  None |  38.9820788132   |  None  |
| sqft_living_sqrt |  None |  1112.23941465   |  None  |
|     sqft_lot     |  None | -0.0247373605808 |  None  |
|  sqft_lot_sqrt   |  None |  140.945844751   |  None  |
|      floors      |  None |  20365.2658969   |  None  |
|  floors_square   |  None |  12181.1862577   |  None  |
|    waterfront    |  None |  547143.180179   |  None  |
|       view       |  None |  89651.6923916   |  None  |
|    condition     |  None |   6199.959966    |  None  |
|      grade       |  None |  6094.13138655   |  None  |
|    sqft_above    |  None |   

+------------------+-------+------------------+--------+
|       name       | index |      value       | stderr |
+------------------+-------+------------------+--------+
|   (intercept)    |  None |  31564.6064733   |  None  |
|     bedrooms     |  None |  7618.56776464   |  None  |
| bedrooms_square  |  None |  750.170954883   |  None  |
|    bathrooms     |  None |  25026.1774076   |  None  |
|   sqft_living    |  None |  38.9359152531   |  None  |
| sqft_living_sqrt |  None |  1108.38631937   |  None  |
|     sqft_lot     |  None | -0.0177447627514 |  None  |
|  sqft_lot_sqrt   |  None |  138.385325946   |  None  |
|      floors      |  None |   20124.74672    |  None  |
|  floors_square   |  None |  11975.4977617   |  None  |
|    waterfront    |  None |  532082.075627   |  None  |
|       view       |  None |  88631.0425555   |  None  |
|    condition     |  None |   6082.6260082   |  None  |
|      grade       |  None |  6060.67791122   |  None  |
|    sqft_above    |  None |  4

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 34954.6717432 |  None  |
|     bedrooms     |  None | 7515.37748215 |  None  |
| bedrooms_square  |  None | 696.785359477 |  None  |
|    bathrooms     |  None | 24894.5560924 |  None  |
|   sqft_living    |  None |  38.856140639 |  None  |
| sqft_living_sqrt |  None | 1102.50564365 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None | 133.900307194 |  None  |
|      floors      |  None | 19795.9303312 |  None  |
|  floors_square   |  None | 11704.2280548 |  None  |
|    waterfront    |  None | 512800.580139 |  None  |
|       view       |  None | 87294.4420808 |  None  |
|    condition     |  None | 5922.04189535 |  None  |
|      grade       |  None |  6012.6238341 |  None  |
|    sqft_above    |  None | 42.5532436917 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 39314.6150486 |  None  |
|     bedrooms     |  None | 7394.88288952 |  None  |
| bedrooms_square  |  None | 630.253031222 |  None  |
|    bathrooms     |  None | 24745.1217898 |  None  |
|   sqft_living    |  None | 38.7755060169 |  None  |
| sqft_living_sqrt |  None | 1095.99069713 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None | 129.371608166 |  None  |
|      floors      |  None | 19399.6670071 |  None  |
|  floors_square   |  None | 11367.9574719 |  None  |
|    waterfront    |  None | 488319.748029 |  None  |
|       view       |  None | 85626.7866561 |  None  |
|    condition     |  None | 5728.63216536 |  None  |
|      grade       |  None | 5956.87175569 |  None  |
|    sqft_above    |  None |  42.340283995 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 44909.3379183 |  None  |
|     bedrooms     |  None | 7253.29314819 |  None  |
| bedrooms_square  |  None | 547.592462374 |  None  |
|    bathrooms     |  None | 24570.6858953 |  None  |
|   sqft_living    |  None | 38.6822812723 |  None  |
| sqft_living_sqrt |  None | 1088.38186842 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None | 121.987113653 |  None  |
|      floors      |  None | 18922.8216139 |  None  |
|  floors_square   |  None | 10954.4718718 |  None  |
|    waterfront    |  None |  457135.25712 |  None  |
|       view       |  None | 83487.1278449 |  None  |
|    condition     |  None | 5493.39678143 |  None  |
|      grade       |  None | 5890.40390709 |  None  |
|    sqft_above    |  None | 42.0766862594 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 52030.0300806 |  None  |
|     bedrooms     |  None | 7070.21661567 |  None  |
| bedrooms_square  |  None | 441.775084091 |  None  |
|    bathrooms     |  None |  24344.882055 |  None  |
|   sqft_living    |  None |  38.561505028 |  None  |
| sqft_living_sqrt |  None |  1078.536814  |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None | 112.959925773 |  None  |
|      floors      |  None | 18308.8672364 |  None  |
|  floors_square   |  None | 10424.1285297 |  None  |
|    waterfront    |  None | 417398.763136 |  None  |
|       view       |  None | 80764.7024479 |  None  |
|    condition     |  None | 5191.18838206 |  None  |
|      grade       |  None | 5804.71090167 |  None  |
|    sqft_above    |  None | 41.7392680886 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 61121.2820244 |  None  |
|     bedrooms     |  None | 6842.36051556 |  None  |
| bedrooms_square  |  None | 307.903660162 |  None  |
|    bathrooms     |  None | 24063.6095875 |  None  |
|   sqft_living    |  None | 38.4096473907 |  None  |
| sqft_living_sqrt |  None | 1066.24937649 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None | 100.193559291 |  None  |
|      floors      |  None | 17539.4407831 |  None  |
|  floors_square   |  None | 9755.40223411 |  None  |
|    waterfront    |  None | 366774.084543 |  None  |
|       view       |  None | 77282.2988126 |  None  |
|    condition     |  None | 4811.23997193 |  None  |
|      grade       |  None | 5697.41313174 |  None  |
|    sqft_above    |  None | 41.3094875827 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 72653.7628357 |  None  |
|     bedrooms     |  None | 6537.49853878 |  None  |
| bedrooms_square  |  None | 135.203648875 |  None  |
|    bathrooms     |  None | 23681.2367227 |  None  |
|   sqft_living    |  None | 38.1918646961 |  None  |
| sqft_living_sqrt |  None | 1049.39149651 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None | 83.2210162526 |  None  |
|      floors      |  None | 16527.6347038 |  None  |
|  floors_square   |  None | 8889.54463876 |  None  |
|    waterfront    |  None | 301854.633614 |  None  |
|       view       |  None | 72780.6760936 |  None  |
|    condition     |  None | 4313.38209201 |  None  |
|      grade       |  None | 5553.89704035 |  None  |
|    sqft_above    |  None | 40.7353790102 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 87298.8276811 |  None  |
|     bedrooms     |  None | 6137.40172251 |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None |  23168.695279 |  None  |
|   sqft_living    |  None |  37.888523607 |  None  |
| sqft_living_sqrt |  None | 1026.70969952 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None | 60.0719852593 |  None  |
|      floors      |  None | 15213.0294203 |  None  |
|  floors_square   |  None | 7778.83538767 |  None  |
|    waterfront    |  None |  217190.05178 |  None  |
|       view       |  None | 66808.7395328 |  None  |
|    condition     |  None | 3671.72833844 |  None  |
|      grade       |  None | 5365.15461244 |  None  |
|    sqft_above    |  None |  39.982281339 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 105843.800016 |  None  |
|     bedrooms     |  None | 5589.85410206 |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None | 22460.6559313 |  None  |
|   sqft_living    |  None | 37.4484553711 |  None  |
| sqft_living_sqrt |  None | 995.139815925 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None | 29.0615219257 |  None  |
|      floors      |  None | 13469.2490939 |  None  |
|  floors_square   |  None | 6335.03673756 |  None  |
|    waterfront    |  None | 108231.302491 |  None  |
|       view       |  None | 59040.0488287 |  None  |
|    condition     |  None | 2826.45412255 |  None  |
|      grade       |  None | 5110.13478059 |  None  |
|    sqft_above    |  None | 38.9670974649 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 129178.714228 |  None  |
|     bedrooms     |  None | 4784.47090092 |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None | 21417.8224739 |  None  |
|   sqft_living    |  None | 36.7460694226 |  None  |
| sqft_living_sqrt |  None |  947.9643281  |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None |      0.0      |  None  |
|      floors      |  None | 11067.1541719 |  None  |
|  floors_square   |  None | 4415.50904975 |  None  |
|    waterfront    |  None |      0.0      |  None  |
|       view       |  None | 48905.9159529 |  None  |
|    condition     |  None | 1667.84083599 |  None  |
|      grade       |  None | 4746.51594025 |  None  |
|    sqft_above    |  None | 37.5177051667 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 158796.200904 |  None  |
|     bedrooms     |  None | 3707.13962925 |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None | 19985.9978368 |  None  |
|   sqft_living    |  None | 35.6973978709 |  None  |
| sqft_living_sqrt |  None | 882.788902805 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None |      0.0      |  None  |
|      floors      |  None |  7920.1252361 |  None  |
|  floors_square   |  None | 1926.93051575 |  None  |
|    waterfront    |  None |      0.0      |  None  |
|       view       |  None | 32825.2978687 |  None  |
|    condition     |  None | 150.014330496 |  None  |
|      grade       |  None | 4258.66302235 |  None  |
|    sqft_above    |  None | 35.5274768688 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 196100.937806 |  None  |
|     bedrooms     |  None | 2181.57432107 |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None | 17962.6966612 |  None  |
|   sqft_living    |  None | 34.1424656512 |  None  |
| sqft_living_sqrt |  None | 789.319789078 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None |      0.0      |  None  |
|      floors      |  None |  3665.9308176 |  None  |
|  floors_square   |  None |      0.0      |  None  |
|    waterfront    |  None |      0.0      |  None  |
|       view       |  None | 11333.8410308 |  None  |
|    condition     |  None |      0.0      |  None  |
|      grade       |  None | 3578.90040044 |  None  |
|    sqft_above    |  None | 32.7432013718 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None |  240309.75932 |  None  |
|     bedrooms     |  None |      0.0      |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None | 13840.6399577 |  None  |
|   sqft_living    |  None | 30.5583588298 |  None  |
| sqft_living_sqrt |  None | 592.199469213 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None |      0.0      |  None  |
|      floors      |  None |      0.0      |  None  |
|  floors_square   |  None |      0.0      |  None  |
|    waterfront    |  None |      0.0      |  None  |
|       view       |  None |      0.0      |  None  |
|    condition     |  None |      0.0      |  None  |
|      grade       |  None | 2265.12052556 |  None  |
|    sqft_above    |  None | 27.4878726568 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 291783.678065 |  None  |
|     bedrooms     |  None |      0.0      |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None | 6104.32576546 |  None  |
|   sqft_living    |  None | 23.1701243021 |  None  |
| sqft_living_sqrt |  None | 215.030934365 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None |      0.0      |  None  |
|      floors      |  None |      0.0      |  None  |
|  floors_square   |  None |      0.0      |  None  |
|    waterfront    |  None |      0.0      |  None  |
|       view       |  None |      0.0      |  None  |
|    condition     |  None |      0.0      |  None  |
|      grade       |  None |      0.0      |  None  |
|    sqft_above    |  None | 18.4074740154 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 352383.220392 |  None  |
|     bedrooms     |  None |      0.0      |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None |      0.0      |  None  |
|   sqft_living    |  None | 11.5088257813 |  None  |
| sqft_living_sqrt |  None |      0.0      |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None |      0.0      |  None  |
|      floors      |  None |      0.0      |  None  |
|  floors_square   |  None |      0.0      |  None  |
|    waterfront    |  None |      0.0      |  None  |
|       view       |  None |      0.0      |  None  |
|    condition     |  None |      0.0      |  None  |
|      grade       |  None |      0.0      |  None  |
|    sqft_above    |  None | 4.66343503593 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 507987.962744 |  None  |
|     bedrooms     |  None |      0.0      |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None |      0.0      |  None  |
|   sqft_living    |  None |      0.0      |  None  |
| sqft_living_sqrt |  None |      0.0      |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None |      0.0      |  None  |
|      floors      |  None |      0.0      |  None  |
|  floors_square   |  None |      0.0      |  None  |
|    waterfront    |  None |      0.0      |  None  |
|       view       |  None |      0.0      |  None  |
|    condition     |  None |      0.0      |  None  |
|      grade       |  None |      0.0      |  None  |
|    sqft_above    |  None |      0.0      |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 564482.136844 |  None  |
|     bedrooms     |  None |      0.0      |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None |      0.0      |  None  |
|   sqft_living    |  None |      0.0      |  None  |
| sqft_living_sqrt |  None |      0.0      |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None |      0.0      |  None  |
|      floors      |  None |      0.0      |  None  |
|  floors_square   |  None |      0.0      |  None  |
|    waterfront    |  None |      0.0      |  None  |
|       view       |  None |      0.0      |  None  |
|    condition     |  None |      0.0      |  None  |
|      grade       |  None |      0.0      |  None  |
|    sqft_above    |  None |      0.0      |  None  |
|  sqft_basement   |  None |

Out of this large range, we want to find the two ends of our desired narrow range of `l1_penalty`.  At one end, we will have `l1_penalty` values that have too few non-zeros, and at the other end, we will have an `l1_penalty` that has too many non-zeros.  

More formally, find:
* The largest `l1_penalty` that has more non-zeros than `max_nonzeros` (if we pick a penalty smaller than this value, we will definitely have too many non-zero weights)
    * Store this value in the variable `l1_penalty_min` (we will use it later)
* The smallest `l1_penalty` that has fewer non-zeros than `max_nonzeros` (if we pick a penalty larger than this value, we will definitely have too few non-zero weights)
    * Store this value in the variable `l1_penalty_max` (we will use it later)


*Hint: there are many ways to do this, e.g.:*
* Programmatically within the loop above
* Creating a list with the number of non-zeros for each value of `l1_penalty` and inspecting it to find the appropriate boundaries.

In [66]:
l1_penalty_coefficients_sframe.print_rows(num_rows= len(l1_penalty_coefficients_sframe)+1)

+---------------+-------------------------------+-----+
|    penalty    |          coefficients         | nnz |
+---------------+-------------------------------+-----+
|  100000000.0  | [25090.9173672, 7789.17706... |  18 |
|  127427498.57 | [26746.6619366, 7743.97904... |  18 |
| 162377673.919 | [28873.1810166, 7691.04707... |  18 |
| 206913808.111 | [31564.6064733, 7618.56776... |  18 |
| 263665089.873 | [34954.6717432, 7515.37748... |  17 |
| 335981828.628 | [39314.6150486, 7394.88288... |  17 |
| 428133239.872 | [44909.3379183, 7253.29314... |  17 |
| 545559478.117 | [52030.0300806, 7070.21661... |  17 |
| 695192796.178 | [61121.2820244, 6842.36051... |  17 |
|  885866790.41 | [72653.7628357, 6537.49853... |  16 |
| 1128837891.68 | [87298.8276811, 6137.40172... |  15 |
| 1438449888.29 | [105843.800016, 5589.85410... |  15 |
| 1832980710.83 | [129178.714228, 4784.47090... |  13 |
| 2335721469.09 | [158796.200904, 3707.13962... |  12 |
| 2976351441.63 | [196100.937806, 2181.57432... 

In [67]:
l1_penalty_min_sframe = l1_penalty_coefficients_sframe[l1_penalty_coefficients_sframe['nnz'] > max_nonzeros]
l1_penalty_min = l1_penalty_min_sframe['penalty'].max()
print l1_penalty_min

2976351441.63


In [68]:
l1_penalty_max_sframe = l1_penalty_coefficients_sframe[l1_penalty_coefficients_sframe['nnz'] < max_nonzeros]
l1_penalty_max = l1_penalty_max_sframe['penalty'].min()
print l1_penalty_max

3792690190.73


***QUIZ QUESTION.*** What values did you find for `l1_penalty_min` and `l1_penalty_max`, respectively? 

## Exploring the narrow range of values to find the solution with the right number of non-zeros that has lowest RSS on the validation set 

We will now explore the narrow region of `l1_penalty` values we found:

In [69]:
l1_penalty_values = numpy.linspace(l1_penalty_min,l1_penalty_max,20)
print l1_penalty_values

[  2.97635144e+09   3.01931664e+09   3.06228184e+09   3.10524703e+09
   3.14821223e+09   3.19117743e+09   3.23414263e+09   3.27710782e+09
   3.32007302e+09   3.36303822e+09   3.40600341e+09   3.44896861e+09
   3.49193381e+09   3.53489901e+09   3.57786420e+09   3.62082940e+09
   3.66379460e+09   3.70675980e+09   3.74972499e+09   3.79269019e+09]


* For `l1_penalty` in `np.linspace(l1_penalty_min,l1_penalty_max,20)`:
    * Fit a regression model with a given `l1_penalty` on TRAIN data. Specify `l1_penalty=l1_penalty` and `l2_penalty=0.` in the parameter list. When you call `linear_regression.create()` make sure you set `validation_set = None`
    * Measure the RSS of the learned model on the VALIDATION set

Find the model that the lowest RSS on the VALIDATION set and has sparsity *equal* to `max_nonzeros`.

In [70]:
def build_lasso_model_sframe(training_data, validation_data, features_list, output_name, l1_penalty_set):
    lasso_model_sframe = graphlab.SFrame()
    lasso_model_sframe['penalty'] = graphlab.SArray(l1_penalty_set)
    validation_errors = []
    coefficients_array = []
    nnz_array = []
    for l1_penalty in l1_penalty_set:
        model = graphlab.linear_regression.create(training_data, target=output_name, features=features_list,
                                              validation_set=None, 
                                              l2_penalty=0., l1_penalty=l1_penalty)
        
        model.get("coefficients").print_rows(num_rows = len(features_list)+1)
        validation_errors.append(rss(model, validation_data, features_list, output_name))
        coefficients = model['coefficients']['value']
        coefficients_array.append(coefficients)
        nnz_array.append(coefficients.nnz())
    lasso_model_sframe['errors'] = graphlab.SArray(validation_errors)
    lasso_model_sframe['coefficients'] = graphlab.SArray(coefficients_array)
    lasso_model_sframe['nnz'] = graphlab.SArray(nnz_array)
    return lasso_model_sframe

In [71]:
lasso_model_sframe = build_lasso_model_sframe(training, validation, all_features, output_name, l1_penalty_values)

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 196100.937806 |  None  |
|     bedrooms     |  None | 2181.57432107 |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None | 17962.6966612 |  None  |
|   sqft_living    |  None | 34.1424656512 |  None  |
| sqft_living_sqrt |  None | 789.319789078 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None |      0.0      |  None  |
|      floors      |  None |  3665.9308176 |  None  |
|  floors_square   |  None |      0.0      |  None  |
|    waterfront    |  None |      0.0      |  None  |
|       view       |  None | 11333.8410308 |  None  |
|    condition     |  None |      0.0      |  None  |
|      grade       |  None | 3578.90040044 |  None  |
|    sqft_above    |  None | 32.7432013718 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 198563.246218 |  None  |
|     bedrooms     |  None | 2067.01515556 |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None | 17810.3875978 |  None  |
|   sqft_living    |  None | 34.0215103056 |  None  |
| sqft_living_sqrt |  None | 782.182317695 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None |      0.0      |  None  |
|      floors      |  None | 3358.20330522 |  None  |
|  floors_square   |  None |      0.0      |  None  |
|    waterfront    |  None |      0.0      |  None  |
|       view       |  None | 9876.73760812 |  None  |
|    condition     |  None |      0.0      |  None  |
|      grade       |  None | 3528.25500887 |  None  |
|    sqft_above    |  None | 32.5372329212 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 201025.560405 |  None  |
|     bedrooms     |  None | 1952.47648961 |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None | 17658.0894965 |  None  |
|   sqft_living    |  None | 33.9005653546 |  None  |
| sqft_living_sqrt |  None |  775.04542855 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None |      0.0      |  None  |
|      floors      |  None | 3050.47219404 |  None  |
|  floors_square   |  None |      0.0      |  None  |
|    waterfront    |  None |      0.0      |  None  |
|       view       |  None | 8418.98705902 |  None  |
|    condition     |  None |      0.0      |  None  |
|      grade       |  None | 3477.61216617 |  None  |
|    sqft_above    |  None | 32.3312655985 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None |   203443.601  |  None  |
|     bedrooms     |  None | 1825.78864269 |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None | 17487.4923511 |  None  |
|   sqft_living    |  None | 33.7619937958 |  None  |
| sqft_living_sqrt |  None | 766.963165072 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None |      0.0      |  None  |
|      floors      |  None | 2716.80762179 |  None  |
|  floors_square   |  None |      0.0      |  None  |
|    waterfront    |  None |      0.0      |  None  |
|       view       |  None |  6944.3661608 |  None  |
|    condition     |  None |      0.0      |  None  |
|      grade       |  None | 3421.35035252 |  None  |
|    sqft_above    |  None | 32.1053231575 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 205805.899501 |  None  |
|     bedrooms     |  None | 1683.39344119 |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None | 17291.0360282 |  None  |
|   sqft_living    |  None | 33.5989952082 |  None  |
| sqft_living_sqrt |  None | 757.631735464 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None |      0.0      |  None  |
|      floors      |  None |  2342.2083308 |  None  |
|  floors_square   |  None |      0.0      |  None  |
|    waterfront    |  None |      0.0      |  None  |
|       view       |  None |  5448.0235087 |  None  |
|    condition     |  None |      0.0      |  None  |
|      grade       |  None | 3357.67138391 |  None  |
|    sqft_above    |  None | 31.8500798625 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 208163.075463 |  None  |
|     bedrooms     |  None | 1539.55136897 |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None | 17092.1819668 |  None  |
|   sqft_living    |  None | 33.4337343767 |  None  |
| sqft_living_sqrt |  None | 748.185048398 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None |      0.0      |  None  |
|      floors      |  None | 1963.78499353 |  None  |
|  floors_square   |  None |      0.0      |  None  |
|    waterfront    |  None |      0.0      |  None  |
|       view       |  None | 3949.68534862 |  None  |
|    condition     |  None |      0.0      |  None  |
|      grade       |  None | 3293.30808031 |  None  |
|    sqft_above    |  None | 31.5921117487 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 210520.251425 |  None  |
|     bedrooms     |  None | 1395.70929675 |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None | 16893.3279054 |  None  |
|   sqft_living    |  None | 33.2684735452 |  None  |
| sqft_living_sqrt |  None | 738.738361333 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None |      0.0      |  None  |
|      floors      |  None | 1585.36165627 |  None  |
|  floors_square   |  None |      0.0      |  None  |
|    waterfront    |  None |      0.0      |  None  |
|       view       |  None | 2451.34718854 |  None  |
|    condition     |  None |      0.0      |  None  |
|      grade       |  None | 3228.94477672 |  None  |
|    sqft_above    |  None | 31.3341436349 |  None  |
|  sqft_basement   |  None |

+------------------+-------+----------------+--------+
|       name       | index |     value      | stderr |
+------------------+-------+----------------+--------+
|   (intercept)    |  None | 212877.413342  |  None  |
|     bedrooms     |  None | 1251.86808159  |  None  |
| bedrooms_square  |  None |      0.0       |  None  |
|    bathrooms     |  None | 16694.4750289  |  None  |
|   sqft_living    |  None | 33.1032136983  |  None  |
| sqft_living_sqrt |  None | 729.291730553  |  None  |
|     sqft_lot     |  None |      0.0       |  None  |
|  sqft_lot_sqrt   |  None |      0.0       |  None  |
|      floors      |  None | 1206.94057377  |  None  |
|  floors_square   |  None |      0.0       |  None  |
|    waterfront    |  None |      0.0       |  None  |
|       view       |  None | 953.017956011  |  None  |
|    condition     |  None |      0.0       |  None  |
|      grade       |  None | 3164.58185662  |  None  |
|    sqft_above    |  None | 31.0761770582  |  None  |
|  sqft_ba

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 215235.603644 |  None  |
|     bedrooms     |  None | 1108.36955956 |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None | 16496.2360732 |  None  |
|   sqft_living    |  None | 32.9384118477 |  None  |
| sqft_living_sqrt |  None | 719.868441786 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None |      0.0      |  None  |
|      floors      |  None | 829.560064725 |  None  |
|  floors_square   |  None |      0.0      |  None  |
|    waterfront    |  None |      0.0      |  None  |
|       view       |  None |      0.0      |  None  |
|    condition     |  None |      0.0      |  None  |
|      grade       |  None | 3100.35992021 |  None  |
|    sqft_above    |  None | 30.8186652907 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 217598.303815 |  None  |
|     bedrooms     |  None | 966.398500274 |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None | 16300.7319463 |  None  |
|   sqft_living    |  None | 32.7756506777 |  None  |
| sqft_living_sqrt |  None | 710.549184481 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None |      0.0      |  None  |
|      floors      |  None |  456.8160872  |  None  |
|  floors_square   |  None |      0.0      |  None  |
|    waterfront    |  None |      0.0      |  None  |
|       view       |  None |      0.0      |  None  |
|    condition     |  None |      0.0      |  None  |
|      grade       |  None | 3036.76651824 |  None  |
|    sqft_above    |  None | 30.5631824898 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None |  219929.85697 |  None  |
|     bedrooms     |  None | 815.276192211 |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None | 16089.4931893 |  None  |
|   sqft_living    |  None |  32.595397602 |  None  |
| sqft_living_sqrt |  None | 700.439894317 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None |      0.0      |  None  |
|      floors      |  None | 64.8212452205 |  None  |
|  floors_square   |  None |      0.0      |  None  |
|    waterfront    |  None |      0.0      |  None  |
|       view       |  None |      0.0      |  None  |
|    condition     |  None |      0.0      |  None  |
|      grade       |  None | 2968.69981004 |  None  |
|    sqft_above    |  None | 30.2898334847 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 222253.192544 |  None  |
|     bedrooms     |  None | 661.722717782 |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None | 15873.9572593 |  None  |
|   sqft_living    |  None | 32.4102214513 |  None  |
| sqft_living_sqrt |  None | 690.114773313 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None |      0.0      |  None  |
|      floors      |  None |      0.0      |  None  |
|  floors_square   |  None |      0.0      |  None  |
|    waterfront    |  None |      0.0      |  None  |
|       view       |  None |      0.0      |  None  |
|    condition     |  None |      0.0      |  None  |
|      grade       |  None | 2899.42026975 |  None  |
|    sqft_above    |  None | 30.0115753022 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 224545.136501 |  None  |
|     bedrooms     |  None | 496.983429977 |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None | 15640.8229131 |  None  |
|   sqft_living    |  None | 32.2039341994 |  None  |
| sqft_living_sqrt |  None | 678.904419357 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None |      0.0      |  None  |
|      floors      |  None |      0.0      |  None  |
|  floors_square   |  None |      0.0      |  None  |
|    waterfront    |  None |      0.0      |  None  |
|       view       |  None |      0.0      |  None  |
|    condition     |  None |      0.0      |  None  |
|      grade       |  None |  2825.4694254 |  None  |
|    sqft_above    |  None |  29.715599776 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 226807.047234 |  None  |
|     bedrooms     |  None | 322.098888436 |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None | 15392.9565223 |  None  |
|   sqft_living    |  None | 31.9817554069 |  None  |
| sqft_living_sqrt |  None | 666.949904077 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None |      0.0      |  None  |
|      floors      |  None |      0.0      |  None  |
|  floors_square   |  None |      0.0      |  None  |
|    waterfront    |  None |      0.0      |  None  |
|       view       |  None |      0.0      |  None  |
|    condition     |  None |      0.0      |  None  |
|      grade       |  None | 2747.52787565 |  None  |
|    sqft_above    |  None | 29.4061078857 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 229077.875448 |  None  |
|     bedrooms     |  None | 149.417508491 |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None | 15146.7392756 |  None  |
|   sqft_living    |  None | 31.7586377842 |  None  |
| sqft_living_sqrt |  None |  655.07824878 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None |      0.0      |  None  |
|      floors      |  None |      0.0      |  None  |
|  floors_square   |  None |      0.0      |  None  |
|    waterfront    |  None |      0.0      |  None  |
|       view       |  None |      0.0      |  None  |
|    condition     |  None |      0.0      |  None  |
|      grade       |  None | 2670.25064395 |  None  |
|    sqft_above    |  None | 29.0979663052 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 231334.909221 |  None  |
|     bedrooms     |  None |      0.0      |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None | 14892.9994878 |  None  |
|   sqft_living    |  None | 31.5287758428 |  None  |
| sqft_living_sqrt |  None |  642.87941674 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None |      0.0      |  None  |
|      floors      |  None |      0.0      |  None  |
|  floors_square   |  None |      0.0      |  None  |
|    waterfront    |  None |      0.0      |  None  |
|       view       |  None |      0.0      |  None  |
|    condition     |  None |      0.0      |  None  |
|      grade       |  None | 2590.98192449 |  None  |
|    sqft_above    |  None |  28.781057388 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 233587.842086 |  None  |
|     bedrooms     |  None |      0.0      |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None | 14637.1202819 |  None  |
|   sqft_living    |  None |  31.297068723 |  None  |
| sqft_living_sqrt |  None | 630.587199493 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None |      0.0      |  None  |
|      floors      |  None |      0.0      |  None  |
|  floors_square   |  None |      0.0      |  None  |
|    waterfront    |  None |      0.0      |  None  |
|       view       |  None |      0.0      |  None  |
|    condition     |  None |      0.0      |  None  |
|      grade       |  None |  2511.1399552 |  None  |
|    sqft_above    |  None | 28.4616722759 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 235832.551542 |  None  |
|     bedrooms     |  None |      0.0      |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None | 14373.6503925 |  None  |
|   sqft_living    |  None | 31.0532265345 |  None  |
| sqft_living_sqrt |  None | 617.898021947 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None |      0.0      |  None  |
|      floors      |  None |      0.0      |  None  |
|  floors_square   |  None |      0.0      |  None  |
|    waterfront    |  None |      0.0      |  None  |
|       view       |  None |      0.0      |  None  |
|    condition     |  None |      0.0      |  None  |
|      grade       |  None | 2429.68001481 |  None  |
|    sqft_above    |  None | 28.1385016491 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None | 238074.136362 |  None  |
|     bedrooms     |  None |      0.0      |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None | 14108.2262177 |  None  |
|   sqft_living    |  None | 30.8067775617 |  None  |
| sqft_living_sqrt |  None | 605.107123194 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None |      0.0      |  None  |
|      floors      |  None |      0.0      |  None  |
|  floors_square   |  None |      0.0      |  None  |
|    waterfront    |  None |      0.0      |  None  |
|       view       |  None |      0.0      |  None  |
|    condition     |  None |      0.0      |  None  |
|      grade       |  None | 2347.76075058 |  None  |
|    sqft_above    |  None | 27.8143089513 |  None  |
|  sqft_basement   |  None |

+------------------+-------+---------------+--------+
|       name       | index |     value     | stderr |
+------------------+-------+---------------+--------+
|   (intercept)    |  None |  240309.75932 |  None  |
|     bedrooms     |  None |      0.0      |  None  |
| bedrooms_square  |  None |      0.0      |  None  |
|    bathrooms     |  None | 13840.6399577 |  None  |
|   sqft_living    |  None | 30.5583588298 |  None  |
| sqft_living_sqrt |  None | 592.199469213 |  None  |
|     sqft_lot     |  None |      0.0      |  None  |
|  sqft_lot_sqrt   |  None |      0.0      |  None  |
|      floors      |  None |      0.0      |  None  |
|  floors_square   |  None |      0.0      |  None  |
|    waterfront    |  None |      0.0      |  None  |
|       view       |  None |      0.0      |  None  |
|    condition     |  None |      0.0      |  None  |
|      grade       |  None | 2265.12052556 |  None  |
|    sqft_above    |  None | 27.4878726568 |  None  |
|  sqft_basement   |  None |

In [72]:
lasso_model_sframe.print_rows(num_rows=len(lasso_model_sframe)+1)

+---------------+-------------------+-------------------------------+-----+
|    penalty    |       errors      |          coefficients         | nnz |
+---------------+-------------------+-------------------------------+-----+
| 2976351441.63 | 9.66925692362e+14 | [196100.937806, 2181.57432... |  10 |
| 3019316638.95 | 9.74019450085e+14 | [198563.246218, 2067.01515... |  10 |
| 3062281836.27 | 9.81188367942e+14 | [201025.560405, 1952.47648... |  10 |
| 3105247033.59 | 9.89328342459e+14 | [203443.601, 1825.78864269... |  10 |
| 3148212230.92 | 9.98783211266e+14 | [205805.899501, 1683.39344... |  10 |
| 3191177428.24 | 1.00847716702e+15 | [208163.075463, 1539.55136... |  10 |
| 3234142625.56 | 1.01829878055e+15 | [210520.251425, 1395.70929... |  10 |
| 3277107822.88 | 1.02824799221e+15 | [212877.413342, 1251.86808... |  10 |
|  3320073020.2 | 1.03461690923e+15 | [215235.603644, 1108.36955... |  8  |
| 3363038217.52 | 1.03855473594e+15 | [217598.303815, 966.398500... |  8  |
| 3406003414

***QUIZ QUESTIONS***
1. What value of `l1_penalty` in our narrow range has the lowest RSS on the VALIDATION set and has sparsity *equal* to `max_nonzeros`?
2. What features in this model have non-zero coefficients?

In [73]:
print max_nonzeros
result_frame = lasso_model_sframe[lasso_model_sframe['nnz'] == max_nonzeros]
print result_frame

7
+---------------+-------------------+-------------------------------+-----+
|    penalty    |       errors      |          coefficients         | nnz |
+---------------+-------------------+-------------------------------+-----+
| 3448968612.16 | 1.04693748875e+15 | [222253.192544, 661.722717... |  7  |
| 3491933809.48 | 1.05114762561e+15 | [224545.136501, 496.983429... |  7  |
| 3534899006.81 | 1.05599273534e+15 | [226807.047234, 322.098888... |  7  |
| 3577864204.13 | 1.06079953176e+15 | [229077.875448, 149.417508... |  7  |
+---------------+-------------------+-------------------------------+-----+
[? rows x 4 columns]
Note: Only the head of the SFrame is printed. This SFrame is lazily evaluated.
You can use sf.materialize() to force materialization.


In [74]:
final_best_penalty = find_best_l1_penalty(result_frame)
print final_best_penalty

3448968612.16


In [75]:
result = result_frame[result_frame['penalty'] == final_best_penalty]
coefficients = result['coefficients']
print coefficients

[array('d', [222253.19254432785, 661.7227177822587, 0.0, 15873.957259267981, 32.41022145125964, 690.1147733133256, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2899.4202697498786, 30.011575302201045, 0.0, 0.0, 0.0]), ... ]
