Comparison of the sparsity (percentage of zero coefficients) of solutions when L1 and L2 penalty are used for different values of C. We can see that large values of C give more freedom to the model. Conversely, smaller values of C constrain the model more. In the L1 penalty case, this leads to sparser solutions.

We classify 8x8 images of digits into two classes: 0-4 against 5-9. The visualization shows coefficients of the models for varying C.

#### New to Plotly?
Plotly's Python library is free and open source! [Get started](https://plot.ly/python/getting-started/) by downloading the client and [reading the primer](https://plot.ly/python/getting-started/).
<br>You can set up Plotly to work in [online](https://plot.ly/python/getting-started/#initialization-for-online-plotting) or [offline](https://plot.ly/python/getting-started/#initialization-for-offline-plotting) mode, or in [jupyter notebooks](https://plot.ly/python/getting-started/#start-plotting-online).
<br>We also have a quick-reference [cheatsheet](https://images.plot.ly/plotly-documentation/images/python_cheat_sheet.pdf) (new!) to help you get started!

### Version

In [1]:
import sklearn
sklearn.__version__

'0.18.1'

### Imports

This tutorial imports [LogisticRegression](http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression) and [StandardScaler](http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression).

In [2]:
print(__doc__)

import plotly.plotly as py
import plotly.graph_objs as go
from plotly import tools

import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn import datasets

import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler

Automatically created module for IPython interactive environment


### Calculations

In [3]:
digits = datasets.load_digits()

X, y = digits.data, digits.target
X = StandardScaler().fit_transform(X)

# classify small against large digits
y = (y > 4).astype(np.int)


In [4]:
def matplotlib_to_plotly(cmap, pl_entries):
    h = 1.0/(pl_entries-1)
    pl_colorscale = []
    
    for k in range(pl_entries):
        C = map(np.uint8, np.array(cmap(k*h)[:3])*255)
        pl_colorscale.append([k*h, 'rgb'+str((C[0], C[1], C[2]))])
        
    return pl_colorscale

Set regularization parameter

In [5]:
data = []

for i, C in enumerate((100, 1, 0.01)):
    # turn down tolerance for short training time
    clf_l1_LR = LogisticRegression(C=C, penalty='l1', tol=0.01)
    clf_l2_LR = LogisticRegression(C=C, penalty='l2', tol=0.01)
    clf_l1_LR.fit(X, y)
    clf_l2_LR.fit(X, y)

    coef_l1_LR = clf_l1_LR.coef_.ravel()
    coef_l2_LR = clf_l2_LR.coef_.ravel()

    # coef_l1_LR contains zeros due to the
    # L1 sparsity inducing norm

    sparsity_l1_LR = np.mean(coef_l1_LR == 0) * 100
    sparsity_l2_LR = np.mean(coef_l2_LR == 0) * 100

    print("C=%.2f" % C)
    print("Sparsity with L1 penalty: %.2f%%" % sparsity_l1_LR)
    print("score with L1 penalty: %.4f" % clf_l1_LR.score(X, y))
    print("Sparsity with L2 penalty: %.2f%%" % sparsity_l2_LR)
    print("score with L2 penalty: %.4f" % clf_l2_LR.score(X, y))
    # Plot Results
    
    fig = tools.make_subplots(rows=1, cols=2, 
                              print_grid=False,
                              subplot_titles=("L1 penalty",
                                              "L2 penalty")
                              )
    
    trace1 = go.Heatmap(z=np.abs(coef_l1_LR.reshape(8, 8)),
                        colorscale=matplotlib_to_plotly(plt.cm.binary, 10),
                        showscale=False, name="L1 penalty"
                       )
    fig.append_trace(trace1, 1, 1)
    
    fig['layout']['yaxis1'].update(autorange='reversed',
                                  showticklabels=False, ticks='')
    fig['layout']['xaxis1'].update(showticklabels=False, ticks='')

    trace2 = go.Heatmap(z=np.abs(coef_l2_LR.reshape(8, 8)),
                        colorscale=matplotlib_to_plotly(plt.cm.binary, 10),
                        showscale=False, name="L2 penalty"
                       )
    fig.append_trace(trace2, 1, 2)
    
    fig['layout']['yaxis2'].update(autorange='reversed',
                                  showticklabels=False, ticks='')
    fig['layout']['xaxis2'].update(showticklabels=False, ticks='')
    
    data.append(fig)
    

C=100.00
Sparsity with L1 penalty: 6.25%
score with L1 penalty: 0.9093
Sparsity with L2 penalty: 4.69%
score with L2 penalty: 0.9098
C=1.00
Sparsity with L1 penalty: 9.38%
score with L1 penalty: 0.9093
Sparsity with L2 penalty: 4.69%
score with L2 penalty: 0.9093
C=0.01
Sparsity with L1 penalty: 85.94%
score with L1 penalty: 0.8609
Sparsity with L2 penalty: 4.69%
score with L2 penalty: 0.8915


### C=100

In [6]:
py.iplot(data[0])

### C=1.00

In [7]:
py.iplot(data[1])

### C=0.01

In [8]:
py.iplot(data[2])

### License

Authors: 

        Alexandre Gramfort <alexandre.gramfort@inria.fr>
        
        Mathieu Blondel <mathieu@mblondel.org>
        
        Andreas Mueller <amueller@ais.uni-bonn.de>

License:

        BSD 3 clause


In [10]:
from IPython.display import display, HTML

display(HTML('<link href="//fonts.googleapis.com/css?family=Open+Sans:600,400,300,200|Inconsolata|Ubuntu+Mono:400,700" rel="stylesheet" type="text/css" />'))
display(HTML('<link rel="stylesheet" type="text/css" href="http://help.plot.ly/documentation/all_static/css/ipython-notebook-custom.css">'))

! pip install git+https://github.com/plotly/publisher.git --upgrade
import publisher
publisher.publish(
    'L1 Penalty and Sparsity in Logistic Regression.ipynb', 'scikit-learn/plot-theilsen/', 'L1 Penalty and Sparsity in Logistic Regression | plotly',
    ' ',
    title = 'L1 Penalty and Sparsity in Logistic Regression | plotly',
    name = 'L1 Penalty and Sparsity in Logistic Regression',
    has_thumbnail='true', thumbnail='thumbnail/l1.jpg', 
    language='scikit-learn', page_type='example_index',
    display_as='linear_models', order=26,
    ipynb= '~Diksha_Gabha/3279')

Collecting git+https://github.com/plotly/publisher.git
  Cloning https://github.com/plotly/publisher.git to /tmp/pip-OJCfL7-build
Installing collected packages: publisher
  Found existing installation: publisher 0.10
    Uninstalling publisher-0.10:
      Successfully uninstalled publisher-0.10
  Running setup.py install for publisher ... [?25l- done
[?25hSuccessfully installed publisher-0.10
