his example illustrates the differences between univariate F-test statistics and mutual information.

We consider 3 features x_1, x_2, x_3 distributed uniformly over [0, 1], the target depends on them as follows:
y = x_1 + sin(6 * pi * x_2) + 0.1 * N(0, 1), that is the third features is completely irrelevant.
The code below plots the dependency of y against individual x_i and normalized values of univariate F-tests statistics and mutual information.

As F-test captures only linear dependency, it rates x_1 as the most discriminative feature. On the other hand, mutual information can capture any kind of dependency between variables and it rates x_2 as the most discriminative feature, which probably agrees better with our intuitive perception for this example. Both methods correctly marks x_3 as irrelevant.

#### New to Plotly?
Plotly's Python library is free and open source! [Get started](https://plot.ly/python/getting-started/) by downloading the client and [reading the primer](https://plot.ly/python/getting-started/).
<br>You can set up Plotly to work in [online](https://plot.ly/python/getting-started/#initialization-for-online-plotting) or [offline](https://plot.ly/python/getting-started/#initialization-for-offline-plotting) mode, or in [jupyter notebooks](https://plot.ly/python/getting-started/#start-plotting-online).
<br>We also have a quick-reference [cheatsheet](https://images.plot.ly/plotly-documentation/images/python_cheat_sheet.pdf) (new!) to help you get started!

### Version

In [1]:
import sklearn
sklearn.__version__

'0.18.1'

### Imports

This tutorial imports [f_regression](http://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.f_regression.html#sklearn.feature_selection.f_regression) and [mutual_info_regression](http://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.mutual_info_regression.html#sklearn.feature_selection.mutual_info_regression).

In [2]:
print(__doc__)

import plotly.plotly as py
import plotly.graph_objs as go
from plotly import tools

import numpy as np
from sklearn.feature_selection import f_regression, mutual_info_regression


Automatically created module for IPython interactive environment


### Calculations

In [3]:
np.random.seed(0)
X = np.random.rand(1000, 3)
y = X[:, 0] + np.sin(6 * np.pi * X[:, 1]) + 0.1 * np.random.randn(1000)

f_test, _ = f_regression(X, y)
f_test /= np.max(f_test)

mi = mutual_info_regression(X, y)
mi /= np.max(mi)

### Plot Results

In [4]:
titles = []
for i in range(3):
    titles.append("F-test={:.2f}, MI={:.2f}".format(f_test[i], mi[i]))
    
fig = tools.make_subplots(rows=1, cols=3,
                          print_grid=False,
                          subplot_titles=tuple(titles))

In [5]:
for i in range(3):
    trace = go.Scatter(x=X[:, i], y=y,
                       mode='markers',
                       marker=dict(color='blue', 
                                   line=dict(width=1, color='black')
                                  ),
                       showlegend=False
                      )
    fig.append_trace(trace, 1, i+1)
    
for i in map(str ,range(1, 4)):
    x = 'xaxis' + i
    y = 'yaxis' + i
    fig['layout'][x].update(title="<i>x_{}</i>".format(int(i)),
                            showgrid=False, zeroline=False)
    fig['layout'][y].update(title='<i>y</i>', showgrid=False,
                            zeroline=False)

In [6]:
py.iplot(fig)

In [8]:
from IPython.display import display, HTML

display(HTML('<link href="//fonts.googleapis.com/css?family=Open+Sans:600,400,300,200|Inconsolata|Ubuntu+Mono:400,700" rel="stylesheet" type="text/css" />'))
display(HTML('<link rel="stylesheet" type="text/css" href="http://help.plot.ly/documentation/all_static/css/ipython-notebook-custom.css">'))

! pip install git+https://github.com/plotly/publisher.git --upgrade
import publisher
publisher.publish(
    'Comparison of F-Test and Mutual Information.ipynb', 'scikit-learn/plot-f-test-vs-mi/', 'Comparison of F-Test and Mutual Information | plotly',
    ' ',
    title = 'Comparison of F-Test and Mutual Information | plotly',
    name = 'Comparison of F-Test and Mutual Information',
    has_thumbnail='true', thumbnail='thumbnail/f-test-vs-mi.jpg', 
    language='scikit-learn', page_type='example_index',
    display_as='feature_selection', order=3,
    ipynb= '~Diksha_Gabha/3082')

Collecting git+https://github.com/plotly/publisher.git
  Cloning https://github.com/plotly/publisher.git to /tmp/pip-0y7A3j-build
Installing collected packages: publisher
  Found existing installation: publisher 0.10
    Uninstalling publisher-0.10:
      Successfully uninstalled publisher-0.10
  Running setup.py install for publisher ... [?25l- done
[?25hSuccessfully installed publisher-0.10
