In [1]:
from sklearn import datasets
from sklearn.feature_selection import SelectKBest, chi2

# Univariate Feature selection

We select feature based on strong relation with the target variable. Here, we don't care about other features when estimating the relationship b/w one variable and the target.

We select K best features that have strongest relationship with the target variables.

### Chi2 (Chi-Squered) Probability distribution

Distribution of sum of squares of k independent and standard normal random variables.

If Z1, Z2, ...Zk are independendent, standard normal random variables, then the sum of their squares.

![title](https://wikimedia.org/api/rest_v1/media/math/render/svg/613cc5993b3b61897dbc9a8d722487d86a34f406)

is distributed according to the chi-square distribution. Denoted as following:

![title](https://wikimedia.org/api/rest_v1/media/math/render/svg/965f31a974c81dfa46b88c272744f0c7029fb274)

The chi-square distribution has one parameter: a positive integer K that specifies the no.of Zi's

Random variables are drawn from chi-square distribution are always positive. 
So, this <b>distribution can be used only for features are positive in values</b>.


# Iris data features are positive, so we can use chi2

In [10]:
iris = datasets.load_iris()

In [11]:
x, y = iris.data, iris.target

In [12]:
x.shape

(150, 4)

In [13]:
x

array([[5.1, 3.5, 1.4, 0.2],
       [4.9, 3. , 1.4, 0.2],
       [4.7, 3.2, 1.3, 0.2],
       [4.6, 3.1, 1.5, 0.2],
       [5. , 3.6, 1.4, 0.2],
       [5.4, 3.9, 1.7, 0.4],
       [4.6, 3.4, 1.4, 0.3],
       [5. , 3.4, 1.5, 0.2],
       [4.4, 2.9, 1.4, 0.2],
       [4.9, 3.1, 1.5, 0.1],
       [5.4, 3.7, 1.5, 0.2],
       [4.8, 3.4, 1.6, 0.2],
       [4.8, 3. , 1.4, 0.1],
       [4.3, 3. , 1.1, 0.1],
       [5.8, 4. , 1.2, 0.2],
       [5.7, 4.4, 1.5, 0.4],
       [5.4, 3.9, 1.3, 0.4],
       [5.1, 3.5, 1.4, 0.3],
       [5.7, 3.8, 1.7, 0.3],
       [5.1, 3.8, 1.5, 0.3],
       [5.4, 3.4, 1.7, 0.2],
       [5.1, 3.7, 1.5, 0.4],
       [4.6, 3.6, 1. , 0.2],
       [5.1, 3.3, 1.7, 0.5],
       [4.8, 3.4, 1.9, 0.2],
       [5. , 3. , 1.6, 0.2],
       [5. , 3.4, 1.6, 0.4],
       [5.2, 3.5, 1.5, 0.2],
       [5.2, 3.4, 1.4, 0.2],
       [4.7, 3.2, 1.6, 0.2],
       [4.8, 3.1, 1.6, 0.2],
       [5.4, 3.4, 1.5, 0.4],
       [5.2, 4.1, 1.5, 0.1],
       [5.5, 4.2, 1.4, 0.2],
       [4.9, 3

Since all the features are positive, we can assume these features are drawn from chi2 probability distribution.

In [14]:
x_new = SelectKBest(chi2, k=2).fit_transform(x,y)

In [15]:
x_new.shape

(150, 2)

In [16]:
x_new

array([[1.4, 0.2],
       [1.4, 0.2],
       [1.3, 0.2],
       [1.5, 0.2],
       [1.4, 0.2],
       [1.7, 0.4],
       [1.4, 0.3],
       [1.5, 0.2],
       [1.4, 0.2],
       [1.5, 0.1],
       [1.5, 0.2],
       [1.6, 0.2],
       [1.4, 0.1],
       [1.1, 0.1],
       [1.2, 0.2],
       [1.5, 0.4],
       [1.3, 0.4],
       [1.4, 0.3],
       [1.7, 0.3],
       [1.5, 0.3],
       [1.7, 0.2],
       [1.5, 0.4],
       [1. , 0.2],
       [1.7, 0.5],
       [1.9, 0.2],
       [1.6, 0.2],
       [1.6, 0.4],
       [1.5, 0.2],
       [1.4, 0.2],
       [1.6, 0.2],
       [1.6, 0.2],
       [1.5, 0.4],
       [1.5, 0.1],
       [1.4, 0.2],
       [1.5, 0.2],
       [1.2, 0.2],
       [1.3, 0.2],
       [1.4, 0.1],
       [1.3, 0.2],
       [1.5, 0.2],
       [1.3, 0.3],
       [1.3, 0.3],
       [1.3, 0.2],
       [1.6, 0.6],
       [1.9, 0.4],
       [1.4, 0.3],
       [1.6, 0.2],
       [1.4, 0.2],
       [1.5, 0.2],
       [1.4, 0.2],
       [4.7, 1.4],
       [4.5, 1.5],
       [4.9,

# Disadvantage of Univariate feature selection technique

A univariate model is <b>less comprehensive compared to multivariate models</b>. In real world, there is often more than just one factor at play and a Univariate model is unable to take this into account due to its inherent limitations.