Feature Selection

Optimal Feature Selection

Given a dataset X = {x₁, ..., x_p} composed by p features, and a target variable y, the miscoding of the feature x_j measures how difficult is to reconstruct y given x_j, and the other way around. We are not only interested in to identify how much information x_j contains about y, but also if x_j contains additional information that is not related to y (which is a bad thing).

The fastautoml.Miscoding class allow us to compute the relevance of features, the quality of a dataset, and select the optimal subset of features to include in a stydy.

Feature Relevance

Let's generate a synthetic dataset composed by 1000 random points belonging to 10 Gaussian blobs. The samples of the dataset are described by 20 features, from which only 4 are informative. Next figure shows the blobs projected into the two dimensional space defined by the features x₈ and x₁₀.

Ten Gaussian Blobs

For more information about how to identify the relevance of features using the Miscoding class see the following blog entries:

Miscoding of Random Distributions
Correlation vs Miscoding

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Selection

Optimal Feature Selection

Feature Relevance

Mathematical Formulation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally