# Filter Methods

* Rely on the characteristics of the data (feature characteristics)
* Do not use ML algorithms at all
* Model agnostic
* Tend to be less computationally expensive
* Usually give lower prediction performance than wrapper methods
* Very well-suited for a quick screen and removal of irrelevant features

#### Looks at things like:
* Variance
* Correlation
* Univariate selection: observing the distribution of the values within the variable

#### Two Step Procedure:
1) Rank features according to a certain criteria
- Each feature is ranked independently of the feature space

2) Select the highest ranking features

** May select redundant variables because they do not consider the relationships between features

#### Ranking Criteria Examples:
* Feature scores on various statistical tests:
    * Chi-square, Fisher Score
    * Univariate parametric tests (anova)
    * Mutual information
    * Variance
        * Constant features 
        * Quasi-constant features

#### Multivariate Filter Methods
* Handle redundant feature
* Duplicated features
* Correlated features

    * Simple yet powerful methods to quickly remove irrelevant and redundant features
    * First step in any feature selection pipeline
    * Quick dataset screening for irrelevant features
    * Quick removal of redundant features

# Wrapper Methods

* Use predictive ML models to score the feature subset
* Train a new model on each feature subset
* Tend to be very computationally expensive
* Usually provide the best performing feature subset for a given ML algorithm
* May not produce the best feature combination for a different ML model that was not used to select the features

#### Looks at things like:
* Forward selection
* Backward selection
* Exhaustive search: scans all the possible combinations of features to find the optimal feature combination

#### Overview:
* Evaluate the features in the light of a specific ML algorithm
* Evaluate subsets of variables
    * Detect interactions between variables
    * Find the optimal feature subset for the desired classifier

# Embedded Methods

* Perform feature selection as part of the model construction process
* Consider the interaction between features and models
* Less computationally expensive than wrapper methods because they fit the ML model only once

#### Looks at things like:
* LASSO
* Tree importance