## Reducing Features by Maximizing Class Separability

### Problem:
**You want to reduce the features to be used by a classifier** (means you have supervised data and it is a classification problem)

### Solution:
**Try linear discriminant analysis(LDA) to project the features onto component axes that maximize the separation of classes**


In [1]:
# Load libraries
from sklearn import datasets
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis

In [2]:
# Load the Iris flower dataset:
iris = datasets.load_iris()
X = iris.data
y = iris.target

In [3]:
# Create an LDA that will reduce the data down to 2 feature
lda = LinearDiscriminantAnalysis(n_components=2)

# run an LDA and use it to transform the features
X_lda = lda.fit(X, y).transform(X)

In [4]:
# Print the number of features
print('Original number of features:', X.shape[1])
print('Reduced number of features:', X_lda.shape[1])

Original number of features: 4
Reduced number of features: 2


In [5]:
## View the ratio of explained variance
lda.explained_variance_ratio_

array([0.9912126, 0.0087874])

In [6]:
# Create an LDA that will reduce the data down to 1 feature
lda = LinearDiscriminantAnalysis(n_components=1)

# run an LDA and use it to transform the features
X_lda1 = lda.fit(X, y).transform(X)

In [10]:
# Print the number of features
print('Original number of features:', X.shape[1])
print('Reduced number of features in X_lda:', X_lda.shape[1])
print('Reduced number of features in X_lda1:', X_lda1.shape[1])

Original number of features: 4
Reduced number of features in X_lda: 2
Reduced number of features in X_lda1: 1


In [9]:
## View the ratio of explained variance
lda.explained_variance_ratio_

array([0.9912126])

**Note:** see here for the [concepts of LDA ](https://en.wikipedia.org/wiki/Linear_discriminant_analysis)


And for implementation deatils of LDA in sklearn see [here](https://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html)

### Selecting The Best Number Of Components For LDA

In scikit-learn, LDA is implemented using `LinearDiscriminantAnalysis` includes a parameter, `n_components` indicating the number of features we want returned. To figure out what argument value to use with `n_components` (e.g. how many parameters to keep), we can take advantage of the fact that `explained_variance_ratio_` tells us the variance explained by each outputted feature and is a sorted array.

Specifically, we can run `LinearDiscriminantAnalysis` with `n_components` set to `None` to return ratio of variance explained by every component feature, then calculate how many components are required to get above some threshold of variance explained (often 0.95 or 0.99).

In [11]:
# Create and run an LDA with n-components set to none
lda = LinearDiscriminantAnalysis(n_components=None)
X_lda = lda.fit(X, y)

In [12]:
# Create array of explained variance ratios
lda_var_ratios = lda.explained_variance_ratio_

### Create Function Calculating Number Of Components Required To Pass Threshold

In [13]:
# Create a function
def select_n_components(var_ratio, goal_var: float) -> int:
    # Set initial variance explained so far
    total_variance = 0.0
    
    # Set initial number of features
    n_components = 0
    
    # For the explained variance of each feature:
    for explained_variance in var_ratio:
        
        # Add the explained variance to the total
        total_variance += explained_variance
        
        # Add one to the number of components
        n_components += 1
        
        # If we reach our goal level of explained variance
        if total_variance >= goal_var:
            # End the loop
            break
            
    # Return the number of components
    return n_components

In [14]:
# Run function
select_n_components(lda_var_ratios, 0.95)

1