<font color="red" size="6">Filter Methods</font>
<p><font color="Yellow" size="5"><b>9_Dispersion_Ratio</b></font>

Mutual Dependence refers to the measure of how dependent two variables are on each other. It quantifies the relationship or dependency between two variables and is often used in feature selection to understand how well one feature predicts another.

One popular method to compute Mutual Dependence is by using Mutual Information (MI), which measures the amount of information shared between two variables. Unlike correlation, mutual information can capture non-linear dependencies as well.

In [1]:
import numpy as np
import pandas as pd
from sklearn.datasets import load_wine
from sklearn.feature_selection import mutual_info_classif

In [2]:
# Load the Wine dataset
data = load_wine()
X = pd.DataFrame(data.data, columns=data.feature_names)  # Features as a DataFrame
y = data.target  # Target variable

# Calculate Mutual Information (MI) for each feature with the target
mutual_info = mutual_info_classif(X, y)

In [3]:
# Create a DataFrame for Mutual Information values
mi_df = pd.DataFrame({
    'Feature': X.columns,
    'Mutual Information': mutual_info
})
mi_df = mi_df.sort_values(by='Mutual Information', ascending=False).reset_index(drop=True)


In [4]:
# Display features ranked by Mutual Information
print("Features Ranked by Mutual Information:")
print(mi_df)

Features Ranked by Mutual Information:
                         Feature  Mutual Information
0                     flavanoids            0.671010
1                        proline            0.567389
2                color_intensity            0.555392
3   od280/od315_of_diluted_wines            0.502342
4                            hue            0.473096
5                        alcohol            0.462069
6                  total_phenols            0.416052
7              alcalinity_of_ash            0.292887
8                proanthocyanins            0.292169
9                     malic_acid            0.269178
10                     magnesium            0.227955
11          nonflavanoid_phenols            0.130595
12                           ash            0.070738


In [5]:
# Optional: Select features above a certain MI threshold
threshold = np.mean(mutual_info)  # Use mean MI as a threshold
selected_features = mi_df[mi_df['Mutual Information'] > threshold]['Feature'].tolist()

print("\nSelected Features Based on Mutual Information Threshold:")
print(selected_features)


Selected Features Based on Mutual Information Threshold:
['flavanoids', 'proline', 'color_intensity', 'od280/od315_of_diluted_wines', 'hue', 'alcohol', 'total_phenols']
