<a href="https://colab.research.google.com/github/MehrdadJalali-AI/BlackHole/blob/main/Day6.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Filter Methods: Using Correlation**
To select features based on their correlation with the target variable, you can use pandas and scipy to compute correlation and select the most relevant features.

In [18]:
from sklearn.feature_selection import mutual_info_classif
import pandas as pd

# Example data
data = pd.DataFrame({
    'Feature1': [10, 20, 30, 40, 50],
    'Feature2': [5, 10, 15, 20, 25],
    'Feature3': [15, 25, 35, 45, 55],
    'Target': [1, 0, 1, 0, 1]
})

# Calculate mutual information
X = data.drop(columns='Target')
y = data['Target']
mutual_info = mutual_info_classif(X, y)

# Show mutual information values
mi_df = pd.DataFrame({'Feature': X.columns, 'Mutual Information': mutual_info})
selected_features = mi_df[mi_df['Mutual Information'] > 0]['Feature'].tolist()

print("Mutual Information values:\n", mi_df)
print("Selected Features:", selected_features)






Mutual Information values:
     Feature  Mutual Information
0  Feature1                   0
1  Feature2                   0
2  Feature3                   0
Selected Features: []


**Wrapper Method (Recursive Feature Elimination)**
As shown earlier, Recursive Feature Elimination (RFE) can be used to find the best subset of features by evaluating model performance.

In [19]:
from sklearn.feature_selection import RFE
from sklearn.linear_model import LogisticRegression

model = LogisticRegression()
rfe = RFE(model, n_features_to_select=2)
rfe.fit(data.drop(columns='Target'), data['Target'])

selected_features = data.drop(columns='Target').columns[rfe.support_].tolist()
print("Selected Features with RFE:", selected_features)


Selected Features with RFE: ['Feature1', 'Feature3']


**Tree-Based Feature Importance**
Tree-based models like Random Forests and Decision Trees can provide feature importances, which can be useful for feature selection.

In [20]:
from sklearn.ensemble import RandomForestClassifier

model = RandomForestClassifier()
model.fit(data.drop(columns='Target'), data['Target'])

feature_importances = model.feature_importances_
fi_df = pd.DataFrame({'Feature': data.drop(columns='Target').columns, 'Importance': feature_importances})
selected_features = fi_df[fi_df['Importance'] > 0]['Feature'].tolist()

print("Feature Importances:\n", fi_df)
print("Selected Features:", selected_features)


Feature Importances:
     Feature  Importance
0  Feature1    0.342643
1  Feature2    0.340965
2  Feature3    0.316392
Selected Features: ['Feature1', 'Feature2', 'Feature3']


**Sample Dataset**
Consider a dataset with three features, where Feature1 and Feature2 are correlated with the target variable, and Feature3 is just random noise:

**Embedded Methods for Feature Selection**

Lasso Regression (L1 Regularization)
•Description: Adds an L1 penalty to the cost function, reducing the coefficients of less important features to zero, effectively selecting features.
•Use Case: Linear models where feature selection is required.
Ridge Regression (L2 Regularization) with Coefficient Thresholding
•Description: While Ridge doesn’t directly select features, setting a threshold on the coefficient magnitude can act as a form of feature selection.
•Use Case: Linear models needing stable but minimally impactful features.
Elastic Net (Combination of L1 and L2 Regularization)
•Description: Combines L1 (for feature selection) and L2 (for stability) penalties, allowing for both feature selection and shrinkage.
Use Case: Linear models, especially when features are correlated

In [21]:
import numpy as np
import pandas as pd
from sklearn.linear_model import Lasso, Ridge, ElasticNet

# Generating a simple dataset
np.random.seed(0)
X = pd.DataFrame({
    'Feature1': np.random.rand(10),
    'Feature2': np.random.rand(10) * 2,
    'Feature3': np.random.rand(10) * 0.5  # Random noise
})
y = 3 * X['Feature1'] + 2 * X['Feature2'] + np.random.rand(10)  # Target with some noise

print("Dataset:\n", X)
print("\nTarget:\n", y)


Dataset:
    Feature1  Feature2  Feature3
0  0.548814  1.583450  0.489309
1  0.715189  1.057790  0.399579
2  0.602763  1.136089  0.230740
3  0.544883  1.851193  0.390265
4  0.423655  0.142072  0.059137
5  0.645894  0.174259  0.319961
6  0.437587  0.040437  0.071677
7  0.891773  1.665240  0.472334
8  0.963663  1.556314  0.260924
9  0.383442  1.740024  0.207331

Target:
 0    5.077896
1    5.035381
2    4.536619
3    5.905470
4    1.573898
5    2.903835
6    2.005731
7    6.622732
8    6.947363
9    5.312193
dtype: float64


**1. Lasso Regression (L1 Regularization)**

Lasso (L1) adds a penalty that can shrink some coefficients to zero, effectively performing feature selection.

In [22]:
lasso = Lasso(alpha=0.1)
lasso.fit(X, y)

print("Lasso Coefficients:", lasso.coef_)


Lasso Coefficients: [0.68446817 2.1121344  0.        ]


**2. Ridge Regression (L2 Regularization)**

Ridge (L2) applies a penalty proportional to the square of the coefficients, shrinking them but not setting them to zero.

Explanation:

Expected Outcome: All coefficients are shrunk toward zero, but none are exactly zero. This regularization helps manage multicollinearity but keeps all features in the model.

In [23]:
ridge = Ridge(alpha=0.1)
ridge.fit(X, y)

print("Ridge Coefficients:", ridge.coef_)

Ridge Coefficients: [2.78092591 2.06444915 0.21161511]


**3. Elastic Net (Combination of L1 and L2 Regularization)**

Elastic Net combines both L1 and L2 penalties, allowing for feature selection (L1 effect) and shrinkage (L2 effect).

**# Explanation: **

Expected Outcome: Some coefficients may be reduced to zero (like Lasso), while others are shrunk but kept (like Ridge). This combination is useful when you want both feature selection and stability.

In [24]:
elastic_net = ElasticNet(alpha=0.1, l1_ratio=0.5)  # l1_ratio balances L1 and L2
elastic_net.fit(X, y)

print("Elastic Net Coefficients:", elastic_net.coef_)


Elastic Net Coefficients: [0.93677728 1.98318448 0.        ]
