Q1. What is the Filter method in feature selection, and how does it work?


Filter Method in Feature Selection
The Filter Method is a feature selection technique used in machine learning to select relevant features (independent variables) before training a model. It works by evaluating each feature individually, based on statistical tests, without involving a machine learning model.

How It Works:
Rank Features: Each feature is scored using a statistical measure that evaluates its relevance to the target variable.
Set a Threshold: Features with scores above a certain threshold are selected, while less important features are removed.
Train the Model: The reduced feature set is used to train the machine learning model.


Q2. How does the Wrapper method differ from the Filter method in feature selection?


1. Filter Method (Example: Correlation, Chi-Square, Mutual Information)

Features are ranked based on a statistical measure.

A threshold is applied to select top-ranked features.

The reduced feature set is used for model training.



2. Wrapper Method (Example: Forward/Backward Selection, Recursive Feature Elimination)
A subset of features is selected.

A model is trained and evaluated on this subset.

Features are added/removed iteratively to optimize performance.

The best-performing subset is selected.

Q3. What are some common techniques used in Embedded feature selection methods?


Common Techniques in Embedded Feature Selection
Lasso (L1 Regularization)
Shrinks feature coefficients, setting some to zero (removing irrelevant features).
Used in Lasso Regression, Logistic Regression (L1 penalty).


Ridge (L2 Regularization) & Elastic Net
Ridge shrinks coefficients but does not remove features.
Elastic Net combines L1 (Lasso) & L2 (Ridge) for balanced selection.


Tree-Based Feature Importance
Random Forest, Decision Trees, XGBoost assign importance scores to features.
Features with low importance are removed.


Recursive Feature Elimination (RFE)
Iteratively removes least important features using model performance evaluation.
Advantage: Automatic selection during training, efficient.
Limitation: Computationally expensive for large datasets.



Q4. What are some drawbacks of using the Filter method for feature selection?


Ignores Feature Interactions – Evaluates features individually, missing important combinations.

May Select Irrelevant Features – High correlation does not guarantee predictive usefulness.

Not Model-Specific – Doesn't optimize feature selection for a specific ML model.

Threshold Sensitivity – Arbitrary cutoffs may lead to removing useful features.

Redundant Features – Keeps highly correlated features, leading to redundancy.

Q5. In which situations would you prefer using the Filter method over the Wrapper method for feature
selection?


Large Datasets 
The Filter method is faster and computationally efficient, making it ideal for datasets with many features (e.g., text data, genomics).

Preprocessing Step Before Advanced Methods 
Often used before Wrapper or Embedded methods to remove irrelevant features quickly, reducing computation time.

Avoiding Overfitting 
Since it doesn't rely on a specific model, it helps prevent overfitting, especially when dealing with small datasets.

Interpretability & Simplicity 
Uses statistical tests like correlation, Chi-square, and mutual information, making it easy to understand and apply.

Limited Computational Resources 
Unlike Wrapper methods (which train models multiple times), the Filter method is lightweight and fast, making it suitable for low-resource environments.

Q6. In a telecom company, you are working on a project to develop a predictive model for customer churn.
You are unsure of which features to include in the model because the dataset contains several different
ones. Describe how you would choose the most pertinent attributes for the model using the Filter Method.


Feature Selection for Customer Churn Prediction Using the Filter Method


Remove Low-Variance Features – Drop features with little variation using VarianceThreshold().


Correlation Analysis – Remove highly correlated features (correlation > 0.85) to avoid redundancy.


Chi-Square Test (For Categorical Features) – Select top categorical features using SelectKBest(chi2).


Mutual Information Score – Identify non-linear relationships between features and Churn

Q7. You are working on a project to predict the outcome of a soccer match. You have a large dataset with
many features, including player statistics and team rankings. Explain how you would use the Embedded
method to select the most relevant features for the model.


Feature Selection for Soccer Match Prediction Using the Embedded Method
1. Train a Model with Built-in Feature Selection:
Use Lasso (L1 Regularization) to remove irrelevant features.
Use Tree-based models (Random Forest, XGBoost) to get feature importance scores.


2. Extract Important Features:
Keep features with non-zero coefficients (Lasso).
Select top-ranked features based on importance scores (Tree models).


3. Recursive Feature Elimination (RFE) (Optional):
Iteratively remove the least important features using a model like RandomForestClassifier.

Q8. You are working on a project to predict the price of a house based on its features, such as size, location,
and age. You have a limited number of features, and you want to ensure that you select the most important
ones for the model. Explain how you would use the Wrapper method to select the best set of features for the
predictor.

In [None]:
from sklearn.feature_selection import RFE
from sklearn.linear_model import LinearRegression

model = LinearRegression()
selector = RFE(model, n_features_to_select=5)  # Select top 5 features
selector.fit(X, y)

print("Selected Features:", X.columns[selector.support_])
