QTS.1

**Filter Method in Feature Selection:**
- **Definition**: Filter methods evaluate the relevance of features based 
on statistical measures and rank or select them before the model training process.
- **How it works**: Features are assessed independently of the machine learning model.
Common techniques include correlation, chi-squared tests, information gain, and 
mutual information. Features are then ranked or selected based on these measures,
and only the top-ranked features are used for model training.


QTS.2

**Wrapper Method vs. Filter Method:**
- **Wrapper Method**: Evaluates subsets of features by using a specific 
machine learning model's performance as a criterion. 
It involves iterative model training with different feature subsets, and
the selection criterion is based on the model's performance (e.g., accuracy, F1 score).
- **Filter Method**: Assesses the relevance of features independently of the machine learning 
model. It relies on statistical measures to rank or select features before the model training process.

In summary, the main difference lies in how they assess feature relevance: Wrapper methods
use the model's performance, while filter methods use statistical measures.

QTS.3

**Common Techniques in Embedded Feature Selection:**
1. **Lasso Regression (L1 Regularization):**
   - **How it works**: Penalizes the absolute 
    values of the coefficients, encouraging sparsity and automatic feature selection.
  
2. **Decision Trees and Random Forests:**
   - **How it works**: Feature importance scores are calculated during the 
    tree-building process, aiding in feature selection.

3. **Gradient Boosting Machines:**
   - **How it works**: Iterative model training where each new model corrects 
    errors of the previous ones; feature importance is derived during this process.

4. **Elastic Net Regression:**
   - **How it works**: Combines L1 and L2 regularization, providing a balance 
    between sparsity and grouping effects for feature selection.

5. **Recursive Feature Elimination (RFE):**
   - **How it works**: Iteratively removes the least important features based on 
    model performance until the desired number of features is reached.

Embedded methods incorporate feature selection within the model training process,
optimizing both the model and feature subset simultaneously.

QTS.4

**Drawbacks of Filter Method for Feature Selection:**
1. **Independence Assumption:**
   - **Issue**: Filter methods assess features independently, ignoring potential 
    interactions or dependencies between features.
   
2. **Ignores Model's Performance:**
   - **Issue**: The selected features might not be the most relevant for the 
    specific machine learning model being used, as filter methods don't consider the model's performance.

3. **Doesn't Consider Feature Combinations:**
   - **Issue**: Filter methods don't evaluate the impact of feature combinations,
    potentially missing synergies that could contribute to better model performance.

4. **Sensitivity to Feature Scaling:**
   - **Issue**: Results can be sensitive to the scale of features, and inappropriate 
    scaling might impact the ranking of features.

5. **Limited in Handling Redundancy:**
   - **Issue**: Filter methods may not effectively handle redundant features, 
    leading to the selection of correlated features.

While filter methods are computationally efficient, these drawbacks highlight 
limitations in their ability to capture the full complexity of relationships between features in a dataset.

QTS.5

**Use Filter Method Over Wrapper Method When:**
1. **Large Dataset:**
   - **Situation**: Working with a large dataset where the computational cost of
    wrapper methods is prohibitive.

2. **Computational Efficiency is Critical:**
   - **Situation**: Need a quick and computationally efficient feature selection 
    process without the overhead of iterative model training.

3. **Exploratory Data Analysis:**
   - **Situation**: Conducting initial data exploration and want a fast way to 
    identify potentially relevant features before diving into complex model-specific evaluations.

4. **Feature Independence is Reasonable:**
   - **Situation**: Assumption that features are reasonably independent, and 
    assessing them individually is sufficient for the task.

Filter methods are advantageous in scenarios where quick, independent feature 
assessment is needed, and computational efficiency is a priority.

QTS.6

**Filter Method for Selecting Features in Telecom Customer Churn Model:**
1. **Explore Data:**
   - **Step**: Conduct initial exploratory data analysis to understand the 
    characteristics of the dataset and the relationships between features.

2. **Calculate Relevance Metrics:**
   - **Step**: Use filter methods such as correlation analysis, chi-squared tests,
    or information gain to calculate relevance metrics for each feature in isolation.

3. **Rank Features:**
   - **Step**: Rank the features based on their relevance metrics. 
    Identify the top-ranked features that show the highest correlation or information
    gain concerning the target variable (churn).

4. **Select Features:**
   - **Step**: Choose a subset of the top-ranked features based on the desired number
    or a predetermined threshold for relevance.

5. **Train Model:**
   - **Step**: Train the predictive model using the selected subset of features and 
    evaluate its performance on a validation set.

By applying filter methods, you can quickly identify and select features that 
exhibit a strong statistical relationship with the target variable, potentially 
improving the model's ability to predict customer churn.

QTS.7

**Embedded Feature Selection for Soccer Match Outcome Prediction:**
1. **Choose a Model with Embedded Feature Selection:**
   - **Selection**: Opt for a machine learning algorithm that inherently 
    incorporates feature selection within its 
    training process. Examples include Lasso Regression, Decision Trees, 
    Random Forests, Gradient Boosting Machines, or Elastic Net Regression.

2. **Train the Model:**
   - **Process**: Train the chosen model using the entire dataset, 
    including all available features.

3. **Evaluate Feature Importance:**
   - **Process**: Utilize the model's built-in feature importance scores or 
    coefficients to identify the relevance of each feature in predicting soccer match outcomes.

4. **Select Top Features:**
   - **Process**: Choose a subset of the most important features based on their scores.
    Set a threshold or select a predetermined number of features.

5. **Refine and Validate:**
   - **Process**: Refine the model by training it again using only the selected 
    subset of features. Validate the model's performance on a separate test set to ensure generalization.

Embedded methods automatically assess feature importance during the model training process,
making them suitable for selecting relevant features in a soccer match outcome prediction project.

QTS.8

**Wrapper Method for House Price Prediction:**
1. **Define Evaluation Metric:**
   - **Step**: Choose an evaluation metric (e.g., mean squared error for regression)
    to assess the model's performance.

2. **Generate Feature Subsets:**
   - **Step**: Create different subsets of features (combinations) to be evaluated
    by the chosen machine learning model.

3. **Train and Evaluate Model:**
   - **Step**: Iteratively train the model using each subset of features and evaluate
    its performance using the defined metric. This involves training and testing the 
    model multiple times with different feature combinations.

4. **Select Optimal Subset:**
   - **Step**: Choose the subset of features that results in the best model performance
    according to the evaluation metric.

5. **Validate Model:**
   - **Step**: Validate the final model's performance on a separate test set to ensure
    its generalization ability.

Wrapper methods involve the model itself in the feature selection process, assessing 
subsets of features based on their impact on the model's performance for the specific
prediction task, such as predicting house prices.