# MAchine Learning Intro

Machine learning is the scientific study of algorithms and statistical models to perform a specific task effectively without using explicit instructions. Machine learning algorithms include – supervised and unsupervised algorithms

## Supervised Learning 
In supervised learning, the target is already known and is used in the model prediction.

**Classification**: When target variable is categorical 
**Regression**: When target variable is continuous 


## Unsupervised Learning 
In unsupervised learning, the target is not known and is supposed to be determined through the models.

**Clustering**: Customer segmentation  
**Association**: Market basket analysis 

![image.png](attachment:image.png)

# Feature Selection 

Feature selection is the process of selecting a subset of relevant features for use in machine learning model building. 
- Filter method 
- Wrapper Method
- Embedded Method

## Filter Method

Filter methods rely on the characteristics of data and are model agnostic. They tend to be less computationally expensive and suitable for quick screening. 



![image.png](attachment:image.png)

- **Constant feature**: Same value for all observations. Checks on standard deviation and count of unique
- **Quasi constant feature**: Same value for most of the observations. Check on variance threshold and count of distinct observations. 
- **Duplication feature**: Identical feature. Retain only one of the duplicate features. 
- **Correlation**: It refers to the degree to which a pair of variables is linearly related. Correlated predictor variables provide redundant information. Good feature subset contains features highly correlated with the target, yet uncorrelated to each other. 
- **Fisher score (Chi-square)**: Statistical test, best suited to determine a difference between expected frequencies and observed frequencies. **Smallest p-value, biggest is the importance. **
- **Univariate (one way ANOVA)**: Tests the hypothesis that 2 or more samples have same mean. Samples should be **independent, normally distributed and homogeneity of variance.** Variables with p-value > 0.05 are not important to predict Y. 

## Wrapper Method

- Wrapper methods use machine learning models to score the **feature subset**. 
- A new model is trained on each feature subset and usually provide the best performing subset

- **Step forward**: Begin with no feature and add one feature at a time (mlxtend)
    - **Recursive function addition**: If on adding the feature the increment is more than the threshold then keep the feature 
    - **Condition**: Increase > Threshold 
- **Step backward**: Begin with all the features and remove one feature at a time (mlxtend)
    - **Recursive function elimination**: If on removing the feature, the decrease is less than the threshold then drop the feature 
    - **Condition**: Decrease < Threshold 
- **Exhaustive**: Tries all possible feature combinations
- **Stop condition**: When the performance does not increase beyond a certain threshold or decrease beyond a certain threshold 


## Embedded Method 

- Performs feature selection as part of the **model construction process** and considers the **interaction between models and features**. 
- Embedded methods are faster than wrapper methods and more accurate than filter methods

- **Regularization**: Consists of adding a penalty to the different parameters of the model to reduce the freedom of the model. Helps to **improve generalization ability of the model**. 
- **Lasso (L1)**: Shrinks some **parameters to zero (feature elimination)**
- **Ridge (L2)**: As the **penalization increases the coefficients approach zero** (no feature is eliminated)
- **Tree**: Build machine learning model (decision tree, random forest or gradient boosting) and calculate **feature importance**. Remove least important feature and repeat till a condition is met. 


# Reference

- https://analyticsindiamag.com/ai-trends/study-notes-on-machine-learning-pipeline-feature-engineering-feature-selection-and-hyper-parameters-optimization/

- https://analyticsindiamag.com/ai-trends/common-feature-engineering-techniques-to-tackle-real-world-data/