# **Feature Engineering**

Feature engineering is the process of transforming raw data into meaningful inputs (features) that improve a machine learning model’s performance. It’s one of the most important steps in building a successful ML model.  

Think of it like cooking:  
- **Raw ingredients** = Raw data  
- **Chopping, mixing, and seasoning** = Feature engineering  
- **Delicious dish** = A well-performing ML model  

---

### **Why is Feature Engineering Important?**  
A machine learning model is only as good as the data it learns from. Even if you have a powerful model, poor-quality features can lead to bad predictions. Good features help the model understand patterns better, leading to improved accuracy.  

---

## **Feature Transformation**

- #### **Handling Missing Values (Imputation)**

    - Replacing missing values with a meaningful estimate such as mean, median, or mode.
    - Example: Filling missing ages in a dataset with the average age.

- #### **Handling Categorical Variables**

    - Converting categorical variables (e.g., dog, cat, sheep) into numerical values.
    - **One-Hot Encoding**: Representing categories as binary columns.
    - Example:
      - Dog → [1, 0, 0]
      - Cat → [0, 1, 0]
      - Sheep → [0, 0, 1]
    - **Ordinal Encoding**: Assigning ordered numerical values to categories.
    - Example: Small (1), Medium (2), Large (3)

- #### **Binning Continuous Features**

    - Converting numerical features into categorical ranges.
    - Example: Age → {0-12: Child, 13-19: Teen, 20+: Adult}

- #### **Outlier Detection and Handling**

    - Detecting extreme values that can distort model training.
    - Methods: Z-score, IQR method, Winsorization, Clipping.

- #### **Scaling & Normalization**

    - Different numerical features have different scales (e.g., Age vs. Salary).
    - **MinMax Scaling**: Rescales data to a fixed range, usually [0,1].
    - **Standardization**: Converts data to have zero mean and unit variance.
    - **Normalization**: Converts values to a range between -1 and 1.
    - **Mean Absolute Scaling**: Scaling based on mean absolute deviation.
    - Example: Converting prices from dollars to a normalized scale between 0 and 1.

---

## **Feature Construction**

- #### **Feature Splitting**

    - Splitting an existing feature into multiple meaningful features.
    - Example: Splitting a full name column into First Name and Last Name.

- #### **Feature Grouping**

    - Combining related features to create more meaningful representations.
    - Example: Grouping similar product categories together in a dataset.

- #### **Creating New Features from Existing Ones**

    - Example: Instead of using "Date of Birth," deriving "Age".
    - **Titanic Dataset Example**:
      - Given columns: `SibSp` (Number of siblings/spouses) and `Parch` (Number of parents/children).
      - New Feature: `Family Size` = `SibSp + Parch`.
      - Further Classification:
        - `0: Alone`
        - `1-4: Small Family`
        - `>4: Large Family`

- #### **Feature Interaction**

    - Creating new features by combining two or more existing features.
    - Example: In an e-commerce dataset, multiplying `price` and `quantity_sold` to create a `total_revenue` feature.

- #### **Polynomial Features**

    - Creating higher-order features to capture non-linear relationships.
    - Example: Given `feature X`, generating `X²`, `X³`, etc., to enhance model expressiveness.

- #### **Time-Based Feature Engineering**

    - Extracting useful features from time-related data.
    - Example: Extracting `day_of_week`, `month`, or `hour` from a timestamp for a sales forecasting model.

---

## **Feature Selection**

Selecting the most relevant features to improve model efficiency and performance.

- #### **Methods of Feature Selection**

    - **Forward Selection**: Iteratively adding the best-performing feature.
    - **Backward Elimination**: Removing the least important feature one at a time.
    - **Example**: In house price prediction, keeping features like `size` and `location` while removing `house color`.

- #### **Example: MNIST Dataset**

    - 50,000 images of handwritten digits.
    - Low-resolution images: 28 × 28 = 784 features.
    - Using **Feature Selection**, only selecting the central pixels that contribute to digit formation, ignoring empty spaces.

---

## **Feature Extraction**

Reducing the number of features while preserving relevant information.

- #### **Example: Principal Component Analysis (PCA)**

    - Reduces dimensionality while retaining maximum variance.
    - Example: Converting 100 features into 10 principal components.

- #### **Example: Real Estate Dataset**

    - Given features: `Rooms`, `Washrooms`, `Price`.
    - Instead of using `Rooms` and `Washrooms` separately, merging them into `Total Square Feet Area`, and then using it as a feature.

- #### **Other Dimensionality Reduction Techniques**

    - **Linear Discriminant Analysis (LDA)**
    - **Autoencoders** (Neural Network-based dimensionality reduction)

---

### **Key Takeaways**

- ✅ Feature Engineering improves model accuracy and efficiency.
- ✅ Transforming, selecting, and extracting features are crucial steps.
- ✅ Choosing the right features helps in reducing computational costs and preventing overfitting.


