# Feature Scaling Notes

Feature scaling is a critical preprocessing step in machine learning that standardizes the range of independent variables (features) in the data. Many algorithms, particularly those based on distance calculations or gradient descent (e.g., linear regression, k-nearest neighbors, support vector machines), are sensitive to the scale of input data. Without scaling, features with larger ranges might dominate the learning process, leading to biased or inefficient models.

Below are the three main methods of feature scaling, along with detailed explanations and examples.

---

## 1. Min-Max Scaling (Normalization)

**Overview:**
- **Purpose:** Rescale the feature values to a fixed range, typically between 0 and 1.
- **Application:** Useful when the data has a known minimum and maximum, or when the algorithm requires a bounded input.
- **Intuitive Analogy:** Think of adjusting the brightness on a TV so that the darkest part is black (0) and the brightest is white (1). Everything in between is scaled accordingly.

**Formula:**

$$
X_{\text{norm}} = \frac{X - X_{\text{min}}}{X_{\text{max}} - X_{\text{min}}}
$$

**Key Components:**
- **$X$:** The original feature value.
- **$X_{\text{min}}$:** The minimum value in the feature.
- **$X_{\text{max}}$:** The maximum value in the feature.

**Step-by-Step Breakdown:**
1. Identify the minimum ($X_{\text{min}}$) and maximum ($X_{\text{max}}$) values in the dataset.
2. Subtract the minimum value from each data point.
3. Divide the result by the range ($X_{\text{max}} - X_{\text{min}}$).

**Practical Example:**
Imagine you have a feature representing ages ranging from 10 to 80. For a person who is 45 years old:

$$
X_{\text{norm}} = \frac{45 - 10}{80 - 10} = \frac{35}{70} = 0.5
$$

This converts the age 45 into a normalized value of 0.5.

---

## 2. Standardization (Z-score Normalization)

**Overview:**
- **Purpose:** Rescale the data so that it has a mean of 0 and a standard deviation of 1.
- **Application:** Particularly useful for algorithms that assume data is normally distributed (e.g., logistic regression, neural networks).
- **Intuitive Analogy:** Think of standardization like converting different currencies to a single currency using an exchange rate; the conversion aligns the scale of different values.

**Formula:**

$$
X_{\text{std}} = \frac{X - \mu}{\sigma}
$$

**Key Components:**
- **$X$:** The original feature value.
- **$\mu$:** The mean of the feature.
- **$\sigma$:** The standard deviation of the feature.

**Step-by-Step Breakdown:**
1. Calculate the mean ($\mu$) of the feature.
2. Compute the standard deviation ($\sigma$) of the feature.
3. Subtract the mean from each feature value.
4. Divide by the standard deviation.

**Practical Example:**
If the mean age is 50 and the standard deviation is 15, then for an age of 65:

$$
X_{\text{std}} = \frac{65 - 50}{15} = \frac{15}{15} = 1
$$

This indicates that an age of 65 is 1 standard deviation above the mean.

---

## 3. Mean Normalization

**Overview:**
- **Purpose:** Adjusts the data so that its mean becomes 0, while also scaling the values based on the range of the feature.
- **Application:** Often used when you want the data centered around zero while maintaining a scale of -1 to 1.
- **Intuitive Analogy:** Imagine centering a seesaw so that the balance point is at zero, and all the weights are measured relative to the ends of the seesaw.

**Formula:**

$$
X_{\text{norm}} = \frac{X - \mu}{X_{\text{max}} - X_{\text{min}}}
$$

**Key Components:**
- **$X$:** The original feature value.
- **$\mu$:** The mean of the feature.
- **$X_{\text{min}}$:** The minimum value of the feature.
- **$X_{\text{max}}$:** The maximum value of the feature.

**Step-by-Step Breakdown:**
1. Calculate the mean ($\mu$) of the feature.
2. Determine the range by subtracting $X_{\text{min}}$ from $X_{\text{max}}$.
3. Subtract the mean from each feature value.
4. Divide by the range to obtain the normalized value.

**Practical Example:**
For a feature where the minimum is 20, the maximum is 100, and the mean is 60, for a value of 80:

$$
X_{\text{norm}} = \frac{80 - 60}{100 - 20} = \frac{20}{80} = 0.25
$$

This means the value 80 is 25% of the way between the mean and the maximum when scaled by the range.

---

## Summary

- **Feature Scaling** is used to ensure that each feature contributes equally to the model by standardizing their ranges.
- **Min-Max Scaling** normalizes data to a fixed range (0 to 1) and is best when the bounds are known.
- **Standardization** transforms data to have a mean of 0 and a standard deviation of 1, making it ideal for normally distributed data.
- **Mean Normalization** centers the data around 0 and scales it based on the range, useful for certain applications requiring values in a symmetric interval.

These methods can be selected based on the requirements of your machine learning algorithm and the distribution of your data.


In [2]:
import numpy as np
from numpy.typing import NDArray

def normalization(X: NDArray[np.float64]) -> NDArray[np.float64]:
    """
    Normalizes the input features using the simple min-max scaling method.

    Args:
        X: The input values (independent variables).

    Returns:
        The normalized input values.
    """

    max_vals = np.max(X, axis=0)
    return X / max_vals

In [3]:
def standardization(X: NDArray[np.float64]) -> NDArray[np.float64]:
    """
    Standardizes the input features using the z-score method.

    Args:
        X: The input values (independent variables).

    Returns:
        The standardized input values.
    """

    mean_vals = np.mean(X, axis=0)
    std_devs = np.std(X, axis=0)
    return (X - mean_vals) / std_devs

In [4]:
def mean_normalization(X: NDArray[np.float64]) -> NDArray[np.float64]:
    """
    Normalizes the input features using the mean normalization method.

    Args:
        X: The input values (independent variables).

    Returns:
        The normalized input values.
    """

    mean_vals = np.mean(X, axis=0)
    max_vals = np.max(X, axis=0)
    min_vals = np.min(X, axis=0)
    return (X - mean_vals) / (max_vals - min_vals)