# Feature Scaling:-

- Feature scaling is a crucial preprocessing step in machine learning that ensures all features contribute equally to the model's performance. It involves transforming the values of features in your dataset to a common scale without distorting the relationships between them. This is particularly important when features have different units or magnitudes.

## Why Use Feature Scaling?
- Improves Model Convergence:
- Algorithms like gradient descent converge faster when features are scaled.

### Prevents Dominance:
- Features with larger magnitudes can dominate those with smaller magnitudes in distance-based algorithms.

### Enhances Performance:
- Scaling ensures fair weight distribution for machine learning models sensitive to feature magnitude.

### When to Use Feature Scaling
- Feature scaling is especially important for machine learning algorithms that:

- Use distance metrics (e.g., K-Nearest Neighbors, K-Means, SVM).
- Involve gradient-based optimization (e.g., Neural Networks, Logistic Regression).
- Are sensitive to feature magnitudes (e.g., Principal Component Analysis).
- Some algorithms, like tree-based methods (e.g., Random Forest, Decision Trees, XGBoost), are generally insensitive to feature scaling.

## Common Scaling Techniques

### Min-Max Scaling (Normalization):

- Transforms features to a fixed range, usually 
[
0
,
1
]
[0,1].
- Formula:
𝑋
′
=
𝑋
−
𝑋
min
𝑋
max
−
𝑋
min
X 
′
 = 
X 
max
​
 −X 
min
​
 
X−X 
min
​
 
​
 
- Pros: Maintains original distribution.
- Cons: Sensitive to outliers.

## Standardization (Z-score Normalization):

- Centers the data around zero with a unit variance.
- Formula:
𝑋
′
=
𝑋
−
𝜇
𝜎
X 
′
 = 
σ
X−μ
​
 
- Where 
𝜇
μ is the mean and 
𝜎
σ is the standard deviation.
- Pros: Robust for normally distributed data.
- Cons: Assumes Gaussian distribution.

## Robust Scaling:

- Uses the median and interquartile range to scale features.
- Formula:
𝑋
′
=
𝑋
−
median
(
𝑋
)
IQR
(
𝑋
)
X 
′
 = 
IQR(X)
X−median(X)
​
 
- Pros: Handles outliers well.
- Cons: Can distort non-linear relationships.

## MaxAbs Scaling:

- Scales data to the range ([-1, 1]] by dividing by the maximum absolute value.
Useful for sparse data.

## Log Transformation:

- Reduces skewness by transforming data using 
log
⁡
(
𝑋
)
log(X).
- Best for data with exponential or positively skewed distributions.

## Quantile Transformation:

- Maps data to a uniform or normal distribution using quantiles.
- Pros: Handles outliers effectively.

In [1]:
import numpy as np
import pandas as pd

## 1. Min-Max Scaling (Normalization)
Description:

Scales the data to a range, typically [ 0 , 1 ] [0,1]. Formula: 𝑋 ′ = 𝑋 − 𝑋 min 𝑋 max − 𝑋 min X ′ = X max​−X min​

X−X min​

In [1]:
from sklearn.preprocessing import MinMaxScaler
import numpy as np

# Example data
data = np.array([[1, 10], [2, 15], [3, 30], [4, 50]])

# Apply Min-Max Scaling
scaler = MinMaxScaler()
data_min_max_scaled = scaler.fit_transform(data)

print("Min-Max Scaled Data:\n", data_min_max_scaled)


Min-Max Scaled Data:
 [[0.         0.        ]
 [0.33333333 0.125     ]
 [0.66666667 0.5       ]
 [1.         1.        ]]


## 2. Standardization (Z-score Normalization)
Description:

Centers the data to have a mean of 0 and a standard deviation of 1. Formula: 𝑋 ′ = 𝑋 − 𝜇 𝜎 X ′ = σ X−μ​

Where 𝜇 μ is the mean and 𝜎 σ is the standard deviation.

In [2]:
from sklearn.preprocessing import StandardScaler

# Apply Standardization
scaler = StandardScaler()
data_standard_scaled = scaler.fit_transform(data)

print("Standardized Data:\n", data_standard_scaled)


Standardized Data:
 [[-1.34164079 -1.04418513]
 [-0.4472136  -0.7228974 ]
 [ 0.4472136   0.2409658 ]
 [ 1.34164079  1.52611672]]


## 3. Robust Scaling
Description:

Scales the data based on the median and interquartile range (IQR), making it robust to outliers. Formula: 𝑋 ′ = 𝑋 − median ( 𝑋 ) IQR ( 𝑋 ) X ′ = IQR(X) X−median(X)​

In [3]:
from sklearn.preprocessing import RobustScaler

# Apply Robust Scaling
scaler = RobustScaler()
data_robust_scaled = scaler.fit_transform(data)

print("Robust Scaled Data:\n", data_robust_scaled)


Robust Scaled Data:
 [[-1.         -0.58823529]
 [-0.33333333 -0.35294118]
 [ 0.33333333  0.35294118]
 [ 1.          1.29411765]]


## 4. MaxAbs Scaling
Description:

Scales each feature by its maximum absolute value, resulting in a range of ([-1, 1]]. Formula: 𝑋 ′ = 𝑋 max ( ∣ 𝑋 ∣ ) X ′ = max(∣X∣) X​

In [4]:
from sklearn.preprocessing import MaxAbsScaler

# Apply MaxAbs Scaling
scaler = MaxAbsScaler()
data_maxabs_scaled = scaler.fit_transform(data)

print("MaxAbs Scaled Data:\n", data_maxabs_scaled)


MaxAbs Scaled Data:
 [[0.25 0.2 ]
 [0.5  0.3 ]
 [0.75 0.6 ]
 [1.   1.  ]]


## 5. Log Transformation
Description:

Reduces skewness by applying the logarithm function. Typically used for positively skewed data.

In [5]:
import numpy as np

# Apply Log Transformation
data_log_transformed = np.log1p(data)  # log1p adds 1 to avoid log(0)

print("Log Transformed Data:\n", data_log_transformed)


Log Transformed Data:
 [[0.69314718 2.39789527]
 [1.09861229 2.77258872]
 [1.38629436 3.4339872 ]
 [1.60943791 3.93182563]]


## 6. Quantile Transformation
Description:

Transforms features to follow a uniform or normal distribution using quantiles.