# ‚öñ ùêÉùêöùê≤ ùüó: ùêÖùêûùêöùê≠ùêÆùê´ùêû ùêíùêúùêöùê•ùê¢ùêßùê† ‚Äì ùêçùê®ùê´ùê¶ùêöùê•ùê¢ùê≥ùêöùê≠ùê¢ùê®ùêß & ùêíùê≠ùêöùêßùêùùêöùê´ùêùùê¢ùê≥ùêöùê≠ùê¢ùê®ùêß ùê¢ùêß ùêåùêöùêúùê°ùê¢ùêßùêû ùêãùêûùêöùê´ùêßùê¢ùêßùê† | ùüëùüé-ùêÉùêöùê≤ ùêåùêã ùêÇùê°ùêöùê•ùê•ùêûùêßùê†ùêû



Feature scaling is a crucial preprocessing step in machine learning. Many algorithms perform better when numerical features are on the same scale. Today, we‚Äôll explore Normalization and Standardization‚Äîtwo widely used techniques.



## üîç Why Feature Scaling?

     ‚úÖ Improves Model Performance ‚Äì Some ML algorithms are sensitive to scale differences.

     ‚úÖ Speeds Up Training ‚Äì Gradient descent converges faster when features are scaled.

     ‚úÖ Enhances Comparability ‚Äì Keeps all features on a similar range.



## üìå Normalization (Min-Max Scaling)

Normalization (also called Min-Max Scaling) transforms features to a fixed range, typically [0,1] or [-1,1].



      ‚úÖ Best for neural networks and distance-based models (e.g., KNN, K-Means).

üîπ Transforms values between 0 and 1.

üîπ Sensitive to outliers (can distort scaling).



## üìå Standardization (Z-Score Scaling)

Standardization (also called Z-score normalization) transforms features to have zero mean and unit variance.



     ‚úÖ Best for algorithms like Logistic Regression, SVM, PCA, and Linear Regression.

üîπ Works well for normally distributed data.

üîπ Less sensitive to outliers than Min-Max Scaling.



## üöÄ When to Use Which?

üîπ Use Normalization if the data follows a non-Gaussian distribution and models like KNN, K-Means, Neural Networks.

üîπ Use Standardization if the data is normally distributed or required by algorithms like SVM, Linear Regression, or PCA.



## üìå Summary & Key Takeaways

‚úÖ Scaling is crucial for optimal model performance.

‚úÖ Normalization (Min-Max) scales data between [0,1].

‚úÖ Standardization (Z-score) ensures zero mean and unit variance.

‚úÖ Different algorithms prefer different scaling techniques.

## üìå Normalization (Min-Max Scaling)

![image.png](attachment:image.png)

In [None]:
import pandas as pd
import numpy as np

from sklearn.preprocessing import MinMaxScaler

In [2]:
# Sample dataset
data = {'feature1': [10, 20, 30, 40, 50], 'feature2': [200, 400, 600, 800, 1000]}
df = pd.DataFrame(data)

In [3]:
df

Unnamed: 0,feature1,feature2
0,10,200
1,20,400
2,30,600
3,40,800
4,50,1000


In [4]:
# Applying Min-Max Scaling
scaler = MinMaxScaler()
normalized_data = scaler.fit_transform(df)

In [5]:
# Creating a new DataFrame
df_normalized = pd.DataFrame(normalized_data, columns=df.columns)
print(df_normalized)

   feature1  feature2
0      0.00      0.00
1      0.25      0.25
2      0.50      0.50
3      0.75      0.75
4      1.00      1.00


## üìå Standardization (Z-Score Scaling)

![image.png](attachment:image.png)

In [6]:
from sklearn.preprocessing import StandardScaler

In [7]:
# Applying Standardization
scaler = StandardScaler()
standardized_data = scaler.fit_transform(df)


In [8]:
# Creating a new DataFrame
df_standardized = pd.DataFrame(standardized_data, columns=df.columns)
print(df_standardized)

   feature1  feature2
0 -1.414214 -1.414214
1 -0.707107 -0.707107
2  0.000000  0.000000
3  0.707107  0.707107
4  1.414214  1.414214
