Normalization, in the context of machine learning, refers to the rescaling of the features to a range of [0, 1]. It is also known as Min-Max scaling. 

The formula for normalization is:

\[ x_{\text{norm}} = \frac{x - x_{\text{min}}}{x_{\text{max}} - x_{\text{min}}} \]

where:
- \( x \) is the original feature value,
- \( x_{\text{min}} \) is the minimum value of the feature,
- \( x_{\text{max}} \) is the maximum value of the feature, and
- \( x_{\text{norm}} \) is the normalized feature value.

Normalization is useful when the features have different ranges and units, and the algorithm you are using requires features to be on a similar scale. It helps in speeding up the convergence of the learning algorithm.

Standardization, also known as z-score normalization, is a technique used in machine learning to rescale features so that they have the properties of a standard normal distribution with a mean of 0 and a standard deviation of 1. This transformation is important because it ensures that certain features do not dominate the learning algorithm due to their larger scale.

The formula for standardization is:

\[ x_{\text{std}} = \frac{x - \mu}{\sigma} \]

where:
- \( x \) is the original feature value,
- \( \mu \) is the mean of the feature values,
- \( \sigma \) is the standard deviation of the feature values, and
- \( x_{\text{std}} \) is the standardized feature value.

Standardization is commonly applied to features before training a model, especially in algorithms that are sensitive to the scale of the input features, such as support vector machines (SVMs) or k-means clustering.

In [1]:
import numpy as np
from sklearn.preprocessing import StandardScaler, MinMaxScaler

# Create a dummy dataset
np.random.seed(0)
data = np.random.randint(0, 100, (5, 3)).astype(float)
print("Original Data:")
print(data)

# Standardization
scaler = StandardScaler()
data_standardized = scaler.fit_transform(data)
print("\nStandardized Data:")
print(data_standardized)

# Normalization
min_max_scaler = MinMaxScaler()
data_normalized = min_max_scaler.fit_transform(data)
print("\nNormalized Data:")
print(data_normalized)


Original Data:
[[44. 47. 64.]
 [67. 67.  9.]
 [83. 21. 36.]
 [87. 70. 88.]
 [88. 12. 58.]]

Standardized Data:
[[-1.78420724  0.15308204  0.48610445]
 [-0.40713454  1.00353779 -1.57049131]
 [ 0.55082908 -0.95251044 -0.56088975]
 [ 0.79031998  1.13110615  1.38352806]
 [ 0.85019271 -1.33521553  0.26174855]]

Normalized Data:
[[0.         0.60344828 0.69620253]
 [0.52272727 0.94827586 0.        ]
 [0.88636364 0.15517241 0.34177215]
 [0.97727273 1.         1.        ]
 [1.         0.         0.62025316]]
