> # **Scalling in Machine learning**
Data scaling is a common preprocessing step in machine learning that involves transforming the input variables to have
Similar scale or distribution. This can help improve the performance and stability of some machine learning
algorithms, particularly those that are sensitive to the scale of the input data, such as K-nearest neighbors.
Support vector machines, and gradient descent-based optimization algorithms.


### **1_ Standard Scalar**
Standardization is a technique that is often applied to make the mean of the data zero and the standard deviation one.
This is done by subtracting the mean and dividing by the standard deviation. This is done for each feature in the
dataset. The standardization is done using the StandardScaler class in the sklearn.preprocessing module.

In [7]:
#Load the data
from sklearn.datasets import load_iris
data = load_iris()
#Define the predictor and response variables
X = data.data
y = data.target
#Split the data into training and testing sets
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split (X, y, test_size=0.2, random_state=42)
#Perform feature scaling
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

print("X_train: ", X_train.min(), " - ", X_train.max())
print("X_train_scaled (StandarScalar): ", X_train_scaled.min(), " - ", X_train_scaled.max())

X_train:  0.1  -  7.7
X_train_scaled(StandarScalar):  -2.373777512810883  -  2.9923757343597126


### **2_ Min-Max Scalar**
The min-max scaler is a technique that is often used to scale the data to a fixed range, such as 0 to 1. This is done by
subtracting the minimum value and dividing by the range. This is done for each feature in the dataset. The min-max
scaling is done using the MinMax5caler class in the sklearn.preprocessing module.

In [8]:
#Load the data
from sklearn.datasets import load_iris
data = load_iris()
#Define the predictor and response variables
X = data.data
y = data.target
#Split the data into training and testing sets
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split (X, y, test_size=0.2, random_state=42)
#Perform feature scaling
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

print("X_train: ", X_train.min(), " - ", X_train.max())
print("X_train_scaled (Min-Max Scaler): ", X_train_scaled.min(), " - ", X_train_scaled.max())

X_train:  0.1  -  7.7
X_train_scaled(StandarScalar):  0.0  -  1.0


### **3_ Robust Scalar**
The RobustScaler method scales the data based on the median and interquartile range (IQR) of each feature. This can
difference between the 75th and 25th percentiles. The IQR is calculated using the RobustScaler class in the
sklearn.preprocessing module.


In [12]:
from sklearn.preprocessing import RobustScaler
scaler = RobustScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

print("X_train: ", X_train.min(), " - ", X_train.max())
print("X_train_scaled (RobustScaler): ", X_train_scaled.min(), " - ", X_train_scaled.max())

X_train:  0.1  -  7.7
X_train_scaled (RobustScaler):  -1.6666666666666665  -  2.3333333333333335


### **4_ MaxAbs Scalar**
The MaxAbsScaler method scales the data based on the maximum absolute value of each feature. This can be useful
when the data contains outliers or is not no rmally distributed. The maximum absolute value is calculated
using the MaxAbsScaler class in the sklearn.preprocessing module.


In [13]:
from sklearn.preprocessing import MaxAbsScaler
scaler = MaxAbsScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

print("X_train: ", X_train.min(), " - ", X_train.max())
print("X_train_scaled (MaxAbsScaler): ", X_train_scaled.min(), " - ", X_train_scaled.max())

X_train:  0.1  -  7.7
X_train_scaled (MaxAbsScaler):  0.04  -  1.0


### **5_ Quantile Transformer**
The QuantileTransformer method transforms the features to folow a uniform or a normal distribution. This can be
useful for non-linear transformations in which the output is more normally distributed. The transformation is applied
using the QuantileTransformer class in the sklearn.preprocessing module.


In [15]:
from sklearn.preprocessing import QuantileTransformer
scaler = QuantileTransformer(output_distribution = 'normal')
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

print("X_train: ", X_train.min(), " - ", X_train.max())
print("X_train_scaled (QuantileTransformer): ", X_train_scaled.min(), " - ", X_train_scaled.max())

X_train:  0.1  -  7.7
X_train_scaled (QuantileTransformer):  -5.199337582605575  -  5.19933758270342




### **6_ Power transformer**

The PowerTransformer method transforms the features to follow a normal distribution by applying a power
transformation. This can be useful for non-inear transtormations in which the output is more normally distributed.

The transtormation is applied using the PowerTransformer class in the sklearn.preprocessing module.

The Power Transformer method applies a power transtormation to the data to make it more Gaussian-like.The method parameter can be set to yeo-johnson or boX-cox to control the type of power transtormation used.
The default is yeo-johnson. The box-cox method is limited to strictly positive data.

In [16]:
from sklearn.preprocessing import PowerTransformer
scaler = PowerTransformer (method='yeo-johnson')
X_train_scaled = scaler.fit_transform (X_train)
X_test_scaled = scaler.transform(X_test)

print("X_train: ", X_train.min(), " - ", X_train.max())
print("X_train_scaled (PowerTransformer): ", X_train_scaled.min(), " - ", X_train_scaled.max())

X_train:  0.1  -  7.7
X_train_scaled (PowerTransformer):  -2.68316691739846  -  2.6651156823636706


### **7_ Normalizer**

The Normalizer method transforms the data to have a unit norm.

This can be useful for sparse datasets (lots of zeros) with attributes of varying scales when using algorithms that
weignt input values Such as neural networks and a lgoritnms that use distance measures such as K-nearest neighbors.

The transformation is applied using the Normalizer class in the sklearn.preprocessing module.



In [17]:
from sklearn.preprocessing import Normalizer
scaler = Normalizer()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

print("X_train: ", X_train.min(), " - ", X_train.max())
print("X_train_scaled (Normalizer): ", X_train_scaled.min(), " - ", X_train_scaled.max())

X_train:  0.1  -  7.7
X_train_scaled (Normalizer):  0.014726598240177802  -  0.8609385732675535


The Normalizer method scales each sample (1.e., each row in the data matrix) to have unit norm. This can be useful
when you want to treat each sample as a vector with a certain magnitude and direction.

### **8_ Binarizer**

The Binarizer method converts the data to binary values (1.e, 0 or 1) based on a threshold. This can be useful
when you want to treat the data as a binary classification problem.

These are just a few more examples of the scaling methods available in scikit-learn.

You can find more information on scaling techniques in the scikit-learn documentation.

It's important to choose the appropriate Scaling method based on the specific problem and data at hand, and to
experiment with different techniques to find the best approach for your particular problem.


In [18]:
from sklearn. preprocessing import Binarizer
scaler = Binarizer(threshold=0.5)
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

print("X_train: ", X_train.min(), " - ", X_train.max())
print("X_train_scaled (Binarizer): ", X_train_scaled.min(), " - ", X_train_scaled.max())

X_train:  0.1  -  7.7
X_train_scaled (Binarizer):  0.0  -  1.0
