# __SCALING__

<hr>
<hr>

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

<hr>

### E. _Maximum Absolute Scaler_

- $\displaystyle x_\textrm{scaler} = \frac {x} {|x|_\textrm{max}}$

- MaxAbsScaler results have range between __-1__ sampai __1__

In [None]:
df = pd.DataFrame({
    'TB': [1.72, 1.81, 1.93, 1.67, 1.85, 1.66, 1.59, 1.76, 1.88, 1.78],
    'BB': [76, 65, 87, 55, 60, 78, 98, 77, 74, 64]
})
df

In [None]:
np.max(np.abs(df['TB']))

df['masTB'] = df['TB'] / np.max(np.abs(df['TB']))
df['masBB'] = df['BB'] / np.max(np.abs(df['BB']))
df

<hr>

### F. _Maximum Absolute Scaler on Sklearn_

#### F1. ```maxabs_scale()```
```from sklearn.preprocessing import maxabs_scale```

In [None]:
from sklearn.preprocessing import maxabs_scale

In [None]:
df['masTB 2'] = maxabs_scale(df['TB'])
df['masBB 2'] = maxabs_scale(df['BB'])
df

<hr>

#### F2. ```MaxAbsScaler()```

```from sklearn.preprocessing import MaxAbsScaler```

In [None]:
from sklearn.preprocessing import MaxAbsScaler

In [None]:
scaler = MaxAbsScaler()
scaler.fit(df[['TB', 'BB']])
mas = scaler.transform(df[['TB', 'BB']])

df['masTB 3'] = mas[:, 0]
df['masBB 3'] = mas[:, 1]
df

In [None]:
inmas = scaler.inverse_transform(df[['masTB 3', 'masBB 3']])
df['inverse TB'] = inmas[:,0]
df['inverse BB'] = inmas[:,1]
df

<hr>

### H. _Robust Scaler_

- $\displaystyle x_\textrm{robust} = \frac {x - \textrm{median}{(x)}} {\textrm{IQR}} = \frac {x - Q_2} {Q_3 - Q_1}$

- bagus digunakan pada data dengan banyak outliers

In [None]:
df1 = pd.DataFrame({
    'TB': [1.72, 1.81, 1.93, 1.67, 1.85, 1.66, 1.59, 1.76, 1.88, 1.78],
    'BB': [76, 65, 87, 55, 60, 78, 98, 77, 74, 64]
})
df1

In [None]:
# find Q1, Q2, Q3
q1TB = np.quantile(df1['TB'], .25)
q2TB = np.quantile(df1['TB'], .5)
q3TB = np.quantile(df1['TB'], .75)
q1BB = np.quantile(df1['BB'], .25)
q2BB = np.quantile(df1['BB'], .5)
q3BB = np.quantile(df1['BB'], .75)

print(q1TB, q2TB, q3TB)
print(q1BB, q2BB, q3BB)

In [None]:
df1['robust TB'] = (df1['TB'] - q2TB) / (q3TB - q1TB)
df1['robust BB'] = (df1['BB'] - q2BB) / (q3BB - q1BB)
df1

<hr>

### I. _Robust Scaler Sklearn ```RobustScaler()```_

```from sklearn.preprocessing import RobustScaler```

In [None]:
from sklearn.preprocessing import RobustScaler

In [None]:
scaler = RobustScaler()
scaler.fit(df[['TB', 'BB']])

rs = scaler.transform(df[['TB', 'BB']])

df1['Robust TB 2'] = rs[:, 0]
df1['Robust BB 2'] = rs[:, 1]
df1

In [None]:
inv = scaler.inverse_transform(df1[['Robust TB 2', 'Robust BB 2']])

df1['inverse TB'] = inv[:,0]
df1['inverse BB'] = inv[:,1]
df1

In [None]:
#### apakah robust merubah sebaran data?

In [None]:
plt.subplot(121)
plt.plot(df1['TB'], df1['BB'], 'ro')
plt.subplot(122)
plt.plot(df1['robust TB'], df1['robust BB'], 'go')
plt.show()

<hr>

### J. _Binarizer_

- Merubah data menajadi data biner (value 0 atau 1) aka boolean dengan menggunakan threshold tertentu
- misal threshold = 20, maka data dengan 

```value <= 20 akan menjadi 0 ```

```value > 20 akan menjadi 1```

In [None]:
df2 = pd.DataFrame({
    'TB': [1.72, 1.81, 1.93, 1.67, 1.85, 1.66, 1.59, 1.76, 1.88, 1.78],
    'BB': [76, 65, 87, 55, 60, 78, 98, 77, 74, 64]
})
df2

In [23]:
df2['Binarizer TB'] = df2['TB'].apply(lambda x: 0 if x <= 1.8 else 1)
df2['Binarizer BB'] = df2['BB'].apply(lambda x: 0 if x <= 80 else 1)
df2

Unnamed: 0,TB,BB,Binarizer TB,Binarizer BB
0,1.72,76,0,0
1,1.81,65,1,0
2,1.93,87,1,1
3,1.67,55,0,0
4,1.85,60,1,0
5,1.66,78,0,0
6,1.59,98,0,1
7,1.76,77,0,0
8,1.88,74,1,0
9,1.78,64,0,0


<hr>

### K. _Binarizer Sklearn_

```from sklearn.preprocessing import Binarizer```

In [24]:
from sklearn.preprocessing import Binarizer

In [26]:
transformer = Binarizer(threshold=1.8)
transformer.fit(df2[['TB']])
transformer.transform(df2[['TB']])
df2['Binarizer TB 2'] = transformer.transform(df2[['TB']]).astype('int32')

In [27]:
transformer = Binarizer(threshold = 80)
transformer.fit(df2[['BB']])
transformer.transform(df2[['BB']])
df2['Binarizer BB 2'] = transformer.transform(df2[['BB']])
df2

Unnamed: 0,TB,BB,Binarizer TB,Binarizer BB,Binarizer TB 2,Binarizer BB 2
0,1.72,76,0,0,0,0
1,1.81,65,1,0,1,0
2,1.93,87,1,1,1,1
3,1.67,55,0,0,0,0
4,1.85,60,1,0,1,0
5,1.66,78,0,0,0,0
6,1.59,98,0,1,0,1
7,1.76,77,0,0,0,0
8,1.88,74,1,0,1,0
9,1.78,64,0,0,0,0
