
## Scaling to maximum value - MaxAbsScaling

Maximum absolute scaling scales the data to its absolute maximum value:

X_scaled = X / abs(X.max)

The result of the above transformation is a distribution which values vary within the range of -1 to 1. But the mean is not centered at zero and the standard deviation varies across variables.

Scikit-learn suggests that this transformer is meant for data that is centered at zero, and for sparse data.


In a nutshell, MaxAbsScaling:

- does not center the mean at 0 (but it might be a good idea t center it with another method)
- variance varies across variables
- may not preserve the shape of the original distribution
- sensitive outliers

In [5]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MaxAbsScaler

In [2]:
data = pd.read_csv('../Data/mobile_dataset.csv')

In [3]:
data = data.select_dtypes('number')

In [4]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2000 entries, 0 to 1999
Data columns (total 21 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   battery_power  2000 non-null   int64  
 1   blue           2000 non-null   int64  
 2   clock_speed    2000 non-null   float64
 3   dual_sim       2000 non-null   int64  
 4   fc             2000 non-null   int64  
 5   four_g         2000 non-null   int64  
 6   int_memory     2000 non-null   int64  
 7   m_dep          2000 non-null   float64
 8   mobile_wt      2000 non-null   int64  
 9   n_cores        2000 non-null   int64  
 10  pc             2000 non-null   int64  
 11  px_height      2000 non-null   int64  
 12  px_width       2000 non-null   int64  
 13  ram            2000 non-null   int64  
 14  sc_h           2000 non-null   int64  
 15  sc_w           2000 non-null   int64  
 16  talk_time      2000 non-null   int64  
 17  three_g        2000 non-null   int64  
 18  touch_sc

In [9]:
X_train, X_test, y_train, y_test = train_test_split(data.drop(['price_range'], axis=1), data['price_range'], random_state=24)

In [13]:
X_train

Unnamed: 0,battery_power,blue,clock_speed,dual_sim,fc,four_g,int_memory,m_dep,mobile_wt,n_cores,pc,px_height,px_width,ram,sc_h,sc_w,talk_time,three_g,touch_screen,wifi
1594,586,1,0.6,0,16,1,42,0.3,121,7,17,785,1118,1869,12,2,7,1,1,1
843,1438,1,1.8,0,3,0,16,0.6,169,8,7,859,867,2246,14,11,15,0,0,0
1947,1788,0,0.5,0,0,1,6,0.2,141,6,16,149,1022,2321,7,5,20,1,1,0
1896,1799,0,1.0,0,1,1,30,0.6,123,3,9,617,1386,445,10,8,10,1,1,0
964,648,0,1.9,1,4,0,8,1.0,91,5,19,819,1997,2991,8,7,4,0,0,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1425,1600,0,2.5,1,1,0,19,0.6,88,6,9,831,1713,1179,10,3,18,0,0,1
343,1034,1,2.7,1,6,0,37,0.7,120,7,20,707,1199,3625,17,1,12,0,1,1
192,1490,1,0.5,1,4,1,64,0.3,150,8,8,1417,1464,3600,17,9,7,1,1,1
899,1112,0,0.5,0,0,1,12,0.9,190,4,6,777,1119,3302,11,0,20,1,1,1


In [11]:
scale = MaxAbsScaler()
scale.fit(X_train)


# transform the train and the test data

train_scale = scale.transform(X_train)
test_scale = scale.transform(X_test)

## to dataframe

train_scale = pd.DataFrame(train_scale, columns=X_train.columns)
test_scale = pd.DataFrame(test_scale, columns=X_test.columns)

In [12]:
train_scale

Unnamed: 0,battery_power,blue,clock_speed,dual_sim,fc,four_g,int_memory,m_dep,mobile_wt,n_cores,pc,px_height,px_width,ram,sc_h,sc_w,talk_time,three_g,touch_screen,wifi
0,0.293293,1.0,0.200000,0.0,0.842105,1.0,0.656250,0.3,0.605,0.875,0.85,0.400510,0.559560,0.467484,0.631579,0.111111,0.35,1.0,1.0,1.0
1,0.719720,1.0,0.600000,0.0,0.157895,0.0,0.250000,0.6,0.845,1.000,0.35,0.438265,0.433934,0.561781,0.736842,0.611111,0.75,0.0,0.0,0.0
2,0.894895,0.0,0.166667,0.0,0.000000,1.0,0.093750,0.2,0.705,0.750,0.80,0.076020,0.511512,0.580540,0.368421,0.277778,1.00,1.0,1.0,0.0
3,0.900400,0.0,0.333333,0.0,0.052632,1.0,0.468750,0.6,0.615,0.375,0.45,0.314796,0.693694,0.111306,0.526316,0.444444,0.50,1.0,1.0,0.0
4,0.324324,0.0,0.633333,1.0,0.210526,0.0,0.125000,1.0,0.455,0.625,0.95,0.417857,0.999499,0.748124,0.421053,0.388889,0.20,0.0,0.0,1.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1495,0.800801,0.0,0.833333,1.0,0.052632,0.0,0.296875,0.6,0.440,0.750,0.45,0.423980,0.857357,0.294897,0.526316,0.166667,0.90,0.0,0.0,1.0
1496,0.517518,1.0,0.900000,1.0,0.315789,0.0,0.578125,0.7,0.600,0.875,1.00,0.360714,0.600100,0.906703,0.894737,0.055556,0.60,0.0,1.0,1.0
1497,0.745746,1.0,0.166667,1.0,0.210526,1.0,1.000000,0.3,0.750,1.000,0.40,0.722959,0.732733,0.900450,0.894737,0.500000,0.35,1.0,1.0,1.0
1498,0.556557,0.0,0.166667,0.0,0.000000,1.0,0.187500,0.9,0.950,0.500,0.30,0.396429,0.560060,0.825913,0.578947,0.000000,1.00,1.0,1.0,1.0
