# Data Range

## Range scaling (Range Compression)
### Compressing data into a desired range.

$d_{min}$ and $d_{max}$ are respectively
the minimum and maximum values in
the data.

We first calculate the proportion:
$$x_{proportion}=\frac{x-d_{min}}{d_{max}-d_{min}}$$

Let's say we take the range $[r_{min},r_{max}]$.

Using the proportion, we can calculate the scaled values:
$$x_{scaled}=x_{proportion}\times(r_{max}-r_{min})+r_{min}$$

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import sklearn

In [2]:
from sklearn.preprocessing import MinMaxScaler

In [15]:
data = np.array([
    [1, 2, 3, 3, 2, 1, 4, 5, 6, 5, 5, 3, 4],
    [100, 200, 400, 200, 250, 350, 600, 350, 700, 650, 400, 470, 500]
])
data = data.transpose()
data

array([[  1, 100],
       [  2, 200],
       [  3, 400],
       [  3, 200],
       [  2, 250],
       [  1, 350],
       [  4, 600],
       [  5, 350],
       [  6, 700],
       [  5, 650],
       [  5, 400],
       [  3, 470],
       [  4, 500]])

In [17]:
default_scaler = MinMaxScaler() # Default range: [0, 1]
default_scaler.fit_transform(data)

array([[0.        , 0.        ],
       [0.2       , 0.16666667],
       [0.4       , 0.5       ],
       [0.4       , 0.16666667],
       [0.2       , 0.25      ],
       [0.        , 0.41666667],
       [0.6       , 0.83333333],
       [0.8       , 0.41666667],
       [1.        , 1.        ],
       [0.8       , 0.91666667],
       [0.8       , 0.5       ],
       [0.4       , 0.61666667],
       [0.6       , 0.66666667]])

In [32]:
default_scaler.fit_transform(data[:, 0].reshape(-1, 1))

array([[0. ],
       [0.2],
       [0.4],
       [0.4],
       [0.2],
       [0. ],
       [0.6],
       [0.8],
       [1. ],
       [0.8],
       [0.8],
       [0.4],
       [0.6]])

In [33]:
custom_scaler = MinMaxScaler(feature_range=(-10, 4))
custom_scaler.fit_transform(data)

array([[-10.        , -10.        ],
       [ -7.2       ,  -7.66666667],
       [ -4.4       ,  -3.        ],
       [ -4.4       ,  -7.66666667],
       [ -7.2       ,  -6.5       ],
       [-10.        ,  -4.16666667],
       [ -1.6       ,   1.66666667],
       [  1.2       ,  -4.16666667],
       [  4.        ,   4.        ],
       [  1.2       ,   2.83333333],
       [  1.2       ,  -3.        ],
       [ -4.4       ,  -1.36666667],
       [ -1.6       ,  -0.66666667]])

In [42]:
data2 = np.array([
    [0.1, 0.2, 4.1, 4.4],
    [9000, 2000, 3000, 4000]
])
data2 = data2.transpose()
data, data2

(array([[  1, 100],
        [  2, 200],
        [  3, 400],
        [  3, 200],
        [  2, 250],
        [  1, 350],
        [  4, 600],
        [  5, 350],
        [  6, 700],
        [  5, 650],
        [  5, 400],
        [  3, 470],
        [  4, 500]]),
 array([[1.0e-01, 9.0e+03],
        [2.0e-01, 2.0e+03],
        [4.1e+00, 3.0e+03],
        [4.4e+00, 4.0e+03]]))

In [41]:
new_scaler = MinMaxScaler()
new_scaler.fit(data)
new_scaler.transform(data2)

array([[-0.18      , 14.83333333],
       [-0.16      ,  3.16666667],
       [ 0.62      ,  4.83333333],
       [ 0.68      ,  6.5       ]])