Scaling of data using scikit-learn

[scikit-learn](https://scikit-learn.org/) is a very helpful Open-source, commercial usable machine learning library. It comes with a so called [`MinMaxScaler`](https://scikit-learn.org/stable/modules/preprocessing.html#scaling-features-to-a-range) that allows to scale features to a certain range.

by Prof. Dr.-Ing. Jürgen Brauer, www.juergenbrauer.org

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Prepare-some-example-data-to-scale" data-toc-modified-id="Prepare-some-example-data-to-scale-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Prepare some example data to scale</a></span></li><li><span><a href="#Scale-the-data-to-a-certain-range" data-toc-modified-id="Scale-the-data-to-a-certain-range-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Scale the data to a certain range</a></span></li><li><span><a href="#Now-use-the-scaler-to-transform-new-data" data-toc-modified-id="Now-use-the-scaler-to-transform-new-data-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Now use the scaler to transform new data</a></span></li><li><span><a href="#Use-the-scaler-to-undo-a-transformation" data-toc-modified-id="Use-the-scaler-to-undo-a-transformation-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Use the scaler to undo a transformation</a></span></li></ul></div>

# Prepare some example data to scale

In [1]:
import numpy as np
data = np.array([ [10.0, 20.0],
                  [15.0, 40.0],
                  [20.0, 60.0]])
print(data)
print("Type of data is", type(data))

[[ 10.  20.]
 [ 15.  40.]
 [ 20.  60.]]
Type of data is <class 'numpy.ndarray'>


# Scale the data to a certain range

In [2]:
from sklearn.preprocessing import MinMaxScaler

# 2. create a MinMaxScaler with feature range [0,1]
scaler = MinMaxScaler(feature_range=(0, 1))
transformed_data = scaler.fit_transform(data)
print(transformed_data)

[[ 0.   0. ]
 [ 0.5  0.5]
 [ 1.   1. ]]


# Now use the scaler to transform new data

In [3]:
data2 = np.array( [[10.0, 20.0],
                   [20.0, 60.0],
                   [30.0, 100.0]])
transformed_data2 = scaler.transform(data2)
print(transformed_data2)

[[ 0.  0.]
 [ 1.  1.]
 [ 2.  2.]]


# Use the scaler to undo a transformation

In [4]:
print(transformed_data)
original_data = scaler.inverse_transform(transformed_data)
print(original_data)

[[ 0.   0. ]
 [ 0.5  0.5]
 [ 1.   1. ]]
[[ 10.  20.]
 [ 15.  40.]
 [ 20.  60.]]
