## Exporting scalers

This file contains examples of exporting scalers from `sklearn.preprocessing` library.

### Loading common packages
Instaling the `sklearn-export` using pip.

In [None]:
import sys
!{sys.executable} -m pip install sklearn_export

Loading common packages.

In [39]:
import numpy as np
import json
from sklearn.datasets import load_iris
from sklearn_export import Export

### Exporting StandardScaler

The StandardScaler is one of the most common techniques to normalize data. It normalizes data to have zero mean and unit variance. In sklearn it is in the package `sklearn.preprocessing`.

In [40]:
from sklearn.preprocessing import StandardScaler

Let us normalize the features of the iris dataset.

In [41]:
# Loading iris dataset
dataset = load_iris()
X = dataset['data']

# Normalizing features
scaler = StandardScaler()
Xz = scaler.fit_transform(X)

Let us save the scaler parameters using `sklearn-export`.

In [42]:
# A new instance of the class Export
export = Export(scaler)

# Exporting the result in JSON and returning a dict of the JSON objects
result = export.to_json(filename='standard_scaler.json')

# Taking a look in the dict of the JSON file
result

{'scaler': 'ZscoreScaler',
 'mean': [5.843333333333335,
  3.057333333333334,
  3.7580000000000027,
  1.199333333333334],
 'std': [0.8253012917851409,
  0.43441096773549437,
  1.7594040657753032,
  0.7596926279021594]}

It is easy to load the file and have the model data in an dict again.

In [43]:
# Opening JSON file
f = open('standard_scaler.json')

# Transforming in a dict (same as result above)
model_data = json.load(f)
model_data

{'mean': [5.843333333333335,
  3.057333333333334,
  3.7580000000000027,
  1.199333333333334],
 'scaler': 'ZscoreScaler',
 'std': [0.8253012917851409,
  0.43441096773549437,
  1.7594040657753032,
  0.7596926279021594]}

Since we have the JSON file, you only need to implement the prediction of the StandardScaler method in any language your desire (the formular can be found in [Wikipedia](https://en.wikipedia.org/wiki/Standard_score)). For example, in python it is easy to implement the standard-scaler using numpy.

In [44]:
# An example of a standard scalar implemented with model_data
def standard_scaler(X, model_data):
    mean = np.asarray(model_data['mean'])
    std = np.asarray(model_data['std'])
    Xz = (X-mean)/std
    return Xz

# Same as Xz
Xz_pred = standard_scaler(X, model_data)

It is also possible to store simpler versions of the StandardScaler as the "MeanScaler".

In [45]:
# Removing mean of features
scaler = StandardScaler(with_std=False)
Xz = scaler.fit_transform(X)

# Exporting again
export = Export(scaler)
export.to_json(filename='mean_scaler.json')

{'scaler': 'MeanScaler',
 'mean': [5.843333333333335,
  3.057333333333334,
  3.7580000000000027,
  1.199333333333334]}

Or the "StandardDeviationScaler".

In [46]:
# Removing mean of features
scaler = StandardScaler(with_mean=False)
Xz = scaler.fit_transform(X)

# Exporting again
export = Export(scaler)
export.to_json(filename='std_scaler.json')

{'scaler': 'StandardDeviationScaler',
 'std': [0.8253012917851409,
  0.43441096773549437,
  1.7594040657753032,
  0.7596926279021594]}

### Exporting MinMaxScaler

Another common technique to normalize data is MinMaxScaler. It normalize data to be in the interval $[\text{lower},\text{upper}]$. It is in the package `sklearn.preprocessing`.

In [47]:
from sklearn.preprocessing import MinMaxScaler

Let us normalize the features of the iris dataset and save it using `sklearn-export`.

In [48]:
# Normalizing data to be in the interval [lower, upper]
scaler = MinMaxScaler()
Xz = scaler.fit_transform(X)

# Exporting again
export = Export(scaler)
export.to_json(filename='minmax_scaler.json')

{'lower': 0,
 'upper': 1,
 'min': [4.3, 2.0, 1.0, 0.1],
 'max': [7.9, 4.4, 6.9, 2.5],
 'scaler': 'MinMaxScaler'}