Skip to content

Latest commit

 

History

History
73 lines (47 loc) · 2.47 KB

skrl.resources.preprocessors.rst

File metadata and controls

73 lines (47 loc) · 2.47 KB

Preprocessors

Implemented preprocessors

  • Running standard scaler <running-standard-scaler-preprocessor>

Basic usage

The preprocessors usage is defined in each agent's configuration dictionary.

The preprocessor class is set under the "<variable>_preprocessor" key and its arguments are set under the "<variable>_preprocessor_kwargs" key as a keyword argument dictionary. The following examples show how to set the preprocessors for an agent:

Running standard scaler

# import the preprocessor class
from skrl.resources.preprocessors.torch import RunningStandardScaler

cfg = DEFAULT_CONFIG.copy()
cfg["state_preprocessor"] = RunningStandardScaler
cfg["state_preprocessor_kwargs"] = {"size": env.observation_space, "device": device}
cfg["value_preprocessor"] = RunningStandardScaler
cfg["value_preprocessor_kwargs"] = {"size": 1, "device": device}

Running standard scaler

Algorithm implementation

Main notation/symbols:
  - mean (), standard deviation (σ), variance (σ2)
  - running mean (t), running variance (σt2)

Standardization by centering and scaling

$\text{clip}((x - \bar{x}_t) / (\sqrt{\sigma^2} \;+$ epsilon ),  − c, c)   with c as clip_threshold

Scale back the data to the original representation (inverse transform)

$\sqrt{\sigma^2_t} \; \text{clip}(x, -c, c) + \bar{x}_t \qquad$ with c as clip_threshold

Update the running mean and variance (See parallel algorithm)

δ ← x − t
nT ← nt + n
$M2 \leftarrow (\sigma^2_t n_t) + (\sigma^2 n) + \delta^2 \dfrac{n_t n}{n_T}$
# update internal variables
$\bar{x}_t \leftarrow \bar{x}_t + \delta \dfrac{n}{n_T}$
$\sigma^2_t \leftarrow \dfrac{M2}{n_T}$
nt ← nT

API

skrl.resources.preprocessors.torch.running_standard_scaler.RunningStandardScaler

__init__