# General-purpose environment and observation wrappers for SSP embeddings

This package defines a set of wrappers for transforming observations and/or actions into SSP/VSA embeddings. Under the hood these use the SSP spaces but for basic gym observation and action spaces the creation of those objects can be done automatically. 

The most general purpose wrapper is SSPEnvWrapper. This can be initialized in one of the following ways:
- Provide a suitable ssp_obs_space and ssp_action_space (can be SSPBox, SSPDiscrete, SSPSequence, or even SSPDict for more complex, custom VSA embeddings)
- Set auto_convert_obs_space=True and auto_convert_action_space=True and do not give ssp_obs_space and ssp_action_space. In this case (as long as env.observation and env.action_space are Box or Discrete), ssp_obs_space and ssp_action_space will be generated automatically
- Set auto_convert_obs_space=True or give a ssp_obs_space and set auto_convert_action_space=False. SSPs will be used for observations but not actions
- Set auto_convert_action_space=True or give a ssp_action_space and set auto_convert_obs_space=False. SSPs will be used for actions but not observations

Note that this wrapper does **not** support learning of the SSP parameters. See the feature extractor networks if you would like to learn mapping parameters.


In [3]:
import numpy as np
import gymnasium as gym
import sys, os
sys.path.insert(1, os.path.dirname(os.getcwd()))
os.chdir("..")
import vsagym


# A base env
env = gym.make('CartPole-v1', render_mode='rgb_array')

# The general SSP wrapper
env = vsagym.wrappers.SSPEnvWrapper(env,
                                 auto_convert_obs_space=True,
                                 auto_convert_action_space=True,
                                 shape_out=251, decoder_method='from-set',
                                 length_scale=0.1)
observation, _ = env.reset()
assert observation.shape == (251,)
for t in range(5):
    action = env.action_space.sample()
    assert action.shape == (251,)
    _, _, terminated, truncated, _ = env.step(action)
    if terminated or truncated or t==4:
        observation, _ = env.reset()
env.close()

There is also SSPObsWrapper (subclass of gym.ObservationWrapper) for SSP encodings of observations. Note that this will be the same as SSPEnvWrapper with auto_convert_action_space=False and no ssp_action_space provided.

In [5]:
env = gym.make('CartPole-v1', render_mode='rgb_array')
env = vsagym.wrappers.SSPObsWrapper(env,
                             shape_out=251, length_scale=0.1,
                             decoder_method='from-set')
observation, _ = env.reset()
assert observation.shape == (251,)
for t in range(5):
    action = env.action_space.sample()
    _, _, terminated, truncated, _ = env.step(action)
    if terminated or truncated or t==4:
        observation, _ = env.reset()
env.close()

## With RLZoo3
 You can use custom wrappers with the rlzoo framework. You can create a hyperparameters json file with env_wrapper args. For example,

```text
 CartPole-v1:
  batch_size: 256
  clip_range: lin_0.2
  ent_coef: 0.0
  env_wrapper:
  - vsagym.wrappers.SSPObsWrapper:
      shape_out: 251
      length_scale: [9.6000004e-01, 1.0000000e-01, 8.3775806e-02, 1.0000000e-01]
  gae_lambda: 0.8
  gamma: 0.98
  learning_rate: lin_0.001
  n_envs: 8
  n_epochs: 20
  n_steps: 32
  n_timesteps: 100000.0
  policy: MlpPolicy
  policy_kwargs: "dict(net_arch=dict(pi=[64], vf=[64]),activation_fn=nn.ReLU)"
  ```

