# Reinforcement Learning Technolgies

## Gymnasium

An API standard for reinforcement learning with a diverse collection of reference environments. Gymnasium is a maintained fork of OpenAI’s Gym library.
It's a toolkit for developing and comparing different RL algorithms by providing a variety of different prebuilt environments and a consistent API for interacting with such environments.

In [None]:
import gymnasium as gym

# Initizaling environment
env = gym.make("LunarLander-v2", render_mode="human")

# Get first observation
observation, info = env.reset(seed=42)
for _ in range(1000):
   action = env.action_space.sample()  # this is where you would insert your policy, currently sampling a random action

   # Perform action
   observation, reward, terminated, truncated, info = env.step(action)

   

   # If environment crashes reset
   if terminated or truncated:
      observation, info = env.reset()

env.close()

## Action and Observation Spaces

There are two attributes **env.action_space** and **env.observation_space** that define these two concepts in *Gymnasium*.

They can be of different types:
* **Box** - n-dimensional continuous space bounded by upper and lower limits that define the valid values
* **Discrete** - discrete space where {0, 1, ..., n - 1} are the possible values, can be shifted via optional argument
* **Dict** - dictionary of simple spaces
* **Tuple** - represents a tuple of simple spaces
* **MultiBinary** - n-shape binary space, n can be a number or a list of numbers
* **MultiDiscrete** - series of **Discrete** action spaces with a different number of actions in each element

## Modifying the environment via wrappers

Convenient way to modify existing environments without altering the underlying code

Common wrappers:
* **TimeLimit**: Issue a truncated signal if a maximum number of timesteps has been exceeded (or the base environment has issued a truncated signal).

* **ClipAction**: Clip the action such that it lies in the action space (of type Box).

* **RescaleAction**: Rescale actions to lie in a specified interval

* **TimeAwareObservation**: Add information about the index of timestep to observation. In some cases helpful to ensure that transitions are Markov.
