# Gymnasium RL training environment

- Successor of Open AI Gym
- Has multiple environments available for training reinforcement learning agents
- Two famous problems Mountaincar (continuous assignment) and Cartpole (final assignment)


[**Mountaincar**](https://gymnasium.farama.org/main/environments/classic_control/mountain_car/)

Move mountaincar up a hill by accelerating and braking
<img src="http://drive.google.com/uc?export=view&id=1XJj3Bju-mqZO8S9JT9QEfMAnZqAjj5X2" width=45%>

[**Cartpole**](https://gymnasium.farama.org/main/environments/classic_control/cart_pole/)

Balance a pole on a cart 

Maximise duration (max 10s) where pole stays upright

RL agents moves the cart to balance pole

<img src="https://drive.google.com/uc?export=download&id=1wiFksyB3-mcirfdZEvrT2DPD7SBEjye2" >

### How to tell Gymnasium to start the simulation for the RL task
* Find the RL task in gymnasium's [webpage](https://gymnasium.farama.org/) in the section environment
* Copy the name of the task e.g. "MountainCar-v0"
gymnasium.make("MountainCar-v0")




#### import gymnasium 

In [None]:
!pip install gymnasium # if necessary

In [None]:
import gymnasium as gym

#### Use the make() function to set up the simulation
* The *make()* function takes one argument of type *str*. which is the name of the RL task
* The RL Task and its simulation is usually called environment in RL


In [None]:
env = gym.make("MountainCar-v0")


#### Use the reset() function to reset the environment to its initial state
* This starts the simulation

In [None]:
state, info = env.reset(seed=1, options={}) # starts environment returns current state and info about the environment 
state

#### An action can be passed into the environment and a new state obtained

state, reward, terminated, truncated , info = env.step(action)

It returns
* The new state/observation, 
* The reward 
* Whether a sequence is terminated or truncated (both forms of finishes where truncated is when a time time limit is reached)
* Any additional info about the environment

#### Schematic gameplay

import gym

env = gym.make("LunarLander-v2", render_mode="human")

observation, info = env.reset(seed=123, options={})

done = False

while not done:

    action = env.action_space.sample()  # agent policy that uses the observation and info

    observation, reward, terminated, truncated, info = env.step(action)

    done = terminated or truncated

env.close()

#### After set up, you can visually inspect the environment any time by calling the render() function
* The *render()* function only works if *reset()* was called before. Otherwise you will get a black screen. 
* Render does not work in Google Colab

In [None]:
env.render()

#### To close a session, use the *close()* Function

In [None]:
env.close()