<h1>Step 1: Introduction to OpenAI Gym or similar:</h1>

OpenAI Gym is a toolkit for developing and comparing reinforcement learning algorithms. It provides a wide variety of environments where agents can be trained to learn and perform tasks. These environments range from simple grid-world games to complex simulated physics environments.

<h3>a. Installation:</h3>
- OpenAI Gym can be installed using pip:<br/>

In [None]:
!pip install wheel setuptools pip --upgrade # make sure we have the latest pip

In [None]:
!pip install swig # needed for box2d
!pip install gymnasium # install gym
!pip install 'gymnasium[all]' # install all environments
!pip install 'gymnasium[box2d]' # install box2d environments

<b>!pip install swig:</b> This command is used to install the SWIG library. SWIG is a software development tool that connects programs written in C and C++ with a variety of high-level programming languages. It's needed for the box2d environment in gym.

<b>!pip install gymnasium:</b> This command is used to install the gymnasium library. Gymnasium is a toolkit for developing and comparing reinforcement learning algorithms.

<b>!pip install 'gymnasium[all]':</b> This command is used to install all environments provided by the gymnasium library.

<b>!pip install 'gymnasium[box2d]':</b> This command is used to install box2d environments provided by the gymnasium library. Box2D is a 2D physics engine for games.

In [None]:
!pip install matplotlib 
!pip install numpy
!pip install seaborn
!pip install tqdm

<h3>b. Environments:</h3>
OpenAI Gym provides a collection of environments, each representing a different task or problem.
Environments are instantiated using the <i>gym.make()</i> function, passing the name of the environment as an argument.

In [36]:
import gymnasium as gym
from tqdm import tqdm
from IPython.display import display, clear_output
import time

In [42]:
 desc=["SFFF", 
       "FHFH", 
       "FFFH", 
       "HFFG"]

In [43]:
env = gym.make('FrozenLake-v1', desc=desc, map_name="4x4", is_slippery=True, render_mode="human")

<h3>c. Basic Methods:</h3>
<b>reset():</b> This method resets the environment to its initial state and returns the initial observation.

<b>step():</b> This method takes an action as input, executes the action in the environment, and returns 5 values: observation, reward, terminated, truncated and info. <br>
`observation:` The observation of the environment after taking the action.<br>
`reward:` The reward obtained from taking the action.<br>
`terminated and truncated:` flags indicating whether the episode has ended `(terminated)` or been cut short `(truncated)`,<br>
`info:` Additional information about the environment (usually used for debugging or analysis).<br>

In [44]:
observation, info = env.reset()

for _ in tqdm(range(100)):
    action = env.action_space.sample()  # agent policy that uses the observation and info
    observation, reward, terminated, truncated, info = env.step(action)
    print(f'reward = {reward} terminated = {terminated} truncated = {truncated} {info}')
    if terminated or truncated:
        observation, info = env.reset()
    clear_output(wait=True)
    time.sleep(0.1)
env.close()

100%|█████████████████████████████████████████| 100/100 [00:28<00:00,  3.49it/s]


<h3>d. Environment Attributes:</h3>

`observation_space:`The observation space of the environment, representing the possible states the environment can be in.<br>
`action_space:` The action space of the environment, representing the possible actions that the agent can take.<br>
`render_mode:` that specifies how the environment should be visualised.

<h1>Step 2: Exploring Markov Decision Processes (MDP)</h1>

<h3>FrozenLake-v1 as an MDP:</h3>

![frozen_lake.gif](attachment:f8ff9260-fbb6-49ab-9aeb-0dc1c58defa9.gif)

FrozenLake-v1 is a classic reinforcement learning environment provided by OpenAI Gym. It simulates a grid-world environment where an agent navigates through a frozen lake to reach a goal tile while avoiding holes.