# Stable Baselines 3

[Stable Baselines3 (SB3)](https://stable-baselines3.readthedocs.io/en/master/) is a set of reliable implementations of reinforcement learning algorithms in PyTorch.

In this notebook, you will use the SB3 library to train a PPO agent on LunarLander Environment.

Write code to define and train the agent.

Also include evaluation of the agent(using sb3) and a visualization of the agent's performance in form of a video

In [None]:
%pip install -q swig
%pip install -q gym[box2d]
%pip install -q gym[atari]
%pip install stable-baselines3[extra]

## Imports

Stable-Baselines works on environments that follow the [gym interface](https://stable-baselines.readthedocs.io/en/master/guide/custom_env.html).


In [None]:
import gymnasium as gym
import numpy as np

# stable_baselines3 imports
from stable_baselines3.common.env_util import make_vec_env

# Import remaining sb3 imports here

  and should_run_async(code)


## Creating the Environment

In [None]:
# Parallel Environments. Vectorized environments allow to easily multiprocess training.
vec_env = make_vec_env("LunarLander-v2", n_envs=4, wrapper_class=gym.wrappers.TimeLimit, wrapper_kwargs={"max_episode_steps":500})

### Solve here

write the code to define, train and evaluate the agent:

### Visualization

You are provided with some functions which will help you visualize the results as a video.
Feel free to wrie your own code for visualization if you prefer

In [None]:
# For visualization
from gym.wrappers.monitoring import video_recorder
from IPython.display import HTML
from IPython import display
import glob
import base64, io, os, shutil
from stable_baselines3.common.vec_env import VecVideoRecorder, DummyVecEnv

os.environ['SDL_VIDEODRIVER']='dummy'

In [None]:
shutil.rmtree('video', ignore_errors=True)
os.makedirs("video", exist_ok=True)

def show_video():
    mp4list = glob.glob('video/*.mp4')
    if len(mp4list) > 0:
        mp4 = mp4list[0]
        video = io.open(mp4, 'r+b').read()
        encoded = base64.b64encode(video)
        display.display(HTML(data='''<video alt="test" autoplay
                loop controls style="height: 400px;">
                <source src="data:video/mp4;base64,{0}" type="video/mp4" />
             </video>'''.format(encoded.decode('ascii'))))
    else:
        print("Could not find video")

def show_video_of_model():
    """
    :param env_id: (str)
    :param model: (RL model)
    :param video_length: (int)
    :param prefix: (str)
    :param video_folder: (str)
    """
    video_length=500
    eval_env = make_vec_env("LunarLander-v2", n_envs=1)
    # Start the video at step=0 and record 500 steps
    eval_env = VecVideoRecorder(
        eval_env,
        video_folder="video/",
        record_video_trigger=lambda step: step == 0,
        video_length=video_length,
        name_prefix="",
    )

    obs = eval_env.reset()
    for _ in range(video_length):
        
        # Write your code to choose an action here.
        action = 0



        obs, _, _, _ = eval_env.step(action)

    # Close the video recorder
    eval_env.close()

In [None]:
show_video_of_model()

Saving video to /content/video/-step-0-to-step-500.mp4
Moviepy - Building video /content/video/-step-0-to-step-500.mp4.
Moviepy - Writing video /content/video/-step-0-to-step-500.mp4





Moviepy - Done !
Moviepy - video ready /content/video/-step-0-to-step-500.mp4


In [None]:
show_video()