Playing with a more complex reinforcement learning setting involving cars and racing.

# Playing with Stable Baselines 3 

In [10]:
# First let's play with stable baselines 3 to serve as a baseline.

import gymnasium as gym

from stable_baselines3 import DQN

env = gym.make("CartPole-v1")


LEARN_TIMESTEPS = 100000
TEST_TIMESTEPS = 10000

model = DQN("MlpPolicy", env, verbose=1)
model.learn(total_timesteps=LEARN_TIMESTEPS)

vec_env = model.get_env()
obs = vec_env.reset()

for i in range(TEST_TIMESTEPS):
    action, _states = model.predict(obs, deterministic=True)
    obs, reward, done, info = vec_env.step(action)
    vec_env.render()

env.close()

Using cuda device
Wrapping the env with a `Monitor` wrapper
Wrapping the env in a DummyVecEnv.
----------------------------------
| rollout/            |          |
|    ep_len_mean      | 15.5     |
|    ep_rew_mean      | 15.5     |
|    exploration_rate | 0.994    |
| time/               |          |
|    episodes         | 4        |
|    fps              | 9515     |
|    time_elapsed     | 0        |
|    total_timesteps  | 62       |
----------------------------------


----------------------------------
| rollout/            |          |
|    ep_len_mean      | 14.2     |
|    ep_rew_mean      | 14.2     |
|    exploration_rate | 0.989    |
| time/               |          |
|    episodes         | 8        |
|    fps              | 10841    |
|    time_elapsed     | 0        |
|    total_timesteps  | 114      |
----------------------------------
----------------------------------
| rollout/            |          |
|    ep_len_mean      | 16.8     |
|    ep_rew_mean      | 16.8     |
|    exploration_rate | 0.981    |
| time/               |          |
|    episodes         | 12       |
|    fps              | 10562    |
|    time_elapsed     | 0        |
|    total_timesteps  | 201      |
----------------------------------
----------------------------------
| rollout/            |          |
|    ep_len_mean      | 20       |
|    ep_rew_mean      | 20       |
|    exploration_rate | 0.97     |
| time/               |          |
|    episodes       



## Video Rendering

In [11]:
import usefulutils.gymvids as vids

In [12]:
vids.record_video("CartPole-v1", model, video_length=500, prefix="dqn-cartpole")

Saving video to c:\Users\Joaquin\Desktop\Playground\ML\Fiddle\videos\dqn-cartpole-step-0-to-step-500.mp4
Moviepy - Building video c:\Users\Joaquin\Desktop\Playground\ML\Fiddle\videos\dqn-cartpole-step-0-to-step-500.mp4.
Moviepy - Writing video c:\Users\Joaquin\Desktop\Playground\ML\Fiddle\videos\dqn-cartpole-step-0-to-step-500.mp4



                                                               

Moviepy - Done !
Moviepy - video ready c:\Users\Joaquin\Desktop\Playground\ML\Fiddle\videos\dqn-cartpole-step-0-to-step-500.mp4




In [13]:
vids.show_videos("videos", prefix="dqn")