In [1]:
import gymnasium as gym

#### 1. Setup the environment with render mode to observe

In [None]:
env =gym.make('LunarLander-v2',render_mode='human')

#### 2. Reset the environment

In [None]:
env.reset()

#### 3. Sample Actions, Observation Space and Sample Observation Space

In [None]:
print ('sample action', env.action_space.sample())
print ('observation space shape',env.observation_space.shape)
print ('sample observation',env.observation_space.sample)

#### 4. Close the env. 

In [None]:
env.close()

#### 5. Final reward is -100 at the end of episode. With random action observe not -100 on ~141th epoch

In [None]:
reward = -100
episode = 1
while reward ==-100:
    env =gym.make('LunarLander-v2',render_mode='human')
    env.reset()
    for step in range (200):
        env.render()
        obs,reward,done, info, _ =env.step(env.action_space.sample())
        
    env.close()
    episode +=1 
    print (episode,reward)

##### Key principles to consider in a game

# Gains in RL algorithms (ranked)
- Altering algorithms
- Altering reward space parameters
- Hyper parameter tuning for the algorithm

####| https://stable-baselines3.readthedocs.io/en/master/guide/algos.html

#### 6. Train via A2C

In [None]:
from stable_baselines3 import A2C

In [2]:
env =gym.make('LunarLander-v2')

In [None]:
model = A2C('MlpPolicy', env, verbose=1)

In [None]:
model.learn(total_timesteps=100000)
episodes=10
for ep in range (episodes):
    obs = env.reset()
    done = False
    while not done:
        env.render()
        obs,reward,done, info, _ =env.step(env.action_space.sample())

In [None]:
env.close()

#### 7. Try PPO

In [6]:
from stable_baselines3 import PPO

In [None]:
model = PPO('MlpPolicy', env, verbose=1)

In [None]:
# model.learn(total_timesteps=10000000)
# episodes=10
# for ep in range (episodes):
#     obs = env.reset()
#     done = False
#     while not done:
#         env.render()
#         obs,reward,done, info, _ =env.step(env.action_space.sample())

#### 8. Saving interim models

In [3]:
import os

In [7]:
model = PPO('MlpPolicy', env, verbose=1,tensorboard_log=logdir)

Using cuda device
Wrapping the env with a `Monitor` wrapper
Wrapping the env in a DummyVecEnv.


In [8]:
# Create directories for saving models and logs if they don't exist
models_dir = 'models/PPO'
logdir = 'logs'

# Create the models directory if it doesn't exist
if not os.path.exists(models_dir):
    os.makedirs(models_dir)

# Create the logs directory if it doesn't exist
if not os.path.exists(logdir):
    os.makedirs(logdir)


### Understanding the below PPO Snippet

#### Initialization: `model = PPO('MlpPolicy', env, verbose=1)`

- **`'MlpPolicy'`**: This specifies that the policy network architecture will be a Multi-Layer Perceptron (MLP). MLPs are generally good for simpler, lower-dimensional observation spaces.
  
- **`env`**: This is the environment object where the agent will be trained. It should comply with OpenAI's Gym API, providing methods like `reset()` and `step()`.

- **`verbose=1`**: This sets the logging level to verbose, meaning that training progress will be printed to the console. This is useful for debugging and monitoring.

#### Training Loop: `for i in range(1, 30):`

- **`TIMESTEPS = 10000`**: This sets the number of timesteps for which the model will be trained in each iteration of the loop. 10,000 timesteps is a reasonable starting point for many problems but may need to be adjusted based on the complexity of the task.

- **`model.learn(...)`**: This is where the actual training happens.

  - **`total_timesteps=TIMESTEPS`**: Specifies the number of timesteps for this training iteration.
  
  - **`reset_num_timesteps=False`**: This ensures that the learning continues from where it left off in the previous iteration, rather than resetting. This is crucial for incremental learning.
  
  - **`tb_log_name='PPO'`**: This sets the name of the TensorBoard log, useful for monitoring training metrics.

- **`model.save(f"(models_dir)/{TIMESTEPS*i})`**: This saves the model after each training iteration. The filename includes the total number of timesteps the model has been trained for, which is useful for keeping track of training progress and for potential rollbacks to previous states.

The rationale behind this code is to incrementally train a PPO model for a total of 30 iterations, each with 10,000 timesteps, while saving the model at each step. This allows for monitoring and potentially resuming training from a specific point. The verbose logging and TensorBoard support provide avenues for debugging and performance tracking.

In [9]:
model = PPO('MlpPolicy', env, verbose=1,tensorboard_log=logdir)

TIMESTEPS = 10000
for i in range (1,500):
    model.learn(total_timesteps=TIMESTEPS,reset_num_timesteps=False, tb_log_name='PPO')
    model.save(f"{models_dir}/{TIMESTEPS*i}")



Using cuda device
Wrapping the env with a `Monitor` wrapper
Wrapping the env in a DummyVecEnv.
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 94.8     |
|    ep_rew_mean     | -172     |
| time/              |          |
|    fps             | 300      |
|    iterations      | 1        |
|    time_elapsed    | 6        |
|    total_timesteps | 2048     |
---------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 97.8        |
|    ep_rew_mean          | -180        |
| time/                   |             |
|    fps                  | 388         |
|    iterations           | 2           |
|    time_elapsed         | 10          |
|    total_timesteps      | 4096        |
| train/                  |             |
|    approx_kl            | 0.008454743 |
|    clip_fraction        | 0.0135      |
|    clip_range           | 0.2        

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 137         |
|    ep_rew_mean          | -147        |
| time/                   |             |
|    fps                  | 699         |
|    iterations           | 2           |
|    time_elapsed         | 5           |
|    total_timesteps      | 24576       |
| train/                  |             |
|    approx_kl            | 0.010449658 |
|    clip_fraction        | 0.0281      |
|    clip_range           | 0.2         |
|    entropy_loss         | -1.24       |
|    explained_variance   | 0.0065      |
|    learning_rate        | 0.0003      |
|    loss                 | 90          |
|    n_updates            | 110         |
|    policy_gradient_loss | -0.00791    |
|    value_loss           | 244         |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 145   

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 260         |
|    ep_rew_mean          | -81.1       |
| time/                   |             |
|    fps                  | 641         |
|    iterations           | 3           |
|    time_elapsed         | 9           |
|    total_timesteps      | 47104       |
| train/                  |             |
|    approx_kl            | 0.007481703 |
|    clip_fraction        | 0.0417      |
|    clip_range           | 0.2         |
|    entropy_loss         | -1.12       |
|    explained_variance   | 0.826       |
|    learning_rate        | 0.0003      |
|    loss                 | 32.3        |
|    n_updates            | 220         |
|    policy_gradient_loss | -0.00495    |
|    value_loss           | 72.8        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 265 

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 413          |
|    ep_rew_mean          | -50.2        |
| time/                   |              |
|    fps                  | 599          |
|    iterations           | 4            |
|    time_elapsed         | 13           |
|    total_timesteps      | 69632        |
| train/                  |              |
|    approx_kl            | 0.0066982806 |
|    clip_fraction        | 0.0438       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.836       |
|    explained_variance   | 0.782        |
|    learning_rate        | 0.0003       |
|    loss                 | 17.6         |
|    n_updates            | 330          |
|    policy_gradient_loss | -0.00374     |
|    value_loss           | 42.4         |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 556          |
|    ep_rew_mean          | -9.75        |
| time/                   |              |
|    fps                  | 605          |
|    iterations           | 5            |
|    time_elapsed         | 16           |
|    total_timesteps      | 92160        |
| train/                  |              |
|    approx_kl            | 0.0034285428 |
|    clip_fraction        | 0.0127       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.906       |
|    explained_variance   | 0.317        |
|    learning_rate        | 0.0003       |
|    loss                 | 47.5         |
|    n_updates            | 440          |
|    policy_gradient_loss | -0.00173     |
|    value_loss           | 109          |
------------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 655      |
|    ep_rew_mean     | 88.4     |
| time/              |          |
|    fps             | 906      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 114688   |
---------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 646          |
|    ep_rew_mean          | 88.7         |
| time/                   |              |
|    fps                  | 691          |
|    iterations           | 2            |
|    time_elapsed         | 5            |
|    total_timesteps      | 116736       |
| train/                  |              |
|    approx_kl            | 0.0046108086 |
|    clip_fraction        | 0.066        |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.875       |
|    explained_variance   | 0.636   

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 410         |
|    ep_rew_mean          | 139         |
| time/                   |             |
|    fps                  | 685         |
|    iterations           | 2           |
|    time_elapsed         | 5           |
|    total_timesteps      | 137216      |
| train/                  |             |
|    approx_kl            | 0.008505649 |
|    clip_fraction        | 0.0701      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.78       |
|    explained_variance   | 0.531       |
|    learning_rate        | 0.0003      |
|    loss                 | 411         |
|    n_updates            | 660         |
|    policy_gradient_loss | -0.00486    |
|    value_loss           | 261         |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 402   

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 392          |
|    ep_rew_mean          | 111          |
| time/                   |              |
|    fps                  | 643          |
|    iterations           | 3            |
|    time_elapsed         | 9            |
|    total_timesteps      | 159744       |
| train/                  |              |
|    approx_kl            | 0.0048869424 |
|    clip_fraction        | 0.0558       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.904       |
|    explained_variance   | 0.891        |
|    learning_rate        | 0.0003       |
|    loss                 | 12           |
|    n_updates            | 770          |
|    policy_gradient_loss | -0.00422     |
|    value_loss           | 65.8         |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 466         |
|    ep_rew_mean          | 120         |
| time/                   |             |
|    fps                  | 621         |
|    iterations           | 4           |
|    time_elapsed         | 13          |
|    total_timesteps      | 182272      |
| train/                  |             |
|    approx_kl            | 0.006825731 |
|    clip_fraction        | 0.069       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.826      |
|    explained_variance   | 0.826       |
|    learning_rate        | 0.0003      |
|    loss                 | 19.3        |
|    n_updates            | 880         |
|    policy_gradient_loss | -0.00691    |
|    value_loss           | 77.5        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 466 

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 408         |
|    ep_rew_mean          | 133         |
| time/                   |             |
|    fps                  | 612         |
|    iterations           | 5           |
|    time_elapsed         | 16          |
|    total_timesteps      | 204800      |
| train/                  |             |
|    approx_kl            | 0.005642321 |
|    clip_fraction        | 0.0427      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.628      |
|    explained_variance   | 0.672       |
|    learning_rate        | 0.0003      |
|    loss                 | 116         |
|    n_updates            | 990         |
|    policy_gradient_loss | -0.00478    |
|    value_loss           | 267         |
-----------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 399  

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 347      |
|    ep_rew_mean     | 130      |
| time/              |          |
|    fps             | 905      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 227328   |
---------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 353          |
|    ep_rew_mean          | 125          |
| time/                   |              |
|    fps                  | 696          |
|    iterations           | 2            |
|    time_elapsed         | 5            |
|    total_timesteps      | 229376       |
| train/                  |              |
|    approx_kl            | 0.0052898256 |
|    clip_fraction        | 0.0287       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.713       |
|    explained_variance   | 0.784   

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 367          |
|    ep_rew_mean          | 108          |
| time/                   |              |
|    fps                  | 694          |
|    iterations           | 2            |
|    time_elapsed         | 5            |
|    total_timesteps      | 249856       |
| train/                  |              |
|    approx_kl            | 0.0048708366 |
|    clip_fraction        | 0.0585       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.711       |
|    explained_variance   | 0.832        |
|    learning_rate        | 0.0003       |
|    loss                 | 7.65         |
|    n_updates            | 1210         |
|    policy_gradient_loss | -0.00399     |
|    value_loss           | 63           |
------------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_m

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 384         |
|    ep_rew_mean          | 127         |
| time/                   |             |
|    fps                  | 648         |
|    iterations           | 3           |
|    time_elapsed         | 9           |
|    total_timesteps      | 272384      |
| train/                  |             |
|    approx_kl            | 0.002480038 |
|    clip_fraction        | 0.0295      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.744      |
|    explained_variance   | 0.672       |
|    learning_rate        | 0.0003      |
|    loss                 | 179         |
|    n_updates            | 1320        |
|    policy_gradient_loss | -0.00139    |
|    value_loss           | 236         |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 379   

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 346          |
|    ep_rew_mean          | 145          |
| time/                   |              |
|    fps                  | 627          |
|    iterations           | 4            |
|    time_elapsed         | 13           |
|    total_timesteps      | 294912       |
| train/                  |              |
|    approx_kl            | 0.0067876973 |
|    clip_fraction        | 0.0621       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.71        |
|    explained_variance   | 0.937        |
|    learning_rate        | 0.0003       |
|    loss                 | 12.6         |
|    n_updates            | 1430         |
|    policy_gradient_loss | -0.00409     |
|    value_loss           | 52.6         |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 301          |
|    ep_rew_mean          | 139          |
| time/                   |              |
|    fps                  | 608          |
|    iterations           | 5            |
|    time_elapsed         | 16           |
|    total_timesteps      | 317440       |
| train/                  |              |
|    approx_kl            | 0.0034789382 |
|    clip_fraction        | 0.0282       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.584       |
|    explained_variance   | 0.863        |
|    learning_rate        | 0.0003       |
|    loss                 | 30.8         |
|    n_updates            | 1540         |
|    policy_gradient_loss | -0.0034      |
|    value_loss           | 162          |
------------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 261      |
|    ep_rew_mean     | 141      |
| time/              |          |
|    fps             | 863      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 339968   |
---------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 261          |
|    ep_rew_mean          | 145          |
| time/                   |              |
|    fps                  | 684          |
|    iterations           | 2            |
|    time_elapsed         | 5            |
|    total_timesteps      | 342016       |
| train/                  |              |
|    approx_kl            | 0.0045902985 |
|    clip_fraction        | 0.0524       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.648       |
|    explained_variance   | 0.866   

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 286          |
|    ep_rew_mean          | 151          |
| time/                   |              |
|    fps                  | 681          |
|    iterations           | 2            |
|    time_elapsed         | 6            |
|    total_timesteps      | 362496       |
| train/                  |              |
|    approx_kl            | 0.0054167993 |
|    clip_fraction        | 0.0471       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.643       |
|    explained_variance   | 0.845        |
|    learning_rate        | 0.0003       |
|    loss                 | 31.8         |
|    n_updates            | 1760         |
|    policy_gradient_loss | -0.00382     |
|    value_loss           | 183          |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 329         |
|    ep_rew_mean          | 162         |
| time/                   |             |
|    fps                  | 638         |
|    iterations           | 3           |
|    time_elapsed         | 9           |
|    total_timesteps      | 385024      |
| train/                  |             |
|    approx_kl            | 0.004681733 |
|    clip_fraction        | 0.0292      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.682      |
|    explained_variance   | 0.844       |
|    learning_rate        | 0.0003      |
|    loss                 | 19.1        |
|    n_updates            | 1870        |
|    policy_gradient_loss | -0.00289    |
|    value_loss           | 125         |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 326 

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 322          |
|    ep_rew_mean          | 195          |
| time/                   |              |
|    fps                  | 633          |
|    iterations           | 4            |
|    time_elapsed         | 12           |
|    total_timesteps      | 407552       |
| train/                  |              |
|    approx_kl            | 0.0040909173 |
|    clip_fraction        | 0.0487       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.579       |
|    explained_variance   | 0.747        |
|    learning_rate        | 0.0003       |
|    loss                 | 24.8         |
|    n_updates            | 1980         |
|    policy_gradient_loss | -0.00296     |
|    value_loss           | 63.8         |
------------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_m

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 296          |
|    ep_rew_mean          | 210          |
| time/                   |              |
|    fps                  | 612          |
|    iterations           | 5            |
|    time_elapsed         | 16           |
|    total_timesteps      | 430080       |
| train/                  |              |
|    approx_kl            | 0.0040967055 |
|    clip_fraction        | 0.0484       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.58        |
|    explained_variance   | 0.814        |
|    learning_rate        | 0.0003       |
|    loss                 | 26           |
|    n_updates            | 2090         |
|    policy_gradient_loss | -0.0018      |
|    value_loss           | 68.8         |
------------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 268      |
|    ep_rew_mean     | 214      |
| time/              |          |
|    fps             | 925      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 452608   |
---------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 265        |
|    ep_rew_mean          | 215        |
| time/                   |            |
|    fps                  | 698        |
|    iterations           | 2          |
|    time_elapsed         | 5          |
|    total_timesteps      | 454656     |
| train/                  |            |
|    approx_kl            | 0.00294402 |
|    clip_fraction        | 0.0332     |
|    clip_range           | 0.2        |
|    entropy_loss         | -0.588     |
|    explained_variance   | 0.804      |
|    learning_rate     

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 256          |
|    ep_rew_mean          | 219          |
| time/                   |              |
|    fps                  | 685          |
|    iterations           | 2            |
|    time_elapsed         | 5            |
|    total_timesteps      | 475136       |
| train/                  |              |
|    approx_kl            | 0.0028534164 |
|    clip_fraction        | 0.0339       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.514       |
|    explained_variance   | 0.835        |
|    learning_rate        | 0.0003       |
|    loss                 | 7.45         |
|    n_updates            | 2310         |
|    policy_gradient_loss | -0.00356     |
|    value_loss           | 33.3         |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 253          |
|    ep_rew_mean          | 237          |
| time/                   |              |
|    fps                  | 648          |
|    iterations           | 3            |
|    time_elapsed         | 9            |
|    total_timesteps      | 497664       |
| train/                  |              |
|    approx_kl            | 0.0042896387 |
|    clip_fraction        | 0.039        |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.504       |
|    explained_variance   | 0.851        |
|    learning_rate        | 0.0003       |
|    loss                 | 72.1         |
|    n_updates            | 2420         |
|    policy_gradient_loss | -0.000411    |
|    value_loss           | 87.9         |
------------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_m

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 242         |
|    ep_rew_mean          | 242         |
| time/                   |             |
|    fps                  | 583         |
|    iterations           | 4           |
|    time_elapsed         | 14          |
|    total_timesteps      | 520192      |
| train/                  |             |
|    approx_kl            | 0.004425668 |
|    clip_fraction        | 0.0599      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.546      |
|    explained_variance   | 0.835       |
|    learning_rate        | 0.0003      |
|    loss                 | 5.85        |
|    n_updates            | 2530        |
|    policy_gradient_loss | -0.00271    |
|    value_loss           | 20.6        |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 241   

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 235          |
|    ep_rew_mean          | 236          |
| time/                   |              |
|    fps                  | 588          |
|    iterations           | 5            |
|    time_elapsed         | 17           |
|    total_timesteps      | 542720       |
| train/                  |              |
|    approx_kl            | 0.0035305815 |
|    clip_fraction        | 0.0085       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.538       |
|    explained_variance   | 0.759        |
|    learning_rate        | 0.0003       |
|    loss                 | 40.9         |
|    n_updates            | 2640         |
|    policy_gradient_loss | -0.00129     |
|    value_loss           | 139          |
------------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 234      |
|    ep_rew_mean     | 248      |
| time/              |          |
|    fps             | 843      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 565248   |
---------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 230         |
|    ep_rew_mean          | 250         |
| time/                   |             |
|    fps                  | 676         |
|    iterations           | 2           |
|    time_elapsed         | 6           |
|    total_timesteps      | 567296      |
| train/                  |             |
|    approx_kl            | 0.007722054 |
|    clip_fraction        | 0.0567      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.503      |
|    explained_variance   | 0.772       |
|    lea

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 232        |
|    ep_rew_mean          | 262        |
| time/                   |            |
|    fps                  | 647        |
|    iterations           | 2          |
|    time_elapsed         | 6          |
|    total_timesteps      | 587776     |
| train/                  |            |
|    approx_kl            | 0.00409144 |
|    clip_fraction        | 0.038      |
|    clip_range           | 0.2        |
|    entropy_loss         | -0.48      |
|    explained_variance   | 0.797      |
|    learning_rate        | 0.0003     |
|    loss                 | 4.14       |
|    n_updates            | 2860       |
|    policy_gradient_loss | -0.00101   |
|    value_loss           | 20         |
----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 231          |
|    ep_re

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 233         |
|    ep_rew_mean          | 269         |
| time/                   |             |
|    fps                  | 613         |
|    iterations           | 3           |
|    time_elapsed         | 10          |
|    total_timesteps      | 610304      |
| train/                  |             |
|    approx_kl            | 0.005020002 |
|    clip_fraction        | 0.0548      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.507      |
|    explained_variance   | 0.932       |
|    learning_rate        | 0.0003      |
|    loss                 | 4.51        |
|    n_updates            | 2970        |
|    policy_gradient_loss | -0.0035     |
|    value_loss           | 11.8        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 234 

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 255        |
|    ep_rew_mean          | 252        |
| time/                   |            |
|    fps                  | 622        |
|    iterations           | 4          |
|    time_elapsed         | 13         |
|    total_timesteps      | 632832     |
| train/                  |            |
|    approx_kl            | 0.00963939 |
|    clip_fraction        | 0.045      |
|    clip_range           | 0.2        |
|    entropy_loss         | -0.495     |
|    explained_variance   | 0.909      |
|    learning_rate        | 0.0003     |
|    loss                 | 7.91       |
|    n_updates            | 3080       |
|    policy_gradient_loss | -0.00476   |
|    value_loss           | 24.3       |
----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 249         |
|    ep_rew_m

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 225         |
|    ep_rew_mean          | 264         |
| time/                   |             |
|    fps                  | 574         |
|    iterations           | 5           |
|    time_elapsed         | 17          |
|    total_timesteps      | 655360      |
| train/                  |             |
|    approx_kl            | 0.009835326 |
|    clip_fraction        | 0.0963      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.512      |
|    explained_variance   | 0.926       |
|    learning_rate        | 0.0003      |
|    loss                 | 7.48        |
|    n_updates            | 3190        |
|    policy_gradient_loss | -0.0063     |
|    value_loss           | 15.4        |
-----------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 225  

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 256      |
|    ep_rew_mean     | 253      |
| time/              |          |
|    fps             | 879      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 677888   |
---------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 264         |
|    ep_rew_mean          | 251         |
| time/                   |             |
|    fps                  | 680         |
|    iterations           | 2           |
|    time_elapsed         | 6           |
|    total_timesteps      | 679936      |
| train/                  |             |
|    approx_kl            | 0.005275095 |
|    clip_fraction        | 0.0472      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.56       |
|    explained_variance   | 0.978       |
|    lea

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 238          |
|    ep_rew_mean          | 252          |
| time/                   |              |
|    fps                  | 648          |
|    iterations           | 2            |
|    time_elapsed         | 6            |
|    total_timesteps      | 700416       |
| train/                  |              |
|    approx_kl            | 0.0060318066 |
|    clip_fraction        | 0.0593       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.589       |
|    explained_variance   | 0.904        |
|    learning_rate        | 0.0003       |
|    loss                 | 13.9         |
|    n_updates            | 3410         |
|    policy_gradient_loss | -0.00315     |
|    value_loss           | 19.7         |
------------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_m

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 231         |
|    ep_rew_mean          | 264         |
| time/                   |             |
|    fps                  | 608         |
|    iterations           | 3           |
|    time_elapsed         | 10          |
|    total_timesteps      | 722944      |
| train/                  |             |
|    approx_kl            | 0.002886696 |
|    clip_fraction        | 0.0499      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.465      |
|    explained_variance   | 0.719       |
|    learning_rate        | 0.0003      |
|    loss                 | 14.4        |
|    n_updates            | 3520        |
|    policy_gradient_loss | -0.00363    |
|    value_loss           | 114         |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 231 

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 237         |
|    ep_rew_mean          | 266         |
| time/                   |             |
|    fps                  | 622         |
|    iterations           | 4           |
|    time_elapsed         | 13          |
|    total_timesteps      | 745472      |
| train/                  |             |
|    approx_kl            | 0.005928012 |
|    clip_fraction        | 0.0602      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.487      |
|    explained_variance   | 0.873       |
|    learning_rate        | 0.0003      |
|    loss                 | 6.22        |
|    n_updates            | 3630        |
|    policy_gradient_loss | -0.00243    |
|    value_loss           | 18.3        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 237 

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 275         |
|    ep_rew_mean          | 252         |
| time/                   |             |
|    fps                  | 623         |
|    iterations           | 5           |
|    time_elapsed         | 16          |
|    total_timesteps      | 768000      |
| train/                  |             |
|    approx_kl            | 0.006079867 |
|    clip_fraction        | 0.0685      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.482      |
|    explained_variance   | 0.936       |
|    learning_rate        | 0.0003      |
|    loss                 | 3.45        |
|    n_updates            | 3740        |
|    policy_gradient_loss | -0.00168    |
|    value_loss           | 31.8        |
-----------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 278  

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 274      |
|    ep_rew_mean     | 257      |
| time/              |          |
|    fps             | 851      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 790528   |
---------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 261         |
|    ep_rew_mean          | 255         |
| time/                   |             |
|    fps                  | 673         |
|    iterations           | 2           |
|    time_elapsed         | 6           |
|    total_timesteps      | 792576      |
| train/                  |             |
|    approx_kl            | 0.007077243 |
|    clip_fraction        | 0.104       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.558      |
|    explained_variance   | 0.983       |
|    lea

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 213         |
|    ep_rew_mean          | 270         |
| time/                   |             |
|    fps                  | 691         |
|    iterations           | 2           |
|    time_elapsed         | 5           |
|    total_timesteps      | 813056      |
| train/                  |             |
|    approx_kl            | 0.007459012 |
|    clip_fraction        | 0.0632      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.462      |
|    explained_variance   | 0.738       |
|    learning_rate        | 0.0003      |
|    loss                 | 39.7        |
|    n_updates            | 3960        |
|    policy_gradient_loss | -0.00484    |
|    value_loss           | 112         |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 211 

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 264         |
|    ep_rew_mean          | 262         |
| time/                   |             |
|    fps                  | 620         |
|    iterations           | 3           |
|    time_elapsed         | 9           |
|    total_timesteps      | 835584      |
| train/                  |             |
|    approx_kl            | 0.013022109 |
|    clip_fraction        | 0.0751      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.614      |
|    explained_variance   | 0.957       |
|    learning_rate        | 0.0003      |
|    loss                 | 4.54        |
|    n_updates            | 4070        |
|    policy_gradient_loss | -0.00105    |
|    value_loss           | 20.9        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 274 

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 283          |
|    ep_rew_mean          | 262          |
| time/                   |              |
|    fps                  | 597          |
|    iterations           | 4            |
|    time_elapsed         | 13           |
|    total_timesteps      | 858112       |
| train/                  |              |
|    approx_kl            | 0.0064659957 |
|    clip_fraction        | 0.0511       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.403       |
|    explained_variance   | 0.963        |
|    learning_rate        | 0.0003       |
|    loss                 | 2.73         |
|    n_updates            | 4180         |
|    policy_gradient_loss | -0.00277     |
|    value_loss           | 14.3         |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 228         |
|    ep_rew_mean          | 275         |
| time/                   |             |
|    fps                  | 598         |
|    iterations           | 5           |
|    time_elapsed         | 17          |
|    total_timesteps      | 880640      |
| train/                  |             |
|    approx_kl            | 0.004789005 |
|    clip_fraction        | 0.0372      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.403      |
|    explained_variance   | 0.865       |
|    learning_rate        | 0.0003      |
|    loss                 | 7.84        |
|    n_updates            | 4290        |
|    policy_gradient_loss | -0.00119    |
|    value_loss           | 20          |
-----------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 231  

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 234      |
|    ep_rew_mean     | 265      |
| time/              |          |
|    fps             | 823      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 903168   |
---------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 229          |
|    ep_rew_mean          | 267          |
| time/                   |              |
|    fps                  | 666          |
|    iterations           | 2            |
|    time_elapsed         | 6            |
|    total_timesteps      | 905216       |
| train/                  |              |
|    approx_kl            | 0.0029463954 |
|    clip_fraction        | 0.037        |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.433       |
|    explained_variance   | 0.753   

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 247         |
|    ep_rew_mean          | 263         |
| time/                   |             |
|    fps                  | 660         |
|    iterations           | 2           |
|    time_elapsed         | 6           |
|    total_timesteps      | 925696      |
| train/                  |             |
|    approx_kl            | 0.008161811 |
|    clip_fraction        | 0.0421      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.48       |
|    explained_variance   | 0.924       |
|    learning_rate        | 0.0003      |
|    loss                 | 4.61        |
|    n_updates            | 4510        |
|    policy_gradient_loss | -0.00522    |
|    value_loss           | 36          |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 252   

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 217          |
|    ep_rew_mean          | 256          |
| time/                   |              |
|    fps                  | 629          |
|    iterations           | 3            |
|    time_elapsed         | 9            |
|    total_timesteps      | 948224       |
| train/                  |              |
|    approx_kl            | 0.0063666855 |
|    clip_fraction        | 0.0327       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.379       |
|    explained_variance   | 0.822        |
|    learning_rate        | 0.0003       |
|    loss                 | 13.5         |
|    n_updates            | 4620         |
|    policy_gradient_loss | -0.00123     |
|    value_loss           | 48.2         |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 220          |
|    ep_rew_mean          | 251          |
| time/                   |              |
|    fps                  | 596          |
|    iterations           | 4            |
|    time_elapsed         | 13           |
|    total_timesteps      | 970752       |
| train/                  |              |
|    approx_kl            | 0.0025734631 |
|    clip_fraction        | 0.0242       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.372       |
|    explained_variance   | 0.871        |
|    learning_rate        | 0.0003       |
|    loss                 | 11.2         |
|    n_updates            | 4730         |
|    policy_gradient_loss | -0.000835    |
|    value_loss           | 32           |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 227          |
|    ep_rew_mean          | 255          |
| time/                   |              |
|    fps                  | 577          |
|    iterations           | 5            |
|    time_elapsed         | 17           |
|    total_timesteps      | 993280       |
| train/                  |              |
|    approx_kl            | 0.0056849616 |
|    clip_fraction        | 0.0526       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.406       |
|    explained_variance   | 0.903        |
|    learning_rate        | 0.0003       |
|    loss                 | 5.07         |
|    n_updates            | 4840         |
|    policy_gradient_loss | -0.00319     |
|    value_loss           | 19.2         |
------------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 218      |
|    ep_rew_mean     | 248      |
| time/              |          |
|    fps             | 808      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 1015808  |
---------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 229          |
|    ep_rew_mean          | 247          |
| time/                   |              |
|    fps                  | 624          |
|    iterations           | 2            |
|    time_elapsed         | 6            |
|    total_timesteps      | 1017856      |
| train/                  |              |
|    approx_kl            | 0.0048786886 |
|    clip_fraction        | 0.0525       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.49        |
|    explained_variance   | 0.948   

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 224         |
|    ep_rew_mean          | 253         |
| time/                   |             |
|    fps                  | 590         |
|    iterations           | 3           |
|    time_elapsed         | 10          |
|    total_timesteps      | 1040384     |
| train/                  |             |
|    approx_kl            | 0.018155053 |
|    clip_fraction        | 0.0681      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.452      |
|    explained_variance   | 0.967       |
|    learning_rate        | 0.0003      |
|    loss                 | 24.3        |
|    n_updates            | 5070        |
|    policy_gradient_loss | -0.00345    |
|    value_loss           | 27.6        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 217 

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 237         |
|    ep_rew_mean          | 259         |
| time/                   |             |
|    fps                  | 623         |
|    iterations           | 4           |
|    time_elapsed         | 13          |
|    total_timesteps      | 1062912     |
| train/                  |             |
|    approx_kl            | 0.009430107 |
|    clip_fraction        | 0.133       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.568      |
|    explained_variance   | 0.932       |
|    learning_rate        | 0.0003      |
|    loss                 | 46          |
|    n_updates            | 5180        |
|    policy_gradient_loss | -0.00207    |
|    value_loss           | 34.6        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 246 

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 234         |
|    ep_rew_mean          | 265         |
| time/                   |             |
|    fps                  | 599         |
|    iterations           | 5           |
|    time_elapsed         | 17          |
|    total_timesteps      | 1085440     |
| train/                  |             |
|    approx_kl            | 0.002973344 |
|    clip_fraction        | 0.0329      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.43       |
|    explained_variance   | 0.887       |
|    learning_rate        | 0.0003      |
|    loss                 | 5.05        |
|    n_updates            | 5290        |
|    policy_gradient_loss | -0.000331   |
|    value_loss           | 94.7        |
-----------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 227  

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 232      |
|    ep_rew_mean     | 249      |
| time/              |          |
|    fps             | 893      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 1107968  |
---------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 231          |
|    ep_rew_mean          | 252          |
| time/                   |              |
|    fps                  | 693          |
|    iterations           | 2            |
|    time_elapsed         | 5            |
|    total_timesteps      | 1110016      |
| train/                  |              |
|    approx_kl            | 0.0028518443 |
|    clip_fraction        | 0.0186       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.39        |
|    explained_variance   | 0.655   

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 237         |
|    ep_rew_mean          | 272         |
| time/                   |             |
|    fps                  | 638         |
|    iterations           | 3           |
|    time_elapsed         | 9           |
|    total_timesteps      | 1132544     |
| train/                  |             |
|    approx_kl            | 0.005736466 |
|    clip_fraction        | 0.0567      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.433      |
|    explained_variance   | 0.899       |
|    learning_rate        | 0.0003      |
|    loss                 | 15.2        |
|    n_updates            | 5520        |
|    policy_gradient_loss | -0.00326    |
|    value_loss           | 26.8        |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 238   

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 210          |
|    ep_rew_mean          | 258          |
| time/                   |              |
|    fps                  | 593          |
|    iterations           | 4            |
|    time_elapsed         | 13           |
|    total_timesteps      | 1155072      |
| train/                  |              |
|    approx_kl            | 0.0036793614 |
|    clip_fraction        | 0.051        |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.4         |
|    explained_variance   | 0.768        |
|    learning_rate        | 0.0003       |
|    loss                 | 55.3         |
|    n_updates            | 5630         |
|    policy_gradient_loss | -0.000503    |
|    value_loss           | 75           |
------------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_m

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 230         |
|    ep_rew_mean          | 275         |
| time/                   |             |
|    fps                  | 551         |
|    iterations           | 5           |
|    time_elapsed         | 18          |
|    total_timesteps      | 1177600     |
| train/                  |             |
|    approx_kl            | 0.017326735 |
|    clip_fraction        | 0.0949      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.427      |
|    explained_variance   | 0.957       |
|    learning_rate        | 0.0003      |
|    loss                 | 8.54        |
|    n_updates            | 5740        |
|    policy_gradient_loss | -0.00476    |
|    value_loss           | 13.6        |
-----------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 214  

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 228      |
|    ep_rew_mean     | 266      |
| time/              |          |
|    fps             | 808      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 1200128  |
---------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 235          |
|    ep_rew_mean          | 262          |
| time/                   |              |
|    fps                  | 622          |
|    iterations           | 2            |
|    time_elapsed         | 6            |
|    total_timesteps      | 1202176      |
| train/                  |              |
|    approx_kl            | 0.0053899325 |
|    clip_fraction        | 0.0545       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.401       |
|    explained_variance   | 0.915   

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 224         |
|    ep_rew_mean          | 249         |
| time/                   |             |
|    fps                  | 691         |
|    iterations           | 2           |
|    time_elapsed         | 5           |
|    total_timesteps      | 1222656     |
| train/                  |             |
|    approx_kl            | 0.011598089 |
|    clip_fraction        | 0.0662      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.462      |
|    explained_variance   | 0.934       |
|    learning_rate        | 0.0003      |
|    loss                 | 33.2        |
|    n_updates            | 5960        |
|    policy_gradient_loss | -0.00146    |
|    value_loss           | 100         |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 214 

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 221         |
|    ep_rew_mean          | 257         |
| time/                   |             |
|    fps                  | 645         |
|    iterations           | 3           |
|    time_elapsed         | 9           |
|    total_timesteps      | 1245184     |
| train/                  |             |
|    approx_kl            | 0.003278918 |
|    clip_fraction        | 0.0363      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.414      |
|    explained_variance   | 0.96        |
|    learning_rate        | 0.0003      |
|    loss                 | 6.66        |
|    n_updates            | 6070        |
|    policy_gradient_loss | -0.00194    |
|    value_loss           | 21.3        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 223 

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 228         |
|    ep_rew_mean          | 247         |
| time/                   |             |
|    fps                  | 622         |
|    iterations           | 4           |
|    time_elapsed         | 13          |
|    total_timesteps      | 1267712     |
| train/                  |             |
|    approx_kl            | 0.006324742 |
|    clip_fraction        | 0.142       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.525      |
|    explained_variance   | 0.987       |
|    learning_rate        | 0.0003      |
|    loss                 | 4.21        |
|    n_updates            | 6180        |
|    policy_gradient_loss | -0.00375    |
|    value_loss           | 17.2        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 228 

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 247          |
|    ep_rew_mean          | 261          |
| time/                   |              |
|    fps                  | 605          |
|    iterations           | 5            |
|    time_elapsed         | 16           |
|    total_timesteps      | 1290240      |
| train/                  |              |
|    approx_kl            | 0.0063797375 |
|    clip_fraction        | 0.0519       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.434       |
|    explained_variance   | 0.938        |
|    learning_rate        | 0.0003       |
|    loss                 | 5.57         |
|    n_updates            | 6290         |
|    policy_gradient_loss | -0.00309     |
|    value_loss           | 19.8         |
------------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 254      |
|    ep_rew_mean     | 261      |
| time/              |          |
|    fps             | 906      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 1312768  |
---------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 270          |
|    ep_rew_mean          | 258          |
| time/                   |              |
|    fps                  | 695          |
|    iterations           | 2            |
|    time_elapsed         | 5            |
|    total_timesteps      | 1314816      |
| train/                  |              |
|    approx_kl            | 0.0035793318 |
|    clip_fraction        | 0.0402       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.419       |
|    explained_variance   | 0.97    

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 263         |
|    ep_rew_mean          | 258         |
| time/                   |             |
|    fps                  | 666         |
|    iterations           | 2           |
|    time_elapsed         | 6           |
|    total_timesteps      | 1335296     |
| train/                  |             |
|    approx_kl            | 0.004504992 |
|    clip_fraction        | 0.0438      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.41       |
|    explained_variance   | 0.955       |
|    learning_rate        | 0.0003      |
|    loss                 | 6           |
|    n_updates            | 6510        |
|    policy_gradient_loss | -0.00289    |
|    value_loss           | 14.9        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 256 

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 284         |
|    ep_rew_mean          | 251         |
| time/                   |             |
|    fps                  | 643         |
|    iterations           | 3           |
|    time_elapsed         | 9           |
|    total_timesteps      | 1357824     |
| train/                  |             |
|    approx_kl            | 0.020544957 |
|    clip_fraction        | 0.0724      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.357      |
|    explained_variance   | 0.921       |
|    learning_rate        | 0.0003      |
|    loss                 | 54          |
|    n_updates            | 6620        |
|    policy_gradient_loss | -0.0122     |
|    value_loss           | 72          |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 285   

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 234        |
|    ep_rew_mean          | 263        |
| time/                   |            |
|    fps                  | 617        |
|    iterations           | 4          |
|    time_elapsed         | 13         |
|    total_timesteps      | 1380352    |
| train/                  |            |
|    approx_kl            | 0.00402434 |
|    clip_fraction        | 0.0501     |
|    clip_range           | 0.2        |
|    entropy_loss         | -0.494     |
|    explained_variance   | 0.975      |
|    learning_rate        | 0.0003     |
|    loss                 | 3.62       |
|    n_updates            | 6730       |
|    policy_gradient_loss | -0.000743  |
|    value_loss           | 17.8       |
----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 226         |
|    ep_rew_m

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 227         |
|    ep_rew_mean          | 250         |
| time/                   |             |
|    fps                  | 584         |
|    iterations           | 5           |
|    time_elapsed         | 17          |
|    total_timesteps      | 1402880     |
| train/                  |             |
|    approx_kl            | 0.012104489 |
|    clip_fraction        | 0.107       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.542      |
|    explained_variance   | 0.989       |
|    learning_rate        | 0.0003      |
|    loss                 | 2.15        |
|    n_updates            | 6840        |
|    policy_gradient_loss | -0.000472   |
|    value_loss           | 4.58        |
-----------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 235  

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 257      |
|    ep_rew_mean     | 259      |
| time/              |          |
|    fps             | 889      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 1425408  |
---------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 249         |
|    ep_rew_mean          | 260         |
| time/                   |             |
|    fps                  | 680         |
|    iterations           | 2           |
|    time_elapsed         | 6           |
|    total_timesteps      | 1427456     |
| train/                  |             |
|    approx_kl            | 0.006783161 |
|    clip_fraction        | 0.0265      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.492      |
|    explained_variance   | 0.937       |
|    lea

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 238          |
|    ep_rew_mean          | 262          |
| time/                   |              |
|    fps                  | 619          |
|    iterations           | 3            |
|    time_elapsed         | 9            |
|    total_timesteps      | 1449984      |
| train/                  |              |
|    approx_kl            | 0.0034196652 |
|    clip_fraction        | 0.0331       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.354       |
|    explained_variance   | 0.88         |
|    learning_rate        | 0.0003       |
|    loss                 | 8.99         |
|    n_updates            | 7070         |
|    policy_gradient_loss | -0.000846    |
|    value_loss           | 43.6         |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 255          |
|    ep_rew_mean          | 253          |
| time/                   |              |
|    fps                  | 606          |
|    iterations           | 4            |
|    time_elapsed         | 13           |
|    total_timesteps      | 1472512      |
| train/                  |              |
|    approx_kl            | 0.0037790127 |
|    clip_fraction        | 0.0296       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.356       |
|    explained_variance   | 0.901        |
|    learning_rate        | 0.0003       |
|    loss                 | 28.6         |
|    n_updates            | 7180         |
|    policy_gradient_loss | -0.00198     |
|    value_loss           | 86.1         |
------------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_m

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 227         |
|    ep_rew_mean          | 265         |
| time/                   |             |
|    fps                  | 600         |
|    iterations           | 5           |
|    time_elapsed         | 17          |
|    total_timesteps      | 1495040     |
| train/                  |             |
|    approx_kl            | 0.009170684 |
|    clip_fraction        | 0.104       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.429      |
|    explained_variance   | 0.957       |
|    learning_rate        | 0.0003      |
|    loss                 | 8.62        |
|    n_updates            | 7290        |
|    policy_gradient_loss | -0.00381    |
|    value_loss           | 32.4        |
-----------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 221  

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 223      |
|    ep_rew_mean     | 253      |
| time/              |          |
|    fps             | 918      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 1517568  |
---------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 214          |
|    ep_rew_mean          | 251          |
| time/                   |              |
|    fps                  | 694          |
|    iterations           | 2            |
|    time_elapsed         | 5            |
|    total_timesteps      | 1519616      |
| train/                  |              |
|    approx_kl            | 0.0037424471 |
|    clip_fraction        | 0.0377       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.313       |
|    explained_variance   | 0.866   

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 235         |
|    ep_rew_mean          | 262         |
| time/                   |             |
|    fps                  | 689         |
|    iterations           | 2           |
|    time_elapsed         | 5           |
|    total_timesteps      | 1540096     |
| train/                  |             |
|    approx_kl            | 0.011283263 |
|    clip_fraction        | 0.108       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.454      |
|    explained_variance   | 0.976       |
|    learning_rate        | 0.0003      |
|    loss                 | 2.54        |
|    n_updates            | 7510        |
|    policy_gradient_loss | -0.00213    |
|    value_loss           | 16.4        |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 237   

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 212          |
|    ep_rew_mean          | 267          |
| time/                   |              |
|    fps                  | 634          |
|    iterations           | 3            |
|    time_elapsed         | 9            |
|    total_timesteps      | 1562624      |
| train/                  |              |
|    approx_kl            | 0.0046147197 |
|    clip_fraction        | 0.0479       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.367       |
|    explained_variance   | 0.843        |
|    learning_rate        | 0.0003       |
|    loss                 | 13.3         |
|    n_updates            | 7620         |
|    policy_gradient_loss | -0.000458    |
|    value_loss           | 79.6         |
------------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_m

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 187          |
|    ep_rew_mean          | 272          |
| time/                   |              |
|    fps                  | 614          |
|    iterations           | 4            |
|    time_elapsed         | 13           |
|    total_timesteps      | 1585152      |
| train/                  |              |
|    approx_kl            | 0.0037388918 |
|    clip_fraction        | 0.0412       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.336       |
|    explained_variance   | 0.975        |
|    learning_rate        | 0.0003       |
|    loss                 | 4.19         |
|    n_updates            | 7730         |
|    policy_gradient_loss | -0.000963    |
|    value_loss           | 11.2         |
------------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_m

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 228          |
|    ep_rew_mean          | 241          |
| time/                   |              |
|    fps                  | 597          |
|    iterations           | 5            |
|    time_elapsed         | 17           |
|    total_timesteps      | 1607680      |
| train/                  |              |
|    approx_kl            | 0.0036924335 |
|    clip_fraction        | 0.0421       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.334       |
|    explained_variance   | 0.944        |
|    learning_rate        | 0.0003       |
|    loss                 | 4.93         |
|    n_updates            | 7840         |
|    policy_gradient_loss | -0.00353     |
|    value_loss           | 16.1         |
------------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 220      |
|    ep_rew_mean     | 271      |
| time/              |          |
|    fps             | 891      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 1630208  |
---------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 227          |
|    ep_rew_mean          | 268          |
| time/                   |              |
|    fps                  | 684          |
|    iterations           | 2            |
|    time_elapsed         | 5            |
|    total_timesteps      | 1632256      |
| train/                  |              |
|    approx_kl            | 0.0026004564 |
|    clip_fraction        | 0.0417       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.352       |
|    explained_variance   | 0.963   

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 211         |
|    ep_rew_mean          | 275         |
| time/                   |             |
|    fps                  | 668         |
|    iterations           | 2           |
|    time_elapsed         | 6           |
|    total_timesteps      | 1652736     |
| train/                  |             |
|    approx_kl            | 0.009041334 |
|    clip_fraction        | 0.0538      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.349      |
|    explained_variance   | 0.984       |
|    learning_rate        | 0.0003      |
|    loss                 | 5.98        |
|    n_updates            | 8060        |
|    policy_gradient_loss | -0.00166    |
|    value_loss           | 10.3        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 212 

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 216          |
|    ep_rew_mean          | 255          |
| time/                   |              |
|    fps                  | 643          |
|    iterations           | 3            |
|    time_elapsed         | 9            |
|    total_timesteps      | 1675264      |
| train/                  |              |
|    approx_kl            | 0.0068815914 |
|    clip_fraction        | 0.0937       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.384       |
|    explained_variance   | 0.976        |
|    learning_rate        | 0.0003       |
|    loss                 | 8.32         |
|    n_updates            | 8170         |
|    policy_gradient_loss | -0.00206     |
|    value_loss           | 20.9         |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 216         |
|    ep_rew_mean          | 261         |
| time/                   |             |
|    fps                  | 558         |
|    iterations           | 4           |
|    time_elapsed         | 14          |
|    total_timesteps      | 1697792     |
| train/                  |             |
|    approx_kl            | 0.007942139 |
|    clip_fraction        | 0.107       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.483      |
|    explained_variance   | 0.983       |
|    learning_rate        | 0.0003      |
|    loss                 | 12.2        |
|    n_updates            | 8280        |
|    policy_gradient_loss | -0.000995   |
|    value_loss           | 20.3        |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 212   

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 189          |
|    ep_rew_mean          | 224          |
| time/                   |              |
|    fps                  | 584          |
|    iterations           | 5            |
|    time_elapsed         | 17           |
|    total_timesteps      | 1720320      |
| train/                  |              |
|    approx_kl            | 0.0060468256 |
|    clip_fraction        | 0.0872       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.418       |
|    explained_variance   | 0.966        |
|    learning_rate        | 0.0003       |
|    loss                 | 14.8         |
|    n_updates            | 8390         |
|    policy_gradient_loss | -0.00339     |
|    value_loss           | 49.4         |
------------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 207      |
|    ep_rew_mean     | 248      |
| time/              |          |
|    fps             | 768      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 1742848  |
---------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 221          |
|    ep_rew_mean          | 252          |
| time/                   |              |
|    fps                  | 645          |
|    iterations           | 2            |
|    time_elapsed         | 6            |
|    total_timesteps      | 1744896      |
| train/                  |              |
|    approx_kl            | 0.0074077584 |
|    clip_fraction        | 0.0578       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.38        |
|    explained_variance   | 0.926   

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 211          |
|    ep_rew_mean          | 269          |
| time/                   |              |
|    fps                  | 681          |
|    iterations           | 2            |
|    time_elapsed         | 6            |
|    total_timesteps      | 1765376      |
| train/                  |              |
|    approx_kl            | 0.0064790323 |
|    clip_fraction        | 0.0684       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.319       |
|    explained_variance   | 0.973        |
|    learning_rate        | 0.0003       |
|    loss                 | 5.08         |
|    n_updates            | 8610         |
|    policy_gradient_loss | -0.00312     |
|    value_loss           | 15.3         |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 193         |
|    ep_rew_mean          | 258         |
| time/                   |             |
|    fps                  | 613         |
|    iterations           | 3           |
|    time_elapsed         | 10          |
|    total_timesteps      | 1787904     |
| train/                  |             |
|    approx_kl            | 0.004522311 |
|    clip_fraction        | 0.0446      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.323      |
|    explained_variance   | 0.915       |
|    learning_rate        | 0.0003      |
|    loss                 | 20.4        |
|    n_updates            | 8720        |
|    policy_gradient_loss | -0.00183    |
|    value_loss           | 34.8        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 195 

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 195         |
|    ep_rew_mean          | 259         |
| time/                   |             |
|    fps                  | 578         |
|    iterations           | 4           |
|    time_elapsed         | 14          |
|    total_timesteps      | 1810432     |
| train/                  |             |
|    approx_kl            | 0.005560749 |
|    clip_fraction        | 0.0344      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.332      |
|    explained_variance   | 0.845       |
|    learning_rate        | 0.0003      |
|    loss                 | 76.2        |
|    n_updates            | 8830        |
|    policy_gradient_loss | -0.00138    |
|    value_loss           | 67.4        |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 195   

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 196          |
|    ep_rew_mean          | 270          |
| time/                   |              |
|    fps                  | 572          |
|    iterations           | 5            |
|    time_elapsed         | 17           |
|    total_timesteps      | 1832960      |
| train/                  |              |
|    approx_kl            | 0.0029766168 |
|    clip_fraction        | 0.095        |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.515       |
|    explained_variance   | 0.968        |
|    learning_rate        | 0.0003       |
|    loss                 | 4.22         |
|    n_updates            | 8940         |
|    policy_gradient_loss | 0.00302      |
|    value_loss           | 9.93         |
------------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 197      |
|    ep_rew_mean     | 260      |
| time/              |          |
|    fps             | 850      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 1855488  |
---------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 199          |
|    ep_rew_mean          | 260          |
| time/                   |              |
|    fps                  | 625          |
|    iterations           | 2            |
|    time_elapsed         | 6            |
|    total_timesteps      | 1857536      |
| train/                  |              |
|    approx_kl            | 0.0065304795 |
|    clip_fraction        | 0.0442       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.339       |
|    explained_variance   | 0.918   

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 238          |
|    ep_rew_mean          | 256          |
| time/                   |              |
|    fps                  | 652          |
|    iterations           | 2            |
|    time_elapsed         | 6            |
|    total_timesteps      | 1878016      |
| train/                  |              |
|    approx_kl            | 0.0048282524 |
|    clip_fraction        | 0.0539       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.436       |
|    explained_variance   | 0.908        |
|    learning_rate        | 0.0003       |
|    loss                 | 22.4         |
|    n_updates            | 9160         |
|    policy_gradient_loss | -0.00413     |
|    value_loss           | 76.5         |
------------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_m

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 210          |
|    ep_rew_mean          | 268          |
| time/                   |              |
|    fps                  | 618          |
|    iterations           | 3            |
|    time_elapsed         | 9            |
|    total_timesteps      | 1900544      |
| train/                  |              |
|    approx_kl            | 0.0056251334 |
|    clip_fraction        | 0.054        |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.328       |
|    explained_variance   | 0.928        |
|    learning_rate        | 0.0003       |
|    loss                 | 9.98         |
|    n_updates            | 9270         |
|    policy_gradient_loss | -0.00131     |
|    value_loss           | 21.2         |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 225          |
|    ep_rew_mean          | 270          |
| time/                   |              |
|    fps                  | 594          |
|    iterations           | 4            |
|    time_elapsed         | 13           |
|    total_timesteps      | 1923072      |
| train/                  |              |
|    approx_kl            | 0.0064117536 |
|    clip_fraction        | 0.0705       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.411       |
|    explained_variance   | 0.984        |
|    learning_rate        | 0.0003       |
|    loss                 | 4.26         |
|    n_updates            | 9380         |
|    policy_gradient_loss | -0.000605    |
|    value_loss           | 13           |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 183          |
|    ep_rew_mean          | 261          |
| time/                   |              |
|    fps                  | 572          |
|    iterations           | 5            |
|    time_elapsed         | 17           |
|    total_timesteps      | 1945600      |
| train/                  |              |
|    approx_kl            | 0.0048133004 |
|    clip_fraction        | 0.0516       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.316       |
|    explained_variance   | 0.896        |
|    learning_rate        | 0.0003       |
|    loss                 | 13.9         |
|    n_updates            | 9490         |
|    policy_gradient_loss | -0.00232     |
|    value_loss           | 25.1         |
------------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 201      |
|    ep_rew_mean     | 268      |
| time/              |          |
|    fps             | 856      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 1968128  |
---------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 203          |
|    ep_rew_mean          | 268          |
| time/                   |              |
|    fps                  | 667          |
|    iterations           | 2            |
|    time_elapsed         | 6            |
|    total_timesteps      | 1970176      |
| train/                  |              |
|    approx_kl            | 0.0062133493 |
|    clip_fraction        | 0.0917       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.38        |
|    explained_variance   | 0.98    

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 203         |
|    ep_rew_mean          | 268         |
| time/                   |             |
|    fps                  | 659         |
|    iterations           | 2           |
|    time_elapsed         | 6           |
|    total_timesteps      | 1990656     |
| train/                  |             |
|    approx_kl            | 0.007685822 |
|    clip_fraction        | 0.0886      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.533      |
|    explained_variance   | 0.936       |
|    learning_rate        | 0.0003      |
|    loss                 | 7.39        |
|    n_updates            | 9710        |
|    policy_gradient_loss | -0.00121    |
|    value_loss           | 14.6        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 202 

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 224          |
|    ep_rew_mean          | 261          |
| time/                   |              |
|    fps                  | 605          |
|    iterations           | 3            |
|    time_elapsed         | 10           |
|    total_timesteps      | 2013184      |
| train/                  |              |
|    approx_kl            | 0.0047453786 |
|    clip_fraction        | 0.04         |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.316       |
|    explained_variance   | 0.856        |
|    learning_rate        | 0.0003       |
|    loss                 | 6.07         |
|    n_updates            | 9820         |
|    policy_gradient_loss | -0.00193     |
|    value_loss           | 80.3         |
------------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_m

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 236         |
|    ep_rew_mean          | 262         |
| time/                   |             |
|    fps                  | 573         |
|    iterations           | 4           |
|    time_elapsed         | 14          |
|    total_timesteps      | 2035712     |
| train/                  |             |
|    approx_kl            | 0.005099626 |
|    clip_fraction        | 0.0633      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.439      |
|    explained_variance   | 0.973       |
|    learning_rate        | 0.0003      |
|    loss                 | 35.2        |
|    n_updates            | 9930        |
|    policy_gradient_loss | -0.00329    |
|    value_loss           | 47.2        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 244 

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 200          |
|    ep_rew_mean          | 265          |
| time/                   |              |
|    fps                  | 610          |
|    iterations           | 5            |
|    time_elapsed         | 16           |
|    total_timesteps      | 2058240      |
| train/                  |              |
|    approx_kl            | 0.0060130055 |
|    clip_fraction        | 0.0337       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.288       |
|    explained_variance   | 0.836        |
|    learning_rate        | 0.0003       |
|    loss                 | 18.9         |
|    n_updates            | 10040        |
|    policy_gradient_loss | -0.0016      |
|    value_loss           | 83.1         |
------------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 238      |
|    ep_rew_mean     | 266      |
| time/              |          |
|    fps             | 863      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 2080768  |
---------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 238          |
|    ep_rew_mean          | 269          |
| time/                   |              |
|    fps                  | 684          |
|    iterations           | 2            |
|    time_elapsed         | 5            |
|    total_timesteps      | 2082816      |
| train/                  |              |
|    approx_kl            | 0.0037730015 |
|    clip_fraction        | 0.062        |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.469       |
|    explained_variance   | 0.919   

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 241          |
|    ep_rew_mean          | 263          |
| time/                   |              |
|    fps                  | 658          |
|    iterations           | 3            |
|    time_elapsed         | 9            |
|    total_timesteps      | 2105344      |
| train/                  |              |
|    approx_kl            | 0.0072244057 |
|    clip_fraction        | 0.0812       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.348       |
|    explained_variance   | 0.977        |
|    learning_rate        | 0.0003       |
|    loss                 | 3.09         |
|    n_updates            | 10270        |
|    policy_gradient_loss | -0.00304     |
|    value_loss           | 11.7         |
------------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_m

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 191          |
|    ep_rew_mean          | 281          |
| time/                   |              |
|    fps                  | 615          |
|    iterations           | 4            |
|    time_elapsed         | 13           |
|    total_timesteps      | 2127872      |
| train/                  |              |
|    approx_kl            | 0.0056393193 |
|    clip_fraction        | 0.0549       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.313       |
|    explained_variance   | 0.943        |
|    learning_rate        | 0.0003       |
|    loss                 | 5.54         |
|    n_updates            | 10380        |
|    policy_gradient_loss | -0.00109     |
|    value_loss           | 17.9         |
------------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_m

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 200         |
|    ep_rew_mean          | 270         |
| time/                   |             |
|    fps                  | 607         |
|    iterations           | 5           |
|    time_elapsed         | 16          |
|    total_timesteps      | 2150400     |
| train/                  |             |
|    approx_kl            | 0.002924608 |
|    clip_fraction        | 0.0273      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.302      |
|    explained_variance   | 0.897       |
|    learning_rate        | 0.0003      |
|    loss                 | 31.7        |
|    n_updates            | 10490       |
|    policy_gradient_loss | -0.00249    |
|    value_loss           | 53.3        |
-----------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 207  

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 245      |
|    ep_rew_mean     | 270      |
| time/              |          |
|    fps             | 912      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 2172928  |
---------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 252          |
|    ep_rew_mean          | 269          |
| time/                   |              |
|    fps                  | 691          |
|    iterations           | 2            |
|    time_elapsed         | 5            |
|    total_timesteps      | 2174976      |
| train/                  |              |
|    approx_kl            | 0.0039792145 |
|    clip_fraction        | 0.0355       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.296       |
|    explained_variance   | 0.961   

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 223         |
|    ep_rew_mean          | 262         |
| time/                   |             |
|    fps                  | 691         |
|    iterations           | 2           |
|    time_elapsed         | 5           |
|    total_timesteps      | 2195456     |
| train/                  |             |
|    approx_kl            | 0.009730635 |
|    clip_fraction        | 0.0684      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.289      |
|    explained_variance   | 0.934       |
|    learning_rate        | 0.0003      |
|    loss                 | 15.1        |
|    n_updates            | 10710       |
|    policy_gradient_loss | -0.00247    |
|    value_loss           | 20.7        |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 206   

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 221         |
|    ep_rew_mean          | 271         |
| time/                   |             |
|    fps                  | 646         |
|    iterations           | 3           |
|    time_elapsed         | 9           |
|    total_timesteps      | 2217984     |
| train/                  |             |
|    approx_kl            | 0.009636866 |
|    clip_fraction        | 0.0957      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.369      |
|    explained_variance   | 0.982       |
|    learning_rate        | 0.0003      |
|    loss                 | 15.4        |
|    n_updates            | 10820       |
|    policy_gradient_loss | -0.0035     |
|    value_loss           | 21.2        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 213 

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 199         |
|    ep_rew_mean          | 281         |
| time/                   |             |
|    fps                  | 622         |
|    iterations           | 4           |
|    time_elapsed         | 13          |
|    total_timesteps      | 2240512     |
| train/                  |             |
|    approx_kl            | 0.007056244 |
|    clip_fraction        | 0.0768      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.342      |
|    explained_variance   | 0.962       |
|    learning_rate        | 0.0003      |
|    loss                 | 8.72        |
|    n_updates            | 10930       |
|    policy_gradient_loss | -0.00539    |
|    value_loss           | 13.7        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 199 

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 199          |
|    ep_rew_mean          | 279          |
| time/                   |              |
|    fps                  | 605          |
|    iterations           | 5            |
|    time_elapsed         | 16           |
|    total_timesteps      | 2263040      |
| train/                  |              |
|    approx_kl            | 0.0024632444 |
|    clip_fraction        | 0.0265       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.302       |
|    explained_variance   | 0.817        |
|    learning_rate        | 0.0003       |
|    loss                 | 20.2         |
|    n_updates            | 11040        |
|    policy_gradient_loss | -0.00103     |
|    value_loss           | 76.3         |
------------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 190      |
|    ep_rew_mean     | 264      |
| time/              |          |
|    fps             | 902      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 2285568  |
---------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 191          |
|    ep_rew_mean          | 267          |
| time/                   |              |
|    fps                  | 665          |
|    iterations           | 2            |
|    time_elapsed         | 6            |
|    total_timesteps      | 2287616      |
| train/                  |              |
|    approx_kl            | 0.0045658327 |
|    clip_fraction        | 0.0531       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.335       |
|    explained_variance   | 0.947   

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 186         |
|    ep_rew_mean          | 276         |
| time/                   |             |
|    fps                  | 624         |
|    iterations           | 2           |
|    time_elapsed         | 6           |
|    total_timesteps      | 2308096     |
| train/                  |             |
|    approx_kl            | 0.019026889 |
|    clip_fraction        | 0.0617      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.275      |
|    explained_variance   | 0.934       |
|    learning_rate        | 0.0003      |
|    loss                 | 5.21        |
|    n_updates            | 11260       |
|    policy_gradient_loss | -0.00332    |
|    value_loss           | 15.8        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 185 

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 216         |
|    ep_rew_mean          | 254         |
| time/                   |             |
|    fps                  | 579         |
|    iterations           | 3           |
|    time_elapsed         | 10          |
|    total_timesteps      | 2330624     |
| train/                  |             |
|    approx_kl            | 0.004326316 |
|    clip_fraction        | 0.0471      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.266      |
|    explained_variance   | 0.902       |
|    learning_rate        | 0.0003      |
|    loss                 | 20.8        |
|    n_updates            | 11370       |
|    policy_gradient_loss | -0.00108    |
|    value_loss           | 23.3        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 214 

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 194         |
|    ep_rew_mean          | 281         |
| time/                   |             |
|    fps                  | 641         |
|    iterations           | 4           |
|    time_elapsed         | 12          |
|    total_timesteps      | 2353152     |
| train/                  |             |
|    approx_kl            | 0.008141436 |
|    clip_fraction        | 0.0605      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.269      |
|    explained_variance   | 0.941       |
|    learning_rate        | 0.0003      |
|    loss                 | 4.54        |
|    n_updates            | 11480       |
|    policy_gradient_loss | -0.00207    |
|    value_loss           | 17          |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 201   

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 219          |
|    ep_rew_mean          | 277          |
| time/                   |              |
|    fps                  | 549          |
|    iterations           | 5            |
|    time_elapsed         | 18           |
|    total_timesteps      | 2375680      |
| train/                  |              |
|    approx_kl            | 0.0032079476 |
|    clip_fraction        | 0.0343       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.294       |
|    explained_variance   | 0.832        |
|    learning_rate        | 0.0003       |
|    loss                 | 13.4         |
|    n_updates            | 11590        |
|    policy_gradient_loss | -0.00182     |
|    value_loss           | 55.9         |
------------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 199      |
|    ep_rew_mean     | 278      |
| time/              |          |
|    fps             | 893      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 2398208  |
---------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 198          |
|    ep_rew_mean          | 277          |
| time/                   |              |
|    fps                  | 690          |
|    iterations           | 2            |
|    time_elapsed         | 5            |
|    total_timesteps      | 2400256      |
| train/                  |              |
|    approx_kl            | 0.0044953683 |
|    clip_fraction        | 0.0338       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.296       |
|    explained_variance   | 0.891   

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 193          |
|    ep_rew_mean          | 263          |
| time/                   |              |
|    fps                  | 691          |
|    iterations           | 2            |
|    time_elapsed         | 5            |
|    total_timesteps      | 2420736      |
| train/                  |              |
|    approx_kl            | 0.0072378237 |
|    clip_fraction        | 0.0655       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.318       |
|    explained_variance   | 0.913        |
|    learning_rate        | 0.0003       |
|    loss                 | 9.47         |
|    n_updates            | 11810        |
|    policy_gradient_loss | -0.00359     |
|    value_loss           | 21.5         |
------------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_m

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 209          |
|    ep_rew_mean          | 275          |
| time/                   |              |
|    fps                  | 581          |
|    iterations           | 3            |
|    time_elapsed         | 10           |
|    total_timesteps      | 2443264      |
| train/                  |              |
|    approx_kl            | 0.0046828273 |
|    clip_fraction        | 0.0462       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.32        |
|    explained_variance   | 0.959        |
|    learning_rate        | 0.0003       |
|    loss                 | 5.82         |
|    n_updates            | 11920        |
|    policy_gradient_loss | -0.00291     |
|    value_loss           | 21.4         |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 196          |
|    ep_rew_mean          | 277          |
| time/                   |              |
|    fps                  | 570          |
|    iterations           | 4            |
|    time_elapsed         | 14           |
|    total_timesteps      | 2465792      |
| train/                  |              |
|    approx_kl            | 0.0052064816 |
|    clip_fraction        | 0.0614       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.422       |
|    explained_variance   | 0.969        |
|    learning_rate        | 0.0003       |
|    loss                 | 2.55         |
|    n_updates            | 12030        |
|    policy_gradient_loss | 9.4e-05      |
|    value_loss           | 9.57         |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 202         |
|    ep_rew_mean          | 244         |
| time/                   |             |
|    fps                  | 608         |
|    iterations           | 5           |
|    time_elapsed         | 16          |
|    total_timesteps      | 2488320     |
| train/                  |             |
|    approx_kl            | 0.013589035 |
|    clip_fraction        | 0.0607      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.467      |
|    explained_variance   | 0.99        |
|    learning_rate        | 0.0003      |
|    loss                 | 4.36        |
|    n_updates            | 12140       |
|    policy_gradient_loss | -0.0026     |
|    value_loss           | 11.4        |
-----------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 193  

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 200      |
|    ep_rew_mean     | 273      |
| time/              |          |
|    fps             | 901      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 2510848  |
---------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 201         |
|    ep_rew_mean          | 273         |
| time/                   |             |
|    fps                  | 685         |
|    iterations           | 2           |
|    time_elapsed         | 5           |
|    total_timesteps      | 2512896     |
| train/                  |             |
|    approx_kl            | 0.003257705 |
|    clip_fraction        | 0.0339      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.283      |
|    explained_variance   | 0.925       |
|    lea

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 191         |
|    ep_rew_mean          | 268         |
| time/                   |             |
|    fps                  | 698         |
|    iterations           | 2           |
|    time_elapsed         | 5           |
|    total_timesteps      | 2533376     |
| train/                  |             |
|    approx_kl            | 0.007664161 |
|    clip_fraction        | 0.063       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.276      |
|    explained_variance   | 0.976       |
|    learning_rate        | 0.0003      |
|    loss                 | 6.62        |
|    n_updates            | 12360       |
|    policy_gradient_loss | -0.00212    |
|    value_loss           | 13.8        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 190 

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 210         |
|    ep_rew_mean          | 274         |
| time/                   |             |
|    fps                  | 646         |
|    iterations           | 3           |
|    time_elapsed         | 9           |
|    total_timesteps      | 2555904     |
| train/                  |             |
|    approx_kl            | 0.003951413 |
|    clip_fraction        | 0.0611      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.282      |
|    explained_variance   | 0.978       |
|    learning_rate        | 0.0003      |
|    loss                 | 5.45        |
|    n_updates            | 12470       |
|    policy_gradient_loss | -0.00251    |
|    value_loss           | 11.3        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 219 

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 203          |
|    ep_rew_mean          | 255          |
| time/                   |              |
|    fps                  | 623          |
|    iterations           | 4            |
|    time_elapsed         | 13           |
|    total_timesteps      | 2578432      |
| train/                  |              |
|    approx_kl            | 0.0017597559 |
|    clip_fraction        | 0.0203       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.308       |
|    explained_variance   | 0.805        |
|    learning_rate        | 0.0003       |
|    loss                 | 404          |
|    n_updates            | 12580        |
|    policy_gradient_loss | -0.000583    |
|    value_loss           | 661          |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 200          |
|    ep_rew_mean          | 269          |
| time/                   |              |
|    fps                  | 596          |
|    iterations           | 5            |
|    time_elapsed         | 17           |
|    total_timesteps      | 2600960      |
| train/                  |              |
|    approx_kl            | 0.0045052366 |
|    clip_fraction        | 0.0487       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.29        |
|    explained_variance   | 0.941        |
|    learning_rate        | 0.0003       |
|    loss                 | 6.12         |
|    n_updates            | 12690        |
|    policy_gradient_loss | -0.00185     |
|    value_loss           | 16.9         |
------------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 238      |
|    ep_rew_mean     | 275      |
| time/              |          |
|    fps             | 834      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 2623488  |
---------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 236         |
|    ep_rew_mean          | 275         |
| time/                   |             |
|    fps                  | 666         |
|    iterations           | 2           |
|    time_elapsed         | 6           |
|    total_timesteps      | 2625536     |
| train/                  |             |
|    approx_kl            | 0.005740992 |
|    clip_fraction        | 0.106       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.533      |
|    explained_variance   | 0.983       |
|    lea

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 232          |
|    ep_rew_mean          | 263          |
| time/                   |              |
|    fps                  | 608          |
|    iterations           | 3            |
|    time_elapsed         | 10           |
|    total_timesteps      | 2648064      |
| train/                  |              |
|    approx_kl            | 0.0027215644 |
|    clip_fraction        | 0.0466       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.42        |
|    explained_variance   | 0.993        |
|    learning_rate        | 0.0003       |
|    loss                 | 4.1          |
|    n_updates            | 12920        |
|    policy_gradient_loss | -0.00136     |
|    value_loss           | 7.74         |
------------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_m

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 278          |
|    ep_rew_mean          | 259          |
| time/                   |              |
|    fps                  | 565          |
|    iterations           | 4            |
|    time_elapsed         | 14           |
|    total_timesteps      | 2670592      |
| train/                  |              |
|    approx_kl            | 0.0058450485 |
|    clip_fraction        | 0.0655       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.287       |
|    explained_variance   | 0.966        |
|    learning_rate        | 0.0003       |
|    loss                 | 5.4          |
|    n_updates            | 13030        |
|    policy_gradient_loss | -0.00321     |
|    value_loss           | 13.9         |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 234          |
|    ep_rew_mean          | 267          |
| time/                   |              |
|    fps                  | 601          |
|    iterations           | 5            |
|    time_elapsed         | 17           |
|    total_timesteps      | 2693120      |
| train/                  |              |
|    approx_kl            | 0.0048108534 |
|    clip_fraction        | 0.126        |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.429       |
|    explained_variance   | 0.991        |
|    learning_rate        | 0.0003       |
|    loss                 | 1.72         |
|    n_updates            | 13140        |
|    policy_gradient_loss | -0.00399     |
|    value_loss           | 8.51         |
------------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 205      |
|    ep_rew_mean     | 258      |
| time/              |          |
|    fps             | 918      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 2715648  |
---------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 209          |
|    ep_rew_mean          | 258          |
| time/                   |              |
|    fps                  | 696          |
|    iterations           | 2            |
|    time_elapsed         | 5            |
|    total_timesteps      | 2717696      |
| train/                  |              |
|    approx_kl            | 0.0050494876 |
|    clip_fraction        | 0.0548       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.296       |
|    explained_variance   | 0.861   

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 224         |
|    ep_rew_mean          | 269         |
| time/                   |             |
|    fps                  | 634         |
|    iterations           | 3           |
|    time_elapsed         | 9           |
|    total_timesteps      | 2740224     |
| train/                  |             |
|    approx_kl            | 0.006837239 |
|    clip_fraction        | 0.0488      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.421      |
|    explained_variance   | 0.98        |
|    learning_rate        | 0.0003      |
|    loss                 | 2.89        |
|    n_updates            | 13370       |
|    policy_gradient_loss | -0.00289    |
|    value_loss           | 13.3        |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 233   

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 229          |
|    ep_rew_mean          | 263          |
| time/                   |              |
|    fps                  | 609          |
|    iterations           | 4            |
|    time_elapsed         | 13           |
|    total_timesteps      | 2762752      |
| train/                  |              |
|    approx_kl            | 0.0050605657 |
|    clip_fraction        | 0.0448       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.296       |
|    explained_variance   | 0.846        |
|    learning_rate        | 0.0003       |
|    loss                 | 10           |
|    n_updates            | 13480        |
|    policy_gradient_loss | -0.0028      |
|    value_loss           | 41.1         |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 216          |
|    ep_rew_mean          | 261          |
| time/                   |              |
|    fps                  | 601          |
|    iterations           | 5            |
|    time_elapsed         | 17           |
|    total_timesteps      | 2785280      |
| train/                  |              |
|    approx_kl            | 0.0031658784 |
|    clip_fraction        | 0.0589       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.397       |
|    explained_variance   | 0.897        |
|    learning_rate        | 0.0003       |
|    loss                 | 8.3          |
|    n_updates            | 13590        |
|    policy_gradient_loss | -0.00132     |
|    value_loss           | 63.3         |
------------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 217      |
|    ep_rew_mean     | 275      |
| time/              |          |
|    fps             | 877      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 2807808  |
---------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 227         |
|    ep_rew_mean          | 274         |
| time/                   |             |
|    fps                  | 678         |
|    iterations           | 2           |
|    time_elapsed         | 6           |
|    total_timesteps      | 2809856     |
| train/                  |             |
|    approx_kl            | 0.004884159 |
|    clip_fraction        | 0.0538      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.274      |
|    explained_variance   | 0.971       |
|    lea

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 250       |
|    ep_rew_mean          | 265       |
| time/                   |           |
|    fps                  | 625       |
|    iterations           | 3         |
|    time_elapsed         | 9         |
|    total_timesteps      | 2832384   |
| train/                  |           |
|    approx_kl            | 0.0059837 |
|    clip_fraction        | 0.0558    |
|    clip_range           | 0.2       |
|    entropy_loss         | -0.283    |
|    explained_variance   | 0.944     |
|    learning_rate        | 0.0003    |
|    loss                 | 6.47      |
|    n_updates            | 13820     |
|    policy_gradient_loss | -0.00274  |
|    value_loss           | 15.1      |
---------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 240          |
|    ep_rew_mean          | 265

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 221         |
|    ep_rew_mean          | 267         |
| time/                   |             |
|    fps                  | 607         |
|    iterations           | 4           |
|    time_elapsed         | 13          |
|    total_timesteps      | 2854912     |
| train/                  |             |
|    approx_kl            | 0.004106749 |
|    clip_fraction        | 0.0512      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.249      |
|    explained_variance   | 0.936       |
|    learning_rate        | 0.0003      |
|    loss                 | 8.17        |
|    n_updates            | 13930       |
|    policy_gradient_loss | -0.00324    |
|    value_loss           | 26.4        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 212 

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 197          |
|    ep_rew_mean          | 265          |
| time/                   |              |
|    fps                  | 605          |
|    iterations           | 5            |
|    time_elapsed         | 16           |
|    total_timesteps      | 2877440      |
| train/                  |              |
|    approx_kl            | 0.0050688307 |
|    clip_fraction        | 0.0492       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.269       |
|    explained_variance   | 0.945        |
|    learning_rate        | 0.0003       |
|    loss                 | 7.13         |
|    n_updates            | 14040        |
|    policy_gradient_loss | -0.00137     |
|    value_loss           | 15.7         |
------------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 220      |
|    ep_rew_mean     | 273      |
| time/              |          |
|    fps             | 897      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 2899968  |
---------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 220         |
|    ep_rew_mean          | 272         |
| time/                   |             |
|    fps                  | 675         |
|    iterations           | 2           |
|    time_elapsed         | 6           |
|    total_timesteps      | 2902016     |
| train/                  |             |
|    approx_kl            | 0.010995816 |
|    clip_fraction        | 0.0799      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.282      |
|    explained_variance   | 0.95        |
|    lea

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 218          |
|    ep_rew_mean          | 272          |
| time/                   |              |
|    fps                  | 636          |
|    iterations           | 3            |
|    time_elapsed         | 9            |
|    total_timesteps      | 2924544      |
| train/                  |              |
|    approx_kl            | 0.0048609055 |
|    clip_fraction        | 0.0353       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.241       |
|    explained_variance   | 0.816        |
|    learning_rate        | 0.0003       |
|    loss                 | 5.07         |
|    n_updates            | 14270        |
|    policy_gradient_loss | -0.00119     |
|    value_loss           | 32.3         |
------------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_m

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 246          |
|    ep_rew_mean          | 254          |
| time/                   |              |
|    fps                  | 614          |
|    iterations           | 4            |
|    time_elapsed         | 13           |
|    total_timesteps      | 2947072      |
| train/                  |              |
|    approx_kl            | 0.0046654018 |
|    clip_fraction        | 0.0409       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.382       |
|    explained_variance   | 0.977        |
|    learning_rate        | 0.0003       |
|    loss                 | 6.36         |
|    n_updates            | 14380        |
|    policy_gradient_loss | -0.000289    |
|    value_loss           | 19.3         |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 228          |
|    ep_rew_mean          | 265          |
| time/                   |              |
|    fps                  | 597          |
|    iterations           | 5            |
|    time_elapsed         | 17           |
|    total_timesteps      | 2969600      |
| train/                  |              |
|    approx_kl            | 0.0062808315 |
|    clip_fraction        | 0.0542       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.452       |
|    explained_variance   | 0.987        |
|    learning_rate        | 0.0003       |
|    loss                 | 6.83         |
|    n_updates            | 14490        |
|    policy_gradient_loss | -0.00414     |
|    value_loss           | 10.9         |
------------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 221      |
|    ep_rew_mean     | 267      |
| time/              |          |
|    fps             | 851      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 2992128  |
---------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 220          |
|    ep_rew_mean          | 270          |
| time/                   |              |
|    fps                  | 656          |
|    iterations           | 2            |
|    time_elapsed         | 6            |
|    total_timesteps      | 2994176      |
| train/                  |              |
|    approx_kl            | 0.0034908592 |
|    clip_fraction        | 0.0645       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.395       |
|    explained_variance   | 0.988   

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 221          |
|    ep_rew_mean          | 267          |
| time/                   |              |
|    fps                  | 664          |
|    iterations           | 2            |
|    time_elapsed         | 6            |
|    total_timesteps      | 3014656      |
| train/                  |              |
|    approx_kl            | 0.0046475856 |
|    clip_fraction        | 0.0463       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.371       |
|    explained_variance   | 0.993        |
|    learning_rate        | 0.0003       |
|    loss                 | 3.48         |
|    n_updates            | 14710        |
|    policy_gradient_loss | -0.00122     |
|    value_loss           | 10.4         |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 195          |
|    ep_rew_mean          | 267          |
| time/                   |              |
|    fps                  | 610          |
|    iterations           | 3            |
|    time_elapsed         | 10           |
|    total_timesteps      | 3037184      |
| train/                  |              |
|    approx_kl            | 0.0021449476 |
|    clip_fraction        | 0.0235       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.249       |
|    explained_variance   | 0.848        |
|    learning_rate        | 0.0003       |
|    loss                 | 15.9         |
|    n_updates            | 14820        |
|    policy_gradient_loss | -0.00167     |
|    value_loss           | 61.8         |
------------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_m

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 184          |
|    ep_rew_mean          | 265          |
| time/                   |              |
|    fps                  | 597          |
|    iterations           | 4            |
|    time_elapsed         | 13           |
|    total_timesteps      | 3059712      |
| train/                  |              |
|    approx_kl            | 0.0053509027 |
|    clip_fraction        | 0.059        |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.237       |
|    explained_variance   | 0.932        |
|    learning_rate        | 0.0003       |
|    loss                 | 9.52         |
|    n_updates            | 14930        |
|    policy_gradient_loss | -0.00159     |
|    value_loss           | 20.3         |
------------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_m

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 189         |
|    ep_rew_mean          | 259         |
| time/                   |             |
|    fps                  | 581         |
|    iterations           | 5           |
|    time_elapsed         | 17          |
|    total_timesteps      | 3082240     |
| train/                  |             |
|    approx_kl            | 0.012633637 |
|    clip_fraction        | 0.0966      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.265      |
|    explained_variance   | 0.979       |
|    learning_rate        | 0.0003      |
|    loss                 | 6.68        |
|    n_updates            | 15040       |
|    policy_gradient_loss | -0.00308    |
|    value_loss           | 14          |
-----------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 193  

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 207      |
|    ep_rew_mean     | 277      |
| time/              |          |
|    fps             | 903      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 3104768  |
---------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 204         |
|    ep_rew_mean          | 272         |
| time/                   |             |
|    fps                  | 719         |
|    iterations           | 2           |
|    time_elapsed         | 5           |
|    total_timesteps      | 3106816     |
| train/                  |             |
|    approx_kl            | 0.006478617 |
|    clip_fraction        | 0.0455      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.254      |
|    explained_variance   | 0.9         |
|    lea

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 196         |
|    ep_rew_mean          | 272         |
| time/                   |             |
|    fps                  | 641         |
|    iterations           | 3           |
|    time_elapsed         | 9           |
|    total_timesteps      | 3129344     |
| train/                  |             |
|    approx_kl            | 0.006657758 |
|    clip_fraction        | 0.045       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.249      |
|    explained_variance   | 0.736       |
|    learning_rate        | 0.0003      |
|    loss                 | 6.14        |
|    n_updates            | 15270       |
|    policy_gradient_loss | -0.00054    |
|    value_loss           | 48.7        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 197 

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 256        |
|    ep_rew_mean          | 267        |
| time/                   |            |
|    fps                  | 608        |
|    iterations           | 4          |
|    time_elapsed         | 13         |
|    total_timesteps      | 3151872    |
| train/                  |            |
|    approx_kl            | 0.00886581 |
|    clip_fraction        | 0.0988     |
|    clip_range           | 0.2        |
|    entropy_loss         | -0.423     |
|    explained_variance   | 0.983      |
|    learning_rate        | 0.0003     |
|    loss                 | 5.77       |
|    n_updates            | 15380      |
|    policy_gradient_loss | -0.000959  |
|    value_loss           | 13.6       |
----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 263         |
|    ep_rew_m

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 269         |
|    ep_rew_mean          | 259         |
| time/                   |             |
|    fps                  | 597         |
|    iterations           | 5           |
|    time_elapsed         | 17          |
|    total_timesteps      | 3174400     |
| train/                  |             |
|    approx_kl            | 0.014370231 |
|    clip_fraction        | 0.263       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.56       |
|    explained_variance   | 0.992       |
|    learning_rate        | 0.0003      |
|    loss                 | 1.11        |
|    n_updates            | 15490       |
|    policy_gradient_loss | 0.00311     |
|    value_loss           | 2.18        |
-----------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 254  

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 206      |
|    ep_rew_mean     | 255      |
| time/              |          |
|    fps             | 875      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 3196928  |
---------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 196          |
|    ep_rew_mean          | 251          |
| time/                   |              |
|    fps                  | 734          |
|    iterations           | 2            |
|    time_elapsed         | 5            |
|    total_timesteps      | 3198976      |
| train/                  |              |
|    approx_kl            | 0.0021219887 |
|    clip_fraction        | 0.027        |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.285       |
|    explained_variance   | 0.765   

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 233         |
|    ep_rew_mean          | 265         |
| time/                   |             |
|    fps                  | 644         |
|    iterations           | 3           |
|    time_elapsed         | 9           |
|    total_timesteps      | 3221504     |
| train/                  |             |
|    approx_kl            | 0.005284018 |
|    clip_fraction        | 0.0525      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.306      |
|    explained_variance   | 0.78        |
|    learning_rate        | 0.0003      |
|    loss                 | 13.7        |
|    n_updates            | 15720       |
|    policy_gradient_loss | -0.00427    |
|    value_loss           | 60.6        |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 216   

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 223         |
|    ep_rew_mean          | 258         |
| time/                   |             |
|    fps                  | 605         |
|    iterations           | 4           |
|    time_elapsed         | 13          |
|    total_timesteps      | 3244032     |
| train/                  |             |
|    approx_kl            | 0.003899405 |
|    clip_fraction        | 0.0407      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.412      |
|    explained_variance   | 0.963       |
|    learning_rate        | 0.0003      |
|    loss                 | 2.71        |
|    n_updates            | 15830       |
|    policy_gradient_loss | -0.00259    |
|    value_loss           | 20.4        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 224 

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 241         |
|    ep_rew_mean          | 260         |
| time/                   |             |
|    fps                  | 572         |
|    iterations           | 5           |
|    time_elapsed         | 17          |
|    total_timesteps      | 3266560     |
| train/                  |             |
|    approx_kl            | 0.015574464 |
|    clip_fraction        | 0.182       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.424      |
|    explained_variance   | 0.962       |
|    learning_rate        | 0.0003      |
|    loss                 | 6.4         |
|    n_updates            | 15940       |
|    policy_gradient_loss | -0.00646    |
|    value_loss           | 47.2        |
-----------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 251  

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 242      |
|    ep_rew_mean     | 264      |
| time/              |          |
|    fps             | 879      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 3289088  |
---------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 242          |
|    ep_rew_mean          | 266          |
| time/                   |              |
|    fps                  | 674          |
|    iterations           | 2            |
|    time_elapsed         | 6            |
|    total_timesteps      | 3291136      |
| train/                  |              |
|    approx_kl            | 0.0064925556 |
|    clip_fraction        | 0.0662       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.376       |
|    explained_variance   | 0.989   

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 230          |
|    ep_rew_mean          | 259          |
| time/                   |              |
|    fps                  | 677          |
|    iterations           | 2            |
|    time_elapsed         | 6            |
|    total_timesteps      | 3311616      |
| train/                  |              |
|    approx_kl            | 0.0042380644 |
|    clip_fraction        | 0.035        |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.381       |
|    explained_variance   | 0.988        |
|    learning_rate        | 0.0003       |
|    loss                 | 4.88         |
|    n_updates            | 16160        |
|    policy_gradient_loss | 0.000158     |
|    value_loss           | 10.2         |
------------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_m

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 190          |
|    ep_rew_mean          | 272          |
| time/                   |              |
|    fps                  | 626          |
|    iterations           | 3            |
|    time_elapsed         | 9            |
|    total_timesteps      | 3334144      |
| train/                  |              |
|    approx_kl            | 0.0043810634 |
|    clip_fraction        | 0.0424       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.297       |
|    explained_variance   | 0.928        |
|    learning_rate        | 0.0003       |
|    loss                 | 6.02         |
|    n_updates            | 16270        |
|    policy_gradient_loss | -0.00426     |
|    value_loss           | 18.7         |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 227        |
|    ep_rew_mean          | 251        |
| time/                   |            |
|    fps                  | 597        |
|    iterations           | 4          |
|    time_elapsed         | 13         |
|    total_timesteps      | 3356672    |
| train/                  |            |
|    approx_kl            | 0.05328373 |
|    clip_fraction        | 0.105      |
|    clip_range           | 0.2        |
|    entropy_loss         | -0.254     |
|    explained_variance   | 0.955      |
|    learning_rate        | 0.0003     |
|    loss                 | 13.5       |
|    n_updates            | 16380      |
|    policy_gradient_loss | -0.00932   |
|    value_loss           | 68.1       |
----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 230         |
|    ep_rew_m

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 229          |
|    ep_rew_mean          | 275          |
| time/                   |              |
|    fps                  | 592          |
|    iterations           | 5            |
|    time_elapsed         | 17           |
|    total_timesteps      | 3379200      |
| train/                  |              |
|    approx_kl            | 0.0054666633 |
|    clip_fraction        | 0.0467       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.251       |
|    explained_variance   | 0.732        |
|    learning_rate        | 0.0003       |
|    loss                 | 10.8         |
|    n_updates            | 16490        |
|    policy_gradient_loss | -0.00187     |
|    value_loss           | 113          |
------------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 192      |
|    ep_rew_mean     | 271      |
| time/              |          |
|    fps             | 883      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 3401728  |
---------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 199          |
|    ep_rew_mean          | 266          |
| time/                   |              |
|    fps                  | 672          |
|    iterations           | 2            |
|    time_elapsed         | 6            |
|    total_timesteps      | 3403776      |
| train/                  |              |
|    approx_kl            | 0.0149639975 |
|    clip_fraction        | 0.075        |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.282       |
|    explained_variance   | 0.971   

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 240         |
|    ep_rew_mean          | 262         |
| time/                   |             |
|    fps                  | 679         |
|    iterations           | 2           |
|    time_elapsed         | 6           |
|    total_timesteps      | 3424256     |
| train/                  |             |
|    approx_kl            | 0.008117301 |
|    clip_fraction        | 0.0746      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.35       |
|    explained_variance   | 0.985       |
|    learning_rate        | 0.0003      |
|    loss                 | 4.87        |
|    n_updates            | 16710       |
|    policy_gradient_loss | -0.00263    |
|    value_loss           | 15          |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 232 

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 191         |
|    ep_rew_mean          | 268         |
| time/                   |             |
|    fps                  | 623         |
|    iterations           | 3           |
|    time_elapsed         | 9           |
|    total_timesteps      | 3446784     |
| train/                  |             |
|    approx_kl            | 0.005696102 |
|    clip_fraction        | 0.043       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.235      |
|    explained_variance   | 0.977       |
|    learning_rate        | 0.0003      |
|    loss                 | 9.56        |
|    n_updates            | 16820       |
|    policy_gradient_loss | -0.00144    |
|    value_loss           | 14.6        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 191 

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 199         |
|    ep_rew_mean          | 270         |
| time/                   |             |
|    fps                  | 605         |
|    iterations           | 4           |
|    time_elapsed         | 13          |
|    total_timesteps      | 3469312     |
| train/                  |             |
|    approx_kl            | 0.007999771 |
|    clip_fraction        | 0.0619      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.239      |
|    explained_variance   | 0.858       |
|    learning_rate        | 0.0003      |
|    loss                 | 5.76        |
|    n_updates            | 16930       |
|    policy_gradient_loss | -0.00249    |
|    value_loss           | 25.3        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 200 

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 205          |
|    ep_rew_mean          | 272          |
| time/                   |              |
|    fps                  | 592          |
|    iterations           | 5            |
|    time_elapsed         | 17           |
|    total_timesteps      | 3491840      |
| train/                  |              |
|    approx_kl            | 0.0053888317 |
|    clip_fraction        | 0.0502       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.266       |
|    explained_variance   | 0.939        |
|    learning_rate        | 0.0003       |
|    loss                 | 6.35         |
|    n_updates            | 17040        |
|    policy_gradient_loss | -0.00237     |
|    value_loss           | 14.9         |
------------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 186      |
|    ep_rew_mean     | 271      |
| time/              |          |
|    fps             | 898      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 3514368  |
---------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 186          |
|    ep_rew_mean          | 275          |
| time/                   |              |
|    fps                  | 680          |
|    iterations           | 2            |
|    time_elapsed         | 6            |
|    total_timesteps      | 3516416      |
| train/                  |              |
|    approx_kl            | 0.0062764767 |
|    clip_fraction        | 0.0615       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.23        |
|    explained_variance   | 0.977   

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 195          |
|    ep_rew_mean          | 279          |
| time/                   |              |
|    fps                  | 673          |
|    iterations           | 2            |
|    time_elapsed         | 6            |
|    total_timesteps      | 3536896      |
| train/                  |              |
|    approx_kl            | 0.0062566274 |
|    clip_fraction        | 0.0463       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.234       |
|    explained_variance   | 0.883        |
|    learning_rate        | 0.0003       |
|    loss                 | 6.67         |
|    n_updates            | 17260        |
|    policy_gradient_loss | -0.00323     |
|    value_loss           | 23           |
------------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_m

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 217         |
|    ep_rew_mean          | 275         |
| time/                   |             |
|    fps                  | 627         |
|    iterations           | 3           |
|    time_elapsed         | 9           |
|    total_timesteps      | 3559424     |
| train/                  |             |
|    approx_kl            | 0.009034982 |
|    clip_fraction        | 0.0893      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.355      |
|    explained_variance   | 0.978       |
|    learning_rate        | 0.0003      |
|    loss                 | 2.2         |
|    n_updates            | 17370       |
|    policy_gradient_loss | 0.00138     |
|    value_loss           | 8.5         |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 233 

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 202          |
|    ep_rew_mean          | 258          |
| time/                   |              |
|    fps                  | 601          |
|    iterations           | 4            |
|    time_elapsed         | 13           |
|    total_timesteps      | 3581952      |
| train/                  |              |
|    approx_kl            | 0.0030470204 |
|    clip_fraction        | 0.0247       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.352       |
|    explained_variance   | 0.866        |
|    learning_rate        | 0.0003       |
|    loss                 | 11.4         |
|    n_updates            | 17480        |
|    policy_gradient_loss | -0.00352     |
|    value_loss           | 161          |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 217         |
|    ep_rew_mean          | 279         |
| time/                   |             |
|    fps                  | 588         |
|    iterations           | 5           |
|    time_elapsed         | 17          |
|    total_timesteps      | 3604480     |
| train/                  |             |
|    approx_kl            | 0.007116205 |
|    clip_fraction        | 0.0586      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.269      |
|    explained_variance   | 0.972       |
|    learning_rate        | 0.0003      |
|    loss                 | 7.64        |
|    n_updates            | 17590       |
|    policy_gradient_loss | -0.00255    |
|    value_loss           | 12          |
-----------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 208  

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 195      |
|    ep_rew_mean     | 273      |
| time/              |          |
|    fps             | 906      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 3627008  |
---------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 196         |
|    ep_rew_mean          | 274         |
| time/                   |             |
|    fps                  | 692         |
|    iterations           | 2           |
|    time_elapsed         | 5           |
|    total_timesteps      | 3629056     |
| train/                  |             |
|    approx_kl            | 0.006009842 |
|    clip_fraction        | 0.0557      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.249      |
|    explained_variance   | 0.979       |
|    lea

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 225         |
|    ep_rew_mean          | 256         |
| time/                   |             |
|    fps                  | 643         |
|    iterations           | 3           |
|    time_elapsed         | 9           |
|    total_timesteps      | 3651584     |
| train/                  |             |
|    approx_kl            | 0.009444876 |
|    clip_fraction        | 0.0968      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.406      |
|    explained_variance   | 0.994       |
|    learning_rate        | 0.0003      |
|    loss                 | 2.47        |
|    n_updates            | 17820       |
|    policy_gradient_loss | -0.000806   |
|    value_loss           | 5.84        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 235 

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 214          |
|    ep_rew_mean          | 266          |
| time/                   |              |
|    fps                  | 623          |
|    iterations           | 4            |
|    time_elapsed         | 13           |
|    total_timesteps      | 3674112      |
| train/                  |              |
|    approx_kl            | 0.0071685864 |
|    clip_fraction        | 0.067        |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.252       |
|    explained_variance   | 0.98         |
|    learning_rate        | 0.0003       |
|    loss                 | 5.11         |
|    n_updates            | 17930        |
|    policy_gradient_loss | -0.00109     |
|    value_loss           | 11.2         |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 226         |
|    ep_rew_mean          | 266         |
| time/                   |             |
|    fps                  | 599         |
|    iterations           | 5           |
|    time_elapsed         | 17          |
|    total_timesteps      | 3696640     |
| train/                  |             |
|    approx_kl            | 0.060332578 |
|    clip_fraction        | 0.128       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.479      |
|    explained_variance   | 0.968       |
|    learning_rate        | 0.0003      |
|    loss                 | 6.14        |
|    n_updates            | 18040       |
|    policy_gradient_loss | -0.00547    |
|    value_loss           | 10.8        |
-----------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 230  

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 243      |
|    ep_rew_mean     | 260      |
| time/              |          |
|    fps             | 880      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 3719168  |
---------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 239         |
|    ep_rew_mean          | 258         |
| time/                   |             |
|    fps                  | 679         |
|    iterations           | 2           |
|    time_elapsed         | 6           |
|    total_timesteps      | 3721216     |
| train/                  |             |
|    approx_kl            | 0.004222528 |
|    clip_fraction        | 0.0426      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.241      |
|    explained_variance   | 0.895       |
|    lea

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 213         |
|    ep_rew_mean          | 271         |
| time/                   |             |
|    fps                  | 676         |
|    iterations           | 2           |
|    time_elapsed         | 6           |
|    total_timesteps      | 3741696     |
| train/                  |             |
|    approx_kl            | 0.008902808 |
|    clip_fraction        | 0.0688      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.385      |
|    explained_variance   | 0.989       |
|    learning_rate        | 0.0003      |
|    loss                 | 0.964       |
|    n_updates            | 18260       |
|    policy_gradient_loss | 0.000914    |
|    value_loss           | 8.08        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 212 

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 213          |
|    ep_rew_mean          | 268          |
| time/                   |              |
|    fps                  | 630          |
|    iterations           | 3            |
|    time_elapsed         | 9            |
|    total_timesteps      | 3764224      |
| train/                  |              |
|    approx_kl            | 0.0024267263 |
|    clip_fraction        | 0.0313       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.235       |
|    explained_variance   | 0.955        |
|    learning_rate        | 0.0003       |
|    loss                 | 20.5         |
|    n_updates            | 18370        |
|    policy_gradient_loss | -0.00117     |
|    value_loss           | 27.9         |
------------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_m

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 221         |
|    ep_rew_mean          | 256         |
| time/                   |             |
|    fps                  | 607         |
|    iterations           | 4           |
|    time_elapsed         | 13          |
|    total_timesteps      | 3786752     |
| train/                  |             |
|    approx_kl            | 0.004229565 |
|    clip_fraction        | 0.0401      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.264      |
|    explained_variance   | 0.736       |
|    learning_rate        | 0.0003      |
|    loss                 | 24.1        |
|    n_updates            | 18480       |
|    policy_gradient_loss | -0.00247    |
|    value_loss           | 108         |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 221   

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 250         |
|    ep_rew_mean          | 254         |
| time/                   |             |
|    fps                  | 610         |
|    iterations           | 5           |
|    time_elapsed         | 16          |
|    total_timesteps      | 3809280     |
| train/                  |             |
|    approx_kl            | 0.007188528 |
|    clip_fraction        | 0.0657      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.311      |
|    explained_variance   | 0.931       |
|    learning_rate        | 0.0003      |
|    loss                 | 8.33        |
|    n_updates            | 18590       |
|    policy_gradient_loss | -0.00222    |
|    value_loss           | 66.7        |
-----------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 259  

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 262      |
|    ep_rew_mean     | 255      |
| time/              |          |
|    fps             | 903      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 3831808  |
---------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 253         |
|    ep_rew_mean          | 257         |
| time/                   |             |
|    fps                  | 695         |
|    iterations           | 2           |
|    time_elapsed         | 5           |
|    total_timesteps      | 3833856     |
| train/                  |             |
|    approx_kl            | 0.004148267 |
|    clip_fraction        | 0.0295      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.23       |
|    explained_variance   | 0.855       |
|    lea

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 216          |
|    ep_rew_mean          | 270          |
| time/                   |              |
|    fps                  | 644          |
|    iterations           | 3            |
|    time_elapsed         | 9            |
|    total_timesteps      | 3856384      |
| train/                  |              |
|    approx_kl            | 0.0052547697 |
|    clip_fraction        | 0.0519       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.263       |
|    explained_variance   | 0.962        |
|    learning_rate        | 0.0003       |
|    loss                 | 5.61         |
|    n_updates            | 18820        |
|    policy_gradient_loss | -0.00295     |
|    value_loss           | 14.2         |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 225          |
|    ep_rew_mean          | 273          |
| time/                   |              |
|    fps                  | 622          |
|    iterations           | 4            |
|    time_elapsed         | 13           |
|    total_timesteps      | 3878912      |
| train/                  |              |
|    approx_kl            | 0.0043014754 |
|    clip_fraction        | 0.0584       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.343       |
|    explained_variance   | 0.984        |
|    learning_rate        | 0.0003       |
|    loss                 | 5.67         |
|    n_updates            | 18930        |
|    policy_gradient_loss | 0.000736     |
|    value_loss           | 10.5         |
------------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_m

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 190          |
|    ep_rew_mean          | 268          |
| time/                   |              |
|    fps                  | 608          |
|    iterations           | 5            |
|    time_elapsed         | 16           |
|    total_timesteps      | 3901440      |
| train/                  |              |
|    approx_kl            | 0.0064878473 |
|    clip_fraction        | 0.0549       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.237       |
|    explained_variance   | 0.938        |
|    learning_rate        | 0.0003       |
|    loss                 | 7.46         |
|    n_updates            | 19040        |
|    policy_gradient_loss | -0.00171     |
|    value_loss           | 17.1         |
------------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 194      |
|    ep_rew_mean     | 278      |
| time/              |          |
|    fps             | 909      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 3923968  |
---------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 190         |
|    ep_rew_mean          | 277         |
| time/                   |             |
|    fps                  | 697         |
|    iterations           | 2           |
|    time_elapsed         | 5           |
|    total_timesteps      | 3926016     |
| train/                  |             |
|    approx_kl            | 0.010624244 |
|    clip_fraction        | 0.111       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.401      |
|    explained_variance   | 0.949       |
|    lea

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 207         |
|    ep_rew_mean          | 272         |
| time/                   |             |
|    fps                  | 646         |
|    iterations           | 3           |
|    time_elapsed         | 9           |
|    total_timesteps      | 3948544     |
| train/                  |             |
|    approx_kl            | 0.005836977 |
|    clip_fraction        | 0.0647      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.243      |
|    explained_variance   | 0.954       |
|    learning_rate        | 0.0003      |
|    loss                 | 5.4         |
|    n_updates            | 19270       |
|    policy_gradient_loss | -0.00558    |
|    value_loss           | 21.9        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 216 

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 196          |
|    ep_rew_mean          | 266          |
| time/                   |              |
|    fps                  | 625          |
|    iterations           | 4            |
|    time_elapsed         | 13           |
|    total_timesteps      | 3971072      |
| train/                  |              |
|    approx_kl            | 0.0041452884 |
|    clip_fraction        | 0.0275       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.198       |
|    explained_variance   | 0.919        |
|    learning_rate        | 0.0003       |
|    loss                 | 15.1         |
|    n_updates            | 19380        |
|    policy_gradient_loss | -0.00214     |
|    value_loss           | 34.1         |
------------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_m

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 203         |
|    ep_rew_mean          | 275         |
| time/                   |             |
|    fps                  | 610         |
|    iterations           | 5           |
|    time_elapsed         | 16          |
|    total_timesteps      | 3993600     |
| train/                  |             |
|    approx_kl            | 0.004304942 |
|    clip_fraction        | 0.0493      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.219      |
|    explained_variance   | 0.974       |
|    learning_rate        | 0.0003      |
|    loss                 | 7.03        |
|    n_updates            | 19490       |
|    policy_gradient_loss | -0.000251   |
|    value_loss           | 10          |
-----------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 205  

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 196      |
|    ep_rew_mean     | 274      |
| time/              |          |
|    fps             | 922      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 4016128  |
---------------------------------
--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 197      |
|    ep_rew_mean          | 276      |
| time/                   |          |
|    fps                  | 698      |
|    iterations           | 2        |
|    time_elapsed         | 5        |
|    total_timesteps      | 4018176  |
| train/                  |          |
|    approx_kl            | 0.006312 |
|    clip_fraction        | 0.0473   |
|    clip_range           | 0.2      |
|    entropy_loss         | -0.225   |
|    explained_variance   | 0.97     |
|    learning_rate        | 0.0003   |
|    loss     

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 222         |
|    ep_rew_mean          | 262         |
| time/                   |             |
|    fps                  | 644         |
|    iterations           | 3           |
|    time_elapsed         | 9           |
|    total_timesteps      | 4040704     |
| train/                  |             |
|    approx_kl            | 0.003301883 |
|    clip_fraction        | 0.0267      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.224      |
|    explained_variance   | 0.765       |
|    learning_rate        | 0.0003      |
|    loss                 | 21.6        |
|    n_updates            | 19720       |
|    policy_gradient_loss | -0.000303   |
|    value_loss           | 136         |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 214   

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 203          |
|    ep_rew_mean          | 264          |
| time/                   |              |
|    fps                  | 623          |
|    iterations           | 4            |
|    time_elapsed         | 13           |
|    total_timesteps      | 4063232      |
| train/                  |              |
|    approx_kl            | 0.0053747846 |
|    clip_fraction        | 0.0905       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.421       |
|    explained_variance   | 0.931        |
|    learning_rate        | 0.0003       |
|    loss                 | 4.42         |
|    n_updates            | 19830        |
|    policy_gradient_loss | -0.0014      |
|    value_loss           | 24.3         |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 189         |
|    ep_rew_mean          | 276         |
| time/                   |             |
|    fps                  | 611         |
|    iterations           | 5           |
|    time_elapsed         | 16          |
|    total_timesteps      | 4085760     |
| train/                  |             |
|    approx_kl            | 0.010938975 |
|    clip_fraction        | 0.0497      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.227      |
|    explained_variance   | 0.797       |
|    learning_rate        | 0.0003      |
|    loss                 | 21.8        |
|    n_updates            | 19940       |
|    policy_gradient_loss | -0.00537    |
|    value_loss           | 131         |
-----------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 188  

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 191      |
|    ep_rew_mean     | 271      |
| time/              |          |
|    fps             | 911      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 4108288  |
---------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 193        |
|    ep_rew_mean          | 272        |
| time/                   |            |
|    fps                  | 697        |
|    iterations           | 2          |
|    time_elapsed         | 5          |
|    total_timesteps      | 4110336    |
| train/                  |            |
|    approx_kl            | 0.00630658 |
|    clip_fraction        | 0.0598     |
|    clip_range           | 0.2        |
|    entropy_loss         | -0.293     |
|    explained_variance   | 0.956      |
|    learning_rate     

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 192          |
|    ep_rew_mean          | 271          |
| time/                   |              |
|    fps                  | 693          |
|    iterations           | 2            |
|    time_elapsed         | 5            |
|    total_timesteps      | 4130816      |
| train/                  |              |
|    approx_kl            | 0.0043435153 |
|    clip_fraction        | 0.031        |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.233       |
|    explained_variance   | 0.974        |
|    learning_rate        | 0.0003       |
|    loss                 | 4.81         |
|    n_updates            | 20160        |
|    policy_gradient_loss | -0.00253     |
|    value_loss           | 16.3         |
------------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mea

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 210         |
|    ep_rew_mean          | 270         |
| time/                   |             |
|    fps                  | 644         |
|    iterations           | 3           |
|    time_elapsed         | 9           |
|    total_timesteps      | 4153344     |
| train/                  |             |
|    approx_kl            | 0.008315055 |
|    clip_fraction        | 0.0503      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.427      |
|    explained_variance   | 0.984       |
|    learning_rate        | 0.0003      |
|    loss                 | 6.89        |
|    n_updates            | 20270       |
|    policy_gradient_loss | -0.00186    |
|    value_loss           | 10.7        |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 210   

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 210         |
|    ep_rew_mean          | 280         |
| time/                   |             |
|    fps                  | 622         |
|    iterations           | 4           |
|    time_elapsed         | 13          |
|    total_timesteps      | 4175872     |
| train/                  |             |
|    approx_kl            | 0.004910225 |
|    clip_fraction        | 0.0452      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.24       |
|    explained_variance   | 0.971       |
|    learning_rate        | 0.0003      |
|    loss                 | 4.66        |
|    n_updates            | 20380       |
|    policy_gradient_loss | -0.00255    |
|    value_loss           | 12.5        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 208 

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 206          |
|    ep_rew_mean          | 258          |
| time/                   |              |
|    fps                  | 612          |
|    iterations           | 5            |
|    time_elapsed         | 16           |
|    total_timesteps      | 4198400      |
| train/                  |              |
|    approx_kl            | 0.0024306932 |
|    clip_fraction        | 0.0318       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.206       |
|    explained_variance   | 0.829        |
|    learning_rate        | 0.0003       |
|    loss                 | 58.1         |
|    n_updates            | 20490        |
|    policy_gradient_loss | -0.00204     |
|    value_loss           | 113          |
------------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 201      |
|    ep_rew_mean     | 287      |
| time/              |          |
|    fps             | 920      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 4220928  |
---------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 197         |
|    ep_rew_mean          | 286         |
| time/                   |             |
|    fps                  | 697         |
|    iterations           | 2           |
|    time_elapsed         | 5           |
|    total_timesteps      | 4222976     |
| train/                  |             |
|    approx_kl            | 0.004462202 |
|    clip_fraction        | 0.0538      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.211      |
|    explained_variance   | 0.983       |
|    lea

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 189         |
|    ep_rew_mean          | 257         |
| time/                   |             |
|    fps                  | 688         |
|    iterations           | 2           |
|    time_elapsed         | 5           |
|    total_timesteps      | 4243456     |
| train/                  |             |
|    approx_kl            | 0.011111973 |
|    clip_fraction        | 0.109       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.441      |
|    explained_variance   | 0.957       |
|    learning_rate        | 0.0003      |
|    loss                 | 3.34        |
|    n_updates            | 20710       |
|    policy_gradient_loss | 0.00445     |
|    value_loss           | 12.9        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 196 

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 185          |
|    ep_rew_mean          | 279          |
| time/                   |              |
|    fps                  | 569          |
|    iterations           | 3            |
|    time_elapsed         | 10           |
|    total_timesteps      | 4265984      |
| train/                  |              |
|    approx_kl            | 0.0022733668 |
|    clip_fraction        | 0.0243       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.192       |
|    explained_variance   | 0.889        |
|    learning_rate        | 0.0003       |
|    loss                 | 16.8         |
|    n_updates            | 20820        |
|    policy_gradient_loss | -0.00132     |
|    value_loss           | 33.2         |
------------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_m

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 213         |
|    ep_rew_mean          | 279         |
| time/                   |             |
|    fps                  | 608         |
|    iterations           | 4           |
|    time_elapsed         | 13          |
|    total_timesteps      | 4288512     |
| train/                  |             |
|    approx_kl            | 0.016196057 |
|    clip_fraction        | 0.0697      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.348      |
|    explained_variance   | 0.974       |
|    learning_rate        | 0.0003      |
|    loss                 | 4.96        |
|    n_updates            | 20930       |
|    policy_gradient_loss | -0.000445   |
|    value_loss           | 11.8        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 223 

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 188          |
|    ep_rew_mean          | 270          |
| time/                   |              |
|    fps                  | 552          |
|    iterations           | 5            |
|    time_elapsed         | 18           |
|    total_timesteps      | 4311040      |
| train/                  |              |
|    approx_kl            | 0.0062956056 |
|    clip_fraction        | 0.0416       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.23        |
|    explained_variance   | 0.836        |
|    learning_rate        | 0.0003       |
|    loss                 | 6.52         |
|    n_updates            | 21040        |
|    policy_gradient_loss | -0.00191     |
|    value_loss           | 77.6         |
------------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 210      |
|    ep_rew_mean     | 258      |
| time/              |          |
|    fps             | 814      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 4333568  |
---------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 204          |
|    ep_rew_mean          | 261          |
| time/                   |              |
|    fps                  | 615          |
|    iterations           | 2            |
|    time_elapsed         | 6            |
|    total_timesteps      | 4335616      |
| train/                  |              |
|    approx_kl            | 0.0037503624 |
|    clip_fraction        | 0.0369       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.224       |
|    explained_variance   | 0.937   

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 193         |
|    ep_rew_mean          | 278         |
| time/                   |             |
|    fps                  | 668         |
|    iterations           | 2           |
|    time_elapsed         | 6           |
|    total_timesteps      | 4356096     |
| train/                  |             |
|    approx_kl            | 0.020289969 |
|    clip_fraction        | 0.0893      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.296      |
|    explained_variance   | 0.968       |
|    learning_rate        | 0.0003      |
|    loss                 | 9.13        |
|    n_updates            | 21260       |
|    policy_gradient_loss | -0.014      |
|    value_loss           | 16          |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 193 

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 185          |
|    ep_rew_mean          | 259          |
| time/                   |              |
|    fps                  | 604          |
|    iterations           | 3            |
|    time_elapsed         | 10           |
|    total_timesteps      | 4378624      |
| train/                  |              |
|    approx_kl            | 0.0029609958 |
|    clip_fraction        | 0.0257       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.199       |
|    explained_variance   | 0.895        |
|    learning_rate        | 0.0003       |
|    loss                 | 34.2         |
|    n_updates            | 21370        |
|    policy_gradient_loss | -0.00103     |
|    value_loss           | 49.6         |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 186          |
|    ep_rew_mean          | 274          |
| time/                   |              |
|    fps                  | 587          |
|    iterations           | 4            |
|    time_elapsed         | 13           |
|    total_timesteps      | 4401152      |
| train/                  |              |
|    approx_kl            | 0.0049670795 |
|    clip_fraction        | 0.0465       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.212       |
|    explained_variance   | 0.929        |
|    learning_rate        | 0.0003       |
|    loss                 | 11.4         |
|    n_updates            | 21480        |
|    policy_gradient_loss | -0.00174     |
|    value_loss           | 26.3         |
------------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_m

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 216         |
|    ep_rew_mean          | 265         |
| time/                   |             |
|    fps                  | 599         |
|    iterations           | 5           |
|    time_elapsed         | 17          |
|    total_timesteps      | 4423680     |
| train/                  |             |
|    approx_kl            | 0.004357067 |
|    clip_fraction        | 0.0371      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.18       |
|    explained_variance   | 0.95        |
|    learning_rate        | 0.0003      |
|    loss                 | 6.54        |
|    n_updates            | 21590       |
|    policy_gradient_loss | -0.000542   |
|    value_loss           | 15.6        |
-----------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 216  

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 197      |
|    ep_rew_mean     | 272      |
| time/              |          |
|    fps             | 822      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 4446208  |
---------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 214         |
|    ep_rew_mean          | 268         |
| time/                   |             |
|    fps                  | 645         |
|    iterations           | 2           |
|    time_elapsed         | 6           |
|    total_timesteps      | 4448256     |
| train/                  |             |
|    approx_kl            | 0.009246084 |
|    clip_fraction        | 0.107       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.371      |
|    explained_variance   | 0.937       |
|    lea

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 201          |
|    ep_rew_mean          | 266          |
| time/                   |              |
|    fps                  | 677          |
|    iterations           | 2            |
|    time_elapsed         | 6            |
|    total_timesteps      | 4468736      |
| train/                  |              |
|    approx_kl            | 0.0049563306 |
|    clip_fraction        | 0.0373       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.207       |
|    explained_variance   | 0.962        |
|    learning_rate        | 0.0003       |
|    loss                 | 21.4         |
|    n_updates            | 21810        |
|    policy_gradient_loss | -0.00249     |
|    value_loss           | 19.7         |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 251          |
|    ep_rew_mean          | 245          |
| time/                   |              |
|    fps                  | 629          |
|    iterations           | 3            |
|    time_elapsed         | 9            |
|    total_timesteps      | 4491264      |
| train/                  |              |
|    approx_kl            | 0.0025465945 |
|    clip_fraction        | 0.0469       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.392       |
|    explained_variance   | 0.946        |
|    learning_rate        | 0.0003       |
|    loss                 | 40.9         |
|    n_updates            | 21920        |
|    policy_gradient_loss | -0.00255     |
|    value_loss           | 67.4         |
------------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_m

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 196          |
|    ep_rew_mean          | 272          |
| time/                   |              |
|    fps                  | 607          |
|    iterations           | 4            |
|    time_elapsed         | 13           |
|    total_timesteps      | 4513792      |
| train/                  |              |
|    approx_kl            | 0.0073207505 |
|    clip_fraction        | 0.0683       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.219       |
|    explained_variance   | 0.969        |
|    learning_rate        | 0.0003       |
|    loss                 | 6.02         |
|    n_updates            | 22030        |
|    policy_gradient_loss | -0.00327     |
|    value_loss           | 17           |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 254         |
|    ep_rew_mean          | 251         |
| time/                   |             |
|    fps                  | 596         |
|    iterations           | 5           |
|    time_elapsed         | 17          |
|    total_timesteps      | 4536320     |
| train/                  |             |
|    approx_kl            | 0.006938141 |
|    clip_fraction        | 0.0756      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.421      |
|    explained_variance   | 0.947       |
|    learning_rate        | 0.0003      |
|    loss                 | 20.7        |
|    n_updates            | 22140       |
|    policy_gradient_loss | -0.000145   |
|    value_loss           | 74          |
-----------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 256  

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 211      |
|    ep_rew_mean     | 268      |
| time/              |          |
|    fps             | 871      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 4558848  |
---------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 218         |
|    ep_rew_mean          | 267         |
| time/                   |             |
|    fps                  | 676         |
|    iterations           | 2           |
|    time_elapsed         | 6           |
|    total_timesteps      | 4560896     |
| train/                  |             |
|    approx_kl            | 0.017033827 |
|    clip_fraction        | 0.23        |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.522      |
|    explained_variance   | 0.983       |
|    lea

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 201         |
|    ep_rew_mean          | 266         |
| time/                   |             |
|    fps                  | 630         |
|    iterations           | 3           |
|    time_elapsed         | 9           |
|    total_timesteps      | 4583424     |
| train/                  |             |
|    approx_kl            | 0.009402466 |
|    clip_fraction        | 0.0413      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.22       |
|    explained_variance   | 0.819       |
|    learning_rate        | 0.0003      |
|    loss                 | 10          |
|    n_updates            | 22370       |
|    policy_gradient_loss | -0.00316    |
|    value_loss           | 44.3        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 208 

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 236         |
|    ep_rew_mean          | 249         |
| time/                   |             |
|    fps                  | 608         |
|    iterations           | 4           |
|    time_elapsed         | 13          |
|    total_timesteps      | 4605952     |
| train/                  |             |
|    approx_kl            | 0.006574074 |
|    clip_fraction        | 0.0769      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.386      |
|    explained_variance   | 0.944       |
|    learning_rate        | 0.0003      |
|    loss                 | 8.54        |
|    n_updates            | 22480       |
|    policy_gradient_loss | -0.00354    |
|    value_loss           | 90.6        |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 229   

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 185          |
|    ep_rew_mean          | 268          |
| time/                   |              |
|    fps                  | 597          |
|    iterations           | 5            |
|    time_elapsed         | 17           |
|    total_timesteps      | 4628480      |
| train/                  |              |
|    approx_kl            | 0.0029761107 |
|    clip_fraction        | 0.0344       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.212       |
|    explained_variance   | 0.844        |
|    learning_rate        | 0.0003       |
|    loss                 | 20.8         |
|    n_updates            | 22590        |
|    policy_gradient_loss | -0.000982    |
|    value_loss           | 108          |
------------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 192      |
|    ep_rew_mean     | 266      |
| time/              |          |
|    fps             | 899      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 4651008  |
---------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 193          |
|    ep_rew_mean          | 266          |
| time/                   |              |
|    fps                  | 694          |
|    iterations           | 2            |
|    time_elapsed         | 5            |
|    total_timesteps      | 4653056      |
| train/                  |              |
|    approx_kl            | 0.0024829633 |
|    clip_fraction        | 0.029        |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.198       |
|    explained_variance   | 0.834   

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 195          |
|    ep_rew_mean          | 262          |
| time/                   |              |
|    fps                  | 692          |
|    iterations           | 2            |
|    time_elapsed         | 5            |
|    total_timesteps      | 4673536      |
| train/                  |              |
|    approx_kl            | 0.0036804774 |
|    clip_fraction        | 0.0385       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.334       |
|    explained_variance   | 0.956        |
|    learning_rate        | 0.0003       |
|    loss                 | 32.2         |
|    n_updates            | 22810        |
|    policy_gradient_loss | -0.00106     |
|    value_loss           | 84.4         |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 201         |
|    ep_rew_mean          | 266         |
| time/                   |             |
|    fps                  | 639         |
|    iterations           | 3           |
|    time_elapsed         | 9           |
|    total_timesteps      | 4696064     |
| train/                  |             |
|    approx_kl            | 0.004892018 |
|    clip_fraction        | 0.0492      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.204      |
|    explained_variance   | 0.963       |
|    learning_rate        | 0.0003      |
|    loss                 | 3.17        |
|    n_updates            | 22920       |
|    policy_gradient_loss | -0.00169    |
|    value_loss           | 14.3        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 201 

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 229         |
|    ep_rew_mean          | 270         |
| time/                   |             |
|    fps                  | 621         |
|    iterations           | 4           |
|    time_elapsed         | 13          |
|    total_timesteps      | 4718592     |
| train/                  |             |
|    approx_kl            | 0.012078384 |
|    clip_fraction        | 0.0604      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.22       |
|    explained_variance   | 0.972       |
|    learning_rate        | 0.0003      |
|    loss                 | 9.33        |
|    n_updates            | 23030       |
|    policy_gradient_loss | -0.00338    |
|    value_loss           | 13.6        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 238 

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 185          |
|    ep_rew_mean          | 274          |
| time/                   |              |
|    fps                  | 610          |
|    iterations           | 5            |
|    time_elapsed         | 16           |
|    total_timesteps      | 4741120      |
| train/                  |              |
|    approx_kl            | 0.0042837122 |
|    clip_fraction        | 0.0458       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.235       |
|    explained_variance   | 0.849        |
|    learning_rate        | 0.0003       |
|    loss                 | 4.78         |
|    n_updates            | 23140        |
|    policy_gradient_loss | -0.00119     |
|    value_loss           | 44           |
------------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 195      |
|    ep_rew_mean     | 262      |
| time/              |          |
|    fps             | 897      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 4763648  |
---------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 194          |
|    ep_rew_mean          | 261          |
| time/                   |              |
|    fps                  | 690          |
|    iterations           | 2            |
|    time_elapsed         | 5            |
|    total_timesteps      | 4765696      |
| train/                  |              |
|    approx_kl            | 0.0036097053 |
|    clip_fraction        | 0.0414       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.233       |
|    explained_variance   | 0.933   

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 195         |
|    ep_rew_mean          | 273         |
| time/                   |             |
|    fps                  | 695         |
|    iterations           | 2           |
|    time_elapsed         | 5           |
|    total_timesteps      | 4786176     |
| train/                  |             |
|    approx_kl            | 0.007863218 |
|    clip_fraction        | 0.0598      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.244      |
|    explained_variance   | 0.941       |
|    learning_rate        | 0.0003      |
|    loss                 | 6.8         |
|    n_updates            | 23360       |
|    policy_gradient_loss | -0.00239    |
|    value_loss           | 14.1        |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 202   

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 192          |
|    ep_rew_mean          | 278          |
| time/                   |              |
|    fps                  | 643          |
|    iterations           | 3            |
|    time_elapsed         | 9            |
|    total_timesteps      | 4808704      |
| train/                  |              |
|    approx_kl            | 0.0026953875 |
|    clip_fraction        | 0.0327       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.189       |
|    explained_variance   | 0.965        |
|    learning_rate        | 0.0003       |
|    loss                 | 7.94         |
|    n_updates            | 23470        |
|    policy_gradient_loss | -0.00244     |
|    value_loss           | 17.2         |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 194         |
|    ep_rew_mean          | 266         |
| time/                   |             |
|    fps                  | 621         |
|    iterations           | 4           |
|    time_elapsed         | 13          |
|    total_timesteps      | 4831232     |
| train/                  |             |
|    approx_kl            | 0.009373893 |
|    clip_fraction        | 0.133       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.392      |
|    explained_variance   | 0.988       |
|    learning_rate        | 0.0003      |
|    loss                 | 2.75        |
|    n_updates            | 23580       |
|    policy_gradient_loss | -0.0014     |
|    value_loss           | 6.76        |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 202   

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 212          |
|    ep_rew_mean          | 270          |
| time/                   |              |
|    fps                  | 609          |
|    iterations           | 5            |
|    time_elapsed         | 16           |
|    total_timesteps      | 4853760      |
| train/                  |              |
|    approx_kl            | 0.0033068033 |
|    clip_fraction        | 0.0296       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.24        |
|    explained_variance   | 0.737        |
|    learning_rate        | 0.0003       |
|    loss                 | 21.8         |
|    n_updates            | 23690        |
|    policy_gradient_loss | -0.00269     |
|    value_loss           | 115          |
------------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 204      |
|    ep_rew_mean     | 275      |
| time/              |          |
|    fps             | 910      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 4876288  |
---------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 203         |
|    ep_rew_mean          | 275         |
| time/                   |             |
|    fps                  | 698         |
|    iterations           | 2           |
|    time_elapsed         | 5           |
|    total_timesteps      | 4878336     |
| train/                  |             |
|    approx_kl            | 0.005205643 |
|    clip_fraction        | 0.0519      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.28       |
|    explained_variance   | 0.982       |
|    lea

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 225         |
|    ep_rew_mean          | 255         |
| time/                   |             |
|    fps                  | 643         |
|    iterations           | 3           |
|    time_elapsed         | 9           |
|    total_timesteps      | 4900864     |
| train/                  |             |
|    approx_kl            | 0.006290025 |
|    clip_fraction        | 0.0443      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.215      |
|    explained_variance   | 0.881       |
|    learning_rate        | 0.0003      |
|    loss                 | 11.9        |
|    n_updates            | 23920       |
|    policy_gradient_loss | -0.00105    |
|    value_loss           | 34.4        |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 223   

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 206         |
|    ep_rew_mean          | 263         |
| time/                   |             |
|    fps                  | 623         |
|    iterations           | 4           |
|    time_elapsed         | 13          |
|    total_timesteps      | 4923392     |
| train/                  |             |
|    approx_kl            | 0.004294006 |
|    clip_fraction        | 0.0583      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.269      |
|    explained_variance   | 0.974       |
|    learning_rate        | 0.0003      |
|    loss                 | 9.48        |
|    n_updates            | 24030       |
|    policy_gradient_loss | 0.00065     |
|    value_loss           | 15.7        |
-----------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 205 

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 202         |
|    ep_rew_mean          | 278         |
| time/                   |             |
|    fps                  | 610         |
|    iterations           | 5           |
|    time_elapsed         | 16          |
|    total_timesteps      | 4945920     |
| train/                  |             |
|    approx_kl            | 0.008342976 |
|    clip_fraction        | 0.0692      |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.228      |
|    explained_variance   | 0.978       |
|    learning_rate        | 0.0003      |
|    loss                 | 5.08        |
|    n_updates            | 24140       |
|    policy_gradient_loss | -0.00607    |
|    value_loss           | 10.2        |
-----------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 210  

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 226      |
|    ep_rew_mean     | 268      |
| time/              |          |
|    fps             | 912      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 4968448  |
---------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 219          |
|    ep_rew_mean          | 270          |
| time/                   |              |
|    fps                  | 700          |
|    iterations           | 2            |
|    time_elapsed         | 5            |
|    total_timesteps      | 4970496      |
| train/                  |              |
|    approx_kl            | 0.0092020845 |
|    clip_fraction        | 0.102        |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.4         |
|    explained_variance   | 0.987   

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 241          |
|    ep_rew_mean          | 271          |
| time/                   |              |
|    fps                  | 695          |
|    iterations           | 2            |
|    time_elapsed         | 5            |
|    total_timesteps      | 4990976      |
| train/                  |              |
|    approx_kl            | 0.0027979596 |
|    clip_fraction        | 0.0443       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.285       |
|    explained_variance   | 0.666        |
|    learning_rate        | 0.0003       |
|    loss                 | 11.2         |
|    n_updates            | 24360        |
|    policy_gradient_loss | -0.000234    |
|    value_loss           | 90.1         |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 236          |
|    ep_rew_mean          | 271          |
| time/                   |              |
|    fps                  | 644          |
|    iterations           | 3            |
|    time_elapsed         | 9            |
|    total_timesteps      | 5013504      |
| train/                  |              |
|    approx_kl            | 0.0050131045 |
|    clip_fraction        | 0.0417       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.257       |
|    explained_variance   | 0.947        |
|    learning_rate        | 0.0003       |
|    loss                 | 6.38         |
|    n_updates            | 24470        |
|    policy_gradient_loss | -0.00346     |
|    value_loss           | 34.2         |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 235          |
|    ep_rew_mean          | 265          |
| time/                   |              |
|    fps                  | 623          |
|    iterations           | 4            |
|    time_elapsed         | 13           |
|    total_timesteps      | 5036032      |
| train/                  |              |
|    approx_kl            | 0.0038324941 |
|    clip_fraction        | 0.0403       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.208       |
|    explained_variance   | 0.913        |
|    learning_rate        | 0.0003       |
|    loss                 | 6.6          |
|    n_updates            | 24580        |
|    policy_gradient_loss | -0.0044      |
|    value_loss           | 19.2         |
------------------------------------------
------------------------------------------
| rollout/                |              |
|    ep_len

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 215          |
|    ep_rew_mean          | 264          |
| time/                   |              |
|    fps                  | 609          |
|    iterations           | 5            |
|    time_elapsed         | 16           |
|    total_timesteps      | 5058560      |
| train/                  |              |
|    approx_kl            | 0.0061190673 |
|    clip_fraction        | 0.0425       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.263       |
|    explained_variance   | 0.969        |
|    learning_rate        | 0.0003       |
|    loss                 | 33.9         |
|    n_updates            | 24690        |
|    policy_gradient_loss | -0.00313     |
|    value_loss           | 42.4         |
------------------------------------------
Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep

Logging to logs\PPO_0
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 211      |
|    ep_rew_mean     | 260      |
| time/              |          |
|    fps             | 906      |
|    iterations      | 1        |
|    time_elapsed    | 2        |
|    total_timesteps | 5081088  |
---------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 213         |
|    ep_rew_mean          | 261         |
| time/                   |             |
|    fps                  | 697         |
|    iterations           | 2           |
|    time_elapsed         | 5           |
|    total_timesteps      | 5083136     |
| train/                  |             |
|    approx_kl            | 0.010598378 |
|    clip_fraction        | 0.14        |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.379      |
|    explained_variance   | 0.979       |
|    lea

------------------------------------------
| rollout/                |              |
|    ep_len_mean          | 242          |
|    ep_rew_mean          | 275          |
| time/                   |              |
|    fps                  | 644          |
|    iterations           | 3            |
|    time_elapsed         | 9            |
|    total_timesteps      | 5105664      |
| train/                  |              |
|    approx_kl            | 0.0044596205 |
|    clip_fraction        | 0.0415       |
|    clip_range           | 0.2          |
|    entropy_loss         | -0.205       |
|    explained_variance   | 0.981        |
|    learning_rate        | 0.0003       |
|    loss                 | 4.11         |
|    n_updates            | 24920        |
|    policy_gradient_loss | -0.00127     |
|    value_loss           | 11.1         |
------------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mea

##### also do this with A2C

In [None]:
# Create directories for saving models and logs if they don't exist
models_dir = 'models/A2C'
logdir = 'logs'

# Create the models directory if it doesn't exist
if not os.path.exists(models_dir):
    os.makedirs(models_dir)

# Create the logs directory if it doesn't exist
if not os.path.exists(logdir):
    os.makedirs(logdir)

In [None]:
model = A2C('MlpPolicy', env, verbose=1,tensorboard_log=logdir)

In [None]:
TIMESTEPS = 10000
for i in range (1,30):
    model.learn(total_timesteps=TIMESTEPS,reset_num_timesteps=False, tb_log_name='A2C')
    model.save(f"{models_dir}/{TIMESTEPS*i}")

In [None]:
from stable_baselines3 import DQN

In [None]:
# Create directories for saving models and logs if they don't exist
models_dir = 'models/DQN'
logdir = 'logs'

# Create the models directory if it doesn't exist
if not os.path.exists(models_dir):
    os.makedirs(models_dir)

# Create the logs directory if it doesn't exist
if not os.path.exists(logdir):
    os.makedirs(logdir)

In [None]:
model = DQN('MlpPolicy', env, verbose=1,tensorboard_log=logdir)

In [None]:
TIMESTEPS = 10000
for i in range (1,30):
    model.learn(total_timesteps=TIMESTEPS,reset_num_timesteps=False, tb_log_name='DQN')
    model.save(f"{models_dir}/{TIMESTEPS*i}")

# Understanding the output

Sample:
| rollout/            |          |
|    ep_len_mean      | 101      |
|    ep_rew_mean      | -283     |
|    exploration_rate | 0.05     |
| time/               |          |
|    episodes         | 552      |
|    fps              | 879      |
|    time_elapsed     | 3        |
|    total_timesteps  | 52787    |
| train/              |          |
|    learning_rate    | 0.0001   |
|    loss             | 1.35     |
|    n_updates     | 696     |
----------------------------------


let's break down each of the output parameters from this sample training log, which is often seen when training reinforcement learning models using libraries like Stable Baselines or OpenAI Gym.

### Rollout Parameters

- **`ep_len_mean` (Mean Episode Length)**: This is the average length of episodes over a certain period. A value of 101 suggests that, on average, episodes last for 101 timesteps. This can be an indicator of how long the agent is able to "survive" or maintain a certain level of performance in the environment.

- **`ep_rew_mean` (Mean Episode Reward)**: This is the average reward obtained per episode over a certain period. A value of -283 suggests that the agent is, on average, receiving a negative reward, which usually indicates poor performance.

- **`exploration_rate`**: This is the rate at which the agent explores the environment, usually governed by an ε-greedy strategy or similar. A value of 0.05 means that 5% of the time, the agent takes a random action to explore the state space.

### Time Parameters

- **`episodes`**: This indicates the total number of episodes that have been completed. A value of 552 suggests that the agent has completed 552 episodes.

- **`fps` (Frames Per Second)**: This is the speed of the simulation, indicating how many timesteps the model can process per second. A value of 879 means the model is processing 879 timesteps per second.

- **`time_elapsed`**: This is the total time elapsed in seconds since the training started. A value of 3 suggests that the training has been running for 3 seconds.

- **`total_timesteps`**: This indicates the total number of timesteps that have been processed so far. A value of 52,787 means the agent has interacted with the environment for that many timesteps.

### Training Parameters

- **`learning_rate`**: This is the learning rate used in the optimization algorithm. A value of 0.0001 is fairly typical for many RL problems.

- **`loss`**: This is the value of the loss function, which the optimization algorithm is trying to minimize. A value of 1.35 suggests the current "error" or "loss" in the model's predictions.

- **`n_updates`**: This indicates the number of times the model's parameters have been updated. A value of 696 suggests that 696 updates have been made.

Each of these metrics provides valuable insights into the training process, helping you understand the agent's performance, the efficiency of the training, and the stability of the learning.

In [None]:
from IPython.display import Image
Image(filename="tensorboard_pic.JPG")

The command `tensorboard --logdir=logs` is used to launch TensorBoard, a visualization tool for machine learning experiments, and point it to the directory where the log files are stored. In this case, the log files are expected to be in a directory named `logs`. Once TensorBoard is running, it will continuously monitor the `logs` directory for any updates and reflect those changes in its visualizations. This is particularly useful for tracking metrics, visualizing model architectures, and monitoring training progress in real-time. By setting up TensorBoard in this manner, you can gain valuable insights into the behavior and performance of your reinforcement learning models, making it easier to debug issues, optimize hyperparameters, and understand the learning dynamics.

!pip install tensorflow to install tensorboard #see https://stackoverflow.com/questions/33634008/how-do-i-install-tensorflows-tensorboard/33634101#33634101

tensorboard --logdir=logs
Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all
TensorBoard 2.13.0 at http://localhost:6006/ (Press CTRL+C to quit)
