# <center>Table of Contents</center>

### 1. **Import Libraries**  
   - 1A. [Import Required Libraries](#1a-import-required-libraries)  
   - 1B. [Create Environment and Test](#1b-create-environment-and-test)  

### 2. **Train Model for Normal Version with PPO**  
   - 2A. [Preprocess Environment](#2a-preprocess-environment)  
   - 2B. [Train the Model](#2b-train-the-model)  
   - 2C. [Save the Model](#2c-save-the-model)  
   - 2D. [Evaluate the Model](#2d-evaluate-the-model)  

### 3. **Train Model for Hardcore Version with PPO**  
   - 3A. [Test the Environment](#3a-test-the-environment)  
   - 3B. [Preprocess Environment](#3b-preprocess-environment)  
   - 3C. [Train the Hardcore Model](#3c-train-the-hardcore-model)  
   - 3D. [Save the Hardcore Model](#3d-save-the-hardcore-model)  
   - 3E. [Evaluate the Hardcore Model](#3e-evaluate-the-hardcore-model)  

# <center>1. Import Libaries</center>

## 1A) Import Libaries

In [1]:
# Import the necessary libraries

# gymnasium is a modern version of the gym library, used to create and interact with reinforcement learning environments
import gymnasium as gym

# Import PPO (Proximal Policy Optimization) from stable-baselines3, which is a popular reinforcement learning algorithm
from stable_baselines3 import PPO, DQN, DDPG

# Import the evaluation function to assess the performance of the trained policy
from stable_baselines3.common.evaluation import evaluate_policy

# Import Monitor to log training information such as rewards and episode lengths
from stable_baselines3.common.monitor import Monitor

# Import utility functions for vectorized environments, normalization, frame stacking, and video recording
from stable_baselines3.common.vec_env import DummyVecEnv, VecNormalize, VecFrameStack, VecVideoRecorder

# Import os for handling directory creation and file paths
import os 

# Import pandas for handling and analyzing data (e.g., log files)
import pandas as pd

## 1B) Create Env and Test

In [None]:
# Create the BipedalWalker environment with human-rendering mode enabled
env = gym.make("BipedalWalker-v3", render_mode="human")

In [None]:
# Reset the environment (start a new episode) - without using seed or options
obs = env.reset()

# Let the agent take random actions for 1000 steps
for _ in range(1000):
    # Take a random action sampled from the environment's action space
    action = env.action_space.sample()
    
    # Step the environment forward using the chosen action
    # The environment returns the new observation (obs), the reward, 
    # whether the episode is done (done), if it was truncated (truncated), and additional info (info)
    obs, reward, done, truncated, info = env.step(action)
    
    # If the episode is finished (either done or truncated), reset the environment for a new episode
    if done or truncated:
        obs = env.reset()

# Close the environment when finished to clean up resources
env.close()

# <center>2) Train Model for Normal Version with PPO</center>

## 2A) Preprocces Enviorment

In [None]:
env = gym.make("BipedalWalker-v3") #,render_mode = 'rgb_array') 

In [None]:
# Define the logs directory and create it if it doesn't exist
logs_dir = 'logs'
os.makedirs(logs_dir, exist_ok=True)

# Specify the log filename (change this if needed)
log_filename = ""  # You can change this manually if needed. Default 'monitor.csv', if you add a text it wil be
                    #import as (text).monitor.csv

# Define the path for the monitor log
monitor_log_path = os.path.join(logs_dir, log_filename)

# Wrap the environment with Monitor and save logs to the defined path
env = Monitor(env, filename=monitor_log_path)

In [None]:
# Wrap the environment in a DummyVecEnv to enable vectorized operations
env = DummyVecEnv([lambda: env])

# Normalize observations and rewards in the environment
# norm_obs: Normalize observations
# norm_reward: Normalize rewards
# clip_obs: Clip the observation values to prevent outliers
env = VecNormalize(env, norm_obs=True, norm_reward=True, clip_obs=10.)

# Stack the last n_stack observations (here n_stack=4) to provide temporal information to the agent
env = VecFrameStack(env, n_stack=4)

In [None]:
video_folder = 'videos'
os.makedirs(video_folder, exist_ok=True)

env = VecVideoRecorder(env, video_folder, record_video_trigger=lambda x: x % 1000 == 0, video_length=200)

## 2B) Train Model

In [None]:
# Create the PPO model with a Multi-Layer Perceptron (MLP) policy
model = PPO("MlpPolicy", env, verbose=1)

In [None]:
model.learn(total_timesteps=1000000)

## 2C) Save Model

In [None]:
model.save("ppo_bipedalwalker_1M")

In [None]:
del model

## 2D) Evaluate Model

In [None]:
model = PPO.load("ppo_bipedalwalker_1M")

In [None]:
env = gym.make("BipedalWalker-v3", render_mode="human")

In [None]:
# Evaluate the model (e.g., over 10 episodes)
mean_reward, std_reward = evaluate_policy(model, env, n_eval_episodes=10)

print(f"Average reward: {mean_reward} ± {std_reward}")

Average Reward**: 248.39 ± 112.10
  - **Assessment**: This result indicates that the model is performing quite well overall. The average reward suggests that it has developed an effective policy and undergone a successful learning process. The high standard deviation (112.10) indicates that the model achieved significantly higher rewards in some trials while scoring lower in others, implying variability in its responses to different situations. This variability highlights the need for further analysis to understand how the model interacts with its environment.

# <center>3) Train Model for Hardcore Version with PPO</center>

## 3A) Test Enviroment

In [None]:
env = gym.make("BipedalWalker-v3", hardcore=True, render_mode="human")

In [None]:
# Reset the environment (start a new episode) - without using seed or options
obs = env.reset()

# Let the agent take random actions for 1000 steps
for _ in range(1000):
    # Take a random action sampled from the environment's action space
    action = env.action_space.sample()
    
    # Step the environment forward using the chosen action
    # The environment returns the new observation (obs), the reward, 
    # whether the episode is done (done), if it was truncated (truncated), and additional info (info)
    obs, reward, done, truncated, info = env.step(action)
    
    # If the episode is finished (either done or truncated), reset the environment for a new episode
    if done or truncated:
        obs = env.reset()

# Close the environment when finished to clean up resources
env.close()

## 3B) Preprocces Enviorment

In [11]:
env = gym.make("BipedalWalker-v3", hardcore=True) #,render_mode = 'rgb_array')

In [12]:
# Define the logs directory and create it if it doesn't exist
logs_dir = 'logs'
os.makedirs(logs_dir, exist_ok=True)

# Specify the log filename (change this if needed)
log_filename = "5m_hardcore"  # You can change this manually if needed. Default 'monitor.csv', if you add a text it wil be
                    #import as (text).monitor.csv

# Define the path for the monitor log
monitor_log_path = os.path.join(logs_dir, log_filename)

# Wrap the environment with Monitor and save logs to the defined path
env = Monitor(env, filename=monitor_log_path)

In [13]:
# Wrap the environment in a DummyVecEnv to enable vectorized operations
env = DummyVecEnv([lambda: env])

# Normalize observations and rewards in the environment
# norm_obs: Normalize observations
# norm_reward: Normalize rewards
# clip_obs: Clip the observation values to prevent outliers
env = VecNormalize(env, norm_obs=True, norm_reward=True, clip_obs=10.)

# Stack the last n_stack observations (here n_stack=4) to provide temporal information to the agent
env = VecFrameStack(env, n_stack=4)

In [None]:
# Define the video folder and create it if it doesn't exist
video_folder = 'videos'
os.makedirs(video_folder, exist_ok=True)

# Wrap the environment with VecVideoRecorder to record videos
# The recording is triggered every 1000 steps and each video will be 200 steps long
env = VecVideoRecorder(env, video_folder, record_video_trigger=lambda x: x % 1000 == 0, video_length=200)

## 3C) Train Model

In [14]:
# Create the PPO model with a Multi-Layer Perceptron (MLP) policy
model = PPO("MlpPolicy", env, verbose=1)

Using cpu device


In [15]:
model.learn(total_timesteps=5000000)

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 818      |
|    ep_rew_mean     | -113     |
| time/              |          |
|    fps             | 3354     |
|    iterations      | 1        |
|    time_elapsed    | 0        |
|    total_timesteps | 2048     |
---------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 1.08e+03    |
|    ep_rew_mean          | -110        |
| time/                   |             |
|    fps                  | 2780        |
|    iterations           | 2           |
|    time_elapsed         | 1           |
|    total_timesteps      | 4096        |
| train/                  |             |
|    approx_kl            | 0.011098673 |
|    clip_fraction        | 0.0916      |
|    clip_range           | 0.2         |
|    entropy_loss         | -5.67       |
|    explained_variance   | -0.0971     |
|    learning_rate        | 0.

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 703         |
|    ep_rew_mean          | -104        |
| time/                   |             |
|    fps                  | 2464        |
|    iterations           | 11          |
|    time_elapsed         | 9           |
|    total_timesteps      | 22528       |
| train/                  |             |
|    approx_kl            | 0.014261878 |
|    clip_fraction        | 0.147       |
|    clip_range           | 0.2         |
|    entropy_loss         | -5.31       |
|    explained_variance   | 0.926       |
|    learning_rate        | 0.0003      |
|    loss                 | 0.00343     |
|    n_updates            | 100         |
|    policy_gradient_loss | -0.0347     |
|    std                  | 0.909       |
|    value_loss           | 0.0715      |
-----------------------------------------
-----------------------------------------
| rollout/                |       

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 867         |
|    ep_rew_mean          | -104        |
| time/                   |             |
|    fps                  | 2443        |
|    iterations           | 20          |
|    time_elapsed         | 16          |
|    total_timesteps      | 40960       |
| train/                  |             |
|    approx_kl            | 0.016412206 |
|    clip_fraction        | 0.151       |
|    clip_range           | 0.2         |
|    entropy_loss         | -4.98       |
|    explained_variance   | 0.832       |
|    learning_rate        | 0.0003      |
|    loss                 | -0.0697     |
|    n_updates            | 190         |
|    policy_gradient_loss | -0.0389     |
|    std                  | 0.835       |
|    value_loss           | 0.0281      |
-----------------------------------------
-----------------------------------------
| rollout/                |       

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 979         |
|    ep_rew_mean          | -103        |
| time/                   |             |
|    fps                  | 2443        |
|    iterations           | 29          |
|    time_elapsed         | 24          |
|    total_timesteps      | 59392       |
| train/                  |             |
|    approx_kl            | 0.021307435 |
|    clip_fraction        | 0.214       |
|    clip_range           | 0.2         |
|    entropy_loss         | -4.58       |
|    explained_variance   | 0.753       |
|    learning_rate        | 0.0003      |
|    loss                 | -0.0448     |
|    n_updates            | 280         |
|    policy_gradient_loss | -0.051      |
|    std                  | 0.756       |
|    value_loss           | 0.0182      |
-----------------------------------------
-----------------------------------------
| rollout/                |       

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 1.01e+03    |
|    ep_rew_mean          | -101        |
| time/                   |             |
|    fps                  | 2448        |
|    iterations           | 38          |
|    time_elapsed         | 31          |
|    total_timesteps      | 77824       |
| train/                  |             |
|    approx_kl            | 0.023527618 |
|    clip_fraction        | 0.24        |
|    clip_range           | 0.2         |
|    entropy_loss         | -4.16       |
|    explained_variance   | 0.812       |
|    learning_rate        | 0.0003      |
|    loss                 | -0.0826     |
|    n_updates            | 370         |
|    policy_gradient_loss | -0.0511     |
|    std                  | 0.681       |
|    value_loss           | 0.011       |
-----------------------------------------
----------------------------------------
| rollout/                |        

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 993       |
|    ep_rew_mean          | -103      |
| time/                   |           |
|    fps                  | 2450      |
|    iterations           | 47        |
|    time_elapsed         | 39        |
|    total_timesteps      | 96256     |
| train/                  |           |
|    approx_kl            | 0.0272046 |
|    clip_fraction        | 0.24      |
|    clip_range           | 0.2       |
|    entropy_loss         | -3.86     |
|    explained_variance   | 0.61      |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0556   |
|    n_updates            | 460       |
|    policy_gradient_loss | -0.0489   |
|    std                  | 0.632     |
|    value_loss           | 0.0238    |
---------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 988     

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 1.07e+03    |
|    ep_rew_mean          | -102        |
| time/                   |             |
|    fps                  | 2452        |
|    iterations           | 56          |
|    time_elapsed         | 46          |
|    total_timesteps      | 114688      |
| train/                  |             |
|    approx_kl            | 0.028386265 |
|    clip_fraction        | 0.276       |
|    clip_range           | 0.2         |
|    entropy_loss         | -3.51       |
|    explained_variance   | 0.847       |
|    learning_rate        | 0.0003      |
|    loss                 | -0.083      |
|    n_updates            | 550         |
|    policy_gradient_loss | -0.0476     |
|    std                  | 0.581       |
|    value_loss           | 0.00979     |
-----------------------------------------
---------------------------------------
| rollout/                |         

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.14e+03   |
|    ep_rew_mean          | -99.2      |
| time/                   |            |
|    fps                  | 2450       |
|    iterations           | 65         |
|    time_elapsed         | 54         |
|    total_timesteps      | 133120     |
| train/                  |            |
|    approx_kl            | 0.03337264 |
|    clip_fraction        | 0.306      |
|    clip_range           | 0.2        |
|    entropy_loss         | -3.14      |
|    explained_variance   | 0.948      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0762    |
|    n_updates            | 640        |
|    policy_gradient_loss | -0.0557    |
|    std                  | 0.529      |
|    value_loss           | 0.00457    |
----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_me

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.13e+03   |
|    ep_rew_mean          | -98        |
| time/                   |            |
|    fps                  | 2444       |
|    iterations           | 74         |
|    time_elapsed         | 62         |
|    total_timesteps      | 151552     |
| train/                  |            |
|    approx_kl            | 0.03827528 |
|    clip_fraction        | 0.316      |
|    clip_range           | 0.2        |
|    entropy_loss         | -2.75      |
|    explained_variance   | 0.868      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.064     |
|    n_updates            | 730        |
|    policy_gradient_loss | -0.0404    |
|    std                  | 0.478      |
|    value_loss           | 0.0162     |
----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_me

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 1.12e+03    |
|    ep_rew_mean          | -95.1       |
| time/                   |             |
|    fps                  | 2430        |
|    iterations           | 83          |
|    time_elapsed         | 69          |
|    total_timesteps      | 169984      |
| train/                  |             |
|    approx_kl            | 0.041690636 |
|    clip_fraction        | 0.345       |
|    clip_range           | 0.2         |
|    entropy_loss         | -2.29       |
|    explained_variance   | 0.919       |
|    learning_rate        | 0.0003      |
|    loss                 | -0.079      |
|    n_updates            | 820         |
|    policy_gradient_loss | -0.0553     |
|    std                  | 0.427       |
|    value_loss           | 0.00517     |
-----------------------------------------
----------------------------------------
| rollout/                |        

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 1.16e+03    |
|    ep_rew_mean          | -93         |
| time/                   |             |
|    fps                  | 2409        |
|    iterations           | 92          |
|    time_elapsed         | 78          |
|    total_timesteps      | 188416      |
| train/                  |             |
|    approx_kl            | 0.046933834 |
|    clip_fraction        | 0.343       |
|    clip_range           | 0.2         |
|    entropy_loss         | -1.9        |
|    explained_variance   | 0.871       |
|    learning_rate        | 0.0003      |
|    loss                 | -0.0676     |
|    n_updates            | 910         |
|    policy_gradient_loss | -0.0543     |
|    std                  | 0.388       |
|    value_loss           | 0.00544     |
-----------------------------------------
----------------------------------------
| rollout/                |        

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 1.21e+03    |
|    ep_rew_mean          | -88.9       |
| time/                   |             |
|    fps                  | 2379        |
|    iterations           | 101         |
|    time_elapsed         | 86          |
|    total_timesteps      | 206848      |
| train/                  |             |
|    approx_kl            | 0.051874746 |
|    clip_fraction        | 0.394       |
|    clip_range           | 0.2         |
|    entropy_loss         | -1.45       |
|    explained_variance   | 0.966       |
|    learning_rate        | 0.0003      |
|    loss                 | -0.0865     |
|    n_updates            | 1000        |
|    policy_gradient_loss | -0.0539     |
|    std                  | 0.347       |
|    value_loss           | 0.00319     |
-----------------------------------------
----------------------------------------
| rollout/                |        

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.2e+03    |
|    ep_rew_mean          | -83.2      |
| time/                   |            |
|    fps                  | 2338       |
|    iterations           | 110        |
|    time_elapsed         | 96         |
|    total_timesteps      | 225280     |
| train/                  |            |
|    approx_kl            | 0.08335616 |
|    clip_fraction        | 0.422      |
|    clip_range           | 0.2        |
|    entropy_loss         | -1.05      |
|    explained_variance   | 0.968      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0645    |
|    n_updates            | 1090       |
|    policy_gradient_loss | -0.0483    |
|    std                  | 0.314      |
|    value_loss           | 0.00999    |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.17e+03   |
|    ep_rew_mean          | -80.6      |
| time/                   |            |
|    fps                  | 2275       |
|    iterations           | 120        |
|    time_elapsed         | 107        |
|    total_timesteps      | 245760     |
| train/                  |            |
|    approx_kl            | 0.07645377 |
|    clip_fraction        | 0.408      |
|    clip_range           | 0.2        |
|    entropy_loss         | -0.64      |
|    explained_variance   | 0.954      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.05      |
|    n_updates            | 1190       |
|    policy_gradient_loss | -0.0532    |
|    std                  | 0.284      |
|    value_loss           | 0.0475     |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.16e+03   |
|    ep_rew_mean          | -76.5      |
| time/                   |            |
|    fps                  | 2195       |
|    iterations           | 130        |
|    time_elapsed         | 121        |
|    total_timesteps      | 266240     |
| train/                  |            |
|    approx_kl            | 0.05320458 |
|    clip_fraction        | 0.393      |
|    clip_range           | 0.2        |
|    entropy_loss         | -0.226     |
|    explained_variance   | 0.908      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0475    |
|    n_updates            | 1290       |
|    policy_gradient_loss | -0.028     |
|    std                  | 0.257      |
|    value_loss           | 0.007      |
----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_me

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.13e+03   |
|    ep_rew_mean          | -76.7      |
| time/                   |            |
|    fps                  | 2111       |
|    iterations           | 139        |
|    time_elapsed         | 134        |
|    total_timesteps      | 284672     |
| train/                  |            |
|    approx_kl            | 0.11086357 |
|    clip_fraction        | 0.469      |
|    clip_range           | 0.2        |
|    entropy_loss         | 0.132      |
|    explained_variance   | 0.733      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0809    |
|    n_updates            | 1380       |
|    policy_gradient_loss | -0.0477    |
|    std                  | 0.234      |
|    value_loss           | 0.0263     |
----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_me

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.08e+03   |
|    ep_rew_mean          | -74.7      |
| time/                   |            |
|    fps                  | 2027       |
|    iterations           | 149        |
|    time_elapsed         | 150        |
|    total_timesteps      | 305152     |
| train/                  |            |
|    approx_kl            | 0.10715556 |
|    clip_fraction        | 0.388      |
|    clip_range           | 0.2        |
|    entropy_loss         | 0.578      |
|    explained_variance   | 0.983      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0581    |
|    n_updates            | 1480       |
|    policy_gradient_loss | -0.0334    |
|    std                  | 0.209      |
|    value_loss           | 0.00364    |
----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_me

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.03e+03   |
|    ep_rew_mean          | -74.1      |
| time/                   |            |
|    fps                  | 1961       |
|    iterations           | 159        |
|    time_elapsed         | 166        |
|    total_timesteps      | 325632     |
| train/                  |            |
|    approx_kl            | 0.11474913 |
|    clip_fraction        | 0.524      |
|    clip_range           | 0.2        |
|    entropy_loss         | 1          |
|    explained_variance   | 0.95       |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0707    |
|    n_updates            | 1580       |
|    policy_gradient_loss | -0.0472    |
|    std                  | 0.188      |
|    value_loss           | 0.0238     |
----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_me

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 991        |
|    ep_rew_mean          | -75.6      |
| time/                   |            |
|    fps                  | 1911       |
|    iterations           | 169        |
|    time_elapsed         | 181        |
|    total_timesteps      | 346112     |
| train/                  |            |
|    approx_kl            | 0.12168567 |
|    clip_fraction        | 0.496      |
|    clip_range           | 0.2        |
|    entropy_loss         | 1.38       |
|    explained_variance   | 0.946      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.092     |
|    n_updates            | 1680       |
|    policy_gradient_loss | -0.0458    |
|    std                  | 0.172      |
|    value_loss           | 0.00773    |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 977        |
|    ep_rew_mean          | -75.1      |
| time/                   |            |
|    fps                  | 1873       |
|    iterations           | 179        |
|    time_elapsed         | 195        |
|    total_timesteps      | 366592     |
| train/                  |            |
|    approx_kl            | 0.12558392 |
|    clip_fraction        | 0.547      |
|    clip_range           | 0.2        |
|    entropy_loss         | 1.85       |
|    explained_variance   | 0.916      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0464    |
|    n_updates            | 1780       |
|    policy_gradient_loss | -0.0422    |
|    std                  | 0.153      |
|    value_loss           | 0.00475    |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.02e+03   |
|    ep_rew_mean          | -72.4      |
| time/                   |            |
|    fps                  | 1845       |
|    iterations           | 189        |
|    time_elapsed         | 209        |
|    total_timesteps      | 387072     |
| train/                  |            |
|    approx_kl            | 0.18257445 |
|    clip_fraction        | 0.586      |
|    clip_range           | 0.2        |
|    entropy_loss         | 2.1        |
|    explained_variance   | 0.846      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.083     |
|    n_updates            | 1880       |
|    policy_gradient_loss | -0.0427    |
|    std                  | 0.144      |
|    value_loss           | 0.00537    |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.06e+03   |
|    ep_rew_mean          | -70.1      |
| time/                   |            |
|    fps                  | 1819       |
|    iterations           | 199        |
|    time_elapsed         | 223        |
|    total_timesteps      | 407552     |
| train/                  |            |
|    approx_kl            | 0.18044524 |
|    clip_fraction        | 0.518      |
|    clip_range           | 0.2        |
|    entropy_loss         | 2.32       |
|    explained_variance   | 0.972      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.0332     |
|    n_updates            | 1980       |
|    policy_gradient_loss | -0.0161    |
|    std                  | 0.136      |
|    value_loss           | 0.00175    |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.07e+03  |
|    ep_rew_mean          | -69.2     |
| time/                   |           |
|    fps                  | 1800      |
|    iterations           | 209       |
|    time_elapsed         | 237       |
|    total_timesteps      | 428032    |
| train/                  |           |
|    approx_kl            | 0.1165915 |
|    clip_fraction        | 0.514     |
|    clip_range           | 0.2       |
|    entropy_loss         | 2.71      |
|    explained_variance   | 0.969     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0275   |
|    n_updates            | 2080      |
|    policy_gradient_loss | -0.023    |
|    std                  | 0.124     |
|    value_loss           | 0.00189   |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.07e+03  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.11e+03  |
|    ep_rew_mean          | -67.2     |
| time/                   |           |
|    fps                  | 1784      |
|    iterations           | 219       |
|    time_elapsed         | 251       |
|    total_timesteps      | 448512    |
| train/                  |           |
|    approx_kl            | 2.4927702 |
|    clip_fraction        | 0.573     |
|    clip_range           | 0.2       |
|    entropy_loss         | 2.91      |
|    explained_variance   | 0.889     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0908   |
|    n_updates            | 2180      |
|    policy_gradient_loss | -0.0291   |
|    std                  | 0.117     |
|    value_loss           | 0.00232   |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.12e+03  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.09e+03   |
|    ep_rew_mean          | -68.6      |
| time/                   |            |
|    fps                  | 1776       |
|    iterations           | 229        |
|    time_elapsed         | 263        |
|    total_timesteps      | 468992     |
| train/                  |            |
|    approx_kl            | 0.23071493 |
|    clip_fraction        | 0.607      |
|    clip_range           | 0.2        |
|    entropy_loss         | 3.13       |
|    explained_variance   | 0.962      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0743    |
|    n_updates            | 2280       |
|    policy_gradient_loss | -0.0335    |
|    std                  | 0.11       |
|    value_loss           | 0.0131     |
----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_me

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 986        |
|    ep_rew_mean          | -71.6      |
| time/                   |            |
|    fps                  | 1771       |
|    iterations           | 239        |
|    time_elapsed         | 276        |
|    total_timesteps      | 489472     |
| train/                  |            |
|    approx_kl            | 0.18521348 |
|    clip_fraction        | 0.584      |
|    clip_range           | 0.2        |
|    entropy_loss         | 3.42       |
|    explained_variance   | 0.989      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0308    |
|    n_updates            | 2380       |
|    policy_gradient_loss | -0.021     |
|    std                  | 0.103      |
|    value_loss           | 0.00348    |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 993       |
|    ep_rew_mean          | -72.9     |
| time/                   |           |
|    fps                  | 1768      |
|    iterations           | 249       |
|    time_elapsed         | 288       |
|    total_timesteps      | 509952    |
| train/                  |           |
|    approx_kl            | 0.3243636 |
|    clip_fraction        | 0.665     |
|    clip_range           | 0.2       |
|    entropy_loss         | 3.63      |
|    explained_variance   | 0.978     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0683   |
|    n_updates            | 2480      |
|    policy_gradient_loss | -0.0313   |
|    std                  | 0.0976    |
|    value_loss           | 0.00414   |
---------------------------------------
--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 993      |
|  

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 1.02e+03    |
|    ep_rew_mean          | -72.3       |
| time/                   |             |
|    fps                  | 1764        |
|    iterations           | 259         |
|    time_elapsed         | 300         |
|    total_timesteps      | 530432      |
| train/                  |             |
|    approx_kl            | 0.105735704 |
|    clip_fraction        | 0.532       |
|    clip_range           | 0.2         |
|    entropy_loss         | 3.7         |
|    explained_variance   | 0.98        |
|    learning_rate        | 0.0003      |
|    loss                 | 0.0769      |
|    n_updates            | 2580        |
|    policy_gradient_loss | -0.00462    |
|    std                  | 0.0958      |
|    value_loss           | 0.00167     |
-----------------------------------------
----------------------------------------
| rollout/                |        

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 998        |
|    ep_rew_mean          | -71.5      |
| time/                   |            |
|    fps                  | 1758       |
|    iterations           | 269        |
|    time_elapsed         | 313        |
|    total_timesteps      | 550912     |
| train/                  |            |
|    approx_kl            | 0.38808104 |
|    clip_fraction        | 0.528      |
|    clip_range           | 0.2        |
|    entropy_loss         | 4.01       |
|    explained_variance   | 0.988      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0359    |
|    n_updates            | 2680       |
|    policy_gradient_loss | -0.0337    |
|    std                  | 0.0884     |
|    value_loss           | 0.00641    |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.04e+03 |
|    ep_rew_mean          | -69      |
| time/                   |          |
|    fps                  | 1753     |
|    iterations           | 279      |
|    time_elapsed         | 325      |
|    total_timesteps      | 571392   |
| train/                  |          |
|    approx_kl            | 0.357893 |
|    clip_fraction        | 0.706    |
|    clip_range           | 0.2      |
|    entropy_loss         | 4.15     |
|    explained_variance   | 0.953    |
|    learning_rate        | 0.0003   |
|    loss                 | -0.104   |
|    n_updates            | 2780     |
|    policy_gradient_loss | -0.0315  |
|    std                  | 0.0858   |
|    value_loss           | 0.00326  |
--------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.03e+03   |
|    ep_rew_mean   

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.14e+03  |
|    ep_rew_mean          | -67.1     |
| time/                   |           |
|    fps                  | 1748      |
|    iterations           | 289       |
|    time_elapsed         | 338       |
|    total_timesteps      | 591872    |
| train/                  |           |
|    approx_kl            | 1.5161158 |
|    clip_fraction        | 0.665     |
|    clip_range           | 0.2       |
|    entropy_loss         | 4.25      |
|    explained_variance   | 0.96      |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0736   |
|    n_updates            | 2880      |
|    policy_gradient_loss | -0.0223   |
|    std                  | 0.0836    |
|    value_loss           | 0.00405   |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.15e+03  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.09e+03   |
|    ep_rew_mean          | -68.5      |
| time/                   |            |
|    fps                  | 1747       |
|    iterations           | 299        |
|    time_elapsed         | 350        |
|    total_timesteps      | 612352     |
| train/                  |            |
|    approx_kl            | 0.20981196 |
|    clip_fraction        | 0.575      |
|    clip_range           | 0.2        |
|    entropy_loss         | 4.47       |
|    explained_variance   | 0.972      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.047      |
|    n_updates            | 2980       |
|    policy_gradient_loss | 0.00199    |
|    std                  | 0.0797     |
|    value_loss           | 0.0016     |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.1e+03   |
|    ep_rew_mean          | -68       |
| time/                   |           |
|    fps                  | 1747      |
|    iterations           | 309       |
|    time_elapsed         | 362       |
|    total_timesteps      | 632832    |
| train/                  |           |
|    approx_kl            | 0.5824002 |
|    clip_fraction        | 0.727     |
|    clip_range           | 0.2       |
|    entropy_loss         | 4.66      |
|    explained_variance   | 0.983     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0615   |
|    n_updates            | 3080      |
|    policy_gradient_loss | -0.0184   |
|    std                  | 0.0758    |
|    value_loss           | 0.0124    |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.1e+03   |


----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.15e+03   |
|    ep_rew_mean          | -67.6      |
| time/                   |            |
|    fps                  | 1748       |
|    iterations           | 319        |
|    time_elapsed         | 373        |
|    total_timesteps      | 653312     |
| train/                  |            |
|    approx_kl            | 0.19375107 |
|    clip_fraction        | 0.599      |
|    clip_range           | 0.2        |
|    entropy_loss         | 4.79       |
|    explained_variance   | 0.958      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0327    |
|    n_updates            | 3180       |
|    policy_gradient_loss | 0.0104     |
|    std                  | 0.0734     |
|    value_loss           | 0.003      |
----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_me

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.19e+03   |
|    ep_rew_mean          | -67.9      |
| time/                   |            |
|    fps                  | 1749       |
|    iterations           | 329        |
|    time_elapsed         | 385        |
|    total_timesteps      | 673792     |
| train/                  |            |
|    approx_kl            | 0.34687597 |
|    clip_fraction        | 0.655      |
|    clip_range           | 0.2        |
|    entropy_loss         | 4.8        |
|    explained_variance   | 0.903      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0488    |
|    n_updates            | 3280       |
|    policy_gradient_loss | -0.00538   |
|    std                  | 0.0728     |
|    value_loss           | 0.00308    |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.12e+03   |
|    ep_rew_mean          | -68.9      |
| time/                   |            |
|    fps                  | 1749       |
|    iterations           | 339        |
|    time_elapsed         | 396        |
|    total_timesteps      | 694272     |
| train/                  |            |
|    approx_kl            | 0.42326754 |
|    clip_fraction        | 0.608      |
|    clip_range           | 0.2        |
|    entropy_loss         | 4.88       |
|    explained_variance   | 0.937      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.0343     |
|    n_updates            | 3380       |
|    policy_gradient_loss | 0.015      |
|    std                  | 0.0713     |
|    value_loss           | 0.11       |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.14e+03  |
|    ep_rew_mean          | -65.7     |
| time/                   |           |
|    fps                  | 1749      |
|    iterations           | 349       |
|    time_elapsed         | 408       |
|    total_timesteps      | 714752    |
| train/                  |           |
|    approx_kl            | 0.3299926 |
|    clip_fraction        | 0.647     |
|    clip_range           | 0.2       |
|    entropy_loss         | 5         |
|    explained_variance   | 0.945     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0202   |
|    n_updates            | 3480      |
|    policy_gradient_loss | -0.017    |
|    std                  | 0.0694    |
|    value_loss           | 0.011     |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.16e+03  

--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.18e+03 |
|    ep_rew_mean          | -64.8    |
| time/                   |          |
|    fps                  | 1750     |
|    iterations           | 359      |
|    time_elapsed         | 419      |
|    total_timesteps      | 735232   |
| train/                  |          |
|    approx_kl            | 1.070081 |
|    clip_fraction        | 0.662    |
|    clip_range           | 0.2      |
|    entropy_loss         | 5.01     |
|    explained_variance   | 0.853    |
|    learning_rate        | 0.0003   |
|    loss                 | 0.0664   |
|    n_updates            | 3580     |
|    policy_gradient_loss | 0.00953  |
|    std                  | 0.0694   |
|    value_loss           | 0.00375  |
--------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.16e+03   |
|    ep_rew_mean   

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.11e+03  |
|    ep_rew_mean          | -66.1     |
| time/                   |           |
|    fps                  | 1753      |
|    iterations           | 369       |
|    time_elapsed         | 431       |
|    total_timesteps      | 755712    |
| train/                  |           |
|    approx_kl            | 0.2744383 |
|    clip_fraction        | 0.652     |
|    clip_range           | 0.2       |
|    entropy_loss         | 5.01      |
|    explained_variance   | 0.836     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0415   |
|    n_updates            | 3680      |
|    policy_gradient_loss | 0.00599   |
|    std                  | 0.0694    |
|    value_loss           | 0.0031    |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.11e+03  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.16e+03   |
|    ep_rew_mean          | -64.1      |
| time/                   |            |
|    fps                  | 1754       |
|    iterations           | 379        |
|    time_elapsed         | 442        |
|    total_timesteps      | 776192     |
| train/                  |            |
|    approx_kl            | 0.17811762 |
|    clip_fraction        | 0.556      |
|    clip_range           | 0.2        |
|    entropy_loss         | 5.02       |
|    explained_variance   | 0.959      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.00913   |
|    n_updates            | 3780       |
|    policy_gradient_loss | 0.00393    |
|    std                  | 0.0691     |
|    value_loss           | 0.000577   |
----------------------------------------
--------------------------------------
| rollout/                |          |
|    ep_len_mean    

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.17e+03   |
|    ep_rew_mean          | -63        |
| time/                   |            |
|    fps                  | 1756       |
|    iterations           | 389        |
|    time_elapsed         | 453        |
|    total_timesteps      | 796672     |
| train/                  |            |
|    approx_kl            | 0.11471979 |
|    clip_fraction        | 0.49       |
|    clip_range           | 0.2        |
|    entropy_loss         | 5.08       |
|    explained_variance   | 0.936      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.00553    |
|    n_updates            | 3880       |
|    policy_gradient_loss | -0.0083    |
|    std                  | 0.0677     |
|    value_loss           | 0.000795   |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.24e+03  |
|    ep_rew_mean          | -59.9     |
| time/                   |           |
|    fps                  | 1756      |
|    iterations           | 399       |
|    time_elapsed         | 465       |
|    total_timesteps      | 817152    |
| train/                  |           |
|    approx_kl            | 0.0780935 |
|    clip_fraction        | 0.521     |
|    clip_range           | 0.2       |
|    entropy_loss         | 5.06      |
|    explained_variance   | 0.917     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0425   |
|    n_updates            | 3980      |
|    policy_gradient_loss | 0.000794  |
|    std                  | 0.0684    |
|    value_loss           | 0.00073   |
---------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 1.27e+03

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.33e+03   |
|    ep_rew_mean          | -55.6      |
| time/                   |            |
|    fps                  | 1757       |
|    iterations           | 409        |
|    time_elapsed         | 476        |
|    total_timesteps      | 837632     |
| train/                  |            |
|    approx_kl            | 0.67319393 |
|    clip_fraction        | 0.53       |
|    clip_range           | 0.2        |
|    entropy_loss         | 4.9        |
|    explained_variance   | 0.983      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0477    |
|    n_updates            | 4080       |
|    policy_gradient_loss | 0.0182     |
|    std                  | 0.0714     |
|    value_loss           | 0.000274   |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.33e+03   |
|    ep_rew_mean          | -54.7      |
| time/                   |            |
|    fps                  | 1758       |
|    iterations           | 419        |
|    time_elapsed         | 487        |
|    total_timesteps      | 858112     |
| train/                  |            |
|    approx_kl            | 0.43874693 |
|    clip_fraction        | 0.536      |
|    clip_range           | 0.2        |
|    entropy_loss         | 4.92       |
|    explained_variance   | 0.98       |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0607    |
|    n_updates            | 4180       |
|    policy_gradient_loss | -0.0182    |
|    std                  | 0.0705     |
|    value_loss           | 0.00452    |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.35e+03   |
|    ep_rew_mean          | -52        |
| time/                   |            |
|    fps                  | 1759       |
|    iterations           | 429        |
|    time_elapsed         | 499        |
|    total_timesteps      | 878592     |
| train/                  |            |
|    approx_kl            | 0.24561316 |
|    clip_fraction        | 0.51       |
|    clip_range           | 0.2        |
|    entropy_loss         | 4.95       |
|    explained_variance   | 0.926      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0537    |
|    n_updates            | 4280       |
|    policy_gradient_loss | -0.0122    |
|    std                  | 0.0702     |
|    value_loss           | 0.000886   |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.46e+03 |
|    ep_rew_mean          | -45.8    |
| time/                   |          |
|    fps                  | 1761     |
|    iterations           | 439      |
|    time_elapsed         | 510      |
|    total_timesteps      | 899072   |
| train/                  |          |
|    approx_kl            | 8.66913  |
|    clip_fraction        | 0.562    |
|    clip_range           | 0.2      |
|    entropy_loss         | 5.25     |
|    explained_variance   | 0.909    |
|    learning_rate        | 0.0003   |
|    loss                 | -0.0483  |
|    n_updates            | 4380     |
|    policy_gradient_loss | -0.0136  |
|    std                  | 0.065    |
|    value_loss           | 0.000992 |
--------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.46e+03   |
|    ep_rew_mean   

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.48e+03   |
|    ep_rew_mean          | -41.4      |
| time/                   |            |
|    fps                  | 1761       |
|    iterations           | 449        |
|    time_elapsed         | 522        |
|    total_timesteps      | 919552     |
| train/                  |            |
|    approx_kl            | 0.33576387 |
|    clip_fraction        | 0.511      |
|    clip_range           | 0.2        |
|    entropy_loss         | 5.35       |
|    explained_variance   | 0.937      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0269    |
|    n_updates            | 4480       |
|    policy_gradient_loss | -0.00961   |
|    std                  | 0.0637     |
|    value_loss           | 0.000258   |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.49e+03   |
|    ep_rew_mean          | -39.1      |
| time/                   |            |
|    fps                  | 1762       |
|    iterations           | 459        |
|    time_elapsed         | 533        |
|    total_timesteps      | 940032     |
| train/                  |            |
|    approx_kl            | 0.13420036 |
|    clip_fraction        | 0.462      |
|    clip_range           | 0.2        |
|    entropy_loss         | 5.38       |
|    explained_variance   | 0.984      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0137    |
|    n_updates            | 4580       |
|    policy_gradient_loss | 0.00553    |
|    std                  | 0.0628     |
|    value_loss           | 0.00017    |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.49e+03  |
|    ep_rew_mean          | -37.7     |
| time/                   |           |
|    fps                  | 1763      |
|    iterations           | 469       |
|    time_elapsed         | 544       |
|    total_timesteps      | 960512    |
| train/                  |           |
|    approx_kl            | 1.8845935 |
|    clip_fraction        | 0.52      |
|    clip_range           | 0.2       |
|    entropy_loss         | 5.47      |
|    explained_variance   | 0.968     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0383   |
|    n_updates            | 4680      |
|    policy_gradient_loss | -0.0165   |
|    std                  | 0.0613    |
|    value_loss           | 0.00236   |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.49e+03  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.47e+03   |
|    ep_rew_mean          | -38        |
| time/                   |            |
|    fps                  | 1764       |
|    iterations           | 479        |
|    time_elapsed         | 556        |
|    total_timesteps      | 980992     |
| train/                  |            |
|    approx_kl            | 0.17154974 |
|    clip_fraction        | 0.525      |
|    clip_range           | 0.2        |
|    entropy_loss         | 5.53       |
|    explained_variance   | 0.994      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.00441    |
|    n_updates            | 4780       |
|    policy_gradient_loss | 0.00959    |
|    std                  | 0.0606     |
|    value_loss           | 0.000187   |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.5e+03   |
|    ep_rew_mean          | -35       |
| time/                   |           |
|    fps                  | 1764      |
|    iterations           | 489       |
|    time_elapsed         | 567       |
|    total_timesteps      | 1001472   |
| train/                  |           |
|    approx_kl            | 0.2419262 |
|    clip_fraction        | 0.445     |
|    clip_range           | 0.2       |
|    entropy_loss         | 5.75      |
|    explained_variance   | 0.948     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0327   |
|    n_updates            | 4880      |
|    policy_gradient_loss | -0.00145  |
|    std                  | 0.0576    |
|    value_loss           | 0.00162   |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.52e+03  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.54e+03  |
|    ep_rew_mean          | -31.2     |
| time/                   |           |
|    fps                  | 1764      |
|    iterations           | 499       |
|    time_elapsed         | 579       |
|    total_timesteps      | 1021952   |
| train/                  |           |
|    approx_kl            | 1.1299725 |
|    clip_fraction        | 0.483     |
|    clip_range           | 0.2       |
|    entropy_loss         | 5.94      |
|    explained_variance   | 0.97      |
|    learning_rate        | 0.0003    |
|    loss                 | -0.052    |
|    n_updates            | 4980      |
|    policy_gradient_loss | -0.0132   |
|    std                  | 0.0547    |
|    value_loss           | 0.000784  |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.52e+03  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.53e+03   |
|    ep_rew_mean          | -29.6      |
| time/                   |            |
|    fps                  | 1763       |
|    iterations           | 509        |
|    time_elapsed         | 590        |
|    total_timesteps      | 1042432    |
| train/                  |            |
|    approx_kl            | 0.28032914 |
|    clip_fraction        | 0.582      |
|    clip_range           | 0.2        |
|    entropy_loss         | 6.09       |
|    explained_variance   | 0.99       |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0338    |
|    n_updates            | 5080       |
|    policy_gradient_loss | 0.0165     |
|    std                  | 0.053      |
|    value_loss           | 0.000848   |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.53e+03   |
|    ep_rew_mean          | -28.6      |
| time/                   |            |
|    fps                  | 1763       |
|    iterations           | 519        |
|    time_elapsed         | 602        |
|    total_timesteps      | 1062912    |
| train/                  |            |
|    approx_kl            | 0.31011832 |
|    clip_fraction        | 0.562      |
|    clip_range           | 0.2        |
|    entropy_loss         | 6.3        |
|    explained_variance   | 0.981      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.0207     |
|    n_updates            | 5180       |
|    policy_gradient_loss | -0.00461   |
|    std                  | 0.0499     |
|    value_loss           | 0.000399   |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.53e+03  |
|    ep_rew_mean          | -27.6     |
| time/                   |           |
|    fps                  | 1763      |
|    iterations           | 529       |
|    time_elapsed         | 614       |
|    total_timesteps      | 1083392   |
| train/                  |           |
|    approx_kl            | 1.4730115 |
|    clip_fraction        | 0.558     |
|    clip_range           | 0.2       |
|    entropy_loss         | 6.58      |
|    explained_variance   | 0.834     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0401   |
|    n_updates            | 5280      |
|    policy_gradient_loss | 0.0467    |
|    std                  | 0.0466    |
|    value_loss           | 0.0022    |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.53e+03  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.54e+03   |
|    ep_rew_mean          | -26        |
| time/                   |            |
|    fps                  | 1763       |
|    iterations           | 539        |
|    time_elapsed         | 626        |
|    total_timesteps      | 1103872    |
| train/                  |            |
|    approx_kl            | 0.28626275 |
|    clip_fraction        | 0.509      |
|    clip_range           | 0.2        |
|    entropy_loss         | 6.66       |
|    explained_variance   | 0.936      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0153    |
|    n_updates            | 5380       |
|    policy_gradient_loss | 0.0486     |
|    std                  | 0.046      |
|    value_loss           | 0.0014     |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.57e+03  |
|    ep_rew_mean          | -24.1     |
| time/                   |           |
|    fps                  | 1761      |
|    iterations           | 549       |
|    time_elapsed         | 638       |
|    total_timesteps      | 1124352   |
| train/                  |           |
|    approx_kl            | 0.6760707 |
|    clip_fraction        | 0.613     |
|    clip_range           | 0.2       |
|    entropy_loss         | 6.57      |
|    explained_variance   | 0.837     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.0129    |
|    n_updates            | 5480      |
|    policy_gradient_loss | 0.0309    |
|    std                  | 0.0471    |
|    value_loss           | 0.00559   |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.57e+03  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.56e+03   |
|    ep_rew_mean          | -23.8      |
| time/                   |            |
|    fps                  | 1746       |
|    iterations           | 559        |
|    time_elapsed         | 655        |
|    total_timesteps      | 1144832    |
| train/                  |            |
|    approx_kl            | 0.53124547 |
|    clip_fraction        | 0.532      |
|    clip_range           | 0.2        |
|    entropy_loss         | 6.66       |
|    explained_variance   | 0.911      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.056     |
|    n_updates            | 5580       |
|    policy_gradient_loss | 0.00627    |
|    std                  | 0.0457     |
|    value_loss           | 0.00164    |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.56e+03   |
|    ep_rew_mean          | -25.2      |
| time/                   |            |
|    fps                  | 1741       |
|    iterations           | 569        |
|    time_elapsed         | 669        |
|    total_timesteps      | 1165312    |
| train/                  |            |
|    approx_kl            | 0.17864121 |
|    clip_fraction        | 0.544      |
|    clip_range           | 0.2        |
|    entropy_loss         | 6.82       |
|    explained_variance   | 0.981      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.0274     |
|    n_updates            | 5680       |
|    policy_gradient_loss | 0.009      |
|    std                  | 0.0442     |
|    value_loss           | 0.000268   |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.57e+03  |
|    ep_rew_mean          | -27.4     |
| time/                   |           |
|    fps                  | 1738      |
|    iterations           | 579       |
|    time_elapsed         | 682       |
|    total_timesteps      | 1185792   |
| train/                  |           |
|    approx_kl            | 1.0094087 |
|    clip_fraction        | 0.764     |
|    clip_range           | 0.2       |
|    entropy_loss         | 6.88      |
|    explained_variance   | 0.966     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0475   |
|    n_updates            | 5780      |
|    policy_gradient_loss | 0.0612    |
|    std                  | 0.0435    |
|    value_loss           | 0.0189    |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.54e+03  |


--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.52e+03 |
|    ep_rew_mean          | -32.4    |
| time/                   |          |
|    fps                  | 1735     |
|    iterations           | 589      |
|    time_elapsed         | 694      |
|    total_timesteps      | 1206272  |
| train/                  |          |
|    approx_kl            | 5.945257 |
|    clip_fraction        | 0.721    |
|    clip_range           | 0.2      |
|    entropy_loss         | 7.02     |
|    explained_variance   | 0.964    |
|    learning_rate        | 0.0003   |
|    loss                 | -0.109   |
|    n_updates            | 5880     |
|    policy_gradient_loss | -0.00534 |
|    std                  | 0.0418   |
|    value_loss           | 0.00151  |
--------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.52e+03  |
|    ep_rew_mean      

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.5e+03   |
|    ep_rew_mean          | -37.3     |
| time/                   |           |
|    fps                  | 1734      |
|    iterations           | 599       |
|    time_elapsed         | 707       |
|    total_timesteps      | 1226752   |
| train/                  |           |
|    approx_kl            | 0.5843425 |
|    clip_fraction        | 0.641     |
|    clip_range           | 0.2       |
|    entropy_loss         | 7.09      |
|    explained_variance   | 0.93      |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0226   |
|    n_updates            | 5980      |
|    policy_gradient_loss | 0.00215   |
|    std                  | 0.041     |
|    value_loss           | 0.00119   |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.5e+03   

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.48e+03   |
|    ep_rew_mean          | -40.6      |
| time/                   |            |
|    fps                  | 1734       |
|    iterations           | 609        |
|    time_elapsed         | 719        |
|    total_timesteps      | 1247232    |
| train/                  |            |
|    approx_kl            | 0.94627845 |
|    clip_fraction        | 0.629      |
|    clip_range           | 0.2        |
|    entropy_loss         | 7.06       |
|    explained_variance   | 0.903      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.083     |
|    n_updates            | 6080       |
|    policy_gradient_loss | -0.00276   |
|    std                  | 0.0413     |
|    value_loss           | 0.0027     |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.46e+03   |
|    ep_rew_mean          | -42.3      |
| time/                   |            |
|    fps                  | 1733       |
|    iterations           | 619        |
|    time_elapsed         | 731        |
|    total_timesteps      | 1267712    |
| train/                  |            |
|    approx_kl            | 0.34085327 |
|    clip_fraction        | 0.619      |
|    clip_range           | 0.2        |
|    entropy_loss         | 6.96       |
|    explained_variance   | 0.935      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0441    |
|    n_updates            | 6180       |
|    policy_gradient_loss | 0.00673    |
|    std                  | 0.0424     |
|    value_loss           | 0.00485    |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.44e+03  |
|    ep_rew_mean          | -46.1     |
| time/                   |           |
|    fps                  | 1733      |
|    iterations           | 629       |
|    time_elapsed         | 743       |
|    total_timesteps      | 1288192   |
| train/                  |           |
|    approx_kl            | 0.3184476 |
|    clip_fraction        | 0.619     |
|    clip_range           | 0.2       |
|    entropy_loss         | 7.1       |
|    explained_variance   | 0.977     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.0316    |
|    n_updates            | 6280      |
|    policy_gradient_loss | 0.0335    |
|    std                  | 0.0411    |
|    value_loss           | 0.000296  |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.44e+03  |


----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.43e+03   |
|    ep_rew_mean          | -46.2      |
| time/                   |            |
|    fps                  | 1733       |
|    iterations           | 639        |
|    time_elapsed         | 754        |
|    total_timesteps      | 1308672    |
| train/                  |            |
|    approx_kl            | 0.37010562 |
|    clip_fraction        | 0.604      |
|    clip_range           | 0.2        |
|    entropy_loss         | 7.11       |
|    explained_variance   | 0.94       |
|    learning_rate        | 0.0003     |
|    loss                 | 0.0591     |
|    n_updates            | 6380       |
|    policy_gradient_loss | 0.00939    |
|    std                  | 0.0409     |
|    value_loss           | 0.00119    |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.34e+03   |
|    ep_rew_mean          | -48.4      |
| time/                   |            |
|    fps                  | 1733       |
|    iterations           | 649        |
|    time_elapsed         | 766        |
|    total_timesteps      | 1329152    |
| train/                  |            |
|    approx_kl            | 0.95556957 |
|    clip_fraction        | 0.697      |
|    clip_range           | 0.2        |
|    entropy_loss         | 7.31       |
|    explained_variance   | 0.963      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.00249   |
|    n_updates            | 6480       |
|    policy_gradient_loss | -6.17e-06  |
|    std                  | 0.0388     |
|    value_loss           | 0.00952    |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.34e+03  |
|    ep_rew_mean          | -46.8     |
| time/                   |           |
|    fps                  | 1734      |
|    iterations           | 659       |
|    time_elapsed         | 778       |
|    total_timesteps      | 1349632   |
| train/                  |           |
|    approx_kl            | 0.5615537 |
|    clip_fraction        | 0.638     |
|    clip_range           | 0.2       |
|    entropy_loss         | 7.41      |
|    explained_variance   | 0.964     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.0726    |
|    n_updates            | 6580      |
|    policy_gradient_loss | 0.0516    |
|    std                  | 0.0382    |
|    value_loss           | 0.0208    |
---------------------------------------
--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.33e+03 |
|  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.26e+03  |
|    ep_rew_mean          | -47.3     |
| time/                   |           |
|    fps                  | 1735      |
|    iterations           | 669       |
|    time_elapsed         | 789       |
|    total_timesteps      | 1370112   |
| train/                  |           |
|    approx_kl            | 0.2602123 |
|    clip_fraction        | 0.619     |
|    clip_range           | 0.2       |
|    entropy_loss         | 7.4       |
|    explained_variance   | 0.912     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0396   |
|    n_updates            | 6680      |
|    policy_gradient_loss | 0.0046    |
|    std                  | 0.0381    |
|    value_loss           | 0.00115   |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.26e+03  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.24e+03   |
|    ep_rew_mean          | -47.8      |
| time/                   |            |
|    fps                  | 1735       |
|    iterations           | 679        |
|    time_elapsed         | 801        |
|    total_timesteps      | 1390592    |
| train/                  |            |
|    approx_kl            | 0.69248605 |
|    clip_fraction        | 0.703      |
|    clip_range           | 0.2        |
|    entropy_loss         | 7.51       |
|    explained_variance   | 0.913      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.00796   |
|    n_updates            | 6780       |
|    policy_gradient_loss | 0.0228     |
|    std                  | 0.0373     |
|    value_loss           | 0.00355    |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.16e+03 |
|    ep_rew_mean          | -49.2    |
| time/                   |          |
|    fps                  | 1733     |
|    iterations           | 689      |
|    time_elapsed         | 813      |
|    total_timesteps      | 1411072  |
| train/                  |          |
|    approx_kl            | 11.61386 |
|    clip_fraction        | 0.785    |
|    clip_range           | 0.2      |
|    entropy_loss         | 7.51     |
|    explained_variance   | 0.949    |
|    learning_rate        | 0.0003   |
|    loss                 | -0.0701  |
|    n_updates            | 6880     |
|    policy_gradient_loss | 0.0411   |
|    std                  | 0.0369   |
|    value_loss           | 0.00168  |
--------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.17e+03  |
|    ep_rew_mean      

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.12e+03  |
|    ep_rew_mean          | -49.7     |
| time/                   |           |
|    fps                  | 1731      |
|    iterations           | 699       |
|    time_elapsed         | 826       |
|    total_timesteps      | 1431552   |
| train/                  |           |
|    approx_kl            | 0.7488717 |
|    clip_fraction        | 0.692     |
|    clip_range           | 0.2       |
|    entropy_loss         | 7.6       |
|    explained_variance   | 0.965     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0393   |
|    n_updates            | 6980      |
|    policy_gradient_loss | -0.00202  |
|    std                  | 0.0359    |
|    value_loss           | 0.045     |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.12e+03  |


---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.05e+03  |
|    ep_rew_mean          | -53.1     |
| time/                   |           |
|    fps                  | 1730      |
|    iterations           | 709       |
|    time_elapsed         | 839       |
|    total_timesteps      | 1452032   |
| train/                  |           |
|    approx_kl            | 0.4144821 |
|    clip_fraction        | 0.631     |
|    clip_range           | 0.2       |
|    entropy_loss         | 7.58      |
|    explained_variance   | 0.946     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0159   |
|    n_updates            | 7080      |
|    policy_gradient_loss | 0.0014    |
|    std                  | 0.0364    |
|    value_loss           | 0.00152   |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.06e+03  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.08e+03  |
|    ep_rew_mean          | -51.9     |
| time/                   |           |
|    fps                  | 1729      |
|    iterations           | 719       |
|    time_elapsed         | 851       |
|    total_timesteps      | 1472512   |
| train/                  |           |
|    approx_kl            | 0.8357861 |
|    clip_fraction        | 0.606     |
|    clip_range           | 0.2       |
|    entropy_loss         | 7.58      |
|    explained_variance   | 0.981     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.0876    |
|    n_updates            | 7180      |
|    policy_gradient_loss | 0.0126    |
|    std                  | 0.0362    |
|    value_loss           | 0.0092    |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.08e+03  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.1e+03   |
|    ep_rew_mean          | -52.1     |
| time/                   |           |
|    fps                  | 1730      |
|    iterations           | 729       |
|    time_elapsed         | 862       |
|    total_timesteps      | 1492992   |
| train/                  |           |
|    approx_kl            | 0.7321145 |
|    clip_fraction        | 0.652     |
|    clip_range           | 0.2       |
|    entropy_loss         | 7.74      |
|    explained_variance   | 0.964     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0272   |
|    n_updates            | 7280      |
|    policy_gradient_loss | 0.0134    |
|    std                  | 0.0347    |
|    value_loss           | 0.00281   |
---------------------------------------
--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.11e+03 |
|  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.14e+03   |
|    ep_rew_mean          | -50.8      |
| time/                   |            |
|    fps                  | 1730       |
|    iterations           | 739        |
|    time_elapsed         | 874        |
|    total_timesteps      | 1513472    |
| train/                  |            |
|    approx_kl            | 0.74373615 |
|    clip_fraction        | 0.694      |
|    clip_range           | 0.2        |
|    entropy_loss         | 7.97       |
|    explained_variance   | 0.975      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0391    |
|    n_updates            | 7380       |
|    policy_gradient_loss | 0.00771    |
|    std                  | 0.0329     |
|    value_loss           | 0.00775    |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.15e+03  |
|    ep_rew_mean          | -51.7     |
| time/                   |           |
|    fps                  | 1731      |
|    iterations           | 749       |
|    time_elapsed         | 885       |
|    total_timesteps      | 1533952   |
| train/                  |           |
|    approx_kl            | 3.9517379 |
|    clip_fraction        | 0.717     |
|    clip_range           | 0.2       |
|    entropy_loss         | 8.03      |
|    explained_variance   | 0.962     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.055    |
|    n_updates            | 7480      |
|    policy_gradient_loss | 0.0853    |
|    std                  | 0.0323    |
|    value_loss           | 0.00209   |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.16e+03  |


--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.22e+03 |
|    ep_rew_mean          | -50.8    |
| time/                   |          |
|    fps                  | 1732     |
|    iterations           | 759      |
|    time_elapsed         | 897      |
|    total_timesteps      | 1554432  |
| train/                  |          |
|    approx_kl            | 2.991765 |
|    clip_fraction        | 0.795    |
|    clip_range           | 0.2      |
|    entropy_loss         | 8.14     |
|    explained_variance   | 0.953    |
|    learning_rate        | 0.0003   |
|    loss                 | -0.0695  |
|    n_updates            | 7580     |
|    policy_gradient_loss | 0.0212   |
|    std                  | 0.0314   |
|    value_loss           | 0.00804  |
--------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.2e+03   |
|    ep_rew_mean      

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.31e+03  |
|    ep_rew_mean          | -46.6     |
| time/                   |           |
|    fps                  | 1731      |
|    iterations           | 769       |
|    time_elapsed         | 909       |
|    total_timesteps      | 1574912   |
| train/                  |           |
|    approx_kl            | 2.1771655 |
|    clip_fraction        | 0.863     |
|    clip_range           | 0.2       |
|    entropy_loss         | 8.27      |
|    explained_variance   | 0.973     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.0131    |
|    n_updates            | 7680      |
|    policy_gradient_loss | 0.0282    |
|    std                  | 0.0306    |
|    value_loss           | 0.00352   |
---------------------------------------
--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.31e+03 |
|  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.31e+03  |
|    ep_rew_mean          | -47       |
| time/                   |           |
|    fps                  | 1731      |
|    iterations           | 779       |
|    time_elapsed         | 921       |
|    total_timesteps      | 1595392   |
| train/                  |           |
|    approx_kl            | 24.169104 |
|    clip_fraction        | 0.742     |
|    clip_range           | 0.2       |
|    entropy_loss         | 8.3       |
|    explained_variance   | 0.965     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.0297    |
|    n_updates            | 7780      |
|    policy_gradient_loss | 0.439     |
|    std                  | 0.0304    |
|    value_loss           | 0.00534   |
---------------------------------------
--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.3e+03  |
|  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.32e+03  |
|    ep_rew_mean          | -46.5     |
| time/                   |           |
|    fps                  | 1731      |
|    iterations           | 789       |
|    time_elapsed         | 933       |
|    total_timesteps      | 1615872   |
| train/                  |           |
|    approx_kl            | 11.610739 |
|    clip_fraction        | 0.73      |
|    clip_range           | 0.2       |
|    entropy_loss         | 8.3       |
|    explained_variance   | 0.965     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0519   |
|    n_updates            | 7880      |
|    policy_gradient_loss | 0.0226    |
|    std                  | 0.0304    |
|    value_loss           | 0.0165    |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.32e+03  |


--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.32e+03 |
|    ep_rew_mean          | -47.7    |
| time/                   |          |
|    fps                  | 1732     |
|    iterations           | 799      |
|    time_elapsed         | 944      |
|    total_timesteps      | 1636352  |
| train/                  |          |
|    approx_kl            | 1.51978  |
|    clip_fraction        | 0.788    |
|    clip_range           | 0.2      |
|    entropy_loss         | 8.37     |
|    explained_variance   | 0.959    |
|    learning_rate        | 0.0003   |
|    loss                 | -0.0238  |
|    n_updates            | 7980     |
|    policy_gradient_loss | 0.042    |
|    std                  | 0.0297   |
|    value_loss           | 0.00173  |
--------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.31e+03  |
|    ep_rew_mean      

--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.32e+03 |
|    ep_rew_mean          | -48.9    |
| time/                   |          |
|    fps                  | 1733     |
|    iterations           | 809      |
|    time_elapsed         | 956      |
|    total_timesteps      | 1656832  |
| train/                  |          |
|    approx_kl            | 4.456909 |
|    clip_fraction        | 0.809    |
|    clip_range           | 0.2      |
|    entropy_loss         | 8.54     |
|    explained_variance   | 0.981    |
|    learning_rate        | 0.0003   |
|    loss                 | 0.0121   |
|    n_updates            | 8080     |
|    policy_gradient_loss | 0.0185   |
|    std                  | 0.0286   |
|    value_loss           | 0.00232  |
--------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.31e+03  |
|    ep_rew_mean      

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.28e+03  |
|    ep_rew_mean          | -50.1     |
| time/                   |           |
|    fps                  | 1733      |
|    iterations           | 819       |
|    time_elapsed         | 967       |
|    total_timesteps      | 1677312   |
| train/                  |           |
|    approx_kl            | 0.8045436 |
|    clip_fraction        | 0.659     |
|    clip_range           | 0.2       |
|    entropy_loss         | 8.65      |
|    explained_variance   | 0.964     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0231   |
|    n_updates            | 8180      |
|    policy_gradient_loss | -0.00409  |
|    std                  | 0.0276    |
|    value_loss           | 0.0184    |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.27e+03  |


---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.27e+03  |
|    ep_rew_mean          | -49.9     |
| time/                   |           |
|    fps                  | 1734      |
|    iterations           | 829       |
|    time_elapsed         | 978       |
|    total_timesteps      | 1697792   |
| train/                  |           |
|    approx_kl            | 0.9097963 |
|    clip_fraction        | 0.769     |
|    clip_range           | 0.2       |
|    entropy_loss         | 8.77      |
|    explained_variance   | 0.94      |
|    learning_rate        | 0.0003    |
|    loss                 | 0.0142    |
|    n_updates            | 8280      |
|    policy_gradient_loss | 0.0469    |
|    std                  | 0.027     |
|    value_loss           | 0.0116    |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.27e+03  |


--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.23e+03 |
|    ep_rew_mean          | -52.4    |
| time/                   |          |
|    fps                  | 1735     |
|    iterations           | 839      |
|    time_elapsed         | 989      |
|    total_timesteps      | 1718272  |
| train/                  |          |
|    approx_kl            | 8.679375 |
|    clip_fraction        | 0.67     |
|    clip_range           | 0.2      |
|    entropy_loss         | 8.99     |
|    explained_variance   | 0.994    |
|    learning_rate        | 0.0003   |
|    loss                 | 0.0719   |
|    n_updates            | 8380     |
|    policy_gradient_loss | 0.00369  |
|    std                  | 0.0255   |
|    value_loss           | 0.000288 |
--------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.23e+03   |
|    ep_rew_mean   

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.15e+03  |
|    ep_rew_mean          | -56.7     |
| time/                   |           |
|    fps                  | 1736      |
|    iterations           | 849       |
|    time_elapsed         | 1001      |
|    total_timesteps      | 1738752   |
| train/                  |           |
|    approx_kl            | 5.7827053 |
|    clip_fraction        | 0.709     |
|    clip_range           | 0.2       |
|    entropy_loss         | 8.98      |
|    explained_variance   | 0.936     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0728   |
|    n_updates            | 8480      |
|    policy_gradient_loss | 0.0134    |
|    std                  | 0.0256    |
|    value_loss           | 0.00338   |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.15e+03  |


----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.18e+03   |
|    ep_rew_mean          | -55.1      |
| time/                   |            |
|    fps                  | 1737       |
|    iterations           | 859        |
|    time_elapsed         | 1012       |
|    total_timesteps      | 1759232    |
| train/                  |            |
|    approx_kl            | 14.7685795 |
|    clip_fraction        | 0.926      |
|    clip_range           | 0.2        |
|    entropy_loss         | 9.07       |
|    explained_variance   | 0.926      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.0223     |
|    n_updates            | 8580       |
|    policy_gradient_loss | 0.0286     |
|    std                  | 0.0251     |
|    value_loss           | 0.0034     |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.15e+03   |
|    ep_rew_mean          | -56        |
| time/                   |            |
|    fps                  | 1740       |
|    iterations           | 869        |
|    time_elapsed         | 1022       |
|    total_timesteps      | 1779712    |
| train/                  |            |
|    approx_kl            | 0.30591577 |
|    clip_fraction        | 0.688      |
|    clip_range           | 0.2        |
|    entropy_loss         | 9.16       |
|    explained_variance   | 0.973      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.0333     |
|    n_updates            | 8680       |
|    policy_gradient_loss | 0.0683     |
|    std                  | 0.0246     |
|    value_loss           | 0.000686   |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.14e+03  |
|    ep_rew_mean          | -57.1     |
| time/                   |           |
|    fps                  | 1743      |
|    iterations           | 879       |
|    time_elapsed         | 1032      |
|    total_timesteps      | 1800192   |
| train/                  |           |
|    approx_kl            | 0.9202546 |
|    clip_fraction        | 0.731     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.13      |
|    explained_variance   | 0.986     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0365   |
|    n_updates            | 8780      |
|    policy_gradient_loss | 0.0218    |
|    std                  | 0.0246    |
|    value_loss           | 0.00475   |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.12e+03  |


---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.14e+03  |
|    ep_rew_mean          | -58.1     |
| time/                   |           |
|    fps                  | 1747      |
|    iterations           | 889       |
|    time_elapsed         | 1042      |
|    total_timesteps      | 1820672   |
| train/                  |           |
|    approx_kl            | 5.2225056 |
|    clip_fraction        | 0.706     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.36      |
|    explained_variance   | 0.911     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0372   |
|    n_updates            | 8880      |
|    policy_gradient_loss | -0.025    |
|    std                  | 0.0232    |
|    value_loss           | 0.00247   |
---------------------------------------
--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.15e+03 |
|  

--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.18e+03 |
|    ep_rew_mean          | -57.4    |
| time/                   |          |
|    fps                  | 1750     |
|    iterations           | 899      |
|    time_elapsed         | 1051     |
|    total_timesteps      | 1841152  |
| train/                  |          |
|    approx_kl            | 31.12532 |
|    clip_fraction        | 0.728    |
|    clip_range           | 0.2      |
|    entropy_loss         | 9.34     |
|    explained_variance   | 0.9      |
|    learning_rate        | 0.0003   |
|    loss                 | -0.0512  |
|    n_updates            | 8980     |
|    policy_gradient_loss | 0.0844   |
|    std                  | 0.0235   |
|    value_loss           | 0.000763 |
--------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.19e+03  |
|    ep_rew_mean      

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.27e+03  |
|    ep_rew_mean          | -54.6     |
| time/                   |           |
|    fps                  | 1754      |
|    iterations           | 909       |
|    time_elapsed         | 1060      |
|    total_timesteps      | 1861632   |
| train/                  |           |
|    approx_kl            | 20.263397 |
|    clip_fraction        | 0.92      |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.29      |
|    explained_variance   | 0.776     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.0103    |
|    n_updates            | 9080      |
|    policy_gradient_loss | 0.038     |
|    std                  | 0.0238    |
|    value_loss           | 0.0101    |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.27e+03  |


---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.31e+03  |
|    ep_rew_mean          | -55.4     |
| time/                   |           |
|    fps                  | 1759      |
|    iterations           | 919       |
|    time_elapsed         | 1069      |
|    total_timesteps      | 1882112   |
| train/                  |           |
|    approx_kl            | 10.038881 |
|    clip_fraction        | 0.625     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.24      |
|    explained_variance   | 0.174     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0407   |
|    n_updates            | 9180      |
|    policy_gradient_loss | -0.039    |
|    std                  | 0.024     |
|    value_loss           | 0.00137   |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.31e+03  |


---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.31e+03  |
|    ep_rew_mean          | -57.3     |
| time/                   |           |
|    fps                  | 1763      |
|    iterations           | 929       |
|    time_elapsed         | 1078      |
|    total_timesteps      | 1902592   |
| train/                  |           |
|    approx_kl            | 12.926966 |
|    clip_fraction        | 0.927     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.25      |
|    explained_variance   | 0.882     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.122     |
|    n_updates            | 9280      |
|    policy_gradient_loss | 0.0585    |
|    std                  | 0.024     |
|    value_loss           | 0.00311   |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.3e+03   |


---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.34e+03  |
|    ep_rew_mean          | -58       |
| time/                   |           |
|    fps                  | 1767      |
|    iterations           | 939       |
|    time_elapsed         | 1088      |
|    total_timesteps      | 1923072   |
| train/                  |           |
|    approx_kl            | 0.5939136 |
|    clip_fraction        | 0.625     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.21      |
|    explained_variance   | 0.968     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0321   |
|    n_updates            | 9380      |
|    policy_gradient_loss | 0.0147    |
|    std                  | 0.0241    |
|    value_loss           | 0.000999  |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.34e+03  |


--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.32e+03 |
|    ep_rew_mean          | -58.5    |
| time/                   |          |
|    fps                  | 1770     |
|    iterations           | 949      |
|    time_elapsed         | 1097     |
|    total_timesteps      | 1943552  |
| train/                  |          |
|    approx_kl            | 1.003649 |
|    clip_fraction        | 0.747    |
|    clip_range           | 0.2      |
|    entropy_loss         | 9.14     |
|    explained_variance   | 0.98     |
|    learning_rate        | 0.0003   |
|    loss                 | 0.0499   |
|    n_updates            | 9480     |
|    policy_gradient_loss | 0.0477   |
|    std                  | 0.0248   |
|    value_loss           | 0.00608  |
--------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 1.34e+03    |
|    ep_rew_mean

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.34e+03  |
|    ep_rew_mean          | -58.6     |
| time/                   |           |
|    fps                  | 1775      |
|    iterations           | 959       |
|    time_elapsed         | 1106      |
|    total_timesteps      | 1964032   |
| train/                  |           |
|    approx_kl            | 18.298267 |
|    clip_fraction        | 0.91      |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.13      |
|    explained_variance   | 0.951     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.058    |
|    n_updates            | 9580      |
|    policy_gradient_loss | 0.18      |
|    std                  | 0.0247    |
|    value_loss           | 0.00302   |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.34e+03  |


----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.35e+03   |
|    ep_rew_mean          | -57        |
| time/                   |            |
|    fps                  | 1778       |
|    iterations           | 969        |
|    time_elapsed         | 1115       |
|    total_timesteps      | 1984512    |
| train/                  |            |
|    approx_kl            | 13.8591385 |
|    clip_fraction        | 0.639      |
|    clip_range           | 0.2        |
|    entropy_loss         | 9.46       |
|    explained_variance   | 0.933      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0615    |
|    n_updates            | 9680       |
|    policy_gradient_loss | -0.0404    |
|    std                  | 0.0226     |
|    value_loss           | 0.000101   |
----------------------------------------
--------------------------------------
| rollout/                |          |
|    ep_len_mean    

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.32e+03  |
|    ep_rew_mean          | -57.2     |
| time/                   |           |
|    fps                  | 1782      |
|    iterations           | 979       |
|    time_elapsed         | 1125      |
|    total_timesteps      | 2004992   |
| train/                  |           |
|    approx_kl            | 11.159399 |
|    clip_fraction        | 0.935     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.45      |
|    explained_variance   | 0.851     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.124     |
|    n_updates            | 9780      |
|    policy_gradient_loss | 0.0648    |
|    std                  | 0.023     |
|    value_loss           | 0.00622   |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.32e+03  |


--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.32e+03 |
|    ep_rew_mean          | -55.3    |
| time/                   |          |
|    fps                  | 1785     |
|    iterations           | 989      |
|    time_elapsed         | 1134     |
|    total_timesteps      | 2025472  |
| train/                  |          |
|    approx_kl            | 5.131144 |
|    clip_fraction        | 0.639    |
|    clip_range           | 0.2      |
|    entropy_loss         | 9.41     |
|    explained_variance   | 0.759    |
|    learning_rate        | 0.0003   |
|    loss                 | -0.0746  |
|    n_updates            | 9880     |
|    policy_gradient_loss | -0.0251  |
|    std                  | 0.023    |
|    value_loss           | 0.000328 |
--------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.34e+03   |
|    ep_rew_mean   

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.36e+03  |
|    ep_rew_mean          | -52.1     |
| time/                   |           |
|    fps                  | 1788      |
|    iterations           | 999       |
|    time_elapsed         | 1143      |
|    total_timesteps      | 2045952   |
| train/                  |           |
|    approx_kl            | 19.234905 |
|    clip_fraction        | 0.758     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.25      |
|    explained_variance   | 0.965     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.12      |
|    n_updates            | 9980      |
|    policy_gradient_loss | 0.0634    |
|    std                  | 0.0241    |
|    value_loss           | 0.00177   |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.36e+03  |


--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.39e+03 |
|    ep_rew_mean          | -51.5    |
| time/                   |          |
|    fps                  | 1792     |
|    iterations           | 1009     |
|    time_elapsed         | 1153     |
|    total_timesteps      | 2066432  |
| train/                  |          |
|    approx_kl            | 6.948022 |
|    clip_fraction        | 0.746    |
|    clip_range           | 0.2      |
|    entropy_loss         | 9.26     |
|    explained_variance   | 0.96     |
|    learning_rate        | 0.0003   |
|    loss                 | -0.0352  |
|    n_updates            | 10080    |
|    policy_gradient_loss | 0.0118   |
|    std                  | 0.0239   |
|    value_loss           | 0.000811 |
--------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.39e+03  |
|    ep_rew_mean      

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.43e+03  |
|    ep_rew_mean          | -51.1     |
| time/                   |           |
|    fps                  | 1795      |
|    iterations           | 1019      |
|    time_elapsed         | 1162      |
|    total_timesteps      | 2086912   |
| train/                  |           |
|    approx_kl            | 2.9688036 |
|    clip_fraction        | 0.732     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.28      |
|    explained_variance   | 0.936     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.02     |
|    n_updates            | 10180     |
|    policy_gradient_loss | 0.0116    |
|    std                  | 0.0238    |
|    value_loss           | 0.000349  |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.43e+03  |


--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.43e+03 |
|    ep_rew_mean          | -49.9    |
| time/                   |          |
|    fps                  | 1797     |
|    iterations           | 1029     |
|    time_elapsed         | 1172     |
|    total_timesteps      | 2107392  |
| train/                  |          |
|    approx_kl            | 2.729403 |
|    clip_fraction        | 0.795    |
|    clip_range           | 0.2      |
|    entropy_loss         | 9.24     |
|    explained_variance   | 0.944    |
|    learning_rate        | 0.0003   |
|    loss                 | -0.0227  |
|    n_updates            | 10280    |
|    policy_gradient_loss | 0.0766   |
|    std                  | 0.024    |
|    value_loss           | 0.00329  |
--------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.43e+03  |
|    ep_rew_mean      

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.41e+03   |
|    ep_rew_mean          | -51.5      |
| time/                   |            |
|    fps                  | 1800       |
|    iterations           | 1039       |
|    time_elapsed         | 1181       |
|    total_timesteps      | 2127872    |
| train/                  |            |
|    approx_kl            | 0.22060362 |
|    clip_fraction        | 0.635      |
|    clip_range           | 0.2        |
|    entropy_loss         | 9.22       |
|    explained_variance   | 0.972      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.042     |
|    n_updates            | 10380      |
|    policy_gradient_loss | 0.0461     |
|    std                  | 0.0242     |
|    value_loss           | 0.00107    |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.43e+03  |
|    ep_rew_mean          | -52.5     |
| time/                   |           |
|    fps                  | 1803      |
|    iterations           | 1049      |
|    time_elapsed         | 1191      |
|    total_timesteps      | 2148352   |
| train/                  |           |
|    approx_kl            | 24.212883 |
|    clip_fraction        | 0.818     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.24      |
|    explained_variance   | 0.798     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0892   |
|    n_updates            | 10480     |
|    policy_gradient_loss | -0.0305   |
|    std                  | 0.0239    |
|    value_loss           | 0.00381   |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.44e+03  |


--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.42e+03 |
|    ep_rew_mean          | -53.4    |
| time/                   |          |
|    fps                  | 1805     |
|    iterations           | 1059     |
|    time_elapsed         | 1201     |
|    total_timesteps      | 2168832  |
| train/                  |          |
|    approx_kl            | 7.830497 |
|    clip_fraction        | 0.833    |
|    clip_range           | 0.2      |
|    entropy_loss         | 9.27     |
|    explained_variance   | 0.957    |
|    learning_rate        | 0.0003   |
|    loss                 | -0.0278  |
|    n_updates            | 10580    |
|    policy_gradient_loss | 0.0352   |
|    std                  | 0.0238   |
|    value_loss           | 0.0202   |
--------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.42e+03  |
|    ep_rew_mean      

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.38e+03  |
|    ep_rew_mean          | -55.4     |
| time/                   |           |
|    fps                  | 1808      |
|    iterations           | 1069      |
|    time_elapsed         | 1210      |
|    total_timesteps      | 2189312   |
| train/                  |           |
|    approx_kl            | 11.792691 |
|    clip_fraction        | 0.926     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.28      |
|    explained_variance   | 0.977     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.0655    |
|    n_updates            | 10680     |
|    policy_gradient_loss | 0.036     |
|    std                  | 0.0237    |
|    value_loss           | 0.0108    |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.38e+03  |


---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.35e+03  |
|    ep_rew_mean          | -58.1     |
| time/                   |           |
|    fps                  | 1811      |
|    iterations           | 1079      |
|    time_elapsed         | 1220      |
|    total_timesteps      | 2209792   |
| train/                  |           |
|    approx_kl            | 7.7408323 |
|    clip_fraction        | 0.778     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.42      |
|    explained_variance   | 0.974     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0246   |
|    n_updates            | 10780     |
|    policy_gradient_loss | 0.031     |
|    std                  | 0.0228    |
|    value_loss           | 0.00105   |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.35e+03  |


--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.33e+03 |
|    ep_rew_mean          | -57.5    |
| time/                   |          |
|    fps                  | 1813     |
|    iterations           | 1089     |
|    time_elapsed         | 1229     |
|    total_timesteps      | 2230272  |
| train/                  |          |
|    approx_kl            | 0.774803 |
|    clip_fraction        | 0.704    |
|    clip_range           | 0.2      |
|    entropy_loss         | 9.22     |
|    explained_variance   | 0.975    |
|    learning_rate        | 0.0003   |
|    loss                 | -0.00419 |
|    n_updates            | 10880    |
|    policy_gradient_loss | 0.0225   |
|    std                  | 0.024    |
|    value_loss           | 0.00796  |
--------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.31e+03  |
|    ep_rew_mean      

--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.3e+03  |
|    ep_rew_mean          | -57.8    |
| time/                   |          |
|    fps                  | 1815     |
|    iterations           | 1099     |
|    time_elapsed         | 1239     |
|    total_timesteps      | 2250752  |
| train/                  |          |
|    approx_kl            | 10.09962 |
|    clip_fraction        | 0.831    |
|    clip_range           | 0.2      |
|    entropy_loss         | 9.23     |
|    explained_variance   | 0.95     |
|    learning_rate        | 0.0003   |
|    loss                 | -0.00995 |
|    n_updates            | 10980    |
|    policy_gradient_loss | 0.0062   |
|    std                  | 0.0241   |
|    value_loss           | 0.0278   |
--------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.28e+03  |
|    ep_rew_mean      

--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.23e+03 |
|    ep_rew_mean          | -60.1    |
| time/                   |          |
|    fps                  | 1818     |
|    iterations           | 1109     |
|    time_elapsed         | 1249     |
|    total_timesteps      | 2271232  |
| train/                  |          |
|    approx_kl            | 8.861655 |
|    clip_fraction        | 0.75     |
|    clip_range           | 0.2      |
|    entropy_loss         | 9.25     |
|    explained_variance   | 0.976    |
|    learning_rate        | 0.0003   |
|    loss                 | -0.00972 |
|    n_updates            | 11080    |
|    policy_gradient_loss | -0.00504 |
|    std                  | 0.0238   |
|    value_loss           | 0.000915 |
--------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.23e+03  |
|    ep_rew_mean      

--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.19e+03 |
|    ep_rew_mean          | -60.4    |
| time/                   |          |
|    fps                  | 1820     |
|    iterations           | 1119     |
|    time_elapsed         | 1258     |
|    total_timesteps      | 2291712  |
| train/                  |          |
|    approx_kl            | 31.31847 |
|    clip_fraction        | 0.914    |
|    clip_range           | 0.2      |
|    entropy_loss         | 9.29     |
|    explained_variance   | 0.883    |
|    learning_rate        | 0.0003   |
|    loss                 | -0.0615  |
|    n_updates            | 11180    |
|    policy_gradient_loss | -0.0074  |
|    std                  | 0.0236   |
|    value_loss           | 0.00366  |
--------------------------------------
--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.19e+03 |
|    ep_rew_mean         

--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.2e+03  |
|    ep_rew_mean          | -57      |
| time/                   |          |
|    fps                  | 1822     |
|    iterations           | 1129     |
|    time_elapsed         | 1268     |
|    total_timesteps      | 2312192  |
| train/                  |          |
|    approx_kl            | 4.825657 |
|    clip_fraction        | 0.798    |
|    clip_range           | 0.2      |
|    entropy_loss         | 9.44     |
|    explained_variance   | 0.957    |
|    learning_rate        | 0.0003   |
|    loss                 | 0.00982  |
|    n_updates            | 11280    |
|    policy_gradient_loss | 0.00785  |
|    std                  | 0.0228   |
|    value_loss           | 0.0174   |
--------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.19e+03  |
|    ep_rew_mean      

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.23e+03  |
|    ep_rew_mean          | -54.3     |
| time/                   |           |
|    fps                  | 1824      |
|    iterations           | 1139      |
|    time_elapsed         | 1278      |
|    total_timesteps      | 2332672   |
| train/                  |           |
|    approx_kl            | 4.0912514 |
|    clip_fraction        | 0.691     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.47      |
|    explained_variance   | 0.946     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0379   |
|    n_updates            | 11380     |
|    policy_gradient_loss | -0.00951  |
|    std                  | 0.0225    |
|    value_loss           | 0.00285   |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.24e+03  |


---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.17e+03  |
|    ep_rew_mean          | -55.2     |
| time/                   |           |
|    fps                  | 1826      |
|    iterations           | 1149      |
|    time_elapsed         | 1288      |
|    total_timesteps      | 2353152   |
| train/                  |           |
|    approx_kl            | 14.998308 |
|    clip_fraction        | 0.924     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.43      |
|    explained_variance   | 0.983     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0112   |
|    n_updates            | 11480     |
|    policy_gradient_loss | 0.0551    |
|    std                  | 0.0228    |
|    value_loss           | 0.0122    |
---------------------------------------
--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.19e+03 |
|  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.2e+03   |
|    ep_rew_mean          | -53.1     |
| time/                   |           |
|    fps                  | 1828      |
|    iterations           | 1159      |
|    time_elapsed         | 1298      |
|    total_timesteps      | 2373632   |
| train/                  |           |
|    approx_kl            | 4.2843037 |
|    clip_fraction        | 0.737     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.48      |
|    explained_variance   | 0.981     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0123   |
|    n_updates            | 11580     |
|    policy_gradient_loss | 0.0243    |
|    std                  | 0.0225    |
|    value_loss           | 0.00169   |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.21e+03  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.2e+03   |
|    ep_rew_mean          | -49.6     |
| time/                   |           |
|    fps                  | 1830      |
|    iterations           | 1169      |
|    time_elapsed         | 1307      |
|    total_timesteps      | 2394112   |
| train/                  |           |
|    approx_kl            | 13.129529 |
|    clip_fraction        | 0.928     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.53      |
|    explained_variance   | 0.98      |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0447   |
|    n_updates            | 11680     |
|    policy_gradient_loss | 0.0047    |
|    std                  | 0.0222    |
|    value_loss           | 0.0111    |
---------------------------------------
--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.2e+03  |
|  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.25e+03  |
|    ep_rew_mean          | -47.5     |
| time/                   |           |
|    fps                  | 1832      |
|    iterations           | 1179      |
|    time_elapsed         | 1317      |
|    total_timesteps      | 2414592   |
| train/                  |           |
|    approx_kl            | 14.202967 |
|    clip_fraction        | 0.925     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.41      |
|    explained_variance   | 0.948     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.0754    |
|    n_updates            | 11780     |
|    policy_gradient_loss | 0.0493    |
|    std                  | 0.023     |
|    value_loss           | 0.0094    |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.25e+03  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.18e+03  |
|    ep_rew_mean          | -51.2     |
| time/                   |           |
|    fps                  | 1834      |
|    iterations           | 1189      |
|    time_elapsed         | 1327      |
|    total_timesteps      | 2435072   |
| train/                  |           |
|    approx_kl            | 1.2190701 |
|    clip_fraction        | 0.776     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.41      |
|    explained_variance   | 0.978     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.0703    |
|    n_updates            | 11880     |
|    policy_gradient_loss | 0.0308    |
|    std                  | 0.023     |
|    value_loss           | 0.0045    |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.18e+03  |


---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.15e+03  |
|    ep_rew_mean          | -51.2     |
| time/                   |           |
|    fps                  | 1836      |
|    iterations           | 1199      |
|    time_elapsed         | 1336      |
|    total_timesteps      | 2455552   |
| train/                  |           |
|    approx_kl            | 2.1722622 |
|    clip_fraction        | 0.688     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.29      |
|    explained_variance   | 0.892     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.0148    |
|    n_updates            | 11980     |
|    policy_gradient_loss | 0.0135    |
|    std                  | 0.0238    |
|    value_loss           | 0.00934   |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.16e+03  |


---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.21e+03  |
|    ep_rew_mean          | -48.1     |
| time/                   |           |
|    fps                  | 1838      |
|    iterations           | 1209      |
|    time_elapsed         | 1346      |
|    total_timesteps      | 2476032   |
| train/                  |           |
|    approx_kl            | 0.9333072 |
|    clip_fraction        | 0.688     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.31      |
|    explained_variance   | 0.967     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0133   |
|    n_updates            | 12080     |
|    policy_gradient_loss | 0.0415    |
|    std                  | 0.0237    |
|    value_loss           | 0.000959  |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.21e+03  |


----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.24e+03   |
|    ep_rew_mean          | -45.3      |
| time/                   |            |
|    fps                  | 1840       |
|    iterations           | 1219       |
|    time_elapsed         | 1356       |
|    total_timesteps      | 2496512    |
| train/                  |            |
|    approx_kl            | 0.32011062 |
|    clip_fraction        | 0.608      |
|    clip_range           | 0.2        |
|    entropy_loss         | 9.19       |
|    explained_variance   | 0.934      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.0408     |
|    n_updates            | 12180      |
|    policy_gradient_loss | 0.0264     |
|    std                  | 0.0244     |
|    value_loss           | 0.000745   |
----------------------------------------
--------------------------------------
| rollout/                |          |
|    ep_len_mean    

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.25e+03  |
|    ep_rew_mean          | -44.5     |
| time/                   |           |
|    fps                  | 1842      |
|    iterations           | 1229      |
|    time_elapsed         | 1366      |
|    total_timesteps      | 2516992   |
| train/                  |           |
|    approx_kl            | 1.4093899 |
|    clip_fraction        | 0.682     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.21      |
|    explained_variance   | 0.909     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0185   |
|    n_updates            | 12280     |
|    policy_gradient_loss | 0.00335   |
|    std                  | 0.0242    |
|    value_loss           | 0.00313   |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.25e+03  |


----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.26e+03   |
|    ep_rew_mean          | -42.3      |
| time/                   |            |
|    fps                  | 1844       |
|    iterations           | 1239       |
|    time_elapsed         | 1376       |
|    total_timesteps      | 2537472    |
| train/                  |            |
|    approx_kl            | 0.30959707 |
|    clip_fraction        | 0.685      |
|    clip_range           | 0.2        |
|    entropy_loss         | 9.26       |
|    explained_variance   | 0.942      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0246    |
|    n_updates            | 12380      |
|    policy_gradient_loss | 0.0388     |
|    std                  | 0.024      |
|    value_loss           | 0.00126    |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.26e+03  |
|    ep_rew_mean          | -40.4     |
| time/                   |           |
|    fps                  | 1845      |
|    iterations           | 1249      |
|    time_elapsed         | 1385      |
|    total_timesteps      | 2557952   |
| train/                  |           |
|    approx_kl            | 1.0035017 |
|    clip_fraction        | 0.721     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.23      |
|    explained_variance   | 0.977     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0402   |
|    n_updates            | 12480     |
|    policy_gradient_loss | 0.0186    |
|    std                  | 0.024     |
|    value_loss           | 0.0133    |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.26e+03  |


----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.24e+03   |
|    ep_rew_mean          | -40.5      |
| time/                   |            |
|    fps                  | 1847       |
|    iterations           | 1259       |
|    time_elapsed         | 1395       |
|    total_timesteps      | 2578432    |
| train/                  |            |
|    approx_kl            | 0.48124102 |
|    clip_fraction        | 0.72       |
|    clip_range           | 0.2        |
|    entropy_loss         | 9.33       |
|    explained_variance   | 0.981      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0348    |
|    n_updates            | 12580      |
|    policy_gradient_loss | 0.00909    |
|    std                  | 0.0235     |
|    value_loss           | 0.00527    |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.24e+03   |
|    ep_rew_mean          | -39.8      |
| time/                   |            |
|    fps                  | 1848       |
|    iterations           | 1269       |
|    time_elapsed         | 1405       |
|    total_timesteps      | 2598912    |
| train/                  |            |
|    approx_kl            | 0.77659535 |
|    clip_fraction        | 0.745      |
|    clip_range           | 0.2        |
|    entropy_loss         | 9.41       |
|    explained_variance   | 0.981      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0165    |
|    n_updates            | 12680      |
|    policy_gradient_loss | 0.0159     |
|    std                  | 0.023      |
|    value_loss           | 0.00418    |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.12e+03   |
|    ep_rew_mean          | -46.1      |
| time/                   |            |
|    fps                  | 1850       |
|    iterations           | 1279       |
|    time_elapsed         | 1415       |
|    total_timesteps      | 2619392    |
| train/                  |            |
|    approx_kl            | 0.89864534 |
|    clip_fraction        | 0.772      |
|    clip_range           | 0.2        |
|    entropy_loss         | 9.27       |
|    explained_variance   | 0.977      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0196    |
|    n_updates            | 12780      |
|    policy_gradient_loss | 0.0299     |
|    std                  | 0.0239     |
|    value_loss           | 0.00773    |
----------------------------------------
--------------------------------------
| rollout/                |          |
|    ep_len_mean    

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.12e+03  |
|    ep_rew_mean          | -47.7     |
| time/                   |           |
|    fps                  | 1851      |
|    iterations           | 1289      |
|    time_elapsed         | 1425      |
|    total_timesteps      | 2639872   |
| train/                  |           |
|    approx_kl            | 1.6592734 |
|    clip_fraction        | 0.75      |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.3       |
|    explained_variance   | 0.908     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.026    |
|    n_updates            | 12880     |
|    policy_gradient_loss | 0.00616   |
|    std                  | 0.0235    |
|    value_loss           | 0.0123    |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.13e+03  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.17e+03   |
|    ep_rew_mean          | -46.3      |
| time/                   |            |
|    fps                  | 1853       |
|    iterations           | 1299       |
|    time_elapsed         | 1435       |
|    total_timesteps      | 2660352    |
| train/                  |            |
|    approx_kl            | 0.12849548 |
|    clip_fraction        | 0.575      |
|    clip_range           | 0.2        |
|    entropy_loss         | 9.11       |
|    explained_variance   | 0.978      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.0108     |
|    n_updates            | 12980      |
|    policy_gradient_loss | 0.0536     |
|    std                  | 0.0247     |
|    value_loss           | 0.000645   |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 999        |
|    ep_rew_mean          | -54.5      |
| time/                   |            |
|    fps                  | 1855       |
|    iterations           | 1309       |
|    time_elapsed         | 1445       |
|    total_timesteps      | 2680832    |
| train/                  |            |
|    approx_kl            | 0.43025583 |
|    clip_fraction        | 0.73       |
|    clip_range           | 0.2        |
|    entropy_loss         | 8.94       |
|    explained_variance   | 0.86       |
|    learning_rate        | 0.0003     |
|    loss                 | 0.037      |
|    n_updates            | 13080      |
|    policy_gradient_loss | 0.0342     |
|    std                  | 0.026      |
|    value_loss           | 0.00729    |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.03e+03   |
|    ep_rew_mean          | -53        |
| time/                   |            |
|    fps                  | 1856       |
|    iterations           | 1319       |
|    time_elapsed         | 1455       |
|    total_timesteps      | 2701312    |
| train/                  |            |
|    approx_kl            | 0.22351211 |
|    clip_fraction        | 0.723      |
|    clip_range           | 0.2        |
|    entropy_loss         | 8.93       |
|    explained_variance   | 0.949      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0632    |
|    n_updates            | 13180      |
|    policy_gradient_loss | 0.0356     |
|    std                  | 0.0261     |
|    value_loss           | 0.00178    |
----------------------------------------
--------------------------------------
| rollout/                |          |
|    ep_len_mean    

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.12e+03  |
|    ep_rew_mean          | -48.5     |
| time/                   |           |
|    fps                  | 1857      |
|    iterations           | 1329      |
|    time_elapsed         | 1465      |
|    total_timesteps      | 2721792   |
| train/                  |           |
|    approx_kl            | 18.286499 |
|    clip_fraction        | 0.669     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.07      |
|    explained_variance   | 0.91      |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0344   |
|    n_updates            | 13280     |
|    policy_gradient_loss | -0.0307   |
|    std                  | 0.0249    |
|    value_loss           | 0.0016    |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.13e+03  |


----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.16e+03   |
|    ep_rew_mean          | -47        |
| time/                   |            |
|    fps                  | 1858       |
|    iterations           | 1339       |
|    time_elapsed         | 1475       |
|    total_timesteps      | 2742272    |
| train/                  |            |
|    approx_kl            | 0.61368847 |
|    clip_fraction        | 0.702      |
|    clip_range           | 0.2        |
|    entropy_loss         | 9.18       |
|    explained_variance   | 0.958      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.0428     |
|    n_updates            | 13380      |
|    policy_gradient_loss | 0.0397     |
|    std                  | 0.0245     |
|    value_loss           | 0.00129    |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.15e+03  |
|    ep_rew_mean          | -45.4     |
| time/                   |           |
|    fps                  | 1860      |
|    iterations           | 1349      |
|    time_elapsed         | 1485      |
|    total_timesteps      | 2762752   |
| train/                  |           |
|    approx_kl            | 0.7195778 |
|    clip_fraction        | 0.702     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.07      |
|    explained_variance   | 0.957     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.0461    |
|    n_updates            | 13480     |
|    policy_gradient_loss | 0.0374    |
|    std                  | 0.0251    |
|    value_loss           | 0.00203   |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.15e+03  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.21e+03  |
|    ep_rew_mean          | -42.1     |
| time/                   |           |
|    fps                  | 1861      |
|    iterations           | 1359      |
|    time_elapsed         | 1494      |
|    total_timesteps      | 2783232   |
| train/                  |           |
|    approx_kl            | 0.4148562 |
|    clip_fraction        | 0.605     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.1       |
|    explained_variance   | 0.95      |
|    learning_rate        | 0.0003    |
|    loss                 | 0.167     |
|    n_updates            | 13580     |
|    policy_gradient_loss | 0.0514    |
|    std                  | 0.025     |
|    value_loss           | 0.000446  |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.22e+03  |


--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.35e+03 |
|    ep_rew_mean          | -36.6    |
| time/                   |          |
|    fps                  | 1863     |
|    iterations           | 1369     |
|    time_elapsed         | 1504     |
|    total_timesteps      | 2803712  |
| train/                  |          |
|    approx_kl            | 9.496169 |
|    clip_fraction        | 0.701    |
|    clip_range           | 0.2      |
|    entropy_loss         | 8.99     |
|    explained_variance   | 0.863    |
|    learning_rate        | 0.0003   |
|    loss                 | -0.0516  |
|    n_updates            | 13680    |
|    policy_gradient_loss | -0.0141  |
|    std                  | 0.0254   |
|    value_loss           | 0.000483 |
--------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.37e+03  |
|    ep_rew_mean      

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.42e+03  |
|    ep_rew_mean          | -35.9     |
| time/                   |           |
|    fps                  | 1864      |
|    iterations           | 1379      |
|    time_elapsed         | 1514      |
|    total_timesteps      | 2824192   |
| train/                  |           |
|    approx_kl            | 0.5701853 |
|    clip_fraction        | 0.606     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.05      |
|    explained_variance   | 0.977     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0302   |
|    n_updates            | 13780     |
|    policy_gradient_loss | -0.00369  |
|    std                  | 0.0251    |
|    value_loss           | 0.000461  |
---------------------------------------
--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.43e+03 |
|  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.41e+03  |
|    ep_rew_mean          | -37.6     |
| time/                   |           |
|    fps                  | 1865      |
|    iterations           | 1389      |
|    time_elapsed         | 1524      |
|    total_timesteps      | 2844672   |
| train/                  |           |
|    approx_kl            | 0.7564815 |
|    clip_fraction        | 0.646     |
|    clip_range           | 0.2       |
|    entropy_loss         | 8.96      |
|    explained_variance   | 0.98      |
|    learning_rate        | 0.0003    |
|    loss                 | 0.0564    |
|    n_updates            | 13880     |
|    policy_gradient_loss | 0.0458    |
|    std                  | 0.0258    |
|    value_loss           | 0.00623   |
---------------------------------------
--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.41e+03 |
|  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.44e+03   |
|    ep_rew_mean          | -38.2      |
| time/                   |            |
|    fps                  | 1866       |
|    iterations           | 1399       |
|    time_elapsed         | 1534       |
|    total_timesteps      | 2865152    |
| train/                  |            |
|    approx_kl            | 0.22575192 |
|    clip_fraction        | 0.574      |
|    clip_range           | 0.2        |
|    entropy_loss         | 8.72       |
|    explained_variance   | 0.941      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.0764     |
|    n_updates            | 13980      |
|    policy_gradient_loss | 0.0739     |
|    std                  | 0.0274     |
|    value_loss           | 0.000781   |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.28e+03  |
|    ep_rew_mean          | -47.5     |
| time/                   |           |
|    fps                  | 1868      |
|    iterations           | 1409      |
|    time_elapsed         | 1544      |
|    total_timesteps      | 2885632   |
| train/                  |           |
|    approx_kl            | 3.2538476 |
|    clip_fraction        | 0.872     |
|    clip_range           | 0.2       |
|    entropy_loss         | 8.88      |
|    explained_variance   | 0.84      |
|    learning_rate        | 0.0003    |
|    loss                 | 0.013     |
|    n_updates            | 14080     |
|    policy_gradient_loss | 0.134     |
|    std                  | 0.0263    |
|    value_loss           | 0.177     |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.27e+03  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.2e+03    |
|    ep_rew_mean          | -52.3      |
| time/                   |            |
|    fps                  | 1869       |
|    iterations           | 1419       |
|    time_elapsed         | 1554       |
|    total_timesteps      | 2906112    |
| train/                  |            |
|    approx_kl            | 0.46829742 |
|    clip_fraction        | 0.698      |
|    clip_range           | 0.2        |
|    entropy_loss         | 8.85       |
|    explained_variance   | 0.943      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0128    |
|    n_updates            | 14180      |
|    policy_gradient_loss | 0.0312     |
|    std                  | 0.0264     |
|    value_loss           | 0.00382    |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 69.4      |
|    ep_rew_mean          | -97.4     |
| time/                   |           |
|    fps                  | 1870      |
|    iterations           | 1429      |
|    time_elapsed         | 1564      |
|    total_timesteps      | 2926592   |
| train/                  |           |
|    approx_kl            | 3.3998308 |
|    clip_fraction        | 0.857     |
|    clip_range           | 0.2       |
|    entropy_loss         | 8.81      |
|    explained_variance   | 0.995     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.0162    |
|    n_updates            | 14280     |
|    policy_gradient_loss | 0.0678    |
|    std                  | 0.0268    |
|    value_loss           | 0.00561   |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 75.5      |


----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 73.8       |
|    ep_rew_mean          | -96.8      |
| time/                   |            |
|    fps                  | 1871       |
|    iterations           | 1439       |
|    time_elapsed         | 1575       |
|    total_timesteps      | 2947072    |
| train/                  |            |
|    approx_kl            | 0.77585936 |
|    clip_fraction        | 0.772      |
|    clip_range           | 0.2        |
|    entropy_loss         | 8.6        |
|    explained_variance   | 0.977      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.0515     |
|    n_updates            | 14380      |
|    policy_gradient_loss | 0.046      |
|    std                  | 0.0282     |
|    value_loss           | 0.0173     |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 157       |
|    ep_rew_mean          | -100      |
| time/                   |           |
|    fps                  | 1872      |
|    iterations           | 1449      |
|    time_elapsed         | 1585      |
|    total_timesteps      | 2967552   |
| train/                  |           |
|    approx_kl            | 1.7132434 |
|    clip_fraction        | 0.836     |
|    clip_range           | 0.2       |
|    entropy_loss         | 8.54      |
|    explained_variance   | 0.966     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.0568    |
|    n_updates            | 14480     |
|    policy_gradient_loss | 0.0967    |
|    std                  | 0.029     |
|    value_loss           | 0.0366    |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 157       |


---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 131       |
|    ep_rew_mean          | -98.7     |
| time/                   |           |
|    fps                  | 1873      |
|    iterations           | 1459      |
|    time_elapsed         | 1595      |
|    total_timesteps      | 2988032   |
| train/                  |           |
|    approx_kl            | 0.8255445 |
|    clip_fraction        | 0.811     |
|    clip_range           | 0.2       |
|    entropy_loss         | 8.34      |
|    explained_variance   | 0.991     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.0615    |
|    n_updates            | 14580     |
|    policy_gradient_loss | 0.0591    |
|    std                  | 0.0304    |
|    value_loss           | 0.0113    |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 132       |


---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 171       |
|    ep_rew_mean          | -103      |
| time/                   |           |
|    fps                  | 1874      |
|    iterations           | 1469      |
|    time_elapsed         | 1604      |
|    total_timesteps      | 3008512   |
| train/                  |           |
|    approx_kl            | 0.5474319 |
|    clip_fraction        | 0.743     |
|    clip_range           | 0.2       |
|    entropy_loss         | 7.87      |
|    explained_variance   | 0.984     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.0344    |
|    n_updates            | 14680     |
|    policy_gradient_loss | 0.0441    |
|    std                  | 0.034     |
|    value_loss           | 0.0246    |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 173       |


----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 237        |
|    ep_rew_mean          | -111       |
| time/                   |            |
|    fps                  | 1875       |
|    iterations           | 1479       |
|    time_elapsed         | 1614       |
|    total_timesteps      | 3028992    |
| train/                  |            |
|    approx_kl            | 0.56970143 |
|    clip_fraction        | 0.765      |
|    clip_range           | 0.2        |
|    entropy_loss         | 7.87       |
|    explained_variance   | 0.945      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.0966     |
|    n_updates            | 14780      |
|    policy_gradient_loss | 0.06       |
|    std                  | 0.0342     |
|    value_loss           | 0.204      |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 256        |
|    ep_rew_mean          | -110       |
| time/                   |            |
|    fps                  | 1877       |
|    iterations           | 1489       |
|    time_elapsed         | 1624       |
|    total_timesteps      | 3049472    |
| train/                  |            |
|    approx_kl            | 0.12491115 |
|    clip_fraction        | 0.578      |
|    clip_range           | 0.2        |
|    entropy_loss         | 7.73       |
|    explained_variance   | 0.0604     |
|    learning_rate        | 0.0003     |
|    loss                 | 0.0471     |
|    n_updates            | 14880      |
|    policy_gradient_loss | 0.055      |
|    std                  | 0.0352     |
|    value_loss           | 0.0301     |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 218       |
|    ep_rew_mean          | -105      |
| time/                   |           |
|    fps                  | 1878      |
|    iterations           | 1499      |
|    time_elapsed         | 1634      |
|    total_timesteps      | 3069952   |
| train/                  |           |
|    approx_kl            | 3.8945518 |
|    clip_fraction        | 0.705     |
|    clip_range           | 0.2       |
|    entropy_loss         | 7.44      |
|    explained_variance   | 0.815     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0515   |
|    n_updates            | 14980     |
|    policy_gradient_loss | 0.00917   |
|    std                  | 0.0377    |
|    value_loss           | 0.0471    |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 188       

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 260        |
|    ep_rew_mean          | -105       |
| time/                   |            |
|    fps                  | 1879       |
|    iterations           | 1509       |
|    time_elapsed         | 1644       |
|    total_timesteps      | 3090432    |
| train/                  |            |
|    approx_kl            | 0.18739356 |
|    clip_fraction        | 0.605      |
|    clip_range           | 0.2        |
|    entropy_loss         | 7.23       |
|    explained_variance   | 0.989      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0168    |
|    n_updates            | 15080      |
|    policy_gradient_loss | 0.0121     |
|    std                  | 0.0399     |
|    value_loss           | 0.0209     |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 218        |
|    ep_rew_mean          | -106       |
| time/                   |            |
|    fps                  | 1881       |
|    iterations           | 1519       |
|    time_elapsed         | 1653       |
|    total_timesteps      | 3110912    |
| train/                  |            |
|    approx_kl            | 0.62434196 |
|    clip_fraction        | 0.649      |
|    clip_range           | 0.2        |
|    entropy_loss         | 7.03       |
|    explained_variance   | 0.99       |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0241    |
|    n_updates            | 15180      |
|    policy_gradient_loss | 0.0239     |
|    std                  | 0.042      |
|    value_loss           | 0.0269     |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 187        |
|    ep_rew_mean          | -108       |
| time/                   |            |
|    fps                  | 1882       |
|    iterations           | 1529       |
|    time_elapsed         | 1663       |
|    total_timesteps      | 3131392    |
| train/                  |            |
|    approx_kl            | 0.17493266 |
|    clip_fraction        | 0.627      |
|    clip_range           | 0.2        |
|    entropy_loss         | 6.8        |
|    explained_variance   | 0.977      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.0997     |
|    n_updates            | 15280      |
|    policy_gradient_loss | 0.0229     |
|    std                  | 0.0445     |
|    value_loss           | 0.0622     |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 184        |
|    ep_rew_mean          | -105       |
| time/                   |            |
|    fps                  | 1883       |
|    iterations           | 1539       |
|    time_elapsed         | 1673       |
|    total_timesteps      | 3151872    |
| train/                  |            |
|    approx_kl            | 0.16365436 |
|    clip_fraction        | 0.581      |
|    clip_range           | 0.2        |
|    entropy_loss         | 6.69       |
|    explained_variance   | 0.982      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.019      |
|    n_updates            | 15380      |
|    policy_gradient_loss | 0.0184     |
|    std                  | 0.0456     |
|    value_loss           | 0.055      |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 188        |
|    ep_rew_mean          | -105       |
| time/                   |            |
|    fps                  | 1884       |
|    iterations           | 1549       |
|    time_elapsed         | 1683       |
|    total_timesteps      | 3172352    |
| train/                  |            |
|    approx_kl            | 0.35434943 |
|    clip_fraction        | 0.492      |
|    clip_range           | 0.2        |
|    entropy_loss         | 6.52       |
|    explained_variance   | 0.988      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0294    |
|    n_updates            | 15480      |
|    policy_gradient_loss | 0.00835    |
|    std                  | 0.0474     |
|    value_loss           | 0.0104     |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 169       |
|    ep_rew_mean          | -106      |
| time/                   |           |
|    fps                  | 1885      |
|    iterations           | 1559      |
|    time_elapsed         | 1693      |
|    total_timesteps      | 3192832   |
| train/                  |           |
|    approx_kl            | 0.1579085 |
|    clip_fraction        | 0.595     |
|    clip_range           | 0.2       |
|    entropy_loss         | 6.37      |
|    explained_variance   | 0.976     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.0069    |
|    n_updates            | 15580     |
|    policy_gradient_loss | -0.000266 |
|    std                  | 0.0493    |
|    value_loss           | 0.0564    |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 175       |


----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 203        |
|    ep_rew_mean          | -105       |
| time/                   |            |
|    fps                  | 1886       |
|    iterations           | 1569       |
|    time_elapsed         | 1702       |
|    total_timesteps      | 3213312    |
| train/                  |            |
|    approx_kl            | 0.13003625 |
|    clip_fraction        | 0.549      |
|    clip_range           | 0.2        |
|    entropy_loss         | 6.17       |
|    explained_variance   | 0.984      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.00269   |
|    n_updates            | 15680      |
|    policy_gradient_loss | 0.00155    |
|    std                  | 0.0518     |
|    value_loss           | 0.0288     |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 167        |
|    ep_rew_mean          | -105       |
| time/                   |            |
|    fps                  | 1887       |
|    iterations           | 1579       |
|    time_elapsed         | 1712       |
|    total_timesteps      | 3233792    |
| train/                  |            |
|    approx_kl            | 0.12562369 |
|    clip_fraction        | 0.571      |
|    clip_range           | 0.2        |
|    entropy_loss         | 6.03       |
|    explained_variance   | 0.981      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.0269     |
|    n_updates            | 15780      |
|    policy_gradient_loss | 0.00476    |
|    std                  | 0.0539     |
|    value_loss           | 0.0552     |
----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_me

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 162        |
|    ep_rew_mean          | -103       |
| time/                   |            |
|    fps                  | 1888       |
|    iterations           | 1589       |
|    time_elapsed         | 1722       |
|    total_timesteps      | 3254272    |
| train/                  |            |
|    approx_kl            | 0.11464169 |
|    clip_fraction        | 0.543      |
|    clip_range           | 0.2        |
|    entropy_loss         | 5.9        |
|    explained_variance   | 0.982      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0545    |
|    n_updates            | 15880      |
|    policy_gradient_loss | 0.00731    |
|    std                  | 0.0557     |
|    value_loss           | 0.0246     |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 158        |
|    ep_rew_mean          | -106       |
| time/                   |            |
|    fps                  | 1889       |
|    iterations           | 1599       |
|    time_elapsed         | 1733       |
|    total_timesteps      | 3274752    |
| train/                  |            |
|    approx_kl            | 0.09328811 |
|    clip_fraction        | 0.506      |
|    clip_range           | 0.2        |
|    entropy_loss         | 5.77       |
|    explained_variance   | 0.983      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.00485    |
|    n_updates            | 15980      |
|    policy_gradient_loss | -0.00735   |
|    std                  | 0.0573     |
|    value_loss           | 0.0448     |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 222       |
|    ep_rew_mean          | -101      |
| time/                   |           |
|    fps                  | 1890      |
|    iterations           | 1609      |
|    time_elapsed         | 1743      |
|    total_timesteps      | 3295232   |
| train/                  |           |
|    approx_kl            | 0.1250302 |
|    clip_fraction        | 0.452     |
|    clip_range           | 0.2       |
|    entropy_loss         | 5.72      |
|    explained_variance   | 0.934     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0602   |
|    n_updates            | 16080     |
|    policy_gradient_loss | -0.0131   |
|    std                  | 0.0582    |
|    value_loss           | 0.0117    |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 223       

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 346        |
|    ep_rew_mean          | -92.1      |
| time/                   |            |
|    fps                  | 1891       |
|    iterations           | 1619       |
|    time_elapsed         | 1753       |
|    total_timesteps      | 3315712    |
| train/                  |            |
|    approx_kl            | 0.21789446 |
|    clip_fraction        | 0.544      |
|    clip_range           | 0.2        |
|    entropy_loss         | 5.68       |
|    explained_variance   | 0.926      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0276    |
|    n_updates            | 16180      |
|    policy_gradient_loss | -0.00677   |
|    std                  | 0.0586     |
|    value_loss           | 0.033      |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 518         |
|    ep_rew_mean          | -83.3       |
| time/                   |             |
|    fps                  | 1892        |
|    iterations           | 1629        |
|    time_elapsed         | 1763        |
|    total_timesteps      | 3336192     |
| train/                  |             |
|    approx_kl            | 0.107807875 |
|    clip_fraction        | 0.512       |
|    clip_range           | 0.2         |
|    entropy_loss         | 5.75        |
|    explained_variance   | 0.214       |
|    learning_rate        | 0.0003      |
|    loss                 | -0.0627     |
|    n_updates            | 16280       |
|    policy_gradient_loss | -0.0208     |
|    std                  | 0.0574      |
|    value_loss           | 0.00309     |
-----------------------------------------
----------------------------------------
| rollout/                |        

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 674         |
|    ep_rew_mean          | -73         |
| time/                   |             |
|    fps                  | 1892        |
|    iterations           | 1639        |
|    time_elapsed         | 1773        |
|    total_timesteps      | 3356672     |
| train/                  |             |
|    approx_kl            | 0.108608276 |
|    clip_fraction        | 0.518       |
|    clip_range           | 0.2         |
|    entropy_loss         | 5.78        |
|    explained_variance   | -1.25       |
|    learning_rate        | 0.0003      |
|    loss                 | 0.0458      |
|    n_updates            | 16380       |
|    policy_gradient_loss | -0.00749    |
|    std                  | 0.0571      |
|    value_loss           | 0.00175     |
-----------------------------------------
----------------------------------------
| rollout/                |        

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 819        |
|    ep_rew_mean          | -64.3      |
| time/                   |            |
|    fps                  | 1893       |
|    iterations           | 1649       |
|    time_elapsed         | 1783       |
|    total_timesteps      | 3377152    |
| train/                  |            |
|    approx_kl            | 0.19524994 |
|    clip_fraction        | 0.573      |
|    clip_range           | 0.2        |
|    entropy_loss         | 5.81       |
|    explained_variance   | 0.692      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.00101   |
|    n_updates            | 16480      |
|    policy_gradient_loss | -0.00487   |
|    std                  | 0.0568     |
|    value_loss           | 0.00599    |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 992        |
|    ep_rew_mean          | -55.4      |
| time/                   |            |
|    fps                  | 1894       |
|    iterations           | 1659       |
|    time_elapsed         | 1793       |
|    total_timesteps      | 3397632    |
| train/                  |            |
|    approx_kl            | 0.08483419 |
|    clip_fraction        | 0.509      |
|    clip_range           | 0.2        |
|    entropy_loss         | 5.7        |
|    explained_variance   | 0.607      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0549    |
|    n_updates            | 16580      |
|    policy_gradient_loss | -0.00931   |
|    std                  | 0.0581     |
|    value_loss           | 0.00117    |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.14e+03   |
|    ep_rew_mean          | -47.4      |
| time/                   |            |
|    fps                  | 1895       |
|    iterations           | 1669       |
|    time_elapsed         | 1803       |
|    total_timesteps      | 3418112    |
| train/                  |            |
|    approx_kl            | 0.28091365 |
|    clip_fraction        | 0.548      |
|    clip_range           | 0.2        |
|    entropy_loss         | 5.81       |
|    explained_variance   | 0.912      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0826    |
|    n_updates            | 16680      |
|    policy_gradient_loss | -0.034     |
|    std                  | 0.0564     |
|    value_loss           | 0.0134     |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.23e+03   |
|    ep_rew_mean          | -41.5      |
| time/                   |            |
|    fps                  | 1895       |
|    iterations           | 1679       |
|    time_elapsed         | 1813       |
|    total_timesteps      | 3438592    |
| train/                  |            |
|    approx_kl            | 0.20311072 |
|    clip_fraction        | 0.579      |
|    clip_range           | 0.2        |
|    entropy_loss         | 5.95       |
|    explained_variance   | 0.882      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0413    |
|    n_updates            | 16780      |
|    policy_gradient_loss | -0.0272    |
|    std                  | 0.0547     |
|    value_loss           | 0.00149    |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.33e+03   |
|    ep_rew_mean          | -34.9      |
| time/                   |            |
|    fps                  | 1896       |
|    iterations           | 1689       |
|    time_elapsed         | 1823       |
|    total_timesteps      | 3459072    |
| train/                  |            |
|    approx_kl            | 0.14206302 |
|    clip_fraction        | 0.539      |
|    clip_range           | 0.2        |
|    entropy_loss         | 6.07       |
|    explained_variance   | 0.949      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0552    |
|    n_updates            | 16880      |
|    policy_gradient_loss | -0.00952   |
|    std                  | 0.053      |
|    value_loss           | 0.00308    |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.27e+03  |
|    ep_rew_mean          | -38.3     |
| time/                   |           |
|    fps                  | 1897      |
|    iterations           | 1699      |
|    time_elapsed         | 1833      |
|    total_timesteps      | 3479552   |
| train/                  |           |
|    approx_kl            | 0.1383247 |
|    clip_fraction        | 0.575     |
|    clip_range           | 0.2       |
|    entropy_loss         | 6.17      |
|    explained_variance   | 0.937     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0637   |
|    n_updates            | 16980     |
|    policy_gradient_loss | -0.0125   |
|    std                  | 0.0516    |
|    value_loss           | 0.0177    |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.27e+03  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.25e+03   |
|    ep_rew_mean          | -39        |
| time/                   |            |
|    fps                  | 1898       |
|    iterations           | 1709       |
|    time_elapsed         | 1843       |
|    total_timesteps      | 3500032    |
| train/                  |            |
|    approx_kl            | 0.29056838 |
|    clip_fraction        | 0.599      |
|    clip_range           | 0.2        |
|    entropy_loss         | 6.27       |
|    explained_variance   | 0.954      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0186    |
|    n_updates            | 17080      |
|    policy_gradient_loss | -0.008     |
|    std                  | 0.0505     |
|    value_loss           | 0.0418     |
----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_me

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.24e+03   |
|    ep_rew_mean          | -39.6      |
| time/                   |            |
|    fps                  | 1899       |
|    iterations           | 1719       |
|    time_elapsed         | 1853       |
|    total_timesteps      | 3520512    |
| train/                  |            |
|    approx_kl            | 0.22751772 |
|    clip_fraction        | 0.523      |
|    clip_range           | 0.2        |
|    entropy_loss         | 6.25       |
|    explained_variance   | 0.97       |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0369    |
|    n_updates            | 17180      |
|    policy_gradient_loss | -0.00471   |
|    std                  | 0.0507     |
|    value_loss           | 0.00102    |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.31e+03  |
|    ep_rew_mean          | -34.3     |
| time/                   |           |
|    fps                  | 1899      |
|    iterations           | 1729      |
|    time_elapsed         | 1864      |
|    total_timesteps      | 3540992   |
| train/                  |           |
|    approx_kl            | 0.1287805 |
|    clip_fraction        | 0.512     |
|    clip_range           | 0.2       |
|    entropy_loss         | 6.29      |
|    explained_variance   | 0.91      |
|    learning_rate        | 0.0003    |
|    loss                 | 0.0102    |
|    n_updates            | 17280     |
|    policy_gradient_loss | -0.00667  |
|    std                  | 0.0506    |
|    value_loss           | 0.000428  |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.32e+03  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.35e+03   |
|    ep_rew_mean          | -31.3      |
| time/                   |            |
|    fps                  | 1900       |
|    iterations           | 1739       |
|    time_elapsed         | 1874       |
|    total_timesteps      | 3561472    |
| train/                  |            |
|    approx_kl            | 0.10169956 |
|    clip_fraction        | 0.524      |
|    clip_range           | 0.2        |
|    entropy_loss         | 6.29       |
|    explained_variance   | 0.958      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.0452     |
|    n_updates            | 17380      |
|    policy_gradient_loss | 0.00672    |
|    std                  | 0.0505     |
|    value_loss           | 0.00089    |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.35e+03   |
|    ep_rew_mean          | -30.9      |
| time/                   |            |
|    fps                  | 1900       |
|    iterations           | 1749       |
|    time_elapsed         | 1884       |
|    total_timesteps      | 3581952    |
| train/                  |            |
|    approx_kl            | 0.12726724 |
|    clip_fraction        | 0.543      |
|    clip_range           | 0.2        |
|    entropy_loss         | 6.36       |
|    explained_variance   | 0.912      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0522    |
|    n_updates            | 17480      |
|    policy_gradient_loss | -0.0197    |
|    std                  | 0.0495     |
|    value_loss           | 0.00101    |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.34e+03   |
|    ep_rew_mean          | -30.9      |
| time/                   |            |
|    fps                  | 1901       |
|    iterations           | 1759       |
|    time_elapsed         | 1894       |
|    total_timesteps      | 3602432    |
| train/                  |            |
|    approx_kl            | 0.17240897 |
|    clip_fraction        | 0.554      |
|    clip_range           | 0.2        |
|    entropy_loss         | 6.48       |
|    explained_variance   | 0.891      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0393    |
|    n_updates            | 17580      |
|    policy_gradient_loss | -0.00951   |
|    std                  | 0.0482     |
|    value_loss           | 0.000908   |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.4e+03   |
|    ep_rew_mean          | -27.3     |
| time/                   |           |
|    fps                  | 1901      |
|    iterations           | 1769      |
|    time_elapsed         | 1905      |
|    total_timesteps      | 3622912   |
| train/                  |           |
|    approx_kl            | 0.4376525 |
|    clip_fraction        | 0.536     |
|    clip_range           | 0.2       |
|    entropy_loss         | 6.59      |
|    explained_variance   | 0.502     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0476   |
|    n_updates            | 17680     |
|    policy_gradient_loss | -0.0161   |
|    std                  | 0.0464    |
|    value_loss           | 0.0126    |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.4e+03   

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.45e+03   |
|    ep_rew_mean          | -25.3      |
| time/                   |            |
|    fps                  | 1901       |
|    iterations           | 1779       |
|    time_elapsed         | 1915       |
|    total_timesteps      | 3643392    |
| train/                  |            |
|    approx_kl            | 0.14264676 |
|    clip_fraction        | 0.597      |
|    clip_range           | 0.2        |
|    entropy_loss         | 6.64       |
|    explained_variance   | 0.939      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0155    |
|    n_updates            | 17780      |
|    policy_gradient_loss | 0.0267     |
|    std                  | 0.046      |
|    value_loss           | 0.00182    |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.45e+03   |
|    ep_rew_mean          | -25.5      |
| time/                   |            |
|    fps                  | 1902       |
|    iterations           | 1789       |
|    time_elapsed         | 1925       |
|    total_timesteps      | 3663872    |
| train/                  |            |
|    approx_kl            | 0.30288363 |
|    clip_fraction        | 0.498      |
|    clip_range           | 0.2        |
|    entropy_loss         | 6.88       |
|    explained_variance   | 0.777      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0388    |
|    n_updates            | 17880      |
|    policy_gradient_loss | -0.0248    |
|    std                  | 0.043      |
|    value_loss           | 0.0246     |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.32e+03   |
|    ep_rew_mean          | -34.9      |
| time/                   |            |
|    fps                  | 1903       |
|    iterations           | 1799       |
|    time_elapsed         | 1935       |
|    total_timesteps      | 3684352    |
| train/                  |            |
|    approx_kl            | 0.07732274 |
|    clip_fraction        | 0.495      |
|    clip_range           | 0.2        |
|    entropy_loss         | 6.99       |
|    explained_variance   | 0.585      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.0339     |
|    n_updates            | 17980      |
|    policy_gradient_loss | 0.00829    |
|    std                  | 0.0424     |
|    value_loss           | 0.00419    |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.27e+03   |
|    ep_rew_mean          | -38.4      |
| time/                   |            |
|    fps                  | 1904       |
|    iterations           | 1809       |
|    time_elapsed         | 1945       |
|    total_timesteps      | 3704832    |
| train/                  |            |
|    approx_kl            | 0.08162853 |
|    clip_fraction        | 0.53       |
|    clip_range           | 0.2        |
|    entropy_loss         | 7.12       |
|    explained_variance   | 0.927      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.0272     |
|    n_updates            | 18080      |
|    policy_gradient_loss | 0.0526     |
|    std                  | 0.0411     |
|    value_loss           | 0.00116    |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.24e+03  |
|    ep_rew_mean          | -42       |
| time/                   |           |
|    fps                  | 1905      |
|    iterations           | 1819      |
|    time_elapsed         | 1954      |
|    total_timesteps      | 3725312   |
| train/                  |           |
|    approx_kl            | 1.8385155 |
|    clip_fraction        | 0.715     |
|    clip_range           | 0.2       |
|    entropy_loss         | 7.2       |
|    explained_variance   | -1.89     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0579   |
|    n_updates            | 18180     |
|    policy_gradient_loss | 0.0225    |
|    std                  | 0.0403    |
|    value_loss           | 0.000919  |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.24e+03  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.18e+03   |
|    ep_rew_mean          | -45.9      |
| time/                   |            |
|    fps                  | 1906       |
|    iterations           | 1829       |
|    time_elapsed         | 1964       |
|    total_timesteps      | 3745792    |
| train/                  |            |
|    approx_kl            | 0.10448574 |
|    clip_fraction        | 0.544      |
|    clip_range           | 0.2        |
|    entropy_loss         | 7.33       |
|    explained_variance   | 0.795      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0161    |
|    n_updates            | 18280      |
|    policy_gradient_loss | 0.017      |
|    std                  | 0.039      |
|    value_loss           | 0.00185    |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.17e+03   |
|    ep_rew_mean          | -45.4      |
| time/                   |            |
|    fps                  | 1908       |
|    iterations           | 1839       |
|    time_elapsed         | 1973       |
|    total_timesteps      | 3766272    |
| train/                  |            |
|    approx_kl            | 0.17103305 |
|    clip_fraction        | 0.516      |
|    clip_range           | 0.2        |
|    entropy_loss         | 7.53       |
|    explained_variance   | 0.831      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.113      |
|    n_updates            | 18380      |
|    policy_gradient_loss | 0.029      |
|    std                  | 0.0371     |
|    value_loss           | 0.000869   |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.19e+03   |
|    ep_rew_mean          | -43.7      |
| time/                   |            |
|    fps                  | 1909       |
|    iterations           | 1849       |
|    time_elapsed         | 1982       |
|    total_timesteps      | 3786752    |
| train/                  |            |
|    approx_kl            | 0.35543144 |
|    clip_fraction        | 0.601      |
|    clip_range           | 0.2        |
|    entropy_loss         | 7.41       |
|    explained_variance   | 0.975      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0476    |
|    n_updates            | 18480      |
|    policy_gradient_loss | 0.000695   |
|    std                  | 0.038      |
|    value_loss           | 0.000882   |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.27e+03  |
|    ep_rew_mean          | -38.6     |
| time/                   |           |
|    fps                  | 1910      |
|    iterations           | 1859      |
|    time_elapsed         | 1992      |
|    total_timesteps      | 3807232   |
| train/                  |           |
|    approx_kl            | 0.1113917 |
|    clip_fraction        | 0.57      |
|    clip_range           | 0.2       |
|    entropy_loss         | 7.58      |
|    explained_variance   | 0.911     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.000241 |
|    n_updates            | 18580     |
|    policy_gradient_loss | 0.0153    |
|    std                  | 0.0366    |
|    value_loss           | 0.000983  |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.24e+03  |


---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.29e+03  |
|    ep_rew_mean          | -37.4     |
| time/                   |           |
|    fps                  | 1912      |
|    iterations           | 1869      |
|    time_elapsed         | 2001      |
|    total_timesteps      | 3827712   |
| train/                  |           |
|    approx_kl            | 0.4748819 |
|    clip_fraction        | 0.718     |
|    clip_range           | 0.2       |
|    entropy_loss         | 7.67      |
|    explained_variance   | 0.923     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0627   |
|    n_updates            | 18680     |
|    policy_gradient_loss | 0.0152    |
|    std                  | 0.0357    |
|    value_loss           | 0.00209   |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.25e+03  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.3e+03    |
|    ep_rew_mean          | -36.3      |
| time/                   |            |
|    fps                  | 1913       |
|    iterations           | 1879       |
|    time_elapsed         | 2011       |
|    total_timesteps      | 3848192    |
| train/                  |            |
|    approx_kl            | 0.41890857 |
|    clip_fraction        | 0.68       |
|    clip_range           | 0.2        |
|    entropy_loss         | 7.88       |
|    explained_variance   | 0.851      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0211    |
|    n_updates            | 18780      |
|    policy_gradient_loss | -0.000298  |
|    std                  | 0.0339     |
|    value_loss           | 0.00596    |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.29e+03  |
|    ep_rew_mean          | -37.5     |
| time/                   |           |
|    fps                  | 1914      |
|    iterations           | 1889      |
|    time_elapsed         | 2020      |
|    total_timesteps      | 3868672   |
| train/                  |           |
|    approx_kl            | 14.422428 |
|    clip_fraction        | 0.862     |
|    clip_range           | 0.2       |
|    entropy_loss         | 8.02      |
|    explained_variance   | 0.51      |
|    learning_rate        | 0.0003    |
|    loss                 | -0.000888 |
|    n_updates            | 18880     |
|    policy_gradient_loss | -0.00164  |
|    std                  | 0.0325    |
|    value_loss           | 0.00514   |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.29e+03  |


----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.33e+03   |
|    ep_rew_mean          | -34.7      |
| time/                   |            |
|    fps                  | 1915       |
|    iterations           | 1899       |
|    time_elapsed         | 2030       |
|    total_timesteps      | 3889152    |
| train/                  |            |
|    approx_kl            | 0.24301888 |
|    clip_fraction        | 0.634      |
|    clip_range           | 0.2        |
|    entropy_loss         | 8.05       |
|    explained_variance   | 0.958      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.031      |
|    n_updates            | 18980      |
|    policy_gradient_loss | 0.031      |
|    std                  | 0.0325     |
|    value_loss           | 0.00231    |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.33e+03   |
|    ep_rew_mean          | -34.4      |
| time/                   |            |
|    fps                  | 1916       |
|    iterations           | 1909       |
|    time_elapsed         | 2039       |
|    total_timesteps      | 3909632    |
| train/                  |            |
|    approx_kl            | 0.14700219 |
|    clip_fraction        | 0.616      |
|    clip_range           | 0.2        |
|    entropy_loss         | 8.04       |
|    explained_variance   | 0.974      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.0076     |
|    n_updates            | 19080      |
|    policy_gradient_loss | 0.0337     |
|    std                  | 0.0324     |
|    value_loss           | 0.000141   |
----------------------------------------
--------------------------------------
| rollout/                |          |
|    ep_len_mean    

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.23e+03  |
|    ep_rew_mean          | -43       |
| time/                   |           |
|    fps                  | 1917      |
|    iterations           | 1919      |
|    time_elapsed         | 2049      |
|    total_timesteps      | 3930112   |
| train/                  |           |
|    approx_kl            | 10.099989 |
|    clip_fraction        | 0.909     |
|    clip_range           | 0.2       |
|    entropy_loss         | 8.12      |
|    explained_variance   | 0.91      |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0341   |
|    n_updates            | 19180     |
|    policy_gradient_loss | 0.0273    |
|    std                  | 0.0318    |
|    value_loss           | 0.00419   |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.21e+03  |


--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.21e+03 |
|    ep_rew_mean          | -43.6    |
| time/                   |          |
|    fps                  | 1918     |
|    iterations           | 1929     |
|    time_elapsed         | 2059     |
|    total_timesteps      | 3950592  |
| train/                  |          |
|    approx_kl            | 2.728717 |
|    clip_fraction        | 0.76     |
|    clip_range           | 0.2      |
|    entropy_loss         | 8.23     |
|    explained_variance   | 0.943    |
|    learning_rate        | 0.0003   |
|    loss                 | -0.029   |
|    n_updates            | 19280    |
|    policy_gradient_loss | 0.0171   |
|    std                  | 0.0308   |
|    value_loss           | 0.00144  |
--------------------------------------
--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.23e+03 |
|    ep_rew_mean         

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.18e+03  |
|    ep_rew_mean          | -46.2     |
| time/                   |           |
|    fps                  | 1919      |
|    iterations           | 1939      |
|    time_elapsed         | 2068      |
|    total_timesteps      | 3971072   |
| train/                  |           |
|    approx_kl            | 0.6849563 |
|    clip_fraction        | 0.646     |
|    clip_range           | 0.2       |
|    entropy_loss         | 8.32      |
|    explained_variance   | 0.853     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.000747  |
|    n_updates            | 19380     |
|    policy_gradient_loss | -0.0206   |
|    std                  | 0.0301    |
|    value_loss           | 0.0652    |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.18e+03  |


----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.02e+03   |
|    ep_rew_mean          | -57.4      |
| time/                   |            |
|    fps                  | 1920       |
|    iterations           | 1949       |
|    time_elapsed         | 2078       |
|    total_timesteps      | 3991552    |
| train/                  |            |
|    approx_kl            | 0.63551116 |
|    clip_fraction        | 0.751      |
|    clip_range           | 0.2        |
|    entropy_loss         | 8.48       |
|    explained_variance   | 0.861      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0165    |
|    n_updates            | 19480      |
|    policy_gradient_loss | 0.00478    |
|    std                  | 0.0291     |
|    value_loss           | 0.0375     |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 955       |
|    ep_rew_mean          | -63.1     |
| time/                   |           |
|    fps                  | 1921      |
|    iterations           | 1959      |
|    time_elapsed         | 2088      |
|    total_timesteps      | 4012032   |
| train/                  |           |
|    approx_kl            | 0.6391956 |
|    clip_fraction        | 0.663     |
|    clip_range           | 0.2       |
|    entropy_loss         | 8.58      |
|    explained_variance   | 0.89      |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0355   |
|    n_updates            | 19580     |
|    policy_gradient_loss | 0.00658   |
|    std                  | 0.0282    |
|    value_loss           | 0.0344    |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 956       |


---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 986       |
|    ep_rew_mean          | -61.6     |
| time/                   |           |
|    fps                  | 1921      |
|    iterations           | 1969      |
|    time_elapsed         | 2098      |
|    total_timesteps      | 4032512   |
| train/                  |           |
|    approx_kl            | 2.5590081 |
|    clip_fraction        | 0.735     |
|    clip_range           | 0.2       |
|    entropy_loss         | 8.66      |
|    explained_variance   | 0.878     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.00113   |
|    n_updates            | 19680     |
|    policy_gradient_loss | 0.0227    |
|    std                  | 0.0277    |
|    value_loss           | 0.00873   |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 959       |


----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 982        |
|    ep_rew_mean          | -62.2      |
| time/                   |            |
|    fps                  | 1922       |
|    iterations           | 1979       |
|    time_elapsed         | 2108       |
|    total_timesteps      | 4052992    |
| train/                  |            |
|    approx_kl            | 0.74306667 |
|    clip_fraction        | 0.747      |
|    clip_range           | 0.2        |
|    entropy_loss         | 8.81       |
|    explained_variance   | 0.562      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.0421     |
|    n_updates            | 19780      |
|    policy_gradient_loss | 0.0213     |
|    std                  | 0.0266     |
|    value_loss           | 0.0224     |
----------------------------------------
--------------------------------------
| rollout/                |          |
|    ep_len_mean    

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 992       |
|    ep_rew_mean          | -62.7     |
| time/                   |           |
|    fps                  | 1922      |
|    iterations           | 1989      |
|    time_elapsed         | 2118      |
|    total_timesteps      | 4073472   |
| train/                  |           |
|    approx_kl            | 0.5084829 |
|    clip_fraction        | 0.656     |
|    clip_range           | 0.2       |
|    entropy_loss         | 8.95      |
|    explained_variance   | 0.937     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0306   |
|    n_updates            | 19880     |
|    policy_gradient_loss | 0.00744   |
|    std                  | 0.0257    |
|    value_loss           | 0.0122    |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.01e+03  

--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.06e+03 |
|    ep_rew_mean          | -56.9    |
| time/                   |          |
|    fps                  | 1922     |
|    iterations           | 1999     |
|    time_elapsed         | 2128     |
|    total_timesteps      | 4093952  |
| train/                  |          |
|    approx_kl            | 9.60677  |
|    clip_fraction        | 0.878    |
|    clip_range           | 0.2      |
|    entropy_loss         | 9.01     |
|    explained_variance   | 0.845    |
|    learning_rate        | 0.0003   |
|    loss                 | 0.0115   |
|    n_updates            | 19980    |
|    policy_gradient_loss | 0.0605   |
|    std                  | 0.0255   |
|    value_loss           | 0.00634  |
--------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.08e+03  |
|    ep_rew_mean      

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.11e+03  |
|    ep_rew_mean          | -52.5     |
| time/                   |           |
|    fps                  | 1923      |
|    iterations           | 2009      |
|    time_elapsed         | 2139      |
|    total_timesteps      | 4114432   |
| train/                  |           |
|    approx_kl            | 2.2314878 |
|    clip_fraction        | 0.836     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.06      |
|    explained_variance   | 0.848     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0627   |
|    n_updates            | 20080     |
|    policy_gradient_loss | 0.0277    |
|    std                  | 0.0252    |
|    value_loss           | 0.0171    |
---------------------------------------
--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.1e+03  |
|  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.13e+03   |
|    ep_rew_mean          | -49.7      |
| time/                   |            |
|    fps                  | 1923       |
|    iterations           | 2019       |
|    time_elapsed         | 2149       |
|    total_timesteps      | 4134912    |
| train/                  |            |
|    approx_kl            | 0.66484565 |
|    clip_fraction        | 0.767      |
|    clip_range           | 0.2        |
|    entropy_loss         | 9.01       |
|    explained_variance   | 0.923      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.062     |
|    n_updates            | 20180      |
|    policy_gradient_loss | 0.0373     |
|    std                  | 0.0255     |
|    value_loss           | 0.021      |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.14e+03  |
|    ep_rew_mean          | -48.2     |
| time/                   |           |
|    fps                  | 1924      |
|    iterations           | 2029      |
|    time_elapsed         | 2159      |
|    total_timesteps      | 4155392   |
| train/                  |           |
|    approx_kl            | 0.7781583 |
|    clip_fraction        | 0.705     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.07      |
|    explained_variance   | 0.817     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0167   |
|    n_updates            | 20280     |
|    policy_gradient_loss | 0.0226    |
|    std                  | 0.0251    |
|    value_loss           | 0.0077    |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.12e+03  |


---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.13e+03  |
|    ep_rew_mean          | -50.4     |
| time/                   |           |
|    fps                  | 1924      |
|    iterations           | 2039      |
|    time_elapsed         | 2169      |
|    total_timesteps      | 4175872   |
| train/                  |           |
|    approx_kl            | 12.630543 |
|    clip_fraction        | 0.893     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.08      |
|    explained_variance   | 0.956     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.0406    |
|    n_updates            | 20380     |
|    policy_gradient_loss | 0.00888   |
|    std                  | 0.0248    |
|    value_loss           | 0.00975   |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.13e+03  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.16e+03  |
|    ep_rew_mean          | -48.7     |
| time/                   |           |
|    fps                  | 1924      |
|    iterations           | 2049      |
|    time_elapsed         | 2180      |
|    total_timesteps      | 4196352   |
| train/                  |           |
|    approx_kl            | 3.7158117 |
|    clip_fraction        | 0.814     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.23      |
|    explained_variance   | 0.809     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.0664    |
|    n_updates            | 20480     |
|    policy_gradient_loss | 0.034     |
|    std                  | 0.0242    |
|    value_loss           | 0.00752   |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.16e+03  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.12e+03  |
|    ep_rew_mean          | -52.5     |
| time/                   |           |
|    fps                  | 1924      |
|    iterations           | 2059      |
|    time_elapsed         | 2190      |
|    total_timesteps      | 4216832   |
| train/                  |           |
|    approx_kl            | 1.0268096 |
|    clip_fraction        | 0.765     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.27      |
|    explained_variance   | 0.918     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.0964    |
|    n_updates            | 20580     |
|    policy_gradient_loss | 0.0341    |
|    std                  | 0.0239    |
|    value_loss           | 0.00596   |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.13e+03  |


--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.2e+03  |
|    ep_rew_mean          | -48.6    |
| time/                   |          |
|    fps                  | 1925     |
|    iterations           | 2069     |
|    time_elapsed         | 2201     |
|    total_timesteps      | 4237312  |
| train/                  |          |
|    approx_kl            | 2.115304 |
|    clip_fraction        | 0.748    |
|    clip_range           | 0.2      |
|    entropy_loss         | 9.21     |
|    explained_variance   | 0.927    |
|    learning_rate        | 0.0003   |
|    loss                 | -0.0682  |
|    n_updates            | 20680    |
|    policy_gradient_loss | 0.0254   |
|    std                  | 0.0242   |
|    value_loss           | 0.00136  |
--------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.21e+03   |
|    ep_rew_mean   

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.25e+03  |
|    ep_rew_mean          | -45.5     |
| time/                   |           |
|    fps                  | 1925      |
|    iterations           | 2079      |
|    time_elapsed         | 2211      |
|    total_timesteps      | 4257792   |
| train/                  |           |
|    approx_kl            | 0.6160564 |
|    clip_fraction        | 0.781     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.26      |
|    explained_variance   | 0.966     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.0058    |
|    n_updates            | 20780     |
|    policy_gradient_loss | 0.0257    |
|    std                  | 0.024     |
|    value_loss           | 0.00249   |
---------------------------------------
--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.27e+03 |
|  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.22e+03  |
|    ep_rew_mean          | -49.8     |
| time/                   |           |
|    fps                  | 1925      |
|    iterations           | 2089      |
|    time_elapsed         | 2221      |
|    total_timesteps      | 4278272   |
| train/                  |           |
|    approx_kl            | 1.1063147 |
|    clip_fraction        | 0.811     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.43      |
|    explained_variance   | 0.965     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.0309    |
|    n_updates            | 20880     |
|    policy_gradient_loss | 0.0625    |
|    std                  | 0.023     |
|    value_loss           | 0.0064    |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.22e+03  

--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.2e+03  |
|    ep_rew_mean          | -50.3    |
| time/                   |          |
|    fps                  | 1925     |
|    iterations           | 2099     |
|    time_elapsed         | 2232     |
|    total_timesteps      | 4298752  |
| train/                  |          |
|    approx_kl            | 7.676155 |
|    clip_fraction        | 0.852    |
|    clip_range           | 0.2      |
|    entropy_loss         | 9.38     |
|    explained_variance   | 0.942    |
|    learning_rate        | 0.0003   |
|    loss                 | -0.042   |
|    n_updates            | 20980    |
|    policy_gradient_loss | 0.00702  |
|    std                  | 0.023    |
|    value_loss           | 0.00498  |
--------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.2e+03   |
|    ep_rew_mean      

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.24e+03   |
|    ep_rew_mean          | -48.6      |
| time/                   |            |
|    fps                  | 1925       |
|    iterations           | 2109       |
|    time_elapsed         | 2242       |
|    total_timesteps      | 4319232    |
| train/                  |            |
|    approx_kl            | 0.72598803 |
|    clip_fraction        | 0.759      |
|    clip_range           | 0.2        |
|    entropy_loss         | 9.38       |
|    explained_variance   | 0.966      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0154    |
|    n_updates            | 21080      |
|    policy_gradient_loss | 0.024      |
|    std                  | 0.0232     |
|    value_loss           | 0.00563    |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.2e+03  |
|    ep_rew_mean          | -51.7    |
| time/                   |          |
|    fps                  | 1926     |
|    iterations           | 2119     |
|    time_elapsed         | 2252     |
|    total_timesteps      | 4339712  |
| train/                  |          |
|    approx_kl            | 16.10523 |
|    clip_fraction        | 0.863    |
|    clip_range           | 0.2      |
|    entropy_loss         | 9.37     |
|    explained_variance   | 0.695    |
|    learning_rate        | 0.0003   |
|    loss                 | 0.164    |
|    n_updates            | 21180    |
|    policy_gradient_loss | 0.051    |
|    std                  | 0.0233   |
|    value_loss           | 0.0638   |
--------------------------------------
--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.22e+03 |
|    ep_rew_mean         

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.13e+03  |
|    ep_rew_mean          | -57       |
| time/                   |           |
|    fps                  | 1925      |
|    iterations           | 2129      |
|    time_elapsed         | 2264      |
|    total_timesteps      | 4360192   |
| train/                  |           |
|    approx_kl            | 7.4950523 |
|    clip_fraction        | 0.781     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.42      |
|    explained_variance   | 0.915     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.00688   |
|    n_updates            | 21280     |
|    policy_gradient_loss | 0.0202    |
|    std                  | 0.0229    |
|    value_loss           | 0.00597   |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.13e+03  |


--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.15e+03 |
|    ep_rew_mean          | -54.8    |
| time/                   |          |
|    fps                  | 1925     |
|    iterations           | 2139     |
|    time_elapsed         | 2274     |
|    total_timesteps      | 4380672  |
| train/                  |          |
|    approx_kl            | 6.196265 |
|    clip_fraction        | 0.793    |
|    clip_range           | 0.2      |
|    entropy_loss         | 9.36     |
|    explained_variance   | 0.961    |
|    learning_rate        | 0.0003   |
|    loss                 | -0.0563  |
|    n_updates            | 21380    |
|    policy_gradient_loss | 0.035    |
|    std                  | 0.0234   |
|    value_loss           | 0.00157  |
--------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.14e+03  |
|    ep_rew_mean      

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.15e+03  |
|    ep_rew_mean          | -54.3     |
| time/                   |           |
|    fps                  | 1925      |
|    iterations           | 2149      |
|    time_elapsed         | 2285      |
|    total_timesteps      | 4401152   |
| train/                  |           |
|    approx_kl            | 1.3120718 |
|    clip_fraction        | 0.791     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.29      |
|    explained_variance   | 0.96      |
|    learning_rate        | 0.0003    |
|    loss                 | 0.124     |
|    n_updates            | 21480     |
|    policy_gradient_loss | 0.0223    |
|    std                  | 0.0238    |
|    value_loss           | 0.0016    |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.16e+03  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.16e+03  |
|    ep_rew_mean          | -53.3     |
| time/                   |           |
|    fps                  | 1925      |
|    iterations           | 2159      |
|    time_elapsed         | 2296      |
|    total_timesteps      | 4421632   |
| train/                  |           |
|    approx_kl            | 1.9828389 |
|    clip_fraction        | 0.779     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.23      |
|    explained_variance   | 0.982     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.0415    |
|    n_updates            | 21580     |
|    policy_gradient_loss | 0.0186    |
|    std                  | 0.0242    |
|    value_loss           | 0.00129   |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.16e+03  |


----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.17e+03   |
|    ep_rew_mean          | -51.4      |
| time/                   |            |
|    fps                  | 1924       |
|    iterations           | 2169       |
|    time_elapsed         | 2307       |
|    total_timesteps      | 4442112    |
| train/                  |            |
|    approx_kl            | 0.49137488 |
|    clip_fraction        | 0.693      |
|    clip_range           | 0.2        |
|    entropy_loss         | 9.16       |
|    explained_variance   | 0.968      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.0394     |
|    n_updates            | 21680      |
|    policy_gradient_loss | 0.0218     |
|    std                  | 0.0245     |
|    value_loss           | 0.0025     |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.2e+03   |
|    ep_rew_mean          | -48.9     |
| time/                   |           |
|    fps                  | 1924      |
|    iterations           | 2179      |
|    time_elapsed         | 2318      |
|    total_timesteps      | 4462592   |
| train/                  |           |
|    approx_kl            | 7.4725866 |
|    clip_fraction        | 0.823     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.28      |
|    explained_variance   | 0.984     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0505   |
|    n_updates            | 21780     |
|    policy_gradient_loss | 0.0489    |
|    std                  | 0.0238    |
|    value_loss           | 0.00084   |
---------------------------------------
--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.2e+03  |
|  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.23e+03  |
|    ep_rew_mean          | -47.4     |
| time/                   |           |
|    fps                  | 1924      |
|    iterations           | 2189      |
|    time_elapsed         | 2330      |
|    total_timesteps      | 4483072   |
| train/                  |           |
|    approx_kl            | 0.5579603 |
|    clip_fraction        | 0.724     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.47      |
|    explained_variance   | 0.98      |
|    learning_rate        | 0.0003    |
|    loss                 | 0.0095    |
|    n_updates            | 21880     |
|    policy_gradient_loss | 0.0108    |
|    std                  | 0.0226    |
|    value_loss           | 0.00192   |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.22e+03  |


---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.27e+03  |
|    ep_rew_mean          | -46.6     |
| time/                   |           |
|    fps                  | 1923      |
|    iterations           | 2199      |
|    time_elapsed         | 2341      |
|    total_timesteps      | 4503552   |
| train/                  |           |
|    approx_kl            | 2.2859085 |
|    clip_fraction        | 0.802     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.48      |
|    explained_variance   | 0.949     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0248   |
|    n_updates            | 21980     |
|    policy_gradient_loss | 0.0174    |
|    std                  | 0.0226    |
|    value_loss           | 0.00674   |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.27e+03  |


---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.29e+03  |
|    ep_rew_mean          | -45.4     |
| time/                   |           |
|    fps                  | 1923      |
|    iterations           | 2209      |
|    time_elapsed         | 2352      |
|    total_timesteps      | 4524032   |
| train/                  |           |
|    approx_kl            | 0.5693066 |
|    clip_fraction        | 0.687     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.46      |
|    explained_variance   | 0.972     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0124   |
|    n_updates            | 22080     |
|    policy_gradient_loss | 0.00754   |
|    std                  | 0.0228    |
|    value_loss           | 0.00992   |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.29e+03  |


---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.32e+03  |
|    ep_rew_mean          | -43.6     |
| time/                   |           |
|    fps                  | 1922      |
|    iterations           | 2219      |
|    time_elapsed         | 2363      |
|    total_timesteps      | 4544512   |
| train/                  |           |
|    approx_kl            | 5.5972333 |
|    clip_fraction        | 0.842     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.44      |
|    explained_variance   | 0.995     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.0133    |
|    n_updates            | 22180     |
|    policy_gradient_loss | 0.127     |
|    std                  | 0.0229    |
|    value_loss           | 0.000216  |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.32e+03  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.27e+03  |
|    ep_rew_mean          | -46.1     |
| time/                   |           |
|    fps                  | 1922      |
|    iterations           | 2229      |
|    time_elapsed         | 2374      |
|    total_timesteps      | 4564992   |
| train/                  |           |
|    approx_kl            | 11.326557 |
|    clip_fraction        | 0.88      |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.43      |
|    explained_variance   | 0.988     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0314   |
|    n_updates            | 22280     |
|    policy_gradient_loss | 0.0164    |
|    std                  | 0.0229    |
|    value_loss           | 0.000614  |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.26e+03  |


---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.24e+03  |
|    ep_rew_mean          | -47.1     |
| time/                   |           |
|    fps                  | 1922      |
|    iterations           | 2239      |
|    time_elapsed         | 2385      |
|    total_timesteps      | 4585472   |
| train/                  |           |
|    approx_kl            | 12.252332 |
|    clip_fraction        | 0.925     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.56      |
|    explained_variance   | 0.951     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0311   |
|    n_updates            | 22380     |
|    policy_gradient_loss | 0.0625    |
|    std                  | 0.0222    |
|    value_loss           | 0.00363   |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.25e+03  |


---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.28e+03  |
|    ep_rew_mean          | -43.8     |
| time/                   |           |
|    fps                  | 1921      |
|    iterations           | 2249      |
|    time_elapsed         | 2396      |
|    total_timesteps      | 4605952   |
| train/                  |           |
|    approx_kl            | 11.623705 |
|    clip_fraction        | 0.927     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.6       |
|    explained_variance   | 0.95      |
|    learning_rate        | 0.0003    |
|    loss                 | 0.00953   |
|    n_updates            | 22480     |
|    policy_gradient_loss | 0.0554    |
|    std                  | 0.0219    |
|    value_loss           | 0.00396   |
---------------------------------------
--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.26e+03 |
|  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.3e+03   |
|    ep_rew_mean          | -39.3     |
| time/                   |           |
|    fps                  | 1921      |
|    iterations           | 2259      |
|    time_elapsed         | 2407      |
|    total_timesteps      | 4626432   |
| train/                  |           |
|    approx_kl            | 4.7138634 |
|    clip_fraction        | 0.717     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.51      |
|    explained_variance   | 0.942     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0725   |
|    n_updates            | 22580     |
|    policy_gradient_loss | -0.018    |
|    std                  | 0.0224    |
|    value_loss           | 0.000738  |
---------------------------------------
--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.29e+03 |
|  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.29e+03  |
|    ep_rew_mean          | -38.5     |
| time/                   |           |
|    fps                  | 1921      |
|    iterations           | 2269      |
|    time_elapsed         | 2418      |
|    total_timesteps      | 4646912   |
| train/                  |           |
|    approx_kl            | 2.5338264 |
|    clip_fraction        | 0.702     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.44      |
|    explained_variance   | 0.915     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0504   |
|    n_updates            | 22680     |
|    policy_gradient_loss | -0.0152   |
|    std                  | 0.0228    |
|    value_loss           | 0.00318   |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.28e+03  |


---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.28e+03  |
|    ep_rew_mean          | -37.1     |
| time/                   |           |
|    fps                  | 1921      |
|    iterations           | 2279      |
|    time_elapsed         | 2428      |
|    total_timesteps      | 4667392   |
| train/                  |           |
|    approx_kl            | 0.7707824 |
|    clip_fraction        | 0.776     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.25      |
|    explained_variance   | 0.956     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0015   |
|    n_updates            | 22780     |
|    policy_gradient_loss | 0.0115    |
|    std                  | 0.0239    |
|    value_loss           | 0.00275   |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.28e+03  |


----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.31e+03   |
|    ep_rew_mean          | -36.4      |
| time/                   |            |
|    fps                  | 1921       |
|    iterations           | 2289       |
|    time_elapsed         | 2439       |
|    total_timesteps      | 4687872    |
| train/                  |            |
|    approx_kl            | 0.49825636 |
|    clip_fraction        | 0.752      |
|    clip_range           | 0.2        |
|    entropy_loss         | 9.26       |
|    explained_variance   | 0.978      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.226      |
|    n_updates            | 22880      |
|    policy_gradient_loss | 0.0361     |
|    std                  | 0.0241     |
|    value_loss           | 0.000951   |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.36e+03  |
|    ep_rew_mean          | -33.1     |
| time/                   |           |
|    fps                  | 1922      |
|    iterations           | 2299      |
|    time_elapsed         | 2449      |
|    total_timesteps      | 4708352   |
| train/                  |           |
|    approx_kl            | 0.5327971 |
|    clip_fraction        | 0.691     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.29      |
|    explained_variance   | 0.917     |
|    learning_rate        | 0.0003    |
|    loss                 | 0.121     |
|    n_updates            | 22980     |
|    policy_gradient_loss | 0.00143   |
|    std                  | 0.0237    |
|    value_loss           | 0.00119   |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.35e+03  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.38e+03   |
|    ep_rew_mean          | -32.2      |
| time/                   |            |
|    fps                  | 1922       |
|    iterations           | 2309       |
|    time_elapsed         | 2460       |
|    total_timesteps      | 4728832    |
| train/                  |            |
|    approx_kl            | 0.44721323 |
|    clip_fraction        | 0.672      |
|    clip_range           | 0.2        |
|    entropy_loss         | 9.06       |
|    explained_variance   | 0.977      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0024    |
|    n_updates            | 23080      |
|    policy_gradient_loss | 0.00966    |
|    std                  | 0.0252     |
|    value_loss           | 0.00106    |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.42e+03   |
|    ep_rew_mean          | -30.9      |
| time/                   |            |
|    fps                  | 1922       |
|    iterations           | 2319       |
|    time_elapsed         | 2470       |
|    total_timesteps      | 4749312    |
| train/                  |            |
|    approx_kl            | 0.56802166 |
|    clip_fraction        | 0.661      |
|    clip_range           | 0.2        |
|    entropy_loss         | 9.07       |
|    explained_variance   | 0.987      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0189    |
|    n_updates            | 23180      |
|    policy_gradient_loss | -0.0056    |
|    std                  | 0.0249     |
|    value_loss           | 0.00306    |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.38e+03 |
|    ep_rew_mean          | -34.8    |
| time/                   |          |
|    fps                  | 1922     |
|    iterations           | 2329     |
|    time_elapsed         | 2481     |
|    total_timesteps      | 4769792  |
| train/                  |          |
|    approx_kl            | 5.88457  |
|    clip_fraction        | 0.765    |
|    clip_range           | 0.2      |
|    entropy_loss         | 9.09     |
|    explained_variance   | 0.979    |
|    learning_rate        | 0.0003   |
|    loss                 | -0.0249  |
|    n_updates            | 23280    |
|    policy_gradient_loss | 0.03     |
|    std                  | 0.0249   |
|    value_loss           | 0.00141  |
--------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.4e+03    |
|    ep_rew_mean   

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.41e+03  |
|    ep_rew_mean          | -34.1     |
| time/                   |           |
|    fps                  | 1922      |
|    iterations           | 2339      |
|    time_elapsed         | 2491      |
|    total_timesteps      | 4790272   |
| train/                  |           |
|    approx_kl            | 16.476368 |
|    clip_fraction        | 0.904     |
|    clip_range           | 0.2       |
|    entropy_loss         | 9.06      |
|    explained_variance   | 0.939     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0686   |
|    n_updates            | 23380     |
|    policy_gradient_loss | 0.0163    |
|    std                  | 0.025     |
|    value_loss           | 0.00226   |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.42e+03  |


---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.43e+03  |
|    ep_rew_mean          | -34.2     |
| time/                   |           |
|    fps                  | 1922      |
|    iterations           | 2349      |
|    time_elapsed         | 2502      |
|    total_timesteps      | 4810752   |
| train/                  |           |
|    approx_kl            | 0.7705461 |
|    clip_fraction        | 0.653     |
|    clip_range           | 0.2       |
|    entropy_loss         | 8.85      |
|    explained_variance   | 0.973     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0126   |
|    n_updates            | 23480     |
|    policy_gradient_loss | -0.00513  |
|    std                  | 0.0264    |
|    value_loss           | 0.00148   |
---------------------------------------
--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.43e+03 |
|  

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.44e+03  |
|    ep_rew_mean          | -34.3     |
| time/                   |           |
|    fps                  | 1922      |
|    iterations           | 2359      |
|    time_elapsed         | 2512      |
|    total_timesteps      | 4831232   |
| train/                  |           |
|    approx_kl            | 0.7306694 |
|    clip_fraction        | 0.615     |
|    clip_range           | 0.2       |
|    entropy_loss         | 8.93      |
|    explained_variance   | 0.992     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0363   |
|    n_updates            | 23580     |
|    policy_gradient_loss | 0.00834   |
|    std                  | 0.026     |
|    value_loss           | 0.000257  |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.45e+03  |


---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.46e+03  |
|    ep_rew_mean          | -32.7     |
| time/                   |           |
|    fps                  | 1922      |
|    iterations           | 2369      |
|    time_elapsed         | 2523      |
|    total_timesteps      | 4851712   |
| train/                  |           |
|    approx_kl            | 2.7464952 |
|    clip_fraction        | 0.727     |
|    clip_range           | 0.2       |
|    entropy_loss         | 8.82      |
|    explained_variance   | 0.948     |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0111   |
|    n_updates            | 23680     |
|    policy_gradient_loss | 0.0146    |
|    std                  | 0.0266    |
|    value_loss           | 0.00121   |
---------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.46e+03  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.48e+03   |
|    ep_rew_mean          | -32.6      |
| time/                   |            |
|    fps                  | 1922       |
|    iterations           | 2379       |
|    time_elapsed         | 2534       |
|    total_timesteps      | 4872192    |
| train/                  |            |
|    approx_kl            | 0.22941971 |
|    clip_fraction        | 0.657      |
|    clip_range           | 0.2        |
|    entropy_loss         | 8.79       |
|    explained_variance   | 0.987      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0237    |
|    n_updates            | 23780      |
|    policy_gradient_loss | 0.0316     |
|    std                  | 0.0269     |
|    value_loss           | 0.00102    |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.45e+03   |
|    ep_rew_mean          | -33.4      |
| time/                   |            |
|    fps                  | 1922       |
|    iterations           | 2389       |
|    time_elapsed         | 2544       |
|    total_timesteps      | 4892672    |
| train/                  |            |
|    approx_kl            | 0.46998814 |
|    clip_fraction        | 0.781      |
|    clip_range           | 0.2        |
|    entropy_loss         | 8.72       |
|    explained_variance   | 0.956      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.0189     |
|    n_updates            | 23880      |
|    policy_gradient_loss | 0.0312     |
|    std                  | 0.0276     |
|    value_loss           | 0.00463    |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.48e+03   |
|    ep_rew_mean          | -31        |
| time/                   |            |
|    fps                  | 1922       |
|    iterations           | 2399       |
|    time_elapsed         | 2555       |
|    total_timesteps      | 4913152    |
| train/                  |            |
|    approx_kl            | 0.33633065 |
|    clip_fraction        | 0.738      |
|    clip_range           | 0.2        |
|    entropy_loss         | 8.65       |
|    explained_variance   | 0.986      |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0399    |
|    n_updates            | 23980      |
|    policy_gradient_loss | 0.0423     |
|    std                  | 0.028      |
|    value_loss           | 0.0011     |
----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 1.47e+03   |
|    ep_rew_mean          | -33.6      |
| time/                   |            |
|    fps                  | 1922       |
|    iterations           | 2409       |
|    time_elapsed         | 2565       |
|    total_timesteps      | 4933632    |
| train/                  |            |
|    approx_kl            | 0.62288404 |
|    clip_fraction        | 0.6        |
|    clip_range           | 0.2        |
|    entropy_loss         | 8.69       |
|    explained_variance   | 0.84       |
|    learning_rate        | 0.0003     |
|    loss                 | -0.0485    |
|    n_updates            | 24080      |
|    policy_gradient_loss | -0.0246    |
|    std                  | 0.0273     |
|    value_loss           | 0.00956    |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean  

--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.44e+03 |
|    ep_rew_mean          | -39.8    |
| time/                   |          |
|    fps                  | 1922     |
|    iterations           | 2419     |
|    time_elapsed         | 2576     |
|    total_timesteps      | 4954112  |
| train/                  |          |
|    approx_kl            | 10.57218 |
|    clip_fraction        | 0.872    |
|    clip_range           | 0.2      |
|    entropy_loss         | 8.81     |
|    explained_variance   | 0.979    |
|    learning_rate        | 0.0003   |
|    loss                 | -0.0541  |
|    n_updates            | 24180    |
|    policy_gradient_loss | -0.0151  |
|    std                  | 0.0264   |
|    value_loss           | 0.00453  |
--------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.44e+03  |
|    ep_rew_mean      

--------------------------------------
| rollout/                |          |
|    ep_len_mean          | 1.42e+03 |
|    ep_rew_mean          | -41.9    |
| time/                   |          |
|    fps                  | 1922     |
|    iterations           | 2429     |
|    time_elapsed         | 2587     |
|    total_timesteps      | 4974592  |
| train/                  |          |
|    approx_kl            | 2.275885 |
|    clip_fraction        | 0.769    |
|    clip_range           | 0.2      |
|    entropy_loss         | 8.89     |
|    explained_variance   | 0.988    |
|    learning_rate        | 0.0003   |
|    loss                 | -0.0242  |
|    n_updates            | 24280    |
|    policy_gradient_loss | -0.00612 |
|    std                  | 0.0261   |
|    value_loss           | 0.00348  |
--------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.4e+03   |
|    ep_rew_mean      

---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.38e+03  |
|    ep_rew_mean          | -43.8     |
| time/                   |           |
|    fps                  | 1922      |
|    iterations           | 2439      |
|    time_elapsed         | 2597      |
|    total_timesteps      | 4995072   |
| train/                  |           |
|    approx_kl            | 0.4745683 |
|    clip_fraction        | 0.709     |
|    clip_range           | 0.2       |
|    entropy_loss         | 8.69      |
|    explained_variance   | 0.98      |
|    learning_rate        | 0.0003    |
|    loss                 | -0.0234   |
|    n_updates            | 24380     |
|    policy_gradient_loss | 0.00435   |
|    std                  | 0.0276    |
|    value_loss           | 0.00162   |
---------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 1.35e+03  |


<stable_baselines3.ppo.ppo.PPO at 0x30389f690>

## 3D) Save Model

In [16]:
model.save("ppo_bipedalwalker_hardcore_3M")

In [6]:
del model

## 3E) Evaluate Model 3M Model

In [7]:
model = PPO.load("models/ppo_bipedalwalker_hardcore_3M")

In [8]:
env = gym.make("BipedalWalker-v3", hardcore=True, render_mode="human")

In [None]:
# Evaluate the model (e.g., over 10 episodes)
mean_reward, std_reward = evaluate_policy(model, env, n_eval_episodes=10)

print(f"Average reward: {mean_reward} ± {std_reward}")

Average Reward**: -28.23 ± 24.82
  - **Assessment**: This result shows that the model is underperforming in the more challenging environment. A negative average reward indicates that the model mostly receives unfavorable feedback and struggles to achieve the target. The lower standard deviation (24.82) suggests less variability in performance, indicating that the model consistently performs poorly under difficult conditions. This may imply that the model requires more training and potentially different hyperparameter settings.

## 3F) Evaluate Model 5M Model

In [17]:
del model

In [18]:
model = PPO.load("models/ppo_bipedalwalker_hardcore_5M")

In [None]:
# Evaluate the model (e.g., over 10 episodes)
mean_reward, std_reward = evaluate_policy(model, env, n_eval_episodes=1000)

print(f"Average reward: {mean_reward} ± {std_reward}")