<a href="https://colab.research.google.com/github/erbanhun/-centOS7/blob/master/stable_baselines_her.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Stable Baselines - Hindsight Experience Replay on Highway Env

Github Repo: [https://github.com/DLR-RM/stable-baselines3](https://github.com/DLR-RM/stable-baselines3)

Highway env: [https://github.com/eleurent/highway-env](https://github.com/eleurent/highway-env)

[RL Baselines3 Zoo](https://github.com/DLR-RM/rl-baselines3-zoo) is a training framework for Reinforcement Learning (RL), using Stable Baselines3.

It provides scripts for training, evaluating agents, tuning hyperparameters, plotting results and recording videos.

Documentation is available online: [https://stable-baselines3.readthedocs.io/](https://stable-baselines3.readthedocs.io/)

## Install Dependencies and Stable Baselines Using Pip


```
pip install stable-baselines3[extra]
```

In [1]:
# for autoformatting
# %load_ext jupyter_black

In [2]:
# Install stable-baselines latest version
!pip install "stable-baselines3[extra]>=2.0.0a4"

Collecting stable-baselines3>=2.0.0a4 (from stable-baselines3[extra]>=2.0.0a4)
  Downloading stable_baselines3-2.7.0-py3-none-any.whl.metadata (4.8 kB)
Downloading stable_baselines3-2.7.0-py3-none-any.whl (187 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m187.2/187.2 kB[0m [31m5.5 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: stable-baselines3
Successfully installed stable-baselines3-2.7.0


In [3]:
# Install highway-env
!pip install highway-env

Collecting highway-env
  Downloading highway_env-1.10.1-py3-none-any.whl.metadata (16 kB)
Downloading highway_env-1.10.1-py3-none-any.whl (104 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m105.0/105.0 kB[0m [31m4.5 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: highway-env
Successfully installed highway-env-1.10.1


## Import policy, RL agent, ...

In [4]:
import gymnasium as gym
import highway_env
import numpy as np

from stable_baselines3 import HerReplayBuffer, SAC, DDPG
from stable_baselines3.common.noise import NormalActionNoise

Gym has been unmaintained since 2022 and does not support NumPy 2.0 amongst other critical functionality.
Please upgrade to Gymnasium, the maintained drop-in replacement of Gym, or contact the authors of your software and request that they upgrade.
See the migration guide at https://gymnasium.farama.org/introduction/migration_guide/ for additional information.
  return datetime.utcnow().replace(tzinfo=utc)


## Create the Gym env and instantiate the agent

For this example, we will be using the parking environment from the [highway-env](https://github.com/Farama-Foundation/HighwayEnv) repo by @eleurent.

The parking env is a goal-conditioned continuous control task, in which the vehicle must park in a given space with the appropriate heading.


![parking-env](https://raw.githubusercontent.com/eleurent/highway-env/gh-media/docs/media/parking-env.gif)



### Train Soft Actor-Critic (SAC) agent

Here, we use HER "future" goal sampling strategy, where we create 4 artificial transitions per real transition

Note: the hyperparameters (network architecture, discount factor, ...) were tuned for this task

In [5]:
env = gym.make("parking-v0")

  return datetime.utcnow().replace(tzinfo=utc)


In [6]:
# SAC hyperparams:
model = SAC(
    "MultiInputPolicy",
    env,
    replay_buffer_class=HerReplayBuffer,
    replay_buffer_kwargs=dict(
        n_sampled_goal=4,
        goal_selection_strategy="future",
    ),
    verbose=1,
    buffer_size=int(1e6),
    learning_rate=1e-3,
    gamma=0.95,
    batch_size=256,
    policy_kwargs=dict(net_arch=[256, 256, 256]),
)

Using cuda device
Wrapping the env with a `Monitor` wrapper
Wrapping the env in a DummyVecEnv.


  return datetime.utcnow().replace(tzinfo=utc)


In [8]:
# Train for 1e5 steps
model.learn(int(1e5))
# Save the trained agent
model.save('her_sac_highway')

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 62.8     |
|    ep_rew_mean     | -31.6    |
|    success_rate    | 0        |
| time/              |          |
|    episodes        | 4        |
|    fps             | 44       |
|    time_elapsed    | 5        |
|    total_timesteps | 251      |
| train/             |          |
|    actor_loss      | -2.45    |
|    critic_loss     | 0.305    |
|    ent_coef        | 0.862    |
|    ent_coef_loss   | -0.506   |
|    learning_rate   | 0.001    |
|    n_updates       | 150      |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 77.2     |
|    ep_rew_mean     | -36.5    |
|    success_rate    | 0        |
| time/              |          |
|    episodes        | 8        |
|    fps             | 41       |
|    time_elapsed    | 14       |
|    total_timesteps | 618      |
| train/             |          |
|    actor_loss      | -2.51    |
|    critic_loss     | 0.0339   |
|    ent_coef        | 0.597    |
|    ent_coef_loss   | -1.74    |
|    learning_rate   | 0.001    |
|    n_updates       | 517      |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 73.5     |
|    ep_rew_mean     | -37      |
|    success_rate    | 0        |
| time/              |          |
|    episodes        | 12       |
|    fps             | 40       |
|    time_elapsed    | 21       |
|    total_timesteps | 882      |
| train/             |          |
|    actor_loss      | -2.3     |
|    critic_loss     | 0.171    |
|    ent_coef        | 0.459    |
|    ent_coef_loss   | -2.6     |
|    learning_rate   | 0.001    |
|    n_updates       | 781      |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 85.8     |
|    ep_rew_mean     | -45.5    |
|    success_rate    | 0        |
| time/              |          |
|    episodes        | 16       |
|    fps             | 39       |
|    time_elapsed    | 34       |
|    total_timesteps | 1373     |
| train/             |          |
|    actor_loss      | -2.1     |
|    critic_loss     | 0.0149   |
|    ent_coef        | 0.281    |
|    ent_coef_loss   | -4.22    |
|    learning_rate   | 0.001    |
|    n_updates       | 1272     |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 86       |
|    ep_rew_mean     | -44.5    |
|    success_rate    | 0        |
| time/              |          |
|    episodes        | 20       |
|    fps             | 40       |
|    time_elapsed    | 42       |
|    total_timesteps | 1719     |
| train/             |          |
|    actor_loss      | -1.68    |
|    critic_loss     | 0.0288   |
|    ent_coef        | 0.2      |
|    ent_coef_loss   | -5.3     |
|    learning_rate   | 0.001    |
|    n_updates       | 1618     |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 116      |
|    ep_rew_mean     | -59.5    |
|    success_rate    | 0        |
| time/              |          |
|    episodes        | 24       |
|    fps             | 39       |
|    time_elapsed    | 69       |
|    total_timesteps | 2787     |
| train/             |          |
|    actor_loss      | -0.671   |
|    critic_loss     | 0.00797  |
|    ent_coef        | 0.0706   |
|    ent_coef_loss   | -8.22    |
|    learning_rate   | 0.001    |
|    n_updates       | 2686     |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 137      |
|    ep_rew_mean     | -65      |
|    success_rate    | 0        |
| time/              |          |
|    episodes        | 28       |
|    fps             | 39       |
|    time_elapsed    | 96       |
|    total_timesteps | 3843     |
| train/             |          |
|    actor_loss      | 0.463    |
|    critic_loss     | 0.00835  |
|    ent_coef        | 0.0266   |
|    ent_coef_loss   | -9.31    |
|    learning_rate   | 0.001    |
|    n_updates       | 3742     |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 148      |
|    ep_rew_mean     | -69.5    |
|    success_rate    | 0        |
| time/              |          |
|    episodes        | 32       |
|    fps             | 39       |
|    time_elapsed    | 119      |
|    total_timesteps | 4743     |
| train/             |          |
|    actor_loss      | 1.09     |
|    critic_loss     | 0.00518  |
|    ent_coef        | 0.0127   |
|    ent_coef_loss   | -8.27    |
|    learning_rate   | 0.001    |
|    n_updates       | 4642     |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 179      |
|    ep_rew_mean     | -80.2    |
|    success_rate    | 0        |
| time/              |          |
|    episodes        | 36       |
|    fps             | 39       |
|    time_elapsed    | 161      |
|    total_timesteps | 6427     |
| train/             |          |
|    actor_loss      | 1.89     |
|    critic_loss     | 0.00487  |
|    ent_coef        | 0.00445  |
|    ent_coef_loss   | -3.97    |
|    learning_rate   | 0.001    |
|    n_updates       | 6326     |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 168      |
|    ep_rew_mean     | -74.6    |
|    success_rate    | 0.025    |
| time/              |          |
|    episodes        | 40       |
|    fps             | 39       |
|    time_elapsed    | 169      |
|    total_timesteps | 6716     |
| train/             |          |
|    actor_loss      | 1.86     |
|    critic_loss     | 0.00473  |
|    ent_coef        | 0.004    |
|    ent_coef_loss   | -2.27    |
|    learning_rate   | 0.001    |
|    n_updates       | 6615     |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 174      |
|    ep_rew_mean     | -72.8    |
|    success_rate    | 0.0227   |
| time/              |          |
|    episodes        | 44       |
|    fps             | 39       |
|    time_elapsed    | 193      |
|    total_timesteps | 7667     |
| train/             |          |
|    actor_loss      | 2.15     |
|    critic_loss     | 0.0818   |
|    ent_coef        | 0.00322  |
|    ent_coef_loss   | -0.671   |
|    learning_rate   | 0.001    |
|    n_updates       | 7566     |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 164      |
|    ep_rew_mean     | -68.5    |
|    success_rate    | 0.0208   |
| time/              |          |
|    episodes        | 48       |
|    fps             | 39       |
|    time_elapsed    | 198      |
|    total_timesteps | 7851     |
| train/             |          |
|    actor_loss      | 2.07     |
|    critic_loss     | 0.0059   |
|    ent_coef        | 0.00315  |
|    ent_coef_loss   | -2.38    |
|    learning_rate   | 0.001    |
|    n_updates       | 7750     |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 156      |
|    ep_rew_mean     | -65.5    |
|    success_rate    | 0.0192   |
| time/              |          |
|    episodes        | 52       |
|    fps             | 39       |
|    time_elapsed    | 205      |
|    total_timesteps | 8106     |
| train/             |          |
|    actor_loss      | 2.23     |
|    critic_loss     | 0.00615  |
|    ent_coef        | 0.00311  |
|    ent_coef_loss   | 0.588    |
|    learning_rate   | 0.001    |
|    n_updates       | 8005     |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 147      |
|    ep_rew_mean     | -61.6    |
|    success_rate    | 0.0357   |
| time/              |          |
|    episodes        | 56       |
|    fps             | 39       |
|    time_elapsed    | 207      |
|    total_timesteps | 8205     |
| train/             |          |
|    actor_loss      | 2.23     |
|    critic_loss     | 0.00699  |
|    ent_coef        | 0.00312  |
|    ent_coef_loss   | -0.338   |
|    learning_rate   | 0.001    |
|    n_updates       | 8104     |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 138      |
|    ep_rew_mean     | -58.3    |
|    success_rate    | 0.0333   |
| time/              |          |
|    episodes        | 60       |
|    fps             | 39       |
|    time_elapsed    | 209      |
|    total_timesteps | 8283     |
| train/             |          |
|    actor_loss      | 2.26     |
|    critic_loss     | 0.011    |
|    ent_coef        | 0.00317  |
|    ent_coef_loss   | 1.29     |
|    learning_rate   | 0.001    |
|    n_updates       | 8182     |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 131      |
|    ep_rew_mean     | -55.7    |
|    success_rate    | 0.0469   |
| time/              |          |
|    episodes        | 64       |
|    fps             | 39       |
|    time_elapsed    | 212      |
|    total_timesteps | 8408     |
| train/             |          |
|    actor_loss      | 2.05     |
|    critic_loss     | 0.0151   |
|    ent_coef        | 0.00325  |
|    ent_coef_loss   | 2.02     |
|    learning_rate   | 0.001    |
|    n_updates       | 8307     |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 127      |
|    ep_rew_mean     | -53.3    |
|    success_rate    | 0.0588   |
| time/              |          |
|    episodes        | 68       |
|    fps             | 39       |
|    time_elapsed    | 218      |
|    total_timesteps | 8619     |
| train/             |          |
|    actor_loss      | 2.33     |
|    critic_loss     | 0.00692  |
|    ent_coef        | 0.00341  |
|    ent_coef_loss   | 0.51     |
|    learning_rate   | 0.001    |
|    n_updates       | 8518     |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 122      |
|    ep_rew_mean     | -51.4    |
|    success_rate    | 0.0556   |
| time/              |          |
|    episodes        | 72       |
|    fps             | 39       |
|    time_elapsed    | 222      |
|    total_timesteps | 8764     |
| train/             |          |
|    actor_loss      | 2.22     |
|    critic_loss     | 0.00526  |
|    ent_coef        | 0.00352  |
|    ent_coef_loss   | 1.32     |
|    learning_rate   | 0.001    |
|    n_updates       | 8663     |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 117      |
|    ep_rew_mean     | -49.4    |
|    success_rate    | 0.0658   |
| time/              |          |
|    episodes        | 76       |
|    fps             | 39       |
|    time_elapsed    | 224      |
|    total_timesteps | 8861     |
| train/             |          |
|    actor_loss      | 2.22     |
|    critic_loss     | 0.00695  |
|    ent_coef        | 0.00357  |
|    ent_coef_loss   | 1.46     |
|    learning_rate   | 0.001    |
|    n_updates       | 8760     |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 112      |
|    ep_rew_mean     | -47.6    |
|    success_rate    | 0.075    |
| time/              |          |
|    episodes        | 80       |
|    fps             | 39       |
|    time_elapsed    | 227      |
|    total_timesteps | 8970     |
| train/             |          |
|    actor_loss      | 2.32     |
|    critic_loss     | 0.0311   |
|    ent_coef        | 0.0037   |
|    ent_coef_loss   | 1.04     |
|    learning_rate   | 0.001    |
|    n_updates       | 8869     |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 111      |
|    ep_rew_mean     | -46.6    |
|    success_rate    | 0.0833   |
| time/              |          |
|    episodes        | 84       |
|    fps             | 39       |
|    time_elapsed    | 237      |
|    total_timesteps | 9356     |
| train/             |          |
|    actor_loss      | 2.31     |
|    critic_loss     | 0.029    |
|    ent_coef        | 0.00382  |
|    ent_coef_loss   | 0.825    |
|    learning_rate   | 0.001    |
|    n_updates       | 9255     |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 108      |
|    ep_rew_mean     | -45.3    |
|    success_rate    | 0.0795   |
| time/              |          |
|    episodes        | 88       |
|    fps             | 39       |
|    time_elapsed    | 240      |
|    total_timesteps | 9480     |
| train/             |          |
|    actor_loss      | 2.43     |
|    critic_loss     | 0.0116   |
|    ent_coef        | 0.00385  |
|    ent_coef_loss   | 0.0774   |
|    learning_rate   | 0.001    |
|    n_updates       | 9379     |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 104      |
|    ep_rew_mean     | -43.8    |
|    success_rate    | 0.087    |
| time/              |          |
|    episodes        | 92       |
|    fps             | 39       |
|    time_elapsed    | 243      |
|    total_timesteps | 9591     |
| train/             |          |
|    actor_loss      | 2.33     |
|    critic_loss     | 0.00752  |
|    ent_coef        | 0.00386  |
|    ent_coef_loss   | 0.113    |
|    learning_rate   | 0.001    |
|    n_updates       | 9490     |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 102      |
|    ep_rew_mean     | -42.7    |
|    success_rate    | 0.0938   |
| time/              |          |
|    episodes        | 96       |
|    fps             | 39       |
|    time_elapsed    | 248      |
|    total_timesteps | 9797     |
| train/             |          |
|    actor_loss      | 2.47     |
|    critic_loss     | 0.00632  |
|    ent_coef        | 0.00403  |
|    ent_coef_loss   | 1.94     |
|    learning_rate   | 0.001    |
|    n_updates       | 9696     |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 100      |
|    ep_rew_mean     | -41.7    |
|    success_rate    | 0.1      |
| time/              |          |
|    episodes        | 100      |
|    fps             | 39       |
|    time_elapsed    | 254      |
|    total_timesteps | 9998     |
| train/             |          |
|    actor_loss      | 2.48     |
|    critic_loss     | 0.011    |
|    ent_coef        | 0.00407  |
|    ent_coef_loss   | 1.02     |
|    learning_rate   | 0.001    |
|    n_updates       | 9897     |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 104      |
|    ep_rew_mean     | -42.1    |
|    success_rate    | 0.11     |
| time/              |          |
|    episodes        | 104      |
|    fps             | 39       |
|    time_elapsed    | 270      |
|    total_timesteps | 10609    |
| train/             |          |
|    actor_loss      | 2.46     |
|    critic_loss     | 0.00742  |
|    ent_coef        | 0.00429  |
|    ent_coef_loss   | -0.918   |
|    learning_rate   | 0.001    |
|    n_updates       | 10508    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 106      |
|    ep_rew_mean     | -42      |
|    success_rate    | 0.11     |
| time/              |          |
|    episodes        | 108      |
|    fps             | 39       |
|    time_elapsed    | 285      |
|    total_timesteps | 11187    |
| train/             |          |
|    actor_loss      | 2.33     |
|    critic_loss     | 0.00997  |
|    ent_coef        | 0.00417  |
|    ent_coef_loss   | -0.997   |
|    learning_rate   | 0.001    |
|    n_updates       | 11086    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 104      |
|    ep_rew_mean     | -41.1    |
|    success_rate    | 0.12     |
| time/              |          |
|    episodes        | 112      |
|    fps             | 39       |
|    time_elapsed    | 288      |
|    total_timesteps | 11311    |
| train/             |          |
|    actor_loss      | 2.37     |
|    critic_loss     | 0.00956  |
|    ent_coef        | 0.00403  |
|    ent_coef_loss   | 1.04     |
|    learning_rate   | 0.001    |
|    n_updates       | 11210    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 102      |
|    ep_rew_mean     | -39.1    |
|    success_rate    | 0.14     |
| time/              |          |
|    episodes        | 116      |
|    fps             | 39       |
|    time_elapsed    | 295      |
|    total_timesteps | 11571    |
| train/             |          |
|    actor_loss      | 2.32     |
|    critic_loss     | 0.012    |
|    ent_coef        | 0.00408  |
|    ent_coef_loss   | 0.322    |
|    learning_rate   | 0.001    |
|    n_updates       | 11470    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 100      |
|    ep_rew_mean     | -38.3    |
|    success_rate    | 0.15     |
| time/              |          |
|    episodes        | 120      |
|    fps             | 39       |
|    time_elapsed    | 299      |
|    total_timesteps | 11764    |
| train/             |          |
|    actor_loss      | 2.41     |
|    critic_loss     | 0.01     |
|    ent_coef        | 0.00437  |
|    ent_coef_loss   | 0.612    |
|    learning_rate   | 0.001    |
|    n_updates       | 11663    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 92.1     |
|    ep_rew_mean     | -33.5    |
|    success_rate    | 0.17     |
| time/              |          |
|    episodes        | 124      |
|    fps             | 39       |
|    time_elapsed    | 306      |
|    total_timesteps | 12000    |
| train/             |          |
|    actor_loss      | 2.47     |
|    critic_loss     | 0.00965  |
|    ent_coef        | 0.00493  |
|    ent_coef_loss   | 1.47     |
|    learning_rate   | 0.001    |
|    n_updates       | 11899    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 83       |
|    ep_rew_mean     | -30.1    |
|    success_rate    | 0.2      |
| time/              |          |
|    episodes        | 128      |
|    fps             | 39       |
|    time_elapsed    | 310      |
|    total_timesteps | 12143    |
| train/             |          |
|    actor_loss      | 2.28     |
|    critic_loss     | 0.0127   |
|    ent_coef        | 0.00483  |
|    ent_coef_loss   | 1.13     |
|    learning_rate   | 0.001    |
|    n_updates       | 12042    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 84.5     |
|    ep_rew_mean     | -28.2    |
|    success_rate    | 0.21     |
| time/              |          |
|    episodes        | 132      |
|    fps             | 39       |
|    time_elapsed    | 337      |
|    total_timesteps | 13188    |
| train/             |          |
|    actor_loss      | 2.29     |
|    critic_loss     | 0.00581  |
|    ent_coef        | 0.00456  |
|    ent_coef_loss   | 2.1      |
|    learning_rate   | 0.001    |
|    n_updates       | 13087    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 69.2     |
|    ep_rew_mean     | -22.2    |
|    success_rate    | 0.22     |
| time/              |          |
|    episodes        | 136      |
|    fps             | 38       |
|    time_elapsed    | 342      |
|    total_timesteps | 13347    |
| train/             |          |
|    actor_loss      | 2.27     |
|    critic_loss     | 0.0123   |
|    ent_coef        | 0.00442  |
|    ent_coef_loss   | -0.0334  |
|    learning_rate   | 0.001    |
|    n_updates       | 13246    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 69.6     |
|    ep_rew_mean     | -22      |
|    success_rate    | 0.24     |
| time/              |          |
|    episodes        | 140      |
|    fps             | 39       |
|    time_elapsed    | 350      |
|    total_timesteps | 13676    |
| train/             |          |
|    actor_loss      | 2.36     |
|    critic_loss     | 0.00673  |
|    ent_coef        | 0.00463  |
|    ent_coef_loss   | -0.591   |
|    learning_rate   | 0.001    |
|    n_updates       | 13575    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 64.5     |
|    ep_rew_mean     | -20.8    |
|    success_rate    | 0.26     |
| time/              |          |
|    episodes        | 144      |
|    fps             | 38       |
|    time_elapsed    | 362      |
|    total_timesteps | 14112    |
| train/             |          |
|    actor_loss      | 2.21     |
|    critic_loss     | 0.00857  |
|    ent_coef        | 0.0046   |
|    ent_coef_loss   | 0.948    |
|    learning_rate   | 0.001    |
|    n_updates       | 14011    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 63.6     |
|    ep_rew_mean     | -20.3    |
|    success_rate    | 0.29     |
| time/              |          |
|    episodes        | 148      |
|    fps             | 38       |
|    time_elapsed    | 364      |
|    total_timesteps | 14210    |
| train/             |          |
|    actor_loss      | 2.32     |
|    critic_loss     | 0.00597  |
|    ent_coef        | 0.00464  |
|    ent_coef_loss   | -0.155   |
|    learning_rate   | 0.001    |
|    n_updates       | 14109    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 71.7     |
|    ep_rew_mean     | -20.9    |
|    success_rate    | 0.31     |
| time/              |          |
|    episodes        | 152      |
|    fps             | 38       |
|    time_elapsed    | 392      |
|    total_timesteps | 15276    |
| train/             |          |
|    actor_loss      | 2.28     |
|    critic_loss     | 0.0269   |
|    ent_coef        | 0.00442  |
|    ent_coef_loss   | 1.21     |
|    learning_rate   | 0.001    |
|    n_updates       | 15175    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 76.4     |
|    ep_rew_mean     | -21.5    |
|    success_rate    | 0.32     |
| time/              |          |
|    episodes        | 156      |
|    fps             | 38       |
|    time_elapsed    | 407      |
|    total_timesteps | 15845    |
| train/             |          |
|    actor_loss      | 2.2      |
|    critic_loss     | 0.00919  |
|    ent_coef        | 0.00439  |
|    ent_coef_loss   | 1.18     |
|    learning_rate   | 0.001    |
|    n_updates       | 15744    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 81.2     |
|    ep_rew_mean     | -22      |
|    success_rate    | 0.35     |
| time/              |          |
|    episodes        | 160      |
|    fps             | 38       |
|    time_elapsed    | 421      |
|    total_timesteps | 16407    |
| train/             |          |
|    actor_loss      | 2.23     |
|    critic_loss     | 0.0117   |
|    ent_coef        | 0.00433  |
|    ent_coef_loss   | 0.238    |
|    learning_rate   | 0.001    |
|    n_updates       | 16306    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 86.9     |
|    ep_rew_mean     | -22.4    |
|    success_rate    | 0.36     |
| time/              |          |
|    episodes        | 164      |
|    fps             | 38       |
|    time_elapsed    | 439      |
|    total_timesteps | 17100    |
| train/             |          |
|    actor_loss      | 2.17     |
|    critic_loss     | 0.00515  |
|    ent_coef        | 0.0042   |
|    ent_coef_loss   | -0.138   |
|    learning_rate   | 0.001    |
|    n_updates       | 16999    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 87.3     |
|    ep_rew_mean     | -22.5    |
|    success_rate    | 0.38     |
| time/              |          |
|    episodes        | 168      |
|    fps             | 38       |
|    time_elapsed    | 446      |
|    total_timesteps | 17351    |
| train/             |          |
|    actor_loss      | 2.19     |
|    critic_loss     | 0.00544  |
|    ent_coef        | 0.00412  |
|    ent_coef_loss   | 0.00479  |
|    learning_rate   | 0.001    |
|    n_updates       | 17250    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 87       |
|    ep_rew_mean     | -22.2    |
|    success_rate    | 0.4      |
| time/              |          |
|    episodes        | 172      |
|    fps             | 38       |
|    time_elapsed    | 449      |
|    total_timesteps | 17468    |
| train/             |          |
|    actor_loss      | 2.2      |
|    critic_loss     | 0.0227   |
|    ent_coef        | 0.00422  |
|    ent_coef_loss   | -0.168   |
|    learning_rate   | 0.001    |
|    n_updates       | 17367    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 87.4     |
|    ep_rew_mean     | -22.2    |
|    success_rate    | 0.41     |
| time/              |          |
|    episodes        | 176      |
|    fps             | 38       |
|    time_elapsed    | 453      |
|    total_timesteps | 17602    |
| train/             |          |
|    actor_loss      | 2.11     |
|    critic_loss     | 0.00635  |
|    ent_coef        | 0.00421  |
|    ent_coef_loss   | 0.608    |
|    learning_rate   | 0.001    |
|    n_updates       | 17501    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 90       |
|    ep_rew_mean     | -22.3    |
|    success_rate    | 0.43     |
| time/              |          |
|    episodes        | 180      |
|    fps             | 38       |
|    time_elapsed    | 462      |
|    total_timesteps | 17972    |
| train/             |          |
|    actor_loss      | 2.09     |
|    critic_loss     | 0.00875  |
|    ent_coef        | 0.004    |
|    ent_coef_loss   | -0.162   |
|    learning_rate   | 0.001    |
|    n_updates       | 17871    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 95.1     |
|    ep_rew_mean     | -22.9    |
|    success_rate    | 0.44     |
| time/              |          |
|    episodes        | 184      |
|    fps             | 38       |
|    time_elapsed    | 486      |
|    total_timesteps | 18869    |
| train/             |          |
|    actor_loss      | 2.12     |
|    critic_loss     | 0.00643  |
|    ent_coef        | 0.00405  |
|    ent_coef_loss   | 0.437    |
|    learning_rate   | 0.001    |
|    n_updates       | 18768    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 95.2     |
|    ep_rew_mean     | -22.5    |
|    success_rate    | 0.48     |
| time/              |          |
|    episodes        | 188      |
|    fps             | 38       |
|    time_elapsed    | 490      |
|    total_timesteps | 19004    |
| train/             |          |
|    actor_loss      | 1.99     |
|    critic_loss     | 0.00577  |
|    ent_coef        | 0.0041   |
|    ent_coef_loss   | 0.976    |
|    learning_rate   | 0.001    |
|    n_updates       | 18903    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 105      |
|    ep_rew_mean     | -24.2    |
|    success_rate    | 0.49     |
| time/              |          |
|    episodes        | 192      |
|    fps             | 38       |
|    time_elapsed    | 518      |
|    total_timesteps | 20056    |
| train/             |          |
|    actor_loss      | 2.08     |
|    critic_loss     | 0.00887  |
|    ent_coef        | 0.00386  |
|    ent_coef_loss   | 1.58     |
|    learning_rate   | 0.001    |
|    n_updates       | 19955    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 104      |
|    ep_rew_mean     | -23.8    |
|    success_rate    | 0.52     |
| time/              |          |
|    episodes        | 196      |
|    fps             | 38       |
|    time_elapsed    | 520      |
|    total_timesteps | 20164    |
| train/             |          |
|    actor_loss      | 2.12     |
|    critic_loss     | 0.00533  |
|    ent_coef        | 0.00388  |
|    ent_coef_loss   | -0.659   |
|    learning_rate   | 0.001    |
|    n_updates       | 20063    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 103      |
|    ep_rew_mean     | -23.5    |
|    success_rate    | 0.54     |
| time/              |          |
|    episodes        | 200      |
|    fps             | 38       |
|    time_elapsed    | 523      |
|    total_timesteps | 20266    |
| train/             |          |
|    actor_loss      | 1.9      |
|    critic_loss     | 0.00742  |
|    ent_coef        | 0.00381  |
|    ent_coef_loss   | 0.847    |
|    learning_rate   | 0.001    |
|    n_updates       | 20165    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 103      |
|    ep_rew_mean     | -23.4    |
|    success_rate    | 0.55     |
| time/              |          |
|    episodes        | 204      |
|    fps             | 38       |
|    time_elapsed    | 539      |
|    total_timesteps | 20876    |
| train/             |          |
|    actor_loss      | 2        |
|    critic_loss     | 0.016    |
|    ent_coef        | 0.00387  |
|    ent_coef_loss   | 1.08     |
|    learning_rate   | 0.001    |
|    n_updates       | 20775    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 108      |
|    ep_rew_mean     | -23.7    |
|    success_rate    | 0.56     |
| time/              |          |
|    episodes        | 208      |
|    fps             | 38       |
|    time_elapsed    | 568      |
|    total_timesteps | 21937    |
| train/             |          |
|    actor_loss      | 1.93     |
|    critic_loss     | 0.00388  |
|    ent_coef        | 0.00383  |
|    ent_coef_loss   | 0.0767   |
|    learning_rate   | 0.001    |
|    n_updates       | 21836    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 112      |
|    ep_rew_mean     | -24.3    |
|    success_rate    | 0.56     |
| time/              |          |
|    episodes        | 212      |
|    fps             | 38       |
|    time_elapsed    | 583      |
|    total_timesteps | 22505    |
| train/             |          |
|    actor_loss      | 1.89     |
|    critic_loss     | 0.0042   |
|    ent_coef        | 0.00374  |
|    ent_coef_loss   | -1.21    |
|    learning_rate   | 0.001    |
|    n_updates       | 22404    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 110      |
|    ep_rew_mean     | -23.7    |
|    success_rate    | 0.58     |
| time/              |          |
|    episodes        | 216      |
|    fps             | 38       |
|    time_elapsed    | 585      |
|    total_timesteps | 22586    |
| train/             |          |
|    actor_loss      | 1.97     |
|    critic_loss     | 0.00469  |
|    ent_coef        | 0.00373  |
|    ent_coef_loss   | -0.102   |
|    learning_rate   | 0.001    |
|    n_updates       | 22485    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 119      |
|    ep_rew_mean     | -24.9    |
|    success_rate    | 0.59     |
| time/              |          |
|    episodes        | 220      |
|    fps             | 38       |
|    time_elapsed    | 613      |
|    total_timesteps | 23648    |
| train/             |          |
|    actor_loss      | 1.99     |
|    critic_loss     | 0.0095   |
|    ent_coef        | 0.0035   |
|    ent_coef_loss   | 1.85     |
|    learning_rate   | 0.001    |
|    n_updates       | 23547    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 122      |
|    ep_rew_mean     | -25.4    |
|    success_rate    | 0.6      |
| time/              |          |
|    episodes        | 224      |
|    fps             | 38       |
|    time_elapsed    | 629      |
|    total_timesteps | 24213    |
| train/             |          |
|    actor_loss      | 1.76     |
|    critic_loss     | 0.00524  |
|    ent_coef        | 0.00341  |
|    ent_coef_loss   | -1.54    |
|    learning_rate   | 0.001    |
|    n_updates       | 24112    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 122      |
|    ep_rew_mean     | -25.2    |
|    success_rate    | 0.61     |
| time/              |          |
|    episodes        | 228      |
|    fps             | 38       |
|    time_elapsed    | 631      |
|    total_timesteps | 24298    |
| train/             |          |
|    actor_loss      | 1.78     |
|    critic_loss     | 0.00573  |
|    ent_coef        | 0.00356  |
|    ent_coef_loss   | 0.95     |
|    learning_rate   | 0.001    |
|    n_updates       | 24197    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 120      |
|    ep_rew_mean     | -24.9    |
|    success_rate    | 0.61     |
| time/              |          |
|    episodes        | 232      |
|    fps             | 38       |
|    time_elapsed    | 655      |
|    total_timesteps | 25153    |
| train/             |          |
|    actor_loss      | 1.79     |
|    critic_loss     | 0.0132   |
|    ent_coef        | 0.00327  |
|    ent_coef_loss   | 0.489    |
|    learning_rate   | 0.001    |
|    n_updates       | 25052    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 124      |
|    ep_rew_mean     | -25.4    |
|    success_rate    | 0.62     |
| time/              |          |
|    episodes        | 236      |
|    fps             | 38       |
|    time_elapsed    | 671      |
|    total_timesteps | 25745    |
| train/             |          |
|    actor_loss      | 1.96     |
|    critic_loss     | 0.0218   |
|    ent_coef        | 0.00324  |
|    ent_coef_loss   | 1.01     |
|    learning_rate   | 0.001    |
|    n_updates       | 25644    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 126      |
|    ep_rew_mean     | -25.5    |
|    success_rate    | 0.63     |
| time/              |          |
|    episodes        | 240      |
|    fps             | 38       |
|    time_elapsed    | 685      |
|    total_timesteps | 26243    |
| train/             |          |
|    actor_loss      | 1.84     |
|    critic_loss     | 0.00669  |
|    ent_coef        | 0.00309  |
|    ent_coef_loss   | -1.45    |
|    learning_rate   | 0.001    |
|    n_updates       | 26142    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 122      |
|    ep_rew_mean     | -24.8    |
|    success_rate    | 0.64     |
| time/              |          |
|    episodes        | 244      |
|    fps             | 38       |
|    time_elapsed    | 687      |
|    total_timesteps | 26344    |
| train/             |          |
|    actor_loss      | 1.93     |
|    critic_loss     | 0.00473  |
|    ent_coef        | 0.00307  |
|    ent_coef_loss   | -0.653   |
|    learning_rate   | 0.001    |
|    n_updates       | 26243    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 123      |
|    ep_rew_mean     | -24.9    |
|    success_rate    | 0.65     |
| time/              |          |
|    episodes        | 248      |
|    fps             | 38       |
|    time_elapsed    | 692      |
|    total_timesteps | 26490    |
| train/             |          |
|    actor_loss      | 1.84     |
|    critic_loss     | 0.00704  |
|    ent_coef        | 0.003    |
|    ent_coef_loss   | -0.213   |
|    learning_rate   | 0.001    |
|    n_updates       | 26389    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 114      |
|    ep_rew_mean     | -23.8    |
|    success_rate    | 0.66     |
| time/              |          |
|    episodes        | 252      |
|    fps             | 38       |
|    time_elapsed    | 697      |
|    total_timesteps | 26712    |
| train/             |          |
|    actor_loss      | 1.88     |
|    critic_loss     | 0.0058   |
|    ent_coef        | 0.00302  |
|    ent_coef_loss   | -1.46    |
|    learning_rate   | 0.001    |
|    n_updates       | 26611    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 110      |
|    ep_rew_mean     | -23.2    |
|    success_rate    | 0.66     |
| time/              |          |
|    episodes        | 256      |
|    fps             | 38       |
|    time_elapsed    | 700      |
|    total_timesteps | 26809    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.00312  |
|    ent_coef        | 0.00309  |
|    ent_coef_loss   | -1.72    |
|    learning_rate   | 0.001    |
|    n_updates       | 26708    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 105      |
|    ep_rew_mean     | -22.5    |
|    success_rate    | 0.67     |
| time/              |          |
|    episodes        | 260      |
|    fps             | 38       |
|    time_elapsed    | 703      |
|    total_timesteps | 26907    |
| train/             |          |
|    actor_loss      | 1.78     |
|    critic_loss     | 0.00481  |
|    ent_coef        | 0.00311  |
|    ent_coef_loss   | 0.591    |
|    learning_rate   | 0.001    |
|    n_updates       | 26806    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 104      |
|    ep_rew_mean     | -22.6    |
|    success_rate    | 0.68     |
| time/              |          |
|    episodes        | 264      |
|    fps             | 38       |
|    time_elapsed    | 719      |
|    total_timesteps | 27506    |
| train/             |          |
|    actor_loss      | 2.05     |
|    critic_loss     | 0.0108   |
|    ent_coef        | 0.00323  |
|    ent_coef_loss   | 2.96     |
|    learning_rate   | 0.001    |
|    n_updates       | 27405    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 103      |
|    ep_rew_mean     | -22.3    |
|    success_rate    | 0.68     |
| time/              |          |
|    episodes        | 268      |
|    fps             | 38       |
|    time_elapsed    | 721      |
|    total_timesteps | 27611    |
| train/             |          |
|    actor_loss      | 1.9      |
|    critic_loss     | 0.00506  |
|    ent_coef        | 0.00339  |
|    ent_coef_loss   | -0.271   |
|    learning_rate   | 0.001    |
|    n_updates       | 27510    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 103      |
|    ep_rew_mean     | -22.1    |
|    success_rate    | 0.7      |
| time/              |          |
|    episodes        | 272      |
|    fps             | 38       |
|    time_elapsed    | 724      |
|    total_timesteps | 27721    |
| train/             |          |
|    actor_loss      | 1.91     |
|    critic_loss     | 0.00796  |
|    ent_coef        | 0.00334  |
|    ent_coef_loss   | 0.547    |
|    learning_rate   | 0.001    |
|    n_updates       | 27620    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 102      |
|    ep_rew_mean     | -21.9    |
|    success_rate    | 0.72     |
| time/              |          |
|    episodes        | 276      |
|    fps             | 38       |
|    time_elapsed    | 727      |
|    total_timesteps | 27838    |
| train/             |          |
|    actor_loss      | 1.75     |
|    critic_loss     | 0.00974  |
|    ent_coef        | 0.00327  |
|    ent_coef_loss   | -1.32    |
|    learning_rate   | 0.001    |
|    n_updates       | 27737    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 99.6     |
|    ep_rew_mean     | -21.6    |
|    success_rate    | 0.73     |
| time/              |          |
|    episodes        | 280      |
|    fps             | 38       |
|    time_elapsed    | 730      |
|    total_timesteps | 27936    |
| train/             |          |
|    actor_loss      | 1.94     |
|    critic_loss     | 0.00491  |
|    ent_coef        | 0.00326  |
|    ent_coef_loss   | -0.13    |
|    learning_rate   | 0.001    |
|    n_updates       | 27835    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 91.4     |
|    ep_rew_mean     | -20.1    |
|    success_rate    | 0.75     |
| time/              |          |
|    episodes        | 284      |
|    fps             | 38       |
|    time_elapsed    | 732      |
|    total_timesteps | 28006    |
| train/             |          |
|    actor_loss      | 1.89     |
|    critic_loss     | 0.00677  |
|    ent_coef        | 0.00343  |
|    ent_coef_loss   | 2.45     |
|    learning_rate   | 0.001    |
|    n_updates       | 27905    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 92.2     |
|    ep_rew_mean     | -20.3    |
|    success_rate    | 0.74     |
| time/              |          |
|    episodes        | 288      |
|    fps             | 38       |
|    time_elapsed    | 738      |
|    total_timesteps | 28224    |
| train/             |          |
|    actor_loss      | 1.93     |
|    critic_loss     | 0.00561  |
|    ent_coef        | 0.0034   |
|    ent_coef_loss   | -0.597   |
|    learning_rate   | 0.001    |
|    n_updates       | 28123    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 87.6     |
|    ep_rew_mean     | -19.5    |
|    success_rate    | 0.75     |
| time/              |          |
|    episodes        | 292      |
|    fps             | 38       |
|    time_elapsed    | 753      |
|    total_timesteps | 28812    |
| train/             |          |
|    actor_loss      | 1.76     |
|    critic_loss     | 0.00401  |
|    ent_coef        | 0.00336  |
|    ent_coef_loss   | -0.285   |
|    learning_rate   | 0.001    |
|    n_updates       | 28711    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 87.6     |
|    ep_rew_mean     | -19.5    |
|    success_rate    | 0.75     |
| time/              |          |
|    episodes        | 296      |
|    fps             | 38       |
|    time_elapsed    | 757      |
|    total_timesteps | 28927    |
| train/             |          |
|    actor_loss      | 1.79     |
|    critic_loss     | 0.00468  |
|    ent_coef        | 0.00342  |
|    ent_coef_loss   | -0.373   |
|    learning_rate   | 0.001    |
|    n_updates       | 28826    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 87.6     |
|    ep_rew_mean     | -19.4    |
|    success_rate    | 0.76     |
| time/              |          |
|    episodes        | 300      |
|    fps             | 38       |
|    time_elapsed    | 759      |
|    total_timesteps | 29024    |
| train/             |          |
|    actor_loss      | 1.86     |
|    critic_loss     | 0.00528  |
|    ent_coef        | 0.00339  |
|    ent_coef_loss   | -1.15    |
|    learning_rate   | 0.001    |
|    n_updates       | 28923    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 85.2     |
|    ep_rew_mean     | -18.6    |
|    success_rate    | 0.78     |
| time/              |          |
|    episodes        | 304      |
|    fps             | 38       |
|    time_elapsed    | 769      |
|    total_timesteps | 29392    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.00616  |
|    ent_coef        | 0.00336  |
|    ent_coef_loss   | 0.385    |
|    learning_rate   | 0.001    |
|    n_updates       | 29291    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 75.3     |
|    ep_rew_mean     | -17      |
|    success_rate    | 0.81     |
| time/              |          |
|    episodes        | 308      |
|    fps             | 38       |
|    time_elapsed    | 771      |
|    total_timesteps | 29468    |
| train/             |          |
|    actor_loss      | 1.81     |
|    critic_loss     | 0.0107   |
|    ent_coef        | 0.00342  |
|    ent_coef_loss   | -0.514   |
|    learning_rate   | 0.001    |
|    n_updates       | 29367    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 70.8     |
|    ep_rew_mean     | -16.1    |
|    success_rate    | 0.84     |
| time/              |          |
|    episodes        | 312      |
|    fps             | 38       |
|    time_elapsed    | 774      |
|    total_timesteps | 29581    |
| train/             |          |
|    actor_loss      | 1.87     |
|    critic_loss     | 0.00703  |
|    ent_coef        | 0.00333  |
|    ent_coef_loss   | -0.0534  |
|    learning_rate   | 0.001    |
|    n_updates       | 29480    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 76.6     |
|    ep_rew_mean     | -17.2    |
|    success_rate    | 0.83     |
| time/              |          |
|    episodes        | 316      |
|    fps             | 38       |
|    time_elapsed    | 791      |
|    total_timesteps | 30246    |
| train/             |          |
|    actor_loss      | 1.8      |
|    critic_loss     | 0.0063   |
|    ent_coef        | 0.00323  |
|    ent_coef_loss   | 0.597    |
|    learning_rate   | 0.001    |
|    n_updates       | 30145    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 66.8     |
|    ep_rew_mean     | -15.5    |
|    success_rate    | 0.85     |
| time/              |          |
|    episodes        | 320      |
|    fps             | 38       |
|    time_elapsed    | 793      |
|    total_timesteps | 30323    |
| train/             |          |
|    actor_loss      | 1.97     |
|    critic_loss     | 0.00637  |
|    ent_coef        | 0.00325  |
|    ent_coef_loss   | -0.213   |
|    learning_rate   | 0.001    |
|    n_updates       | 30222    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 66.8     |
|    ep_rew_mean     | -15.6    |
|    success_rate    | 0.84     |
| time/              |          |
|    episodes        | 324      |
|    fps             | 38       |
|    time_elapsed    | 808      |
|    total_timesteps | 30895    |
| train/             |          |
|    actor_loss      | 1.81     |
|    critic_loss     | 0.00639  |
|    ent_coef        | 0.00342  |
|    ent_coef_loss   | -0.0428  |
|    learning_rate   | 0.001    |
|    n_updates       | 30794    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 68.7     |
|    ep_rew_mean     | -15.8    |
|    success_rate    | 0.84     |
| time/              |          |
|    episodes        | 328      |
|    fps             | 38       |
|    time_elapsed    | 815      |
|    total_timesteps | 31164    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.00475  |
|    ent_coef        | 0.0034   |
|    ent_coef_loss   | -0.531   |
|    learning_rate   | 0.001    |
|    n_updates       | 31063    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 61.3     |
|    ep_rew_mean     | -14.4    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 332      |
|    fps             | 38       |
|    time_elapsed    | 819      |
|    total_timesteps | 31281    |
| train/             |          |
|    actor_loss      | 2.04     |
|    critic_loss     | 0.00412  |
|    ent_coef        | 0.00334  |
|    ent_coef_loss   | 0.579    |
|    learning_rate   | 0.001    |
|    n_updates       | 31180    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 56.6     |
|    ep_rew_mean     | -13.7    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 336      |
|    fps             | 38       |
|    time_elapsed    | 822      |
|    total_timesteps | 31410    |
| train/             |          |
|    actor_loss      | 1.89     |
|    critic_loss     | 0.00758  |
|    ent_coef        | 0.00341  |
|    ent_coef_loss   | -1.23    |
|    learning_rate   | 0.001    |
|    n_updates       | 31309    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 52.5     |
|    ep_rew_mean     | -13.2    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 340      |
|    fps             | 38       |
|    time_elapsed    | 824      |
|    total_timesteps | 31490    |
| train/             |          |
|    actor_loss      | 1.78     |
|    critic_loss     | 0.00797  |
|    ent_coef        | 0.00336  |
|    ent_coef_loss   | 0.0363   |
|    learning_rate   | 0.001    |
|    n_updates       | 31389    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 52.6     |
|    ep_rew_mean     | -13.2    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 344      |
|    fps             | 38       |
|    time_elapsed    | 827      |
|    total_timesteps | 31600    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.00808  |
|    ent_coef        | 0.00332  |
|    ent_coef_loss   | -0.399   |
|    learning_rate   | 0.001    |
|    n_updates       | 31499    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 51.8     |
|    ep_rew_mean     | -13      |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 348      |
|    fps             | 38       |
|    time_elapsed    | 830      |
|    total_timesteps | 31672    |
| train/             |          |
|    actor_loss      | 1.83     |
|    critic_loss     | 0.00603  |
|    ent_coef        | 0.00335  |
|    ent_coef_loss   | 2.62     |
|    learning_rate   | 0.001    |
|    n_updates       | 31571    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 50.5     |
|    ep_rew_mean     | -12.6    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 352      |
|    fps             | 38       |
|    time_elapsed    | 832      |
|    total_timesteps | 31767    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.00632  |
|    ent_coef        | 0.00341  |
|    ent_coef_loss   | -0.582   |
|    learning_rate   | 0.001    |
|    n_updates       | 31666    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 51       |
|    ep_rew_mean     | -12.5    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 356      |
|    fps             | 38       |
|    time_elapsed    | 835      |
|    total_timesteps | 31905    |
| train/             |          |
|    actor_loss      | 1.78     |
|    critic_loss     | 0.00598  |
|    ent_coef        | 0.00349  |
|    ent_coef_loss   | 0.0348   |
|    learning_rate   | 0.001    |
|    n_updates       | 31804    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 51       |
|    ep_rew_mean     | -12.6    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 360      |
|    fps             | 38       |
|    time_elapsed    | 838      |
|    total_timesteps | 32011    |
| train/             |          |
|    actor_loss      | 1.67     |
|    critic_loss     | 0.00593  |
|    ent_coef        | 0.00336  |
|    ent_coef_loss   | -0.88    |
|    learning_rate   | 0.001    |
|    n_updates       | 31910    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 46.5     |
|    ep_rew_mean     | -11.9    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 364      |
|    fps             | 38       |
|    time_elapsed    | 843      |
|    total_timesteps | 32153    |
| train/             |          |
|    actor_loss      | 1.89     |
|    critic_loss     | 0.00811  |
|    ent_coef        | 0.0034   |
|    ent_coef_loss   | 2.8      |
|    learning_rate   | 0.001    |
|    n_updates       | 32052    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 47       |
|    ep_rew_mean     | -12      |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 368      |
|    fps             | 38       |
|    time_elapsed    | 847      |
|    total_timesteps | 32310    |
| train/             |          |
|    actor_loss      | 1.82     |
|    critic_loss     | 0.0104   |
|    ent_coef        | 0.00347  |
|    ent_coef_loss   | -0.0533  |
|    learning_rate   | 0.001    |
|    n_updates       | 32209    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 46.7     |
|    ep_rew_mean     | -11.9    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 372      |
|    fps             | 38       |
|    time_elapsed    | 849      |
|    total_timesteps | 32387    |
| train/             |          |
|    actor_loss      | 1.9      |
|    critic_loss     | 0.00763  |
|    ent_coef        | 0.00357  |
|    ent_coef_loss   | -0.629   |
|    learning_rate   | 0.001    |
|    n_updates       | 32286    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 46.3     |
|    ep_rew_mean     | -11.9    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 376      |
|    fps             | 38       |
|    time_elapsed    | 851      |
|    total_timesteps | 32471    |
| train/             |          |
|    actor_loss      | 1.79     |
|    critic_loss     | 0.017    |
|    ent_coef        | 0.00364  |
|    ent_coef_loss   | -0.0115  |
|    learning_rate   | 0.001    |
|    n_updates       | 32370    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 46.3     |
|    ep_rew_mean     | -11.9    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 380      |
|    fps             | 38       |
|    time_elapsed    | 854      |
|    total_timesteps | 32568    |
| train/             |          |
|    actor_loss      | 1.88     |
|    critic_loss     | 0.0105   |
|    ent_coef        | 0.00365  |
|    ent_coef_loss   | 0.344    |
|    learning_rate   | 0.001    |
|    n_updates       | 32467    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 46.9     |
|    ep_rew_mean     | -12.1    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 384      |
|    fps             | 38       |
|    time_elapsed    | 857      |
|    total_timesteps | 32696    |
| train/             |          |
|    actor_loss      | 1.88     |
|    critic_loss     | 0.0128   |
|    ent_coef        | 0.00354  |
|    ent_coef_loss   | 0.667    |
|    learning_rate   | 0.001    |
|    n_updates       | 32595    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 45.4     |
|    ep_rew_mean     | -11.8    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 388      |
|    fps             | 38       |
|    time_elapsed    | 859      |
|    total_timesteps | 32766    |
| train/             |          |
|    actor_loss      | 1.85     |
|    critic_loss     | 0.00824  |
|    ent_coef        | 0.00366  |
|    ent_coef_loss   | -0.173   |
|    learning_rate   | 0.001    |
|    n_updates       | 32665    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 43.5     |
|    ep_rew_mean     | -11.4    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 392      |
|    fps             | 38       |
|    time_elapsed    | 870      |
|    total_timesteps | 33165    |
| train/             |          |
|    actor_loss      | 1.75     |
|    critic_loss     | 0.00558  |
|    ent_coef        | 0.00347  |
|    ent_coef_loss   | -1.19    |
|    learning_rate   | 0.001    |
|    n_updates       | 33064    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 43.4     |
|    ep_rew_mean     | -11.4    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 396      |
|    fps             | 38       |
|    time_elapsed    | 873      |
|    total_timesteps | 33271    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.00774  |
|    ent_coef        | 0.0035   |
|    ent_coef_loss   | 0.2      |
|    learning_rate   | 0.001    |
|    n_updates       | 33170    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 43.4     |
|    ep_rew_mean     | -11.5    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 400      |
|    fps             | 38       |
|    time_elapsed    | 875      |
|    total_timesteps | 33360    |
| train/             |          |
|    actor_loss      | 1.81     |
|    critic_loss     | 0.0106   |
|    ent_coef        | 0.0035   |
|    ent_coef_loss   | 0.421    |
|    learning_rate   | 0.001    |
|    n_updates       | 33259    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 40.7     |
|    ep_rew_mean     | -11      |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 404      |
|    fps             | 38       |
|    time_elapsed    | 877      |
|    total_timesteps | 33460    |
| train/             |          |
|    actor_loss      | 1.82     |
|    critic_loss     | 0.00789  |
|    ent_coef        | 0.00354  |
|    ent_coef_loss   | -0.201   |
|    learning_rate   | 0.001    |
|    n_updates       | 33359    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 41       |
|    ep_rew_mean     | -11.3    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 408      |
|    fps             | 38       |
|    time_elapsed    | 881      |
|    total_timesteps | 33566    |
| train/             |          |
|    actor_loss      | 1.84     |
|    critic_loss     | 0.00683  |
|    ent_coef        | 0.00356  |
|    ent_coef_loss   | 0.0393   |
|    learning_rate   | 0.001    |
|    n_updates       | 33465    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 42.6     |
|    ep_rew_mean     | -11.4    |
|    success_rate    | 0.85     |
| time/              |          |
|    episodes        | 412      |
|    fps             | 38       |
|    time_elapsed    | 888      |
|    total_timesteps | 33843    |
| train/             |          |
|    actor_loss      | 1.9      |
|    critic_loss     | 0.0124   |
|    ent_coef        | 0.00352  |
|    ent_coef_loss   | 0.606    |
|    learning_rate   | 0.001    |
|    n_updates       | 33742    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 37       |
|    ep_rew_mean     | -10.3    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 416      |
|    fps             | 38       |
|    time_elapsed    | 891      |
|    total_timesteps | 33949    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.0426   |
|    ent_coef        | 0.00348  |
|    ent_coef_loss   | -1.25    |
|    learning_rate   | 0.001    |
|    n_updates       | 33848    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 37.2     |
|    ep_rew_mean     | -10.5    |
|    success_rate    | 0.84     |
| time/              |          |
|    episodes        | 420      |
|    fps             | 38       |
|    time_elapsed    | 894      |
|    total_timesteps | 34041    |
| train/             |          |
|    actor_loss      | 1.77     |
|    critic_loss     | 0.00555  |
|    ent_coef        | 0.00333  |
|    ent_coef_loss   | -0.84    |
|    learning_rate   | 0.001    |
|    n_updates       | 33940    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 32.8     |
|    ep_rew_mean     | -9.73    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 424      |
|    fps             | 38       |
|    time_elapsed    | 897      |
|    total_timesteps | 34170    |
| train/             |          |
|    actor_loss      | 1.65     |
|    critic_loss     | 0.00476  |
|    ent_coef        | 0.00341  |
|    ent_coef_loss   | -1.21    |
|    learning_rate   | 0.001    |
|    n_updates       | 34069    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 33.2     |
|    ep_rew_mean     | -9.75    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 428      |
|    fps             | 38       |
|    time_elapsed    | 906      |
|    total_timesteps | 34482    |
| train/             |          |
|    actor_loss      | 1.9      |
|    critic_loss     | 0.00697  |
|    ent_coef        | 0.00348  |
|    ent_coef_loss   | 3.12     |
|    learning_rate   | 0.001    |
|    n_updates       | 34381    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 33       |
|    ep_rew_mean     | -9.63    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 432      |
|    fps             | 38       |
|    time_elapsed    | 909      |
|    total_timesteps | 34579    |
| train/             |          |
|    actor_loss      | 1.77     |
|    critic_loss     | 0.0154   |
|    ent_coef        | 0.00361  |
|    ent_coef_loss   | -1.76    |
|    learning_rate   | 0.001    |
|    n_updates       | 34478    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 32.5     |
|    ep_rew_mean     | -9.43    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 436      |
|    fps             | 38       |
|    time_elapsed    | 911      |
|    total_timesteps | 34663    |
| train/             |          |
|    actor_loss      | 1.83     |
|    critic_loss     | 0.00562  |
|    ent_coef        | 0.00361  |
|    ent_coef_loss   | -1       |
|    learning_rate   | 0.001    |
|    n_updates       | 34562    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 33       |
|    ep_rew_mean     | -9.65    |
|    success_rate    | 0.85     |
| time/              |          |
|    episodes        | 440      |
|    fps             | 38       |
|    time_elapsed    | 914      |
|    total_timesteps | 34787    |
| train/             |          |
|    actor_loss      | 1.75     |
|    critic_loss     | 0.0137   |
|    ent_coef        | 0.00351  |
|    ent_coef_loss   | -1.82    |
|    learning_rate   | 0.001    |
|    n_updates       | 34686    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 32.8     |
|    ep_rew_mean     | -9.59    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 444      |
|    fps             | 38       |
|    time_elapsed    | 917      |
|    total_timesteps | 34880    |
| train/             |          |
|    actor_loss      | 1.76     |
|    critic_loss     | 0.00814  |
|    ent_coef        | 0.00352  |
|    ent_coef_loss   | -0.652   |
|    learning_rate   | 0.001    |
|    n_updates       | 34779    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 33.1     |
|    ep_rew_mean     | -9.78    |
|    success_rate    | 0.85     |
| time/              |          |
|    episodes        | 448      |
|    fps             | 38       |
|    time_elapsed    | 920      |
|    total_timesteps | 34982    |
| train/             |          |
|    actor_loss      | 1.85     |
|    critic_loss     | 0.0194   |
|    ent_coef        | 0.00354  |
|    ent_coef_loss   | -0.469   |
|    learning_rate   | 0.001    |
|    n_updates       | 34881    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 33.1     |
|    ep_rew_mean     | -9.88    |
|    success_rate    | 0.83     |
| time/              |          |
|    episodes        | 452      |
|    fps             | 38       |
|    time_elapsed    | 922      |
|    total_timesteps | 35080    |
| train/             |          |
|    actor_loss      | 1.84     |
|    critic_loss     | 0.00693  |
|    ent_coef        | 0.00355  |
|    ent_coef_loss   | -0.0075  |
|    learning_rate   | 0.001    |
|    n_updates       | 34979    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 33       |
|    ep_rew_mean     | -10      |
|    success_rate    | 0.83     |
| time/              |          |
|    episodes        | 456      |
|    fps             | 38       |
|    time_elapsed    | 925      |
|    total_timesteps | 35202    |
| train/             |          |
|    actor_loss      | 1.81     |
|    critic_loss     | 0.0135   |
|    ent_coef        | 0.00364  |
|    ent_coef_loss   | 0.196    |
|    learning_rate   | 0.001    |
|    n_updates       | 35101    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 32.7     |
|    ep_rew_mean     | -9.93    |
|    success_rate    | 0.82     |
| time/              |          |
|    episodes        | 460      |
|    fps             | 38       |
|    time_elapsed    | 927      |
|    total_timesteps | 35277    |
| train/             |          |
|    actor_loss      | 1.76     |
|    critic_loss     | 0.00577  |
|    ent_coef        | 0.00361  |
|    ent_coef_loss   | -1.21    |
|    learning_rate   | 0.001    |
|    n_updates       | 35176    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 32       |
|    ep_rew_mean     | -9.72    |
|    success_rate    | 0.82     |
| time/              |          |
|    episodes        | 464      |
|    fps             | 37       |
|    time_elapsed    | 930      |
|    total_timesteps | 35351    |
| train/             |          |
|    actor_loss      | 1.85     |
|    critic_loss     | 0.00527  |
|    ent_coef        | 0.00348  |
|    ent_coef_loss   | -0.928   |
|    learning_rate   | 0.001    |
|    n_updates       | 35250    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 31.4     |
|    ep_rew_mean     | -9.66    |
|    success_rate    | 0.82     |
| time/              |          |
|    episodes        | 468      |
|    fps             | 37       |
|    time_elapsed    | 933      |
|    total_timesteps | 35450    |
| train/             |          |
|    actor_loss      | 1.88     |
|    critic_loss     | 0.0153   |
|    ent_coef        | 0.00357  |
|    ent_coef_loss   | 1.64     |
|    learning_rate   | 0.001    |
|    n_updates       | 35349    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 31.6     |
|    ep_rew_mean     | -9.77    |
|    success_rate    | 0.81     |
| time/              |          |
|    episodes        | 472      |
|    fps             | 38       |
|    time_elapsed    | 935      |
|    total_timesteps | 35544    |
| train/             |          |
|    actor_loss      | 1.65     |
|    critic_loss     | 0.00871  |
|    ent_coef        | 0.00368  |
|    ent_coef_loss   | -0.769   |
|    learning_rate   | 0.001    |
|    n_updates       | 35443    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 31.7     |
|    ep_rew_mean     | -9.74    |
|    success_rate    | 0.81     |
| time/              |          |
|    episodes        | 476      |
|    fps             | 38       |
|    time_elapsed    | 937      |
|    total_timesteps | 35642    |
| train/             |          |
|    actor_loss      | 1.85     |
|    critic_loss     | 0.00902  |
|    ent_coef        | 0.0037   |
|    ent_coef_loss   | 0.701    |
|    learning_rate   | 0.001    |
|    n_updates       | 35541    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 36.5     |
|    ep_rew_mean     | -10.5    |
|    success_rate    | 0.81     |
| time/              |          |
|    episodes        | 480      |
|    fps             | 37       |
|    time_elapsed    | 954      |
|    total_timesteps | 36213    |
| train/             |          |
|    actor_loss      | 1.65     |
|    critic_loss     | 0.0062   |
|    ent_coef        | 0.00374  |
|    ent_coef_loss   | -1.09    |
|    learning_rate   | 0.001    |
|    n_updates       | 36112    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 36.4     |
|    ep_rew_mean     | -10.5    |
|    success_rate    | 0.81     |
| time/              |          |
|    episodes        | 484      |
|    fps             | 37       |
|    time_elapsed    | 958      |
|    total_timesteps | 36338    |
| train/             |          |
|    actor_loss      | 1.87     |
|    critic_loss     | 0.007    |
|    ent_coef        | 0.00357  |
|    ent_coef_loss   | -1.81    |
|    learning_rate   | 0.001    |
|    n_updates       | 36237    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 36.6     |
|    ep_rew_mean     | -10.6    |
|    success_rate    | 0.8      |
| time/              |          |
|    episodes        | 488      |
|    fps             | 37       |
|    time_elapsed    | 960      |
|    total_timesteps | 36426    |
| train/             |          |
|    actor_loss      | 1.88     |
|    critic_loss     | 0.00662  |
|    ent_coef        | 0.00335  |
|    ent_coef_loss   | 0.612    |
|    learning_rate   | 0.001    |
|    n_updates       | 36325    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 33.7     |
|    ep_rew_mean     | -10.2    |
|    success_rate    | 0.8      |
| time/              |          |
|    episodes        | 492      |
|    fps             | 37       |
|    time_elapsed    | 963      |
|    total_timesteps | 36533    |
| train/             |          |
|    actor_loss      | 1.72     |
|    critic_loss     | 0.00936  |
|    ent_coef        | 0.00343  |
|    ent_coef_loss   | 1.32     |
|    learning_rate   | 0.001    |
|    n_updates       | 36432    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 35.4     |
|    ep_rew_mean     | -10.3    |
|    success_rate    | 0.8      |
| time/              |          |
|    episodes        | 496      |
|    fps             | 37       |
|    time_elapsed    | 971      |
|    total_timesteps | 36810    |
| train/             |          |
|    actor_loss      | 1.81     |
|    critic_loss     | 0.0111   |
|    ent_coef        | 0.00356  |
|    ent_coef_loss   | 1.39     |
|    learning_rate   | 0.001    |
|    n_updates       | 36709    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 35.7     |
|    ep_rew_mean     | -10.4    |
|    success_rate    | 0.8      |
| time/              |          |
|    episodes        | 500      |
|    fps             | 37       |
|    time_elapsed    | 974      |
|    total_timesteps | 36934    |
| train/             |          |
|    actor_loss      | 1.76     |
|    critic_loss     | 0.00497  |
|    ent_coef        | 0.00362  |
|    ent_coef_loss   | 1.94     |
|    learning_rate   | 0.001    |
|    n_updates       | 36833    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 35.8     |
|    ep_rew_mean     | -10.3    |
|    success_rate    | 0.81     |
| time/              |          |
|    episodes        | 504      |
|    fps             | 37       |
|    time_elapsed    | 977      |
|    total_timesteps | 37039    |
| train/             |          |
|    actor_loss      | 1.77     |
|    critic_loss     | 0.0109   |
|    ent_coef        | 0.00366  |
|    ent_coef_loss   | -1.19    |
|    learning_rate   | 0.001    |
|    n_updates       | 36938    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 35.9     |
|    ep_rew_mean     | -10.3    |
|    success_rate    | 0.82     |
| time/              |          |
|    episodes        | 508      |
|    fps             | 37       |
|    time_elapsed    | 980      |
|    total_timesteps | 37158    |
| train/             |          |
|    actor_loss      | 1.8      |
|    critic_loss     | 0.0157   |
|    ent_coef        | 0.00371  |
|    ent_coef_loss   | 0.966    |
|    learning_rate   | 0.001    |
|    n_updates       | 37057    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 34.2     |
|    ep_rew_mean     | -10.1    |
|    success_rate    | 0.82     |
| time/              |          |
|    episodes        | 512      |
|    fps             | 37       |
|    time_elapsed    | 983      |
|    total_timesteps | 37262    |
| train/             |          |
|    actor_loss      | 1.9      |
|    critic_loss     | 0.0122   |
|    ent_coef        | 0.00373  |
|    ent_coef_loss   | -0.263   |
|    learning_rate   | 0.001    |
|    n_updates       | 37161    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 34.1     |
|    ep_rew_mean     | -10.1    |
|    success_rate    | 0.82     |
| time/              |          |
|    episodes        | 516      |
|    fps             | 37       |
|    time_elapsed    | 986      |
|    total_timesteps | 37360    |
| train/             |          |
|    actor_loss      | 1.82     |
|    critic_loss     | 0.0205   |
|    ent_coef        | 0.00373  |
|    ent_coef_loss   | 0.836    |
|    learning_rate   | 0.001    |
|    n_updates       | 37259    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 34.4     |
|    ep_rew_mean     | -10.1    |
|    success_rate    | 0.84     |
| time/              |          |
|    episodes        | 520      |
|    fps             | 37       |
|    time_elapsed    | 989      |
|    total_timesteps | 37482    |
| train/             |          |
|    actor_loss      | 1.82     |
|    critic_loss     | 0.00687  |
|    ent_coef        | 0.00358  |
|    ent_coef_loss   | -1.54    |
|    learning_rate   | 0.001    |
|    n_updates       | 37381    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 34.3     |
|    ep_rew_mean     | -10.1    |
|    success_rate    | 0.84     |
| time/              |          |
|    episodes        | 524      |
|    fps             | 37       |
|    time_elapsed    | 992      |
|    total_timesteps | 37599    |
| train/             |          |
|    actor_loss      | 1.86     |
|    critic_loss     | 0.0223   |
|    ent_coef        | 0.0036   |
|    ent_coef_loss   | 0.416    |
|    learning_rate   | 0.001    |
|    n_updates       | 37498    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 32.2     |
|    ep_rew_mean     | -9.85    |
|    success_rate    | 0.84     |
| time/              |          |
|    episodes        | 528      |
|    fps             | 37       |
|    time_elapsed    | 995      |
|    total_timesteps | 37700    |
| train/             |          |
|    actor_loss      | 1.88     |
|    critic_loss     | 0.0102   |
|    ent_coef        | 0.00355  |
|    ent_coef_loss   | 1.94     |
|    learning_rate   | 0.001    |
|    n_updates       | 37599    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 32.1     |
|    ep_rew_mean     | -9.83    |
|    success_rate    | 0.84     |
| time/              |          |
|    episodes        | 532      |
|    fps             | 37       |
|    time_elapsed    | 998      |
|    total_timesteps | 37790    |
| train/             |          |
|    actor_loss      | 1.78     |
|    critic_loss     | 0.00592  |
|    ent_coef        | 0.00376  |
|    ent_coef_loss   | 1.19     |
|    learning_rate   | 0.001    |
|    n_updates       | 37689    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 32.6     |
|    ep_rew_mean     | -9.94    |
|    success_rate    | 0.85     |
| time/              |          |
|    episodes        | 536      |
|    fps             | 37       |
|    time_elapsed    | 1001     |
|    total_timesteps | 37925    |
| train/             |          |
|    actor_loss      | 1.9      |
|    critic_loss     | 0.00793  |
|    ent_coef        | 0.00385  |
|    ent_coef_loss   | -0.358   |
|    learning_rate   | 0.001    |
|    n_updates       | 37824    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 32.7     |
|    ep_rew_mean     | -9.76    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 540      |
|    fps             | 37       |
|    time_elapsed    | 1005     |
|    total_timesteps | 38060    |
| train/             |          |
|    actor_loss      | 1.72     |
|    critic_loss     | 0.011    |
|    ent_coef        | 0.00382  |
|    ent_coef_loss   | -0.742   |
|    learning_rate   | 0.001    |
|    n_updates       | 37959    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 32.8     |
|    ep_rew_mean     | -9.83    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 544      |
|    fps             | 37       |
|    time_elapsed    | 1008     |
|    total_timesteps | 38156    |
| train/             |          |
|    actor_loss      | 1.84     |
|    critic_loss     | 0.00491  |
|    ent_coef        | 0.00386  |
|    ent_coef_loss   | 1.7      |
|    learning_rate   | 0.001    |
|    n_updates       | 38055    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 38.2     |
|    ep_rew_mean     | -10.6    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 548      |
|    fps             | 37       |
|    time_elapsed    | 1026     |
|    total_timesteps | 38804    |
| train/             |          |
|    actor_loss      | 1.85     |
|    critic_loss     | 0.0102   |
|    ent_coef        | 0.00356  |
|    ent_coef_loss   | 1.4      |
|    learning_rate   | 0.001    |
|    n_updates       | 38703    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 38.1     |
|    ep_rew_mean     | -10.5    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 552      |
|    fps             | 37       |
|    time_elapsed    | 1028     |
|    total_timesteps | 38892    |
| train/             |          |
|    actor_loss      | 1.76     |
|    critic_loss     | 0.0163   |
|    ent_coef        | 0.00371  |
|    ent_coef_loss   | 1.33     |
|    learning_rate   | 0.001    |
|    n_updates       | 38791    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 38.4     |
|    ep_rew_mean     | -10.5    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 556      |
|    fps             | 37       |
|    time_elapsed    | 1033     |
|    total_timesteps | 39044    |
| train/             |          |
|    actor_loss      | 1.79     |
|    critic_loss     | 0.00451  |
|    ent_coef        | 0.00383  |
|    ent_coef_loss   | 0.6      |
|    learning_rate   | 0.001    |
|    n_updates       | 38943    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 38.7     |
|    ep_rew_mean     | -10.5    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 560      |
|    fps             | 37       |
|    time_elapsed    | 1036     |
|    total_timesteps | 39143    |
| train/             |          |
|    actor_loss      | 1.76     |
|    critic_loss     | 0.00545  |
|    ent_coef        | 0.00394  |
|    ent_coef_loss   | -0.236   |
|    learning_rate   | 0.001    |
|    n_updates       | 39042    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 39.1     |
|    ep_rew_mean     | -10.5    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 564      |
|    fps             | 37       |
|    time_elapsed    | 1039     |
|    total_timesteps | 39263    |
| train/             |          |
|    actor_loss      | 1.72     |
|    critic_loss     | 0.0143   |
|    ent_coef        | 0.0039   |
|    ent_coef_loss   | -1.16    |
|    learning_rate   | 0.001    |
|    n_updates       | 39162    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 39.1     |
|    ep_rew_mean     | -10.5    |
|    success_rate    | 0.91     |
| time/              |          |
|    episodes        | 568      |
|    fps             | 37       |
|    time_elapsed    | 1041     |
|    total_timesteps | 39357    |
| train/             |          |
|    actor_loss      | 1.86     |
|    critic_loss     | 0.00471  |
|    ent_coef        | 0.00374  |
|    ent_coef_loss   | -0.395   |
|    learning_rate   | 0.001    |
|    n_updates       | 39256    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 38.8     |
|    ep_rew_mean     | -10.4    |
|    success_rate    | 0.91     |
| time/              |          |
|    episodes        | 572      |
|    fps             | 37       |
|    time_elapsed    | 1044     |
|    total_timesteps | 39427    |
| train/             |          |
|    actor_loss      | 1.88     |
|    critic_loss     | 0.0139   |
|    ent_coef        | 0.00372  |
|    ent_coef_loss   | 0.916    |
|    learning_rate   | 0.001    |
|    n_updates       | 39326    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 38.9     |
|    ep_rew_mean     | -10.4    |
|    success_rate    | 0.91     |
| time/              |          |
|    episodes        | 576      |
|    fps             | 37       |
|    time_elapsed    | 1046     |
|    total_timesteps | 39527    |
| train/             |          |
|    actor_loss      | 1.72     |
|    critic_loss     | 0.00594  |
|    ent_coef        | 0.0037   |
|    ent_coef_loss   | -1.18    |
|    learning_rate   | 0.001    |
|    n_updates       | 39426    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 33.8     |
|    ep_rew_mean     | -9.5     |
|    success_rate    | 0.92     |
| time/              |          |
|    episodes        | 580      |
|    fps             | 37       |
|    time_elapsed    | 1048     |
|    total_timesteps | 39591    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.00582  |
|    ent_coef        | 0.00367  |
|    ent_coef_loss   | -0.638   |
|    learning_rate   | 0.001    |
|    n_updates       | 39490    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 33.3     |
|    ep_rew_mean     | -9.39    |
|    success_rate    | 0.91     |
| time/              |          |
|    episodes        | 584      |
|    fps             | 37       |
|    time_elapsed    | 1050     |
|    total_timesteps | 39667    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.0103   |
|    ent_coef        | 0.00375  |
|    ent_coef_loss   | -1.77    |
|    learning_rate   | 0.001    |
|    n_updates       | 39566    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 33.2     |
|    ep_rew_mean     | -9.4     |
|    success_rate    | 0.91     |
| time/              |          |
|    episodes        | 588      |
|    fps             | 37       |
|    time_elapsed    | 1052     |
|    total_timesteps | 39748    |
| train/             |          |
|    actor_loss      | 1.83     |
|    critic_loss     | 0.0109   |
|    ent_coef        | 0.00362  |
|    ent_coef_loss   | -0.0623  |
|    learning_rate   | 0.001    |
|    n_updates       | 39647    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 33       |
|    ep_rew_mean     | -9.23    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 592      |
|    fps             | 37       |
|    time_elapsed    | 1054     |
|    total_timesteps | 39838    |
| train/             |          |
|    actor_loss      | 1.79     |
|    critic_loss     | 0.00929  |
|    ent_coef        | 0.00367  |
|    ent_coef_loss   | 0.746    |
|    learning_rate   | 0.001    |
|    n_updates       | 39737    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 31.1     |
|    ep_rew_mean     | -9.02    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 596      |
|    fps             | 37       |
|    time_elapsed    | 1057     |
|    total_timesteps | 39918    |
| train/             |          |
|    actor_loss      | 1.84     |
|    critic_loss     | 0.00962  |
|    ent_coef        | 0.00377  |
|    ent_coef_loss   | 0.629    |
|    learning_rate   | 0.001    |
|    n_updates       | 39817    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 30.6     |
|    ep_rew_mean     | -8.87    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 600      |
|    fps             | 37       |
|    time_elapsed    | 1059     |
|    total_timesteps | 39990    |
| train/             |          |
|    actor_loss      | 1.84     |
|    critic_loss     | 0.00758  |
|    ent_coef        | 0.00372  |
|    ent_coef_loss   | -0.222   |
|    learning_rate   | 0.001    |
|    n_updates       | 39889    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 30.7     |
|    ep_rew_mean     | -8.97    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 604      |
|    fps             | 37       |
|    time_elapsed    | 1062     |
|    total_timesteps | 40113    |
| train/             |          |
|    actor_loss      | 1.75     |
|    critic_loss     | 0.0101   |
|    ent_coef        | 0.00381  |
|    ent_coef_loss   | 0.986    |
|    learning_rate   | 0.001    |
|    n_updates       | 40012    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 30.5     |
|    ep_rew_mean     | -8.93    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 608      |
|    fps             | 37       |
|    time_elapsed    | 1065     |
|    total_timesteps | 40212    |
| train/             |          |
|    actor_loss      | 1.65     |
|    critic_loss     | 0.0155   |
|    ent_coef        | 0.00372  |
|    ent_coef_loss   | -0.193   |
|    learning_rate   | 0.001    |
|    n_updates       | 40111    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 30.4     |
|    ep_rew_mean     | -8.87    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 612      |
|    fps             | 37       |
|    time_elapsed    | 1068     |
|    total_timesteps | 40302    |
| train/             |          |
|    actor_loss      | 1.84     |
|    critic_loss     | 0.00867  |
|    ent_coef        | 0.00381  |
|    ent_coef_loss   | 1.73     |
|    learning_rate   | 0.001    |
|    n_updates       | 40201    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 30.3     |
|    ep_rew_mean     | -8.84    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 616      |
|    fps             | 37       |
|    time_elapsed    | 1071     |
|    total_timesteps | 40391    |
| train/             |          |
|    actor_loss      | 1.89     |
|    critic_loss     | 0.00715  |
|    ent_coef        | 0.0039   |
|    ent_coef_loss   | 2.1      |
|    learning_rate   | 0.001    |
|    n_updates       | 40290    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 30       |
|    ep_rew_mean     | -8.79    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 620      |
|    fps             | 37       |
|    time_elapsed    | 1073     |
|    total_timesteps | 40483    |
| train/             |          |
|    actor_loss      | 1.85     |
|    critic_loss     | 0.00769  |
|    ent_coef        | 0.00394  |
|    ent_coef_loss   | -0.978   |
|    learning_rate   | 0.001    |
|    n_updates       | 40382    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 29.5     |
|    ep_rew_mean     | -8.61    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 624      |
|    fps             | 37       |
|    time_elapsed    | 1075     |
|    total_timesteps | 40549    |
| train/             |          |
|    actor_loss      | 1.76     |
|    critic_loss     | 0.0103   |
|    ent_coef        | 0.0039   |
|    ent_coef_loss   | -1.07    |
|    learning_rate   | 0.001    |
|    n_updates       | 40448    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 29.9     |
|    ep_rew_mean     | -8.68    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 628      |
|    fps             | 37       |
|    time_elapsed    | 1078     |
|    total_timesteps | 40690    |
| train/             |          |
|    actor_loss      | 1.78     |
|    critic_loss     | 0.0309   |
|    ent_coef        | 0.00385  |
|    ent_coef_loss   | -0.0331  |
|    learning_rate   | 0.001    |
|    n_updates       | 40589    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 29.8     |
|    ep_rew_mean     | -8.6     |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 632      |
|    fps             | 37       |
|    time_elapsed    | 1081     |
|    total_timesteps | 40770    |
| train/             |          |
|    actor_loss      | 1.76     |
|    critic_loss     | 0.0177   |
|    ent_coef        | 0.00398  |
|    ent_coef_loss   | 0.33     |
|    learning_rate   | 0.001    |
|    n_updates       | 40669    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 30       |
|    ep_rew_mean     | -8.57    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 636      |
|    fps             | 37       |
|    time_elapsed    | 1085     |
|    total_timesteps | 40927    |
| train/             |          |
|    actor_loss      | 1.82     |
|    critic_loss     | 0.00983  |
|    ent_coef        | 0.00385  |
|    ent_coef_loss   | -1.13    |
|    learning_rate   | 0.001    |
|    n_updates       | 40826    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 29.5     |
|    ep_rew_mean     | -8.58    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 640      |
|    fps             | 37       |
|    time_elapsed    | 1087     |
|    total_timesteps | 41013    |
| train/             |          |
|    actor_loss      | 1.93     |
|    critic_loss     | 0.0246   |
|    ent_coef        | 0.00391  |
|    ent_coef_loss   | 0.991    |
|    learning_rate   | 0.001    |
|    n_updates       | 40912    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 29.5     |
|    ep_rew_mean     | -8.53    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 644      |
|    fps             | 37       |
|    time_elapsed    | 1090     |
|    total_timesteps | 41106    |
| train/             |          |
|    actor_loss      | 1.81     |
|    critic_loss     | 0.00729  |
|    ent_coef        | 0.00398  |
|    ent_coef_loss   | -0.0864  |
|    learning_rate   | 0.001    |
|    n_updates       | 41005    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.7     |
|    ep_rew_mean     | -7.57    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 648      |
|    fps             | 37       |
|    time_elapsed    | 1092     |
|    total_timesteps | 41177    |
| train/             |          |
|    actor_loss      | 1.87     |
|    critic_loss     | 0.00593  |
|    ent_coef        | 0.00392  |
|    ent_coef_loss   | -1.5     |
|    learning_rate   | 0.001    |
|    n_updates       | 41076    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.9     |
|    ep_rew_mean     | -7.52    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 652      |
|    fps             | 37       |
|    time_elapsed    | 1096     |
|    total_timesteps | 41285    |
| train/             |          |
|    actor_loss      | 1.87     |
|    critic_loss     | 0.00981  |
|    ent_coef        | 0.00383  |
|    ent_coef_loss   | -1.12    |
|    learning_rate   | 0.001    |
|    n_updates       | 41184    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.5     |
|    ep_rew_mean     | -7.49    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 656      |
|    fps             | 37       |
|    time_elapsed    | 1099     |
|    total_timesteps | 41397    |
| train/             |          |
|    actor_loss      | 1.67     |
|    critic_loss     | 0.0386   |
|    ent_coef        | 0.00383  |
|    ent_coef_loss   | -1.39    |
|    learning_rate   | 0.001    |
|    n_updates       | 41296    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.6     |
|    ep_rew_mean     | -7.55    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 660      |
|    fps             | 37       |
|    time_elapsed    | 1101     |
|    total_timesteps | 41503    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.00785  |
|    ent_coef        | 0.00387  |
|    ent_coef_loss   | -1.5     |
|    learning_rate   | 0.001    |
|    n_updates       | 41402    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.9     |
|    ep_rew_mean     | -7.63    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 664      |
|    fps             | 37       |
|    time_elapsed    | 1105     |
|    total_timesteps | 41648    |
| train/             |          |
|    actor_loss      | 1.8      |
|    critic_loss     | 0.00908  |
|    ent_coef        | 0.00378  |
|    ent_coef_loss   | -0.438   |
|    learning_rate   | 0.001    |
|    n_updates       | 41547    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24       |
|    ep_rew_mean     | -7.69    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 668      |
|    fps             | 37       |
|    time_elapsed    | 1109     |
|    total_timesteps | 41759    |
| train/             |          |
|    actor_loss      | 1.78     |
|    critic_loss     | 0.0144   |
|    ent_coef        | 0.00388  |
|    ent_coef_loss   | -0.0991  |
|    learning_rate   | 0.001    |
|    n_updates       | 41658    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24.6     |
|    ep_rew_mean     | -7.87    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 672      |
|    fps             | 37       |
|    time_elapsed    | 1112     |
|    total_timesteps | 41885    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.00916  |
|    ent_coef        | 0.00379  |
|    ent_coef_loss   | -1.22    |
|    learning_rate   | 0.001    |
|    n_updates       | 41784    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24.5     |
|    ep_rew_mean     | -7.9     |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 676      |
|    fps             | 37       |
|    time_elapsed    | 1114     |
|    total_timesteps | 41973    |
| train/             |          |
|    actor_loss      | 1.76     |
|    critic_loss     | 0.00549  |
|    ent_coef        | 0.00374  |
|    ent_coef_loss   | -1.26    |
|    learning_rate   | 0.001    |
|    n_updates       | 41872    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24.8     |
|    ep_rew_mean     | -8.03    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 680      |
|    fps             | 37       |
|    time_elapsed    | 1117     |
|    total_timesteps | 42075    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.0116   |
|    ent_coef        | 0.00373  |
|    ent_coef_loss   | -1.12    |
|    learning_rate   | 0.001    |
|    n_updates       | 41974    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25       |
|    ep_rew_mean     | -8.04    |
|    success_rate    | 0.98     |
| time/              |          |
|    episodes        | 684      |
|    fps             | 37       |
|    time_elapsed    | 1120     |
|    total_timesteps | 42171    |
| train/             |          |
|    actor_loss      | 1.83     |
|    critic_loss     | 0.00805  |
|    ent_coef        | 0.00381  |
|    ent_coef_loss   | 1.44     |
|    learning_rate   | 0.001    |
|    n_updates       | 42070    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.1     |
|    ep_rew_mean     | -8.02    |
|    success_rate    | 0.99     |
| time/              |          |
|    episodes        | 688      |
|    fps             | 37       |
|    time_elapsed    | 1123     |
|    total_timesteps | 42256    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.0107   |
|    ent_coef        | 0.00385  |
|    ent_coef_loss   | -0.394   |
|    learning_rate   | 0.001    |
|    n_updates       | 42155    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.1     |
|    ep_rew_mean     | -7.99    |
|    success_rate    | 0.99     |
| time/              |          |
|    episodes        | 692      |
|    fps             | 37       |
|    time_elapsed    | 1125     |
|    total_timesteps | 42352    |
| train/             |          |
|    actor_loss      | 1.85     |
|    critic_loss     | 0.0081   |
|    ent_coef        | 0.0038   |
|    ent_coef_loss   | -0.184   |
|    learning_rate   | 0.001    |
|    n_updates       | 42251    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.3     |
|    ep_rew_mean     | -8.13    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 696      |
|    fps             | 37       |
|    time_elapsed    | 1128     |
|    total_timesteps | 42445    |
| train/             |          |
|    actor_loss      | 1.75     |
|    critic_loss     | 0.00749  |
|    ent_coef        | 0.00376  |
|    ent_coef_loss   | -1.13    |
|    learning_rate   | 0.001    |
|    n_updates       | 42344    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.5     |
|    ep_rew_mean     | -8.24    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 700      |
|    fps             | 37       |
|    time_elapsed    | 1130     |
|    total_timesteps | 42537    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.0129   |
|    ent_coef        | 0.00368  |
|    ent_coef_loss   | -1.1     |
|    learning_rate   | 0.001    |
|    n_updates       | 42436    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.1     |
|    ep_rew_mean     | -8.12    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 704      |
|    fps             | 37       |
|    time_elapsed    | 1133     |
|    total_timesteps | 42626    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.0147   |
|    ent_coef        | 0.00368  |
|    ent_coef_loss   | -0.262   |
|    learning_rate   | 0.001    |
|    n_updates       | 42525    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.5     |
|    ep_rew_mean     | -8.15    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 708      |
|    fps             | 37       |
|    time_elapsed    | 1137     |
|    total_timesteps | 42761    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.0142   |
|    ent_coef        | 0.00375  |
|    ent_coef_loss   | -1.71    |
|    learning_rate   | 0.001    |
|    n_updates       | 42660    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.3     |
|    ep_rew_mean     | -8.07    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 712      |
|    fps             | 37       |
|    time_elapsed    | 1139     |
|    total_timesteps | 42832    |
| train/             |          |
|    actor_loss      | 1.91     |
|    critic_loss     | 0.00501  |
|    ent_coef        | 0.00379  |
|    ent_coef_loss   | -0.369   |
|    learning_rate   | 0.001    |
|    n_updates       | 42731    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.2     |
|    ep_rew_mean     | -8.07    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 716      |
|    fps             | 37       |
|    time_elapsed    | 1141     |
|    total_timesteps | 42907    |
| train/             |          |
|    actor_loss      | 1.82     |
|    critic_loss     | 0.0277   |
|    ent_coef        | 0.00374  |
|    ent_coef_loss   | -0.231   |
|    learning_rate   | 0.001    |
|    n_updates       | 42806    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.2     |
|    ep_rew_mean     | -8.1     |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 720      |
|    fps             | 37       |
|    time_elapsed    | 1143     |
|    total_timesteps | 43004    |
| train/             |          |
|    actor_loss      | 1.92     |
|    critic_loss     | 0.00953  |
|    ent_coef        | 0.00367  |
|    ent_coef_loss   | -0.862   |
|    learning_rate   | 0.001    |
|    n_updates       | 42903    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.6     |
|    ep_rew_mean     | -8.18    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 724      |
|    fps             | 37       |
|    time_elapsed    | 1147     |
|    total_timesteps | 43104    |
| train/             |          |
|    actor_loss      | 1.84     |
|    critic_loss     | 0.00726  |
|    ent_coef        | 0.00377  |
|    ent_coef_loss   | 0.0153   |
|    learning_rate   | 0.001    |
|    n_updates       | 43003    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24.9     |
|    ep_rew_mean     | -8.03    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 728      |
|    fps             | 37       |
|    time_elapsed    | 1149     |
|    total_timesteps | 43176    |
| train/             |          |
|    actor_loss      | 1.8      |
|    critic_loss     | 0.00787  |
|    ent_coef        | 0.00392  |
|    ent_coef_loss   | -0.308   |
|    learning_rate   | 0.001    |
|    n_updates       | 43075    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25       |
|    ep_rew_mean     | -8.12    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 732      |
|    fps             | 37       |
|    time_elapsed    | 1151     |
|    total_timesteps | 43269    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.00516  |
|    ent_coef        | 0.00407  |
|    ent_coef_loss   | 0.733    |
|    learning_rate   | 0.001    |
|    n_updates       | 43168    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24.4     |
|    ep_rew_mean     | -8.13    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 736      |
|    fps             | 37       |
|    time_elapsed    | 1154     |
|    total_timesteps | 43371    |
| train/             |          |
|    actor_loss      | 1.87     |
|    critic_loss     | 0.017    |
|    ent_coef        | 0.00434  |
|    ent_coef_loss   | 0.413    |
|    learning_rate   | 0.001    |
|    n_updates       | 43270    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24.8     |
|    ep_rew_mean     | -8.27    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 740      |
|    fps             | 37       |
|    time_elapsed    | 1158     |
|    total_timesteps | 43494    |
| train/             |          |
|    actor_loss      | 1.83     |
|    critic_loss     | 0.0205   |
|    ent_coef        | 0.00429  |
|    ent_coef_loss   | -0.648   |
|    learning_rate   | 0.001    |
|    n_updates       | 43393    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24.7     |
|    ep_rew_mean     | -8.29    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 744      |
|    fps             | 37       |
|    time_elapsed    | 1160     |
|    total_timesteps | 43575    |
| train/             |          |
|    actor_loss      | 1.81     |
|    critic_loss     | 0.0253   |
|    ent_coef        | 0.00429  |
|    ent_coef_loss   | 0.202    |
|    learning_rate   | 0.001    |
|    n_updates       | 43474    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24.8     |
|    ep_rew_mean     | -8.32    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 748      |
|    fps             | 37       |
|    time_elapsed    | 1162     |
|    total_timesteps | 43653    |
| train/             |          |
|    actor_loss      | 1.81     |
|    critic_loss     | 0.00717  |
|    ent_coef        | 0.00419  |
|    ent_coef_loss   | -1.08    |
|    learning_rate   | 0.001    |
|    n_updates       | 43552    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24.5     |
|    ep_rew_mean     | -8.26    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 752      |
|    fps             | 37       |
|    time_elapsed    | 1164     |
|    total_timesteps | 43737    |
| train/             |          |
|    actor_loss      | 1.8      |
|    critic_loss     | 0.00802  |
|    ent_coef        | 0.00413  |
|    ent_coef_loss   | -0.946   |
|    learning_rate   | 0.001    |
|    n_updates       | 43636    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24.4     |
|    ep_rew_mean     | -8.26    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 756      |
|    fps             | 37       |
|    time_elapsed    | 1167     |
|    total_timesteps | 43833    |
| train/             |          |
|    actor_loss      | 1.81     |
|    critic_loss     | 0.00991  |
|    ent_coef        | 0.0042   |
|    ent_coef_loss   | 0.232    |
|    learning_rate   | 0.001    |
|    n_updates       | 43732    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24.3     |
|    ep_rew_mean     | -8.25    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 760      |
|    fps             | 37       |
|    time_elapsed    | 1170     |
|    total_timesteps | 43931    |
| train/             |          |
|    actor_loss      | 1.88     |
|    critic_loss     | 0.0247   |
|    ent_coef        | 0.00408  |
|    ent_coef_loss   | -1.14    |
|    learning_rate   | 0.001    |
|    n_updates       | 43830    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.6     |
|    ep_rew_mean     | -8.12    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 764      |
|    fps             | 37       |
|    time_elapsed    | 1172     |
|    total_timesteps | 44006    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.0123   |
|    ent_coef        | 0.00413  |
|    ent_coef_loss   | -1.02    |
|    learning_rate   | 0.001    |
|    n_updates       | 43905    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.4     |
|    ep_rew_mean     | -8.06    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 768      |
|    fps             | 37       |
|    time_elapsed    | 1175     |
|    total_timesteps | 44101    |
| train/             |          |
|    actor_loss      | 1.75     |
|    critic_loss     | 0.013    |
|    ent_coef        | 0.00406  |
|    ent_coef_loss   | -0.353   |
|    learning_rate   | 0.001    |
|    n_updates       | 44000    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.1     |
|    ep_rew_mean     | -7.98    |
|    success_rate    | 0.92     |
| time/              |          |
|    episodes        | 772      |
|    fps             | 37       |
|    time_elapsed    | 1177     |
|    total_timesteps | 44195    |
| train/             |          |
|    actor_loss      | 1.77     |
|    critic_loss     | 0.0181   |
|    ent_coef        | 0.00407  |
|    ent_coef_loss   | -1.11    |
|    learning_rate   | 0.001    |
|    n_updates       | 44094    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.4     |
|    ep_rew_mean     | -8.05    |
|    success_rate    | 0.92     |
| time/              |          |
|    episodes        | 776      |
|    fps             | 37       |
|    time_elapsed    | 1180     |
|    total_timesteps | 44316    |
| train/             |          |
|    actor_loss      | 1.95     |
|    critic_loss     | 0.0141   |
|    ent_coef        | 0.00417  |
|    ent_coef_loss   | 0.376    |
|    learning_rate   | 0.001    |
|    n_updates       | 44215    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.5     |
|    ep_rew_mean     | -8.07    |
|    success_rate    | 0.92     |
| time/              |          |
|    episodes        | 780      |
|    fps             | 37       |
|    time_elapsed    | 1184     |
|    total_timesteps | 44429    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.0224   |
|    ent_coef        | 0.00407  |
|    ent_coef_loss   | 0.0538   |
|    learning_rate   | 0.001    |
|    n_updates       | 44328    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.7     |
|    ep_rew_mean     | -8.21    |
|    success_rate    | 0.91     |
| time/              |          |
|    episodes        | 784      |
|    fps             | 37       |
|    time_elapsed    | 1187     |
|    total_timesteps | 44544    |
| train/             |          |
|    actor_loss      | 1.78     |
|    critic_loss     | 0.00662  |
|    ent_coef        | 0.00401  |
|    ent_coef_loss   | -0.446   |
|    learning_rate   | 0.001    |
|    n_updates       | 44443    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.9     |
|    ep_rew_mean     | -8.26    |
|    success_rate    | 0.91     |
| time/              |          |
|    episodes        | 788      |
|    fps             | 37       |
|    time_elapsed    | 1190     |
|    total_timesteps | 44646    |
| train/             |          |
|    actor_loss      | 1.72     |
|    critic_loss     | 0.0169   |
|    ent_coef        | 0.00386  |
|    ent_coef_loss   | -0.381   |
|    learning_rate   | 0.001    |
|    n_updates       | 44545    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24.2     |
|    ep_rew_mean     | -8.36    |
|    success_rate    | 0.91     |
| time/              |          |
|    episodes        | 792      |
|    fps             | 37       |
|    time_elapsed    | 1193     |
|    total_timesteps | 44776    |
| train/             |          |
|    actor_loss      | 1.92     |
|    critic_loss     | 0.0144   |
|    ent_coef        | 0.00384  |
|    ent_coef_loss   | 0.0591   |
|    learning_rate   | 0.001    |
|    n_updates       | 44675    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24.1     |
|    ep_rew_mean     | -8.2     |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 796      |
|    fps             | 37       |
|    time_elapsed    | 1196     |
|    total_timesteps | 44857    |
| train/             |          |
|    actor_loss      | 1.78     |
|    critic_loss     | 0.00595  |
|    ent_coef        | 0.00391  |
|    ent_coef_loss   | 0.438    |
|    learning_rate   | 0.001    |
|    n_updates       | 44756    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24.1     |
|    ep_rew_mean     | -8.21    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 800      |
|    fps             | 37       |
|    time_elapsed    | 1199     |
|    total_timesteps | 44946    |
| train/             |          |
|    actor_loss      | 1.79     |
|    critic_loss     | 0.0084   |
|    ent_coef        | 0.00404  |
|    ent_coef_loss   | -1.28    |
|    learning_rate   | 0.001    |
|    n_updates       | 44845    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24.2     |
|    ep_rew_mean     | -8.35    |
|    success_rate    | 0.92     |
| time/              |          |
|    episodes        | 804      |
|    fps             | 37       |
|    time_elapsed    | 1201     |
|    total_timesteps | 45050    |
| train/             |          |
|    actor_loss      | 1.86     |
|    critic_loss     | 0.0095   |
|    ent_coef        | 0.00413  |
|    ent_coef_loss   | -0.49    |
|    learning_rate   | 0.001    |
|    n_updates       | 44949    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.9     |
|    ep_rew_mean     | -8.39    |
|    success_rate    | 0.92     |
| time/              |          |
|    episodes        | 808      |
|    fps             | 37       |
|    time_elapsed    | 1204     |
|    total_timesteps | 45154    |
| train/             |          |
|    actor_loss      | 1.88     |
|    critic_loss     | 0.00925  |
|    ent_coef        | 0.00402  |
|    ent_coef_loss   | 0.951    |
|    learning_rate   | 0.001    |
|    n_updates       | 45053    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.9     |
|    ep_rew_mean     | -8.43    |
|    success_rate    | 0.91     |
| time/              |          |
|    episodes        | 812      |
|    fps             | 37       |
|    time_elapsed    | 1206     |
|    total_timesteps | 45226    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.00707  |
|    ent_coef        | 0.00418  |
|    ent_coef_loss   | -2.08    |
|    learning_rate   | 0.001    |
|    n_updates       | 45125    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 28.9     |
|    ep_rew_mean     | -9.16    |
|    success_rate    | 0.91     |
| time/              |          |
|    episodes        | 816      |
|    fps             | 37       |
|    time_elapsed    | 1222     |
|    total_timesteps | 45794    |
| train/             |          |
|    actor_loss      | 1.84     |
|    critic_loss     | 0.0135   |
|    ent_coef        | 0.00415  |
|    ent_coef_loss   | -0.36    |
|    learning_rate   | 0.001    |
|    n_updates       | 45693    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 28.9     |
|    ep_rew_mean     | -9.17    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 820      |
|    fps             | 37       |
|    time_elapsed    | 1225     |
|    total_timesteps | 45894    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.0186   |
|    ent_coef        | 0.00434  |
|    ent_coef_loss   | -0.529   |
|    learning_rate   | 0.001    |
|    n_updates       | 45793    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 28.8     |
|    ep_rew_mean     | -9.15    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 824      |
|    fps             | 37       |
|    time_elapsed    | 1227     |
|    total_timesteps | 45980    |
| train/             |          |
|    actor_loss      | 1.87     |
|    critic_loss     | 0.0197   |
|    ent_coef        | 0.00425  |
|    ent_coef_loss   | 1.05     |
|    learning_rate   | 0.001    |
|    n_updates       | 45879    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 29       |
|    ep_rew_mean     | -9.24    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 828      |
|    fps             | 37       |
|    time_elapsed    | 1230     |
|    total_timesteps | 46076    |
| train/             |          |
|    actor_loss      | 1.88     |
|    critic_loss     | 0.00568  |
|    ent_coef        | 0.00413  |
|    ent_coef_loss   | 1.14     |
|    learning_rate   | 0.001    |
|    n_updates       | 45975    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 29       |
|    ep_rew_mean     | -9.19    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 832      |
|    fps             | 37       |
|    time_elapsed    | 1232     |
|    total_timesteps | 46168    |
| train/             |          |
|    actor_loss      | 1.81     |
|    critic_loss     | 0.0148   |
|    ent_coef        | 0.00418  |
|    ent_coef_loss   | 0.204    |
|    learning_rate   | 0.001    |
|    n_updates       | 46067    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 28.6     |
|    ep_rew_mean     | -9.01    |
|    success_rate    | 0.91     |
| time/              |          |
|    episodes        | 836      |
|    fps             | 37       |
|    time_elapsed    | 1235     |
|    total_timesteps | 46236    |
| train/             |          |
|    actor_loss      | 1.84     |
|    critic_loss     | 0.0115   |
|    ent_coef        | 0.00428  |
|    ent_coef_loss   | 0.179    |
|    learning_rate   | 0.001    |
|    n_updates       | 46135    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 28.2     |
|    ep_rew_mean     | -8.77    |
|    success_rate    | 0.92     |
| time/              |          |
|    episodes        | 840      |
|    fps             | 37       |
|    time_elapsed    | 1237     |
|    total_timesteps | 46312    |
| train/             |          |
|    actor_loss      | 1.83     |
|    critic_loss     | 0.00928  |
|    ent_coef        | 0.00418  |
|    ent_coef_loss   | -0.304   |
|    learning_rate   | 0.001    |
|    n_updates       | 46211    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 28.6     |
|    ep_rew_mean     | -8.77    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 844      |
|    fps             | 37       |
|    time_elapsed    | 1240     |
|    total_timesteps | 46431    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.00846  |
|    ent_coef        | 0.00406  |
|    ent_coef_loss   | -1.01    |
|    learning_rate   | 0.001    |
|    n_updates       | 46330    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 28.9     |
|    ep_rew_mean     | -8.85    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 848      |
|    fps             | 37       |
|    time_elapsed    | 1243     |
|    total_timesteps | 46538    |
| train/             |          |
|    actor_loss      | 1.76     |
|    critic_loss     | 0.0105   |
|    ent_coef        | 0.00397  |
|    ent_coef_loss   | 0.145    |
|    learning_rate   | 0.001    |
|    n_updates       | 46437    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 29       |
|    ep_rew_mean     | -8.96    |
|    success_rate    | 0.92     |
| time/              |          |
|    episodes        | 852      |
|    fps             | 37       |
|    time_elapsed    | 1246     |
|    total_timesteps | 46640    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.00881  |
|    ent_coef        | 0.00398  |
|    ent_coef_loss   | -2.06    |
|    learning_rate   | 0.001    |
|    n_updates       | 46539    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 28.9     |
|    ep_rew_mean     | -8.9     |
|    success_rate    | 0.92     |
| time/              |          |
|    episodes        | 856      |
|    fps             | 37       |
|    time_elapsed    | 1249     |
|    total_timesteps | 46724    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.00736  |
|    ent_coef        | 0.00398  |
|    ent_coef_loss   | -0.0544  |
|    learning_rate   | 0.001    |
|    n_updates       | 46623    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 28.9     |
|    ep_rew_mean     | -8.81    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 860      |
|    fps             | 37       |
|    time_elapsed    | 1251     |
|    total_timesteps | 46820    |
| train/             |          |
|    actor_loss      | 1.63     |
|    critic_loss     | 0.0124   |
|    ent_coef        | 0.00397  |
|    ent_coef_loss   | -1.37    |
|    learning_rate   | 0.001    |
|    n_updates       | 46719    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 29       |
|    ep_rew_mean     | -8.85    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 864      |
|    fps             | 37       |
|    time_elapsed    | 1254     |
|    total_timesteps | 46909    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.0129   |
|    ent_coef        | 0.00405  |
|    ent_coef_loss   | 0.399    |
|    learning_rate   | 0.001    |
|    n_updates       | 46808    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 29       |
|    ep_rew_mean     | -8.84    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 868      |
|    fps             | 37       |
|    time_elapsed    | 1256     |
|    total_timesteps | 47001    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.00844  |
|    ent_coef        | 0.00415  |
|    ent_coef_loss   | -0.154   |
|    learning_rate   | 0.001    |
|    n_updates       | 46900    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 29       |
|    ep_rew_mean     | -8.79    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 872      |
|    fps             | 37       |
|    time_elapsed    | 1259     |
|    total_timesteps | 47096    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.0134   |
|    ent_coef        | 0.00417  |
|    ent_coef_loss   | -0.62    |
|    learning_rate   | 0.001    |
|    n_updates       | 46995    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 28.5     |
|    ep_rew_mean     | -8.63    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 876      |
|    fps             | 37       |
|    time_elapsed    | 1262     |
|    total_timesteps | 47170    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.00554  |
|    ent_coef        | 0.00414  |
|    ent_coef_loss   | -0.521   |
|    learning_rate   | 0.001    |
|    n_updates       | 47069    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 28.2     |
|    ep_rew_mean     | -8.54    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 880      |
|    fps             | 37       |
|    time_elapsed    | 1264     |
|    total_timesteps | 47251    |
| train/             |          |
|    actor_loss      | 1.79     |
|    critic_loss     | 0.0126   |
|    ent_coef        | 0.00418  |
|    ent_coef_loss   | -0.103   |
|    learning_rate   | 0.001    |
|    n_updates       | 47150    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 28       |
|    ep_rew_mean     | -8.46    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 884      |
|    fps             | 37       |
|    time_elapsed    | 1266     |
|    total_timesteps | 47341    |
| train/             |          |
|    actor_loss      | 1.87     |
|    critic_loss     | 0.017    |
|    ent_coef        | 0.00407  |
|    ent_coef_loss   | 0.00965  |
|    learning_rate   | 0.001    |
|    n_updates       | 47240    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 27.9     |
|    ep_rew_mean     | -8.42    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 888      |
|    fps             | 37       |
|    time_elapsed    | 1268     |
|    total_timesteps | 47432    |
| train/             |          |
|    actor_loss      | 1.77     |
|    critic_loss     | 0.0107   |
|    ent_coef        | 0.00408  |
|    ent_coef_loss   | -2.8     |
|    learning_rate   | 0.001    |
|    n_updates       | 47331    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 27.5     |
|    ep_rew_mean     | -8.29    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 892      |
|    fps             | 37       |
|    time_elapsed    | 1271     |
|    total_timesteps | 47523    |
| train/             |          |
|    actor_loss      | 1.95     |
|    critic_loss     | 0.00568  |
|    ent_coef        | 0.00412  |
|    ent_coef_loss   | 0.236    |
|    learning_rate   | 0.001    |
|    n_updates       | 47422    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 27.8     |
|    ep_rew_mean     | -8.39    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 896      |
|    fps             | 37       |
|    time_elapsed    | 1275     |
|    total_timesteps | 47635    |
| train/             |          |
|    actor_loss      | 1.8      |
|    critic_loss     | 0.0113   |
|    ent_coef        | 0.0042   |
|    ent_coef_loss   | 0.903    |
|    learning_rate   | 0.001    |
|    n_updates       | 47534    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 27.6     |
|    ep_rew_mean     | -8.29    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 900      |
|    fps             | 37       |
|    time_elapsed    | 1276     |
|    total_timesteps | 47701    |
| train/             |          |
|    actor_loss      | 1.77     |
|    critic_loss     | 0.0062   |
|    ent_coef        | 0.00416  |
|    ent_coef_loss   | -0.459   |
|    learning_rate   | 0.001    |
|    n_updates       | 47600    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 27.4     |
|    ep_rew_mean     | -8.14    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 904      |
|    fps             | 37       |
|    time_elapsed    | 1279     |
|    total_timesteps | 47792    |
| train/             |          |
|    actor_loss      | 1.8      |
|    critic_loss     | 0.00886  |
|    ent_coef        | 0.00409  |
|    ent_coef_loss   | 0.629    |
|    learning_rate   | 0.001    |
|    n_updates       | 47691    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 27.2     |
|    ep_rew_mean     | -7.98    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 908      |
|    fps             | 37       |
|    time_elapsed    | 1281     |
|    total_timesteps | 47872    |
| train/             |          |
|    actor_loss      | 1.91     |
|    critic_loss     | 0.00579  |
|    ent_coef        | 0.0042   |
|    ent_coef_loss   | -0.938   |
|    learning_rate   | 0.001    |
|    n_updates       | 47771    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 27.2     |
|    ep_rew_mean     | -7.98    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 912      |
|    fps             | 37       |
|    time_elapsed    | 1283     |
|    total_timesteps | 47949    |
| train/             |          |
|    actor_loss      | 1.81     |
|    critic_loss     | 0.00661  |
|    ent_coef        | 0.0042   |
|    ent_coef_loss   | -0.125   |
|    learning_rate   | 0.001    |
|    n_updates       | 47848    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.3     |
|    ep_rew_mean     | -7.18    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 916      |
|    fps             | 37       |
|    time_elapsed    | 1285     |
|    total_timesteps | 48025    |
| train/             |          |
|    actor_loss      | 1.84     |
|    critic_loss     | 0.00761  |
|    ent_coef        | 0.00427  |
|    ent_coef_loss   | -0.117   |
|    learning_rate   | 0.001    |
|    n_updates       | 47924    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.4     |
|    ep_rew_mean     | -7.2     |
|    success_rate    | 0.98     |
| time/              |          |
|    episodes        | 920      |
|    fps             | 37       |
|    time_elapsed    | 1289     |
|    total_timesteps | 48129    |
| train/             |          |
|    actor_loss      | 1.66     |
|    critic_loss     | 0.00628  |
|    ent_coef        | 0.0042   |
|    ent_coef_loss   | -1.13    |
|    learning_rate   | 0.001    |
|    n_updates       | 48028    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.4     |
|    ep_rew_mean     | -7.22    |
|    success_rate    | 0.98     |
| time/              |          |
|    episodes        | 924      |
|    fps             | 37       |
|    time_elapsed    | 1291     |
|    total_timesteps | 48217    |
| train/             |          |
|    actor_loss      | 1.91     |
|    critic_loss     | 0.0255   |
|    ent_coef        | 0.00424  |
|    ent_coef_loss   | 0.29     |
|    learning_rate   | 0.001    |
|    n_updates       | 48116    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.4     |
|    ep_rew_mean     | -7.21    |
|    success_rate    | 0.98     |
| time/              |          |
|    episodes        | 928      |
|    fps             | 37       |
|    time_elapsed    | 1293     |
|    total_timesteps | 48314    |
| train/             |          |
|    actor_loss      | 1.93     |
|    critic_loss     | 0.00777  |
|    ent_coef        | 0.00429  |
|    ent_coef_loss   | 0.131    |
|    learning_rate   | 0.001    |
|    n_updates       | 48213    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.1     |
|    ep_rew_mean     | -7.12    |
|    success_rate    | 0.98     |
| time/              |          |
|    episodes        | 932      |
|    fps             | 37       |
|    time_elapsed    | 1295     |
|    total_timesteps | 48378    |
| train/             |          |
|    actor_loss      | 1.89     |
|    critic_loss     | 0.0144   |
|    ent_coef        | 0.00436  |
|    ent_coef_loss   | 0.692    |
|    learning_rate   | 0.001    |
|    n_updates       | 48277    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.4     |
|    ep_rew_mean     | -7.24    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 936      |
|    fps             | 37       |
|    time_elapsed    | 1298     |
|    total_timesteps | 48473    |
| train/             |          |
|    actor_loss      | 1.65     |
|    critic_loss     | 0.025    |
|    ent_coef        | 0.00436  |
|    ent_coef_loss   | -0.403   |
|    learning_rate   | 0.001    |
|    n_updates       | 48372    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.3     |
|    ep_rew_mean     | -7.22    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 940      |
|    fps             | 37       |
|    time_elapsed    | 1300     |
|    total_timesteps | 48543    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.013    |
|    ent_coef        | 0.00441  |
|    ent_coef_loss   | -1.19    |
|    learning_rate   | 0.001    |
|    n_updates       | 48442    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.9     |
|    ep_rew_mean     | -7.13    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 944      |
|    fps             | 37       |
|    time_elapsed    | 1302     |
|    total_timesteps | 48618    |
| train/             |          |
|    actor_loss      | 1.66     |
|    critic_loss     | 0.00671  |
|    ent_coef        | 0.00438  |
|    ent_coef_loss   | 1.29     |
|    learning_rate   | 0.001    |
|    n_updates       | 48517    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.7     |
|    ep_rew_mean     | -7.11    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 948      |
|    fps             | 37       |
|    time_elapsed    | 1305     |
|    total_timesteps | 48709    |
| train/             |          |
|    actor_loss      | 1.77     |
|    critic_loss     | 0.00964  |
|    ent_coef        | 0.00453  |
|    ent_coef_loss   | 1.25     |
|    learning_rate   | 0.001    |
|    n_updates       | 48608    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.7     |
|    ep_rew_mean     | -7.1     |
|    success_rate    | 0.98     |
| time/              |          |
|    episodes        | 952      |
|    fps             | 37       |
|    time_elapsed    | 1308     |
|    total_timesteps | 48811    |
| train/             |          |
|    actor_loss      | 1.83     |
|    critic_loss     | 0.00788  |
|    ent_coef        | 0.00441  |
|    ent_coef_loss   | 0.302    |
|    learning_rate   | 0.001    |
|    n_updates       | 48710    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.9     |
|    ep_rew_mean     | -7.2     |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 956      |
|    fps             | 37       |
|    time_elapsed    | 1310     |
|    total_timesteps | 48910    |
| train/             |          |
|    actor_loss      | 1.81     |
|    critic_loss     | 0.0111   |
|    ent_coef        | 0.00446  |
|    ent_coef_loss   | 1.46     |
|    learning_rate   | 0.001    |
|    n_updates       | 48809    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.9     |
|    ep_rew_mean     | -7.23    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 960      |
|    fps             | 37       |
|    time_elapsed    | 1314     |
|    total_timesteps | 49012    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.00848  |
|    ent_coef        | 0.00451  |
|    ent_coef_loss   | 0.397    |
|    learning_rate   | 0.001    |
|    n_updates       | 48911    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22       |
|    ep_rew_mean     | -7.22    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 964      |
|    fps             | 37       |
|    time_elapsed    | 1316     |
|    total_timesteps | 49106    |
| train/             |          |
|    actor_loss      | 1.76     |
|    critic_loss     | 0.0229   |
|    ent_coef        | 0.00452  |
|    ent_coef_loss   | -2.26    |
|    learning_rate   | 0.001    |
|    n_updates       | 49005    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.1     |
|    ep_rew_mean     | -7.25    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 968      |
|    fps             | 37       |
|    time_elapsed    | 1319     |
|    total_timesteps | 49207    |
| train/             |          |
|    actor_loss      | 1.79     |
|    critic_loss     | 0.00734  |
|    ent_coef        | 0.00441  |
|    ent_coef_loss   | -1.12    |
|    learning_rate   | 0.001    |
|    n_updates       | 49106    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.3     |
|    ep_rew_mean     | -7.44    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 972      |
|    fps             | 37       |
|    time_elapsed    | 1322     |
|    total_timesteps | 49324    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.00851  |
|    ent_coef        | 0.00436  |
|    ent_coef_loss   | 0.512    |
|    learning_rate   | 0.001    |
|    n_updates       | 49223    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.1     |
|    ep_rew_mean     | -7.61    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 976      |
|    fps             | 37       |
|    time_elapsed    | 1327     |
|    total_timesteps | 49484    |
| train/             |          |
|    actor_loss      | 1.76     |
|    critic_loss     | 0.00794  |
|    ent_coef        | 0.00446  |
|    ent_coef_loss   | 0.114    |
|    learning_rate   | 0.001    |
|    n_updates       | 49383    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.3     |
|    ep_rew_mean     | -7.69    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 980      |
|    fps             | 37       |
|    time_elapsed    | 1329     |
|    total_timesteps | 49578    |
| train/             |          |
|    actor_loss      | 1.66     |
|    critic_loss     | 0.00537  |
|    ent_coef        | 0.0043   |
|    ent_coef_loss   | -0.856   |
|    learning_rate   | 0.001    |
|    n_updates       | 49477    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.2     |
|    ep_rew_mean     | -7.6     |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 984      |
|    fps             | 37       |
|    time_elapsed    | 1332     |
|    total_timesteps | 49661    |
| train/             |          |
|    actor_loss      | 1.62     |
|    critic_loss     | 0.00454  |
|    ent_coef        | 0.00429  |
|    ent_coef_loss   | 0.374    |
|    learning_rate   | 0.001    |
|    n_updates       | 49560    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.1     |
|    ep_rew_mean     | -7.57    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 988      |
|    fps             | 37       |
|    time_elapsed    | 1334     |
|    total_timesteps | 49737    |
| train/             |          |
|    actor_loss      | 1.77     |
|    critic_loss     | 0.0408   |
|    ent_coef        | 0.00441  |
|    ent_coef_loss   | 0.303    |
|    learning_rate   | 0.001    |
|    n_updates       | 49636    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.9     |
|    ep_rew_mean     | -7.51    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 992      |
|    fps             | 37       |
|    time_elapsed    | 1336     |
|    total_timesteps | 49816    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.0109   |
|    ent_coef        | 0.00446  |
|    ent_coef_loss   | -1.23    |
|    learning_rate   | 0.001    |
|    n_updates       | 49715    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.4     |
|    ep_rew_mean     | -7.35    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 996      |
|    fps             | 37       |
|    time_elapsed    | 1338     |
|    total_timesteps | 49876    |
| train/             |          |
|    actor_loss      | 1.83     |
|    critic_loss     | 0.0073   |
|    ent_coef        | 0.00446  |
|    ent_coef_loss   | 1.4      |
|    learning_rate   | 0.001    |
|    n_updates       | 49775    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.7     |
|    ep_rew_mean     | -7.41    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 1000     |
|    fps             | 37       |
|    time_elapsed    | 1340     |
|    total_timesteps | 49968    |
| train/             |          |
|    actor_loss      | 1.82     |
|    critic_loss     | 0.00494  |
|    ent_coef        | 0.00453  |
|    ent_coef_loss   | 0.0742   |
|    learning_rate   | 0.001    |
|    n_updates       | 49867    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.9     |
|    ep_rew_mean     | -7.47    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 1004     |
|    fps             | 37       |
|    time_elapsed    | 1343     |
|    total_timesteps | 50084    |
| train/             |          |
|    actor_loss      | 1.82     |
|    critic_loss     | 0.00942  |
|    ent_coef        | 0.00448  |
|    ent_coef_loss   | 0.483    |
|    learning_rate   | 0.001    |
|    n_updates       | 49983    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.9     |
|    ep_rew_mean     | -7.56    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 1008     |
|    fps             | 37       |
|    time_elapsed    | 1346     |
|    total_timesteps | 50162    |
| train/             |          |
|    actor_loss      | 1.8      |
|    critic_loss     | 0.011    |
|    ent_coef        | 0.0046   |
|    ent_coef_loss   | 1.41     |
|    learning_rate   | 0.001    |
|    n_updates       | 50061    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23       |
|    ep_rew_mean     | -7.57    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 1012     |
|    fps             | 37       |
|    time_elapsed    | 1348     |
|    total_timesteps | 50248    |
| train/             |          |
|    actor_loss      | 1.79     |
|    critic_loss     | 0.0101   |
|    ent_coef        | 0.00452  |
|    ent_coef_loss   | 0.0683   |
|    learning_rate   | 0.001    |
|    n_updates       | 50147    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.1     |
|    ep_rew_mean     | -7.69    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1016     |
|    fps             | 37       |
|    time_elapsed    | 1351     |
|    total_timesteps | 50338    |
| train/             |          |
|    actor_loss      | 1.87     |
|    critic_loss     | 0.00857  |
|    ent_coef        | 0.00451  |
|    ent_coef_loss   | 1.51     |
|    learning_rate   | 0.001    |
|    n_updates       | 50237    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23       |
|    ep_rew_mean     | -7.63    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1020     |
|    fps             | 37       |
|    time_elapsed    | 1354     |
|    total_timesteps | 50432    |
| train/             |          |
|    actor_loss      | 1.81     |
|    critic_loss     | 0.0122   |
|    ent_coef        | 0.00455  |
|    ent_coef_loss   | -0.274   |
|    learning_rate   | 0.001    |
|    n_updates       | 50331    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.3     |
|    ep_rew_mean     | -7.73    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1024     |
|    fps             | 37       |
|    time_elapsed    | 1357     |
|    total_timesteps | 50544    |
| train/             |          |
|    actor_loss      | 1.67     |
|    critic_loss     | 0.0114   |
|    ent_coef        | 0.00437  |
|    ent_coef_loss   | -0.0987  |
|    learning_rate   | 0.001    |
|    n_updates       | 50443    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.2     |
|    ep_rew_mean     | -7.7     |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1028     |
|    fps             | 37       |
|    time_elapsed    | 1359     |
|    total_timesteps | 50635    |
| train/             |          |
|    actor_loss      | 1.77     |
|    critic_loss     | 0.00931  |
|    ent_coef        | 0.00435  |
|    ent_coef_loss   | -1.65    |
|    learning_rate   | 0.001    |
|    n_updates       | 50534    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.7     |
|    ep_rew_mean     | -7.85    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1032     |
|    fps             | 37       |
|    time_elapsed    | 1363     |
|    total_timesteps | 50746    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.00551  |
|    ent_coef        | 0.0044   |
|    ent_coef_loss   | -0.916   |
|    learning_rate   | 0.001    |
|    n_updates       | 50645    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24       |
|    ep_rew_mean     | -7.79    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 1036     |
|    fps             | 37       |
|    time_elapsed    | 1366     |
|    total_timesteps | 50870    |
| train/             |          |
|    actor_loss      | 1.67     |
|    critic_loss     | 0.00552  |
|    ent_coef        | 0.00446  |
|    ent_coef_loss   | -1.09    |
|    learning_rate   | 0.001    |
|    n_updates       | 50769    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24.1     |
|    ep_rew_mean     | -7.87    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 1040     |
|    fps             | 37       |
|    time_elapsed    | 1369     |
|    total_timesteps | 50958    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.0126   |
|    ent_coef        | 0.00442  |
|    ent_coef_loss   | 0.269    |
|    learning_rate   | 0.001    |
|    n_updates       | 50857    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24.3     |
|    ep_rew_mean     | -7.94    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 1044     |
|    fps             | 37       |
|    time_elapsed    | 1371     |
|    total_timesteps | 51045    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.00667  |
|    ent_coef        | 0.00454  |
|    ent_coef_loss   | 0.0483   |
|    learning_rate   | 0.001    |
|    n_updates       | 50944    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24.4     |
|    ep_rew_mean     | -7.95    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 1048     |
|    fps             | 37       |
|    time_elapsed    | 1374     |
|    total_timesteps | 51144    |
| train/             |          |
|    actor_loss      | 1.65     |
|    critic_loss     | 0.00792  |
|    ent_coef        | 0.00443  |
|    ent_coef_loss   | -0.647   |
|    learning_rate   | 0.001    |
|    n_updates       | 51043    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24.2     |
|    ep_rew_mean     | -7.88    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 1052     |
|    fps             | 37       |
|    time_elapsed    | 1377     |
|    total_timesteps | 51236    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.00627  |
|    ent_coef        | 0.00429  |
|    ent_coef_loss   | -1.31    |
|    learning_rate   | 0.001    |
|    n_updates       | 51135    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24.3     |
|    ep_rew_mean     | -7.84    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 1056     |
|    fps             | 37       |
|    time_elapsed    | 1380     |
|    total_timesteps | 51342    |
| train/             |          |
|    actor_loss      | 1.77     |
|    critic_loss     | 0.0131   |
|    ent_coef        | 0.00425  |
|    ent_coef_loss   | 0.687    |
|    learning_rate   | 0.001    |
|    n_updates       | 51241    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24.1     |
|    ep_rew_mean     | -7.77    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 1060     |
|    fps             | 37       |
|    time_elapsed    | 1382     |
|    total_timesteps | 51425    |
| train/             |          |
|    actor_loss      | 1.63     |
|    critic_loss     | 0.00652  |
|    ent_coef        | 0.00426  |
|    ent_coef_loss   | 0.224    |
|    learning_rate   | 0.001    |
|    n_updates       | 51324    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24.1     |
|    ep_rew_mean     | -7.79    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 1064     |
|    fps             | 37       |
|    time_elapsed    | 1384     |
|    total_timesteps | 51512    |
| train/             |          |
|    actor_loss      | 1.83     |
|    critic_loss     | 0.0182   |
|    ent_coef        | 0.00427  |
|    ent_coef_loss   | 1.1      |
|    learning_rate   | 0.001    |
|    n_updates       | 51411    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24       |
|    ep_rew_mean     | -7.75    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 1068     |
|    fps             | 37       |
|    time_elapsed    | 1386     |
|    total_timesteps | 51606    |
| train/             |          |
|    actor_loss      | 1.93     |
|    critic_loss     | 0.0101   |
|    ent_coef        | 0.00433  |
|    ent_coef_loss   | 0.707    |
|    learning_rate   | 0.001    |
|    n_updates       | 51505    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.7     |
|    ep_rew_mean     | -7.56    |
|    success_rate    | 0.98     |
| time/              |          |
|    episodes        | 1072     |
|    fps             | 37       |
|    time_elapsed    | 1390     |
|    total_timesteps | 51690    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.00674  |
|    ent_coef        | 0.00437  |
|    ent_coef_loss   | 0.775    |
|    learning_rate   | 0.001    |
|    n_updates       | 51589    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.2     |
|    ep_rew_mean     | -7.48    |
|    success_rate    | 0.98     |
| time/              |          |
|    episodes        | 1076     |
|    fps             | 37       |
|    time_elapsed    | 1393     |
|    total_timesteps | 51804    |
| train/             |          |
|    actor_loss      | 1.87     |
|    critic_loss     | 0.00722  |
|    ent_coef        | 0.00439  |
|    ent_coef_loss   | 0.868    |
|    learning_rate   | 0.001    |
|    n_updates       | 51703    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.3     |
|    ep_rew_mean     | -7.52    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 1080     |
|    fps             | 37       |
|    time_elapsed    | 1395     |
|    total_timesteps | 51906    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.00574  |
|    ent_coef        | 0.00441  |
|    ent_coef_loss   | -0.16    |
|    learning_rate   | 0.001    |
|    n_updates       | 51805    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.3     |
|    ep_rew_mean     | -7.53    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 1084     |
|    fps             | 37       |
|    time_elapsed    | 1398     |
|    total_timesteps | 51990    |
| train/             |          |
|    actor_loss      | 1.85     |
|    critic_loss     | 0.00654  |
|    ent_coef        | 0.00441  |
|    ent_coef_loss   | 1.14     |
|    learning_rate   | 0.001    |
|    n_updates       | 51889    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.4     |
|    ep_rew_mean     | -7.51    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 1088     |
|    fps             | 37       |
|    time_elapsed    | 1400     |
|    total_timesteps | 52081    |
| train/             |          |
|    actor_loss      | 1.64     |
|    critic_loss     | 0.00756  |
|    ent_coef        | 0.00439  |
|    ent_coef_loss   | 0.945    |
|    learning_rate   | 0.001    |
|    n_updates       | 51980    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.6     |
|    ep_rew_mean     | -7.62    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 1092     |
|    fps             | 37       |
|    time_elapsed    | 1403     |
|    total_timesteps | 52173    |
| train/             |          |
|    actor_loss      | 1.76     |
|    critic_loss     | 0.00748  |
|    ent_coef        | 0.00427  |
|    ent_coef_loss   | 1.02     |
|    learning_rate   | 0.001    |
|    n_updates       | 52072    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.9     |
|    ep_rew_mean     | -7.72    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 1096     |
|    fps             | 37       |
|    time_elapsed    | 1406     |
|    total_timesteps | 52267    |
| train/             |          |
|    actor_loss      | 1.82     |
|    critic_loss     | 0.00455  |
|    ent_coef        | 0.00429  |
|    ent_coef_loss   | 0.0126   |
|    learning_rate   | 0.001    |
|    n_updates       | 52166    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.9     |
|    ep_rew_mean     | -7.73    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 1100     |
|    fps             | 37       |
|    time_elapsed    | 1408     |
|    total_timesteps | 52353    |
| train/             |          |
|    actor_loss      | 1.79     |
|    critic_loss     | 0.0109   |
|    ent_coef        | 0.00423  |
|    ent_coef_loss   | 0.522    |
|    learning_rate   | 0.001    |
|    n_updates       | 52252    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.5     |
|    ep_rew_mean     | -7.66    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 1104     |
|    fps             | 37       |
|    time_elapsed    | 1411     |
|    total_timesteps | 52432    |
| train/             |          |
|    actor_loss      | 1.86     |
|    critic_loss     | 0.0191   |
|    ent_coef        | 0.00425  |
|    ent_coef_loss   | 0.796    |
|    learning_rate   | 0.001    |
|    n_updates       | 52331    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.9     |
|    ep_rew_mean     | -7.74    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 1108     |
|    fps             | 37       |
|    time_elapsed    | 1414     |
|    total_timesteps | 52548    |
| train/             |          |
|    actor_loss      | 1.78     |
|    critic_loss     | 0.013    |
|    ent_coef        | 0.00433  |
|    ent_coef_loss   | -0.968   |
|    learning_rate   | 0.001    |
|    n_updates       | 52447    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.8     |
|    ep_rew_mean     | -7.74    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 1112     |
|    fps             | 37       |
|    time_elapsed    | 1417     |
|    total_timesteps | 52630    |
| train/             |          |
|    actor_loss      | 1.81     |
|    critic_loss     | 0.00508  |
|    ent_coef        | 0.00426  |
|    ent_coef_loss   | 0.404    |
|    learning_rate   | 0.001    |
|    n_updates       | 52529    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24.1     |
|    ep_rew_mean     | -7.68    |
|    success_rate    | 0.98     |
| time/              |          |
|    episodes        | 1116     |
|    fps             | 37       |
|    time_elapsed    | 1420     |
|    total_timesteps | 52749    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.0142   |
|    ent_coef        | 0.00426  |
|    ent_coef_loss   | -1.48    |
|    learning_rate   | 0.001    |
|    n_updates       | 52648    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24.2     |
|    ep_rew_mean     | -7.77    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 1120     |
|    fps             | 37       |
|    time_elapsed    | 1423     |
|    total_timesteps | 52852    |
| train/             |          |
|    actor_loss      | 1.77     |
|    critic_loss     | 0.00983  |
|    ent_coef        | 0.00416  |
|    ent_coef_loss   | -0.083   |
|    learning_rate   | 0.001    |
|    n_updates       | 52751    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24       |
|    ep_rew_mean     | -7.73    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 1124     |
|    fps             | 37       |
|    time_elapsed    | 1425     |
|    total_timesteps | 52940    |
| train/             |          |
|    actor_loss      | 1.85     |
|    critic_loss     | 0.0075   |
|    ent_coef        | 0.00419  |
|    ent_coef_loss   | 0.636    |
|    learning_rate   | 0.001    |
|    n_updates       | 52839    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24       |
|    ep_rew_mean     | -7.81    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1128     |
|    fps             | 37       |
|    time_elapsed    | 1428     |
|    total_timesteps | 53035    |
| train/             |          |
|    actor_loss      | 1.57     |
|    critic_loss     | 0.0122   |
|    ent_coef        | 0.00415  |
|    ent_coef_loss   | 0.773    |
|    learning_rate   | 0.001    |
|    n_updates       | 52934    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.6     |
|    ep_rew_mean     | -7.75    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1132     |
|    fps             | 37       |
|    time_elapsed    | 1430     |
|    total_timesteps | 53108    |
| train/             |          |
|    actor_loss      | 1.91     |
|    critic_loss     | 0.00718  |
|    ent_coef        | 0.00413  |
|    ent_coef_loss   | 1.74     |
|    learning_rate   | 0.001    |
|    n_updates       | 53007    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.4     |
|    ep_rew_mean     | -7.79    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1136     |
|    fps             | 37       |
|    time_elapsed    | 1433     |
|    total_timesteps | 53213    |
| train/             |          |
|    actor_loss      | 1.79     |
|    critic_loss     | 0.0153   |
|    ent_coef        | 0.00435  |
|    ent_coef_loss   | 0.99     |
|    learning_rate   | 0.001    |
|    n_updates       | 53112    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.2     |
|    ep_rew_mean     | -7.8     |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1140     |
|    fps             | 37       |
|    time_elapsed    | 1435     |
|    total_timesteps | 53283    |
| train/             |          |
|    actor_loss      | 1.65     |
|    critic_loss     | 0.0115   |
|    ent_coef        | 0.0044   |
|    ent_coef_loss   | 0.389    |
|    learning_rate   | 0.001    |
|    n_updates       | 53182    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23.2     |
|    ep_rew_mean     | -7.76    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1144     |
|    fps             | 37       |
|    time_elapsed    | 1437     |
|    total_timesteps | 53367    |
| train/             |          |
|    actor_loss      | 1.92     |
|    critic_loss     | 0.0115   |
|    ent_coef        | 0.00444  |
|    ent_coef_loss   | 1.71     |
|    learning_rate   | 0.001    |
|    n_updates       | 53266    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 23       |
|    ep_rew_mean     | -7.7     |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1148     |
|    fps             | 37       |
|    time_elapsed    | 1439     |
|    total_timesteps | 53443    |
| train/             |          |
|    actor_loss      | 1.65     |
|    critic_loss     | 0.00592  |
|    ent_coef        | 0.0046   |
|    ent_coef_loss   | -1.45    |
|    learning_rate   | 0.001    |
|    n_updates       | 53342    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.9     |
|    ep_rew_mean     | -7.77    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 1152     |
|    fps             | 37       |
|    time_elapsed    | 1442     |
|    total_timesteps | 53525    |
| train/             |          |
|    actor_loss      | 1.6      |
|    critic_loss     | 0.00773  |
|    ent_coef        | 0.00442  |
|    ent_coef_loss   | -0.244   |
|    learning_rate   | 0.001    |
|    n_updates       | 53424    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.9     |
|    ep_rew_mean     | -7.76    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 1156     |
|    fps             | 37       |
|    time_elapsed    | 1445     |
|    total_timesteps | 53632    |
| train/             |          |
|    actor_loss      | 1.72     |
|    critic_loss     | 0.00613  |
|    ent_coef        | 0.00449  |
|    ent_coef_loss   | -0.993   |
|    learning_rate   | 0.001    |
|    n_updates       | 53531    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.7     |
|    ep_rew_mean     | -7.72    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 1160     |
|    fps             | 37       |
|    time_elapsed    | 1447     |
|    total_timesteps | 53697    |
| train/             |          |
|    actor_loss      | 1.72     |
|    critic_loss     | 0.0172   |
|    ent_coef        | 0.00443  |
|    ent_coef_loss   | 0.146    |
|    learning_rate   | 0.001    |
|    n_updates       | 53596    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.8     |
|    ep_rew_mean     | -7.72    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 1164     |
|    fps             | 37       |
|    time_elapsed    | 1449     |
|    total_timesteps | 53790    |
| train/             |          |
|    actor_loss      | 1.8      |
|    critic_loss     | 0.00876  |
|    ent_coef        | 0.00454  |
|    ent_coef_loss   | 0.248    |
|    learning_rate   | 0.001    |
|    n_updates       | 53689    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.6     |
|    ep_rew_mean     | -7.73    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 1168     |
|    fps             | 37       |
|    time_elapsed    | 1451     |
|    total_timesteps | 53870    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.00584  |
|    ent_coef        | 0.00455  |
|    ent_coef_loss   | -0.2     |
|    learning_rate   | 0.001    |
|    n_updates       | 53769    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.9     |
|    ep_rew_mean     | -7.78    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 1172     |
|    fps             | 37       |
|    time_elapsed    | 1455     |
|    total_timesteps | 53975    |
| train/             |          |
|    actor_loss      | 1.79     |
|    critic_loss     | 0.0146   |
|    ent_coef        | 0.00454  |
|    ent_coef_loss   | 0.809    |
|    learning_rate   | 0.001    |
|    n_updates       | 53874    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.5     |
|    ep_rew_mean     | -7.74    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 1176     |
|    fps             | 37       |
|    time_elapsed    | 1457     |
|    total_timesteps | 54054    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.0122   |
|    ent_coef        | 0.00458  |
|    ent_coef_loss   | 0.275    |
|    learning_rate   | 0.001    |
|    n_updates       | 53953    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.4     |
|    ep_rew_mean     | -7.63    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1180     |
|    fps             | 37       |
|    time_elapsed    | 1459     |
|    total_timesteps | 54142    |
| train/             |          |
|    actor_loss      | 1.86     |
|    critic_loss     | 0.00658  |
|    ent_coef        | 0.00449  |
|    ent_coef_loss   | 0.837    |
|    learning_rate   | 0.001    |
|    n_updates       | 54041    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.4     |
|    ep_rew_mean     | -7.63    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1184     |
|    fps             | 37       |
|    time_elapsed    | 1462     |
|    total_timesteps | 54226    |
| train/             |          |
|    actor_loss      | 1.83     |
|    critic_loss     | 0.00651  |
|    ent_coef        | 0.00444  |
|    ent_coef_loss   | -0.791   |
|    learning_rate   | 0.001    |
|    n_updates       | 54125    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.3     |
|    ep_rew_mean     | -7.65    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1188     |
|    fps             | 37       |
|    time_elapsed    | 1464     |
|    total_timesteps | 54310    |
| train/             |          |
|    actor_loss      | 1.65     |
|    critic_loss     | 0.00967  |
|    ent_coef        | 0.00447  |
|    ent_coef_loss   | 1.45     |
|    learning_rate   | 0.001    |
|    n_updates       | 54209    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.2     |
|    ep_rew_mean     | -7.62    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1192     |
|    fps             | 37       |
|    time_elapsed    | 1467     |
|    total_timesteps | 54394    |
| train/             |          |
|    actor_loss      | 1.72     |
|    critic_loss     | 0.0121   |
|    ent_coef        | 0.00454  |
|    ent_coef_loss   | 0.526    |
|    learning_rate   | 0.001    |
|    n_updates       | 54293    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.3     |
|    ep_rew_mean     | -7.69    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1196     |
|    fps             | 37       |
|    time_elapsed    | 1470     |
|    total_timesteps | 54495    |
| train/             |          |
|    actor_loss      | 1.76     |
|    critic_loss     | 0.00646  |
|    ent_coef        | 0.00467  |
|    ent_coef_loss   | 0.688    |
|    learning_rate   | 0.001    |
|    n_updates       | 54394    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.5     |
|    ep_rew_mean     | -7.81    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 1200     |
|    fps             | 37       |
|    time_elapsed    | 1473     |
|    total_timesteps | 54599    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.00664  |
|    ent_coef        | 0.00453  |
|    ent_coef_loss   | -1.24    |
|    learning_rate   | 0.001    |
|    n_updates       | 54498    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.4     |
|    ep_rew_mean     | -7.76    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 1204     |
|    fps             | 37       |
|    time_elapsed    | 1475     |
|    total_timesteps | 54676    |
| train/             |          |
|    actor_loss      | 1.72     |
|    critic_loss     | 0.00597  |
|    ent_coef        | 0.00448  |
|    ent_coef_loss   | -0.645   |
|    learning_rate   | 0.001    |
|    n_updates       | 54575    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.2     |
|    ep_rew_mean     | -7.69    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1208     |
|    fps             | 37       |
|    time_elapsed    | 1477     |
|    total_timesteps | 54764    |
| train/             |          |
|    actor_loss      | 1.87     |
|    critic_loss     | 0.00878  |
|    ent_coef        | 0.00448  |
|    ent_coef_loss   | 0.575    |
|    learning_rate   | 0.001    |
|    n_updates       | 54663    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.2     |
|    ep_rew_mean     | -7.67    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1212     |
|    fps             | 37       |
|    time_elapsed    | 1480     |
|    total_timesteps | 54847    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.00892  |
|    ent_coef        | 0.00441  |
|    ent_coef_loss   | -1.52    |
|    learning_rate   | 0.001    |
|    n_updates       | 54746    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.8     |
|    ep_rew_mean     | -7.62    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1216     |
|    fps             | 37       |
|    time_elapsed    | 1482     |
|    total_timesteps | 54925    |
| train/             |          |
|    actor_loss      | 1.79     |
|    critic_loss     | 0.0555   |
|    ent_coef        | 0.00467  |
|    ent_coef_loss   | -0.0812  |
|    learning_rate   | 0.001    |
|    n_updates       | 54824    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.6     |
|    ep_rew_mean     | -7.5     |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1220     |
|    fps             | 37       |
|    time_elapsed    | 1485     |
|    total_timesteps | 55015    |
| train/             |          |
|    actor_loss      | 1.64     |
|    critic_loss     | 0.0124   |
|    ent_coef        | 0.00461  |
|    ent_coef_loss   | -1.28    |
|    learning_rate   | 0.001    |
|    n_updates       | 54914    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.42    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 1224     |
|    fps             | 37       |
|    time_elapsed    | 1487     |
|    total_timesteps | 55091    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.00913  |
|    ent_coef        | 0.00456  |
|    ent_coef_loss   | -1.04    |
|    learning_rate   | 0.001    |
|    n_updates       | 54990    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.2     |
|    ep_rew_mean     | -7.24    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 1228     |
|    fps             | 37       |
|    time_elapsed    | 1488     |
|    total_timesteps | 55154    |
| train/             |          |
|    actor_loss      | 1.9      |
|    critic_loss     | 0.0121   |
|    ent_coef        | 0.00446  |
|    ent_coef_loss   | 1.02     |
|    learning_rate   | 0.001    |
|    n_updates       | 55053    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.4     |
|    ep_rew_mean     | -7.28    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 1232     |
|    fps             | 37       |
|    time_elapsed    | 1491     |
|    total_timesteps | 55243    |
| train/             |          |
|    actor_loss      | 1.76     |
|    critic_loss     | 0.0075   |
|    ent_coef        | 0.00453  |
|    ent_coef_loss   | 0.0795   |
|    learning_rate   | 0.001    |
|    n_updates       | 55142    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.9     |
|    ep_rew_mean     | -7.12    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 1236     |
|    fps             | 37       |
|    time_elapsed    | 1493     |
|    total_timesteps | 55304    |
| train/             |          |
|    actor_loss      | 1.8      |
|    critic_loss     | 0.0124   |
|    ent_coef        | 0.00451  |
|    ent_coef_loss   | -0.147   |
|    learning_rate   | 0.001    |
|    n_updates       | 55203    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.3     |
|    ep_rew_mean     | -7.12    |
|    success_rate    | 0.98     |
| time/              |          |
|    episodes        | 1240     |
|    fps             | 37       |
|    time_elapsed    | 1496     |
|    total_timesteps | 55411    |
| train/             |          |
|    actor_loss      | 1.9      |
|    critic_loss     | 0.0103   |
|    ent_coef        | 0.00452  |
|    ent_coef_loss   | 0.59     |
|    learning_rate   | 0.001    |
|    n_updates       | 55310    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.3     |
|    ep_rew_mean     | -7.09    |
|    success_rate    | 0.98     |
| time/              |          |
|    episodes        | 1244     |
|    fps             | 37       |
|    time_elapsed    | 1498     |
|    total_timesteps | 55493    |
| train/             |          |
|    actor_loss      | 1.79     |
|    critic_loss     | 0.00863  |
|    ent_coef        | 0.0045   |
|    ent_coef_loss   | 1.15     |
|    learning_rate   | 0.001    |
|    n_updates       | 55392    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.4     |
|    ep_rew_mean     | -7.17    |
|    success_rate    | 0.98     |
| time/              |          |
|    episodes        | 1248     |
|    fps             | 37       |
|    time_elapsed    | 1501     |
|    total_timesteps | 55585    |
| train/             |          |
|    actor_loss      | 1.76     |
|    critic_loss     | 0.0121   |
|    ent_coef        | 0.0046   |
|    ent_coef_loss   | 2.06     |
|    learning_rate   | 0.001    |
|    n_updates       | 55484    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.4     |
|    ep_rew_mean     | -7.07    |
|    success_rate    | 0.99     |
| time/              |          |
|    episodes        | 1252     |
|    fps             | 37       |
|    time_elapsed    | 1503     |
|    total_timesteps | 55670    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.0176   |
|    ent_coef        | 0.00458  |
|    ent_coef_loss   | -0.0632  |
|    learning_rate   | 0.001    |
|    n_updates       | 55569    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.2     |
|    ep_rew_mean     | -7.02    |
|    success_rate    | 0.99     |
| time/              |          |
|    episodes        | 1256     |
|    fps             | 36       |
|    time_elapsed    | 1506     |
|    total_timesteps | 55757    |
| train/             |          |
|    actor_loss      | 1.65     |
|    critic_loss     | 0.0185   |
|    ent_coef        | 0.0047   |
|    ent_coef_loss   | -1.31    |
|    learning_rate   | 0.001    |
|    n_updates       | 55656    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.19    |
|    success_rate    | 0.98     |
| time/              |          |
|    episodes        | 1260     |
|    fps             | 37       |
|    time_elapsed    | 1509     |
|    total_timesteps | 55850    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.0122   |
|    ent_coef        | 0.00466  |
|    ent_coef_loss   | -0.509   |
|    learning_rate   | 0.001    |
|    n_updates       | 55749    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.17    |
|    success_rate    | 0.98     |
| time/              |          |
|    episodes        | 1264     |
|    fps             | 37       |
|    time_elapsed    | 1511     |
|    total_timesteps | 55940    |
| train/             |          |
|    actor_loss      | 1.98     |
|    critic_loss     | 0.00969  |
|    ent_coef        | 0.00474  |
|    ent_coef_loss   | 0.449    |
|    learning_rate   | 0.001    |
|    n_updates       | 55839    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.7     |
|    ep_rew_mean     | -7.24    |
|    success_rate    | 0.98     |
| time/              |          |
|    episodes        | 1268     |
|    fps             | 36       |
|    time_elapsed    | 1514     |
|    total_timesteps | 56039    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.0138   |
|    ent_coef        | 0.00463  |
|    ent_coef_loss   | 0.0168   |
|    learning_rate   | 0.001    |
|    n_updates       | 55938    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.22    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 1272     |
|    fps             | 36       |
|    time_elapsed    | 1517     |
|    total_timesteps | 56128    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.00466  |
|    ent_coef        | 0.00449  |
|    ent_coef_loss   | -0.8     |
|    learning_rate   | 0.001    |
|    n_updates       | 56027    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.7     |
|    ep_rew_mean     | -7.27    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 1276     |
|    fps             | 36       |
|    time_elapsed    | 1520     |
|    total_timesteps | 56227    |
| train/             |          |
|    actor_loss      | 1.93     |
|    critic_loss     | 0.00966  |
|    ent_coef        | 0.00442  |
|    ent_coef_loss   | 0.275    |
|    learning_rate   | 0.001    |
|    n_updates       | 56126    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.6     |
|    ep_rew_mean     | -7.26    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 1280     |
|    fps             | 36       |
|    time_elapsed    | 1522     |
|    total_timesteps | 56297    |
| train/             |          |
|    actor_loss      | 1.66     |
|    critic_loss     | 0.0254   |
|    ent_coef        | 0.00438  |
|    ent_coef_loss   | 0.133    |
|    learning_rate   | 0.001    |
|    n_updates       | 56196    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.23    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 1284     |
|    fps             | 36       |
|    time_elapsed    | 1524     |
|    total_timesteps | 56373    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.00731  |
|    ent_coef        | 0.00444  |
|    ent_coef_loss   | -0.235   |
|    learning_rate   | 0.001    |
|    n_updates       | 56272    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.26    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 1288     |
|    fps             | 36       |
|    time_elapsed    | 1526     |
|    total_timesteps | 56463    |
| train/             |          |
|    actor_loss      | 1.75     |
|    critic_loss     | 0.00539  |
|    ent_coef        | 0.0044   |
|    ent_coef_loss   | 1.5      |
|    learning_rate   | 0.001    |
|    n_updates       | 56362    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.19    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 1292     |
|    fps             | 36       |
|    time_elapsed    | 1529     |
|    total_timesteps | 56545    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.0126   |
|    ent_coef        | 0.00459  |
|    ent_coef_loss   | 0.682    |
|    learning_rate   | 0.001    |
|    n_updates       | 56444    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.7     |
|    ep_rew_mean     | -7.19    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 1296     |
|    fps             | 36       |
|    time_elapsed    | 1533     |
|    total_timesteps | 56661    |
| train/             |          |
|    actor_loss      | 1.81     |
|    critic_loss     | 0.0101   |
|    ent_coef        | 0.00467  |
|    ent_coef_loss   | -0.422   |
|    learning_rate   | 0.001    |
|    n_updates       | 56560    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.4     |
|    ep_rew_mean     | -7.06    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 1300     |
|    fps             | 36       |
|    time_elapsed    | 1535     |
|    total_timesteps | 56742    |
| train/             |          |
|    actor_loss      | 1.8      |
|    critic_loss     | 0.00485  |
|    ent_coef        | 0.00464  |
|    ent_coef_loss   | -1.18    |
|    learning_rate   | 0.001    |
|    n_updates       | 56641    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.07    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 1304     |
|    fps             | 36       |
|    time_elapsed    | 1537     |
|    total_timesteps | 56825    |
| train/             |          |
|    actor_loss      | 1.83     |
|    critic_loss     | 0.00927  |
|    ent_coef        | 0.00466  |
|    ent_coef_loss   | 0.0563   |
|    learning_rate   | 0.001    |
|    n_updates       | 56724    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.4     |
|    ep_rew_mean     | -7.02    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 1308     |
|    fps             | 36       |
|    time_elapsed    | 1539     |
|    total_timesteps | 56901    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.00811  |
|    ent_coef        | 0.00462  |
|    ent_coef_loss   | 0.357    |
|    learning_rate   | 0.001    |
|    n_updates       | 56800    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.13    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 1312     |
|    fps             | 36       |
|    time_elapsed    | 1542     |
|    total_timesteps | 56997    |
| train/             |          |
|    actor_loss      | 1.77     |
|    critic_loss     | 0.00948  |
|    ent_coef        | 0.00472  |
|    ent_coef_loss   | -1.15    |
|    learning_rate   | 0.001    |
|    n_updates       | 56896    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.4     |
|    ep_rew_mean     | -7.13    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1316     |
|    fps             | 36       |
|    time_elapsed    | 1544     |
|    total_timesteps | 57067    |
| train/             |          |
|    actor_loss      | 1.89     |
|    critic_loss     | 0.0143   |
|    ent_coef        | 0.00475  |
|    ent_coef_loss   | 1.26     |
|    learning_rate   | 0.001    |
|    n_updates       | 56966    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.4     |
|    ep_rew_mean     | -7.15    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1320     |
|    fps             | 36       |
|    time_elapsed    | 1547     |
|    total_timesteps | 57150    |
| train/             |          |
|    actor_loss      | 1.87     |
|    critic_loss     | 0.00889  |
|    ent_coef        | 0.00481  |
|    ent_coef_loss   | 0.258    |
|    learning_rate   | 0.001    |
|    n_updates       | 57049    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.2     |
|    ep_rew_mean     | -7.08    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1324     |
|    fps             | 36       |
|    time_elapsed    | 1548     |
|    total_timesteps | 57216    |
| train/             |          |
|    actor_loss      | 1.77     |
|    critic_loss     | 0.00484  |
|    ent_coef        | 0.00478  |
|    ent_coef_loss   | -0.329   |
|    learning_rate   | 0.001    |
|    n_updates       | 57115    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.6     |
|    ep_rew_mean     | -7.19    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1328     |
|    fps             | 36       |
|    time_elapsed    | 1551     |
|    total_timesteps | 57313    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.00722  |
|    ent_coef        | 0.00473  |
|    ent_coef_loss   | -0.441   |
|    learning_rate   | 0.001    |
|    n_updates       | 57212    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.15    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1332     |
|    fps             | 36       |
|    time_elapsed    | 1553     |
|    total_timesteps | 57395    |
| train/             |          |
|    actor_loss      | 1.78     |
|    critic_loss     | 0.0113   |
|    ent_coef        | 0.00472  |
|    ent_coef_loss   | -0.0795  |
|    learning_rate   | 0.001    |
|    n_updates       | 57294    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.8     |
|    ep_rew_mean     | -7.32    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1336     |
|    fps             | 36       |
|    time_elapsed    | 1556     |
|    total_timesteps | 57479    |
| train/             |          |
|    actor_loss      | 1.78     |
|    critic_loss     | 0.00627  |
|    ent_coef        | 0.00467  |
|    ent_coef_loss   | 1.67     |
|    learning_rate   | 0.001    |
|    n_updates       | 57378    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.6     |
|    ep_rew_mean     | -7.35    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1340     |
|    fps             | 36       |
|    time_elapsed    | 1559     |
|    total_timesteps | 57570    |
| train/             |          |
|    actor_loss      | 1.88     |
|    critic_loss     | 0.00589  |
|    ent_coef        | 0.00471  |
|    ent_coef_loss   | 0.905    |
|    learning_rate   | 0.001    |
|    n_updates       | 57469    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.6     |
|    ep_rew_mean     | -7.33    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1344     |
|    fps             | 36       |
|    time_elapsed    | 1561     |
|    total_timesteps | 57649    |
| train/             |          |
|    actor_loss      | 1.9      |
|    critic_loss     | 0.00806  |
|    ent_coef        | 0.00481  |
|    ent_coef_loss   | 0.548    |
|    learning_rate   | 0.001    |
|    n_updates       | 57548    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.4     |
|    ep_rew_mean     | -7.28    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1348     |
|    fps             | 36       |
|    time_elapsed    | 1563     |
|    total_timesteps | 57722    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.00627  |
|    ent_coef        | 0.00473  |
|    ent_coef_loss   | 0.341    |
|    learning_rate   | 0.001    |
|    n_updates       | 57621    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.3     |
|    ep_rew_mean     | -7.28    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1352     |
|    fps             | 36       |
|    time_elapsed    | 1565     |
|    total_timesteps | 57799    |
| train/             |          |
|    actor_loss      | 1.85     |
|    critic_loss     | 0.0137   |
|    ent_coef        | 0.00468  |
|    ent_coef_loss   | 0.704    |
|    learning_rate   | 0.001    |
|    n_updates       | 57698    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.4     |
|    ep_rew_mean     | -7.31    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1356     |
|    fps             | 36       |
|    time_elapsed    | 1568     |
|    total_timesteps | 57898    |
| train/             |          |
|    actor_loss      | 1.82     |
|    critic_loss     | 0.00678  |
|    ent_coef        | 0.00451  |
|    ent_coef_loss   | 0.561    |
|    learning_rate   | 0.001    |
|    n_updates       | 57797    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -7.13    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1360     |
|    fps             | 36       |
|    time_elapsed    | 1570     |
|    total_timesteps | 57963    |
| train/             |          |
|    actor_loss      | 1.78     |
|    critic_loss     | 0.00639  |
|    ent_coef        | 0.00462  |
|    ent_coef_loss   | -0.877   |
|    learning_rate   | 0.001    |
|    n_updates       | 57862    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.2     |
|    ep_rew_mean     | -7.24    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1364     |
|    fps             | 36       |
|    time_elapsed    | 1573     |
|    total_timesteps | 58059    |
| train/             |          |
|    actor_loss      | 1.84     |
|    critic_loss     | 0.00694  |
|    ent_coef        | 0.00468  |
|    ent_coef_loss   | -1.29    |
|    learning_rate   | 0.001    |
|    n_updates       | 57958    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.9     |
|    ep_rew_mean     | -7.1     |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1368     |
|    fps             | 36       |
|    time_elapsed    | 1574     |
|    total_timesteps | 58125    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.00793  |
|    ent_coef        | 0.00469  |
|    ent_coef_loss   | 1.29     |
|    learning_rate   | 0.001    |
|    n_updates       | 58024    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.8     |
|    ep_rew_mean     | -7.06    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1372     |
|    fps             | 36       |
|    time_elapsed    | 1577     |
|    total_timesteps | 58209    |
| train/             |          |
|    actor_loss      | 1.64     |
|    critic_loss     | 0.0247   |
|    ent_coef        | 0.00474  |
|    ent_coef_loss   | -1.85    |
|    learning_rate   | 0.001    |
|    n_updates       | 58108    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.4     |
|    ep_rew_mean     | -6.93    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1376     |
|    fps             | 36       |
|    time_elapsed    | 1578     |
|    total_timesteps | 58271    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.00966  |
|    ent_coef        | 0.00471  |
|    ent_coef_loss   | -0.463   |
|    learning_rate   | 0.001    |
|    n_updates       | 58170    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.3     |
|    ep_rew_mean     | -6.84    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 1380     |
|    fps             | 36       |
|    time_elapsed    | 1580     |
|    total_timesteps | 58330    |
| train/             |          |
|    actor_loss      | 1.95     |
|    critic_loss     | 0.00862  |
|    ent_coef        | 0.00474  |
|    ent_coef_loss   | 0.0758   |
|    learning_rate   | 0.001    |
|    n_updates       | 58229    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.5     |
|    ep_rew_mean     | -6.87    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 1384     |
|    fps             | 36       |
|    time_elapsed    | 1583     |
|    total_timesteps | 58420    |
| train/             |          |
|    actor_loss      | 1.89     |
|    critic_loss     | 0.0117   |
|    ent_coef        | 0.00479  |
|    ent_coef_loss   | -0.0524  |
|    learning_rate   | 0.001    |
|    n_updates       | 58319    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.5     |
|    ep_rew_mean     | -6.91    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1388     |
|    fps             | 36       |
|    time_elapsed    | 1586     |
|    total_timesteps | 58509    |
| train/             |          |
|    actor_loss      | 1.79     |
|    critic_loss     | 0.0217   |
|    ent_coef        | 0.00465  |
|    ent_coef_loss   | -0.228   |
|    learning_rate   | 0.001    |
|    n_updates       | 58408    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.5     |
|    ep_rew_mean     | -7.04    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1392     |
|    fps             | 36       |
|    time_elapsed    | 1588     |
|    total_timesteps | 58597    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.00592  |
|    ent_coef        | 0.00462  |
|    ent_coef_loss   | -0.491   |
|    learning_rate   | 0.001    |
|    n_updates       | 58496    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.2     |
|    ep_rew_mean     | -6.97    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1396     |
|    fps             | 36       |
|    time_elapsed    | 1590     |
|    total_timesteps | 58680    |
| train/             |          |
|    actor_loss      | 1.67     |
|    critic_loss     | 0.0081   |
|    ent_coef        | 0.00478  |
|    ent_coef_loss   | -1.96    |
|    learning_rate   | 0.001    |
|    n_updates       | 58579    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.2     |
|    ep_rew_mean     | -7.02    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 1400     |
|    fps             | 36       |
|    time_elapsed    | 1593     |
|    total_timesteps | 58765    |
| train/             |          |
|    actor_loss      | 1.76     |
|    critic_loss     | 0.00594  |
|    ent_coef        | 0.00483  |
|    ent_coef_loss   | -1.95    |
|    learning_rate   | 0.001    |
|    n_updates       | 58664    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.1     |
|    ep_rew_mean     | -6.98    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 1404     |
|    fps             | 36       |
|    time_elapsed    | 1595     |
|    total_timesteps | 58840    |
| train/             |          |
|    actor_loss      | 1.78     |
|    critic_loss     | 0.00719  |
|    ent_coef        | 0.00478  |
|    ent_coef_loss   | 0.609    |
|    learning_rate   | 0.001    |
|    n_updates       | 58739    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24.9     |
|    ep_rew_mean     | -7.93    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 1408     |
|    fps             | 36       |
|    time_elapsed    | 1611     |
|    total_timesteps | 59389    |
| train/             |          |
|    actor_loss      | 1.78     |
|    critic_loss     | 0.00922  |
|    ent_coef        | 0.00495  |
|    ent_coef_loss   | 0.584    |
|    learning_rate   | 0.001    |
|    n_updates       | 59288    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24.9     |
|    ep_rew_mean     | -7.89    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1412     |
|    fps             | 36       |
|    time_elapsed    | 1614     |
|    total_timesteps | 59488    |
| train/             |          |
|    actor_loss      | 1.8      |
|    critic_loss     | 0.00907  |
|    ent_coef        | 0.0049   |
|    ent_coef_loss   | 0.688    |
|    learning_rate   | 0.001    |
|    n_updates       | 59387    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25       |
|    ep_rew_mean     | -7.94    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1416     |
|    fps             | 36       |
|    time_elapsed    | 1616     |
|    total_timesteps | 59570    |
| train/             |          |
|    actor_loss      | 1.84     |
|    critic_loss     | 0.00575  |
|    ent_coef        | 0.00503  |
|    ent_coef_loss   | 0.593    |
|    learning_rate   | 0.001    |
|    n_updates       | 59469    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.2     |
|    ep_rew_mean     | -8.02    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1420     |
|    fps             | 36       |
|    time_elapsed    | 1619     |
|    total_timesteps | 59672    |
| train/             |          |
|    actor_loss      | 1.85     |
|    critic_loss     | 0.0232   |
|    ent_coef        | 0.00511  |
|    ent_coef_loss   | 1.81     |
|    learning_rate   | 0.001    |
|    n_updates       | 59571    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.5     |
|    ep_rew_mean     | -8.1     |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1424     |
|    fps             | 36       |
|    time_elapsed    | 1622     |
|    total_timesteps | 59764    |
| train/             |          |
|    actor_loss      | 1.94     |
|    critic_loss     | 0.00601  |
|    ent_coef        | 0.00512  |
|    ent_coef_loss   | -0.386   |
|    learning_rate   | 0.001    |
|    n_updates       | 59663    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.4     |
|    ep_rew_mean     | -8.11    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1428     |
|    fps             | 36       |
|    time_elapsed    | 1625     |
|    total_timesteps | 59853    |
| train/             |          |
|    actor_loss      | 1.66     |
|    critic_loss     | 0.00649  |
|    ent_coef        | 0.00508  |
|    ent_coef_loss   | 0.476    |
|    learning_rate   | 0.001    |
|    n_updates       | 59752    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.3     |
|    ep_rew_mean     | -8.08    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1432     |
|    fps             | 36       |
|    time_elapsed    | 1626     |
|    total_timesteps | 59923    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.0138   |
|    ent_coef        | 0.00504  |
|    ent_coef_loss   | -1.24    |
|    learning_rate   | 0.001    |
|    n_updates       | 59822    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.1     |
|    ep_rew_mean     | -7.98    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1436     |
|    fps             | 36       |
|    time_elapsed    | 1628     |
|    total_timesteps | 59990    |
| train/             |          |
|    actor_loss      | 1.72     |
|    critic_loss     | 0.0111   |
|    ent_coef        | 0.0051   |
|    ent_coef_loss   | 0.612    |
|    learning_rate   | 0.001    |
|    n_updates       | 59889    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.3     |
|    ep_rew_mean     | -7.97    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1440     |
|    fps             | 36       |
|    time_elapsed    | 1631     |
|    total_timesteps | 60102    |
| train/             |          |
|    actor_loss      | 1.86     |
|    critic_loss     | 0.00762  |
|    ent_coef        | 0.00516  |
|    ent_coef_loss   | 0.674    |
|    learning_rate   | 0.001    |
|    n_updates       | 60001    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.3     |
|    ep_rew_mean     | -8.01    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1444     |
|    fps             | 36       |
|    time_elapsed    | 1634     |
|    total_timesteps | 60178    |
| train/             |          |
|    actor_loss      | 1.78     |
|    critic_loss     | 0.00876  |
|    ent_coef        | 0.00528  |
|    ent_coef_loss   | 0.103    |
|    learning_rate   | 0.001    |
|    n_updates       | 60077    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.3     |
|    ep_rew_mean     | -7.99    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1448     |
|    fps             | 36       |
|    time_elapsed    | 1636     |
|    total_timesteps | 60248    |
| train/             |          |
|    actor_loss      | 1.51     |
|    critic_loss     | 0.0146   |
|    ent_coef        | 0.00507  |
|    ent_coef_loss   | -1.09    |
|    learning_rate   | 0.001    |
|    n_updates       | 60147    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.3     |
|    ep_rew_mean     | -8.07    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1452     |
|    fps             | 36       |
|    time_elapsed    | 1639     |
|    total_timesteps | 60330    |
| train/             |          |
|    actor_loss      | 1.82     |
|    critic_loss     | 0.00653  |
|    ent_coef        | 0.00504  |
|    ent_coef_loss   | 0.653    |
|    learning_rate   | 0.001    |
|    n_updates       | 60229    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.1     |
|    ep_rew_mean     | -7.99    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1456     |
|    fps             | 36       |
|    time_elapsed    | 1641     |
|    total_timesteps | 60409    |
| train/             |          |
|    actor_loss      | 1.82     |
|    critic_loss     | 0.00566  |
|    ent_coef        | 0.00513  |
|    ent_coef_loss   | -0.505   |
|    learning_rate   | 0.001    |
|    n_updates       | 60308    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.4     |
|    ep_rew_mean     | -8.18    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1460     |
|    fps             | 36       |
|    time_elapsed    | 1643     |
|    total_timesteps | 60502    |
| train/             |          |
|    actor_loss      | 1.87     |
|    critic_loss     | 0.0103   |
|    ent_coef        | 0.00501  |
|    ent_coef_loss   | 0.287    |
|    learning_rate   | 0.001    |
|    n_updates       | 60401    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.1     |
|    ep_rew_mean     | -8.08    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1464     |
|    fps             | 36       |
|    time_elapsed    | 1645     |
|    total_timesteps | 60569    |
| train/             |          |
|    actor_loss      | 1.63     |
|    critic_loss     | 0.00556  |
|    ent_coef        | 0.00505  |
|    ent_coef_loss   | 0.383    |
|    learning_rate   | 0.001    |
|    n_updates       | 60468    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.1     |
|    ep_rew_mean     | -8.11    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1468     |
|    fps             | 36       |
|    time_elapsed    | 1647     |
|    total_timesteps | 60637    |
| train/             |          |
|    actor_loss      | 1.64     |
|    critic_loss     | 0.00792  |
|    ent_coef        | 0.00503  |
|    ent_coef_loss   | 0.777    |
|    learning_rate   | 0.001    |
|    n_updates       | 60536    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.1     |
|    ep_rew_mean     | -8.06    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1472     |
|    fps             | 36       |
|    time_elapsed    | 1650     |
|    total_timesteps | 60724    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.0104   |
|    ent_coef        | 0.00499  |
|    ent_coef_loss   | 0.839    |
|    learning_rate   | 0.001    |
|    n_updates       | 60623    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.1     |
|    ep_rew_mean     | -8.06    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1476     |
|    fps             | 36       |
|    time_elapsed    | 1652     |
|    total_timesteps | 60785    |
| train/             |          |
|    actor_loss      | 1.65     |
|    critic_loss     | 0.00709  |
|    ent_coef        | 0.00506  |
|    ent_coef_loss   | -0.183   |
|    learning_rate   | 0.001    |
|    n_updates       | 60684    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.3     |
|    ep_rew_mean     | -8.12    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1480     |
|    fps             | 36       |
|    time_elapsed    | 1654     |
|    total_timesteps | 60863    |
| train/             |          |
|    actor_loss      | 1.8      |
|    critic_loss     | 0.00701  |
|    ent_coef        | 0.00509  |
|    ent_coef_loss   | 1.18     |
|    learning_rate   | 0.001    |
|    n_updates       | 60762    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.2     |
|    ep_rew_mean     | -8.12    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1484     |
|    fps             | 36       |
|    time_elapsed    | 1656     |
|    total_timesteps | 60944    |
| train/             |          |
|    actor_loss      | 1.76     |
|    critic_loss     | 0.0138   |
|    ent_coef        | 0.005    |
|    ent_coef_loss   | -0.748   |
|    learning_rate   | 0.001    |
|    n_updates       | 60843    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.1     |
|    ep_rew_mean     | -8       |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1488     |
|    fps             | 36       |
|    time_elapsed    | 1658     |
|    total_timesteps | 61020    |
| train/             |          |
|    actor_loss      | 1.75     |
|    critic_loss     | 0.034    |
|    ent_coef        | 0.00516  |
|    ent_coef_loss   | -0.894   |
|    learning_rate   | 0.001    |
|    n_updates       | 60919    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.1     |
|    ep_rew_mean     | -7.89    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 1492     |
|    fps             | 36       |
|    time_elapsed    | 1661     |
|    total_timesteps | 61104    |
| train/             |          |
|    actor_loss      | 1.76     |
|    critic_loss     | 0.022    |
|    ent_coef        | 0.00505  |
|    ent_coef_loss   | 0.334    |
|    learning_rate   | 0.001    |
|    n_updates       | 61003    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.1     |
|    ep_rew_mean     | -7.96    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1496     |
|    fps             | 36       |
|    time_elapsed    | 1663     |
|    total_timesteps | 61186    |
| train/             |          |
|    actor_loss      | 1.8      |
|    critic_loss     | 0.00567  |
|    ent_coef        | 0.00494  |
|    ent_coef_loss   | 0.68     |
|    learning_rate   | 0.001    |
|    n_updates       | 61085    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 24.9     |
|    ep_rew_mean     | -7.87    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 1500     |
|    fps             | 36       |
|    time_elapsed    | 1665     |
|    total_timesteps | 61259    |
| train/             |          |
|    actor_loss      | 1.77     |
|    critic_loss     | 0.00783  |
|    ent_coef        | 0.00507  |
|    ent_coef_loss   | 1.55     |
|    learning_rate   | 0.001    |
|    n_updates       | 61158    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 25.1     |
|    ep_rew_mean     | -8.01    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1504     |
|    fps             | 36       |
|    time_elapsed    | 1668     |
|    total_timesteps | 61353    |
| train/             |          |
|    actor_loss      | 1.84     |
|    critic_loss     | 0.00692  |
|    ent_coef        | 0.00502  |
|    ent_coef_loss   | -1.13    |
|    learning_rate   | 0.001    |
|    n_updates       | 61252    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.4     |
|    ep_rew_mean     | -7.09    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1508     |
|    fps             | 36       |
|    time_elapsed    | 1670     |
|    total_timesteps | 61432    |
| train/             |          |
|    actor_loss      | 1.84     |
|    critic_loss     | 0.00563  |
|    ent_coef        | 0.00484  |
|    ent_coef_loss   | 0.143    |
|    learning_rate   | 0.001    |
|    n_updates       | 61331    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.3     |
|    ep_rew_mean     | -7.04    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1512     |
|    fps             | 36       |
|    time_elapsed    | 1673     |
|    total_timesteps | 61518    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.0153   |
|    ent_coef        | 0.00496  |
|    ent_coef_loss   | 1.22     |
|    learning_rate   | 0.001    |
|    n_updates       | 61417    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.6     |
|    ep_rew_mean     | -7.1     |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1516     |
|    fps             | 36       |
|    time_elapsed    | 1676     |
|    total_timesteps | 61634    |
| train/             |          |
|    actor_loss      | 1.75     |
|    critic_loss     | 0.0115   |
|    ent_coef        | 0.00511  |
|    ent_coef_loss   | 0.604    |
|    learning_rate   | 0.001    |
|    n_updates       | 61533    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.6     |
|    ep_rew_mean     | -7.01    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1520     |
|    fps             | 36       |
|    time_elapsed    | 1679     |
|    total_timesteps | 61727    |
| train/             |          |
|    actor_loss      | 1.94     |
|    critic_loss     | 0.0104   |
|    ent_coef        | 0.00522  |
|    ent_coef_loss   | 1.97     |
|    learning_rate   | 0.001    |
|    n_updates       | 61626    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.5     |
|    ep_rew_mean     | -7.16    |
|    success_rate    | 0.92     |
| time/              |          |
|    episodes        | 1524     |
|    fps             | 36       |
|    time_elapsed    | 1681     |
|    total_timesteps | 61810    |
| train/             |          |
|    actor_loss      | 1.75     |
|    critic_loss     | 0.0203   |
|    ent_coef        | 0.00525  |
|    ent_coef_loss   | -0.885   |
|    learning_rate   | 0.001    |
|    n_updates       | 61709    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.7     |
|    ep_rew_mean     | -7.23    |
|    success_rate    | 0.91     |
| time/              |          |
|    episodes        | 1528     |
|    fps             | 36       |
|    time_elapsed    | 1684     |
|    total_timesteps | 61923    |
| train/             |          |
|    actor_loss      | 1.62     |
|    critic_loss     | 0.00815  |
|    ent_coef        | 0.00535  |
|    ent_coef_loss   | 0.241    |
|    learning_rate   | 0.001    |
|    n_updates       | 61822    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.9     |
|    ep_rew_mean     | -7.38    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 1532     |
|    fps             | 36       |
|    time_elapsed    | 1688     |
|    total_timesteps | 62008    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.00735  |
|    ent_coef        | 0.00532  |
|    ent_coef_loss   | 0.919    |
|    learning_rate   | 0.001    |
|    n_updates       | 61907    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.9     |
|    ep_rew_mean     | -7.46    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 1536     |
|    fps             | 36       |
|    time_elapsed    | 1690     |
|    total_timesteps | 62082    |
| train/             |          |
|    actor_loss      | 1.62     |
|    critic_loss     | 0.00832  |
|    ent_coef        | 0.00543  |
|    ent_coef_loss   | -0.734   |
|    learning_rate   | 0.001    |
|    n_updates       | 61981    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.7     |
|    ep_rew_mean     | -7.45    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 1540     |
|    fps             | 36       |
|    time_elapsed    | 1692     |
|    total_timesteps | 62171    |
| train/             |          |
|    actor_loss      | 1.78     |
|    critic_loss     | 0.0149   |
|    ent_coef        | 0.00541  |
|    ent_coef_loss   | -0.311   |
|    learning_rate   | 0.001    |
|    n_updates       | 62070    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.8     |
|    ep_rew_mean     | -7.43    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 1544     |
|    fps             | 36       |
|    time_elapsed    | 1694     |
|    total_timesteps | 62256    |
| train/             |          |
|    actor_loss      | 1.59     |
|    critic_loss     | 0.0142   |
|    ent_coef        | 0.00544  |
|    ent_coef_loss   | -0.168   |
|    learning_rate   | 0.001    |
|    n_updates       | 62155    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.8     |
|    ep_rew_mean     | -7.44    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 1548     |
|    fps             | 36       |
|    time_elapsed    | 1697     |
|    total_timesteps | 62332    |
| train/             |          |
|    actor_loss      | 1.77     |
|    critic_loss     | 0.0153   |
|    ent_coef        | 0.00539  |
|    ent_coef_loss   | -0.593   |
|    learning_rate   | 0.001    |
|    n_updates       | 62231    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.3     |
|    ep_rew_mean     | -7.55    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 1552     |
|    fps             | 36       |
|    time_elapsed    | 1701     |
|    total_timesteps | 62461    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.00685  |
|    ent_coef        | 0.00526  |
|    ent_coef_loss   | 0.13     |
|    learning_rate   | 0.001    |
|    n_updates       | 62360    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.66    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 1556     |
|    fps             | 36       |
|    time_elapsed    | 1704     |
|    total_timesteps | 62558    |
| train/             |          |
|    actor_loss      | 1.67     |
|    critic_loss     | 0.0278   |
|    ent_coef        | 0.00517  |
|    ent_coef_loss   | 0.262    |
|    learning_rate   | 0.001    |
|    n_updates       | 62457    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.4     |
|    ep_rew_mean     | -7.49    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 1560     |
|    fps             | 36       |
|    time_elapsed    | 1706     |
|    total_timesteps | 62637    |
| train/             |          |
|    actor_loss      | 1.62     |
|    critic_loss     | 0.00576  |
|    ent_coef        | 0.00522  |
|    ent_coef_loss   | -1.75    |
|    learning_rate   | 0.001    |
|    n_updates       | 62536    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.7     |
|    ep_rew_mean     | -7.56    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 1564     |
|    fps             | 36       |
|    time_elapsed    | 1709     |
|    total_timesteps | 62741    |
| train/             |          |
|    actor_loss      | 1.48     |
|    critic_loss     | 0.0379   |
|    ent_coef        | 0.00522  |
|    ent_coef_loss   | 0.0929   |
|    learning_rate   | 0.001    |
|    n_updates       | 62640    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.7     |
|    ep_rew_mean     | -7.53    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 1568     |
|    fps             | 36       |
|    time_elapsed    | 1711     |
|    total_timesteps | 62804    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.015    |
|    ent_coef        | 0.00523  |
|    ent_coef_loss   | 0.777    |
|    learning_rate   | 0.001    |
|    n_updates       | 62703    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.5     |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 1572     |
|    fps             | 36       |
|    time_elapsed    | 1713     |
|    total_timesteps | 62872    |
| train/             |          |
|    actor_loss      | 1.75     |
|    critic_loss     | 0.0083   |
|    ent_coef        | 0.00528  |
|    ent_coef_loss   | -0.142   |
|    learning_rate   | 0.001    |
|    n_updates       | 62771    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.7     |
|    ep_rew_mean     | -7.62    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 1576     |
|    fps             | 36       |
|    time_elapsed    | 1715     |
|    total_timesteps | 62951    |
| train/             |          |
|    actor_loss      | 1.65     |
|    critic_loss     | 0.0414   |
|    ent_coef        | 0.00531  |
|    ent_coef_loss   | -0.0772  |
|    learning_rate   | 0.001    |
|    n_updates       | 62850    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.9     |
|    ep_rew_mean     | -7.76    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 1580     |
|    fps             | 36       |
|    time_elapsed    | 1718     |
|    total_timesteps | 63053    |
| train/             |          |
|    actor_loss      | 1.57     |
|    critic_loss     | 0.00609  |
|    ent_coef        | 0.00535  |
|    ent_coef_loss   | 1.07     |
|    learning_rate   | 0.001    |
|    n_updates       | 62952    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.7     |
|    ep_rew_mean     | -7.66    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 1584     |
|    fps             | 36       |
|    time_elapsed    | 1720     |
|    total_timesteps | 63113    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.0102   |
|    ent_coef        | 0.00529  |
|    ent_coef_loss   | 1        |
|    learning_rate   | 0.001    |
|    n_updates       | 63012    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.7     |
|    ep_rew_mean     | -7.72    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 1588     |
|    fps             | 36       |
|    time_elapsed    | 1722     |
|    total_timesteps | 63189    |
| train/             |          |
|    actor_loss      | 1.66     |
|    critic_loss     | 0.0156   |
|    ent_coef        | 0.00531  |
|    ent_coef_loss   | -0.397   |
|    learning_rate   | 0.001    |
|    n_updates       | 63088    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.7     |
|    ep_rew_mean     | -7.71    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 1592     |
|    fps             | 36       |
|    time_elapsed    | 1724     |
|    total_timesteps | 63272    |
| train/             |          |
|    actor_loss      | 1.61     |
|    critic_loss     | 0.0133   |
|    ent_coef        | 0.00517  |
|    ent_coef_loss   | -0.546   |
|    learning_rate   | 0.001    |
|    n_updates       | 63171    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.7     |
|    ep_rew_mean     | -7.62    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 1596     |
|    fps             | 36       |
|    time_elapsed    | 1727     |
|    total_timesteps | 63352    |
| train/             |          |
|    actor_loss      | 1.66     |
|    critic_loss     | 0.00456  |
|    ent_coef        | 0.00506  |
|    ent_coef_loss   | -0.00686 |
|    learning_rate   | 0.001    |
|    n_updates       | 63251    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.8     |
|    ep_rew_mean     | -7.7     |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 1600     |
|    fps             | 36       |
|    time_elapsed    | 1730     |
|    total_timesteps | 63439    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.00649  |
|    ent_coef        | 0.00503  |
|    ent_coef_loss   | 0.262    |
|    learning_rate   | 0.001    |
|    n_updates       | 63338    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.9     |
|    ep_rew_mean     | -7.67    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 1604     |
|    fps             | 36       |
|    time_elapsed    | 1733     |
|    total_timesteps | 63545    |
| train/             |          |
|    actor_loss      | 1.5      |
|    critic_loss     | 0.0112   |
|    ent_coef        | 0.00506  |
|    ent_coef_loss   | -0.425   |
|    learning_rate   | 0.001    |
|    n_updates       | 63444    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.9     |
|    ep_rew_mean     | -7.7     |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 1608     |
|    fps             | 36       |
|    time_elapsed    | 1735     |
|    total_timesteps | 63620    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.00654  |
|    ent_coef        | 0.00509  |
|    ent_coef_loss   | 1.2      |
|    learning_rate   | 0.001    |
|    n_updates       | 63519    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.8     |
|    ep_rew_mean     | -7.68    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 1612     |
|    fps             | 36       |
|    time_elapsed    | 1738     |
|    total_timesteps | 63699    |
| train/             |          |
|    actor_loss      | 1.76     |
|    critic_loss     | 0.0141   |
|    ent_coef        | 0.00508  |
|    ent_coef_loss   | 0.601    |
|    learning_rate   | 0.001    |
|    n_updates       | 63598    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.8     |
|    ep_rew_mean     | -7.69    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 1616     |
|    fps             | 36       |
|    time_elapsed    | 1741     |
|    total_timesteps | 63810    |
| train/             |          |
|    actor_loss      | 1.62     |
|    critic_loss     | 0.0201   |
|    ent_coef        | 0.00511  |
|    ent_coef_loss   | -0.909   |
|    learning_rate   | 0.001    |
|    n_updates       | 63709    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.7     |
|    ep_rew_mean     | -7.77    |
|    success_rate    | 0.84     |
| time/              |          |
|    episodes        | 1620     |
|    fps             | 36       |
|    time_elapsed    | 1744     |
|    total_timesteps | 63893    |
| train/             |          |
|    actor_loss      | 1.78     |
|    critic_loss     | 0.00706  |
|    ent_coef        | 0.00504  |
|    ent_coef_loss   | 1.18     |
|    learning_rate   | 0.001    |
|    n_updates       | 63792    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.6     |
|    ep_rew_mean     | -7.61    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 1624     |
|    fps             | 36       |
|    time_elapsed    | 1746     |
|    total_timesteps | 63965    |
| train/             |          |
|    actor_loss      | 1.64     |
|    critic_loss     | 0.0274   |
|    ent_coef        | 0.00505  |
|    ent_coef_loss   | -0.43    |
|    learning_rate   | 0.001    |
|    n_updates       | 63864    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21       |
|    ep_rew_mean     | -7.37    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 1628     |
|    fps             | 36       |
|    time_elapsed    | 1748     |
|    total_timesteps | 64021    |
| train/             |          |
|    actor_loss      | 1.65     |
|    critic_loss     | 0.0118   |
|    ent_coef        | 0.00508  |
|    ent_coef_loss   | -0.157   |
|    learning_rate   | 0.001    |
|    n_updates       | 63920    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.9     |
|    ep_rew_mean     | -7.24    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 1632     |
|    fps             | 36       |
|    time_elapsed    | 1750     |
|    total_timesteps | 64097    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.0081   |
|    ent_coef        | 0.00517  |
|    ent_coef_loss   | -0.376   |
|    learning_rate   | 0.001    |
|    n_updates       | 63996    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.2     |
|    ep_rew_mean     | -7.34    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 1636     |
|    fps             | 36       |
|    time_elapsed    | 1754     |
|    total_timesteps | 64205    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.00623  |
|    ent_coef        | 0.00509  |
|    ent_coef_loss   | 0.131    |
|    learning_rate   | 0.001    |
|    n_updates       | 64104    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.4     |
|    ep_rew_mean     | -7.44    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 1640     |
|    fps             | 36       |
|    time_elapsed    | 1756     |
|    total_timesteps | 64308    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.0059   |
|    ent_coef        | 0.00514  |
|    ent_coef_loss   | -1.53    |
|    learning_rate   | 0.001    |
|    n_updates       | 64207    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.9     |
|    ep_rew_mean     | -7.54    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 1644     |
|    fps             | 36       |
|    time_elapsed    | 1760     |
|    total_timesteps | 64450    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.00861  |
|    ent_coef        | 0.00494  |
|    ent_coef_loss   | -0.716   |
|    learning_rate   | 0.001    |
|    n_updates       | 64349    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.9     |
|    ep_rew_mean     | -7.55    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 1648     |
|    fps             | 36       |
|    time_elapsed    | 1762     |
|    total_timesteps | 64527    |
| train/             |          |
|    actor_loss      | 1.61     |
|    critic_loss     | 0.00626  |
|    ent_coef        | 0.00493  |
|    ent_coef_loss   | 0.61     |
|    learning_rate   | 0.001    |
|    n_updates       | 64426    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.6     |
|    ep_rew_mean     | -7.5     |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 1652     |
|    fps             | 36       |
|    time_elapsed    | 1765     |
|    total_timesteps | 64619    |
| train/             |          |
|    actor_loss      | 1.67     |
|    critic_loss     | 0.00979  |
|    ent_coef        | 0.00501  |
|    ent_coef_loss   | 0.596    |
|    learning_rate   | 0.001    |
|    n_updates       | 64518    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.46    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 1656     |
|    fps             | 36       |
|    time_elapsed    | 1768     |
|    total_timesteps | 64704    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.00652  |
|    ent_coef        | 0.00508  |
|    ent_coef_loss   | 1.36     |
|    learning_rate   | 0.001    |
|    n_updates       | 64603    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.61    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 1660     |
|    fps             | 36       |
|    time_elapsed    | 1770     |
|    total_timesteps | 64791    |
| train/             |          |
|    actor_loss      | 1.75     |
|    critic_loss     | 0.00665  |
|    ent_coef        | 0.00502  |
|    ent_coef_loss   | 1.95     |
|    learning_rate   | 0.001    |
|    n_updates       | 64690    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.4     |
|    ep_rew_mean     | -7.53    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 1664     |
|    fps             | 36       |
|    time_elapsed    | 1773     |
|    total_timesteps | 64879    |
| train/             |          |
|    actor_loss      | 1.61     |
|    critic_loss     | 0.0307   |
|    ent_coef        | 0.00519  |
|    ent_coef_loss   | -0.373   |
|    learning_rate   | 0.001    |
|    n_updates       | 64778    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.2     |
|    ep_rew_mean     | -7.72    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 1668     |
|    fps             | 36       |
|    time_elapsed    | 1776     |
|    total_timesteps | 65024    |
| train/             |          |
|    actor_loss      | 1.78     |
|    critic_loss     | 0.0179   |
|    ent_coef        | 0.00516  |
|    ent_coef_loss   | 0.885    |
|    learning_rate   | 0.001    |
|    n_updates       | 64923    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.4     |
|    ep_rew_mean     | -7.79    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 1672     |
|    fps             | 36       |
|    time_elapsed    | 1779     |
|    total_timesteps | 65114    |
| train/             |          |
|    actor_loss      | 1.63     |
|    critic_loss     | 0.00573  |
|    ent_coef        | 0.00522  |
|    ent_coef_loss   | -0.0444  |
|    learning_rate   | 0.001    |
|    n_updates       | 65013    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.3     |
|    ep_rew_mean     | -7.67    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 1676     |
|    fps             | 36       |
|    time_elapsed    | 1781     |
|    total_timesteps | 65177    |
| train/             |          |
|    actor_loss      | 1.61     |
|    critic_loss     | 0.0103   |
|    ent_coef        | 0.00524  |
|    ent_coef_loss   | 0.274    |
|    learning_rate   | 0.001    |
|    n_updates       | 65076    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.3     |
|    ep_rew_mean     | -7.72    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 1680     |
|    fps             | 36       |
|    time_elapsed    | 1784     |
|    total_timesteps | 65282    |
| train/             |          |
|    actor_loss      | 1.76     |
|    critic_loss     | 0.00675  |
|    ent_coef        | 0.00517  |
|    ent_coef_loss   | 1.27     |
|    learning_rate   | 0.001    |
|    n_updates       | 65181    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.6     |
|    ep_rew_mean     | -7.85    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 1684     |
|    fps             | 36       |
|    time_elapsed    | 1787     |
|    total_timesteps | 65373    |
| train/             |          |
|    actor_loss      | 1.79     |
|    critic_loss     | 0.00615  |
|    ent_coef        | 0.00519  |
|    ent_coef_loss   | -0.96    |
|    learning_rate   | 0.001    |
|    n_updates       | 65272    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.4     |
|    ep_rew_mean     | -7.73    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 1688     |
|    fps             | 36       |
|    time_elapsed    | 1788     |
|    total_timesteps | 65433    |
| train/             |          |
|    actor_loss      | 1.81     |
|    critic_loss     | 0.00664  |
|    ent_coef        | 0.00508  |
|    ent_coef_loss   | 0.869    |
|    learning_rate   | 0.001    |
|    n_updates       | 65332    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.4     |
|    ep_rew_mean     | -7.68    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 1692     |
|    fps             | 36       |
|    time_elapsed    | 1791     |
|    total_timesteps | 65511    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.00624  |
|    ent_coef        | 0.00509  |
|    ent_coef_loss   | 1.94     |
|    learning_rate   | 0.001    |
|    n_updates       | 65410    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.6     |
|    ep_rew_mean     | -7.8     |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 1696     |
|    fps             | 36       |
|    time_elapsed    | 1794     |
|    total_timesteps | 65613    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.0184   |
|    ent_coef        | 0.00519  |
|    ent_coef_loss   | -0.0452  |
|    learning_rate   | 0.001    |
|    n_updates       | 65512    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.7     |
|    ep_rew_mean     | -7.76    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 1700     |
|    fps             | 36       |
|    time_elapsed    | 1797     |
|    total_timesteps | 65709    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.00538  |
|    ent_coef        | 0.00524  |
|    ent_coef_loss   | -0.518   |
|    learning_rate   | 0.001    |
|    n_updates       | 65608    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.5     |
|    ep_rew_mean     | -7.69    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 1704     |
|    fps             | 36       |
|    time_elapsed    | 1799     |
|    total_timesteps | 65795    |
| train/             |          |
|    actor_loss      | 1.57     |
|    critic_loss     | 0.00614  |
|    ent_coef        | 0.00514  |
|    ent_coef_loss   | -0.965   |
|    learning_rate   | 0.001    |
|    n_updates       | 65694    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.6     |
|    ep_rew_mean     | -7.63    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 1708     |
|    fps             | 36       |
|    time_elapsed    | 1802     |
|    total_timesteps | 65881    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.00554  |
|    ent_coef        | 0.00519  |
|    ent_coef_loss   | 0.251    |
|    learning_rate   | 0.001    |
|    n_updates       | 65780    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.5     |
|    ep_rew_mean     | -7.59    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 1712     |
|    fps             | 36       |
|    time_elapsed    | 1804     |
|    total_timesteps | 65952    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.00587  |
|    ent_coef        | 0.00518  |
|    ent_coef_loss   | -0.15    |
|    learning_rate   | 0.001    |
|    n_updates       | 65851    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.2     |
|    ep_rew_mean     | -7.48    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 1716     |
|    fps             | 36       |
|    time_elapsed    | 1807     |
|    total_timesteps | 66034    |
| train/             |          |
|    actor_loss      | 1.64     |
|    critic_loss     | 0.0143   |
|    ent_coef        | 0.00516  |
|    ent_coef_loss   | -1.18    |
|    learning_rate   | 0.001    |
|    n_updates       | 65933    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.2     |
|    ep_rew_mean     | -7.38    |
|    success_rate    | 0.92     |
| time/              |          |
|    episodes        | 1720     |
|    fps             | 36       |
|    time_elapsed    | 1809     |
|    total_timesteps | 66116    |
| train/             |          |
|    actor_loss      | 1.64     |
|    critic_loss     | 0.018    |
|    ent_coef        | 0.00509  |
|    ent_coef_loss   | 0.927    |
|    learning_rate   | 0.001    |
|    n_updates       | 66015    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.3     |
|    ep_rew_mean     | -7.35    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 1724     |
|    fps             | 36       |
|    time_elapsed    | 1811     |
|    total_timesteps | 66194    |
| train/             |          |
|    actor_loss      | 1.75     |
|    critic_loss     | 0.00728  |
|    ent_coef        | 0.0051   |
|    ent_coef_loss   | -0.105   |
|    learning_rate   | 0.001    |
|    n_updates       | 66093    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.5     |
|    ep_rew_mean     | -7.46    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 1728     |
|    fps             | 36       |
|    time_elapsed    | 1813     |
|    total_timesteps | 66270    |
| train/             |          |
|    actor_loss      | 1.85     |
|    critic_loss     | 0.0231   |
|    ent_coef        | 0.00499  |
|    ent_coef_loss   | -0.855   |
|    learning_rate   | 0.001    |
|    n_updates       | 66169    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.4     |
|    ep_rew_mean     | -7.43    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 1732     |
|    fps             | 36       |
|    time_elapsed    | 1816     |
|    total_timesteps | 66340    |
| train/             |          |
|    actor_loss      | 1.67     |
|    critic_loss     | 0.00736  |
|    ent_coef        | 0.00494  |
|    ent_coef_loss   | -0.379   |
|    learning_rate   | 0.001    |
|    n_updates       | 66239    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22       |
|    ep_rew_mean     | -7.19    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1736     |
|    fps             | 36       |
|    time_elapsed    | 1818     |
|    total_timesteps | 66408    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.00772  |
|    ent_coef        | 0.00486  |
|    ent_coef_loss   | 0.33     |
|    learning_rate   | 0.001    |
|    n_updates       | 66307    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.9     |
|    ep_rew_mean     | -7.15    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1740     |
|    fps             | 36       |
|    time_elapsed    | 1820     |
|    total_timesteps | 66497    |
| train/             |          |
|    actor_loss      | 1.62     |
|    critic_loss     | 0.00909  |
|    ent_coef        | 0.00514  |
|    ent_coef_loss   | 0.0159   |
|    learning_rate   | 0.001    |
|    n_updates       | 66396    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.3     |
|    ep_rew_mean     | -7.03    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1744     |
|    fps             | 36       |
|    time_elapsed    | 1822     |
|    total_timesteps | 66576    |
| train/             |          |
|    actor_loss      | 1.77     |
|    critic_loss     | 0.011    |
|    ent_coef        | 0.00512  |
|    ent_coef_loss   | 0.499    |
|    learning_rate   | 0.001    |
|    n_updates       | 66475    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.4     |
|    ep_rew_mean     | -7.07    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1748     |
|    fps             | 36       |
|    time_elapsed    | 1825     |
|    total_timesteps | 66663    |
| train/             |          |
|    actor_loss      | 1.77     |
|    critic_loss     | 0.00648  |
|    ent_coef        | 0.00513  |
|    ent_coef_loss   | 0.423    |
|    learning_rate   | 0.001    |
|    n_updates       | 66562    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -6.94    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1752     |
|    fps             | 36       |
|    time_elapsed    | 1827     |
|    total_timesteps | 66733    |
| train/             |          |
|    actor_loss      | 1.67     |
|    critic_loss     | 0.0167   |
|    ent_coef        | 0.00512  |
|    ent_coef_loss   | 0.616    |
|    learning_rate   | 0.001    |
|    n_updates       | 66632    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -6.93    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1756     |
|    fps             | 36       |
|    time_elapsed    | 1830     |
|    total_timesteps | 66817    |
| train/             |          |
|    actor_loss      | 1.84     |
|    critic_loss     | 0.0309   |
|    ent_coef        | 0.00502  |
|    ent_coef_loss   | -0.713   |
|    learning_rate   | 0.001    |
|    n_updates       | 66716    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21       |
|    ep_rew_mean     | -6.8     |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 1760     |
|    fps             | 36       |
|    time_elapsed    | 1832     |
|    total_timesteps | 66889    |
| train/             |          |
|    actor_loss      | 1.85     |
|    critic_loss     | 0.0101   |
|    ent_coef        | 0.00511  |
|    ent_coef_loss   | 0.928    |
|    learning_rate   | 0.001    |
|    n_updates       | 66788    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.9     |
|    ep_rew_mean     | -6.84    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1764     |
|    fps             | 36       |
|    time_elapsed    | 1834     |
|    total_timesteps | 66973    |
| train/             |          |
|    actor_loss      | 1.53     |
|    critic_loss     | 0.0324   |
|    ent_coef        | 0.00515  |
|    ent_coef_loss   | -1.45    |
|    learning_rate   | 0.001    |
|    n_updates       | 66872    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.3     |
|    ep_rew_mean     | -6.7     |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1768     |
|    fps             | 36       |
|    time_elapsed    | 1836     |
|    total_timesteps | 67052    |
| train/             |          |
|    actor_loss      | 1.56     |
|    critic_loss     | 0.00504  |
|    ent_coef        | 0.00512  |
|    ent_coef_loss   | -0.158   |
|    learning_rate   | 0.001    |
|    n_updates       | 66951    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.1     |
|    ep_rew_mean     | -6.65    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1772     |
|    fps             | 36       |
|    time_elapsed    | 1838     |
|    total_timesteps | 67119    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.00733  |
|    ent_coef        | 0.00513  |
|    ent_coef_loss   | 0.111    |
|    learning_rate   | 0.001    |
|    n_updates       | 67018    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.3     |
|    ep_rew_mean     | -6.79    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1776     |
|    fps             | 36       |
|    time_elapsed    | 1841     |
|    total_timesteps | 67211    |
| train/             |          |
|    actor_loss      | 1.57     |
|    critic_loss     | 0.0124   |
|    ent_coef        | 0.00516  |
|    ent_coef_loss   | 0.277    |
|    learning_rate   | 0.001    |
|    n_updates       | 67110    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.2     |
|    ep_rew_mean     | -6.74    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1780     |
|    fps             | 36       |
|    time_elapsed    | 1844     |
|    total_timesteps | 67302    |
| train/             |          |
|    actor_loss      | 1.63     |
|    critic_loss     | 0.0108   |
|    ent_coef        | 0.00517  |
|    ent_coef_loss   | -1.24    |
|    learning_rate   | 0.001    |
|    n_updates       | 67201    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.3     |
|    ep_rew_mean     | -6.72    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1784     |
|    fps             | 36       |
|    time_elapsed    | 1847     |
|    total_timesteps | 67399    |
| train/             |          |
|    actor_loss      | 1.78     |
|    critic_loss     | 0.0057   |
|    ent_coef        | 0.00516  |
|    ent_coef_loss   | 1.7      |
|    learning_rate   | 0.001    |
|    n_updates       | 67298    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.3     |
|    ep_rew_mean     | -6.69    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1788     |
|    fps             | 36       |
|    time_elapsed    | 1848     |
|    total_timesteps | 67460    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.0106   |
|    ent_coef        | 0.00513  |
|    ent_coef_loss   | 1.22     |
|    learning_rate   | 0.001    |
|    n_updates       | 67359    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.2     |
|    ep_rew_mean     | -6.8     |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 1792     |
|    fps             | 36       |
|    time_elapsed    | 1851     |
|    total_timesteps | 67534    |
| train/             |          |
|    actor_loss      | 1.64     |
|    critic_loss     | 0.0124   |
|    ent_coef        | 0.00516  |
|    ent_coef_loss   | -0.786   |
|    learning_rate   | 0.001    |
|    n_updates       | 67433    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.4     |
|    ep_rew_mean     | -6.82    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1796     |
|    fps             | 36       |
|    time_elapsed    | 1854     |
|    total_timesteps | 67649    |
| train/             |          |
|    actor_loss      | 1.64     |
|    critic_loss     | 0.00545  |
|    ent_coef        | 0.00507  |
|    ent_coef_loss   | 0.372    |
|    learning_rate   | 0.001    |
|    n_updates       | 67548    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.1     |
|    ep_rew_mean     | -6.77    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1800     |
|    fps             | 36       |
|    time_elapsed    | 1856     |
|    total_timesteps | 67722    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.013    |
|    ent_coef        | 0.00515  |
|    ent_coef_loss   | 1.93     |
|    learning_rate   | 0.001    |
|    n_updates       | 67621    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 19.9     |
|    ep_rew_mean     | -6.73    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1804     |
|    fps             | 36       |
|    time_elapsed    | 1858     |
|    total_timesteps | 67790    |
| train/             |          |
|    actor_loss      | 1.81     |
|    critic_loss     | 0.00695  |
|    ent_coef        | 0.00521  |
|    ent_coef_loss   | -0.548   |
|    learning_rate   | 0.001    |
|    n_updates       | 67689    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 19.8     |
|    ep_rew_mean     | -6.73    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1808     |
|    fps             | 36       |
|    time_elapsed    | 1860     |
|    total_timesteps | 67863    |
| train/             |          |
|    actor_loss      | 1.65     |
|    critic_loss     | 0.0159   |
|    ent_coef        | 0.00521  |
|    ent_coef_loss   | -0.0721  |
|    learning_rate   | 0.001    |
|    n_updates       | 67762    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 19.9     |
|    ep_rew_mean     | -6.75    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1812     |
|    fps             | 36       |
|    time_elapsed    | 1862     |
|    total_timesteps | 67937    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.0227   |
|    ent_coef        | 0.00533  |
|    ent_coef_loss   | -0.278   |
|    learning_rate   | 0.001    |
|    n_updates       | 67836    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 19.7     |
|    ep_rew_mean     | -6.71    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1816     |
|    fps             | 36       |
|    time_elapsed    | 1864     |
|    total_timesteps | 68001    |
| train/             |          |
|    actor_loss      | 1.84     |
|    critic_loss     | 0.0127   |
|    ent_coef        | 0.00535  |
|    ent_coef_loss   | 0.283    |
|    learning_rate   | 0.001    |
|    n_updates       | 67900    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20       |
|    ep_rew_mean     | -6.79    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1820     |
|    fps             | 36       |
|    time_elapsed    | 1868     |
|    total_timesteps | 68116    |
| train/             |          |
|    actor_loss      | 1.66     |
|    critic_loss     | 0.0129   |
|    ent_coef        | 0.00526  |
|    ent_coef_loss   | 0.0391   |
|    learning_rate   | 0.001    |
|    n_updates       | 68015    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 19.9     |
|    ep_rew_mean     | -6.76    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1824     |
|    fps             | 36       |
|    time_elapsed    | 1870     |
|    total_timesteps | 68181    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.0111   |
|    ent_coef        | 0.00515  |
|    ent_coef_loss   | 0.743    |
|    learning_rate   | 0.001    |
|    n_updates       | 68080    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.1     |
|    ep_rew_mean     | -6.8     |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1828     |
|    fps             | 36       |
|    time_elapsed    | 1872     |
|    total_timesteps | 68275    |
| train/             |          |
|    actor_loss      | 1.62     |
|    critic_loss     | 0.02     |
|    ent_coef        | 0.00527  |
|    ent_coef_loss   | 0.502    |
|    learning_rate   | 0.001    |
|    n_updates       | 68174    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.4     |
|    ep_rew_mean     | -6.94    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1832     |
|    fps             | 36       |
|    time_elapsed    | 1875     |
|    total_timesteps | 68375    |
| train/             |          |
|    actor_loss      | 1.56     |
|    critic_loss     | 0.00723  |
|    ent_coef        | 0.00529  |
|    ent_coef_loss   | -0.96    |
|    learning_rate   | 0.001    |
|    n_updates       | 68274    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.6     |
|    ep_rew_mean     | -7.11    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 1836     |
|    fps             | 36       |
|    time_elapsed    | 1877     |
|    total_timesteps | 68464    |
| train/             |          |
|    actor_loss      | 1.58     |
|    critic_loss     | 0.0242   |
|    ent_coef        | 0.00518  |
|    ent_coef_loss   | -0.704   |
|    learning_rate   | 0.001    |
|    n_updates       | 68363    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.5     |
|    ep_rew_mean     | -7.04    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1840     |
|    fps             | 36       |
|    time_elapsed    | 1880     |
|    total_timesteps | 68544    |
| train/             |          |
|    actor_loss      | 1.81     |
|    critic_loss     | 0.0129   |
|    ent_coef        | 0.00524  |
|    ent_coef_loss   | 1.26     |
|    learning_rate   | 0.001    |
|    n_updates       | 68443    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.6     |
|    ep_rew_mean     | -7.08    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1844     |
|    fps             | 36       |
|    time_elapsed    | 1883     |
|    total_timesteps | 68638    |
| train/             |          |
|    actor_loss      | 1.63     |
|    critic_loss     | 0.00575  |
|    ent_coef        | 0.00514  |
|    ent_coef_loss   | 0.213    |
|    learning_rate   | 0.001    |
|    n_updates       | 68537    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.6     |
|    ep_rew_mean     | -7.11    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1848     |
|    fps             | 36       |
|    time_elapsed    | 1886     |
|    total_timesteps | 68727    |
| train/             |          |
|    actor_loss      | 1.64     |
|    critic_loss     | 0.00568  |
|    ent_coef        | 0.00521  |
|    ent_coef_loss   | 1.62     |
|    learning_rate   | 0.001    |
|    n_updates       | 68626    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.9     |
|    ep_rew_mean     | -7.18    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1852     |
|    fps             | 36       |
|    time_elapsed    | 1888     |
|    total_timesteps | 68819    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.00618  |
|    ent_coef        | 0.00529  |
|    ent_coef_loss   | 0.266    |
|    learning_rate   | 0.001    |
|    n_updates       | 68718    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21       |
|    ep_rew_mean     | -7.26    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1856     |
|    fps             | 36       |
|    time_elapsed    | 1891     |
|    total_timesteps | 68921    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.00544  |
|    ent_coef        | 0.0054   |
|    ent_coef_loss   | -0.175   |
|    learning_rate   | 0.001    |
|    n_updates       | 68820    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.2     |
|    ep_rew_mean     | -7.35    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 1860     |
|    fps             | 36       |
|    time_elapsed    | 1894     |
|    total_timesteps | 69006    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.00786  |
|    ent_coef        | 0.00523  |
|    ent_coef_loss   | -0.287   |
|    learning_rate   | 0.001    |
|    n_updates       | 68905    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.4     |
|    ep_rew_mean     | -7.37    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1864     |
|    fps             | 36       |
|    time_elapsed    | 1897     |
|    total_timesteps | 69108    |
| train/             |          |
|    actor_loss      | 1.61     |
|    critic_loss     | 0.0164   |
|    ent_coef        | 0.00538  |
|    ent_coef_loss   | -1.94    |
|    learning_rate   | 0.001    |
|    n_updates       | 69007    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.2     |
|    ep_rew_mean     | -7.33    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1868     |
|    fps             | 36       |
|    time_elapsed    | 1899     |
|    total_timesteps | 69171    |
| train/             |          |
|    actor_loss      | 1.61     |
|    critic_loss     | 0.0105   |
|    ent_coef        | 0.00532  |
|    ent_coef_loss   | -1.66    |
|    learning_rate   | 0.001    |
|    n_updates       | 69070    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.6     |
|    ep_rew_mean     | -7.41    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1872     |
|    fps             | 36       |
|    time_elapsed    | 1901     |
|    total_timesteps | 69276    |
| train/             |          |
|    actor_loss      | 1.64     |
|    critic_loss     | 0.00744  |
|    ent_coef        | 0.00543  |
|    ent_coef_loss   | -0.433   |
|    learning_rate   | 0.001    |
|    n_updates       | 69175    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.4     |
|    ep_rew_mean     | -7.33    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1876     |
|    fps             | 36       |
|    time_elapsed    | 1903     |
|    total_timesteps | 69346    |
| train/             |          |
|    actor_loss      | 1.59     |
|    critic_loss     | 0.00717  |
|    ent_coef        | 0.00549  |
|    ent_coef_loss   | 0.624    |
|    learning_rate   | 0.001    |
|    n_updates       | 69245    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.3     |
|    ep_rew_mean     | -7.22    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 1880     |
|    fps             | 36       |
|    time_elapsed    | 1906     |
|    total_timesteps | 69429    |
| train/             |          |
|    actor_loss      | 1.84     |
|    critic_loss     | 0.0119   |
|    ent_coef        | 0.00549  |
|    ent_coef_loss   | -1.36    |
|    learning_rate   | 0.001    |
|    n_updates       | 69328    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21       |
|    ep_rew_mean     | -7.21    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1884     |
|    fps             | 36       |
|    time_elapsed    | 1909     |
|    total_timesteps | 69499    |
| train/             |          |
|    actor_loss      | 1.66     |
|    critic_loss     | 0.00689  |
|    ent_coef        | 0.00541  |
|    ent_coef_loss   | -0.71    |
|    learning_rate   | 0.001    |
|    n_updates       | 69398    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -7.35    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1888     |
|    fps             | 36       |
|    time_elapsed    | 1911     |
|    total_timesteps | 69572    |
| train/             |          |
|    actor_loss      | 1.72     |
|    critic_loss     | 0.0428   |
|    ent_coef        | 0.00535  |
|    ent_coef_loss   | 1.25     |
|    learning_rate   | 0.001    |
|    n_updates       | 69471    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -7.3     |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1892     |
|    fps             | 36       |
|    time_elapsed    | 1913     |
|    total_timesteps | 69645    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.0146   |
|    ent_coef        | 0.0052   |
|    ent_coef_loss   | 1.14     |
|    learning_rate   | 0.001    |
|    n_updates       | 69544    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.8     |
|    ep_rew_mean     | -7.21    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1896     |
|    fps             | 36       |
|    time_elapsed    | 1915     |
|    total_timesteps | 69729    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.0124   |
|    ent_coef        | 0.00526  |
|    ent_coef_loss   | -0.126   |
|    learning_rate   | 0.001    |
|    n_updates       | 69628    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.2     |
|    ep_rew_mean     | -7.34    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1900     |
|    fps             | 36       |
|    time_elapsed    | 1918     |
|    total_timesteps | 69838    |
| train/             |          |
|    actor_loss      | 1.66     |
|    critic_loss     | 0.00842  |
|    ent_coef        | 0.00522  |
|    ent_coef_loss   | -0.875   |
|    learning_rate   | 0.001    |
|    n_updates       | 69737    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.2     |
|    ep_rew_mean     | -7.4     |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1904     |
|    fps             | 36       |
|    time_elapsed    | 1921     |
|    total_timesteps | 69914    |
| train/             |          |
|    actor_loss      | 1.61     |
|    critic_loss     | 0.0148   |
|    ent_coef        | 0.00516  |
|    ent_coef_loss   | 0.36     |
|    learning_rate   | 0.001    |
|    n_updates       | 69813    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.4     |
|    ep_rew_mean     | -7.34    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1908     |
|    fps             | 36       |
|    time_elapsed    | 1923     |
|    total_timesteps | 69998    |
| train/             |          |
|    actor_loss      | 1.6      |
|    critic_loss     | 0.0569   |
|    ent_coef        | 0.00528  |
|    ent_coef_loss   | -0.924   |
|    learning_rate   | 0.001    |
|    n_updates       | 69897    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.39    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1912     |
|    fps             | 36       |
|    time_elapsed    | 1926     |
|    total_timesteps | 70083    |
| train/             |          |
|    actor_loss      | 1.78     |
|    critic_loss     | 0.0121   |
|    ent_coef        | 0.00543  |
|    ent_coef_loss   | -0.998   |
|    learning_rate   | 0.001    |
|    n_updates       | 69982    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.4     |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1916     |
|    fps             | 36       |
|    time_elapsed    | 1927     |
|    total_timesteps | 70155    |
| train/             |          |
|    actor_loss      | 1.82     |
|    critic_loss     | 0.00409  |
|    ent_coef        | 0.00534  |
|    ent_coef_loss   | 0.421    |
|    learning_rate   | 0.001    |
|    n_updates       | 70054    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -7.31    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1920     |
|    fps             | 36       |
|    time_elapsed    | 1929     |
|    total_timesteps | 70226    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.0138   |
|    ent_coef        | 0.00535  |
|    ent_coef_loss   | 1.46     |
|    learning_rate   | 0.001    |
|    n_updates       | 70125    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -7.28    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1924     |
|    fps             | 36       |
|    time_elapsed    | 1931     |
|    total_timesteps | 70288    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.011    |
|    ent_coef        | 0.00526  |
|    ent_coef_loss   | 0.61     |
|    learning_rate   | 0.001    |
|    n_updates       | 70187    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.7     |
|    ep_rew_mean     | -7.16    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1928     |
|    fps             | 36       |
|    time_elapsed    | 1933     |
|    total_timesteps | 70347    |
| train/             |          |
|    actor_loss      | 1.79     |
|    critic_loss     | 0.0148   |
|    ent_coef        | 0.00535  |
|    ent_coef_loss   | 0.101    |
|    learning_rate   | 0.001    |
|    n_updates       | 70246    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.5     |
|    ep_rew_mean     | -7.09    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1932     |
|    fps             | 36       |
|    time_elapsed    | 1936     |
|    total_timesteps | 70424    |
| train/             |          |
|    actor_loss      | 1.56     |
|    critic_loss     | 0.016    |
|    ent_coef        | 0.00523  |
|    ent_coef_loss   | 1.01     |
|    learning_rate   | 0.001    |
|    n_updates       | 70323    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.7     |
|    ep_rew_mean     | -7.08    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1936     |
|    fps             | 36       |
|    time_elapsed    | 1939     |
|    total_timesteps | 70538    |
| train/             |          |
|    actor_loss      | 1.64     |
|    critic_loss     | 0.00928  |
|    ent_coef        | 0.0054   |
|    ent_coef_loss   | -1.81    |
|    learning_rate   | 0.001    |
|    n_updates       | 70437    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.6     |
|    ep_rew_mean     | -7.06    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1940     |
|    fps             | 36       |
|    time_elapsed    | 1941     |
|    total_timesteps | 70607    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.0132   |
|    ent_coef        | 0.00534  |
|    ent_coef_loss   | -1.07    |
|    learning_rate   | 0.001    |
|    n_updates       | 70506    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.6     |
|    ep_rew_mean     | -7.11    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 1944     |
|    fps             | 36       |
|    time_elapsed    | 1943     |
|    total_timesteps | 70697    |
| train/             |          |
|    actor_loss      | 1.61     |
|    critic_loss     | 0.0144   |
|    ent_coef        | 0.00532  |
|    ent_coef_loss   | 0.0378   |
|    learning_rate   | 0.001    |
|    n_updates       | 70596    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.6     |
|    ep_rew_mean     | -7.12    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 1948     |
|    fps             | 36       |
|    time_elapsed    | 1946     |
|    total_timesteps | 70790    |
| train/             |          |
|    actor_loss      | 1.63     |
|    critic_loss     | 0.0106   |
|    ent_coef        | 0.00527  |
|    ent_coef_loss   | -0.373   |
|    learning_rate   | 0.001    |
|    n_updates       | 70689    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.7     |
|    ep_rew_mean     | -7.13    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 1952     |
|    fps             | 36       |
|    time_elapsed    | 1949     |
|    total_timesteps | 70886    |
| train/             |          |
|    actor_loss      | 1.55     |
|    critic_loss     | 0.0245   |
|    ent_coef        | 0.00517  |
|    ent_coef_loss   | -0.873   |
|    learning_rate   | 0.001    |
|    n_updates       | 70785    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.4     |
|    ep_rew_mean     | -7.03    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1956     |
|    fps             | 36       |
|    time_elapsed    | 1951     |
|    total_timesteps | 70962    |
| train/             |          |
|    actor_loss      | 1.85     |
|    critic_loss     | 0.0176   |
|    ent_coef        | 0.00511  |
|    ent_coef_loss   | 1.66     |
|    learning_rate   | 0.001    |
|    n_updates       | 70861    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.5     |
|    ep_rew_mean     | -6.99    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1960     |
|    fps             | 36       |
|    time_elapsed    | 1954     |
|    total_timesteps | 71052    |
| train/             |          |
|    actor_loss      | 1.67     |
|    critic_loss     | 0.0102   |
|    ent_coef        | 0.00521  |
|    ent_coef_loss   | -0.625   |
|    learning_rate   | 0.001    |
|    n_updates       | 70951    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.3     |
|    ep_rew_mean     | -6.94    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 1964     |
|    fps             | 36       |
|    time_elapsed    | 1956     |
|    total_timesteps | 71141    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.00774  |
|    ent_coef        | 0.00515  |
|    ent_coef_loss   | 0.135    |
|    learning_rate   | 0.001    |
|    n_updates       | 71040    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.8     |
|    ep_rew_mean     | -7.12    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1968     |
|    fps             | 36       |
|    time_elapsed    | 1960     |
|    total_timesteps | 71255    |
| train/             |          |
|    actor_loss      | 1.48     |
|    critic_loss     | 0.0102   |
|    ent_coef        | 0.00521  |
|    ent_coef_loss   | -1.25    |
|    learning_rate   | 0.001    |
|    n_updates       | 71154    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.8     |
|    ep_rew_mean     | -7.15    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 1972     |
|    fps             | 36       |
|    time_elapsed    | 1963     |
|    total_timesteps | 71353    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.00532  |
|    ent_coef        | 0.0052   |
|    ent_coef_loss   | 0.122    |
|    learning_rate   | 0.001    |
|    n_updates       | 71252    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21       |
|    ep_rew_mean     | -7.21    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 1976     |
|    fps             | 36       |
|    time_elapsed    | 1966     |
|    total_timesteps | 71442    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.00502  |
|    ent_coef        | 0.00519  |
|    ent_coef_loss   | -0.75    |
|    learning_rate   | 0.001    |
|    n_updates       | 71341    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21       |
|    ep_rew_mean     | -7.21    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 1980     |
|    fps             | 36       |
|    time_elapsed    | 1968     |
|    total_timesteps | 71529    |
| train/             |          |
|    actor_loss      | 1.55     |
|    critic_loss     | 0.00615  |
|    ent_coef        | 0.00511  |
|    ent_coef_loss   | -1.13    |
|    learning_rate   | 0.001    |
|    n_updates       | 71428    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -7.17    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1984     |
|    fps             | 36       |
|    time_elapsed    | 1970     |
|    total_timesteps | 71613    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.0284   |
|    ent_coef        | 0.00502  |
|    ent_coef_loss   | 0.177    |
|    learning_rate   | 0.001    |
|    n_updates       | 71512    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.3     |
|    ep_rew_mean     | -7.19    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1988     |
|    fps             | 36       |
|    time_elapsed    | 1973     |
|    total_timesteps | 71703    |
| train/             |          |
|    actor_loss      | 1.67     |
|    critic_loss     | 0.00567  |
|    ent_coef        | 0.00522  |
|    ent_coef_loss   | 0.0847   |
|    learning_rate   | 0.001    |
|    n_updates       | 71602    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.24    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1992     |
|    fps             | 36       |
|    time_elapsed    | 1976     |
|    total_timesteps | 71797    |
| train/             |          |
|    actor_loss      | 1.61     |
|    critic_loss     | 0.00619  |
|    ent_coef        | 0.00532  |
|    ent_coef_loss   | -0.925   |
|    learning_rate   | 0.001    |
|    n_updates       | 71696    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.22    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 1996     |
|    fps             | 36       |
|    time_elapsed    | 1979     |
|    total_timesteps | 71879    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.0143   |
|    ent_coef        | 0.00542  |
|    ent_coef_loss   | 0.547    |
|    learning_rate   | 0.001    |
|    n_updates       | 71778    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.2     |
|    ep_rew_mean     | -7.14    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 2000     |
|    fps             | 36       |
|    time_elapsed    | 1981     |
|    total_timesteps | 71957    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.0291   |
|    ent_coef        | 0.00536  |
|    ent_coef_loss   | -0.914   |
|    learning_rate   | 0.001    |
|    n_updates       | 71856    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.2     |
|    ep_rew_mean     | -7.07    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2004     |
|    fps             | 36       |
|    time_elapsed    | 1983     |
|    total_timesteps | 72038    |
| train/             |          |
|    actor_loss      | 1.57     |
|    critic_loss     | 0.021    |
|    ent_coef        | 0.00537  |
|    ent_coef_loss   | -0.42    |
|    learning_rate   | 0.001    |
|    n_updates       | 71937    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.3     |
|    ep_rew_mean     | -7.2     |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 2008     |
|    fps             | 36       |
|    time_elapsed    | 1986     |
|    total_timesteps | 72125    |
| train/             |          |
|    actor_loss      | 1.52     |
|    critic_loss     | 0.0333   |
|    ent_coef        | 0.00544  |
|    ent_coef_loss   | -1.45    |
|    learning_rate   | 0.001    |
|    n_updates       | 72024    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.31    |
|    success_rate    | 0.92     |
| time/              |          |
|    episodes        | 2012     |
|    fps             | 36       |
|    time_elapsed    | 1990     |
|    total_timesteps | 72234    |
| train/             |          |
|    actor_loss      | 1.58     |
|    critic_loss     | 0.0194   |
|    ent_coef        | 0.00542  |
|    ent_coef_loss   | -0.978   |
|    learning_rate   | 0.001    |
|    n_updates       | 72133    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.32    |
|    success_rate    | 0.92     |
| time/              |          |
|    episodes        | 2016     |
|    fps             | 36       |
|    time_elapsed    | 1991     |
|    total_timesteps | 72301    |
| train/             |          |
|    actor_loss      | 1.72     |
|    critic_loss     | 0.0074   |
|    ent_coef        | 0.00543  |
|    ent_coef_loss   | -0.518   |
|    learning_rate   | 0.001    |
|    n_updates       | 72200    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.6     |
|    ep_rew_mean     | -7.35    |
|    success_rate    | 0.92     |
| time/              |          |
|    episodes        | 2020     |
|    fps             | 36       |
|    time_elapsed    | 1994     |
|    total_timesteps | 72384    |
| train/             |          |
|    actor_loss      | 1.75     |
|    critic_loss     | 0.00466  |
|    ent_coef        | 0.00538  |
|    ent_coef_loss   | 0.481    |
|    learning_rate   | 0.001    |
|    n_updates       | 72283    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.8     |
|    ep_rew_mean     | -7.47    |
|    success_rate    | 0.91     |
| time/              |          |
|    episodes        | 2024     |
|    fps             | 36       |
|    time_elapsed    | 1996     |
|    total_timesteps | 72467    |
| train/             |          |
|    actor_loss      | 1.61     |
|    critic_loss     | 0.00946  |
|    ent_coef        | 0.00547  |
|    ent_coef_loss   | -0.685   |
|    learning_rate   | 0.001    |
|    n_updates       | 72366    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22       |
|    ep_rew_mean     | -7.6     |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 2028     |
|    fps             | 36       |
|    time_elapsed    | 1998     |
|    total_timesteps | 72548    |
| train/             |          |
|    actor_loss      | 1.67     |
|    critic_loss     | 0.0073   |
|    ent_coef        | 0.00531  |
|    ent_coef_loss   | -0.322   |
|    learning_rate   | 0.001    |
|    n_updates       | 72447    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.3     |
|    ep_rew_mean     | -7.65    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 2032     |
|    fps             | 36       |
|    time_elapsed    | 2002     |
|    total_timesteps | 72650    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.0084   |
|    ent_coef        | 0.00554  |
|    ent_coef_loss   | -0.244   |
|    learning_rate   | 0.001    |
|    n_updates       | 72549    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.9     |
|    ep_rew_mean     | -7.56    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 2036     |
|    fps             | 36       |
|    time_elapsed    | 2004     |
|    total_timesteps | 72729    |
| train/             |          |
|    actor_loss      | 1.81     |
|    critic_loss     | 0.00837  |
|    ent_coef        | 0.00552  |
|    ent_coef_loss   | 0.838    |
|    learning_rate   | 0.001    |
|    n_updates       | 72628    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.9     |
|    ep_rew_mean     | -7.47    |
|    success_rate    | 0.91     |
| time/              |          |
|    episodes        | 2040     |
|    fps             | 36       |
|    time_elapsed    | 2006     |
|    total_timesteps | 72799    |
| train/             |          |
|    actor_loss      | 1.5      |
|    critic_loss     | 0.0057   |
|    ent_coef        | 0.00549  |
|    ent_coef_loss   | -0.0929  |
|    learning_rate   | 0.001    |
|    n_updates       | 72698    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22       |
|    ep_rew_mean     | -7.54    |
|    success_rate    | 0.91     |
| time/              |          |
|    episodes        | 2044     |
|    fps             | 36       |
|    time_elapsed    | 2009     |
|    total_timesteps | 72900    |
| train/             |          |
|    actor_loss      | 1.64     |
|    critic_loss     | 0.00877  |
|    ent_coef        | 0.00555  |
|    ent_coef_loss   | 1.54     |
|    learning_rate   | 0.001    |
|    n_updates       | 72799    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.2     |
|    ep_rew_mean     | -7.59    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 2048     |
|    fps             | 36       |
|    time_elapsed    | 2012     |
|    total_timesteps | 73011    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.0335   |
|    ent_coef        | 0.00543  |
|    ent_coef_loss   | -1.1     |
|    learning_rate   | 0.001    |
|    n_updates       | 72910    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.1     |
|    ep_rew_mean     | -7.63    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 2052     |
|    fps             | 36       |
|    time_elapsed    | 2015     |
|    total_timesteps | 73098    |
| train/             |          |
|    actor_loss      | 1.79     |
|    critic_loss     | 0.00626  |
|    ent_coef        | 0.0053   |
|    ent_coef_loss   | 0.925    |
|    learning_rate   | 0.001    |
|    n_updates       | 72997    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.2     |
|    ep_rew_mean     | -7.63    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 2056     |
|    fps             | 36       |
|    time_elapsed    | 2017     |
|    total_timesteps | 73181    |
| train/             |          |
|    actor_loss      | 1.75     |
|    critic_loss     | 0.00901  |
|    ent_coef        | 0.00536  |
|    ent_coef_loss   | 0.275    |
|    learning_rate   | 0.001    |
|    n_updates       | 73080    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.1     |
|    ep_rew_mean     | -7.68    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 2060     |
|    fps             | 36       |
|    time_elapsed    | 2020     |
|    total_timesteps | 73263    |
| train/             |          |
|    actor_loss      | 1.63     |
|    critic_loss     | 0.0162   |
|    ent_coef        | 0.00539  |
|    ent_coef_loss   | 0.0684   |
|    learning_rate   | 0.001    |
|    n_updates       | 73162    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.2     |
|    ep_rew_mean     | -7.8     |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 2064     |
|    fps             | 36       |
|    time_elapsed    | 2022     |
|    total_timesteps | 73363    |
| train/             |          |
|    actor_loss      | 1.62     |
|    critic_loss     | 0.0101   |
|    ent_coef        | 0.00543  |
|    ent_coef_loss   | -0.61    |
|    learning_rate   | 0.001    |
|    n_updates       | 73262    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.1     |
|    ep_rew_mean     | -7.79    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 2068     |
|    fps             | 36       |
|    time_elapsed    | 2026     |
|    total_timesteps | 73463    |
| train/             |          |
|    actor_loss      | 1.75     |
|    critic_loss     | 0.00811  |
|    ent_coef        | 0.0054   |
|    ent_coef_loss   | -0.615   |
|    learning_rate   | 0.001    |
|    n_updates       | 73362    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.9     |
|    ep_rew_mean     | -7.74    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 2072     |
|    fps             | 36       |
|    time_elapsed    | 2028     |
|    total_timesteps | 73547    |
| train/             |          |
|    actor_loss      | 1.61     |
|    critic_loss     | 0.0122   |
|    ent_coef        | 0.0054   |
|    ent_coef_loss   | 1.26     |
|    learning_rate   | 0.001    |
|    n_updates       | 73446    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22       |
|    ep_rew_mean     | -7.76    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 2076     |
|    fps             | 36       |
|    time_elapsed    | 2031     |
|    total_timesteps | 73643    |
| train/             |          |
|    actor_loss      | 1.64     |
|    critic_loss     | 0.0115   |
|    ent_coef        | 0.00548  |
|    ent_coef_loss   | -1.01    |
|    learning_rate   | 0.001    |
|    n_updates       | 73542    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.2     |
|    ep_rew_mean     | -7.91    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 2080     |
|    fps             | 36       |
|    time_elapsed    | 2034     |
|    total_timesteps | 73752    |
| train/             |          |
|    actor_loss      | 1.62     |
|    critic_loss     | 0.0147   |
|    ent_coef        | 0.00537  |
|    ent_coef_loss   | -0.768   |
|    learning_rate   | 0.001    |
|    n_updates       | 73651    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.2     |
|    ep_rew_mean     | -7.95    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 2084     |
|    fps             | 36       |
|    time_elapsed    | 2036     |
|    total_timesteps | 73833    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.0167   |
|    ent_coef        | 0.00534  |
|    ent_coef_loss   | 0.262    |
|    learning_rate   | 0.001    |
|    n_updates       | 73732    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.2     |
|    ep_rew_mean     | -8.01    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 2088     |
|    fps             | 36       |
|    time_elapsed    | 2039     |
|    total_timesteps | 73920    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.0125   |
|    ent_coef        | 0.00529  |
|    ent_coef_loss   | -0.264   |
|    learning_rate   | 0.001    |
|    n_updates       | 73819    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.4     |
|    ep_rew_mean     | -8.13    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2092     |
|    fps             | 36       |
|    time_elapsed    | 2043     |
|    total_timesteps | 74035    |
| train/             |          |
|    actor_loss      | 1.62     |
|    critic_loss     | 0.0109   |
|    ent_coef        | 0.00522  |
|    ent_coef_loss   | 0.606    |
|    learning_rate   | 0.001    |
|    n_updates       | 73934    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.3     |
|    ep_rew_mean     | -8.15    |
|    success_rate    | 0.85     |
| time/              |          |
|    episodes        | 2096     |
|    fps             | 36       |
|    time_elapsed    | 2045     |
|    total_timesteps | 74107    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.00477  |
|    ent_coef        | 0.00527  |
|    ent_coef_loss   | -0.858   |
|    learning_rate   | 0.001    |
|    n_updates       | 74006    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.2     |
|    ep_rew_mean     | -8.08    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2100     |
|    fps             | 36       |
|    time_elapsed    | 2047     |
|    total_timesteps | 74176    |
| train/             |          |
|    actor_loss      | 1.59     |
|    critic_loss     | 0.00478  |
|    ent_coef        | 0.00529  |
|    ent_coef_loss   | -1.72    |
|    learning_rate   | 0.001    |
|    n_updates       | 74075    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.2     |
|    ep_rew_mean     | -8.2     |
|    success_rate    | 0.84     |
| time/              |          |
|    episodes        | 2104     |
|    fps             | 36       |
|    time_elapsed    | 2049     |
|    total_timesteps | 74257    |
| train/             |          |
|    actor_loss      | 1.55     |
|    critic_loss     | 0.0229   |
|    ent_coef        | 0.00538  |
|    ent_coef_loss   | -0.412   |
|    learning_rate   | 0.001    |
|    n_updates       | 74156    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.3     |
|    ep_rew_mean     | -8.15    |
|    success_rate    | 0.85     |
| time/              |          |
|    episodes        | 2108     |
|    fps             | 36       |
|    time_elapsed    | 2053     |
|    total_timesteps | 74356    |
| train/             |          |
|    actor_loss      | 1.62     |
|    critic_loss     | 0.0164   |
|    ent_coef        | 0.0054   |
|    ent_coef_loss   | 1.14     |
|    learning_rate   | 0.001    |
|    n_updates       | 74255    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.1     |
|    ep_rew_mean     | -8.04    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2112     |
|    fps             | 36       |
|    time_elapsed    | 2055     |
|    total_timesteps | 74444    |
| train/             |          |
|    actor_loss      | 1.78     |
|    critic_loss     | 0.00766  |
|    ent_coef        | 0.00548  |
|    ent_coef_loss   | -0.289   |
|    learning_rate   | 0.001    |
|    n_updates       | 74343    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.1     |
|    ep_rew_mean     | -8.05    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2116     |
|    fps             | 36       |
|    time_elapsed    | 2057     |
|    total_timesteps | 74516    |
| train/             |          |
|    actor_loss      | 1.66     |
|    critic_loss     | 0.0294   |
|    ent_coef        | 0.00533  |
|    ent_coef_loss   | 0.784    |
|    learning_rate   | 0.001    |
|    n_updates       | 74415    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.1     |
|    ep_rew_mean     | -8.02    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2120     |
|    fps             | 36       |
|    time_elapsed    | 2060     |
|    total_timesteps | 74599    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.0099   |
|    ent_coef        | 0.00535  |
|    ent_coef_loss   | -1.57    |
|    learning_rate   | 0.001    |
|    n_updates       | 74498    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.5     |
|    ep_rew_mean     | -8.17    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2124     |
|    fps             | 36       |
|    time_elapsed    | 2063     |
|    total_timesteps | 74714    |
| train/             |          |
|    actor_loss      | 1.61     |
|    critic_loss     | 0.00476  |
|    ent_coef        | 0.00533  |
|    ent_coef_loss   | -1.11    |
|    learning_rate   | 0.001    |
|    n_updates       | 74613    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.4     |
|    ep_rew_mean     | -8.09    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 2128     |
|    fps             | 36       |
|    time_elapsed    | 2066     |
|    total_timesteps | 74792    |
| train/             |          |
|    actor_loss      | 1.52     |
|    critic_loss     | 0.0159   |
|    ent_coef        | 0.00539  |
|    ent_coef_loss   | 0.661    |
|    learning_rate   | 0.001    |
|    n_updates       | 74691    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.3     |
|    ep_rew_mean     | -8.04    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 2132     |
|    fps             | 36       |
|    time_elapsed    | 2069     |
|    total_timesteps | 74882    |
| train/             |          |
|    actor_loss      | 1.9      |
|    critic_loss     | 0.00506  |
|    ent_coef        | 0.00535  |
|    ent_coef_loss   | 0.821    |
|    learning_rate   | 0.001    |
|    n_updates       | 74781    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.3     |
|    ep_rew_mean     | -8.02    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 2136     |
|    fps             | 36       |
|    time_elapsed    | 2071     |
|    total_timesteps | 74955    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.0456   |
|    ent_coef        | 0.00534  |
|    ent_coef_loss   | 1.52     |
|    learning_rate   | 0.001    |
|    n_updates       | 74854    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.4     |
|    ep_rew_mean     | -8.05    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 2140     |
|    fps             | 36       |
|    time_elapsed    | 2073     |
|    total_timesteps | 75039    |
| train/             |          |
|    actor_loss      | 1.72     |
|    critic_loss     | 0.00771  |
|    ent_coef        | 0.00536  |
|    ent_coef_loss   | -0.115   |
|    learning_rate   | 0.001    |
|    n_updates       | 74938    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.1     |
|    ep_rew_mean     | -7.91    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 2144     |
|    fps             | 36       |
|    time_elapsed    | 2075     |
|    total_timesteps | 75110    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.00796  |
|    ent_coef        | 0.00544  |
|    ent_coef_loss   | -1.3     |
|    learning_rate   | 0.001    |
|    n_updates       | 75009    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.1     |
|    ep_rew_mean     | -7.85    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 2148     |
|    fps             | 36       |
|    time_elapsed    | 2078     |
|    total_timesteps | 75224    |
| train/             |          |
|    actor_loss      | 1.65     |
|    critic_loss     | 0.00809  |
|    ent_coef        | 0.00539  |
|    ent_coef_loss   | 0.0825   |
|    learning_rate   | 0.001    |
|    n_updates       | 75123    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22       |
|    ep_rew_mean     | -7.77    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 2152     |
|    fps             | 36       |
|    time_elapsed    | 2081     |
|    total_timesteps | 75296    |
| train/             |          |
|    actor_loss      | 1.65     |
|    critic_loss     | 0.00485  |
|    ent_coef        | 0.0054   |
|    ent_coef_loss   | 0.544    |
|    learning_rate   | 0.001    |
|    n_updates       | 75195    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.1     |
|    ep_rew_mean     | -7.82    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 2156     |
|    fps             | 36       |
|    time_elapsed    | 2084     |
|    total_timesteps | 75396    |
| train/             |          |
|    actor_loss      | 1.58     |
|    critic_loss     | 0.0116   |
|    ent_coef        | 0.00542  |
|    ent_coef_loss   | -0.241   |
|    learning_rate   | 0.001    |
|    n_updates       | 75295    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.3     |
|    ep_rew_mean     | -7.76    |
|    success_rate    | 0.91     |
| time/              |          |
|    episodes        | 2160     |
|    fps             | 36       |
|    time_elapsed    | 2086     |
|    total_timesteps | 75492    |
| train/             |          |
|    actor_loss      | 1.41     |
|    critic_loss     | 0.0193   |
|    ent_coef        | 0.00534  |
|    ent_coef_loss   | -1.11    |
|    learning_rate   | 0.001    |
|    n_updates       | 75391    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.9     |
|    ep_rew_mean     | -7.58    |
|    success_rate    | 0.92     |
| time/              |          |
|    episodes        | 2164     |
|    fps             | 36       |
|    time_elapsed    | 2088     |
|    total_timesteps | 75556    |
| train/             |          |
|    actor_loss      | 1.63     |
|    critic_loss     | 0.00555  |
|    ent_coef        | 0.00529  |
|    ent_coef_loss   | -0.477   |
|    learning_rate   | 0.001    |
|    n_updates       | 75455    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.6     |
|    ep_rew_mean     | -7.45    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 2168     |
|    fps             | 36       |
|    time_elapsed    | 2090     |
|    total_timesteps | 75626    |
| train/             |          |
|    actor_loss      | 1.55     |
|    critic_loss     | 0.0171   |
|    ent_coef        | 0.00525  |
|    ent_coef_loss   | -0.319   |
|    learning_rate   | 0.001    |
|    n_updates       | 75525    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.9     |
|    ep_rew_mean     | -7.53    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 2172     |
|    fps             | 36       |
|    time_elapsed    | 2093     |
|    total_timesteps | 75734    |
| train/             |          |
|    actor_loss      | 1.6      |
|    critic_loss     | 0.0341   |
|    ent_coef        | 0.00527  |
|    ent_coef_loss   | 0.354    |
|    learning_rate   | 0.001    |
|    n_updates       | 75633    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.7     |
|    ep_rew_mean     | -7.46    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 2176     |
|    fps             | 36       |
|    time_elapsed    | 2095     |
|    total_timesteps | 75811    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.0127   |
|    ent_coef        | 0.00529  |
|    ent_coef_loss   | 0.845    |
|    learning_rate   | 0.001    |
|    n_updates       | 75710    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.4     |
|    ep_rew_mean     | -7.27    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2180     |
|    fps             | 36       |
|    time_elapsed    | 2098     |
|    total_timesteps | 75892    |
| train/             |          |
|    actor_loss      | 1.59     |
|    critic_loss     | 0.015    |
|    ent_coef        | 0.00536  |
|    ent_coef_loss   | 1.15     |
|    learning_rate   | 0.001    |
|    n_updates       | 75791    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.25    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2184     |
|    fps             | 36       |
|    time_elapsed    | 2100     |
|    total_timesteps | 75986    |
| train/             |          |
|    actor_loss      | 1.72     |
|    critic_loss     | 0.0097   |
|    ent_coef        | 0.00542  |
|    ent_coef_loss   | 0.0294   |
|    learning_rate   | 0.001    |
|    n_updates       | 75885    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.19    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2188     |
|    fps             | 36       |
|    time_elapsed    | 2103     |
|    total_timesteps | 76072    |
| train/             |          |
|    actor_loss      | 1.62     |
|    critic_loss     | 0.00542  |
|    ent_coef        | 0.00542  |
|    ent_coef_loss   | 0.704    |
|    learning_rate   | 0.001    |
|    n_updates       | 75971    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.2     |
|    ep_rew_mean     | -7.03    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2192     |
|    fps             | 36       |
|    time_elapsed    | 2106     |
|    total_timesteps | 76152    |
| train/             |          |
|    actor_loss      | 1.66     |
|    critic_loss     | 0.0082   |
|    ent_coef        | 0.00551  |
|    ent_coef_loss   | -0.286   |
|    learning_rate   | 0.001    |
|    n_updates       | 76051    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.2     |
|    ep_rew_mean     | -7.07    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2196     |
|    fps             | 36       |
|    time_elapsed    | 2108     |
|    total_timesteps | 76227    |
| train/             |          |
|    actor_loss      | 1.5      |
|    critic_loss     | 0.00795  |
|    ent_coef        | 0.00548  |
|    ent_coef_loss   | 0.123    |
|    learning_rate   | 0.001    |
|    n_updates       | 76126    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.3     |
|    ep_rew_mean     | -7.14    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2200     |
|    fps             | 36       |
|    time_elapsed    | 2110     |
|    total_timesteps | 76304    |
| train/             |          |
|    actor_loss      | 1.59     |
|    critic_loss     | 0.0121   |
|    ent_coef        | 0.0055   |
|    ent_coef_loss   | 0.00888  |
|    learning_rate   | 0.001    |
|    n_updates       | 76203    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.3     |
|    ep_rew_mean     | -7.04    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 2204     |
|    fps             | 36       |
|    time_elapsed    | 2113     |
|    total_timesteps | 76387    |
| train/             |          |
|    actor_loss      | 1.72     |
|    critic_loss     | 0.007    |
|    ent_coef        | 0.00547  |
|    ent_coef_loss   | 1.18     |
|    learning_rate   | 0.001    |
|    n_updates       | 76286    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -7.05    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 2208     |
|    fps             | 36       |
|    time_elapsed    | 2115     |
|    total_timesteps | 76471    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.0254   |
|    ent_coef        | 0.00539  |
|    ent_coef_loss   | -1.01    |
|    learning_rate   | 0.001    |
|    n_updates       | 76370    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.2     |
|    ep_rew_mean     | -7.05    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 2212     |
|    fps             | 36       |
|    time_elapsed    | 2118     |
|    total_timesteps | 76560    |
| train/             |          |
|    actor_loss      | 1.58     |
|    critic_loss     | 0.00843  |
|    ent_coef        | 0.00546  |
|    ent_coef_loss   | -0.871   |
|    learning_rate   | 0.001    |
|    n_updates       | 76459    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.3     |
|    ep_rew_mean     | -7.06    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 2216     |
|    fps             | 36       |
|    time_elapsed    | 2121     |
|    total_timesteps | 76643    |
| train/             |          |
|    actor_loss      | 1.67     |
|    critic_loss     | 0.00887  |
|    ent_coef        | 0.00547  |
|    ent_coef_loss   | 0.29     |
|    learning_rate   | 0.001    |
|    n_updates       | 76542    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.2     |
|    ep_rew_mean     | -7.16    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2220     |
|    fps             | 36       |
|    time_elapsed    | 2123     |
|    total_timesteps | 76716    |
| train/             |          |
|    actor_loss      | 1.58     |
|    critic_loss     | 0.0436   |
|    ent_coef        | 0.00543  |
|    ent_coef_loss   | 0.157    |
|    learning_rate   | 0.001    |
|    n_updates       | 76615    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.9     |
|    ep_rew_mean     | -6.95    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 2224     |
|    fps             | 36       |
|    time_elapsed    | 2125     |
|    total_timesteps | 76807    |
| train/             |          |
|    actor_loss      | 1.85     |
|    critic_loss     | 0.00894  |
|    ent_coef        | 0.00535  |
|    ent_coef_loss   | 0.694    |
|    learning_rate   | 0.001    |
|    n_updates       | 76706    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21       |
|    ep_rew_mean     | -7       |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 2228     |
|    fps             | 36       |
|    time_elapsed    | 2127     |
|    total_timesteps | 76895    |
| train/             |          |
|    actor_loss      | 1.45     |
|    critic_loss     | 0.00889  |
|    ent_coef        | 0.00535  |
|    ent_coef_loss   | 0.0424   |
|    learning_rate   | 0.001    |
|    n_updates       | 76794    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.9     |
|    ep_rew_mean     | -6.97    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 2232     |
|    fps             | 36       |
|    time_elapsed    | 2130     |
|    total_timesteps | 76975    |
| train/             |          |
|    actor_loss      | 1.48     |
|    critic_loss     | 0.0154   |
|    ent_coef        | 0.00533  |
|    ent_coef_loss   | -1.83    |
|    learning_rate   | 0.001    |
|    n_updates       | 76874    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -7.04    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 2236     |
|    fps             | 36       |
|    time_elapsed    | 2133     |
|    total_timesteps | 77068    |
| train/             |          |
|    actor_loss      | 1.57     |
|    critic_loss     | 0.00972  |
|    ent_coef        | 0.00539  |
|    ent_coef_loss   | 0.281    |
|    learning_rate   | 0.001    |
|    n_updates       | 76967    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -7.09    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 2240     |
|    fps             | 36       |
|    time_elapsed    | 2135     |
|    total_timesteps | 77151    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.00842  |
|    ent_coef        | 0.00546  |
|    ent_coef_loss   | 0.19     |
|    learning_rate   | 0.001    |
|    n_updates       | 77050    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.4     |
|    ep_rew_mean     | -7.12    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 2244     |
|    fps             | 36       |
|    time_elapsed    | 2138     |
|    total_timesteps | 77245    |
| train/             |          |
|    actor_loss      | 1.55     |
|    critic_loss     | 0.00579  |
|    ent_coef        | 0.00554  |
|    ent_coef_loss   | -0.623   |
|    learning_rate   | 0.001    |
|    n_updates       | 77144    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.8     |
|    ep_rew_mean     | -6.99    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 2248     |
|    fps             | 36       |
|    time_elapsed    | 2140     |
|    total_timesteps | 77307    |
| train/             |          |
|    actor_loss      | 1.72     |
|    critic_loss     | 0.00955  |
|    ent_coef        | 0.00557  |
|    ent_coef_loss   | 1.52     |
|    learning_rate   | 0.001    |
|    n_updates       | 77206    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21       |
|    ep_rew_mean     | -7.01    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 2252     |
|    fps             | 36       |
|    time_elapsed    | 2142     |
|    total_timesteps | 77400    |
| train/             |          |
|    actor_loss      | 1.63     |
|    critic_loss     | 0.00907  |
|    ent_coef        | 0.0055   |
|    ent_coef_loss   | 0.964    |
|    learning_rate   | 0.001    |
|    n_updates       | 77299    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.8     |
|    ep_rew_mean     | -6.94    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 2256     |
|    fps             | 36       |
|    time_elapsed    | 2145     |
|    total_timesteps | 77477    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.00738  |
|    ent_coef        | 0.00542  |
|    ent_coef_loss   | 0.0574   |
|    learning_rate   | 0.001    |
|    n_updates       | 77376    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.6     |
|    ep_rew_mean     | -6.88    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 2260     |
|    fps             | 36       |
|    time_elapsed    | 2148     |
|    total_timesteps | 77554    |
| train/             |          |
|    actor_loss      | 1.63     |
|    critic_loss     | 0.00797  |
|    ent_coef        | 0.00539  |
|    ent_coef_loss   | 0.418    |
|    learning_rate   | 0.001    |
|    n_updates       | 77453    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.8     |
|    ep_rew_mean     | -6.96    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2264     |
|    fps             | 36       |
|    time_elapsed    | 2150     |
|    total_timesteps | 77636    |
| train/             |          |
|    actor_loss      | 1.93     |
|    critic_loss     | 0.0104   |
|    ent_coef        | 0.00539  |
|    ent_coef_loss   | 1.07     |
|    learning_rate   | 0.001    |
|    n_updates       | 77535    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.8     |
|    ep_rew_mean     | -6.9     |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2268     |
|    fps             | 36       |
|    time_elapsed    | 2152     |
|    total_timesteps | 77706    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.00742  |
|    ent_coef        | 0.00544  |
|    ent_coef_loss   | 0.615    |
|    learning_rate   | 0.001    |
|    n_updates       | 77605    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.7     |
|    ep_rew_mean     | -6.82    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2272     |
|    fps             | 36       |
|    time_elapsed    | 2155     |
|    total_timesteps | 77806    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.0128   |
|    ent_coef        | 0.00543  |
|    ent_coef_loss   | -0.802   |
|    learning_rate   | 0.001    |
|    n_updates       | 77705    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.8     |
|    ep_rew_mean     | -6.87    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2276     |
|    fps             | 36       |
|    time_elapsed    | 2157     |
|    total_timesteps | 77886    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.0337   |
|    ent_coef        | 0.00552  |
|    ent_coef_loss   | -0.493   |
|    learning_rate   | 0.001    |
|    n_updates       | 77785    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.8     |
|    ep_rew_mean     | -6.97    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2280     |
|    fps             | 36       |
|    time_elapsed    | 2161     |
|    total_timesteps | 77974    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.0125   |
|    ent_coef        | 0.00557  |
|    ent_coef_loss   | 1.1      |
|    learning_rate   | 0.001    |
|    n_updates       | 77873    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.7     |
|    ep_rew_mean     | -7       |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2284     |
|    fps             | 36       |
|    time_elapsed    | 2163     |
|    total_timesteps | 78059    |
| train/             |          |
|    actor_loss      | 1.72     |
|    critic_loss     | 0.00795  |
|    ent_coef        | 0.00562  |
|    ent_coef_loss   | -0.245   |
|    learning_rate   | 0.001    |
|    n_updates       | 77958    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.6     |
|    ep_rew_mean     | -7       |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2288     |
|    fps             | 36       |
|    time_elapsed    | 2165     |
|    total_timesteps | 78133    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.00828  |
|    ent_coef        | 0.00557  |
|    ent_coef_loss   | -0.248   |
|    learning_rate   | 0.001    |
|    n_updates       | 78032    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.8     |
|    ep_rew_mean     | -7.07    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2292     |
|    fps             | 36       |
|    time_elapsed    | 2168     |
|    total_timesteps | 78232    |
| train/             |          |
|    actor_loss      | 1.8      |
|    critic_loss     | 0.0113   |
|    ent_coef        | 0.0055   |
|    ent_coef_loss   | 0.31     |
|    learning_rate   | 0.001    |
|    n_updates       | 78131    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.9     |
|    ep_rew_mean     | -7.01    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2296     |
|    fps             | 36       |
|    time_elapsed    | 2170     |
|    total_timesteps | 78320    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.00921  |
|    ent_coef        | 0.00547  |
|    ent_coef_loss   | -0.618   |
|    learning_rate   | 0.001    |
|    n_updates       | 78219    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21       |
|    ep_rew_mean     | -7.08    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2300     |
|    fps             | 36       |
|    time_elapsed    | 2173     |
|    total_timesteps | 78404    |
| train/             |          |
|    actor_loss      | 1.83     |
|    critic_loss     | 0.0107   |
|    ent_coef        | 0.00532  |
|    ent_coef_loss   | 0.621    |
|    learning_rate   | 0.001    |
|    n_updates       | 78303    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21       |
|    ep_rew_mean     | -7.13    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2304     |
|    fps             | 36       |
|    time_elapsed    | 2176     |
|    total_timesteps | 78491    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.00749  |
|    ent_coef        | 0.00529  |
|    ent_coef_loss   | 0.657    |
|    learning_rate   | 0.001    |
|    n_updates       | 78390    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.9     |
|    ep_rew_mean     | -7.08    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2308     |
|    fps             | 36       |
|    time_elapsed    | 2178     |
|    total_timesteps | 78562    |
| train/             |          |
|    actor_loss      | 1.77     |
|    critic_loss     | 0.00617  |
|    ent_coef        | 0.00528  |
|    ent_coef_loss   | 0.541    |
|    learning_rate   | 0.001    |
|    n_updates       | 78461    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.8     |
|    ep_rew_mean     | -7.09    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2312     |
|    fps             | 36       |
|    time_elapsed    | 2180     |
|    total_timesteps | 78641    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.00573  |
|    ent_coef        | 0.00537  |
|    ent_coef_loss   | -0.548   |
|    learning_rate   | 0.001    |
|    n_updates       | 78540    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.7     |
|    ep_rew_mean     | -7.15    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 2316     |
|    fps             | 36       |
|    time_elapsed    | 2182     |
|    total_timesteps | 78714    |
| train/             |          |
|    actor_loss      | 1.56     |
|    critic_loss     | 0.00925  |
|    ent_coef        | 0.00547  |
|    ent_coef_loss   | -0.243   |
|    learning_rate   | 0.001    |
|    n_updates       | 78613    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.9     |
|    ep_rew_mean     | -7.1     |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2320     |
|    fps             | 36       |
|    time_elapsed    | 2185     |
|    total_timesteps | 78801    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.00573  |
|    ent_coef        | 0.00522  |
|    ent_coef_loss   | -0.577   |
|    learning_rate   | 0.001    |
|    n_updates       | 78700    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.5     |
|    ep_rew_mean     | -7.02    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2324     |
|    fps             | 36       |
|    time_elapsed    | 2187     |
|    total_timesteps | 78857    |
| train/             |          |
|    actor_loss      | 1.63     |
|    critic_loss     | 0.00777  |
|    ent_coef        | 0.00521  |
|    ent_coef_loss   | 0.884    |
|    learning_rate   | 0.001    |
|    n_updates       | 78756    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.5     |
|    ep_rew_mean     | -7.01    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2328     |
|    fps             | 36       |
|    time_elapsed    | 2189     |
|    total_timesteps | 78947    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.0242   |
|    ent_coef        | 0.00531  |
|    ent_coef_loss   | 0.236    |
|    learning_rate   | 0.001    |
|    n_updates       | 78846    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.4     |
|    ep_rew_mean     | -7.02    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2332     |
|    fps             | 36       |
|    time_elapsed    | 2191     |
|    total_timesteps | 79016    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.00867  |
|    ent_coef        | 0.00526  |
|    ent_coef_loss   | -0.578   |
|    learning_rate   | 0.001    |
|    n_updates       | 78915    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.6     |
|    ep_rew_mean     | -7.07    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2336     |
|    fps             | 36       |
|    time_elapsed    | 2194     |
|    total_timesteps | 79125    |
| train/             |          |
|    actor_loss      | 1.63     |
|    critic_loss     | 0.00484  |
|    ent_coef        | 0.00527  |
|    ent_coef_loss   | -0.874   |
|    learning_rate   | 0.001    |
|    n_updates       | 79024    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.5     |
|    ep_rew_mean     | -7.01    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2340     |
|    fps             | 36       |
|    time_elapsed    | 2196     |
|    total_timesteps | 79197    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.0137   |
|    ent_coef        | 0.00521  |
|    ent_coef_loss   | 0.549    |
|    learning_rate   | 0.001    |
|    n_updates       | 79096    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.3     |
|    ep_rew_mean     | -7.03    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2344     |
|    fps             | 36       |
|    time_elapsed    | 2200     |
|    total_timesteps | 79278    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.00741  |
|    ent_coef        | 0.00523  |
|    ent_coef_loss   | 0.196    |
|    learning_rate   | 0.001    |
|    n_updates       | 79177    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.6     |
|    ep_rew_mean     | -7.07    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2348     |
|    fps             | 36       |
|    time_elapsed    | 2202     |
|    total_timesteps | 79364    |
| train/             |          |
|    actor_loss      | 1.78     |
|    critic_loss     | 0.00775  |
|    ent_coef        | 0.00525  |
|    ent_coef_loss   | 0.789    |
|    learning_rate   | 0.001    |
|    n_updates       | 79263    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.5     |
|    ep_rew_mean     | -7.05    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2352     |
|    fps             | 36       |
|    time_elapsed    | 2205     |
|    total_timesteps | 79452    |
| train/             |          |
|    actor_loss      | 1.6      |
|    critic_loss     | 0.00991  |
|    ent_coef        | 0.00538  |
|    ent_coef_loss   | -0.933   |
|    learning_rate   | 0.001    |
|    n_updates       | 79351    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.6     |
|    ep_rew_mean     | -7.11    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 2356     |
|    fps             | 36       |
|    time_elapsed    | 2207     |
|    total_timesteps | 79535    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.0123   |
|    ent_coef        | 0.00549  |
|    ent_coef_loss   | 1.52     |
|    learning_rate   | 0.001    |
|    n_updates       | 79434    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.6     |
|    ep_rew_mean     | -7.12    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 2360     |
|    fps             | 36       |
|    time_elapsed    | 2209     |
|    total_timesteps | 79616    |
| train/             |          |
|    actor_loss      | 1.59     |
|    critic_loss     | 0.0131   |
|    ent_coef        | 0.00553  |
|    ent_coef_loss   | -0.403   |
|    learning_rate   | 0.001    |
|    n_updates       | 79515    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.7     |
|    ep_rew_mean     | -7.09    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2364     |
|    fps             | 36       |
|    time_elapsed    | 2213     |
|    total_timesteps | 79703    |
| train/             |          |
|    actor_loss      | 1.66     |
|    critic_loss     | 0.00878  |
|    ent_coef        | 0.00552  |
|    ent_coef_loss   | -0.117   |
|    learning_rate   | 0.001    |
|    n_updates       | 79602    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.7     |
|    ep_rew_mean     | -7.15    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2368     |
|    fps             | 36       |
|    time_elapsed    | 2215     |
|    total_timesteps | 79778    |
| train/             |          |
|    actor_loss      | 1.55     |
|    critic_loss     | 0.00749  |
|    ent_coef        | 0.00541  |
|    ent_coef_loss   | -0.0958  |
|    learning_rate   | 0.001    |
|    n_updates       | 79677    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.6     |
|    ep_rew_mean     | -7.17    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2372     |
|    fps             | 36       |
|    time_elapsed    | 2217     |
|    total_timesteps | 79867    |
| train/             |          |
|    actor_loss      | 1.64     |
|    critic_loss     | 0.0372   |
|    ent_coef        | 0.00536  |
|    ent_coef_loss   | 0.0702   |
|    learning_rate   | 0.001    |
|    n_updates       | 79766    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.7     |
|    ep_rew_mean     | -7.22    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2376     |
|    fps             | 36       |
|    time_elapsed    | 2220     |
|    total_timesteps | 79958    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.00572  |
|    ent_coef        | 0.00527  |
|    ent_coef_loss   | 0.2      |
|    learning_rate   | 0.001    |
|    n_updates       | 79857    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.9     |
|    ep_rew_mean     | -7.25    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2380     |
|    fps             | 36       |
|    time_elapsed    | 2223     |
|    total_timesteps | 80060    |
| train/             |          |
|    actor_loss      | 1.67     |
|    critic_loss     | 0.0105   |
|    ent_coef        | 0.00535  |
|    ent_coef_loss   | -0.812   |
|    learning_rate   | 0.001    |
|    n_updates       | 79959    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.9     |
|    ep_rew_mean     | -7.26    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2384     |
|    fps             | 36       |
|    time_elapsed    | 2225     |
|    total_timesteps | 80148    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.00879  |
|    ent_coef        | 0.00536  |
|    ent_coef_loss   | 0.266    |
|    learning_rate   | 0.001    |
|    n_updates       | 80047    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21       |
|    ep_rew_mean     | -7.25    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2388     |
|    fps             | 36       |
|    time_elapsed    | 2228     |
|    total_timesteps | 80234    |
| train/             |          |
|    actor_loss      | 1.65     |
|    critic_loss     | 0.0176   |
|    ent_coef        | 0.00549  |
|    ent_coef_loss   | 0.205    |
|    learning_rate   | 0.001    |
|    n_updates       | 80133    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -7.27    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2392     |
|    fps             | 36       |
|    time_elapsed    | 2231     |
|    total_timesteps | 80346    |
| train/             |          |
|    actor_loss      | 1.54     |
|    critic_loss     | 0.00564  |
|    ent_coef        | 0.00558  |
|    ent_coef_loss   | 0.0128   |
|    learning_rate   | 0.001    |
|    n_updates       | 80245    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -7.26    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2396     |
|    fps             | 36       |
|    time_elapsed    | 2233     |
|    total_timesteps | 80425    |
| train/             |          |
|    actor_loss      | 1.61     |
|    critic_loss     | 0.00792  |
|    ent_coef        | 0.00545  |
|    ent_coef_loss   | 0.628    |
|    learning_rate   | 0.001    |
|    n_updates       | 80324    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21       |
|    ep_rew_mean     | -7.19    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 2400     |
|    fps             | 36       |
|    time_elapsed    | 2236     |
|    total_timesteps | 80502    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.00671  |
|    ent_coef        | 0.00554  |
|    ent_coef_loss   | -0.059   |
|    learning_rate   | 0.001    |
|    n_updates       | 80401    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.9     |
|    ep_rew_mean     | -7.18    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 2404     |
|    fps             | 35       |
|    time_elapsed    | 2239     |
|    total_timesteps | 80580    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.0064   |
|    ent_coef        | 0.00548  |
|    ent_coef_loss   | 0.0225   |
|    learning_rate   | 0.001    |
|    n_updates       | 80479    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.9     |
|    ep_rew_mean     | -7.19    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 2408     |
|    fps             | 35       |
|    time_elapsed    | 2241     |
|    total_timesteps | 80656    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.0108   |
|    ent_coef        | 0.00557  |
|    ent_coef_loss   | -0.259   |
|    learning_rate   | 0.001    |
|    n_updates       | 80555    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -7.25    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2412     |
|    fps             | 35       |
|    time_elapsed    | 2244     |
|    total_timesteps | 80748    |
| train/             |          |
|    actor_loss      | 1.83     |
|    critic_loss     | 0.00544  |
|    ent_coef        | 0.00569  |
|    ent_coef_loss   | 0.123    |
|    learning_rate   | 0.001    |
|    n_updates       | 80647    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.2     |
|    ep_rew_mean     | -7.19    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 2416     |
|    fps             | 35       |
|    time_elapsed    | 2246     |
|    total_timesteps | 80832    |
| train/             |          |
|    actor_loss      | 1.6      |
|    critic_loss     | 0.0139   |
|    ent_coef        | 0.00572  |
|    ent_coef_loss   | 0.0204   |
|    learning_rate   | 0.001    |
|    n_updates       | 80731    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.2     |
|    ep_rew_mean     | -7.2     |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2420     |
|    fps             | 35       |
|    time_elapsed    | 2248     |
|    total_timesteps | 80919    |
| train/             |          |
|    actor_loss      | 1.51     |
|    critic_loss     | 0.0362   |
|    ent_coef        | 0.00572  |
|    ent_coef_loss   | -0.567   |
|    learning_rate   | 0.001    |
|    n_updates       | 80818    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.34    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2424     |
|    fps             | 35       |
|    time_elapsed    | 2251     |
|    total_timesteps | 81006    |
| train/             |          |
|    actor_loss      | 1.53     |
|    critic_loss     | 0.00692  |
|    ent_coef        | 0.0056   |
|    ent_coef_loss   | -1.17    |
|    learning_rate   | 0.001    |
|    n_updates       | 80905    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.3     |
|    ep_rew_mean     | -7.3     |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2428     |
|    fps             | 35       |
|    time_elapsed    | 2254     |
|    total_timesteps | 81078    |
| train/             |          |
|    actor_loss      | 1.59     |
|    critic_loss     | 0.0121   |
|    ent_coef        | 0.00549  |
|    ent_coef_loss   | -0.349   |
|    learning_rate   | 0.001    |
|    n_updates       | 80977    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.6     |
|    ep_rew_mean     | -7.37    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2432     |
|    fps             | 35       |
|    time_elapsed    | 2257     |
|    total_timesteps | 81180    |
| train/             |          |
|    actor_loss      | 1.5      |
|    critic_loss     | 0.016    |
|    ent_coef        | 0.00544  |
|    ent_coef_loss   | 0.354    |
|    learning_rate   | 0.001    |
|    n_updates       | 81079    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.2     |
|    ep_rew_mean     | -7.23    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2436     |
|    fps             | 35       |
|    time_elapsed    | 2259     |
|    total_timesteps | 81245    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.00411  |
|    ent_coef        | 0.00555  |
|    ent_coef_loss   | 0.13     |
|    learning_rate   | 0.001    |
|    n_updates       | 81144    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.6     |
|    ep_rew_mean     | -7.37    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2440     |
|    fps             | 35       |
|    time_elapsed    | 2262     |
|    total_timesteps | 81355    |
| train/             |          |
|    actor_loss      | 1.62     |
|    critic_loss     | 0.00855  |
|    ent_coef        | 0.00568  |
|    ent_coef_loss   | 0.0656   |
|    learning_rate   | 0.001    |
|    n_updates       | 81254    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.6     |
|    ep_rew_mean     | -7.32    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2444     |
|    fps             | 35       |
|    time_elapsed    | 2265     |
|    total_timesteps | 81443    |
| train/             |          |
|    actor_loss      | 1.47     |
|    critic_loss     | 0.00747  |
|    ent_coef        | 0.00561  |
|    ent_coef_loss   | -0.524   |
|    learning_rate   | 0.001    |
|    n_updates       | 81342    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.8     |
|    ep_rew_mean     | -7.43    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2448     |
|    fps             | 35       |
|    time_elapsed    | 2268     |
|    total_timesteps | 81542    |
| train/             |          |
|    actor_loss      | 1.54     |
|    critic_loss     | 0.00835  |
|    ent_coef        | 0.00547  |
|    ent_coef_loss   | -1.05    |
|    learning_rate   | 0.001    |
|    n_updates       | 81441    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.6     |
|    ep_rew_mean     | -7.36    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2452     |
|    fps             | 35       |
|    time_elapsed    | 2270     |
|    total_timesteps | 81613    |
| train/             |          |
|    actor_loss      | 1.72     |
|    critic_loss     | 0.0194   |
|    ent_coef        | 0.00544  |
|    ent_coef_loss   | 1.34     |
|    learning_rate   | 0.001    |
|    n_updates       | 81512    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.6     |
|    ep_rew_mean     | -7.33    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 2456     |
|    fps             | 35       |
|    time_elapsed    | 2273     |
|    total_timesteps | 81698    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.0139   |
|    ent_coef        | 0.00556  |
|    ent_coef_loss   | -0.454   |
|    learning_rate   | 0.001    |
|    n_updates       | 81597    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.6     |
|    ep_rew_mean     | -7.35    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 2460     |
|    fps             | 35       |
|    time_elapsed    | 2275     |
|    total_timesteps | 81771    |
| train/             |          |
|    actor_loss      | 1.58     |
|    critic_loss     | 0.00872  |
|    ent_coef        | 0.00557  |
|    ent_coef_loss   | -1.11    |
|    learning_rate   | 0.001    |
|    n_updates       | 81670    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.4     |
|    ep_rew_mean     | -7.32    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2464     |
|    fps             | 35       |
|    time_elapsed    | 2277     |
|    total_timesteps | 81838    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.00716  |
|    ent_coef        | 0.00555  |
|    ent_coef_loss   | 0.954    |
|    learning_rate   | 0.001    |
|    n_updates       | 81737    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.6     |
|    ep_rew_mean     | -7.41    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2468     |
|    fps             | 35       |
|    time_elapsed    | 2280     |
|    total_timesteps | 81933    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.00503  |
|    ent_coef        | 0.00573  |
|    ent_coef_loss   | -0.776   |
|    learning_rate   | 0.001    |
|    n_updates       | 81832    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.4     |
|    ep_rew_mean     | -7.33    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2472     |
|    fps             | 35       |
|    time_elapsed    | 2283     |
|    total_timesteps | 82012    |
| train/             |          |
|    actor_loss      | 1.67     |
|    critic_loss     | 0.017    |
|    ent_coef        | 0.00576  |
|    ent_coef_loss   | 0.502    |
|    learning_rate   | 0.001    |
|    n_updates       | 81911    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.6     |
|    ep_rew_mean     | -7.34    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 2476     |
|    fps             | 35       |
|    time_elapsed    | 2285     |
|    total_timesteps | 82114    |
| train/             |          |
|    actor_loss      | 1.79     |
|    critic_loss     | 0.00929  |
|    ent_coef        | 0.00583  |
|    ent_coef_loss   | -1.29    |
|    learning_rate   | 0.001    |
|    n_updates       | 82013    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.3     |
|    ep_rew_mean     | -7.26    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 2480     |
|    fps             | 35       |
|    time_elapsed    | 2287     |
|    total_timesteps | 82187    |
| train/             |          |
|    actor_loss      | 1.63     |
|    critic_loss     | 0.0213   |
|    ent_coef        | 0.00576  |
|    ent_coef_loss   | -1.21    |
|    learning_rate   | 0.001    |
|    n_updates       | 82086    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -7.18    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 2484     |
|    fps             | 35       |
|    time_elapsed    | 2289     |
|    total_timesteps | 82257    |
| train/             |          |
|    actor_loss      | 1.63     |
|    critic_loss     | 0.0046   |
|    ent_coef        | 0.00571  |
|    ent_coef_loss   | -0.0387  |
|    learning_rate   | 0.001    |
|    n_updates       | 82156    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.9     |
|    ep_rew_mean     | -7.1     |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 2488     |
|    fps             | 35       |
|    time_elapsed    | 2291     |
|    total_timesteps | 82322    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.0342   |
|    ent_coef        | 0.00562  |
|    ent_coef_loss   | 0.48     |
|    learning_rate   | 0.001    |
|    n_updates       | 82221    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.6     |
|    ep_rew_mean     | -6.98    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 2492     |
|    fps             | 35       |
|    time_elapsed    | 2294     |
|    total_timesteps | 82403    |
| train/             |          |
|    actor_loss      | 1.63     |
|    critic_loss     | 0.00643  |
|    ent_coef        | 0.00552  |
|    ent_coef_loss   | 0.637    |
|    learning_rate   | 0.001    |
|    n_updates       | 82302    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.6     |
|    ep_rew_mean     | -7.04    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2496     |
|    fps             | 35       |
|    time_elapsed    | 2296     |
|    total_timesteps | 82482    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.00683  |
|    ent_coef        | 0.00557  |
|    ent_coef_loss   | -0.149   |
|    learning_rate   | 0.001    |
|    n_updates       | 82381    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.6     |
|    ep_rew_mean     | -7.02    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2500     |
|    fps             | 35       |
|    time_elapsed    | 2299     |
|    total_timesteps | 82561    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.00748  |
|    ent_coef        | 0.00574  |
|    ent_coef_loss   | 0.0166   |
|    learning_rate   | 0.001    |
|    n_updates       | 82460    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.4     |
|    ep_rew_mean     | -6.89    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2504     |
|    fps             | 35       |
|    time_elapsed    | 2300     |
|    total_timesteps | 82620    |
| train/             |          |
|    actor_loss      | 1.8      |
|    critic_loss     | 0.0109   |
|    ent_coef        | 0.00584  |
|    ent_coef_loss   | 0.847    |
|    learning_rate   | 0.001    |
|    n_updates       | 82519    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.4     |
|    ep_rew_mean     | -6.87    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2508     |
|    fps             | 35       |
|    time_elapsed    | 2302     |
|    total_timesteps | 82696    |
| train/             |          |
|    actor_loss      | 1.77     |
|    critic_loss     | 0.00794  |
|    ent_coef        | 0.00587  |
|    ent_coef_loss   | 0.333    |
|    learning_rate   | 0.001    |
|    n_updates       | 82595    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.1     |
|    ep_rew_mean     | -6.73    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 2512     |
|    fps             | 35       |
|    time_elapsed    | 2305     |
|    total_timesteps | 82763    |
| train/             |          |
|    actor_loss      | 1.59     |
|    critic_loss     | 0.0106   |
|    ent_coef        | 0.00588  |
|    ent_coef_loss   | 0.325    |
|    learning_rate   | 0.001    |
|    n_updates       | 82662    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20       |
|    ep_rew_mean     | -6.69    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 2516     |
|    fps             | 35       |
|    time_elapsed    | 2307     |
|    total_timesteps | 82830    |
| train/             |          |
|    actor_loss      | 1.65     |
|    critic_loss     | 0.029    |
|    ent_coef        | 0.0059   |
|    ent_coef_loss   | -1.44    |
|    learning_rate   | 0.001    |
|    n_updates       | 82729    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 19.8     |
|    ep_rew_mean     | -6.66    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 2520     |
|    fps             | 35       |
|    time_elapsed    | 2309     |
|    total_timesteps | 82903    |
| train/             |          |
|    actor_loss      | 1.54     |
|    critic_loss     | 0.00924  |
|    ent_coef        | 0.00575  |
|    ent_coef_loss   | -0.801   |
|    learning_rate   | 0.001    |
|    n_updates       | 82802    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 19.9     |
|    ep_rew_mean     | -6.66    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 2524     |
|    fps             | 35       |
|    time_elapsed    | 2312     |
|    total_timesteps | 82997    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.0115   |
|    ent_coef        | 0.00554  |
|    ent_coef_loss   | 0.123    |
|    learning_rate   | 0.001    |
|    n_updates       | 82896    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20       |
|    ep_rew_mean     | -6.72    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2528     |
|    fps             | 35       |
|    time_elapsed    | 2314     |
|    total_timesteps | 83076    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.0114   |
|    ent_coef        | 0.00561  |
|    ent_coef_loss   | -0.00893 |
|    learning_rate   | 0.001    |
|    n_updates       | 82975    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.1     |
|    ep_rew_mean     | -6.81    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2532     |
|    fps             | 35       |
|    time_elapsed    | 2318     |
|    total_timesteps | 83193    |
| train/             |          |
|    actor_loss      | 1.62     |
|    critic_loss     | 0.00967  |
|    ent_coef        | 0.00567  |
|    ent_coef_loss   | -0.501   |
|    learning_rate   | 0.001    |
|    n_updates       | 83092    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.1     |
|    ep_rew_mean     | -6.78    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 2536     |
|    fps             | 35       |
|    time_elapsed    | 2320     |
|    total_timesteps | 83260    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.0102   |
|    ent_coef        | 0.00555  |
|    ent_coef_loss   | 0.345    |
|    learning_rate   | 0.001    |
|    n_updates       | 83159    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 19.8     |
|    ep_rew_mean     | -6.77    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2540     |
|    fps             | 35       |
|    time_elapsed    | 2323     |
|    total_timesteps | 83338    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.0429   |
|    ent_coef        | 0.00561  |
|    ent_coef_loss   | -0.559   |
|    learning_rate   | 0.001    |
|    n_updates       | 83237    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 19.6     |
|    ep_rew_mean     | -6.69    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2544     |
|    fps             | 35       |
|    time_elapsed    | 2324     |
|    total_timesteps | 83404    |
| train/             |          |
|    actor_loss      | 1.72     |
|    critic_loss     | 0.00934  |
|    ent_coef        | 0.00549  |
|    ent_coef_loss   | 0.439    |
|    learning_rate   | 0.001    |
|    n_updates       | 83303    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 19.4     |
|    ep_rew_mean     | -6.61    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 2548     |
|    fps             | 35       |
|    time_elapsed    | 2327     |
|    total_timesteps | 83481    |
| train/             |          |
|    actor_loss      | 1.8      |
|    critic_loss     | 0.00405  |
|    ent_coef        | 0.00558  |
|    ent_coef_loss   | 0.382    |
|    learning_rate   | 0.001    |
|    n_updates       | 83380    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 19.4     |
|    ep_rew_mean     | -6.67    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2552     |
|    fps             | 35       |
|    time_elapsed    | 2328     |
|    total_timesteps | 83550    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.0067   |
|    ent_coef        | 0.0056   |
|    ent_coef_loss   | -1.31    |
|    learning_rate   | 0.001    |
|    n_updates       | 83449    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 19.2     |
|    ep_rew_mean     | -6.65    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 2556     |
|    fps             | 35       |
|    time_elapsed    | 2331     |
|    total_timesteps | 83621    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.0134   |
|    ent_coef        | 0.00548  |
|    ent_coef_loss   | 0.341    |
|    learning_rate   | 0.001    |
|    n_updates       | 83520    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 19.4     |
|    ep_rew_mean     | -6.76    |
|    success_rate    | 0.92     |
| time/              |          |
|    episodes        | 2560     |
|    fps             | 35       |
|    time_elapsed    | 2334     |
|    total_timesteps | 83710    |
| train/             |          |
|    actor_loss      | 1.87     |
|    critic_loss     | 0.00525  |
|    ent_coef        | 0.00542  |
|    ent_coef_loss   | -0.268   |
|    learning_rate   | 0.001    |
|    n_updates       | 83609    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 19.4     |
|    ep_rew_mean     | -6.77    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 2564     |
|    fps             | 35       |
|    time_elapsed    | 2336     |
|    total_timesteps | 83777    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.00687  |
|    ent_coef        | 0.00544  |
|    ent_coef_loss   | -2.08    |
|    learning_rate   | 0.001    |
|    n_updates       | 83676    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 19.3     |
|    ep_rew_mean     | -6.73    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 2568     |
|    fps             | 35       |
|    time_elapsed    | 2339     |
|    total_timesteps | 83864    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.00681  |
|    ent_coef        | 0.00544  |
|    ent_coef_loss   | -0.942   |
|    learning_rate   | 0.001    |
|    n_updates       | 83763    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 19.3     |
|    ep_rew_mean     | -6.76    |
|    success_rate    | 0.92     |
| time/              |          |
|    episodes        | 2572     |
|    fps             | 35       |
|    time_elapsed    | 2341     |
|    total_timesteps | 83938    |
| train/             |          |
|    actor_loss      | 1.86     |
|    critic_loss     | 0.0158   |
|    ent_coef        | 0.00559  |
|    ent_coef_loss   | 0.0459   |
|    learning_rate   | 0.001    |
|    n_updates       | 83837    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 18.9     |
|    ep_rew_mean     | -6.6     |
|    success_rate    | 0.92     |
| time/              |          |
|    episodes        | 2576     |
|    fps             | 35       |
|    time_elapsed    | 2342     |
|    total_timesteps | 84002    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.0442   |
|    ent_coef        | 0.00552  |
|    ent_coef_loss   | -0.483   |
|    learning_rate   | 0.001    |
|    n_updates       | 83901    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 19       |
|    ep_rew_mean     | -6.58    |
|    success_rate    | 0.92     |
| time/              |          |
|    episodes        | 2580     |
|    fps             | 35       |
|    time_elapsed    | 2346     |
|    total_timesteps | 84084    |
| train/             |          |
|    actor_loss      | 1.83     |
|    critic_loss     | 0.0131   |
|    ent_coef        | 0.00535  |
|    ent_coef_loss   | 0.763    |
|    learning_rate   | 0.001    |
|    n_updates       | 83983    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 19.1     |
|    ep_rew_mean     | -6.68    |
|    success_rate    | 0.91     |
| time/              |          |
|    episodes        | 2584     |
|    fps             | 35       |
|    time_elapsed    | 2348     |
|    total_timesteps | 84171    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.0213   |
|    ent_coef        | 0.00542  |
|    ent_coef_loss   | -0.3     |
|    learning_rate   | 0.001    |
|    n_updates       | 84070    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 19.2     |
|    ep_rew_mean     | -6.74    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 2588     |
|    fps             | 35       |
|    time_elapsed    | 2350     |
|    total_timesteps | 84239    |
| train/             |          |
|    actor_loss      | 1.66     |
|    critic_loss     | 0.00665  |
|    ent_coef        | 0.00542  |
|    ent_coef_loss   | -1.51    |
|    learning_rate   | 0.001    |
|    n_updates       | 84138    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 19.4     |
|    ep_rew_mean     | -6.92    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 2592     |
|    fps             | 35       |
|    time_elapsed    | 2353     |
|    total_timesteps | 84342    |
| train/             |          |
|    actor_loss      | 1.78     |
|    critic_loss     | 0.0518   |
|    ent_coef        | 0.00542  |
|    ent_coef_loss   | -0.553   |
|    learning_rate   | 0.001    |
|    n_updates       | 84241    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 19.3     |
|    ep_rew_mean     | -6.83    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 2596     |
|    fps             | 35       |
|    time_elapsed    | 2355     |
|    total_timesteps | 84408    |
| train/             |          |
|    actor_loss      | 1.55     |
|    critic_loss     | 0.0069   |
|    ent_coef        | 0.00542  |
|    ent_coef_loss   | -0.366   |
|    learning_rate   | 0.001    |
|    n_updates       | 84307    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

---------------------------------
| rollout/           |          |
|    ep_len_mean     | 19.8     |
|    ep_rew_mean     | -6.97    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 2600     |
|    fps             | 35       |
|    time_elapsed    | 2360     |
|    total_timesteps | 84545    |
| train/             |          |
|    actor_loss      | 1.81     |
|    critic_loss     | 0.0206   |
|    ent_coef        | 0.00519  |
|    ent_coef_loss   | 0.0993   |
|    learning_rate   | 0.001    |
|    n_updates       | 84444    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20       |
|    ep_rew_mean     | -7.06    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 2604     |
|    fps             | 35       |
|    time_elapsed    | 2362     |
|    total_timesteps | 84621    |
| train/             |          |
|    actor_loss      | 1.6      |
|    critic_loss     | 0.0125   |
|    ent_coef        | 0.00531  |
|    ent_coef_loss   | -1.38    |
|    learning_rate   | 0.001    |
|    n_updates       | 84520    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.1     |
|    ep_rew_mean     | -7.17    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 2608     |
|    fps             | 35       |
|    time_elapsed    | 2364     |
|    total_timesteps | 84709    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.00661  |
|    ent_coef        | 0.00525  |
|    ent_coef_loss   | -0.346   |
|    learning_rate   | 0.001    |
|    n_updates       | 84608    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.3     |
|    ep_rew_mean     | -7.29    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 2612     |
|    fps             | 35       |
|    time_elapsed    | 2367     |
|    total_timesteps | 84796    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.0101   |
|    ent_coef        | 0.00551  |
|    ent_coef_loss   | -1.09    |
|    learning_rate   | 0.001    |
|    n_updates       | 84695    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.7     |
|    ep_rew_mean     | -7.37    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 2616     |
|    fps             | 35       |
|    time_elapsed    | 2370     |
|    total_timesteps | 84898    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.0144   |
|    ent_coef        | 0.00538  |
|    ent_coef_loss   | -0.52    |
|    learning_rate   | 0.001    |
|    n_updates       | 84797    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.9     |
|    ep_rew_mean     | -7.42    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 2620     |
|    fps             | 35       |
|    time_elapsed    | 2373     |
|    total_timesteps | 84991    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.00401  |
|    ent_coef        | 0.00543  |
|    ent_coef_loss   | 0.219    |
|    learning_rate   | 0.001    |
|    n_updates       | 84890    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.7     |
|    ep_rew_mean     | -7.41    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 2624     |
|    fps             | 35       |
|    time_elapsed    | 2375     |
|    total_timesteps | 85064    |
| train/             |          |
|    actor_loss      | 1.75     |
|    critic_loss     | 0.00818  |
|    ent_coef        | 0.00546  |
|    ent_coef_loss   | 0.0282   |
|    learning_rate   | 0.001    |
|    n_updates       | 84963    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.7     |
|    ep_rew_mean     | -7.43    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 2628     |
|    fps             | 35       |
|    time_elapsed    | 2378     |
|    total_timesteps | 85147    |
| train/             |          |
|    actor_loss      | 1.82     |
|    critic_loss     | 0.00591  |
|    ent_coef        | 0.00552  |
|    ent_coef_loss   | -1.45    |
|    learning_rate   | 0.001    |
|    n_updates       | 85046    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.5     |
|    ep_rew_mean     | -7.47    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2632     |
|    fps             | 35       |
|    time_elapsed    | 2381     |
|    total_timesteps | 85244    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.00563  |
|    ent_coef        | 0.00547  |
|    ent_coef_loss   | 0.501    |
|    learning_rate   | 0.001    |
|    n_updates       | 85143    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.7     |
|    ep_rew_mean     | -7.56    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2636     |
|    fps             | 35       |
|    time_elapsed    | 2383     |
|    total_timesteps | 85328    |
| train/             |          |
|    actor_loss      | 1.86     |
|    critic_loss     | 0.0134   |
|    ent_coef        | 0.00531  |
|    ent_coef_loss   | -0.454   |
|    learning_rate   | 0.001    |
|    n_updates       | 85227    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.8     |
|    ep_rew_mean     | -7.5     |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 2640     |
|    fps             | 35       |
|    time_elapsed    | 2386     |
|    total_timesteps | 85415    |
| train/             |          |
|    actor_loss      | 1.77     |
|    critic_loss     | 0.0102   |
|    ent_coef        | 0.0053   |
|    ent_coef_loss   | 0.806    |
|    learning_rate   | 0.001    |
|    n_updates       | 85314    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21       |
|    ep_rew_mean     | -7.65    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2644     |
|    fps             | 35       |
|    time_elapsed    | 2389     |
|    total_timesteps | 85505    |
| train/             |          |
|    actor_loss      | 1.8      |
|    critic_loss     | 0.0138   |
|    ent_coef        | 0.0054   |
|    ent_coef_loss   | 1.16     |
|    learning_rate   | 0.001    |
|    n_updates       | 85404    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21       |
|    ep_rew_mean     | -7.72    |
|    success_rate    | 0.85     |
| time/              |          |
|    episodes        | 2648     |
|    fps             | 35       |
|    time_elapsed    | 2391     |
|    total_timesteps | 85581    |
| train/             |          |
|    actor_loss      | 1.67     |
|    critic_loss     | 0.00831  |
|    ent_coef        | 0.00538  |
|    ent_coef_loss   | 0.504    |
|    learning_rate   | 0.001    |
|    n_updates       | 85480    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.2     |
|    ep_rew_mean     | -7.74    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2652     |
|    fps             | 35       |
|    time_elapsed    | 2394     |
|    total_timesteps | 85671    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.00881  |
|    ent_coef        | 0.00529  |
|    ent_coef_loss   | -0.726   |
|    learning_rate   | 0.001    |
|    n_updates       | 85570    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -7.66    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 2656     |
|    fps             | 35       |
|    time_elapsed    | 2396     |
|    total_timesteps | 85732    |
| train/             |          |
|    actor_loss      | 1.78     |
|    critic_loss     | 0.00432  |
|    ent_coef        | 0.00534  |
|    ent_coef_loss   | -0.533   |
|    learning_rate   | 0.001    |
|    n_updates       | 85631    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.4     |
|    ep_rew_mean     | -7.74    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 2660     |
|    fps             | 35       |
|    time_elapsed    | 2400     |
|    total_timesteps | 85848    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.00836  |
|    ent_coef        | 0.00545  |
|    ent_coef_loss   | 0.133    |
|    learning_rate   | 0.001    |
|    n_updates       | 85747    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.4     |
|    ep_rew_mean     | -7.72    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 2664     |
|    fps             | 35       |
|    time_elapsed    | 2402     |
|    total_timesteps | 85913    |
| train/             |          |
|    actor_loss      | 1.64     |
|    critic_loss     | 0.00854  |
|    ent_coef        | 0.00551  |
|    ent_coef_loss   | -0.961   |
|    learning_rate   | 0.001    |
|    n_updates       | 85812    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.2     |
|    ep_rew_mean     | -7.69    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2668     |
|    fps             | 35       |
|    time_elapsed    | 2404     |
|    total_timesteps | 85986    |
| train/             |          |
|    actor_loss      | 1.77     |
|    critic_loss     | 0.00705  |
|    ent_coef        | 0.00547  |
|    ent_coef_loss   | 0.727    |
|    learning_rate   | 0.001    |
|    n_updates       | 85885    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.3     |
|    ep_rew_mean     | -7.67    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 2672     |
|    fps             | 35       |
|    time_elapsed    | 2406     |
|    total_timesteps | 86069    |
| train/             |          |
|    actor_loss      | 1.43     |
|    critic_loss     | 0.0043   |
|    ent_coef        | 0.00551  |
|    ent_coef_loss   | -2.08    |
|    learning_rate   | 0.001    |
|    n_updates       | 85968    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.79    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 2676     |
|    fps             | 35       |
|    time_elapsed    | 2409     |
|    total_timesteps | 86154    |
| train/             |          |
|    actor_loss      | 1.55     |
|    critic_loss     | 0.00751  |
|    ent_coef        | 0.00547  |
|    ent_coef_loss   | 0.102    |
|    learning_rate   | 0.001    |
|    n_updates       | 86053    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.7     |
|    ep_rew_mean     | -7.92    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2680     |
|    fps             | 35       |
|    time_elapsed    | 2412     |
|    total_timesteps | 86253    |
| train/             |          |
|    actor_loss      | 1.62     |
|    critic_loss     | 0.0124   |
|    ent_coef        | 0.0056   |
|    ent_coef_loss   | -0.788   |
|    learning_rate   | 0.001    |
|    n_updates       | 86152    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.82    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2684     |
|    fps             | 35       |
|    time_elapsed    | 2414     |
|    total_timesteps | 86320    |
| train/             |          |
|    actor_loss      | 1.97     |
|    critic_loss     | 0.0101   |
|    ent_coef        | 0.00547  |
|    ent_coef_loss   | 0.657    |
|    learning_rate   | 0.001    |
|    n_updates       | 86219    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.6     |
|    ep_rew_mean     | -7.89    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2688     |
|    fps             | 35       |
|    time_elapsed    | 2417     |
|    total_timesteps | 86404    |
| train/             |          |
|    actor_loss      | 1.6      |
|    critic_loss     | 0.00908  |
|    ent_coef        | 0.00543  |
|    ent_coef_loss   | -0.504   |
|    learning_rate   | 0.001    |
|    n_updates       | 86303    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.4     |
|    ep_rew_mean     | -7.72    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 2692     |
|    fps             | 35       |
|    time_elapsed    | 2419     |
|    total_timesteps | 86480    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.0121   |
|    ent_coef        | 0.00547  |
|    ent_coef_loss   | -0.0894  |
|    learning_rate   | 0.001    |
|    n_updates       | 86379    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.73    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 2696     |
|    fps             | 35       |
|    time_elapsed    | 2421     |
|    total_timesteps | 86554    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.00427  |
|    ent_coef        | 0.00558  |
|    ent_coef_loss   | 0.617    |
|    learning_rate   | 0.001    |
|    n_updates       | 86453    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.9     |
|    ep_rew_mean     | -7.67    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2700     |
|    fps             | 35       |
|    time_elapsed    | 2423     |
|    total_timesteps | 86640    |
| train/             |          |
|    actor_loss      | 1.63     |
|    critic_loss     | 0.0061   |
|    ent_coef        | 0.00569  |
|    ent_coef_loss   | -0.254   |
|    learning_rate   | 0.001    |
|    n_updates       | 86539    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21       |
|    ep_rew_mean     | -7.68    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2704     |
|    fps             | 35       |
|    time_elapsed    | 2426     |
|    total_timesteps | 86722    |
| train/             |          |
|    actor_loss      | 1.81     |
|    critic_loss     | 0.0329   |
|    ent_coef        | 0.00568  |
|    ent_coef_loss   | 0.16     |
|    learning_rate   | 0.001    |
|    n_updates       | 86621    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21       |
|    ep_rew_mean     | -7.66    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2708     |
|    fps             | 35       |
|    time_elapsed    | 2429     |
|    total_timesteps | 86806    |
| train/             |          |
|    actor_loss      | 1.63     |
|    critic_loss     | 0.00958  |
|    ent_coef        | 0.00571  |
|    ent_coef_loss   | 0.241    |
|    learning_rate   | 0.001    |
|    n_updates       | 86705    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21       |
|    ep_rew_mean     | -7.65    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2712     |
|    fps             | 35       |
|    time_elapsed    | 2432     |
|    total_timesteps | 86895    |
| train/             |          |
|    actor_loss      | 1.63     |
|    critic_loss     | 0.00664  |
|    ent_coef        | 0.00578  |
|    ent_coef_loss   | -0.771   |
|    learning_rate   | 0.001    |
|    n_updates       | 86794    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.8     |
|    ep_rew_mean     | -7.68    |
|    success_rate    | 0.85     |
| time/              |          |
|    episodes        | 2716     |
|    fps             | 35       |
|    time_elapsed    | 2434     |
|    total_timesteps | 86977    |
| train/             |          |
|    actor_loss      | 1.77     |
|    critic_loss     | 0.00864  |
|    ent_coef        | 0.0056   |
|    ent_coef_loss   | -0.139   |
|    learning_rate   | 0.001    |
|    n_updates       | 86876    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.7     |
|    ep_rew_mean     | -7.72    |
|    success_rate    | 0.84     |
| time/              |          |
|    episodes        | 2720     |
|    fps             | 35       |
|    time_elapsed    | 2436     |
|    total_timesteps | 87063    |
| train/             |          |
|    actor_loss      | 1.63     |
|    critic_loss     | 0.0102   |
|    ent_coef        | 0.00558  |
|    ent_coef_loss   | 0.545    |
|    learning_rate   | 0.001    |
|    n_updates       | 86962    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.7     |
|    ep_rew_mean     | -7.65    |
|    success_rate    | 0.85     |
| time/              |          |
|    episodes        | 2724     |
|    fps             | 35       |
|    time_elapsed    | 2439     |
|    total_timesteps | 87134    |
| train/             |          |
|    actor_loss      | 1.81     |
|    critic_loss     | 0.0105   |
|    ent_coef        | 0.00563  |
|    ent_coef_loss   | 0.331    |
|    learning_rate   | 0.001    |
|    n_updates       | 87033    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.6     |
|    ep_rew_mean     | -7.69    |
|    success_rate    | 0.84     |
| time/              |          |
|    episodes        | 2728     |
|    fps             | 35       |
|    time_elapsed    | 2442     |
|    total_timesteps | 87209    |
| train/             |          |
|    actor_loss      | 1.66     |
|    critic_loss     | 0.00952  |
|    ent_coef        | 0.00558  |
|    ent_coef_loss   | 1.08     |
|    learning_rate   | 0.001    |
|    n_updates       | 87108    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.4     |
|    ep_rew_mean     | -7.5     |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2732     |
|    fps             | 35       |
|    time_elapsed    | 2444     |
|    total_timesteps | 87285    |
| train/             |          |
|    actor_loss      | 1.85     |
|    critic_loss     | 0.00885  |
|    ent_coef        | 0.00557  |
|    ent_coef_loss   | 0.355    |
|    learning_rate   | 0.001    |
|    n_updates       | 87184    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.4     |
|    ep_rew_mean     | -7.48    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2736     |
|    fps             | 35       |
|    time_elapsed    | 2446     |
|    total_timesteps | 87366    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.0082   |
|    ent_coef        | 0.00564  |
|    ent_coef_loss   | 1.16     |
|    learning_rate   | 0.001    |
|    n_updates       | 87265    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.5     |
|    ep_rew_mean     | -7.52    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2740     |
|    fps             | 35       |
|    time_elapsed    | 2449     |
|    total_timesteps | 87464    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.00719  |
|    ent_coef        | 0.00566  |
|    ent_coef_loss   | -0.739   |
|    learning_rate   | 0.001    |
|    n_updates       | 87363    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.5     |
|    ep_rew_mean     | -7.5     |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2744     |
|    fps             | 35       |
|    time_elapsed    | 2452     |
|    total_timesteps | 87551    |
| train/             |          |
|    actor_loss      | 1.64     |
|    critic_loss     | 0.0551   |
|    ent_coef        | 0.00562  |
|    ent_coef_loss   | 0.314    |
|    learning_rate   | 0.001    |
|    n_updates       | 87450    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.6     |
|    ep_rew_mean     | -7.47    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 2748     |
|    fps             | 35       |
|    time_elapsed    | 2455     |
|    total_timesteps | 87641    |
| train/             |          |
|    actor_loss      | 1.67     |
|    critic_loss     | 0.00656  |
|    ent_coef        | 0.00572  |
|    ent_coef_loss   | -0.293   |
|    learning_rate   | 0.001    |
|    n_updates       | 87540    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.6     |
|    ep_rew_mean     | -7.5     |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 2752     |
|    fps             | 35       |
|    time_elapsed    | 2458     |
|    total_timesteps | 87736    |
| train/             |          |
|    actor_loss      | 1.66     |
|    critic_loss     | 0.0249   |
|    ent_coef        | 0.00572  |
|    ent_coef_loss   | -0.536   |
|    learning_rate   | 0.001    |
|    n_updates       | 87635    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.3     |
|    ep_rew_mean     | -7.74    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 2756     |
|    fps             | 35       |
|    time_elapsed    | 2461     |
|    total_timesteps | 87866    |
| train/             |          |
|    actor_loss      | 1.62     |
|    critic_loss     | 0.0139   |
|    ent_coef        | 0.00558  |
|    ent_coef_loss   | -0.315   |
|    learning_rate   | 0.001    |
|    n_updates       | 87765    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.9     |
|    ep_rew_mean     | -7.56    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 2760     |
|    fps             | 35       |
|    time_elapsed    | 2463     |
|    total_timesteps | 87943    |
| train/             |          |
|    actor_loss      | 1.6      |
|    critic_loss     | 0.00836  |
|    ent_coef        | 0.00571  |
|    ent_coef_loss   | -0.649   |
|    learning_rate   | 0.001    |
|    n_updates       | 87842    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -7.69    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2764     |
|    fps             | 35       |
|    time_elapsed    | 2466     |
|    total_timesteps | 88023    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.0148   |
|    ent_coef        | 0.00562  |
|    ent_coef_loss   | -0.127   |
|    learning_rate   | 0.001    |
|    n_updates       | 87922    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.2     |
|    ep_rew_mean     | -7.72    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2768     |
|    fps             | 35       |
|    time_elapsed    | 2469     |
|    total_timesteps | 88104    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.00724  |
|    ent_coef        | 0.00557  |
|    ent_coef_loss   | -0.901   |
|    learning_rate   | 0.001    |
|    n_updates       | 88003    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -7.79    |
|    success_rate    | 0.84     |
| time/              |          |
|    episodes        | 2772     |
|    fps             | 35       |
|    time_elapsed    | 2471     |
|    total_timesteps | 88182    |
| train/             |          |
|    actor_loss      | 1.55     |
|    critic_loss     | 0.00372  |
|    ent_coef        | 0.00552  |
|    ent_coef_loss   | 0.684    |
|    learning_rate   | 0.001    |
|    n_updates       | 88081    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -7.81    |
|    success_rate    | 0.83     |
| time/              |          |
|    episodes        | 2776     |
|    fps             | 35       |
|    time_elapsed    | 2474     |
|    total_timesteps | 88267    |
| train/             |          |
|    actor_loss      | 1.67     |
|    critic_loss     | 0.00568  |
|    ent_coef        | 0.00541  |
|    ent_coef_loss   | -0.751   |
|    learning_rate   | 0.001    |
|    n_updates       | 88166    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.9     |
|    ep_rew_mean     | -7.81    |
|    success_rate    | 0.83     |
| time/              |          |
|    episodes        | 2780     |
|    fps             | 35       |
|    time_elapsed    | 2476     |
|    total_timesteps | 88348    |
| train/             |          |
|    actor_loss      | 1.6      |
|    critic_loss     | 0.0111   |
|    ent_coef        | 0.00543  |
|    ent_coef_loss   | -0.565   |
|    learning_rate   | 0.001    |
|    n_updates       | 88247    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -7.85    |
|    success_rate    | 0.83     |
| time/              |          |
|    episodes        | 2784     |
|    fps             | 35       |
|    time_elapsed    | 2478     |
|    total_timesteps | 88430    |
| train/             |          |
|    actor_loss      | 1.67     |
|    critic_loss     | 0.00517  |
|    ent_coef        | 0.00553  |
|    ent_coef_loss   | -1.14    |
|    learning_rate   | 0.001    |
|    n_updates       | 88329    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -7.88    |
|    success_rate    | 0.83     |
| time/              |          |
|    episodes        | 2788     |
|    fps             | 35       |
|    time_elapsed    | 2482     |
|    total_timesteps | 88518    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.00551  |
|    ent_coef        | 0.00565  |
|    ent_coef_loss   | 0.265    |
|    learning_rate   | 0.001    |
|    n_updates       | 88417    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -7.92    |
|    success_rate    | 0.83     |
| time/              |          |
|    episodes        | 2792     |
|    fps             | 35       |
|    time_elapsed    | 2484     |
|    total_timesteps | 88591    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.00735  |
|    ent_coef        | 0.00566  |
|    ent_coef_loss   | -0.459   |
|    learning_rate   | 0.001    |
|    n_updates       | 88490    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -8       |
|    success_rate    | 0.81     |
| time/              |          |
|    episodes        | 2796     |
|    fps             | 35       |
|    time_elapsed    | 2486     |
|    total_timesteps | 88668    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.0077   |
|    ent_coef        | 0.00577  |
|    ent_coef_loss   | -0.113   |
|    learning_rate   | 0.001    |
|    n_updates       | 88567    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21       |
|    ep_rew_mean     | -7.9     |
|    success_rate    | 0.82     |
| time/              |          |
|    episodes        | 2800     |
|    fps             | 35       |
|    time_elapsed    | 2488     |
|    total_timesteps | 88744    |
| train/             |          |
|    actor_loss      | 1.66     |
|    critic_loss     | 0.00669  |
|    ent_coef        | 0.00553  |
|    ent_coef_loss   | -1.44    |
|    learning_rate   | 0.001    |
|    n_updates       | 88643    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -7.96    |
|    success_rate    | 0.82     |
| time/              |          |
|    episodes        | 2804     |
|    fps             | 35       |
|    time_elapsed    | 2491     |
|    total_timesteps | 88835    |
| train/             |          |
|    actor_loss      | 1.81     |
|    critic_loss     | 0.00772  |
|    ent_coef        | 0.00552  |
|    ent_coef_loss   | -0.683   |
|    learning_rate   | 0.001    |
|    n_updates       | 88734    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21       |
|    ep_rew_mean     | -7.89    |
|    success_rate    | 0.83     |
| time/              |          |
|    episodes        | 2808     |
|    fps             | 35       |
|    time_elapsed    | 2494     |
|    total_timesteps | 88905    |
| train/             |          |
|    actor_loss      | 1.79     |
|    critic_loss     | 0.00554  |
|    ent_coef        | 0.00562  |
|    ent_coef_loss   | 0.0586   |
|    learning_rate   | 0.001    |
|    n_updates       | 88804    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.9     |
|    ep_rew_mean     | -7.84    |
|    success_rate    | 0.83     |
| time/              |          |
|    episodes        | 2812     |
|    fps             | 35       |
|    time_elapsed    | 2496     |
|    total_timesteps | 88982    |
| train/             |          |
|    actor_loss      | 1.56     |
|    critic_loss     | 0.00577  |
|    ent_coef        | 0.00547  |
|    ent_coef_loss   | -1.31    |
|    learning_rate   | 0.001    |
|    n_updates       | 88881    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.8     |
|    ep_rew_mean     | -7.87    |
|    success_rate    | 0.82     |
| time/              |          |
|    episodes        | 2816     |
|    fps             | 35       |
|    time_elapsed    | 2499     |
|    total_timesteps | 89059    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.0115   |
|    ent_coef        | 0.00529  |
|    ent_coef_loss   | 0.329    |
|    learning_rate   | 0.001    |
|    n_updates       | 88958    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21       |
|    ep_rew_mean     | -7.79    |
|    success_rate    | 0.84     |
| time/              |          |
|    episodes        | 2820     |
|    fps             | 35       |
|    time_elapsed    | 2502     |
|    total_timesteps | 89165    |
| train/             |          |
|    actor_loss      | 1.65     |
|    critic_loss     | 0.00996  |
|    ent_coef        | 0.00536  |
|    ent_coef_loss   | 0.62     |
|    learning_rate   | 0.001    |
|    n_updates       | 89064    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21       |
|    ep_rew_mean     | -7.81    |
|    success_rate    | 0.84     |
| time/              |          |
|    episodes        | 2824     |
|    fps             | 35       |
|    time_elapsed    | 2504     |
|    total_timesteps | 89238    |
| train/             |          |
|    actor_loss      | 1.59     |
|    critic_loss     | 0.0115   |
|    ent_coef        | 0.00549  |
|    ent_coef_loss   | -0.329   |
|    learning_rate   | 0.001    |
|    n_updates       | 89137    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -7.69    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2828     |
|    fps             | 35       |
|    time_elapsed    | 2507     |
|    total_timesteps | 89317    |
| train/             |          |
|    actor_loss      | 1.75     |
|    critic_loss     | 0.013    |
|    ent_coef        | 0.00554  |
|    ent_coef_loss   | 0.451    |
|    learning_rate   | 0.001    |
|    n_updates       | 89216    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.3     |
|    ep_rew_mean     | -7.78    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2832     |
|    fps             | 35       |
|    time_elapsed    | 2510     |
|    total_timesteps | 89419    |
| train/             |          |
|    actor_loss      | 1.58     |
|    critic_loss     | 0.0533   |
|    ent_coef        | 0.00565  |
|    ent_coef_loss   | 0.0562   |
|    learning_rate   | 0.001    |
|    n_updates       | 89318    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -7.69    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2836     |
|    fps             | 35       |
|    time_elapsed    | 2512     |
|    total_timesteps | 89480    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.0111   |
|    ent_coef        | 0.00565  |
|    ent_coef_loss   | -1.27    |
|    learning_rate   | 0.001    |
|    n_updates       | 89379    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21       |
|    ep_rew_mean     | -7.64    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2840     |
|    fps             | 35       |
|    time_elapsed    | 2515     |
|    total_timesteps | 89565    |
| train/             |          |
|    actor_loss      | 1.66     |
|    critic_loss     | 0.00528  |
|    ent_coef        | 0.00554  |
|    ent_coef_loss   | -0.634   |
|    learning_rate   | 0.001    |
|    n_updates       | 89464    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.9     |
|    ep_rew_mean     | -7.65    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2844     |
|    fps             | 35       |
|    time_elapsed    | 2517     |
|    total_timesteps | 89640    |
| train/             |          |
|    actor_loss      | 1.78     |
|    critic_loss     | 0.0178   |
|    ent_coef        | 0.00538  |
|    ent_coef_loss   | 0.699    |
|    learning_rate   | 0.001    |
|    n_updates       | 89539    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.8     |
|    ep_rew_mean     | -7.64    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2848     |
|    fps             | 35       |
|    time_elapsed    | 2519     |
|    total_timesteps | 89721    |
| train/             |          |
|    actor_loss      | 1.6      |
|    critic_loss     | 0.0821   |
|    ent_coef        | 0.00532  |
|    ent_coef_loss   | -0.317   |
|    learning_rate   | 0.001    |
|    n_updates       | 89620    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.5     |
|    ep_rew_mean     | -7.52    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2852     |
|    fps             | 35       |
|    time_elapsed    | 2522     |
|    total_timesteps | 89787    |
| train/             |          |
|    actor_loss      | 1.88     |
|    critic_loss     | 0.00708  |
|    ent_coef        | 0.0053   |
|    ent_coef_loss   | 1.4      |
|    learning_rate   | 0.001    |
|    n_updates       | 89686    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.2     |
|    ep_rew_mean     | -7.42    |
|    success_rate    | 0.85     |
| time/              |          |
|    episodes        | 2856     |
|    fps             | 35       |
|    time_elapsed    | 2525     |
|    total_timesteps | 89885    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.00733  |
|    ent_coef        | 0.00528  |
|    ent_coef_loss   | 0.0241   |
|    learning_rate   | 0.001    |
|    n_updates       | 89784    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.2     |
|    ep_rew_mean     | -7.54    |
|    success_rate    | 0.84     |
| time/              |          |
|    episodes        | 2860     |
|    fps             | 35       |
|    time_elapsed    | 2528     |
|    total_timesteps | 89966    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.0137   |
|    ent_coef        | 0.00525  |
|    ent_coef_loss   | 0.571    |
|    learning_rate   | 0.001    |
|    n_updates       | 89865    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.1     |
|    ep_rew_mean     | -7.38    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2864     |
|    fps             | 35       |
|    time_elapsed    | 2530     |
|    total_timesteps | 90031    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.0112   |
|    ent_coef        | 0.00526  |
|    ent_coef_loss   | 0.617    |
|    learning_rate   | 0.001    |
|    n_updates       | 89930    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.1     |
|    ep_rew_mean     | -7.4     |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 2868     |
|    fps             | 35       |
|    time_elapsed    | 2532     |
|    total_timesteps | 90111    |
| train/             |          |
|    actor_loss      | 1.65     |
|    critic_loss     | 0.00408  |
|    ent_coef        | 0.00527  |
|    ent_coef_loss   | 0.648    |
|    learning_rate   | 0.001    |
|    n_updates       | 90010    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.2     |
|    ep_rew_mean     | -7.32    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 2872     |
|    fps             | 35       |
|    time_elapsed    | 2535     |
|    total_timesteps | 90201    |
| train/             |          |
|    actor_loss      | 1.59     |
|    critic_loss     | 0.00583  |
|    ent_coef        | 0.00542  |
|    ent_coef_loss   | 0.411    |
|    learning_rate   | 0.001    |
|    n_updates       | 90100    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.2     |
|    ep_rew_mean     | -7.29    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 2876     |
|    fps             | 35       |
|    time_elapsed    | 2538     |
|    total_timesteps | 90283    |
| train/             |          |
|    actor_loss      | 1.6      |
|    critic_loss     | 0.00373  |
|    ent_coef        | 0.00548  |
|    ent_coef_loss   | -1.02    |
|    learning_rate   | 0.001    |
|    n_updates       | 90182    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.3     |
|    ep_rew_mean     | -7.18    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 2880     |
|    fps             | 35       |
|    time_elapsed    | 2540     |
|    total_timesteps | 90380    |
| train/             |          |
|    actor_loss      | 1.61     |
|    critic_loss     | 0.00716  |
|    ent_coef        | 0.00546  |
|    ent_coef_loss   | -0.52    |
|    learning_rate   | 0.001    |
|    n_updates       | 90279    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.4     |
|    ep_rew_mean     | -7.21    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 2884     |
|    fps             | 35       |
|    time_elapsed    | 2543     |
|    total_timesteps | 90465    |
| train/             |          |
|    actor_loss      | 1.52     |
|    critic_loss     | 0.00701  |
|    ent_coef        | 0.00535  |
|    ent_coef_loss   | -1.82    |
|    learning_rate   | 0.001    |
|    n_updates       | 90364    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.2     |
|    ep_rew_mean     | -7.09    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 2888     |
|    fps             | 35       |
|    time_elapsed    | 2545     |
|    total_timesteps | 90540    |
| train/             |          |
|    actor_loss      | 1.64     |
|    critic_loss     | 0.0068   |
|    ent_coef        | 0.00564  |
|    ent_coef_loss   | -0.366   |
|    learning_rate   | 0.001    |
|    n_updates       | 90439    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.2     |
|    ep_rew_mean     | -7.04    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 2892     |
|    fps             | 35       |
|    time_elapsed    | 2547     |
|    total_timesteps | 90613    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.0063   |
|    ent_coef        | 0.00566  |
|    ent_coef_loss   | -0.211   |
|    learning_rate   | 0.001    |
|    n_updates       | 90512    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.2     |
|    ep_rew_mean     | -7.04    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 2896     |
|    fps             | 35       |
|    time_elapsed    | 2550     |
|    total_timesteps | 90686    |
| train/             |          |
|    actor_loss      | 1.67     |
|    critic_loss     | 0.00764  |
|    ent_coef        | 0.0056   |
|    ent_coef_loss   | 0.295    |
|    learning_rate   | 0.001    |
|    n_updates       | 90585    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.4     |
|    ep_rew_mean     | -7.15    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 2900     |
|    fps             | 35       |
|    time_elapsed    | 2553     |
|    total_timesteps | 90780    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.00572  |
|    ent_coef        | 0.00552  |
|    ent_coef_loss   | -0.798   |
|    learning_rate   | 0.001    |
|    n_updates       | 90679    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.3     |
|    ep_rew_mean     | -7.15    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 2904     |
|    fps             | 35       |
|    time_elapsed    | 2555     |
|    total_timesteps | 90861    |
| train/             |          |
|    actor_loss      | 1.59     |
|    critic_loss     | 0.00822  |
|    ent_coef        | 0.00546  |
|    ent_coef_loss   | 1.34     |
|    learning_rate   | 0.001    |
|    n_updates       | 90760    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.5     |
|    ep_rew_mean     | -7.2     |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 2908     |
|    fps             | 35       |
|    time_elapsed    | 2558     |
|    total_timesteps | 90959    |
| train/             |          |
|    actor_loss      | 1.58     |
|    critic_loss     | 0.0349   |
|    ent_coef        | 0.00562  |
|    ent_coef_loss   | 0.171    |
|    learning_rate   | 0.001    |
|    n_updates       | 90858    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.7     |
|    ep_rew_mean     | -7.19    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 2912     |
|    fps             | 35       |
|    time_elapsed    | 2561     |
|    total_timesteps | 91053    |
| train/             |          |
|    actor_loss      | 1.66     |
|    critic_loss     | 0.00667  |
|    ent_coef        | 0.00559  |
|    ent_coef_loss   | -0.201   |
|    learning_rate   | 0.001    |
|    n_updates       | 90952    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.5     |
|    ep_rew_mean     | -7.03    |
|    success_rate    | 0.91     |
| time/              |          |
|    episodes        | 2916     |
|    fps             | 35       |
|    time_elapsed    | 2563     |
|    total_timesteps | 91113    |
| train/             |          |
|    actor_loss      | 1.54     |
|    critic_loss     | 0.00656  |
|    ent_coef        | 0.00551  |
|    ent_coef_loss   | 0.654    |
|    learning_rate   | 0.001    |
|    n_updates       | 91012    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.2     |
|    ep_rew_mean     | -7.02    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 2920     |
|    fps             | 35       |
|    time_elapsed    | 2565     |
|    total_timesteps | 91189    |
| train/             |          |
|    actor_loss      | 1.63     |
|    critic_loss     | 0.0161   |
|    ent_coef        | 0.00554  |
|    ent_coef_loss   | 0.662    |
|    learning_rate   | 0.001    |
|    n_updates       | 91088    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.3     |
|    ep_rew_mean     | -7.07    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 2924     |
|    fps             | 35       |
|    time_elapsed    | 2568     |
|    total_timesteps | 91266    |
| train/             |          |
|    actor_loss      | 1.6      |
|    critic_loss     | 0.00766  |
|    ent_coef        | 0.0057   |
|    ent_coef_loss   | 0.645    |
|    learning_rate   | 0.001    |
|    n_updates       | 91165    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.3     |
|    ep_rew_mean     | -7.07    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 2928     |
|    fps             | 35       |
|    time_elapsed    | 2570     |
|    total_timesteps | 91346    |
| train/             |          |
|    actor_loss      | 1.53     |
|    critic_loss     | 0.00807  |
|    ent_coef        | 0.00563  |
|    ent_coef_loss   | -0.352   |
|    learning_rate   | 0.001    |
|    n_updates       | 91245    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.1     |
|    ep_rew_mean     | -7.05    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 2932     |
|    fps             | 35       |
|    time_elapsed    | 2573     |
|    total_timesteps | 91433    |
| train/             |          |
|    actor_loss      | 1.52     |
|    critic_loss     | 0.00803  |
|    ent_coef        | 0.00553  |
|    ent_coef_loss   | -0.757   |
|    learning_rate   | 0.001    |
|    n_updates       | 91332    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.4     |
|    ep_rew_mean     | -7.17    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 2936     |
|    fps             | 35       |
|    time_elapsed    | 2576     |
|    total_timesteps | 91518    |
| train/             |          |
|    actor_loss      | 1.77     |
|    critic_loss     | 0.0103   |
|    ent_coef        | 0.00561  |
|    ent_coef_loss   | 0.682    |
|    learning_rate   | 0.001    |
|    n_updates       | 91417    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.1     |
|    ep_rew_mean     | -7.07    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 2940     |
|    fps             | 35       |
|    time_elapsed    | 2578     |
|    total_timesteps | 91580    |
| train/             |          |
|    actor_loss      | 1.6      |
|    critic_loss     | 0.00486  |
|    ent_coef        | 0.00558  |
|    ent_coef_loss   | 0.109    |
|    learning_rate   | 0.001    |
|    n_updates       | 91479    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.3     |
|    ep_rew_mean     | -7.03    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 2944     |
|    fps             | 35       |
|    time_elapsed    | 2581     |
|    total_timesteps | 91671    |
| train/             |          |
|    actor_loss      | 1.67     |
|    critic_loss     | 0.00865  |
|    ent_coef        | 0.00565  |
|    ent_coef_loss   | -0.353   |
|    learning_rate   | 0.001    |
|    n_updates       | 91570    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.2     |
|    ep_rew_mean     | -6.99    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 2948     |
|    fps             | 35       |
|    time_elapsed    | 2583     |
|    total_timesteps | 91740    |
| train/             |          |
|    actor_loss      | 1.58     |
|    critic_loss     | 0.0058   |
|    ent_coef        | 0.00555  |
|    ent_coef_loss   | -0.379   |
|    learning_rate   | 0.001    |
|    n_updates       | 91639    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.3     |
|    ep_rew_mean     | -7.06    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 2952     |
|    fps             | 35       |
|    time_elapsed    | 2585     |
|    total_timesteps | 91818    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.00846  |
|    ent_coef        | 0.00551  |
|    ent_coef_loss   | -0.0804  |
|    learning_rate   | 0.001    |
|    n_updates       | 91717    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.1     |
|    ep_rew_mean     | -6.96    |
|    success_rate    | 0.91     |
| time/              |          |
|    episodes        | 2956     |
|    fps             | 35       |
|    time_elapsed    | 2587     |
|    total_timesteps | 91890    |
| train/             |          |
|    actor_loss      | 1.61     |
|    critic_loss     | 0.0128   |
|    ent_coef        | 0.00547  |
|    ent_coef_loss   | 0.381    |
|    learning_rate   | 0.001    |
|    n_updates       | 91789    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20       |
|    ep_rew_mean     | -6.89    |
|    success_rate    | 0.92     |
| time/              |          |
|    episodes        | 2960     |
|    fps             | 35       |
|    time_elapsed    | 2590     |
|    total_timesteps | 91967    |
| train/             |          |
|    actor_loss      | 1.65     |
|    critic_loss     | 0.00478  |
|    ent_coef        | 0.00549  |
|    ent_coef_loss   | -0.0405  |
|    learning_rate   | 0.001    |
|    n_updates       | 91866    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.1     |
|    ep_rew_mean     | -6.98    |
|    success_rate    | 0.92     |
| time/              |          |
|    episodes        | 2964     |
|    fps             | 35       |
|    time_elapsed    | 2593     |
|    total_timesteps | 92045    |
| train/             |          |
|    actor_loss      | 1.61     |
|    critic_loss     | 0.00762  |
|    ent_coef        | 0.00547  |
|    ent_coef_loss   | 0.602    |
|    learning_rate   | 0.001    |
|    n_updates       | 91944    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.2     |
|    ep_rew_mean     | -6.96    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 2968     |
|    fps             | 35       |
|    time_elapsed    | 2595     |
|    total_timesteps | 92133    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.00947  |
|    ent_coef        | 0.00554  |
|    ent_coef_loss   | -0.477   |
|    learning_rate   | 0.001    |
|    n_updates       | 92032    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.1     |
|    ep_rew_mean     | -6.96    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 2972     |
|    fps             | 35       |
|    time_elapsed    | 2598     |
|    total_timesteps | 92209    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.00723  |
|    ent_coef        | 0.0055   |
|    ent_coef_loss   | 1.03     |
|    learning_rate   | 0.001    |
|    n_updates       | 92108    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.3     |
|    ep_rew_mean     | -7.03    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2976     |
|    fps             | 35       |
|    time_elapsed    | 2601     |
|    total_timesteps | 92317    |
| train/             |          |
|    actor_loss      | 1.59     |
|    critic_loss     | 0.022    |
|    ent_coef        | 0.00547  |
|    ent_coef_loss   | -0.604   |
|    learning_rate   | 0.001    |
|    n_updates       | 92216    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.2     |
|    ep_rew_mean     | -7.04    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2980     |
|    fps             | 35       |
|    time_elapsed    | 2604     |
|    total_timesteps | 92399    |
| train/             |          |
|    actor_loss      | 1.6      |
|    critic_loss     | 0.0109   |
|    ent_coef        | 0.00538  |
|    ent_coef_loss   | -1.27    |
|    learning_rate   | 0.001    |
|    n_updates       | 92298    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.2     |
|    ep_rew_mean     | -7.05    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2984     |
|    fps             | 35       |
|    time_elapsed    | 2607     |
|    total_timesteps | 92490    |
| train/             |          |
|    actor_loss      | 1.65     |
|    critic_loss     | 0.0136   |
|    ent_coef        | 0.0054   |
|    ent_coef_loss   | 0.952    |
|    learning_rate   | 0.001    |
|    n_updates       | 92389    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.4     |
|    ep_rew_mean     | -7.21    |
|    success_rate    | 0.92     |
| time/              |          |
|    episodes        | 2988     |
|    fps             | 35       |
|    time_elapsed    | 2610     |
|    total_timesteps | 92579    |
| train/             |          |
|    actor_loss      | 1.64     |
|    critic_loss     | 0.0092   |
|    ent_coef        | 0.00539  |
|    ent_coef_loss   | -0.443   |
|    learning_rate   | 0.001    |
|    n_updates       | 92478    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.6     |
|    ep_rew_mean     | -7.3     |
|    success_rate    | 0.92     |
| time/              |          |
|    episodes        | 2992     |
|    fps             | 35       |
|    time_elapsed    | 2612     |
|    total_timesteps | 92670    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.00537  |
|    ent_coef        | 0.00548  |
|    ent_coef_loss   | -0.268   |
|    learning_rate   | 0.001    |
|    n_updates       | 92569    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.7     |
|    ep_rew_mean     | -7.29    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 2996     |
|    fps             | 35       |
|    time_elapsed    | 2615     |
|    total_timesteps | 92758    |
| train/             |          |
|    actor_loss      | 1.64     |
|    critic_loss     | 0.00506  |
|    ent_coef        | 0.00548  |
|    ent_coef_loss   | -0.31    |
|    learning_rate   | 0.001    |
|    n_updates       | 92657    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.4     |
|    ep_rew_mean     | -7.16    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 3000     |
|    fps             | 35       |
|    time_elapsed    | 2617     |
|    total_timesteps | 92824    |
| train/             |          |
|    actor_loss      | 1.64     |
|    critic_loss     | 0.018    |
|    ent_coef        | 0.00541  |
|    ent_coef_loss   | -0.0422  |
|    learning_rate   | 0.001    |
|    n_updates       | 92723    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.4     |
|    ep_rew_mean     | -7.06    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 3004     |
|    fps             | 35       |
|    time_elapsed    | 2619     |
|    total_timesteps | 92899    |
| train/             |          |
|    actor_loss      | 1.77     |
|    critic_loss     | 0.0127   |
|    ent_coef        | 0.00546  |
|    ent_coef_loss   | 1.12     |
|    learning_rate   | 0.001    |
|    n_updates       | 92798    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.3     |
|    ep_rew_mean     | -7.02    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 3008     |
|    fps             | 35       |
|    time_elapsed    | 2622     |
|    total_timesteps | 92986    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.00813  |
|    ent_coef        | 0.00535  |
|    ent_coef_loss   | 0.537    |
|    learning_rate   | 0.001    |
|    n_updates       | 92885    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.3     |
|    ep_rew_mean     | -7.11    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 3012     |
|    fps             | 35       |
|    time_elapsed    | 2625     |
|    total_timesteps | 93086    |
| train/             |          |
|    actor_loss      | 1.77     |
|    critic_loss     | 0.0036   |
|    ent_coef        | 0.00526  |
|    ent_coef_loss   | -0.0772  |
|    learning_rate   | 0.001    |
|    n_updates       | 92985    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.6     |
|    ep_rew_mean     | -7.26    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 3016     |
|    fps             | 35       |
|    time_elapsed    | 2627     |
|    total_timesteps | 93178    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.00625  |
|    ent_coef        | 0.00537  |
|    ent_coef_loss   | -0.51    |
|    learning_rate   | 0.001    |
|    n_updates       | 93077    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.8     |
|    ep_rew_mean     | -7.24    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 3020     |
|    fps             | 35       |
|    time_elapsed    | 2630     |
|    total_timesteps | 93265    |
| train/             |          |
|    actor_loss      | 1.57     |
|    critic_loss     | 0.0259   |
|    ent_coef        | 0.00537  |
|    ent_coef_loss   | 0.198    |
|    learning_rate   | 0.001    |
|    n_updates       | 93164    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.8     |
|    ep_rew_mean     | -7.32    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 3024     |
|    fps             | 35       |
|    time_elapsed    | 2633     |
|    total_timesteps | 93349    |
| train/             |          |
|    actor_loss      | 1.54     |
|    critic_loss     | 0.00758  |
|    ent_coef        | 0.00536  |
|    ent_coef_loss   | -0.519   |
|    learning_rate   | 0.001    |
|    n_updates       | 93248    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.8     |
|    ep_rew_mean     | -7.32    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 3028     |
|    fps             | 35       |
|    time_elapsed    | 2635     |
|    total_timesteps | 93422    |
| train/             |          |
|    actor_loss      | 1.66     |
|    critic_loss     | 0.00843  |
|    ent_coef        | 0.0053   |
|    ent_coef_loss   | 1        |
|    learning_rate   | 0.001    |
|    n_updates       | 93321    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.6     |
|    ep_rew_mean     | -7.27    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 3032     |
|    fps             | 35       |
|    time_elapsed    | 2637     |
|    total_timesteps | 93494    |
| train/             |          |
|    actor_loss      | 1.64     |
|    critic_loss     | 0.0272   |
|    ent_coef        | 0.00546  |
|    ent_coef_loss   | 0.437    |
|    learning_rate   | 0.001    |
|    n_updates       | 93393    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.6     |
|    ep_rew_mean     | -7.24    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 3036     |
|    fps             | 35       |
|    time_elapsed    | 2640     |
|    total_timesteps | 93575    |
| train/             |          |
|    actor_loss      | 1.75     |
|    critic_loss     | 0.00616  |
|    ent_coef        | 0.00534  |
|    ent_coef_loss   | 0.325    |
|    learning_rate   | 0.001    |
|    n_updates       | 93474    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.9     |
|    ep_rew_mean     | -7.34    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 3040     |
|    fps             | 35       |
|    time_elapsed    | 2643     |
|    total_timesteps | 93665    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.00631  |
|    ent_coef        | 0.00519  |
|    ent_coef_loss   | 0.04     |
|    learning_rate   | 0.001    |
|    n_updates       | 93564    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -7.44    |
|    success_rate    | 0.97     |
| time/              |          |
|    episodes        | 3044     |
|    fps             | 35       |
|    time_elapsed    | 2646     |
|    total_timesteps | 93782    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.00825  |
|    ent_coef        | 0.00511  |
|    ent_coef_loss   | 0.861    |
|    learning_rate   | 0.001    |
|    n_updates       | 93681    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.3     |
|    ep_rew_mean     | -7.55    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 3048     |
|    fps             | 35       |
|    time_elapsed    | 2649     |
|    total_timesteps | 93872    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.0182   |
|    ent_coef        | 0.00517  |
|    ent_coef_loss   | -0.236   |
|    learning_rate   | 0.001    |
|    n_updates       | 93771    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.3     |
|    ep_rew_mean     | -7.48    |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 3052     |
|    fps             | 35       |
|    time_elapsed    | 2651     |
|    total_timesteps | 93945    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.00639  |
|    ent_coef        | 0.0052   |
|    ent_coef_loss   | -0.46    |
|    learning_rate   | 0.001    |
|    n_updates       | 93844    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.3     |
|    ep_rew_mean     | -7.5     |
|    success_rate    | 0.96     |
| time/              |          |
|    episodes        | 3056     |
|    fps             | 35       |
|    time_elapsed    | 2653     |
|    total_timesteps | 94019    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.0069   |
|    ent_coef        | 0.00512  |
|    ent_coef_loss   | 0.114    |
|    learning_rate   | 0.001    |
|    n_updates       | 93918    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.6     |
|    ep_rew_mean     | -7.62    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 3060     |
|    fps             | 35       |
|    time_elapsed    | 2657     |
|    total_timesteps | 94124    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.00628  |
|    ent_coef        | 0.00511  |
|    ent_coef_loss   | 0.0386   |
|    learning_rate   | 0.001    |
|    n_updates       | 94023    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.8     |
|    ep_rew_mean     | -7.66    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 3064     |
|    fps             | 35       |
|    time_elapsed    | 2660     |
|    total_timesteps | 94221    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.00809  |
|    ent_coef        | 0.00517  |
|    ent_coef_loss   | 0.337    |
|    learning_rate   | 0.001    |
|    n_updates       | 94120    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.7     |
|    ep_rew_mean     | -7.66    |
|    success_rate    | 0.95     |
| time/              |          |
|    episodes        | 3068     |
|    fps             | 35       |
|    time_elapsed    | 2663     |
|    total_timesteps | 94306    |
| train/             |          |
|    actor_loss      | 1.57     |
|    critic_loss     | 0.00762  |
|    ent_coef        | 0.00523  |
|    ent_coef_loss   | -0.667   |
|    learning_rate   | 0.001    |
|    n_updates       | 94205    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.9     |
|    ep_rew_mean     | -7.75    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 3072     |
|    fps             | 35       |
|    time_elapsed    | 2665     |
|    total_timesteps | 94394    |
| train/             |          |
|    actor_loss      | 1.85     |
|    critic_loss     | 0.00771  |
|    ent_coef        | 0.00521  |
|    ent_coef_loss   | -0.0605  |
|    learning_rate   | 0.001    |
|    n_updates       | 94293    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.7     |
|    ep_rew_mean     | -7.77    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 3076     |
|    fps             | 35       |
|    time_elapsed    | 2668     |
|    total_timesteps | 94488    |
| train/             |          |
|    actor_loss      | 1.83     |
|    critic_loss     | 0.00686  |
|    ent_coef        | 0.00518  |
|    ent_coef_loss   | 0.0641   |
|    learning_rate   | 0.001    |
|    n_updates       | 94387    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.7     |
|    ep_rew_mean     | -7.76    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 3080     |
|    fps             | 35       |
|    time_elapsed    | 2671     |
|    total_timesteps | 94571    |
| train/             |          |
|    actor_loss      | 1.72     |
|    critic_loss     | 0.0165   |
|    ent_coef        | 0.00511  |
|    ent_coef_loss   | -0.554   |
|    learning_rate   | 0.001    |
|    n_updates       | 94470    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.6     |
|    ep_rew_mean     | -7.68    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 3084     |
|    fps             | 35       |
|    time_elapsed    | 2674     |
|    total_timesteps | 94651    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.0169   |
|    ent_coef        | 0.00516  |
|    ent_coef_loss   | -0.809   |
|    learning_rate   | 0.001    |
|    n_updates       | 94550    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.6     |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 3088     |
|    fps             | 35       |
|    time_elapsed    | 2676     |
|    total_timesteps | 94729    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.0116   |
|    ent_coef        | 0.00523  |
|    ent_coef_loss   | -0.602   |
|    learning_rate   | 0.001    |
|    n_updates       | 94628    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.4     |
|    ep_rew_mean     | -7.57    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 3092     |
|    fps             | 35       |
|    time_elapsed    | 2679     |
|    total_timesteps | 94808    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.0125   |
|    ent_coef        | 0.00527  |
|    ent_coef_loss   | -0.146   |
|    learning_rate   | 0.001    |
|    n_updates       | 94707    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.4     |
|    ep_rew_mean     | -7.55    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 3096     |
|    fps             | 35       |
|    time_elapsed    | 2681     |
|    total_timesteps | 94893    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.011    |
|    ent_coef        | 0.0052   |
|    ent_coef_loss   | 0.498    |
|    learning_rate   | 0.001    |
|    n_updates       | 94792    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.6     |
|    ep_rew_mean     | -7.59    |
|    success_rate    | 0.94     |
| time/              |          |
|    episodes        | 3100     |
|    fps             | 35       |
|    time_elapsed    | 2684     |
|    total_timesteps | 94983    |
| train/             |          |
|    actor_loss      | 1.78     |
|    critic_loss     | 0.0124   |
|    ent_coef        | 0.00523  |
|    ent_coef_loss   | -0.209   |
|    learning_rate   | 0.001    |
|    n_updates       | 94882    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.7     |
|    ep_rew_mean     | -7.68    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 3104     |
|    fps             | 35       |
|    time_elapsed    | 2687     |
|    total_timesteps | 95065    |
| train/             |          |
|    actor_loss      | 1.75     |
|    critic_loss     | 0.00973  |
|    ent_coef        | 0.00527  |
|    ent_coef_loss   | 0.625    |
|    learning_rate   | 0.001    |
|    n_updates       | 94964    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.71    |
|    success_rate    | 0.92     |
| time/              |          |
|    episodes        | 3108     |
|    fps             | 35       |
|    time_elapsed    | 2690     |
|    total_timesteps | 95137    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.014    |
|    ent_coef        | 0.00538  |
|    ent_coef_loss   | -0.0863  |
|    learning_rate   | 0.001    |
|    n_updates       | 95036    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.3     |
|    ep_rew_mean     | -7.73    |
|    success_rate    | 0.91     |
| time/              |          |
|    episodes        | 3112     |
|    fps             | 35       |
|    time_elapsed    | 2692     |
|    total_timesteps | 95214    |
| train/             |          |
|    actor_loss      | 1.64     |
|    critic_loss     | 0.00921  |
|    ent_coef        | 0.00537  |
|    ent_coef_loss   | -0.235   |
|    learning_rate   | 0.001    |
|    n_updates       | 95113    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.3     |
|    ep_rew_mean     | -7.69    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 3116     |
|    fps             | 35       |
|    time_elapsed    | 2695     |
|    total_timesteps | 95305    |
| train/             |          |
|    actor_loss      | 1.79     |
|    critic_loss     | 0.0113   |
|    ent_coef        | 0.00532  |
|    ent_coef_loss   | 0.0596   |
|    learning_rate   | 0.001    |
|    n_updates       | 95204    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -7.66    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 3120     |
|    fps             | 35       |
|    time_elapsed    | 2697     |
|    total_timesteps | 95376    |
| train/             |          |
|    actor_loss      | 1.75     |
|    critic_loss     | 0.0188   |
|    ent_coef        | 0.00519  |
|    ent_coef_loss   | -0.846   |
|    learning_rate   | 0.001    |
|    n_updates       | 95275    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21       |
|    ep_rew_mean     | -7.59    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 3124     |
|    fps             | 35       |
|    time_elapsed    | 2700     |
|    total_timesteps | 95452    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.00523  |
|    ent_coef        | 0.00507  |
|    ent_coef_loss   | 2.12     |
|    learning_rate   | 0.001    |
|    n_updates       | 95351    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21       |
|    ep_rew_mean     | -7.58    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 3128     |
|    fps             | 35       |
|    time_elapsed    | 2702     |
|    total_timesteps | 95518    |
| train/             |          |
|    actor_loss      | 1.83     |
|    critic_loss     | 0.013    |
|    ent_coef        | 0.00521  |
|    ent_coef_loss   | 0.561    |
|    learning_rate   | 0.001    |
|    n_updates       | 95417    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.2     |
|    ep_rew_mean     | -7.67    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 3132     |
|    fps             | 35       |
|    time_elapsed    | 2705     |
|    total_timesteps | 95610    |
| train/             |          |
|    actor_loss      | 1.81     |
|    critic_loss     | 0.00713  |
|    ent_coef        | 0.00528  |
|    ent_coef_loss   | 0.632    |
|    learning_rate   | 0.001    |
|    n_updates       | 95509    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.7     |
|    ep_rew_mean     | -7.78    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 3136     |
|    fps             | 35       |
|    time_elapsed    | 2708     |
|    total_timesteps | 95741    |
| train/             |          |
|    actor_loss      | 1.7      |
|    critic_loss     | 0.0258   |
|    ent_coef        | 0.00532  |
|    ent_coef_loss   | -0.0934  |
|    learning_rate   | 0.001    |
|    n_updates       | 95640    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.76    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 3140     |
|    fps             | 35       |
|    time_elapsed    | 2710     |
|    total_timesteps | 95812    |
| train/             |          |
|    actor_loss      | 1.76     |
|    critic_loss     | 0.00987  |
|    ent_coef        | 0.00529  |
|    ent_coef_loss   | -0.325   |
|    learning_rate   | 0.001    |
|    n_updates       | 95711    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21       |
|    ep_rew_mean     | -7.66    |
|    success_rate    | 0.85     |
| time/              |          |
|    episodes        | 3144     |
|    fps             | 35       |
|    time_elapsed    | 2713     |
|    total_timesteps | 95882    |
| train/             |          |
|    actor_loss      | 1.61     |
|    critic_loss     | 0.00606  |
|    ent_coef        | 0.00528  |
|    ent_coef_loss   | -0.746   |
|    learning_rate   | 0.001    |
|    n_updates       | 95781    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.9     |
|    ep_rew_mean     | -7.56    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 3148     |
|    fps             | 35       |
|    time_elapsed    | 2716     |
|    total_timesteps | 95957    |
| train/             |          |
|    actor_loss      | 1.76     |
|    critic_loss     | 0.0116   |
|    ent_coef        | 0.00526  |
|    ent_coef_loss   | 0.801    |
|    learning_rate   | 0.001    |
|    n_updates       | 95856    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.9     |
|    ep_rew_mean     | -7.6     |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 3152     |
|    fps             | 35       |
|    time_elapsed    | 2718     |
|    total_timesteps | 96040    |
| train/             |          |
|    actor_loss      | 1.86     |
|    critic_loss     | 0.00786  |
|    ent_coef        | 0.00527  |
|    ent_coef_loss   | 0.433    |
|    learning_rate   | 0.001    |
|    n_updates       | 95939    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.9     |
|    ep_rew_mean     | -7.6     |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 3156     |
|    fps             | 35       |
|    time_elapsed    | 2720     |
|    total_timesteps | 96113    |
| train/             |          |
|    actor_loss      | 1.81     |
|    critic_loss     | 0.0108   |
|    ent_coef        | 0.00533  |
|    ent_coef_loss   | -0.318   |
|    learning_rate   | 0.001    |
|    n_updates       | 96012    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.7     |
|    ep_rew_mean     | -7.45    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 3160     |
|    fps             | 35       |
|    time_elapsed    | 2723     |
|    total_timesteps | 96195    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.0128   |
|    ent_coef        | 0.00542  |
|    ent_coef_loss   | -0.588   |
|    learning_rate   | 0.001    |
|    n_updates       | 96094    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.5     |
|    ep_rew_mean     | -7.33    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 3164     |
|    fps             | 35       |
|    time_elapsed    | 2726     |
|    total_timesteps | 96275    |
| train/             |          |
|    actor_loss      | 1.78     |
|    critic_loss     | 0.0069   |
|    ent_coef        | 0.00541  |
|    ent_coef_loss   | 1.18     |
|    learning_rate   | 0.001    |
|    n_updates       | 96174    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.7     |
|    ep_rew_mean     | -7.37    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 3168     |
|    fps             | 35       |
|    time_elapsed    | 2729     |
|    total_timesteps | 96378    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.00591  |
|    ent_coef        | 0.0054   |
|    ent_coef_loss   | 0.509    |
|    learning_rate   | 0.001    |
|    n_updates       | 96277    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.6     |
|    ep_rew_mean     | -7.25    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 3172     |
|    fps             | 35       |
|    time_elapsed    | 2731     |
|    total_timesteps | 96456    |
| train/             |          |
|    actor_loss      | 1.72     |
|    critic_loss     | 0.00785  |
|    ent_coef        | 0.00536  |
|    ent_coef_loss   | -1.62    |
|    learning_rate   | 0.001    |
|    n_updates       | 96355    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.4     |
|    ep_rew_mean     | -7.08    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 3176     |
|    fps             | 35       |
|    time_elapsed    | 2733     |
|    total_timesteps | 96533    |
| train/             |          |
|    actor_loss      | 1.76     |
|    critic_loss     | 0.00567  |
|    ent_coef        | 0.00532  |
|    ent_coef_loss   | 0.889    |
|    learning_rate   | 0.001    |
|    n_updates       | 96432    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.7     |
|    ep_rew_mean     | -7.3     |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 3180     |
|    fps             | 35       |
|    time_elapsed    | 2737     |
|    total_timesteps | 96637    |
| train/             |          |
|    actor_loss      | 1.75     |
|    critic_loss     | 0.0067   |
|    ent_coef        | 0.00546  |
|    ent_coef_loss   | -0.934   |
|    learning_rate   | 0.001    |
|    n_updates       | 96536    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.7     |
|    ep_rew_mean     | -7.38    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 3184     |
|    fps             | 35       |
|    time_elapsed    | 2739     |
|    total_timesteps | 96718    |
| train/             |          |
|    actor_loss      | 1.67     |
|    critic_loss     | 0.0122   |
|    ent_coef        | 0.00537  |
|    ent_coef_loss   | 0.0713   |
|    learning_rate   | 0.001    |
|    n_updates       | 96617    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.7     |
|    ep_rew_mean     | -7.45    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 3188     |
|    fps             | 35       |
|    time_elapsed    | 2742     |
|    total_timesteps | 96799    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.0174   |
|    ent_coef        | 0.00546  |
|    ent_coef_loss   | -0.71    |
|    learning_rate   | 0.001    |
|    n_updates       | 96698    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.7     |
|    ep_rew_mean     | -7.4     |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 3192     |
|    fps             | 35       |
|    time_elapsed    | 2744     |
|    total_timesteps | 96881    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.0066   |
|    ent_coef        | 0.00562  |
|    ent_coef_loss   | -0.836   |
|    learning_rate   | 0.001    |
|    n_updates       | 96780    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.8     |
|    ep_rew_mean     | -7.38    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 3196     |
|    fps             | 35       |
|    time_elapsed    | 2747     |
|    total_timesteps | 96972    |
| train/             |          |
|    actor_loss      | 1.89     |
|    critic_loss     | 0.0358   |
|    ent_coef        | 0.00572  |
|    ent_coef_loss   | -0.376   |
|    learning_rate   | 0.001    |
|    n_updates       | 96871    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.8     |
|    ep_rew_mean     | -7.46    |
|    success_rate    | 0.85     |
| time/              |          |
|    episodes        | 3200     |
|    fps             | 35       |
|    time_elapsed    | 2750     |
|    total_timesteps | 97066    |
| train/             |          |
|    actor_loss      | 1.85     |
|    critic_loss     | 0.0772   |
|    ent_coef        | 0.00567  |
|    ent_coef_loss   | -0.675   |
|    learning_rate   | 0.001    |
|    n_updates       | 96965    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 20.8     |
|    ep_rew_mean     | -7.42    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 3204     |
|    fps             | 35       |
|    time_elapsed    | 2752     |
|    total_timesteps | 97149    |
| train/             |          |
|    actor_loss      | 1.59     |
|    critic_loss     | 0.00721  |
|    ent_coef        | 0.00555  |
|    ent_coef_loss   | -0.543   |
|    learning_rate   | 0.001    |
|    n_updates       | 97048    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.1     |
|    ep_rew_mean     | -7.44    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 3208     |
|    fps             | 35       |
|    time_elapsed    | 2756     |
|    total_timesteps | 97245    |
| train/             |          |
|    actor_loss      | 1.79     |
|    critic_loss     | 0.007    |
|    ent_coef        | 0.00557  |
|    ent_coef_loss   | -0.887   |
|    learning_rate   | 0.001    |
|    n_updates       | 97144    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.2     |
|    ep_rew_mean     | -7.41    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 3212     |
|    fps             | 35       |
|    time_elapsed    | 2758     |
|    total_timesteps | 97334    |
| train/             |          |
|    actor_loss      | 1.56     |
|    critic_loss     | 0.00713  |
|    ent_coef        | 0.00563  |
|    ent_coef_loss   | 0.121    |
|    learning_rate   | 0.001    |
|    n_updates       | 97233    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.2     |
|    ep_rew_mean     | -7.44    |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 3216     |
|    fps             | 35       |
|    time_elapsed    | 2761     |
|    total_timesteps | 97425    |
| train/             |          |
|    actor_loss      | 1.71     |
|    critic_loss     | 0.0117   |
|    ent_coef        | 0.00567  |
|    ent_coef_loss   | -0.188   |
|    learning_rate   | 0.001    |
|    n_updates       | 97324    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.5     |
|    ep_rew_mean     | -7.58    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 3220     |
|    fps             | 35       |
|    time_elapsed    | 2764     |
|    total_timesteps | 97525    |
| train/             |          |
|    actor_loss      | 1.73     |
|    critic_loss     | 0.00615  |
|    ent_coef        | 0.0057   |
|    ent_coef_loss   | -0.36    |
|    learning_rate   | 0.001    |
|    n_updates       | 97424    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.7     |
|    ep_rew_mean     | -7.71    |
|    success_rate    | 0.86     |
| time/              |          |
|    episodes        | 3224     |
|    fps             | 35       |
|    time_elapsed    | 2768     |
|    total_timesteps | 97626    |
| train/             |          |
|    actor_loss      | 1.74     |
|    critic_loss     | 0.00951  |
|    ent_coef        | 0.00564  |
|    ent_coef_loss   | 0.514    |
|    learning_rate   | 0.001    |
|    n_updates       | 97525    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.9     |
|    ep_rew_mean     | -7.7     |
|    success_rate    | 0.87     |
| time/              |          |
|    episodes        | 3228     |
|    fps             | 35       |
|    time_elapsed    | 2771     |
|    total_timesteps | 97712    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.0172   |
|    ent_coef        | 0.00565  |
|    ent_coef_loss   | 0.369    |
|    learning_rate   | 0.001    |
|    n_updates       | 97611    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.8     |
|    ep_rew_mean     | -7.56    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 3232     |
|    fps             | 35       |
|    time_elapsed    | 2773     |
|    total_timesteps | 97790    |
| train/             |          |
|    actor_loss      | 1.63     |
|    critic_loss     | 0.0134   |
|    ent_coef        | 0.00581  |
|    ent_coef_loss   | 1.05     |
|    learning_rate   | 0.001    |
|    n_updates       | 97689    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.4     |
|    ep_rew_mean     | -7.48    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 3236     |
|    fps             | 35       |
|    time_elapsed    | 2775     |
|    total_timesteps | 97877    |
| train/             |          |
|    actor_loss      | 1.65     |
|    critic_loss     | 0.00674  |
|    ent_coef        | 0.00579  |
|    ent_coef_loss   | -0.485   |
|    learning_rate   | 0.001    |
|    n_updates       | 97776    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.7     |
|    ep_rew_mean     | -7.53    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 3240     |
|    fps             | 35       |
|    time_elapsed    | 2779     |
|    total_timesteps | 97978    |
| train/             |          |
|    actor_loss      | 1.66     |
|    critic_loss     | 0.00828  |
|    ent_coef        | 0.00587  |
|    ent_coef_loss   | 0.26     |
|    learning_rate   | 0.001    |
|    n_updates       | 97877    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.8     |
|    ep_rew_mean     | -7.49    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 3244     |
|    fps             | 35       |
|    time_elapsed    | 2782     |
|    total_timesteps | 98063    |
| train/             |          |
|    actor_loss      | 1.59     |
|    critic_loss     | 0.0134   |
|    ent_coef        | 0.00588  |
|    ent_coef_loss   | -0.731   |
|    learning_rate   | 0.001    |
|    n_updates       | 97962    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22       |
|    ep_rew_mean     | -7.58    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 3248     |
|    fps             | 35       |
|    time_elapsed    | 2785     |
|    total_timesteps | 98156    |
| train/             |          |
|    actor_loss      | 1.51     |
|    critic_loss     | 0.0214   |
|    ent_coef        | 0.00592  |
|    ent_coef_loss   | -0.994   |
|    learning_rate   | 0.001    |
|    n_updates       | 98055    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.9     |
|    ep_rew_mean     | -7.57    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 3252     |
|    fps             | 35       |
|    time_elapsed    | 2787     |
|    total_timesteps | 98235    |
| train/             |          |
|    actor_loss      | 1.61     |
|    critic_loss     | 0.0434   |
|    ent_coef        | 0.00576  |
|    ent_coef_loss   | 1.39     |
|    learning_rate   | 0.001    |
|    n_updates       | 98134    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.2     |
|    ep_rew_mean     | -7.65    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 3256     |
|    fps             | 35       |
|    time_elapsed    | 2790     |
|    total_timesteps | 98329    |
| train/             |          |
|    actor_loss      | 1.51     |
|    critic_loss     | 0.0137   |
|    ent_coef        | 0.00606  |
|    ent_coef_loss   | -1.71    |
|    learning_rate   | 0.001    |
|    n_updates       | 98228    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.2     |
|    ep_rew_mean     | -7.74    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 3260     |
|    fps             | 35       |
|    time_elapsed    | 2792     |
|    total_timesteps | 98415    |
| train/             |          |
|    actor_loss      | 1.51     |
|    critic_loss     | 0.00523  |
|    ent_coef        | 0.00596  |
|    ent_coef_loss   | -0.502   |
|    learning_rate   | 0.001    |
|    n_updates       | 98314    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.3     |
|    ep_rew_mean     | -7.83    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 3264     |
|    fps             | 35       |
|    time_elapsed    | 2796     |
|    total_timesteps | 98508    |
| train/             |          |
|    actor_loss      | 1.5      |
|    critic_loss     | 0.0185   |
|    ent_coef        | 0.00591  |
|    ent_coef_loss   | -0.306   |
|    learning_rate   | 0.001    |
|    n_updates       | 98407    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.9     |
|    ep_rew_mean     | -7.68    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 3268     |
|    fps             | 35       |
|    time_elapsed    | 2797     |
|    total_timesteps | 98570    |
| train/             |          |
|    actor_loss      | 1.67     |
|    critic_loss     | 0.0106   |
|    ent_coef        | 0.00591  |
|    ent_coef_loss   | -0.879   |
|    learning_rate   | 0.001    |
|    n_updates       | 98469    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.9     |
|    ep_rew_mean     | -7.7     |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 3272     |
|    fps             | 35       |
|    time_elapsed    | 2800     |
|    total_timesteps | 98646    |
| train/             |          |
|    actor_loss      | 1.59     |
|    critic_loss     | 0.00515  |
|    ent_coef        | 0.00599  |
|    ent_coef_loss   | 0.526    |
|    learning_rate   | 0.001    |
|    n_updates       | 98545    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.1     |
|    ep_rew_mean     | -7.86    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 3276     |
|    fps             | 35       |
|    time_elapsed    | 2803     |
|    total_timesteps | 98748    |
| train/             |          |
|    actor_loss      | 1.64     |
|    critic_loss     | 0.00643  |
|    ent_coef        | 0.00609  |
|    ent_coef_loss   | 1.47     |
|    learning_rate   | 0.001    |
|    n_updates       | 98647    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.9     |
|    ep_rew_mean     | -7.7     |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 3280     |
|    fps             | 35       |
|    time_elapsed    | 2805     |
|    total_timesteps | 98831    |
| train/             |          |
|    actor_loss      | 1.48     |
|    critic_loss     | 0.00444  |
|    ent_coef        | 0.0062   |
|    ent_coef_loss   | 0.211    |
|    learning_rate   | 0.001    |
|    n_updates       | 98730    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.9     |
|    ep_rew_mean     | -7.72    |
|    success_rate    | 0.88     |
| time/              |          |
|    episodes        | 3284     |
|    fps             | 35       |
|    time_elapsed    | 2809     |
|    total_timesteps | 98912    |
| train/             |          |
|    actor_loss      | 1.58     |
|    critic_loss     | 0.0076   |
|    ent_coef        | 0.00609  |
|    ent_coef_loss   | 1.01     |
|    learning_rate   | 0.001    |
|    n_updates       | 98811    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.8     |
|    ep_rew_mean     | -7.48    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 3288     |
|    fps             | 35       |
|    time_elapsed    | 2810     |
|    total_timesteps | 98976    |
| train/             |          |
|    actor_loss      | 1.58     |
|    critic_loss     | 0.0114   |
|    ent_coef        | 0.00619  |
|    ent_coef_loss   | -0.14    |
|    learning_rate   | 0.001    |
|    n_updates       | 98875    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22       |
|    ep_rew_mean     | -7.56    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 3292     |
|    fps             | 35       |
|    time_elapsed    | 2813     |
|    total_timesteps | 99077    |
| train/             |          |
|    actor_loss      | 1.5      |
|    critic_loss     | 0.00498  |
|    ent_coef        | 0.00613  |
|    ent_coef_loss   | -1.42    |
|    learning_rate   | 0.001    |
|    n_updates       | 98976    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.9     |
|    ep_rew_mean     | -7.56    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 3296     |
|    fps             | 35       |
|    time_elapsed    | 2816     |
|    total_timesteps | 99165    |
| train/             |          |
|    actor_loss      | 1.58     |
|    critic_loss     | 0.0283   |
|    ent_coef        | 0.00619  |
|    ent_coef_loss   | -1.27    |
|    learning_rate   | 0.001    |
|    n_updates       | 99064    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.9     |
|    ep_rew_mean     | -7.54    |
|    success_rate    | 0.91     |
| time/              |          |
|    episodes        | 3300     |
|    fps             | 35       |
|    time_elapsed    | 2819     |
|    total_timesteps | 99253    |
| train/             |          |
|    actor_loss      | 1.61     |
|    critic_loss     | 0.0116   |
|    ent_coef        | 0.00619  |
|    ent_coef_loss   | 0.374    |
|    learning_rate   | 0.001    |
|    n_updates       | 99152    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.8     |
|    ep_rew_mean     | -7.64    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 3304     |
|    fps             | 35       |
|    time_elapsed    | 2821     |
|    total_timesteps | 99331    |
| train/             |          |
|    actor_loss      | 1.75     |
|    critic_loss     | 0.00695  |
|    ent_coef        | 0.00614  |
|    ent_coef_loss   | 0.372    |
|    learning_rate   | 0.001    |
|    n_updates       | 99230    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.9     |
|    ep_rew_mean     | -7.64    |
|    success_rate    | 0.89     |
| time/              |          |
|    episodes        | 3308     |
|    fps             | 35       |
|    time_elapsed    | 2824     |
|    total_timesteps | 99436    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.0116   |
|    ent_coef        | 0.00627  |
|    ent_coef_loss   | -0.597   |
|    learning_rate   | 0.001    |
|    n_updates       | 99335    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.4     |
|    ep_rew_mean     | -7.67    |
|    success_rate    | 0.9      |
| time/              |          |
|    episodes        | 3312     |
|    fps             | 35       |
|    time_elapsed    | 2828     |
|    total_timesteps | 99570    |
| train/             |          |
|    actor_loss      | 1.59     |
|    critic_loss     | 0.00918  |
|    ent_coef        | 0.00616  |
|    ent_coef_loss   | -0.851   |
|    learning_rate   | 0.001    |
|    n_updates       | 99469    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.2     |
|    ep_rew_mean     | -7.57    |
|    success_rate    | 0.91     |
| time/              |          |
|    episodes        | 3316     |
|    fps             | 35       |
|    time_elapsed    | 2831     |
|    total_timesteps | 99650    |
| train/             |          |
|    actor_loss      | 1.48     |
|    critic_loss     | 0.0146   |
|    ent_coef        | 0.00601  |
|    ent_coef_loss   | 0.455    |
|    learning_rate   | 0.001    |
|    n_updates       | 99549    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.9     |
|    ep_rew_mean     | -7.37    |
|    success_rate    | 0.92     |
| time/              |          |
|    episodes        | 3320     |
|    fps             | 35       |
|    time_elapsed    | 2833     |
|    total_timesteps | 99713    |
| train/             |          |
|    actor_loss      | 1.52     |
|    critic_loss     | 0.0198   |
|    ent_coef        | 0.00608  |
|    ent_coef_loss   | 0.534    |
|    learning_rate   | 0.001    |
|    n_updates       | 99612    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.8     |
|    ep_rew_mean     | -7.32    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 3324     |
|    fps             | 35       |
|    time_elapsed    | 2836     |
|    total_timesteps | 99809    |
| train/             |          |
|    actor_loss      | 1.68     |
|    critic_loss     | 0.00713  |
|    ent_coef        | 0.00606  |
|    ent_coef_loss   | -0.392   |
|    learning_rate   | 0.001    |
|    n_updates       | 99708    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 21.9     |
|    ep_rew_mean     | -7.4     |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 3328     |
|    fps             | 35       |
|    time_elapsed    | 2839     |
|    total_timesteps | 99903    |
| train/             |          |
|    actor_loss      | 1.69     |
|    critic_loss     | 0.0114   |
|    ent_coef        | 0.00596  |
|    ent_coef_loss   | -0.102   |
|    learning_rate   | 0.001    |
|    n_updates       | 99802    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22       |
|    ep_rew_mean     | -7.46    |
|    success_rate    | 0.93     |
| time/              |          |
|    episodes        | 3332     |
|    fps             | 35       |
|    time_elapsed    | 2842     |
|    total_timesteps | 99990    |
| train/             |          |
|    actor_loss      | 1.56     |
|    critic_loss     | 0.0135   |
|    ent_coef        | 0.00612  |
|    ent_coef_loss   | 1.11     |
|    learning_rate   | 0.001    |
|    n_updates       | 99889    |
---------------------------------


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


In [9]:
# Load saved model
model = SAC.load('her_sac_highway', env=env)

Wrapping the env with a `Monitor` wrapper
Wrapping the env in a DummyVecEnv.


#### Evaluate the agent

In [10]:
# we use the gym >v.26 API here. Note that you could also wrap the env in a DummyVecEnv
# which allows you to use a simplified API
env = gym.make("parking-v0", render_mode='human')
obs, _ = env.reset()

# Evaluate the agent
episode_reward = 0
for _ in range(1000):
    action, _ = model.predict(obs, deterministic=True)
    obs, reward, terminated, truncated, info = env.step(action)
    done = truncated or terminated
    episode_reward += reward
    if done or info.get("is_success", False):
        print("Reward:", episode_reward, "Success?", info.get("is_success", False))
        episode_reward = 0.0
        obs, _ = env.reset()

Reward: -14.02933825195031 Success? False
Reward: -4.292026703223534 Success? True


  return datetime.utcnow().replace(tzinfo=utc)


Reward: -3.3561244454838484 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -7.547303828170518 Success? True
Reward: -7.472455876440916 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -4.465990762532999 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -9.080535199063654 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -14.035456180355112 Success? False
Reward: -3.3008733818315537 Success? True


  return datetime.utcnow().replace(tzinfo=utc)


Reward: -3.6229664996383733 Success? True
Reward: -5.453174626287506 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -8.509336181498218 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -11.272076396377935 Success? False
Reward: -6.458645907869372 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -5.677454030243229 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -9.755779395594539 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -10.655741777283897 Success? False


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -6.037085171870146 Success? True
Reward: -5.837690563015087 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -6.083123881479254 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -6.871504676029186 Success? True
Reward: -4.659079128745899 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -7.600179101142373 Success? True
Reward: -4.314687384154712 Success? True


  return datetime.utcnow().replace(tzinfo=utc)


Reward: -7.445119107311646 Success? True
Reward: -11.026843136710543 Success? False


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -4.597809732597206 Success? True
Reward: -7.24150509051351 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -9.065921772442312 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -6.037721566229804 Success? True
Reward: -5.315352451327499 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -9.03973155302633 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -11.119747609274633 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -5.869926318066487 Success? True
Reward: -7.683574227648486 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -12.443232152406154 Success? False


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -8.505044117509549 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -4.286532527640324 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -7.35060218733332 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -10.622005543915572 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -5.368978538493507 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -5.351364012076045 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -14.939772157346418 Success? False


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -8.554662582422718 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -5.77329522802678 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -4.386760650695649 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -12.14434853331851 Success? False
Reward: -7.5720899512340525 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Reward: -5.481267354746264 Success? True
Reward: -8.66498671176428 Success? True


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


### Train DDPG agent

In [None]:
# Create the action noise object that will be used for exploration
n_actions = env.action_space.shape[0]
noise_std = 0.2
action_noise = NormalActionNoise(
    mean=np.zeros(n_actions), sigma=noise_std * np.ones(n_actions)
)

model = DDPG(
    "MultiInputPolicy",
    env,
    replay_buffer_class=HerReplayBuffer,
    replay_buffer_kwargs=dict(
        n_sampled_goal=4,
        goal_selection_strategy="future",
    ),
    verbose=1,
    buffer_size=int(1e6),
    learning_rate=1e-3,
    action_noise=action_noise,
    gamma=0.95,
    batch_size=256,
    policy_kwargs=dict(net_arch=[256, 256, 256]),
)

In [None]:
# Train for 2e5 steps
model.learn(int(2e5))
# Save the trained agent
model.save('her_ddpg_highway')

In [None]:
# Load saved model
model = DDPG.load('her_ddpg_highway', env=env)

#### Evaluate the agent

In [None]:
# we use the gym >v.26 API here. Note that you could also wrap the env in a DummyVecEnv
# which allows you to use the old gym API a simplified API
obs, _ = env.reset()

# Evaluate the agent
episode_reward = 0
for _ in range(1000):
    action, _ = model.predict(obs, deterministic=True)
    obs, reward, terminated, truncated, info = env.step(action)
    done = truncated or terminated
    episode_reward += reward
    if done or info.get("is_success", False):
        print("Reward:", episode_reward, "Success?", info.get("is_success", False))
        episode_reward = 0.0
        obs, _ = env.reset()