# **Project Deacription**

This notebook presents a Deep Reinforcement Learning (DRL) solution for the continuous problem of dynamic portfolio optimization.

The goal is to train an autonomous agent to manage and rebalance a portfolio of assets over time, aiming to maximize risk-adjusted returns in a complex, non-stationary market environment.

 **Methodology**:

 **Proximal Policy Optimization** (PPO)

 The core of this project is an agent trained using the Proximal Policy Optimization (PPO) algorithm.

 **Environment**:

 The financial market is modeled as a custom Reinforcement Learning Environment, where the agent interacts with a set of 20 European assets (from Germany and Italy) plus a cash holding.

 **Observation Space** (State):

 The agent's decision-making is informed by a comprehensive 120-feature state vector.

 This includes various technical indicators such as:

 **Momentum Metrics**: To capture price trends.
 **Relative Strength Indicators**: To measure an asset's performance against the broader market.

 **Market Volatility Metrics**: To assess market risk and uncertainty.

 **Action Space**

 At each time step, the agent outputs a vector of portfolio weights, determining the allocation of capital across the 20 assets and cash.

 **Reward Function**

 The agent is guided by a novel reward function based on the **Adjusted Sharpe Ratio**.

 This function is designed to penalize the agent not just for low returns, but also for taking on excessive risk or for underperforming the market benchmark, the **STOXX** index.

 **WARNING**:

 This notebook is not a financial advisor.



In [1]:
pip install stable-baselines3 torch

Collecting stable-baselines3
  Downloading stable_baselines3-2.7.1-py3-none-any.whl.metadata (4.8 kB)
Downloading stable_baselines3-2.7.1-py3-none-any.whl (188 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m188.0/188.0 kB[0m [31m3.1 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: stable-baselines3
Successfully installed stable-baselines3-2.7.1


In [9]:
import numpy as np
import pandas as pd
import yfinance as yf
from sklearn.preprocessing import StandardScaler
from collections import deque
import random
import time
import gymnasium as gym
from gymnasium import spaces
from stable_baselines3 import PPO
from stable_baselines3.common.env_util import make_vec_env
from stable_baselines3.common.vec_env import DummyVecEnv
from stable_baselines3.common.callbacks import BaseCallback
from scipy.special import softmax


In [10]:



TICKERS_BY_COUNTRY = {
    'Germany': ['ALV.DE', 'DBK.DE', 'CBK.DE', 'HAG.DE', 'DB1.DE', 'FPE.DE', 'DHER.DE', 'MUV2.DE', 'VNA.DE', 'SDF.DE'],
    'Italy': ['ISP.MI', 'UCG.MI', 'BAMI.MI', 'BMED.MI', 'FBK.MI', 'G.MI', 'AZM.MI', 'PST.MI', 'RACE.MI', 'IP.MI']
}
ALL_TICKERS = [ticker for country in TICKERS_BY_COUNTRY.values() for ticker in country]
TICKERS = ALL_TICKERS
N_ASSETS = len(TICKERS)
START_DATE = '2021-01-01'
END_DATE = '2025-09-30'

MARKET_TICKER = '^STOXX'
RISK_FREE_RATE_ANNUAL = 0.015
TRADING_DAYS_PER_YEAR = 252


class ParticipantVisibleError(Exception):
    pass

def score_multi_asset(solution: pd.DataFrame, submission: pd.DataFrame) -> float:


    if not submission.index.equals(solution.index):
        solution = solution.reindex(submission.index)


    temp_df = solution.copy()
    temp_df['strategy_returns'] = submission['prediction']


    strategy_excess_returns = temp_df['strategy_returns'] - temp_df['risk_free_rate']
    strategy_std_excess = strategy_excess_returns.std()

    if strategy_std_excess == 0:
        return -100.0


    mean_excess_return_daily = strategy_excess_returns.mean()
    strategy_mean_excess_return = mean_excess_return_daily * TRADING_DAYS_PER_YEAR
    sharpe = mean_excess_return_daily / strategy_std_excess * np.sqrt(TRADING_DAYS_PER_YEAR)


    strategy_volatility = float(temp_df['strategy_returns'].std() * np.sqrt(TRADING_DAYS_PER_YEAR) * 100)


    market_excess_returns = temp_df['forward_returns_PORTFOLIO'] - temp_df['risk_free_rate']
    mean_market_excess_return_daily = market_excess_returns.mean()
    market_mean_excess_return = mean_market_excess_return_daily * TRADING_DAYS_PER_YEAR

    market_std = temp_df['forward_returns_PORTFOLIO'].std()
    market_volatility = float(market_std * np.sqrt(TRADING_DAYS_PER_YEAR) * 100)

    if market_volatility == 0:
        return -100.0


    excess_vol = max(0, strategy_volatility / market_volatility - 0.9)
    vol_penalty = 1 + excess_vol


    return_gap = max(
        0,
        (market_mean_excess_return - strategy_mean_excess_return) * 100,
    )
    return_penalty = 1 + (return_gap**2) / 100


    adjusted_sharpe = sharpe / (vol_penalty * return_penalty)
    return min(float(adjusted_sharpe), 1_000_000)



def prepare_multi_asset_data(tickers, market_ticker, start_date, end_date, risk_free_rate_annual):


    print(f"--- 1. Fetching and Preparing Data for {len(tickers)} Assets using {market_ticker} as Benchmark ---")


    all_symbols = tickers + [market_ticker]

    if isinstance(market_ticker, list):
        market_ticker = market_ticker[0]

    data_raw = yf.download(all_symbols, start=start_date, end=end_date, auto_adjust=False)['Close']


    market_close = data_raw[market_ticker]
    market_returns = market_close.pct_change()
    risk_free_rate = risk_free_rate_annual / TRADING_DAYS_PER_YEAR


    data_frames = []

    for ticker in tickers:
        df_asset = pd.DataFrame(index=data_raw.index)
        df_asset['close'] = data_raw[ticker]
        df_asset['market_return'] = market_returns
        df_asset['risk_free_rate'] = risk_free_rate


        df_asset['vol_adjusted_momentum'] = df_asset['close'].pct_change(20) / df_asset['close'].rolling(20).std()


        df_asset['relative_strength'] = df_asset['close'].pct_change(1) - df_asset['market_return'].shift(1)


        market_vol_20d = df_asset['market_return'].rolling(window=20).std()
        df_asset['market_vol_change'] = market_vol_20d.pct_change(20)


        df_asset['ema_10'] = df_asset['close'].ewm(span=10, adjust=False).mean()
        df_asset['ema_ratio'] = df_asset['ema_10'] / df_asset['close'].rolling(20).mean()


        df_asset['momentum_5d'] = df_asset['close'].pct_change(5)


        df_asset['forward_returns'] = df_asset['close'].pct_change().shift(-1)


        df_asset = df_asset.drop(columns=['close'])


        df_asset.columns = [f'{col}_{ticker}' for col in df_asset.columns]

        data_frames.append(df_asset)


    final_df = pd.concat(data_frames, axis=1)


    forward_return_cols = [f'forward_returns_{t}' for t in tickers]
    final_df['forward_returns_PORTFOLIO'] = final_df[forward_return_cols].mean(axis=1)


    final_df.dropna(subset=forward_return_cols + ['forward_returns_PORTFOLIO'], inplace=True)
    final_df.dropna(inplace=True)


    all_features = [col for col in final_df.columns if not col.startswith(('forward_returns', 'market_return', 'risk_free_rate'))]

    scaler = StandardScaler()
    final_df[all_features] = scaler.fit_transform(final_df[all_features])

    print(f"Data Ready. Total days: {len(final_df)}. Total features: {len(all_features)}")
    return final_df, all_features, forward_return_cols



class MultiAssetPortfolioEnv(gym.Env):

    def __init__(self, data_df, features, forward_return_cols, n_assets):
        super(MultiAssetPortfolioEnv, self).__init__()
        self.df = data_df.copy().reset_index(drop=True)
        self.features = features
        self.forward_return_cols = forward_return_cols
        self.n_assets = n_assets
        self.n_states = len(features)


        self.action_space = spaces.Box(low=-1.0, high=1.0, shape=(n_assets + 1,), dtype=np.float32)


        self.observation_space = spaces.Box(low=-np.inf, high=np.inf, shape=(self.n_states,), dtype=np.float32)


        self.strategy_returns = deque(maxlen=TRADING_DAYS_PER_YEAR)
        self.daily_returns_list = []

    def reset(self, seed=None, options=None):
        super().reset(seed=seed)
        self.current_step = 0
        self.strategy_returns.clear()
        self.daily_returns_list = []
        return self._get_state(), {}

    def _get_state(self):

        return self.df.loc[self.current_step, self.features].values.astype(np.float32)

    def step(self, raw_action: np.ndarray):


        weights = softmax(raw_action)


        asset_weights = weights[:-1]
        cash_weight = weights[-1]


        row = self.df.loc[self.current_step]
        r_f = row[f'risk_free_rate_{TICKERS[0]}']


        asset_returns = row[self.forward_return_cols].values


        daily_return = (cash_weight * r_f) + np.dot(asset_weights, asset_returns)

        self.strategy_returns.append(daily_return)
        self.daily_returns_list.append(daily_return)


        reward = 0.0
        if len(self.strategy_returns) >= 20:
            rolling_returns = np.array(self.strategy_returns)
            excess_returns = rolling_returns - r_f
            mean_excess = np.mean(excess_returns)
            std = np.std(excess_returns)

            if std == 0:
                 sharpe_proxy = -1.0
            else:
                 sharpe_proxy = mean_excess / std * np.sqrt(TRADING_DAYS_PER_YEAR)


            reward = sharpe_proxy * 100


            if np.prod(1 + rolling_returns) - 1 < 0:
                reward += (np.prod(1 + rolling_returns) - 1) * 500

        else:

            reward = daily_return * 100


        self.current_step += 1
        done = self.current_step >= len(self.df) - 1

        next_state = self._get_state() if not done else np.zeros(self.observation_space.shape, dtype=np.float32)

        return next_state, reward, done, False, {}

    def get_final_score(self):



        solution_data = self.df.iloc[:len(self.daily_returns_list)][[f'risk_free_rate_{TICKERS[0]}', 'forward_returns_PORTFOLIO']].rename(
            columns={f'risk_free_rate_{TICKERS[0]}': 'risk_free_rate'}
        )


        submission_data = pd.DataFrame(
            {'prediction': self.daily_returns_list},
            index=solution_data.index
        )


        return score_multi_asset(solution_data, submission_data)



class TradingCallback(BaseCallback):
    def __init__(self, check_freq: int, verbose: int = 1):
        super(TradingCallback, self).__init__(verbose)
        self.check_freq = check_freq

    def _on_step(self) -> bool:
        return True

def train_ppo_agent(env, total_timesteps):



    env = DummyVecEnv([lambda: env])


    model = PPO(
        "MlpPolicy",
        env,
        verbose=1,
        seed=42,

        learning_rate=0.0001,
        n_steps=2048,
        batch_size=64
    )

    print(f"\n--- 2. Training Multi-Asset DRL (PPO) Agent for {total_timesteps} timesteps ---")


    model.learn(
        total_timesteps=total_timesteps,
        callback=TradingCallback(check_freq=5000)
    )
    print("DRL Training Complete.")

    return model

def evaluate_drl_policy(model, env):

    print("\n--- 5. Evaluation (Policy Test on Validation Data) ---")

    obs, _ = env.reset()
    done = False

    for i in range(len(env.df) - 1):

        action, _states = model.predict(obs, deterministic=True)

        obs, reward, terminated, truncated, info = env.step(action)
        if terminated or truncated: break

    final_sharpe = env.get_final_score()
    print(f" Validation Complete. Total Days: {env.current_step}")
    return final_sharpe


if __name__ == "__main__":


    TOTAL_TIMESTEPS = 750000


    final_data, features, forward_return_cols = prepare_multi_asset_data(
        TICKERS, MARKET_TICKER, START_DATE, END_DATE, RISK_FREE_RATE_ANNUAL
    )


    train_size = int(len(final_data) * 0.7)
    train_data = final_data.iloc[:train_size]
    validation_data = final_data.iloc[train_size:]

    print(f"\nDataset split: Train ({len(train_data)} days), Validation ({len(validation_data)} days)")


    train_env = MultiAssetPortfolioEnv(
        data_df=train_data,
        features=features,
        forward_return_cols=forward_return_cols,
        n_assets=N_ASSETS
    )


    trained_policy = train_ppo_agent(train_env, total_timesteps=TOTAL_TIMESTEPS)


    validation_env = MultiAssetPortfolioEnv(
        data_df=validation_data,
        features=features,
        forward_return_cols=forward_return_cols,
        n_assets=N_ASSETS
    )

    final_validation_sharpe = evaluate_drl_policy(trained_policy, validation_env)

    print(f"\n--- DRL Strategy Pipeline Result (PPO, {TOTAL_TIMESTEPS} steps) ✨ ---")
    print(f"Number of Assets: {N_ASSETS} (Stocks + Cash)")
    print(f"Market Benchmark: {MARKET_TICKER}")
    print(f"Total Features in State Space: {len(features)}")
    print(f"Validation Period Adjusted Sharpe: **{final_validation_sharpe:.4f}**")

[                       0%                       ]

--- 1. Fetching and Preparing Data for 20 Assets using ^STOXX as Benchmark ---


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
[************

Data Ready. Total days: 1071. Total features: 120

Dataset split: Train (749 days), Validation (322 days)
Using cpu device

--- 2. Training Multi-Asset DRL (PPO) Agent for 750000 timesteps ---


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return date

-----------------------------
| time/              |      |
|    fps             | 374  |
|    iterations      | 1    |
|    time_elapsed    | 5    |
|    total_timesteps | 2048 |
-----------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 350          |
|    iterations           | 2            |
|    time_elapsed         | 11           |
|    total_timesteps      | 4096         |
| train/                  |              |
|    approx_kl            | 0.0059371656 |
|    clip_fraction        | 0.0271       |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.8        |
|    explained_variance   | 1.43e-06     |
|    learning_rate        | 0.0001       |
|    loss                 | 1.77e+06     |
|    n_updates            | 10           |
|    policy_gradient_loss | -0.0246      |
|    std                  | 0.999        |
|    value_loss           | 3.08e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 339          |
|    iterations           | 3            |
|    time_elapsed         | 18           |
|    total_timesteps      | 6144         |
| train/                  |              |
|    approx_kl            | 0.0026955833 |
|    clip_fraction        | 0.00244      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.8        |
|    explained_variance   | 0.00236      |
|    learning_rate        | 0.0001       |
|    loss                 | 1.46e+06     |
|    n_updates            | 20           |
|    policy_gradient_loss | -0.0154      |
|    std                  | 0.998        |
|    value_loss           | 3.15e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 342          |
|    iterations           | 4            |
|    time_elapsed         | 23           |
|    total_timesteps      | 8192         |
| train/                  |              |
|    approx_kl            | 0.0015660097 |
|    clip_fraction        | 0.000391     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.8        |
|    explained_variance   | 0.00305      |
|    learning_rate        | 0.0001       |
|    loss                 | 2.17e+06     |
|    n_updates            | 30           |
|    policy_gradient_loss | -0.0109      |
|    std                  | 0.997        |
|    value_loss           | 4.16e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 336          |
|    iterations           | 5            |
|    time_elapsed         | 30           |
|    total_timesteps      | 10240        |
| train/                  |              |
|    approx_kl            | 0.0021542627 |
|    clip_fraction        | 0.000977     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.7        |
|    explained_variance   | 0.00612      |
|    learning_rate        | 0.0001       |
|    loss                 | 1.19e+06     |
|    n_updates            | 40           |
|    policy_gradient_loss | -0.013       |
|    std                  | 0.996        |
|    value_loss           | 2.46e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 337          |
|    iterations           | 6            |
|    time_elapsed         | 36           |
|    total_timesteps      | 12288        |
| train/                  |              |
|    approx_kl            | 0.0027087778 |
|    clip_fraction        | 0.00205      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.7        |
|    explained_variance   | 0.00629      |
|    learning_rate        | 0.0001       |
|    loss                 | 9.8e+05      |
|    n_updates            | 50           |
|    policy_gradient_loss | -0.014       |
|    std                  | 0.998        |
|    value_loss           | 2.09e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 330         |
|    iterations           | 7           |
|    time_elapsed         | 43          |
|    total_timesteps      | 14336       |
| train/                  |             |
|    approx_kl            | 0.001433321 |
|    clip_fraction        | 0.000342    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.7       |
|    explained_variance   | 0.014       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.26e+06    |
|    n_updates            | 60          |
|    policy_gradient_loss | -0.0102     |
|    std                  | 0.997       |
|    value_loss           | 2.76e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 329          |
|    iterations           | 8            |
|    time_elapsed         | 49           |
|    total_timesteps      | 16384        |
| train/                  |              |
|    approx_kl            | 0.0013494871 |
|    clip_fraction        | 4.88e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.7        |
|    explained_variance   | 0.0117       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.81e+06     |
|    n_updates            | 70           |
|    policy_gradient_loss | -0.00993     |
|    std                  | 0.997        |
|    value_loss           | 4.1e+06      |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 325          |
|    iterations           | 9            |
|    time_elapsed         | 56           |
|    total_timesteps      | 18432        |
| train/                  |              |
|    approx_kl            | 0.0018238504 |
|    clip_fraction        | 0.000586     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.7        |
|    explained_variance   | 0.0133       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.49e+06     |
|    n_updates            | 80           |
|    policy_gradient_loss | -0.0113      |
|    std                  | 0.997        |
|    value_loss           | 3.76e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 328          |
|    iterations           | 10           |
|    time_elapsed         | 62           |
|    total_timesteps      | 20480        |
| train/                  |              |
|    approx_kl            | 0.0017579028 |
|    clip_fraction        | 0.000488     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.7        |
|    explained_variance   | 0.0209       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.24e+06     |
|    n_updates            | 90           |
|    policy_gradient_loss | -0.0106      |
|    std                  | 0.997        |
|    value_loss           | 2.36e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 328          |
|    iterations           | 11           |
|    time_elapsed         | 68           |
|    total_timesteps      | 22528        |
| train/                  |              |
|    approx_kl            | 0.0023548394 |
|    clip_fraction        | 0.0019       |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.7        |
|    explained_variance   | 0.0198       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.28e+06     |
|    n_updates            | 100          |
|    policy_gradient_loss | -0.0131      |
|    std                  | 0.997        |
|    value_loss           | 2.76e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 331          |
|    iterations           | 12           |
|    time_elapsed         | 74           |
|    total_timesteps      | 24576        |
| train/                  |              |
|    approx_kl            | 0.0017357799 |
|    clip_fraction        | 0.000244     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.7        |
|    explained_variance   | 0.0281       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.63e+06     |
|    n_updates            | 110          |
|    policy_gradient_loss | -0.0109      |
|    std                  | 0.996        |
|    value_loss           | 3.12e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 330          |
|    iterations           | 13           |
|    time_elapsed         | 80           |
|    total_timesteps      | 26624        |
| train/                  |              |
|    approx_kl            | 0.0027101971 |
|    clip_fraction        | 0.00176      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.7        |
|    explained_variance   | 0.0288       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.49e+06     |
|    n_updates            | 120          |
|    policy_gradient_loss | -0.0145      |
|    std                  | 0.995        |
|    value_loss           | 2.51e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 326          |
|    iterations           | 14           |
|    time_elapsed         | 87           |
|    total_timesteps      | 28672        |
| train/                  |              |
|    approx_kl            | 0.0018715996 |
|    clip_fraction        | 0.000879     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.7        |
|    explained_variance   | 0.0289       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.6e+06      |
|    n_updates            | 130          |
|    policy_gradient_loss | -0.0113      |
|    std                  | 0.994        |
|    value_loss           | 2.96e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 322          |
|    iterations           | 15           |
|    time_elapsed         | 95           |
|    total_timesteps      | 30720        |
| train/                  |              |
|    approx_kl            | 0.0013637207 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.7        |
|    explained_variance   | 0.022        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.79e+06     |
|    n_updates            | 140          |
|    policy_gradient_loss | -0.0101      |
|    std                  | 0.994        |
|    value_loss           | 3.23e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 319          |
|    iterations           | 16           |
|    time_elapsed         | 102          |
|    total_timesteps      | 32768        |
| train/                  |              |
|    approx_kl            | 0.0023585595 |
|    clip_fraction        | 0.00156      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.7        |
|    explained_variance   | 0.0314       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.59e+06     |
|    n_updates            | 150          |
|    policy_gradient_loss | -0.0134      |
|    std                  | 0.994        |
|    value_loss           | 3.02e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 313          |
|    iterations           | 17           |
|    time_elapsed         | 110          |
|    total_timesteps      | 34816        |
| train/                  |              |
|    approx_kl            | 0.0025773055 |
|    clip_fraction        | 0.00161      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.7        |
|    explained_variance   | 0.0366       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.56e+06     |
|    n_updates            | 160          |
|    policy_gradient_loss | -0.0138      |
|    std                  | 0.994        |
|    value_loss           | 2.77e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 308          |
|    iterations           | 18           |
|    time_elapsed         | 119          |
|    total_timesteps      | 36864        |
| train/                  |              |
|    approx_kl            | 0.0014629902 |
|    clip_fraction        | 0.000488     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.7        |
|    explained_variance   | 0.03         |
|    learning_rate        | 0.0001       |
|    loss                 | 2.15e+06     |
|    n_updates            | 170          |
|    policy_gradient_loss | -0.0102      |
|    std                  | 0.994        |
|    value_loss           | 3.74e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 306          |
|    iterations           | 19           |
|    time_elapsed         | 126          |
|    total_timesteps      | 38912        |
| train/                  |              |
|    approx_kl            | 0.0014588411 |
|    clip_fraction        | 0.000293     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.7        |
|    explained_variance   | 0.04         |
|    learning_rate        | 0.0001       |
|    loss                 | 1.53e+06     |
|    n_updates            | 180          |
|    policy_gradient_loss | -0.0104      |
|    std                  | 0.994        |
|    value_loss           | 3.51e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


----------------------------------------
| time/                   |            |
|    fps                  | 302        |
|    iterations           | 20         |
|    time_elapsed         | 135        |
|    total_timesteps      | 40960      |
| train/                  |            |
|    approx_kl            | 0.00183992 |
|    clip_fraction        | 0.000928   |
|    clip_range           | 0.2        |
|    entropy_loss         | -29.7      |
|    explained_variance   | 0.0364     |
|    learning_rate        | 0.0001     |
|    loss                 | 1.14e+06   |
|    n_updates            | 190        |
|    policy_gradient_loss | -0.0121    |
|    std                  | 0.993      |
|    value_loss           | 2.47e+06   |
----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 296         |
|    iterations           | 21          |
|    time_elapsed         | 145         |
|    total_timesteps      | 43008       |
| train/                  |             |
|    approx_kl            | 0.002351501 |
|    clip_fraction        | 0.00137     |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.6       |
|    explained_variance   | 0.0457      |
|    learning_rate        | 0.0001      |
|    loss                 | 1.19e+06    |
|    n_updates            | 200         |
|    policy_gradient_loss | -0.0139     |
|    std                  | 0.992       |
|    value_loss           | 2.69e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 293          |
|    iterations           | 22           |
|    time_elapsed         | 153          |
|    total_timesteps      | 45056        |
| train/                  |              |
|    approx_kl            | 0.0017510484 |
|    clip_fraction        | 0.000586     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.0391       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.83e+06     |
|    n_updates            | 210          |
|    policy_gradient_loss | -0.0115      |
|    std                  | 0.992        |
|    value_loss           | 3.83e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 293          |
|    iterations           | 23           |
|    time_elapsed         | 160          |
|    total_timesteps      | 47104        |
| train/                  |              |
|    approx_kl            | 0.0018607854 |
|    clip_fraction        | 0.000928     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.0349       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.74e+06     |
|    n_updates            | 220          |
|    policy_gradient_loss | -0.0117      |
|    std                  | 0.993        |
|    value_loss           | 3.81e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 294         |
|    iterations           | 24          |
|    time_elapsed         | 166         |
|    total_timesteps      | 49152       |
| train/                  |             |
|    approx_kl            | 0.002207067 |
|    clip_fraction        | 0.00122     |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.7       |
|    explained_variance   | 0.0576      |
|    learning_rate        | 0.0001      |
|    loss                 | 8.6e+05     |
|    n_updates            | 230         |
|    policy_gradient_loss | -0.0127     |
|    std                  | 0.993       |
|    value_loss           | 2e+06       |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 293          |
|    iterations           | 25           |
|    time_elapsed         | 174          |
|    total_timesteps      | 51200        |
| train/                  |              |
|    approx_kl            | 0.0023371652 |
|    clip_fraction        | 0.00181      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.7        |
|    explained_variance   | 0.0472       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.5e+06      |
|    n_updates            | 240          |
|    policy_gradient_loss | -0.013       |
|    std                  | 0.994        |
|    value_loss           | 3.03e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 295          |
|    iterations           | 26           |
|    time_elapsed         | 180          |
|    total_timesteps      | 53248        |
| train/                  |              |
|    approx_kl            | 0.0016619407 |
|    clip_fraction        | 0.000244     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.7        |
|    explained_variance   | 0.0468       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.57e+06     |
|    n_updates            | 250          |
|    policy_gradient_loss | -0.0114      |
|    std                  | 0.994        |
|    value_loss           | 3.19e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 295         |
|    iterations           | 27          |
|    time_elapsed         | 187         |
|    total_timesteps      | 55296       |
| train/                  |             |
|    approx_kl            | 0.002172228 |
|    clip_fraction        | 0.00117     |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.7       |
|    explained_variance   | 0.037       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.17e+06    |
|    n_updates            | 260         |
|    policy_gradient_loss | -0.0127     |
|    std                  | 0.993       |
|    value_loss           | 2.51e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 296          |
|    iterations           | 28           |
|    time_elapsed         | 193          |
|    total_timesteps      | 57344        |
| train/                  |              |
|    approx_kl            | 0.0020100572 |
|    clip_fraction        | 0.000781     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.7        |
|    explained_variance   | 0.0417       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.28e+06     |
|    n_updates            | 270          |
|    policy_gradient_loss | -0.0113      |
|    std                  | 0.993        |
|    value_loss           | 2.86e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 296          |
|    iterations           | 29           |
|    time_elapsed         | 200          |
|    total_timesteps      | 59392        |
| train/                  |              |
|    approx_kl            | 0.0021038256 |
|    clip_fraction        | 0.000977     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.0616       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.67e+06     |
|    n_updates            | 280          |
|    policy_gradient_loss | -0.0124      |
|    std                  | 0.992        |
|    value_loss           | 3.04e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


----------------------------------------
| time/                   |            |
|    fps                  | 298        |
|    iterations           | 30         |
|    time_elapsed         | 206        |
|    total_timesteps      | 61440      |
| train/                  |            |
|    approx_kl            | 0.00189248 |
|    clip_fraction        | 0.000928   |
|    clip_range           | 0.2        |
|    entropy_loss         | -29.6      |
|    explained_variance   | 0.0553     |
|    learning_rate        | 0.0001     |
|    loss                 | 1.54e+06   |
|    n_updates            | 290        |
|    policy_gradient_loss | -0.0114    |
|    std                  | 0.992      |
|    value_loss           | 3.59e+06   |
----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 298          |
|    iterations           | 31           |
|    time_elapsed         | 212          |
|    total_timesteps      | 63488        |
| train/                  |              |
|    approx_kl            | 0.0020706211 |
|    clip_fraction        | 0.00083      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.0627       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.13e+06     |
|    n_updates            | 300          |
|    policy_gradient_loss | -0.0121      |
|    std                  | 0.992        |
|    value_loss           | 2.57e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 299          |
|    iterations           | 32           |
|    time_elapsed         | 218          |
|    total_timesteps      | 65536        |
| train/                  |              |
|    approx_kl            | 0.0014377479 |
|    clip_fraction        | 0.000244     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.0514       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.22e+06     |
|    n_updates            | 310          |
|    policy_gradient_loss | -0.00945     |
|    std                  | 0.992        |
|    value_loss           | 2.73e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 299          |
|    iterations           | 33           |
|    time_elapsed         | 225          |
|    total_timesteps      | 67584        |
| train/                  |              |
|    approx_kl            | 0.0018613937 |
|    clip_fraction        | 0.00103      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.0554       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.38e+06     |
|    n_updates            | 320          |
|    policy_gradient_loss | -0.0118      |
|    std                  | 0.993        |
|    value_loss           | 3.36e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 300          |
|    iterations           | 34           |
|    time_elapsed         | 231          |
|    total_timesteps      | 69632        |
| train/                  |              |
|    approx_kl            | 0.0018190243 |
|    clip_fraction        | 0.00083      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.0607       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.81e+06     |
|    n_updates            | 330          |
|    policy_gradient_loss | -0.0114      |
|    std                  | 0.993        |
|    value_loss           | 3.45e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 300          |
|    iterations           | 35           |
|    time_elapsed         | 238          |
|    total_timesteps      | 71680        |
| train/                  |              |
|    approx_kl            | 0.0018352473 |
|    clip_fraction        | 0.000586     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.064        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.14e+06     |
|    n_updates            | 340          |
|    policy_gradient_loss | -0.0117      |
|    std                  | 0.992        |
|    value_loss           | 2.87e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 301          |
|    iterations           | 36           |
|    time_elapsed         | 244          |
|    total_timesteps      | 73728        |
| train/                  |              |
|    approx_kl            | 0.0022105128 |
|    clip_fraction        | 0.00122      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.0869       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.18e+06     |
|    n_updates            | 350          |
|    policy_gradient_loss | -0.0125      |
|    std                  | 0.992        |
|    value_loss           | 2.56e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 300          |
|    iterations           | 37           |
|    time_elapsed         | 252          |
|    total_timesteps      | 75776        |
| train/                  |              |
|    approx_kl            | 0.0019011012 |
|    clip_fraction        | 0.000586     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.0665       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.56e+06     |
|    n_updates            | 360          |
|    policy_gradient_loss | -0.0118      |
|    std                  | 0.993        |
|    value_loss           | 3.01e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 300          |
|    iterations           | 38           |
|    time_elapsed         | 258          |
|    total_timesteps      | 77824        |
| train/                  |              |
|    approx_kl            | 0.0014688908 |
|    clip_fraction        | 4.88e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.047        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.56e+06     |
|    n_updates            | 370          |
|    policy_gradient_loss | -0.00996     |
|    std                  | 0.992        |
|    value_loss           | 3.44e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 299          |
|    iterations           | 39           |
|    time_elapsed         | 266          |
|    total_timesteps      | 79872        |
| train/                  |              |
|    approx_kl            | 0.0020834287 |
|    clip_fraction        | 0.00083      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.0831       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.23e+06     |
|    n_updates            | 380          |
|    policy_gradient_loss | -0.0124      |
|    std                  | 0.992        |
|    value_loss           | 2.26e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 298          |
|    iterations           | 40           |
|    time_elapsed         | 274          |
|    total_timesteps      | 81920        |
| train/                  |              |
|    approx_kl            | 0.0016584222 |
|    clip_fraction        | 0.000244     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.0659       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.82e+06     |
|    n_updates            | 390          |
|    policy_gradient_loss | -0.0105      |
|    std                  | 0.992        |
|    value_loss           | 3.09e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 298          |
|    iterations           | 41           |
|    time_elapsed         | 281          |
|    total_timesteps      | 83968        |
| train/                  |              |
|    approx_kl            | 0.0019509927 |
|    clip_fraction        | 0.00205      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.0679       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.34e+06     |
|    n_updates            | 400          |
|    policy_gradient_loss | -0.0126      |
|    std                  | 0.992        |
|    value_loss           | 3.2e+06      |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 298         |
|    iterations           | 42          |
|    time_elapsed         | 288         |
|    total_timesteps      | 86016       |
| train/                  |             |
|    approx_kl            | 0.001758653 |
|    clip_fraction        | 0.000391    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.6       |
|    explained_variance   | 0.0714      |
|    learning_rate        | 0.0001      |
|    loss                 | 1.35e+06    |
|    n_updates            | 410         |
|    policy_gradient_loss | -0.011      |
|    std                  | 0.992       |
|    value_loss           | 3.1e+06     |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 298          |
|    iterations           | 43           |
|    time_elapsed         | 294          |
|    total_timesteps      | 88064        |
| train/                  |              |
|    approx_kl            | 0.0012977101 |
|    clip_fraction        | 0.000146     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.0619       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.43e+06     |
|    n_updates            | 420          |
|    policy_gradient_loss | -0.0092      |
|    std                  | 0.991        |
|    value_loss           | 3.38e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 297          |
|    iterations           | 44           |
|    time_elapsed         | 302          |
|    total_timesteps      | 90112        |
| train/                  |              |
|    approx_kl            | 0.0014256756 |
|    clip_fraction        | 0.000195     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.0912       |
|    learning_rate        | 0.0001       |
|    loss                 | 2.13e+06     |
|    n_updates            | 430          |
|    policy_gradient_loss | -0.00975     |
|    std                  | 0.991        |
|    value_loss           | 3.82e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 298          |
|    iterations           | 45           |
|    time_elapsed         | 308          |
|    total_timesteps      | 92160        |
| train/                  |              |
|    approx_kl            | 0.0023080055 |
|    clip_fraction        | 0.00249      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.0632       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.78e+06     |
|    n_updates            | 440          |
|    policy_gradient_loss | -0.0135      |
|    std                  | 0.991        |
|    value_loss           | 3.2e+06      |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 297          |
|    iterations           | 46           |
|    time_elapsed         | 316          |
|    total_timesteps      | 94208        |
| train/                  |              |
|    approx_kl            | 0.0022798753 |
|    clip_fraction        | 0.00156      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.0702       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.62e+06     |
|    n_updates            | 450          |
|    policy_gradient_loss | -0.0136      |
|    std                  | 0.992        |
|    value_loss           | 3.08e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 298          |
|    iterations           | 47           |
|    time_elapsed         | 322          |
|    total_timesteps      | 96256        |
| train/                  |              |
|    approx_kl            | 0.0025312495 |
|    clip_fraction        | 0.00151      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.108        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.3e+06      |
|    n_updates            | 460          |
|    policy_gradient_loss | -0.0128      |
|    std                  | 0.992        |
|    value_loss           | 2.83e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 298          |
|    iterations           | 48           |
|    time_elapsed         | 329          |
|    total_timesteps      | 98304        |
| train/                  |              |
|    approx_kl            | 0.0023142109 |
|    clip_fraction        | 0.00293      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.0712       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.65e+06     |
|    n_updates            | 470          |
|    policy_gradient_loss | -0.0136      |
|    std                  | 0.991        |
|    value_loss           | 3.16e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 298          |
|    iterations           | 49           |
|    time_elapsed         | 335          |
|    total_timesteps      | 100352       |
| train/                  |              |
|    approx_kl            | 0.0024299235 |
|    clip_fraction        | 0.00249      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.067        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.59e+06     |
|    n_updates            | 480          |
|    policy_gradient_loss | -0.0132      |
|    std                  | 0.991        |
|    value_loss           | 3.41e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 299          |
|    iterations           | 50           |
|    time_elapsed         | 342          |
|    total_timesteps      | 102400       |
| train/                  |              |
|    approx_kl            | 0.0015308587 |
|    clip_fraction        | 0.000195     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.0667       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.65e+06     |
|    n_updates            | 490          |
|    policy_gradient_loss | -0.0115      |
|    std                  | 0.991        |
|    value_loss           | 3.5e+06      |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 299          |
|    iterations           | 51           |
|    time_elapsed         | 348          |
|    total_timesteps      | 104448       |
| train/                  |              |
|    approx_kl            | 0.0020834212 |
|    clip_fraction        | 0.000977     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.102        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.11e+06     |
|    n_updates            | 500          |
|    policy_gradient_loss | -0.0116      |
|    std                  | 0.991        |
|    value_loss           | 2.08e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 300          |
|    iterations           | 52           |
|    time_elapsed         | 354          |
|    total_timesteps      | 106496       |
| train/                  |              |
|    approx_kl            | 0.0021657199 |
|    clip_fraction        | 0.00146      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.111        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.39e+06     |
|    n_updates            | 510          |
|    policy_gradient_loss | -0.0122      |
|    std                  | 0.992        |
|    value_loss           | 2.58e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 300          |
|    iterations           | 53           |
|    time_elapsed         | 361          |
|    total_timesteps      | 108544       |
| train/                  |              |
|    approx_kl            | 0.0010891119 |
|    clip_fraction        | 9.77e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.061        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.35e+06     |
|    n_updates            | 520          |
|    policy_gradient_loss | -0.00952     |
|    std                  | 0.992        |
|    value_loss           | 4.68e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 300          |
|    iterations           | 54           |
|    time_elapsed         | 367          |
|    total_timesteps      | 110592       |
| train/                  |              |
|    approx_kl            | 0.0024224436 |
|    clip_fraction        | 0.00127      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.102        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.09e+06     |
|    n_updates            | 530          |
|    policy_gradient_loss | -0.0137      |
|    std                  | 0.992        |
|    value_loss           | 3.06e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 301          |
|    iterations           | 55           |
|    time_elapsed         | 373          |
|    total_timesteps      | 112640       |
| train/                  |              |
|    approx_kl            | 0.0021059497 |
|    clip_fraction        | 0.000977     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.0927       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.78e+06     |
|    n_updates            | 540          |
|    policy_gradient_loss | -0.013       |
|    std                  | 0.991        |
|    value_loss           | 3.13e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 301         |
|    iterations           | 56          |
|    time_elapsed         | 380         |
|    total_timesteps      | 114688      |
| train/                  |             |
|    approx_kl            | 0.001910574 |
|    clip_fraction        | 0.000732    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.6       |
|    explained_variance   | 0.0899      |
|    learning_rate        | 0.0001      |
|    loss                 | 1.95e+06    |
|    n_updates            | 550         |
|    policy_gradient_loss | -0.0115     |
|    std                  | 0.992       |
|    value_loss           | 3.55e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 302          |
|    iterations           | 57           |
|    time_elapsed         | 386          |
|    total_timesteps      | 116736       |
| train/                  |              |
|    approx_kl            | 0.0013627461 |
|    clip_fraction        | 0.000195     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.0918       |
|    learning_rate        | 0.0001       |
|    loss                 | 2.1e+06      |
|    n_updates            | 560          |
|    policy_gradient_loss | -0.00966     |
|    std                  | 0.992        |
|    value_loss           | 3.68e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 302         |
|    iterations           | 58          |
|    time_elapsed         | 392         |
|    total_timesteps      | 118784      |
| train/                  |             |
|    approx_kl            | 0.002029682 |
|    clip_fraction        | 0.000732    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.6       |
|    explained_variance   | 0.0893      |
|    learning_rate        | 0.0001      |
|    loss                 | 1.73e+06    |
|    n_updates            | 570         |
|    policy_gradient_loss | -0.0119     |
|    std                  | 0.992       |
|    value_loss           | 3.31e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 302          |
|    iterations           | 59           |
|    time_elapsed         | 398          |
|    total_timesteps      | 120832       |
| train/                  |              |
|    approx_kl            | 0.0019326609 |
|    clip_fraction        | 0.000488     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.0998       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.57e+06     |
|    n_updates            | 580          |
|    policy_gradient_loss | -0.012       |
|    std                  | 0.991        |
|    value_loss           | 2.85e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 303          |
|    iterations           | 60           |
|    time_elapsed         | 405          |
|    total_timesteps      | 122880       |
| train/                  |              |
|    approx_kl            | 0.0014981258 |
|    clip_fraction        | 0.000342     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.134        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.22e+06     |
|    n_updates            | 590          |
|    policy_gradient_loss | -0.0102      |
|    std                  | 0.991        |
|    value_loss           | 3.1e+06      |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 303          |
|    iterations           | 61           |
|    time_elapsed         | 411          |
|    total_timesteps      | 124928       |
| train/                  |              |
|    approx_kl            | 0.0016644404 |
|    clip_fraction        | 0.000684     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.0883       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.66e+06     |
|    n_updates            | 600          |
|    policy_gradient_loss | -0.011       |
|    std                  | 0.991        |
|    value_loss           | 3.66e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 303          |
|    iterations           | 62           |
|    time_elapsed         | 418          |
|    total_timesteps      | 126976       |
| train/                  |              |
|    approx_kl            | 0.0012541233 |
|    clip_fraction        | 0.000293     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.0749       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.71e+06     |
|    n_updates            | 610          |
|    policy_gradient_loss | -0.00975     |
|    std                  | 0.992        |
|    value_loss           | 3.81e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 303         |
|    iterations           | 63          |
|    time_elapsed         | 425         |
|    total_timesteps      | 129024      |
| train/                  |             |
|    approx_kl            | 0.002230025 |
|    clip_fraction        | 0.00122     |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.6       |
|    explained_variance   | 0.119       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.36e+06    |
|    n_updates            | 620         |
|    policy_gradient_loss | -0.0129     |
|    std                  | 0.991       |
|    value_loss           | 2.73e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 303          |
|    iterations           | 64           |
|    time_elapsed         | 431          |
|    total_timesteps      | 131072       |
| train/                  |              |
|    approx_kl            | 0.0015930361 |
|    clip_fraction        | 0.000439     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.11         |
|    learning_rate        | 0.0001       |
|    loss                 | 1.73e+06     |
|    n_updates            | 630          |
|    policy_gradient_loss | -0.0102      |
|    std                  | 0.991        |
|    value_loss           | 3.65e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 304          |
|    iterations           | 65           |
|    time_elapsed         | 437          |
|    total_timesteps      | 133120       |
| train/                  |              |
|    approx_kl            | 0.0017176238 |
|    clip_fraction        | 0.000342     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.102        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.84e+06     |
|    n_updates            | 640          |
|    policy_gradient_loss | -0.0117      |
|    std                  | 0.991        |
|    value_loss           | 3.36e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 304          |
|    iterations           | 66           |
|    time_elapsed         | 443          |
|    total_timesteps      | 135168       |
| train/                  |              |
|    approx_kl            | 0.0023424614 |
|    clip_fraction        | 0.00151      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.0972       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.44e+06     |
|    n_updates            | 650          |
|    policy_gradient_loss | -0.0129      |
|    std                  | 0.99         |
|    value_loss           | 2.84e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 304          |
|    iterations           | 67           |
|    time_elapsed         | 450          |
|    total_timesteps      | 137216       |
| train/                  |              |
|    approx_kl            | 0.0023202933 |
|    clip_fraction        | 0.000977     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.129        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.24e+06     |
|    n_updates            | 660          |
|    policy_gradient_loss | -0.0135      |
|    std                  | 0.99         |
|    value_loss           | 2.82e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 305          |
|    iterations           | 68           |
|    time_elapsed         | 456          |
|    total_timesteps      | 139264       |
| train/                  |              |
|    approx_kl            | 0.0013916637 |
|    clip_fraction        | 0.000146     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.0816       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.6e+06      |
|    n_updates            | 670          |
|    policy_gradient_loss | -0.0095      |
|    std                  | 0.991        |
|    value_loss           | 3.28e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 305          |
|    iterations           | 69           |
|    time_elapsed         | 463          |
|    total_timesteps      | 141312       |
| train/                  |              |
|    approx_kl            | 0.0018479055 |
|    clip_fraction        | 0.000928     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.113        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.09e+06     |
|    n_updates            | 680          |
|    policy_gradient_loss | -0.0114      |
|    std                  | 0.991        |
|    value_loss           | 3.44e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 305         |
|    iterations           | 70          |
|    time_elapsed         | 469         |
|    total_timesteps      | 143360      |
| train/                  |             |
|    approx_kl            | 0.001365821 |
|    clip_fraction        | 0.000342    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.6       |
|    explained_variance   | 0.134       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.92e+06    |
|    n_updates            | 690         |
|    policy_gradient_loss | -0.00973    |
|    std                  | 0.991       |
|    value_loss           | 3.43e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 305          |
|    iterations           | 71           |
|    time_elapsed         | 475          |
|    total_timesteps      | 145408       |
| train/                  |              |
|    approx_kl            | 0.0013232002 |
|    clip_fraction        | 0.000195     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.113        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.88e+06     |
|    n_updates            | 700          |
|    policy_gradient_loss | -0.00919     |
|    std                  | 0.991        |
|    value_loss           | 3.36e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 305          |
|    iterations           | 72           |
|    time_elapsed         | 482          |
|    total_timesteps      | 147456       |
| train/                  |              |
|    approx_kl            | 0.0021062314 |
|    clip_fraction        | 0.00176      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.1          |
|    learning_rate        | 0.0001       |
|    loss                 | 1.43e+06     |
|    n_updates            | 710          |
|    policy_gradient_loss | -0.0124      |
|    std                  | 0.992        |
|    value_loss           | 3.12e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 306          |
|    iterations           | 73           |
|    time_elapsed         | 488          |
|    total_timesteps      | 149504       |
| train/                  |              |
|    approx_kl            | 0.0018261387 |
|    clip_fraction        | 0.000537     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.127        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.7e+06      |
|    n_updates            | 720          |
|    policy_gradient_loss | -0.0113      |
|    std                  | 0.991        |
|    value_loss           | 3.28e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 306          |
|    iterations           | 74           |
|    time_elapsed         | 494          |
|    total_timesteps      | 151552       |
| train/                  |              |
|    approx_kl            | 0.0017175304 |
|    clip_fraction        | 0.000488     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.0979       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.38e+06     |
|    n_updates            | 730          |
|    policy_gradient_loss | -0.0102      |
|    std                  | 0.991        |
|    value_loss           | 2.68e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 306          |
|    iterations           | 75           |
|    time_elapsed         | 500          |
|    total_timesteps      | 153600       |
| train/                  |              |
|    approx_kl            | 0.0019910391 |
|    clip_fraction        | 0.000781     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.116        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.65e+06     |
|    n_updates            | 740          |
|    policy_gradient_loss | -0.0117      |
|    std                  | 0.991        |
|    value_loss           | 3.32e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 306          |
|    iterations           | 76           |
|    time_elapsed         | 507          |
|    total_timesteps      | 155648       |
| train/                  |              |
|    approx_kl            | 0.0020578306 |
|    clip_fraction        | 0.00117      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.137        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.52e+06     |
|    n_updates            | 750          |
|    policy_gradient_loss | -0.0123      |
|    std                  | 0.99         |
|    value_loss           | 2.86e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 307          |
|    iterations           | 77           |
|    time_elapsed         | 513          |
|    total_timesteps      | 157696       |
| train/                  |              |
|    approx_kl            | 0.0021312167 |
|    clip_fraction        | 0.00166      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.13         |
|    learning_rate        | 0.0001       |
|    loss                 | 1.82e+06     |
|    n_updates            | 760          |
|    policy_gradient_loss | -0.0129      |
|    std                  | 0.991        |
|    value_loss           | 3.22e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 307          |
|    iterations           | 78           |
|    time_elapsed         | 520          |
|    total_timesteps      | 159744       |
| train/                  |              |
|    approx_kl            | 0.0022694594 |
|    clip_fraction        | 0.00146      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.135        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.63e+06     |
|    n_updates            | 770          |
|    policy_gradient_loss | -0.0124      |
|    std                  | 0.991        |
|    value_loss           | 2.98e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 307          |
|    iterations           | 79           |
|    time_elapsed         | 525          |
|    total_timesteps      | 161792       |
| train/                  |              |
|    approx_kl            | 0.0017076519 |
|    clip_fraction        | 0.000977     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.122        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.36e+06     |
|    n_updates            | 780          |
|    policy_gradient_loss | -0.0111      |
|    std                  | 0.991        |
|    value_loss           | 2.83e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 307         |
|    iterations           | 80          |
|    time_elapsed         | 532         |
|    total_timesteps      | 163840      |
| train/                  |             |
|    approx_kl            | 0.002352509 |
|    clip_fraction        | 0.00166     |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.6       |
|    explained_variance   | 0.126       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.71e+06    |
|    n_updates            | 790         |
|    policy_gradient_loss | -0.0125     |
|    std                  | 0.992       |
|    value_loss           | 2.58e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 307          |
|    iterations           | 81           |
|    time_elapsed         | 538          |
|    total_timesteps      | 165888       |
| train/                  |              |
|    approx_kl            | 0.0018482314 |
|    clip_fraction        | 0.00112      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.114        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.38e+06     |
|    n_updates            | 800          |
|    policy_gradient_loss | -0.0122      |
|    std                  | 0.992        |
|    value_loss           | 2.94e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 308          |
|    iterations           | 82           |
|    time_elapsed         | 545          |
|    total_timesteps      | 167936       |
| train/                  |              |
|    approx_kl            | 0.0017252251 |
|    clip_fraction        | 0.000684     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.113        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.68e+06     |
|    n_updates            | 810          |
|    policy_gradient_loss | -0.0104      |
|    std                  | 0.991        |
|    value_loss           | 3.35e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 308         |
|    iterations           | 83          |
|    time_elapsed         | 550         |
|    total_timesteps      | 169984      |
| train/                  |             |
|    approx_kl            | 0.001504316 |
|    clip_fraction        | 0.000293    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.6       |
|    explained_variance   | 0.126       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.55e+06    |
|    n_updates            | 820         |
|    policy_gradient_loss | -0.0104     |
|    std                  | 0.992       |
|    value_loss           | 3.71e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 308          |
|    iterations           | 84           |
|    time_elapsed         | 557          |
|    total_timesteps      | 172032       |
| train/                  |              |
|    approx_kl            | 0.0018217585 |
|    clip_fraction        | 0.000781     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.135        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.71e+06     |
|    n_updates            | 830          |
|    policy_gradient_loss | -0.0111      |
|    std                  | 0.992        |
|    value_loss           | 2.8e+06      |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 308          |
|    iterations           | 85           |
|    time_elapsed         | 563          |
|    total_timesteps      | 174080       |
| train/                  |              |
|    approx_kl            | 0.0016586229 |
|    clip_fraction        | 0.000293     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.117        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.08e+06     |
|    n_updates            | 840          |
|    policy_gradient_loss | -0.011       |
|    std                  | 0.992        |
|    value_loss           | 2.68e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 308           |
|    iterations           | 86            |
|    time_elapsed         | 570           |
|    total_timesteps      | 176128        |
| train/                  |               |
|    approx_kl            | 0.00080411974 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.6         |
|    explained_variance   | 0.125         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.55e+06      |
|    n_updates            | 850           |
|    policy_gradient_loss | -0.00777      |
|    std                  | 0.992         |
|    value_loss           | 2.87e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 308          |
|    iterations           | 87           |
|    time_elapsed         | 577          |
|    total_timesteps      | 178176       |
| train/                  |              |
|    approx_kl            | 0.0019589076 |
|    clip_fraction        | 0.000879     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.133        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.46e+06     |
|    n_updates            | 860          |
|    policy_gradient_loss | -0.0121      |
|    std                  | 0.992        |
|    value_loss           | 3.14e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 308          |
|    iterations           | 88           |
|    time_elapsed         | 584          |
|    total_timesteps      | 180224       |
| train/                  |              |
|    approx_kl            | 0.0016488151 |
|    clip_fraction        | 0.000439     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.141        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.19e+06     |
|    n_updates            | 870          |
|    policy_gradient_loss | -0.0109      |
|    std                  | 0.991        |
|    value_loss           | 2.65e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 307          |
|    iterations           | 89           |
|    time_elapsed         | 593          |
|    total_timesteps      | 182272       |
| train/                  |              |
|    approx_kl            | 0.0020549898 |
|    clip_fraction        | 0.000586     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.119        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.51e+06     |
|    n_updates            | 880          |
|    policy_gradient_loss | -0.0125      |
|    std                  | 0.99         |
|    value_loss           | 3.16e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 305          |
|    iterations           | 90           |
|    time_elapsed         | 602          |
|    total_timesteps      | 184320       |
| train/                  |              |
|    approx_kl            | 0.0019849483 |
|    clip_fraction        | 0.000879     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.144        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.75e+06     |
|    n_updates            | 890          |
|    policy_gradient_loss | -0.0114      |
|    std                  | 0.99         |
|    value_loss           | 2.92e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 305         |
|    iterations           | 91          |
|    time_elapsed         | 610         |
|    total_timesteps      | 186368      |
| train/                  |             |
|    approx_kl            | 0.002157776 |
|    clip_fraction        | 0.00137     |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.6       |
|    explained_variance   | 0.104       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.56e+06    |
|    n_updates            | 900         |
|    policy_gradient_loss | -0.0127     |
|    std                  | 0.99        |
|    value_loss           | 3.23e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 304          |
|    iterations           | 92           |
|    time_elapsed         | 619          |
|    total_timesteps      | 188416       |
| train/                  |              |
|    approx_kl            | 0.0011770839 |
|    clip_fraction        | 0.000146     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.142        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.18e+06     |
|    n_updates            | 910          |
|    policy_gradient_loss | -0.00862     |
|    std                  | 0.99         |
|    value_loss           | 2.93e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 302           |
|    iterations           | 93            |
|    time_elapsed         | 628           |
|    total_timesteps      | 190464        |
| train/                  |               |
|    approx_kl            | 0.00091619475 |
|    clip_fraction        | 4.88e-05      |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.6         |
|    explained_variance   | 0.117         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.66e+06      |
|    n_updates            | 920           |
|    policy_gradient_loss | -0.00838      |
|    std                  | 0.99          |
|    value_loss           | 3.86e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 302          |
|    iterations           | 94           |
|    time_elapsed         | 637          |
|    total_timesteps      | 192512       |
| train/                  |              |
|    approx_kl            | 0.0014135666 |
|    clip_fraction        | 0.000146     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.197        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.58e+06     |
|    n_updates            | 930          |
|    policy_gradient_loss | -0.00948     |
|    std                  | 0.99         |
|    value_loss           | 2.48e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 301          |
|    iterations           | 95           |
|    time_elapsed         | 644          |
|    total_timesteps      | 194560       |
| train/                  |              |
|    approx_kl            | 0.0018656799 |
|    clip_fraction        | 0.00083      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.115        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.63e+06     |
|    n_updates            | 940          |
|    policy_gradient_loss | -0.0113      |
|    std                  | 0.99         |
|    value_loss           | 3.19e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 302          |
|    iterations           | 96           |
|    time_elapsed         | 650          |
|    total_timesteps      | 196608       |
| train/                  |              |
|    approx_kl            | 0.0014215254 |
|    clip_fraction        | 0.000146     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.158        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.01e+06     |
|    n_updates            | 950          |
|    policy_gradient_loss | -0.00997     |
|    std                  | 0.99         |
|    value_loss           | 3.53e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 302          |
|    iterations           | 97           |
|    time_elapsed         | 657          |
|    total_timesteps      | 198656       |
| train/                  |              |
|    approx_kl            | 0.0021315548 |
|    clip_fraction        | 0.00083      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.189        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.25e+06     |
|    n_updates            | 960          |
|    policy_gradient_loss | -0.0116      |
|    std                  | 0.989        |
|    value_loss           | 2.55e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 302          |
|    iterations           | 98           |
|    time_elapsed         | 663          |
|    total_timesteps      | 200704       |
| train/                  |              |
|    approx_kl            | 0.0016876957 |
|    clip_fraction        | 0.000977     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.152        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.25e+06     |
|    n_updates            | 970          |
|    policy_gradient_loss | -0.0118      |
|    std                  | 0.989        |
|    value_loss           | 2.91e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 302          |
|    iterations           | 99           |
|    time_elapsed         | 670          |
|    total_timesteps      | 202752       |
| train/                  |              |
|    approx_kl            | 0.0017236478 |
|    clip_fraction        | 0.000439     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.154        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.32e+06     |
|    n_updates            | 980          |
|    policy_gradient_loss | -0.0103      |
|    std                  | 0.989        |
|    value_loss           | 3.21e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 302         |
|    iterations           | 100         |
|    time_elapsed         | 676         |
|    total_timesteps      | 204800      |
| train/                  |             |
|    approx_kl            | 0.002602241 |
|    clip_fraction        | 0.00269     |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.6       |
|    explained_variance   | 0.145       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.52e+06    |
|    n_updates            | 990         |
|    policy_gradient_loss | -0.014      |
|    std                  | 0.989       |
|    value_loss           | 2.64e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 303          |
|    iterations           | 101          |
|    time_elapsed         | 682          |
|    total_timesteps      | 206848       |
| train/                  |              |
|    approx_kl            | 0.0013370921 |
|    clip_fraction        | 0.000146     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.156        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.2e+06      |
|    n_updates            | 1000         |
|    policy_gradient_loss | -0.00973     |
|    std                  | 0.988        |
|    value_loss           | 2.55e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 303          |
|    iterations           | 102          |
|    time_elapsed         | 688          |
|    total_timesteps      | 208896       |
| train/                  |              |
|    approx_kl            | 0.0020955664 |
|    clip_fraction        | 0.000586     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.183        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.43e+06     |
|    n_updates            | 1010         |
|    policy_gradient_loss | -0.0124      |
|    std                  | 0.988        |
|    value_loss           | 2.65e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 303          |
|    iterations           | 103          |
|    time_elapsed         | 695          |
|    total_timesteps      | 210944       |
| train/                  |              |
|    approx_kl            | 0.0014330354 |
|    clip_fraction        | 0.000537     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.0952       |
|    learning_rate        | 0.0001       |
|    loss                 | 2.27e+06     |
|    n_updates            | 1020         |
|    policy_gradient_loss | -0.0102      |
|    std                  | 0.988        |
|    value_loss           | 3.91e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 303          |
|    iterations           | 104          |
|    time_elapsed         | 701          |
|    total_timesteps      | 212992       |
| train/                  |              |
|    approx_kl            | 0.0020123287 |
|    clip_fraction        | 0.000537     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.144        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.33e+06     |
|    n_updates            | 1030         |
|    policy_gradient_loss | -0.0137      |
|    std                  | 0.988        |
|    value_loss           | 2.76e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 303         |
|    iterations           | 105         |
|    time_elapsed         | 707         |
|    total_timesteps      | 215040      |
| train/                  |             |
|    approx_kl            | 0.001595141 |
|    clip_fraction        | 0.000537    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.139       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.15e+06    |
|    n_updates            | 1040        |
|    policy_gradient_loss | -0.00914    |
|    std                  | 0.989       |
|    value_loss           | 2.44e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 304          |
|    iterations           | 106          |
|    time_elapsed         | 713          |
|    total_timesteps      | 217088       |
| train/                  |              |
|    approx_kl            | 0.0014268567 |
|    clip_fraction        | 0.000146     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.161        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.66e+06     |
|    n_updates            | 1050         |
|    policy_gradient_loss | -0.00995     |
|    std                  | 0.988        |
|    value_loss           | 3.25e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 304          |
|    iterations           | 107          |
|    time_elapsed         | 720          |
|    total_timesteps      | 219136       |
| train/                  |              |
|    approx_kl            | 0.0016942576 |
|    clip_fraction        | 0.000781     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.138        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.78e+06     |
|    n_updates            | 1060         |
|    policy_gradient_loss | -0.0112      |
|    std                  | 0.989        |
|    value_loss           | 2.82e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 304         |
|    iterations           | 108         |
|    time_elapsed         | 726         |
|    total_timesteps      | 221184      |
| train/                  |             |
|    approx_kl            | 0.002536387 |
|    clip_fraction        | 0.0021      |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.6       |
|    explained_variance   | 0.177       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.28e+06    |
|    n_updates            | 1070        |
|    policy_gradient_loss | -0.0132     |
|    std                  | 0.989       |
|    value_loss           | 2.73e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 304          |
|    iterations           | 109          |
|    time_elapsed         | 733          |
|    total_timesteps      | 223232       |
| train/                  |              |
|    approx_kl            | 0.0014409603 |
|    clip_fraction        | 0.000195     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.156        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.14e+06     |
|    n_updates            | 1080         |
|    policy_gradient_loss | -0.01        |
|    std                  | 0.988        |
|    value_loss           | 2.93e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 304          |
|    iterations           | 110          |
|    time_elapsed         | 740          |
|    total_timesteps      | 225280       |
| train/                  |              |
|    approx_kl            | 0.0022225564 |
|    clip_fraction        | 0.000879     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.147        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.87e+06     |
|    n_updates            | 1090         |
|    policy_gradient_loss | -0.0137      |
|    std                  | 0.988        |
|    value_loss           | 3.27e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 304          |
|    iterations           | 111          |
|    time_elapsed         | 746          |
|    total_timesteps      | 227328       |
| train/                  |              |
|    approx_kl            | 0.0016145626 |
|    clip_fraction        | 0.000977     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.16         |
|    learning_rate        | 0.0001       |
|    loss                 | 1.71e+06     |
|    n_updates            | 1100         |
|    policy_gradient_loss | -0.0112      |
|    std                  | 0.989        |
|    value_loss           | 3.37e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 304          |
|    iterations           | 112          |
|    time_elapsed         | 753          |
|    total_timesteps      | 229376       |
| train/                  |              |
|    approx_kl            | 0.0009086323 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.137        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.66e+06     |
|    n_updates            | 1110         |
|    policy_gradient_loss | -0.00757     |
|    std                  | 0.989        |
|    value_loss           | 3.02e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 304          |
|    iterations           | 113          |
|    time_elapsed         | 759          |
|    total_timesteps      | 231424       |
| train/                  |              |
|    approx_kl            | 0.0016817986 |
|    clip_fraction        | 0.000732     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.148        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.33e+06     |
|    n_updates            | 1120         |
|    policy_gradient_loss | -0.0107      |
|    std                  | 0.989        |
|    value_loss           | 3.59e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 305          |
|    iterations           | 114          |
|    time_elapsed         | 765          |
|    total_timesteps      | 233472       |
| train/                  |              |
|    approx_kl            | 0.0013627023 |
|    clip_fraction        | 0.000391     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.0856       |
|    learning_rate        | 0.0001       |
|    loss                 | 1.99e+06     |
|    n_updates            | 1130         |
|    policy_gradient_loss | -0.0101      |
|    std                  | 0.989        |
|    value_loss           | 3.79e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 305         |
|    iterations           | 115         |
|    time_elapsed         | 771         |
|    total_timesteps      | 235520      |
| train/                  |             |
|    approx_kl            | 0.001177925 |
|    clip_fraction        | 0.000195    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.6       |
|    explained_variance   | 0.157       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.38e+06    |
|    n_updates            | 1140        |
|    policy_gradient_loss | -0.0089     |
|    std                  | 0.989       |
|    value_loss           | 3e+06       |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 305          |
|    iterations           | 116          |
|    time_elapsed         | 777          |
|    total_timesteps      | 237568       |
| train/                  |              |
|    approx_kl            | 0.0015655737 |
|    clip_fraction        | 0.000244     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.165        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.63e+06     |
|    n_updates            | 1150         |
|    policy_gradient_loss | -0.0105      |
|    std                  | 0.989        |
|    value_loss           | 3.02e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 305          |
|    iterations           | 117          |
|    time_elapsed         | 784          |
|    total_timesteps      | 239616       |
| train/                  |              |
|    approx_kl            | 0.0020427168 |
|    clip_fraction        | 0.00107      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.188        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.34e+06     |
|    n_updates            | 1160         |
|    policy_gradient_loss | -0.0124      |
|    std                  | 0.989        |
|    value_loss           | 3.03e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 305         |
|    iterations           | 118         |
|    time_elapsed         | 789         |
|    total_timesteps      | 241664      |
| train/                  |             |
|    approx_kl            | 0.001740333 |
|    clip_fraction        | 0.000684    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.18        |
|    learning_rate        | 0.0001      |
|    loss                 | 1.33e+06    |
|    n_updates            | 1170        |
|    policy_gradient_loss | -0.0108     |
|    std                  | 0.988       |
|    value_loss           | 3.35e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 306          |
|    iterations           | 119          |
|    time_elapsed         | 796          |
|    total_timesteps      | 243712       |
| train/                  |              |
|    approx_kl            | 0.0020975112 |
|    clip_fraction        | 0.000586     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.216        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.13e+06     |
|    n_updates            | 1180         |
|    policy_gradient_loss | -0.0111      |
|    std                  | 0.987        |
|    value_loss           | 2.4e+06      |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 306          |
|    iterations           | 120          |
|    time_elapsed         | 802          |
|    total_timesteps      | 245760       |
| train/                  |              |
|    approx_kl            | 0.0014378726 |
|    clip_fraction        | 0.000146     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.142        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.24e+06     |
|    n_updates            | 1190         |
|    policy_gradient_loss | -0.0091      |
|    std                  | 0.987        |
|    value_loss           | 2.51e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 306          |
|    iterations           | 121          |
|    time_elapsed         | 809          |
|    total_timesteps      | 247808       |
| train/                  |              |
|    approx_kl            | 0.0013858757 |
|    clip_fraction        | 0.000244     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.173        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.72e+06     |
|    n_updates            | 1200         |
|    policy_gradient_loss | -0.01        |
|    std                  | 0.988        |
|    value_loss           | 3.23e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 306          |
|    iterations           | 122          |
|    time_elapsed         | 815          |
|    total_timesteps      | 249856       |
| train/                  |              |
|    approx_kl            | 0.0016002535 |
|    clip_fraction        | 0.000635     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.144        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.88e+06     |
|    n_updates            | 1210         |
|    policy_gradient_loss | -0.0115      |
|    std                  | 0.987        |
|    value_loss           | 3.71e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 306          |
|    iterations           | 123          |
|    time_elapsed         | 821          |
|    total_timesteps      | 251904       |
| train/                  |              |
|    approx_kl            | 0.0017004495 |
|    clip_fraction        | 0.000146     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.201        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.4e+06      |
|    n_updates            | 1220         |
|    policy_gradient_loss | -0.0107      |
|    std                  | 0.987        |
|    value_loss           | 3.39e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 306          |
|    iterations           | 124          |
|    time_elapsed         | 827          |
|    total_timesteps      | 253952       |
| train/                  |              |
|    approx_kl            | 0.0014592696 |
|    clip_fraction        | 0.000195     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.202        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.19e+06     |
|    n_updates            | 1230         |
|    policy_gradient_loss | -0.0104      |
|    std                  | 0.987        |
|    value_loss           | 3.13e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 306          |
|    iterations           | 125          |
|    time_elapsed         | 834          |
|    total_timesteps      | 256000       |
| train/                  |              |
|    approx_kl            | 0.0016541864 |
|    clip_fraction        | 0.000781     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.188        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.79e+06     |
|    n_updates            | 1240         |
|    policy_gradient_loss | -0.0109      |
|    std                  | 0.987        |
|    value_loss           | 3.59e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 307          |
|    iterations           | 126          |
|    time_elapsed         | 839          |
|    total_timesteps      | 258048       |
| train/                  |              |
|    approx_kl            | 0.0016453075 |
|    clip_fraction        | 0.000586     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.137        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.36e+06     |
|    n_updates            | 1250         |
|    policy_gradient_loss | -0.0106      |
|    std                  | 0.988        |
|    value_loss           | 3.94e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 307          |
|    iterations           | 127          |
|    time_elapsed         | 846          |
|    total_timesteps      | 260096       |
| train/                  |              |
|    approx_kl            | 0.0012239954 |
|    clip_fraction        | 9.77e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.157        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.6e+06      |
|    n_updates            | 1260         |
|    policy_gradient_loss | -0.00975     |
|    std                  | 0.988        |
|    value_loss           | 2.96e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 307          |
|    iterations           | 128          |
|    time_elapsed         | 852          |
|    total_timesteps      | 262144       |
| train/                  |              |
|    approx_kl            | 0.0007456773 |
|    clip_fraction        | 4.88e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.172        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.37e+06     |
|    n_updates            | 1270         |
|    policy_gradient_loss | -0.00696     |
|    std                  | 0.988        |
|    value_loss           | 2.43e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 307          |
|    iterations           | 129          |
|    time_elapsed         | 859          |
|    total_timesteps      | 264192       |
| train/                  |              |
|    approx_kl            | 0.0017986335 |
|    clip_fraction        | 0.000244     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.17         |
|    learning_rate        | 0.0001       |
|    loss                 | 2.11e+06     |
|    n_updates            | 1280         |
|    policy_gradient_loss | -0.0105      |
|    std                  | 0.988        |
|    value_loss           | 3.09e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 307          |
|    iterations           | 130          |
|    time_elapsed         | 865          |
|    total_timesteps      | 266240       |
| train/                  |              |
|    approx_kl            | 0.0013812943 |
|    clip_fraction        | 0.000195     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.21         |
|    learning_rate        | 0.0001       |
|    loss                 | 1.41e+06     |
|    n_updates            | 1290         |
|    policy_gradient_loss | -0.00944     |
|    std                  | 0.988        |
|    value_loss           | 3.5e+06      |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 307         |
|    iterations           | 131         |
|    time_elapsed         | 872         |
|    total_timesteps      | 268288      |
| train/                  |             |
|    approx_kl            | 0.000886044 |
|    clip_fraction        | 9.77e-05    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.177       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.46e+06    |
|    n_updates            | 1300        |
|    policy_gradient_loss | -0.00794    |
|    std                  | 0.988       |
|    value_loss           | 3.13e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 307          |
|    iterations           | 132          |
|    time_elapsed         | 879          |
|    total_timesteps      | 270336       |
| train/                  |              |
|    approx_kl            | 0.0016845219 |
|    clip_fraction        | 0.000781     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.231        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.13e+06     |
|    n_updates            | 1310         |
|    policy_gradient_loss | -0.0115      |
|    std                  | 0.988        |
|    value_loss           | 2.14e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 307          |
|    iterations           | 133          |
|    time_elapsed         | 886          |
|    total_timesteps      | 272384       |
| train/                  |              |
|    approx_kl            | 0.0011209558 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.139        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.44e+06     |
|    n_updates            | 1320         |
|    policy_gradient_loss | -0.0092      |
|    std                  | 0.988        |
|    value_loss           | 4.86e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 307         |
|    iterations           | 134         |
|    time_elapsed         | 891         |
|    total_timesteps      | 274432      |
| train/                  |             |
|    approx_kl            | 0.002010017 |
|    clip_fraction        | 0.000488    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.217       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.65e+06    |
|    n_updates            | 1330        |
|    policy_gradient_loss | -0.0124     |
|    std                  | 0.988       |
|    value_loss           | 2.96e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 307         |
|    iterations           | 135         |
|    time_elapsed         | 898         |
|    total_timesteps      | 276480      |
| train/                  |             |
|    approx_kl            | 0.001566518 |
|    clip_fraction        | 0.000439    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.227       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.51e+06    |
|    n_updates            | 1340        |
|    policy_gradient_loss | -0.00964    |
|    std                  | 0.988       |
|    value_loss           | 2.91e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 307          |
|    iterations           | 136          |
|    time_elapsed         | 904          |
|    total_timesteps      | 278528       |
| train/                  |              |
|    approx_kl            | 0.0021237554 |
|    clip_fraction        | 0.00171      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.175        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.33e+06     |
|    n_updates            | 1350         |
|    policy_gradient_loss | -0.0116      |
|    std                  | 0.988        |
|    value_loss           | 2.51e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 307          |
|    iterations           | 137          |
|    time_elapsed         | 911          |
|    total_timesteps      | 280576       |
| train/                  |              |
|    approx_kl            | 0.0010425949 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.163        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.76e+06     |
|    n_updates            | 1360         |
|    policy_gradient_loss | -0.00875     |
|    std                  | 0.989        |
|    value_loss           | 3.48e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 308          |
|    iterations           | 138          |
|    time_elapsed         | 917          |
|    total_timesteps      | 282624       |
| train/                  |              |
|    approx_kl            | 0.0016209485 |
|    clip_fraction        | 0.000537     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.238        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.3e+06      |
|    n_updates            | 1370         |
|    policy_gradient_loss | -0.0111      |
|    std                  | 0.988        |
|    value_loss           | 2.81e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 308         |
|    iterations           | 139         |
|    time_elapsed         | 923         |
|    total_timesteps      | 284672      |
| train/                  |             |
|    approx_kl            | 0.001081387 |
|    clip_fraction        | 9.77e-05    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.183       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.94e+06    |
|    n_updates            | 1380        |
|    policy_gradient_loss | -0.00842    |
|    std                  | 0.988       |
|    value_loss           | 3.32e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 308          |
|    iterations           | 140          |
|    time_elapsed         | 930          |
|    total_timesteps      | 286720       |
| train/                  |              |
|    approx_kl            | 0.0016333571 |
|    clip_fraction        | 9.77e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.213        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.4e+06      |
|    n_updates            | 1390         |
|    policy_gradient_loss | -0.0106      |
|    std                  | 0.988        |
|    value_loss           | 3.15e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 308          |
|    iterations           | 141          |
|    time_elapsed         | 937          |
|    total_timesteps      | 288768       |
| train/                  |              |
|    approx_kl            | 0.0017294332 |
|    clip_fraction        | 0.000684     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.183        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.54e+06     |
|    n_updates            | 1400         |
|    policy_gradient_loss | -0.0113      |
|    std                  | 0.988        |
|    value_loss           | 3.17e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 308          |
|    iterations           | 142          |
|    time_elapsed         | 943          |
|    total_timesteps      | 290816       |
| train/                  |              |
|    approx_kl            | 0.0010661653 |
|    clip_fraction        | 9.77e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.184        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.07e+06     |
|    n_updates            | 1410         |
|    policy_gradient_loss | -0.00839     |
|    std                  | 0.989        |
|    value_loss           | 2.42e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 308         |
|    iterations           | 143         |
|    time_elapsed         | 949         |
|    total_timesteps      | 292864      |
| train/                  |             |
|    approx_kl            | 0.000721506 |
|    clip_fraction        | 0           |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.6       |
|    explained_variance   | 0.167       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.56e+06    |
|    n_updates            | 1420        |
|    policy_gradient_loss | -0.00704    |
|    std                  | 0.989       |
|    value_loss           | 3.07e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 308          |
|    iterations           | 144          |
|    time_elapsed         | 955          |
|    total_timesteps      | 294912       |
| train/                  |              |
|    approx_kl            | 0.0012321931 |
|    clip_fraction        | 0.000244     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.251        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.87e+06     |
|    n_updates            | 1430         |
|    policy_gradient_loss | -0.00841     |
|    std                  | 0.989        |
|    value_loss           | 3.73e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 308          |
|    iterations           | 145          |
|    time_elapsed         | 962          |
|    total_timesteps      | 296960       |
| train/                  |              |
|    approx_kl            | 0.0017111987 |
|    clip_fraction        | 0.000586     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.156        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.65e+06     |
|    n_updates            | 1440         |
|    policy_gradient_loss | -0.0112      |
|    std                  | 0.989        |
|    value_loss           | 3.65e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 308          |
|    iterations           | 146          |
|    time_elapsed         | 968          |
|    total_timesteps      | 299008       |
| train/                  |              |
|    approx_kl            | 0.0017772599 |
|    clip_fraction        | 0.00137      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.182        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.27e+06     |
|    n_updates            | 1450         |
|    policy_gradient_loss | -0.0109      |
|    std                  | 0.989        |
|    value_loss           | 2.49e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 308          |
|    iterations           | 147          |
|    time_elapsed         | 974          |
|    total_timesteps      | 301056       |
| train/                  |              |
|    approx_kl            | 0.0006393021 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.177        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.73e+06     |
|    n_updates            | 1460         |
|    policy_gradient_loss | -0.00667     |
|    std                  | 0.989        |
|    value_loss           | 3.66e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 309          |
|    iterations           | 148          |
|    time_elapsed         | 980          |
|    total_timesteps      | 303104       |
| train/                  |              |
|    approx_kl            | 0.0020328565 |
|    clip_fraction        | 0.000732     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.6        |
|    explained_variance   | 0.221        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.81e+06     |
|    n_updates            | 1470         |
|    policy_gradient_loss | -0.012       |
|    std                  | 0.988        |
|    value_loss           | 3.36e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 309          |
|    iterations           | 149          |
|    time_elapsed         | 986          |
|    total_timesteps      | 305152       |
| train/                  |              |
|    approx_kl            | 0.0012044599 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.215        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.32e+06     |
|    n_updates            | 1480         |
|    policy_gradient_loss | -0.00901     |
|    std                  | 0.988        |
|    value_loss           | 2.63e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 309          |
|    iterations           | 150          |
|    time_elapsed         | 992          |
|    total_timesteps      | 307200       |
| train/                  |              |
|    approx_kl            | 0.0023908322 |
|    clip_fraction        | 0.00112      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.255        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.06e+06     |
|    n_updates            | 1490         |
|    policy_gradient_loss | -0.0133      |
|    std                  | 0.988        |
|    value_loss           | 2.14e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 309           |
|    iterations           | 151           |
|    time_elapsed         | 998           |
|    total_timesteps      | 309248        |
| train/                  |               |
|    approx_kl            | 0.00084945175 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.219         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.65e+06      |
|    n_updates            | 1500          |
|    policy_gradient_loss | -0.00745      |
|    std                  | 0.987         |
|    value_loss           | 3.07e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 310          |
|    iterations           | 152          |
|    time_elapsed         | 1004         |
|    total_timesteps      | 311296       |
| train/                  |              |
|    approx_kl            | 0.0016953598 |
|    clip_fraction        | 0.000781     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.191        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.84e+06     |
|    n_updates            | 1510         |
|    policy_gradient_loss | -0.0115      |
|    std                  | 0.987        |
|    value_loss           | 3.33e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 310         |
|    iterations           | 153         |
|    time_elapsed         | 1010        |
|    total_timesteps      | 313344      |
| train/                  |             |
|    approx_kl            | 0.001386096 |
|    clip_fraction        | 0.000342    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.239       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.53e+06    |
|    n_updates            | 1520        |
|    policy_gradient_loss | -0.00963    |
|    std                  | 0.987       |
|    value_loss           | 3.18e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 310         |
|    iterations           | 154         |
|    time_elapsed         | 1016        |
|    total_timesteps      | 315392      |
| train/                  |             |
|    approx_kl            | 0.000924737 |
|    clip_fraction        | 0           |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.218       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.63e+06    |
|    n_updates            | 1530        |
|    policy_gradient_loss | -0.00832    |
|    std                  | 0.987       |
|    value_loss           | 3.01e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 310          |
|    iterations           | 155          |
|    time_elapsed         | 1022         |
|    total_timesteps      | 317440       |
| train/                  |              |
|    approx_kl            | 0.0009372878 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.182        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.65e+06     |
|    n_updates            | 1540         |
|    policy_gradient_loss | -0.00777     |
|    std                  | 0.987        |
|    value_loss           | 3.59e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 310          |
|    iterations           | 156          |
|    time_elapsed         | 1028         |
|    total_timesteps      | 319488       |
| train/                  |              |
|    approx_kl            | 0.0015567774 |
|    clip_fraction        | 0.000195     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.238        |
|    learning_rate        | 0.0001       |
|    loss                 | 9.86e+05     |
|    n_updates            | 1550         |
|    policy_gradient_loss | -0.0099      |
|    std                  | 0.986        |
|    value_loss           | 2.34e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 310         |
|    iterations           | 157         |
|    time_elapsed         | 1034        |
|    total_timesteps      | 321536      |
| train/                  |             |
|    approx_kl            | 0.001145498 |
|    clip_fraction        | 4.88e-05    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.177       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.68e+06    |
|    n_updates            | 1560        |
|    policy_gradient_loss | -0.00838    |
|    std                  | 0.985       |
|    value_loss           | 3.94e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 310          |
|    iterations           | 158          |
|    time_elapsed         | 1041         |
|    total_timesteps      | 323584       |
| train/                  |              |
|    approx_kl            | 0.0013910702 |
|    clip_fraction        | 9.77e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.241        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.82e+06     |
|    n_updates            | 1570         |
|    policy_gradient_loss | -0.01        |
|    std                  | 0.985        |
|    value_loss           | 3.33e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 310          |
|    iterations           | 159          |
|    time_elapsed         | 1047         |
|    total_timesteps      | 325632       |
| train/                  |              |
|    approx_kl            | 0.0024880175 |
|    clip_fraction        | 0.00308      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.243        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.67e+06     |
|    n_updates            | 1580         |
|    policy_gradient_loss | -0.0131      |
|    std                  | 0.986        |
|    value_loss           | 3.35e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 310           |
|    iterations           | 160           |
|    time_elapsed         | 1054          |
|    total_timesteps      | 327680        |
| train/                  |               |
|    approx_kl            | 0.00091675564 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.209         |
|    learning_rate        | 0.0001        |
|    loss                 | 2.02e+06      |
|    n_updates            | 1590          |
|    policy_gradient_loss | -0.00771      |
|    std                  | 0.986         |
|    value_loss           | 3.68e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 310          |
|    iterations           | 161          |
|    time_elapsed         | 1060         |
|    total_timesteps      | 329728       |
| train/                  |              |
|    approx_kl            | 0.0018976425 |
|    clip_fraction        | 0.000977     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.156        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.28e+06     |
|    n_updates            | 1600         |
|    policy_gradient_loss | -0.0118      |
|    std                  | 0.986        |
|    value_loss           | 4.02e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 310         |
|    iterations           | 162         |
|    time_elapsed         | 1067        |
|    total_timesteps      | 331776      |
| train/                  |             |
|    approx_kl            | 0.002078765 |
|    clip_fraction        | 0.00107     |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.213       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.25e+06    |
|    n_updates            | 1610        |
|    policy_gradient_loss | -0.0119     |
|    std                  | 0.986       |
|    value_loss           | 2.57e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 310          |
|    iterations           | 163          |
|    time_elapsed         | 1073         |
|    total_timesteps      | 333824       |
| train/                  |              |
|    approx_kl            | 0.0013668068 |
|    clip_fraction        | 0.000391     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.278        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.22e+06     |
|    n_updates            | 1620         |
|    policy_gradient_loss | -0.00992     |
|    std                  | 0.986        |
|    value_loss           | 2.47e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 310          |
|    iterations           | 164          |
|    time_elapsed         | 1080         |
|    total_timesteps      | 335872       |
| train/                  |              |
|    approx_kl            | 0.0011474922 |
|    clip_fraction        | 4.88e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.185        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.06e+06     |
|    n_updates            | 1630         |
|    policy_gradient_loss | -0.00944     |
|    std                  | 0.986        |
|    value_loss           | 3.69e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 310          |
|    iterations           | 165          |
|    time_elapsed         | 1086         |
|    total_timesteps      | 337920       |
| train/                  |              |
|    approx_kl            | 0.0021439148 |
|    clip_fraction        | 0.00156      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.165        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.58e+06     |
|    n_updates            | 1640         |
|    policy_gradient_loss | -0.0125      |
|    std                  | 0.985        |
|    value_loss           | 3.23e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 311           |
|    iterations           | 166           |
|    time_elapsed         | 1092          |
|    total_timesteps      | 339968        |
| train/                  |               |
|    approx_kl            | 0.00082262687 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.164         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.51e+06      |
|    n_updates            | 1650          |
|    policy_gradient_loss | -0.00704      |
|    std                  | 0.985         |
|    value_loss           | 3.38e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 311          |
|    iterations           | 167          |
|    time_elapsed         | 1098         |
|    total_timesteps      | 342016       |
| train/                  |              |
|    approx_kl            | 0.0007104565 |
|    clip_fraction        | 4.88e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.204        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.94e+06     |
|    n_updates            | 1660         |
|    policy_gradient_loss | -0.00733     |
|    std                  | 0.985        |
|    value_loss           | 3.52e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 311          |
|    iterations           | 168          |
|    time_elapsed         | 1105         |
|    total_timesteps      | 344064       |
| train/                  |              |
|    approx_kl            | 0.0011397677 |
|    clip_fraction        | 0.000439     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.162        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.23e+06     |
|    n_updates            | 1670         |
|    policy_gradient_loss | -0.00792     |
|    std                  | 0.985        |
|    value_loss           | 4.99e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 311          |
|    iterations           | 169          |
|    time_elapsed         | 1111         |
|    total_timesteps      | 346112       |
| train/                  |              |
|    approx_kl            | 0.0009926399 |
|    clip_fraction        | 0.000146     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.173        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.92e+06     |
|    n_updates            | 1680         |
|    policy_gradient_loss | -0.00786     |
|    std                  | 0.985        |
|    value_loss           | 3.31e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 311          |
|    iterations           | 170          |
|    time_elapsed         | 1117         |
|    total_timesteps      | 348160       |
| train/                  |              |
|    approx_kl            | 0.0018059659 |
|    clip_fraction        | 0.000488     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.267        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.34e+06     |
|    n_updates            | 1690         |
|    policy_gradient_loss | -0.0116      |
|    std                  | 0.985        |
|    value_loss           | 2.67e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 311          |
|    iterations           | 171          |
|    time_elapsed         | 1124         |
|    total_timesteps      | 350208       |
| train/                  |              |
|    approx_kl            | 0.0015112835 |
|    clip_fraction        | 0.000342     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.196        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.33e+06     |
|    n_updates            | 1700         |
|    policy_gradient_loss | -0.0101      |
|    std                  | 0.984        |
|    value_loss           | 4.08e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 311          |
|    iterations           | 172          |
|    time_elapsed         | 1130         |
|    total_timesteps      | 352256       |
| train/                  |              |
|    approx_kl            | 0.0008027636 |
|    clip_fraction        | 9.77e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.144        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.33e+06     |
|    n_updates            | 1710         |
|    policy_gradient_loss | -0.00718     |
|    std                  | 0.984        |
|    value_loss           | 4.26e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 311         |
|    iterations           | 173         |
|    time_elapsed         | 1136        |
|    total_timesteps      | 354304      |
| train/                  |             |
|    approx_kl            | 0.001454583 |
|    clip_fraction        | 0.000488    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.263       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.6e+06     |
|    n_updates            | 1720        |
|    policy_gradient_loss | -0.0101     |
|    std                  | 0.984       |
|    value_loss           | 3.32e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 311          |
|    iterations           | 174          |
|    time_elapsed         | 1142         |
|    total_timesteps      | 356352       |
| train/                  |              |
|    approx_kl            | 0.0011622758 |
|    clip_fraction        | 0.000195     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.155        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.45e+06     |
|    n_updates            | 1730         |
|    policy_gradient_loss | -0.00884     |
|    std                  | 0.985        |
|    value_loss           | 4.65e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 312           |
|    iterations           | 175           |
|    time_elapsed         | 1148          |
|    total_timesteps      | 358400        |
| train/                  |               |
|    approx_kl            | 0.00078406057 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.17          |
|    learning_rate        | 0.0001        |
|    loss                 | 1.9e+06       |
|    n_updates            | 1740          |
|    policy_gradient_loss | -0.00783      |
|    std                  | 0.985         |
|    value_loss           | 3.86e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 311          |
|    iterations           | 176          |
|    time_elapsed         | 1155         |
|    total_timesteps      | 360448       |
| train/                  |              |
|    approx_kl            | 0.0011811738 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.177        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.13e+06     |
|    n_updates            | 1750         |
|    policy_gradient_loss | -0.00832     |
|    std                  | 0.984        |
|    value_loss           | 3.83e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 312          |
|    iterations           | 177          |
|    time_elapsed         | 1161         |
|    total_timesteps      | 362496       |
| train/                  |              |
|    approx_kl            | 0.0019645686 |
|    clip_fraction        | 0.000488     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.252        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.07e+06     |
|    n_updates            | 1760         |
|    policy_gradient_loss | -0.0117      |
|    std                  | 0.985        |
|    value_loss           | 2.3e+06      |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 312         |
|    iterations           | 178         |
|    time_elapsed         | 1168        |
|    total_timesteps      | 364544      |
| train/                  |             |
|    approx_kl            | 0.001154317 |
|    clip_fraction        | 4.88e-05    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.198       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.46e+06    |
|    n_updates            | 1770        |
|    policy_gradient_loss | -0.00883    |
|    std                  | 0.985       |
|    value_loss           | 3.44e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 312          |
|    iterations           | 179          |
|    time_elapsed         | 1174         |
|    total_timesteps      | 366592       |
| train/                  |              |
|    approx_kl            | 0.0011241075 |
|    clip_fraction        | 4.88e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.186        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.87e+06     |
|    n_updates            | 1780         |
|    policy_gradient_loss | -0.0088      |
|    std                  | 0.986        |
|    value_loss           | 3.77e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 312          |
|    iterations           | 180          |
|    time_elapsed         | 1180         |
|    total_timesteps      | 368640       |
| train/                  |              |
|    approx_kl            | 0.0006318046 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.228        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.77e+06     |
|    n_updates            | 1790         |
|    policy_gradient_loss | -0.00662     |
|    std                  | 0.986        |
|    value_loss           | 3.24e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 312           |
|    iterations           | 181           |
|    time_elapsed         | 1186          |
|    total_timesteps      | 370688        |
| train/                  |               |
|    approx_kl            | 0.00087832485 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.231         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.81e+06      |
|    n_updates            | 1800          |
|    policy_gradient_loss | -0.0077       |
|    std                  | 0.986         |
|    value_loss           | 3.3e+06       |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 312          |
|    iterations           | 182          |
|    time_elapsed         | 1193         |
|    total_timesteps      | 372736       |
| train/                  |              |
|    approx_kl            | 0.0013887262 |
|    clip_fraction        | 0.000391     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.223        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.27e+06     |
|    n_updates            | 1810         |
|    policy_gradient_loss | -0.00982     |
|    std                  | 0.987        |
|    value_loss           | 3.97e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 312          |
|    iterations           | 183          |
|    time_elapsed         | 1200         |
|    total_timesteps      | 374784       |
| train/                  |              |
|    approx_kl            | 0.0010344001 |
|    clip_fraction        | 0.000146     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.229        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.85e+06     |
|    n_updates            | 1820         |
|    policy_gradient_loss | -0.00959     |
|    std                  | 0.987        |
|    value_loss           | 3.37e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 312          |
|    iterations           | 184          |
|    time_elapsed         | 1206         |
|    total_timesteps      | 376832       |
| train/                  |              |
|    approx_kl            | 0.0014038127 |
|    clip_fraction        | 0.000195     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.19         |
|    learning_rate        | 0.0001       |
|    loss                 | 1.5e+06      |
|    n_updates            | 1830         |
|    policy_gradient_loss | -0.0093      |
|    std                  | 0.987        |
|    value_loss           | 2.68e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 312          |
|    iterations           | 185          |
|    time_elapsed         | 1212         |
|    total_timesteps      | 378880       |
| train/                  |              |
|    approx_kl            | 0.0009735054 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.216        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.18e+06     |
|    n_updates            | 1840         |
|    policy_gradient_loss | -0.00658     |
|    std                  | 0.987        |
|    value_loss           | 2.43e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 312          |
|    iterations           | 186          |
|    time_elapsed         | 1219         |
|    total_timesteps      | 380928       |
| train/                  |              |
|    approx_kl            | 0.0012775946 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.273        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.15e+06     |
|    n_updates            | 1850         |
|    policy_gradient_loss | -0.01        |
|    std                  | 0.987        |
|    value_loss           | 2.52e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 312          |
|    iterations           | 187          |
|    time_elapsed         | 1225         |
|    total_timesteps      | 382976       |
| train/                  |              |
|    approx_kl            | 0.0010564949 |
|    clip_fraction        | 0.000146     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.233        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.21e+06     |
|    n_updates            | 1860         |
|    policy_gradient_loss | -0.00859     |
|    std                  | 0.986        |
|    value_loss           | 2.74e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 312          |
|    iterations           | 188          |
|    time_elapsed         | 1232         |
|    total_timesteps      | 385024       |
| train/                  |              |
|    approx_kl            | 0.0009969065 |
|    clip_fraction        | 9.77e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.29         |
|    learning_rate        | 0.0001       |
|    loss                 | 1.51e+06     |
|    n_updates            | 1870         |
|    policy_gradient_loss | -0.00845     |
|    std                  | 0.987        |
|    value_loss           | 2.8e+06      |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 312          |
|    iterations           | 189          |
|    time_elapsed         | 1238         |
|    total_timesteps      | 387072       |
| train/                  |              |
|    approx_kl            | 0.0021119774 |
|    clip_fraction        | 0.00122      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.223        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.66e+06     |
|    n_updates            | 1880         |
|    policy_gradient_loss | -0.0117      |
|    std                  | 0.986        |
|    value_loss           | 3.48e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 312          |
|    iterations           | 190          |
|    time_elapsed         | 1245         |
|    total_timesteps      | 389120       |
| train/                  |              |
|    approx_kl            | 0.0013202534 |
|    clip_fraction        | 0.000244     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.313        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.42e+06     |
|    n_updates            | 1890         |
|    policy_gradient_loss | -0.0102      |
|    std                  | 0.986        |
|    value_loss           | 2.51e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 312          |
|    iterations           | 191          |
|    time_elapsed         | 1250         |
|    total_timesteps      | 391168       |
| train/                  |              |
|    approx_kl            | 0.0013159061 |
|    clip_fraction        | 4.88e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.169        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.01e+06     |
|    n_updates            | 1900         |
|    policy_gradient_loss | -0.00923     |
|    std                  | 0.985        |
|    value_loss           | 4.12e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 312          |
|    iterations           | 192          |
|    time_elapsed         | 1257         |
|    total_timesteps      | 393216       |
| train/                  |              |
|    approx_kl            | 0.0011581493 |
|    clip_fraction        | 0.000293     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.229        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.45e+06     |
|    n_updates            | 1910         |
|    policy_gradient_loss | -0.00934     |
|    std                  | 0.985        |
|    value_loss           | 2.91e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 312           |
|    iterations           | 193           |
|    time_elapsed         | 1263          |
|    total_timesteps      | 395264        |
| train/                  |               |
|    approx_kl            | 0.00067485054 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.229         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.47e+06      |
|    n_updates            | 1920          |
|    policy_gradient_loss | -0.00694      |
|    std                  | 0.985         |
|    value_loss           | 2.74e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 312          |
|    iterations           | 194          |
|    time_elapsed         | 1269         |
|    total_timesteps      | 397312       |
| train/                  |              |
|    approx_kl            | 0.0012400228 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.229        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.81e+06     |
|    n_updates            | 1930         |
|    policy_gradient_loss | -0.00955     |
|    std                  | 0.984        |
|    value_loss           | 3.74e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 313          |
|    iterations           | 195          |
|    time_elapsed         | 1275         |
|    total_timesteps      | 399360       |
| train/                  |              |
|    approx_kl            | 0.0017416998 |
|    clip_fraction        | 0.000586     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.219        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.38e+06     |
|    n_updates            | 1940         |
|    policy_gradient_loss | -0.0117      |
|    std                  | 0.984        |
|    value_loss           | 3.18e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 313          |
|    iterations           | 196          |
|    time_elapsed         | 1281         |
|    total_timesteps      | 401408       |
| train/                  |              |
|    approx_kl            | 0.0013451688 |
|    clip_fraction        | 0.000684     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.4        |
|    explained_variance   | 0.244        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.68e+06     |
|    n_updates            | 1950         |
|    policy_gradient_loss | -0.0097      |
|    std                  | 0.984        |
|    value_loss           | 2.89e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 313          |
|    iterations           | 197          |
|    time_elapsed         | 1288         |
|    total_timesteps      | 403456       |
| train/                  |              |
|    approx_kl            | 0.0013235996 |
|    clip_fraction        | 0.000342     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.187        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.54e+06     |
|    n_updates            | 1960         |
|    policy_gradient_loss | -0.00969     |
|    std                  | 0.984        |
|    value_loss           | 3.6e+06      |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 313          |
|    iterations           | 198          |
|    time_elapsed         | 1294         |
|    total_timesteps      | 405504       |
| train/                  |              |
|    approx_kl            | 0.0013329982 |
|    clip_fraction        | 0.000146     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.183        |
|    learning_rate        | 0.0001       |
|    loss                 | 2e+06        |
|    n_updates            | 1970         |
|    policy_gradient_loss | -0.00981     |
|    std                  | 0.985        |
|    value_loss           | 3.91e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 313         |
|    iterations           | 199         |
|    time_elapsed         | 1300        |
|    total_timesteps      | 407552      |
| train/                  |             |
|    approx_kl            | 0.001258944 |
|    clip_fraction        | 0.000391    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.254       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.29e+06    |
|    n_updates            | 1980        |
|    policy_gradient_loss | -0.00936    |
|    std                  | 0.985       |
|    value_loss           | 3e+06       |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 313           |
|    iterations           | 200           |
|    time_elapsed         | 1307          |
|    total_timesteps      | 409600        |
| train/                  |               |
|    approx_kl            | 0.00020174793 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.237         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.66e+06      |
|    n_updates            | 1990          |
|    policy_gradient_loss | -0.00378      |
|    std                  | 0.985         |
|    value_loss           | 3.07e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 313          |
|    iterations           | 201          |
|    time_elapsed         | 1313         |
|    total_timesteps      | 411648       |
| train/                  |              |
|    approx_kl            | 0.0021659052 |
|    clip_fraction        | 0.00127      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.359        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.08e+06     |
|    n_updates            | 2000         |
|    policy_gradient_loss | -0.0122      |
|    std                  | 0.986        |
|    value_loss           | 2.17e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 313          |
|    iterations           | 202          |
|    time_elapsed         | 1319         |
|    total_timesteps      | 413696       |
| train/                  |              |
|    approx_kl            | 0.0018675704 |
|    clip_fraction        | 0.00146      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.271        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.4e+06      |
|    n_updates            | 2010         |
|    policy_gradient_loss | -0.0116      |
|    std                  | 0.986        |
|    value_loss           | 2.86e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 313          |
|    iterations           | 203          |
|    time_elapsed         | 1325         |
|    total_timesteps      | 415744       |
| train/                  |              |
|    approx_kl            | 0.0008241443 |
|    clip_fraction        | 4.88e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.243        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.02e+06     |
|    n_updates            | 2020         |
|    policy_gradient_loss | -0.00792     |
|    std                  | 0.987        |
|    value_loss           | 3.59e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 313          |
|    iterations           | 204          |
|    time_elapsed         | 1332         |
|    total_timesteps      | 417792       |
| train/                  |              |
|    approx_kl            | 0.0007293949 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.226        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.86e+06     |
|    n_updates            | 2030         |
|    policy_gradient_loss | -0.00706     |
|    std                  | 0.987        |
|    value_loss           | 3.28e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


----------------------------------------
| time/                   |            |
|    fps                  | 313        |
|    iterations           | 205        |
|    time_elapsed         | 1337       |
|    total_timesteps      | 419840     |
| train/                  |            |
|    approx_kl            | 0.00151842 |
|    clip_fraction        | 0.000195   |
|    clip_range           | 0.2        |
|    entropy_loss         | -29.5      |
|    explained_variance   | 0.276      |
|    learning_rate        | 0.0001     |
|    loss                 | 1.62e+06   |
|    n_updates            | 2040       |
|    policy_gradient_loss | -0.00975   |
|    std                  | 0.987      |
|    value_loss           | 2.63e+06   |
----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 313          |
|    iterations           | 206          |
|    time_elapsed         | 1344         |
|    total_timesteps      | 421888       |
| train/                  |              |
|    approx_kl            | 0.0018762103 |
|    clip_fraction        | 0.000684     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.245        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.93e+06     |
|    n_updates            | 2050         |
|    policy_gradient_loss | -0.0119      |
|    std                  | 0.987        |
|    value_loss           | 3.58e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 313         |
|    iterations           | 207         |
|    time_elapsed         | 1350        |
|    total_timesteps      | 423936      |
| train/                  |             |
|    approx_kl            | 0.001970281 |
|    clip_fraction        | 0.000439    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.315       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.08e+06    |
|    n_updates            | 2060        |
|    policy_gradient_loss | -0.0113     |
|    std                  | 0.987       |
|    value_loss           | 2.12e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 313          |
|    iterations           | 208          |
|    time_elapsed         | 1357         |
|    total_timesteps      | 425984       |
| train/                  |              |
|    approx_kl            | 0.0004982614 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.301        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.3e+06      |
|    n_updates            | 2070         |
|    policy_gradient_loss | -0.00601     |
|    std                  | 0.987        |
|    value_loss           | 2.72e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 313          |
|    iterations           | 209          |
|    time_elapsed         | 1363         |
|    total_timesteps      | 428032       |
| train/                  |              |
|    approx_kl            | 0.0007832023 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.226        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.17e+06     |
|    n_updates            | 2080         |
|    policy_gradient_loss | -0.00737     |
|    std                  | 0.987        |
|    value_loss           | 4.69e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 313          |
|    iterations           | 210          |
|    time_elapsed         | 1370         |
|    total_timesteps      | 430080       |
| train/                  |              |
|    approx_kl            | 0.0010366746 |
|    clip_fraction        | 0.000146     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.232        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.15e+06     |
|    n_updates            | 2090         |
|    policy_gradient_loss | -0.00729     |
|    std                  | 0.987        |
|    value_loss           | 4.41e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 313          |
|    iterations           | 211          |
|    time_elapsed         | 1376         |
|    total_timesteps      | 432128       |
| train/                  |              |
|    approx_kl            | 0.0007348497 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.257        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.18e+06     |
|    n_updates            | 2100         |
|    policy_gradient_loss | -0.00702     |
|    std                  | 0.987        |
|    value_loss           | 3.09e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 313          |
|    iterations           | 212          |
|    time_elapsed         | 1383         |
|    total_timesteps      | 434176       |
| train/                  |              |
|    approx_kl            | 0.0011800467 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.256        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.04e+06     |
|    n_updates            | 2110         |
|    policy_gradient_loss | -0.00927     |
|    std                  | 0.987        |
|    value_loss           | 3.42e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 213          |
|    time_elapsed         | 1389         |
|    total_timesteps      | 436224       |
| train/                  |              |
|    approx_kl            | 0.0008017813 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.235        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.78e+06     |
|    n_updates            | 2120         |
|    policy_gradient_loss | -0.00772     |
|    std                  | 0.987        |
|    value_loss           | 3.27e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 313           |
|    iterations           | 214           |
|    time_elapsed         | 1395          |
|    total_timesteps      | 438272        |
| train/                  |               |
|    approx_kl            | 0.00084528734 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.193         |
|    learning_rate        | 0.0001        |
|    loss                 | 2.13e+06      |
|    n_updates            | 2130          |
|    policy_gradient_loss | -0.00798      |
|    std                  | 0.986         |
|    value_loss           | 4.07e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 314           |
|    iterations           | 215           |
|    time_elapsed         | 1401          |
|    total_timesteps      | 440320        |
| train/                  |               |
|    approx_kl            | 0.00046345883 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.299         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.8e+06       |
|    n_updates            | 2140          |
|    policy_gradient_loss | -0.00501      |
|    std                  | 0.987         |
|    value_loss           | 3.32e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 314         |
|    iterations           | 216         |
|    time_elapsed         | 1408        |
|    total_timesteps      | 442368      |
| train/                  |             |
|    approx_kl            | 0.001683482 |
|    clip_fraction        | 0.000391    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.294       |
|    learning_rate        | 0.0001      |
|    loss                 | 2.26e+06    |
|    n_updates            | 2150        |
|    policy_gradient_loss | -0.0102     |
|    std                  | 0.986       |
|    value_loss           | 3.46e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 217          |
|    time_elapsed         | 1413         |
|    total_timesteps      | 444416       |
| train/                  |              |
|    approx_kl            | 0.0017159506 |
|    clip_fraction        | 0.000635     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.252        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.59e+06     |
|    n_updates            | 2160         |
|    policy_gradient_loss | -0.0108      |
|    std                  | 0.987        |
|    value_loss           | 2.66e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 218          |
|    time_elapsed         | 1420         |
|    total_timesteps      | 446464       |
| train/                  |              |
|    approx_kl            | 0.0013763822 |
|    clip_fraction        | 0.000391     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.283        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.17e+06     |
|    n_updates            | 2170         |
|    policy_gradient_loss | -0.00967     |
|    std                  | 0.987        |
|    value_loss           | 2.42e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 314         |
|    iterations           | 219         |
|    time_elapsed         | 1426        |
|    total_timesteps      | 448512      |
| train/                  |             |
|    approx_kl            | 0.001967331 |
|    clip_fraction        | 0.000439    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.257       |
|    learning_rate        | 0.0001      |
|    loss                 | 2.36e+06    |
|    n_updates            | 2180        |
|    policy_gradient_loss | -0.0111     |
|    std                  | 0.987       |
|    value_loss           | 3.9e+06     |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 220          |
|    time_elapsed         | 1432         |
|    total_timesteps      | 450560       |
| train/                  |              |
|    approx_kl            | 0.0014458374 |
|    clip_fraction        | 0.000391     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.267        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.25e+06     |
|    n_updates            | 2190         |
|    policy_gradient_loss | -0.01        |
|    std                  | 0.987        |
|    value_loss           | 3.76e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 221          |
|    time_elapsed         | 1438         |
|    total_timesteps      | 452608       |
| train/                  |              |
|    approx_kl            | 0.0014905782 |
|    clip_fraction        | 0.000488     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.214        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.58e+06     |
|    n_updates            | 2200         |
|    policy_gradient_loss | -0.0104      |
|    std                  | 0.987        |
|    value_loss           | 3.71e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 222          |
|    time_elapsed         | 1445         |
|    total_timesteps      | 454656       |
| train/                  |              |
|    approx_kl            | 0.0017376491 |
|    clip_fraction        | 0.000391     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.275        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.28e+06     |
|    n_updates            | 2210         |
|    policy_gradient_loss | -0.0104      |
|    std                  | 0.987        |
|    value_loss           | 3.32e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 223          |
|    time_elapsed         | 1451         |
|    total_timesteps      | 456704       |
| train/                  |              |
|    approx_kl            | 0.0008066747 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.277        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.28e+06     |
|    n_updates            | 2220         |
|    policy_gradient_loss | -0.0067      |
|    std                  | 0.987        |
|    value_loss           | 2.83e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 224          |
|    time_elapsed         | 1458         |
|    total_timesteps      | 458752       |
| train/                  |              |
|    approx_kl            | 0.0006004361 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.332        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.85e+06     |
|    n_updates            | 2230         |
|    policy_gradient_loss | -0.00695     |
|    std                  | 0.987        |
|    value_loss           | 3.43e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 225          |
|    time_elapsed         | 1463         |
|    total_timesteps      | 460800       |
| train/                  |              |
|    approx_kl            | 0.0019034395 |
|    clip_fraction        | 0.000732     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.174        |
|    learning_rate        | 0.0001       |
|    loss                 | 9.87e+05     |
|    n_updates            | 2240         |
|    policy_gradient_loss | -0.0106      |
|    std                  | 0.987        |
|    value_loss           | 2.86e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 314         |
|    iterations           | 226         |
|    time_elapsed         | 1470        |
|    total_timesteps      | 462848      |
| train/                  |             |
|    approx_kl            | 0.001127126 |
|    clip_fraction        | 0.000146    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.26        |
|    learning_rate        | 0.0001      |
|    loss                 | 1.75e+06    |
|    n_updates            | 2250        |
|    policy_gradient_loss | -0.00917    |
|    std                  | 0.987       |
|    value_loss           | 3.39e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 227          |
|    time_elapsed         | 1476         |
|    total_timesteps      | 464896       |
| train/                  |              |
|    approx_kl            | 0.0024921955 |
|    clip_fraction        | 0.00161      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.297        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.43e+06     |
|    n_updates            | 2260         |
|    policy_gradient_loss | -0.0138      |
|    std                  | 0.987        |
|    value_loss           | 2.6e+06      |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 228          |
|    time_elapsed         | 1483         |
|    total_timesteps      | 466944       |
| train/                  |              |
|    approx_kl            | 0.0010634242 |
|    clip_fraction        | 9.77e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.328        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.18e+06     |
|    n_updates            | 2270         |
|    policy_gradient_loss | -0.00859     |
|    std                  | 0.987        |
|    value_loss           | 2.72e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 229          |
|    time_elapsed         | 1489         |
|    total_timesteps      | 468992       |
| train/                  |              |
|    approx_kl            | 0.0023367528 |
|    clip_fraction        | 0.00127      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.232        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.63e+06     |
|    n_updates            | 2280         |
|    policy_gradient_loss | -0.0135      |
|    std                  | 0.987        |
|    value_loss           | 3.02e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 230          |
|    time_elapsed         | 1496         |
|    total_timesteps      | 471040       |
| train/                  |              |
|    approx_kl            | 0.0011998784 |
|    clip_fraction        | 9.77e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.316        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.05e+06     |
|    n_updates            | 2290         |
|    policy_gradient_loss | -0.00967     |
|    std                  | 0.986        |
|    value_loss           | 2.82e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 231          |
|    time_elapsed         | 1502         |
|    total_timesteps      | 473088       |
| train/                  |              |
|    approx_kl            | 0.0015557786 |
|    clip_fraction        | 0.000293     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.285        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.71e+06     |
|    n_updates            | 2300         |
|    policy_gradient_loss | -0.00995     |
|    std                  | 0.987        |
|    value_loss           | 2.91e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 232          |
|    time_elapsed         | 1509         |
|    total_timesteps      | 475136       |
| train/                  |              |
|    approx_kl            | 0.0016528394 |
|    clip_fraction        | 0.00171      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.231        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.35e+06     |
|    n_updates            | 2310         |
|    policy_gradient_loss | -0.0108      |
|    std                  | 0.987        |
|    value_loss           | 4.43e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 233          |
|    time_elapsed         | 1515         |
|    total_timesteps      | 477184       |
| train/                  |              |
|    approx_kl            | 0.0012586091 |
|    clip_fraction        | 4.88e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.304        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.35e+06     |
|    n_updates            | 2320         |
|    policy_gradient_loss | -0.00965     |
|    std                  | 0.986        |
|    value_loss           | 2.8e+06      |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 234          |
|    time_elapsed         | 1522         |
|    total_timesteps      | 479232       |
| train/                  |              |
|    approx_kl            | 0.0012655347 |
|    clip_fraction        | 9.77e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.299        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.23e+06     |
|    n_updates            | 2330         |
|    policy_gradient_loss | -0.00952     |
|    std                  | 0.986        |
|    value_loss           | 2.95e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 235          |
|    time_elapsed         | 1528         |
|    total_timesteps      | 481280       |
| train/                  |              |
|    approx_kl            | 0.0011496674 |
|    clip_fraction        | 0.000391     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.26         |
|    learning_rate        | 0.0001       |
|    loss                 | 2.08e+06     |
|    n_updates            | 2340         |
|    policy_gradient_loss | -0.00933     |
|    std                  | 0.986        |
|    value_loss           | 4.03e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 236          |
|    time_elapsed         | 1535         |
|    total_timesteps      | 483328       |
| train/                  |              |
|    approx_kl            | 0.0012624207 |
|    clip_fraction        | 0.000488     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.172        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.8e+06      |
|    n_updates            | 2350         |
|    policy_gradient_loss | -0.00963     |
|    std                  | 0.987        |
|    value_loss           | 4.37e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 237          |
|    time_elapsed         | 1543         |
|    total_timesteps      | 485376       |
| train/                  |              |
|    approx_kl            | 0.0015710116 |
|    clip_fraction        | 0.00083      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.224        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.34e+06     |
|    n_updates            | 2360         |
|    policy_gradient_loss | -0.00999     |
|    std                  | 0.987        |
|    value_loss           | 3.7e+06      |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 238          |
|    time_elapsed         | 1549         |
|    total_timesteps      | 487424       |
| train/                  |              |
|    approx_kl            | 0.0012105618 |
|    clip_fraction        | 0.000146     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.326        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.79e+06     |
|    n_updates            | 2370         |
|    policy_gradient_loss | -0.00866     |
|    std                  | 0.987        |
|    value_loss           | 3.22e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 239          |
|    time_elapsed         | 1556         |
|    total_timesteps      | 489472       |
| train/                  |              |
|    approx_kl            | 0.0004998834 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.259        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.76e+06     |
|    n_updates            | 2380         |
|    policy_gradient_loss | -0.00566     |
|    std                  | 0.987        |
|    value_loss           | 3.09e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 240          |
|    time_elapsed         | 1562         |
|    total_timesteps      | 491520       |
| train/                  |              |
|    approx_kl            | 0.0005988504 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.194        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.41e+06     |
|    n_updates            | 2390         |
|    policy_gradient_loss | -0.00649     |
|    std                  | 0.987        |
|    value_loss           | 4.62e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 241          |
|    time_elapsed         | 1569         |
|    total_timesteps      | 493568       |
| train/                  |              |
|    approx_kl            | 0.0012758274 |
|    clip_fraction        | 9.77e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.291        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.55e+06     |
|    n_updates            | 2400         |
|    policy_gradient_loss | -0.00882     |
|    std                  | 0.986        |
|    value_loss           | 2.57e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 242          |
|    time_elapsed         | 1575         |
|    total_timesteps      | 495616       |
| train/                  |              |
|    approx_kl            | 0.0014110849 |
|    clip_fraction        | 0.000488     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.247        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.81e+06     |
|    n_updates            | 2410         |
|    policy_gradient_loss | -0.00919     |
|    std                  | 0.986        |
|    value_loss           | 4.26e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 314           |
|    iterations           | 243           |
|    time_elapsed         | 1581          |
|    total_timesteps      | 497664        |
| train/                  |               |
|    approx_kl            | 0.00045019086 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.234         |
|    learning_rate        | 0.0001        |
|    loss                 | 2.1e+06       |
|    n_updates            | 2420          |
|    policy_gradient_loss | -0.00549      |
|    std                  | 0.986         |
|    value_loss           | 3.44e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 244          |
|    time_elapsed         | 1587         |
|    total_timesteps      | 499712       |
| train/                  |              |
|    approx_kl            | 0.0010629585 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.295        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.59e+06     |
|    n_updates            | 2430         |
|    policy_gradient_loss | -0.00804     |
|    std                  | 0.986        |
|    value_loss           | 3.74e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 314         |
|    iterations           | 245         |
|    time_elapsed         | 1593        |
|    total_timesteps      | 501760      |
| train/                  |             |
|    approx_kl            | 0.002113568 |
|    clip_fraction        | 0.00132     |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.315       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.81e+06    |
|    n_updates            | 2440        |
|    policy_gradient_loss | -0.0129     |
|    std                  | 0.985       |
|    value_loss           | 2.74e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 246          |
|    time_elapsed         | 1600         |
|    total_timesteps      | 503808       |
| train/                  |              |
|    approx_kl            | 0.0009419643 |
|    clip_fraction        | 0.000195     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.272        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.2e+06      |
|    n_updates            | 2450         |
|    policy_gradient_loss | -0.0083      |
|    std                  | 0.986        |
|    value_loss           | 3.86e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 247          |
|    time_elapsed         | 1606         |
|    total_timesteps      | 505856       |
| train/                  |              |
|    approx_kl            | 0.0019011801 |
|    clip_fraction        | 0.000635     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.376        |
|    learning_rate        | 0.0001       |
|    loss                 | 8.98e+05     |
|    n_updates            | 2460         |
|    policy_gradient_loss | -0.0124      |
|    std                  | 0.986        |
|    value_loss           | 2.33e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 314         |
|    iterations           | 248         |
|    time_elapsed         | 1612        |
|    total_timesteps      | 507904      |
| train/                  |             |
|    approx_kl            | 0.001225711 |
|    clip_fraction        | 9.77e-05    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.312       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.09e+06    |
|    n_updates            | 2470        |
|    policy_gradient_loss | -0.00964    |
|    std                  | 0.986       |
|    value_loss           | 2.62e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 249          |
|    time_elapsed         | 1619         |
|    total_timesteps      | 509952       |
| train/                  |              |
|    approx_kl            | 0.0025548763 |
|    clip_fraction        | 0.0022       |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.388        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.36e+06     |
|    n_updates            | 2480         |
|    policy_gradient_loss | -0.0144      |
|    std                  | 0.987        |
|    value_loss           | 2.56e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 314          |
|    iterations           | 250          |
|    time_elapsed         | 1625         |
|    total_timesteps      | 512000       |
| train/                  |              |
|    approx_kl            | 0.0005739185 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.264        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.83e+06     |
|    n_updates            | 2490         |
|    policy_gradient_loss | -0.00597     |
|    std                  | 0.987        |
|    value_loss           | 3.32e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 315          |
|    iterations           | 251          |
|    time_elapsed         | 1631         |
|    total_timesteps      | 514048       |
| train/                  |              |
|    approx_kl            | 0.0010689507 |
|    clip_fraction        | 4.88e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.29         |
|    learning_rate        | 0.0001       |
|    loss                 | 1.48e+06     |
|    n_updates            | 2500         |
|    policy_gradient_loss | -0.00869     |
|    std                  | 0.987        |
|    value_loss           | 3.56e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 315          |
|    iterations           | 252          |
|    time_elapsed         | 1637         |
|    total_timesteps      | 516096       |
| train/                  |              |
|    approx_kl            | 0.0017507941 |
|    clip_fraction        | 0.000928     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.33         |
|    learning_rate        | 0.0001       |
|    loss                 | 1.61e+06     |
|    n_updates            | 2510         |
|    policy_gradient_loss | -0.0114      |
|    std                  | 0.987        |
|    value_loss           | 2.87e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 315          |
|    iterations           | 253          |
|    time_elapsed         | 1643         |
|    total_timesteps      | 518144       |
| train/                  |              |
|    approx_kl            | 0.0005731174 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.254        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.97e+06     |
|    n_updates            | 2520         |
|    policy_gradient_loss | -0.00631     |
|    std                  | 0.987        |
|    value_loss           | 4.49e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 315           |
|    iterations           | 254           |
|    time_elapsed         | 1651          |
|    total_timesteps      | 520192        |
| train/                  |               |
|    approx_kl            | 0.00085694704 |
|    clip_fraction        | 9.77e-05      |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.274         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.56e+06      |
|    n_updates            | 2530          |
|    policy_gradient_loss | -0.00763      |
|    std                  | 0.987         |
|    value_loss           | 3.43e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 315         |
|    iterations           | 255         |
|    time_elapsed         | 1657        |
|    total_timesteps      | 522240      |
| train/                  |             |
|    approx_kl            | 0.000643611 |
|    clip_fraction        | 0           |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.332       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.73e+06    |
|    n_updates            | 2540        |
|    policy_gradient_loss | -0.00729    |
|    std                  | 0.987       |
|    value_loss           | 3.15e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 315          |
|    iterations           | 256          |
|    time_elapsed         | 1663         |
|    total_timesteps      | 524288       |
| train/                  |              |
|    approx_kl            | 0.0011573718 |
|    clip_fraction        | 9.77e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.221        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.26e+06     |
|    n_updates            | 2550         |
|    policy_gradient_loss | -0.00824     |
|    std                  | 0.987        |
|    value_loss           | 2.83e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 315          |
|    iterations           | 257          |
|    time_elapsed         | 1669         |
|    total_timesteps      | 526336       |
| train/                  |              |
|    approx_kl            | 0.0013815917 |
|    clip_fraction        | 0.000342     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.262        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.93e+06     |
|    n_updates            | 2560         |
|    policy_gradient_loss | -0.00949     |
|    std                  | 0.987        |
|    value_loss           | 3.84e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 315          |
|    iterations           | 258          |
|    time_elapsed         | 1676         |
|    total_timesteps      | 528384       |
| train/                  |              |
|    approx_kl            | 0.0011938065 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.319        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.05e+06     |
|    n_updates            | 2570         |
|    policy_gradient_loss | -0.00918     |
|    std                  | 0.987        |
|    value_loss           | 2.35e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 315          |
|    iterations           | 259          |
|    time_elapsed         | 1681         |
|    total_timesteps      | 530432       |
| train/                  |              |
|    approx_kl            | 0.0022401547 |
|    clip_fraction        | 0.002        |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.317        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.28e+06     |
|    n_updates            | 2580         |
|    policy_gradient_loss | -0.0136      |
|    std                  | 0.987        |
|    value_loss           | 2.62e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 315           |
|    iterations           | 260           |
|    time_elapsed         | 1687          |
|    total_timesteps      | 532480        |
| train/                  |               |
|    approx_kl            | 0.00069207966 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.312         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.59e+06      |
|    n_updates            | 2590          |
|    policy_gradient_loss | -0.0071       |
|    std                  | 0.987         |
|    value_loss           | 3.22e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 315          |
|    iterations           | 261          |
|    time_elapsed         | 1693         |
|    total_timesteps      | 534528       |
| train/                  |              |
|    approx_kl            | 0.0013249184 |
|    clip_fraction        | 0.000146     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.326        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.07e+06     |
|    n_updates            | 2600         |
|    policy_gradient_loss | -0.00894     |
|    std                  | 0.986        |
|    value_loss           | 2.42e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 315          |
|    iterations           | 262          |
|    time_elapsed         | 1700         |
|    total_timesteps      | 536576       |
| train/                  |              |
|    approx_kl            | 0.0010868388 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.3          |
|    learning_rate        | 0.0001       |
|    loss                 | 1.48e+06     |
|    n_updates            | 2610         |
|    policy_gradient_loss | -0.00819     |
|    std                  | 0.986        |
|    value_loss           | 2.72e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 315          |
|    iterations           | 263          |
|    time_elapsed         | 1705         |
|    total_timesteps      | 538624       |
| train/                  |              |
|    approx_kl            | 0.0011656006 |
|    clip_fraction        | 4.88e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.307        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.48e+06     |
|    n_updates            | 2620         |
|    policy_gradient_loss | -0.00939     |
|    std                  | 0.987        |
|    value_loss           | 3.28e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 315         |
|    iterations           | 264         |
|    time_elapsed         | 1712        |
|    total_timesteps      | 540672      |
| train/                  |             |
|    approx_kl            | 0.000890602 |
|    clip_fraction        | 0.000146    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.288       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.73e+06    |
|    n_updates            | 2630        |
|    policy_gradient_loss | -0.00825    |
|    std                  | 0.987       |
|    value_loss           | 3.62e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 315           |
|    iterations           | 265           |
|    time_elapsed         | 1718          |
|    total_timesteps      | 542720        |
| train/                  |               |
|    approx_kl            | 0.00062598346 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.307         |
|    learning_rate        | 0.0001        |
|    loss                 | 7.86e+05      |
|    n_updates            | 2640          |
|    policy_gradient_loss | -0.00627      |
|    std                  | 0.987         |
|    value_loss           | 2.06e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 315          |
|    iterations           | 266          |
|    time_elapsed         | 1724         |
|    total_timesteps      | 544768       |
| train/                  |              |
|    approx_kl            | 0.0012437508 |
|    clip_fraction        | 9.77e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.304        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.97e+06     |
|    n_updates            | 2650         |
|    policy_gradient_loss | -0.0095      |
|    std                  | 0.987        |
|    value_loss           | 3.67e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 315          |
|    iterations           | 267          |
|    time_elapsed         | 1730         |
|    total_timesteps      | 546816       |
| train/                  |              |
|    approx_kl            | 0.0014586101 |
|    clip_fraction        | 0.000195     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.32         |
|    learning_rate        | 0.0001       |
|    loss                 | 1.44e+06     |
|    n_updates            | 2660         |
|    policy_gradient_loss | -0.0104      |
|    std                  | 0.988        |
|    value_loss           | 2.8e+06      |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 315          |
|    iterations           | 268          |
|    time_elapsed         | 1737         |
|    total_timesteps      | 548864       |
| train/                  |              |
|    approx_kl            | 0.0007116449 |
|    clip_fraction        | 0.000146     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.296        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.4e+06      |
|    n_updates            | 2670         |
|    policy_gradient_loss | -0.00698     |
|    std                  | 0.987        |
|    value_loss           | 3.14e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 269          |
|    time_elapsed         | 1743         |
|    total_timesteps      | 550912       |
| train/                  |              |
|    approx_kl            | 0.0005698174 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.298        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.24e+06     |
|    n_updates            | 2680         |
|    policy_gradient_loss | -0.00613     |
|    std                  | 0.987        |
|    value_loss           | 2.45e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 316         |
|    iterations           | 270         |
|    time_elapsed         | 1749        |
|    total_timesteps      | 552960      |
| train/                  |             |
|    approx_kl            | 0.001056744 |
|    clip_fraction        | 4.88e-05    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.279       |
|    learning_rate        | 0.0001      |
|    loss                 | 2.43e+06    |
|    n_updates            | 2690        |
|    policy_gradient_loss | -0.00796    |
|    std                  | 0.987       |
|    value_loss           | 3.56e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 271          |
|    time_elapsed         | 1755         |
|    total_timesteps      | 555008       |
| train/                  |              |
|    approx_kl            | 0.0012315516 |
|    clip_fraction        | 9.77e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.298        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.46e+06     |
|    n_updates            | 2700         |
|    policy_gradient_loss | -0.00968     |
|    std                  | 0.988        |
|    value_loss           | 2.86e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 272          |
|    time_elapsed         | 1762         |
|    total_timesteps      | 557056       |
| train/                  |              |
|    approx_kl            | 0.0005342782 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.297        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.65e+06     |
|    n_updates            | 2710         |
|    policy_gradient_loss | -0.00627     |
|    std                  | 0.987        |
|    value_loss           | 3.53e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 273          |
|    time_elapsed         | 1768         |
|    total_timesteps      | 559104       |
| train/                  |              |
|    approx_kl            | 0.0009192454 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.287        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.29e+06     |
|    n_updates            | 2720         |
|    policy_gradient_loss | -0.00815     |
|    std                  | 0.987        |
|    value_loss           | 3.37e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 316           |
|    iterations           | 274           |
|    time_elapsed         | 1775          |
|    total_timesteps      | 561152        |
| train/                  |               |
|    approx_kl            | 0.00092800753 |
|    clip_fraction        | 0.000195      |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.328         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.91e+06      |
|    n_updates            | 2730          |
|    policy_gradient_loss | -0.0083       |
|    std                  | 0.987         |
|    value_loss           | 3.14e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 316           |
|    iterations           | 275           |
|    time_elapsed         | 1781          |
|    total_timesteps      | 563200        |
| train/                  |               |
|    approx_kl            | 0.00040431943 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.224         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.54e+06      |
|    n_updates            | 2740          |
|    policy_gradient_loss | -0.00482      |
|    std                  | 0.988         |
|    value_loss           | 3.17e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 276          |
|    time_elapsed         | 1787         |
|    total_timesteps      | 565248       |
| train/                  |              |
|    approx_kl            | 0.0009149031 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.336        |
|    learning_rate        | 0.0001       |
|    loss                 | 9.26e+05     |
|    n_updates            | 2750         |
|    policy_gradient_loss | -0.00851     |
|    std                  | 0.987        |
|    value_loss           | 2.46e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 277          |
|    time_elapsed         | 1793         |
|    total_timesteps      | 567296       |
| train/                  |              |
|    approx_kl            | 0.0007742493 |
|    clip_fraction        | 4.88e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.329        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.64e+06     |
|    n_updates            | 2760         |
|    policy_gradient_loss | -0.00674     |
|    std                  | 0.988        |
|    value_loss           | 3.48e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 278          |
|    time_elapsed         | 1800         |
|    total_timesteps      | 569344       |
| train/                  |              |
|    approx_kl            | 0.0009314263 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.276        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.45e+06     |
|    n_updates            | 2770         |
|    policy_gradient_loss | -0.00836     |
|    std                  | 0.988        |
|    value_loss           | 3.41e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 279          |
|    time_elapsed         | 1806         |
|    total_timesteps      | 571392       |
| train/                  |              |
|    approx_kl            | 0.0012820603 |
|    clip_fraction        | 0.000342     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.309        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.01e+06     |
|    n_updates            | 2780         |
|    policy_gradient_loss | -0.00984     |
|    std                  | 0.988        |
|    value_loss           | 3.74e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 280          |
|    time_elapsed         | 1813         |
|    total_timesteps      | 573440       |
| train/                  |              |
|    approx_kl            | 0.0016789977 |
|    clip_fraction        | 0.00103      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.324        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.59e+06     |
|    n_updates            | 2790         |
|    policy_gradient_loss | -0.011       |
|    std                  | 0.988        |
|    value_loss           | 3.34e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 281          |
|    time_elapsed         | 1819         |
|    total_timesteps      | 575488       |
| train/                  |              |
|    approx_kl            | 0.0006223083 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.337        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.27e+06     |
|    n_updates            | 2800         |
|    policy_gradient_loss | -0.00669     |
|    std                  | 0.988        |
|    value_loss           | 2.46e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 282          |
|    time_elapsed         | 1826         |
|    total_timesteps      | 577536       |
| train/                  |              |
|    approx_kl            | 0.0010652853 |
|    clip_fraction        | 9.77e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.295        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.94e+06     |
|    n_updates            | 2810         |
|    policy_gradient_loss | -0.00928     |
|    std                  | 0.988        |
|    value_loss           | 3.61e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 283          |
|    time_elapsed         | 1832         |
|    total_timesteps      | 579584       |
| train/                  |              |
|    approx_kl            | 0.0013075988 |
|    clip_fraction        | 0.000146     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.34         |
|    learning_rate        | 0.0001       |
|    loss                 | 1.57e+06     |
|    n_updates            | 2820         |
|    policy_gradient_loss | -0.01        |
|    std                  | 0.988        |
|    value_loss           | 3.13e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 284          |
|    time_elapsed         | 1838         |
|    total_timesteps      | 581632       |
| train/                  |              |
|    approx_kl            | 0.0007769128 |
|    clip_fraction        | 9.77e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.372        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.11e+06     |
|    n_updates            | 2830         |
|    policy_gradient_loss | -0.00728     |
|    std                  | 0.988        |
|    value_loss           | 2.34e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 285          |
|    time_elapsed         | 1844         |
|    total_timesteps      | 583680       |
| train/                  |              |
|    approx_kl            | 0.0004786087 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.309        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.5e+06      |
|    n_updates            | 2840         |
|    policy_gradient_loss | -0.0059      |
|    std                  | 0.988        |
|    value_loss           | 3.43e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 316           |
|    iterations           | 286           |
|    time_elapsed         | 1851          |
|    total_timesteps      | 585728        |
| train/                  |               |
|    approx_kl            | 0.00059283944 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.297         |
|    learning_rate        | 0.0001        |
|    loss                 | 2.2e+06       |
|    n_updates            | 2850          |
|    policy_gradient_loss | -0.0064       |
|    std                  | 0.988         |
|    value_loss           | 3.92e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 287          |
|    time_elapsed         | 1857         |
|    total_timesteps      | 587776       |
| train/                  |              |
|    approx_kl            | 0.0007834972 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.319        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.61e+06     |
|    n_updates            | 2860         |
|    policy_gradient_loss | -0.00777     |
|    std                  | 0.988        |
|    value_loss           | 3.75e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 316           |
|    iterations           | 288           |
|    time_elapsed         | 1863          |
|    total_timesteps      | 589824        |
| train/                  |               |
|    approx_kl            | 0.00032540373 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.307         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.4e+06       |
|    n_updates            | 2870          |
|    policy_gradient_loss | -0.00468      |
|    std                  | 0.988         |
|    value_loss           | 3.11e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 316         |
|    iterations           | 289         |
|    time_elapsed         | 1869        |
|    total_timesteps      | 591872      |
| train/                  |             |
|    approx_kl            | 0.001316654 |
|    clip_fraction        | 0.000342    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.396       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.43e+06    |
|    n_updates            | 2880        |
|    policy_gradient_loss | -0.00972    |
|    std                  | 0.988       |
|    value_loss           | 2.86e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 316           |
|    iterations           | 290           |
|    time_elapsed         | 1876          |
|    total_timesteps      | 593920        |
| train/                  |               |
|    approx_kl            | 0.00060339057 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.29          |
|    learning_rate        | 0.0001        |
|    loss                 | 2.25e+06      |
|    n_updates            | 2890          |
|    policy_gradient_loss | -0.0064       |
|    std                  | 0.988         |
|    value_loss           | 4.05e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 291          |
|    time_elapsed         | 1881         |
|    total_timesteps      | 595968       |
| train/                  |              |
|    approx_kl            | 0.0009739449 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.391        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.24e+06     |
|    n_updates            | 2900         |
|    policy_gradient_loss | -0.00842     |
|    std                  | 0.988        |
|    value_loss           | 2.53e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 292          |
|    time_elapsed         | 1888         |
|    total_timesteps      | 598016       |
| train/                  |              |
|    approx_kl            | 0.0012280776 |
|    clip_fraction        | 0.000244     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.378        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.96e+06     |
|    n_updates            | 2910         |
|    policy_gradient_loss | -0.00879     |
|    std                  | 0.987        |
|    value_loss           | 2.85e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 293          |
|    time_elapsed         | 1894         |
|    total_timesteps      | 600064       |
| train/                  |              |
|    approx_kl            | 0.0016513333 |
|    clip_fraction        | 0.000244     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.363        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.71e+06     |
|    n_updates            | 2920         |
|    policy_gradient_loss | -0.0117      |
|    std                  | 0.988        |
|    value_loss           | 3.67e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 294          |
|    time_elapsed         | 1901         |
|    total_timesteps      | 602112       |
| train/                  |              |
|    approx_kl            | 0.0017894858 |
|    clip_fraction        | 0.000537     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.288        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.82e+06     |
|    n_updates            | 2930         |
|    policy_gradient_loss | -0.0115      |
|    std                  | 0.988        |
|    value_loss           | 3.6e+06      |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 295          |
|    time_elapsed         | 1907         |
|    total_timesteps      | 604160       |
| train/                  |              |
|    approx_kl            | 0.0020267046 |
|    clip_fraction        | 0.000781     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.451        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.03e+06     |
|    n_updates            | 2940         |
|    policy_gradient_loss | -0.0124      |
|    std                  | 0.987        |
|    value_loss           | 2.05e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 316           |
|    iterations           | 296           |
|    time_elapsed         | 1913          |
|    total_timesteps      | 606208        |
| train/                  |               |
|    approx_kl            | 0.00064957095 |
|    clip_fraction        | 4.88e-05      |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.343         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.27e+06      |
|    n_updates            | 2950          |
|    policy_gradient_loss | -0.00702      |
|    std                  | 0.987         |
|    value_loss           | 3.29e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 297          |
|    time_elapsed         | 1920         |
|    total_timesteps      | 608256       |
| train/                  |              |
|    approx_kl            | 0.0009749853 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.287        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.63e+06     |
|    n_updates            | 2960         |
|    policy_gradient_loss | -0.00803     |
|    std                  | 0.987        |
|    value_loss           | 3.14e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 298          |
|    time_elapsed         | 1926         |
|    total_timesteps      | 610304       |
| train/                  |              |
|    approx_kl            | 0.0017098729 |
|    clip_fraction        | 0.00146      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.355        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.59e+06     |
|    n_updates            | 2970         |
|    policy_gradient_loss | -0.0121      |
|    std                  | 0.987        |
|    value_loss           | 2.93e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 299          |
|    time_elapsed         | 1932         |
|    total_timesteps      | 612352       |
| train/                  |              |
|    approx_kl            | 0.0016595253 |
|    clip_fraction        | 9.77e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.411        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.7e+06      |
|    n_updates            | 2980         |
|    policy_gradient_loss | -0.01        |
|    std                  | 0.987        |
|    value_loss           | 2.72e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 300          |
|    time_elapsed         | 1939         |
|    total_timesteps      | 614400       |
| train/                  |              |
|    approx_kl            | 0.0006143438 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.31         |
|    learning_rate        | 0.0001       |
|    loss                 | 1.66e+06     |
|    n_updates            | 2990         |
|    policy_gradient_loss | -0.00641     |
|    std                  | 0.987        |
|    value_loss           | 3.74e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 316           |
|    iterations           | 301           |
|    time_elapsed         | 1945          |
|    total_timesteps      | 616448        |
| train/                  |               |
|    approx_kl            | 0.00042119273 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.249         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.83e+06      |
|    n_updates            | 3000          |
|    policy_gradient_loss | -0.00523      |
|    std                  | 0.987         |
|    value_loss           | 3.41e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 302          |
|    time_elapsed         | 1952         |
|    total_timesteps      | 618496       |
| train/                  |              |
|    approx_kl            | 0.0017735885 |
|    clip_fraction        | 0.000684     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.414        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.25e+06     |
|    n_updates            | 3010         |
|    policy_gradient_loss | -0.0114      |
|    std                  | 0.988        |
|    value_loss           | 2.37e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 303          |
|    time_elapsed         | 1958         |
|    total_timesteps      | 620544       |
| train/                  |              |
|    approx_kl            | 0.0013053371 |
|    clip_fraction        | 0.000439     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.313        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.81e+06     |
|    n_updates            | 3020         |
|    policy_gradient_loss | -0.0099      |
|    std                  | 0.988        |
|    value_loss           | 3.55e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 304          |
|    time_elapsed         | 1965         |
|    total_timesteps      | 622592       |
| train/                  |              |
|    approx_kl            | 0.0017871963 |
|    clip_fraction        | 0.000879     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.283        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.48e+06     |
|    n_updates            | 3030         |
|    policy_gradient_loss | -0.0109      |
|    std                  | 0.988        |
|    value_loss           | 3.68e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 305          |
|    time_elapsed         | 1971         |
|    total_timesteps      | 624640       |
| train/                  |              |
|    approx_kl            | 0.0020703021 |
|    clip_fraction        | 0.000781     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.288        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.42e+06     |
|    n_updates            | 3040         |
|    policy_gradient_loss | -0.0128      |
|    std                  | 0.988        |
|    value_loss           | 3.27e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 306          |
|    time_elapsed         | 1977         |
|    total_timesteps      | 626688       |
| train/                  |              |
|    approx_kl            | 0.0006104145 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.296        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.1e+06      |
|    n_updates            | 3050         |
|    policy_gradient_loss | -0.00652     |
|    std                  | 0.988        |
|    value_loss           | 2.92e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 307          |
|    time_elapsed         | 1983         |
|    total_timesteps      | 628736       |
| train/                  |              |
|    approx_kl            | 0.0009771439 |
|    clip_fraction        | 4.88e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.336        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.89e+06     |
|    n_updates            | 3060         |
|    policy_gradient_loss | -0.0074      |
|    std                  | 0.988        |
|    value_loss           | 3.77e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 316           |
|    iterations           | 308           |
|    time_elapsed         | 1990          |
|    total_timesteps      | 630784        |
| train/                  |               |
|    approx_kl            | 0.00026278102 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.258         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.68e+06      |
|    n_updates            | 3070          |
|    policy_gradient_loss | -0.00384      |
|    std                  | 0.987         |
|    value_loss           | 3.15e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 317           |
|    iterations           | 309           |
|    time_elapsed         | 1995          |
|    total_timesteps      | 632832        |
| train/                  |               |
|    approx_kl            | 0.00043816038 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.267         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.69e+06      |
|    n_updates            | 3080          |
|    policy_gradient_loss | -0.00582      |
|    std                  | 0.988         |
|    value_loss           | 3.78e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 317           |
|    iterations           | 310           |
|    time_elapsed         | 2002          |
|    total_timesteps      | 634880        |
| train/                  |               |
|    approx_kl            | 0.00054735487 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.335         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.24e+06      |
|    n_updates            | 3090          |
|    policy_gradient_loss | -0.0065       |
|    std                  | 0.987         |
|    value_loss           | 3.03e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 317          |
|    iterations           | 311          |
|    time_elapsed         | 2008         |
|    total_timesteps      | 636928       |
| train/                  |              |
|    approx_kl            | 0.0015838686 |
|    clip_fraction        | 0.000537     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.387        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.74e+06     |
|    n_updates            | 3100         |
|    policy_gradient_loss | -0.0101      |
|    std                  | 0.987        |
|    value_loss           | 3.32e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 312          |
|    time_elapsed         | 2015         |
|    total_timesteps      | 638976       |
| train/                  |              |
|    approx_kl            | 0.0018307477 |
|    clip_fraction        | 0.00117      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.403        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.65e+06     |
|    n_updates            | 3110         |
|    policy_gradient_loss | -0.0115      |
|    std                  | 0.988        |
|    value_loss           | 3.11e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 317          |
|    iterations           | 313          |
|    time_elapsed         | 2022         |
|    total_timesteps      | 641024       |
| train/                  |              |
|    approx_kl            | 0.0014515953 |
|    clip_fraction        | 0.000342     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.324        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.49e+06     |
|    n_updates            | 3120         |
|    policy_gradient_loss | -0.00918     |
|    std                  | 0.988        |
|    value_loss           | 2.92e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 314          |
|    time_elapsed         | 2028         |
|    total_timesteps      | 643072       |
| train/                  |              |
|    approx_kl            | 0.0004612102 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.345        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.88e+06     |
|    n_updates            | 3130         |
|    policy_gradient_loss | -0.00572     |
|    std                  | 0.988        |
|    value_loss           | 3.49e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 317           |
|    iterations           | 315           |
|    time_elapsed         | 2034          |
|    total_timesteps      | 645120        |
| train/                  |               |
|    approx_kl            | 0.00089470646 |
|    clip_fraction        | 4.88e-05      |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.302         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.75e+06      |
|    n_updates            | 3140          |
|    policy_gradient_loss | -0.00701      |
|    std                  | 0.988         |
|    value_loss           | 3.5e+06       |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 316          |
|    time_elapsed         | 2041         |
|    total_timesteps      | 647168       |
| train/                  |              |
|    approx_kl            | 0.0010726044 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.34         |
|    learning_rate        | 0.0001       |
|    loss                 | 1.5e+06      |
|    n_updates            | 3150         |
|    policy_gradient_loss | -0.00871     |
|    std                  | 0.988        |
|    value_loss           | 3.33e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 317          |
|    iterations           | 317          |
|    time_elapsed         | 2047         |
|    total_timesteps      | 649216       |
| train/                  |              |
|    approx_kl            | 0.0005940993 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.305        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.11e+06     |
|    n_updates            | 3160         |
|    policy_gradient_loss | -0.00629     |
|    std                  | 0.988        |
|    value_loss           | 3.49e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 317          |
|    iterations           | 318          |
|    time_elapsed         | 2054         |
|    total_timesteps      | 651264       |
| train/                  |              |
|    approx_kl            | 0.0019992262 |
|    clip_fraction        | 0.000537     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.456        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.54e+06     |
|    n_updates            | 3170         |
|    policy_gradient_loss | -0.0105      |
|    std                  | 0.987        |
|    value_loss           | 2.66e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 317          |
|    iterations           | 319          |
|    time_elapsed         | 2060         |
|    total_timesteps      | 653312       |
| train/                  |              |
|    approx_kl            | 0.0012691047 |
|    clip_fraction        | 0.000146     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.279        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.51e+06     |
|    n_updates            | 3180         |
|    policy_gradient_loss | -0.0094      |
|    std                  | 0.987        |
|    value_loss           | 4.57e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 317         |
|    iterations           | 320         |
|    time_elapsed         | 2066        |
|    total_timesteps      | 655360      |
| train/                  |             |
|    approx_kl            | 0.001230661 |
|    clip_fraction        | 4.88e-05    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.402       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.19e+06    |
|    n_updates            | 3190        |
|    policy_gradient_loss | -0.00883    |
|    std                  | 0.986       |
|    value_loss           | 2.81e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 317         |
|    iterations           | 321         |
|    time_elapsed         | 2072        |
|    total_timesteps      | 657408      |
| train/                  |             |
|    approx_kl            | 0.001087084 |
|    clip_fraction        | 4.88e-05    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.313       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.84e+06    |
|    n_updates            | 3200        |
|    policy_gradient_loss | -0.00863    |
|    std                  | 0.986       |
|    value_loss           | 3.78e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 317          |
|    iterations           | 322          |
|    time_elapsed         | 2079         |
|    total_timesteps      | 659456       |
| train/                  |              |
|    approx_kl            | 0.0005746136 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.323        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.05e+06     |
|    n_updates            | 3210         |
|    policy_gradient_loss | -0.00614     |
|    std                  | 0.987        |
|    value_loss           | 3.73e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 317          |
|    iterations           | 323          |
|    time_elapsed         | 2086         |
|    total_timesteps      | 661504       |
| train/                  |              |
|    approx_kl            | 0.0002868933 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.343        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.3e+06      |
|    n_updates            | 3220         |
|    policy_gradient_loss | -0.00427     |
|    std                  | 0.986        |
|    value_loss           | 2.84e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 324          |
|    time_elapsed         | 2093         |
|    total_timesteps      | 663552       |
| train/                  |              |
|    approx_kl            | 0.0010441317 |
|    clip_fraction        | 0.000195     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.279        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.85e+06     |
|    n_updates            | 3230         |
|    policy_gradient_loss | -0.0087      |
|    std                  | 0.987        |
|    value_loss           | 3.77e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 325          |
|    time_elapsed         | 2100         |
|    total_timesteps      | 665600       |
| train/                  |              |
|    approx_kl            | 0.0015658258 |
|    clip_fraction        | 0.000391     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.353        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.21e+06     |
|    n_updates            | 3240         |
|    policy_gradient_loss | -0.0101      |
|    std                  | 0.986        |
|    value_loss           | 3.65e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 316         |
|    iterations           | 326         |
|    time_elapsed         | 2107        |
|    total_timesteps      | 667648      |
| train/                  |             |
|    approx_kl            | 0.001398232 |
|    clip_fraction        | 0.000635    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.311       |
|    learning_rate        | 0.0001      |
|    loss                 | 2.7e+06     |
|    n_updates            | 3250        |
|    policy_gradient_loss | -0.0106     |
|    std                  | 0.986       |
|    value_loss           | 4.52e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 327          |
|    time_elapsed         | 2114         |
|    total_timesteps      | 669696       |
| train/                  |              |
|    approx_kl            | 0.0013775656 |
|    clip_fraction        | 0.000146     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.423        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.34e+06     |
|    n_updates            | 3260         |
|    policy_gradient_loss | -0.00908     |
|    std                  | 0.986        |
|    value_loss           | 2.89e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 316           |
|    iterations           | 328           |
|    time_elapsed         | 2120          |
|    total_timesteps      | 671744        |
| train/                  |               |
|    approx_kl            | 0.00051336514 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.292         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.81e+06      |
|    n_updates            | 3270          |
|    policy_gradient_loss | -0.00625      |
|    std                  | 0.986         |
|    value_loss           | 4.65e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 316           |
|    iterations           | 329           |
|    time_elapsed         | 2126          |
|    total_timesteps      | 673792        |
| train/                  |               |
|    approx_kl            | 0.00042477844 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.374         |
|    learning_rate        | 0.0001        |
|    loss                 | 9.12e+05      |
|    n_updates            | 3280          |
|    policy_gradient_loss | -0.00605      |
|    std                  | 0.985         |
|    value_loss           | 2.48e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 330          |
|    time_elapsed         | 2132         |
|    total_timesteps      | 675840       |
| train/                  |              |
|    approx_kl            | 0.0017275237 |
|    clip_fraction        | 0.00107      |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.261        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.45e+06     |
|    n_updates            | 3290         |
|    policy_gradient_loss | -0.0113      |
|    std                  | 0.985        |
|    value_loss           | 3.57e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 331          |
|    time_elapsed         | 2139         |
|    total_timesteps      | 677888       |
| train/                  |              |
|    approx_kl            | 0.0004079079 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.316        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.57e+06     |
|    n_updates            | 3300         |
|    policy_gradient_loss | -0.00489     |
|    std                  | 0.985        |
|    value_loss           | 3.44e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 332          |
|    time_elapsed         | 2145         |
|    total_timesteps      | 679936       |
| train/                  |              |
|    approx_kl            | 0.0018553283 |
|    clip_fraction        | 0.000879     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.286        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.78e+06     |
|    n_updates            | 3310         |
|    policy_gradient_loss | -0.012       |
|    std                  | 0.985        |
|    value_loss           | 3.55e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 333          |
|    time_elapsed         | 2152         |
|    total_timesteps      | 681984       |
| train/                  |              |
|    approx_kl            | 0.0015874263 |
|    clip_fraction        | 0.000537     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.314        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.27e+06     |
|    n_updates            | 3320         |
|    policy_gradient_loss | -0.00917     |
|    std                  | 0.985        |
|    value_loss           | 2.69e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 334          |
|    time_elapsed         | 2158         |
|    total_timesteps      | 684032       |
| train/                  |              |
|    approx_kl            | 0.0005025821 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.335        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.14e+06     |
|    n_updates            | 3330         |
|    policy_gradient_loss | -0.00564     |
|    std                  | 0.985        |
|    value_loss           | 4.43e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 335          |
|    time_elapsed         | 2165         |
|    total_timesteps      | 686080       |
| train/                  |              |
|    approx_kl            | 0.0010517571 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.347        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.83e+06     |
|    n_updates            | 3340         |
|    policy_gradient_loss | -0.00844     |
|    std                  | 0.985        |
|    value_loss           | 3.73e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 336          |
|    time_elapsed         | 2170         |
|    total_timesteps      | 688128       |
| train/                  |              |
|    approx_kl            | 0.0010783082 |
|    clip_fraction        | 0.000342     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.249        |
|    learning_rate        | 0.0001       |
|    loss                 | 2.25e+06     |
|    n_updates            | 3350         |
|    policy_gradient_loss | -0.00867     |
|    std                  | 0.985        |
|    value_loss           | 4.42e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 316         |
|    iterations           | 337         |
|    time_elapsed         | 2177        |
|    total_timesteps      | 690176      |
| train/                  |             |
|    approx_kl            | 0.002069775 |
|    clip_fraction        | 0.00166     |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.38        |
|    learning_rate        | 0.0001      |
|    loss                 | 1.35e+06    |
|    n_updates            | 3360        |
|    policy_gradient_loss | -0.0128     |
|    std                  | 0.986       |
|    value_loss           | 2.81e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 317           |
|    iterations           | 338           |
|    time_elapsed         | 2183          |
|    total_timesteps      | 692224        |
| train/                  |               |
|    approx_kl            | 0.00076556136 |
|    clip_fraction        | 9.77e-05      |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.346         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.92e+06      |
|    n_updates            | 3370          |
|    policy_gradient_loss | -0.00751      |
|    std                  | 0.987         |
|    value_loss           | 3.42e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 339          |
|    time_elapsed         | 2190         |
|    total_timesteps      | 694272       |
| train/                  |              |
|    approx_kl            | 0.0019286235 |
|    clip_fraction        | 0.000684     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.47         |
|    learning_rate        | 0.0001       |
|    loss                 | 1.22e+06     |
|    n_updates            | 3380         |
|    policy_gradient_loss | -0.0115      |
|    std                  | 0.987        |
|    value_loss           | 2.73e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 317         |
|    iterations           | 340         |
|    time_elapsed         | 2196        |
|    total_timesteps      | 696320      |
| train/                  |             |
|    approx_kl            | 0.002699078 |
|    clip_fraction        | 0.00308     |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.326       |
|    learning_rate        | 0.0001      |
|    loss                 | 2.35e+06    |
|    n_updates            | 3390        |
|    policy_gradient_loss | -0.0143     |
|    std                  | 0.987       |
|    value_loss           | 4.02e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 341          |
|    time_elapsed         | 2203         |
|    total_timesteps      | 698368       |
| train/                  |              |
|    approx_kl            | 0.0006414441 |
|    clip_fraction        | 4.88e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.384        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.41e+06     |
|    n_updates            | 3400         |
|    policy_gradient_loss | -0.0071      |
|    std                  | 0.988        |
|    value_loss           | 2.92e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 317          |
|    iterations           | 342          |
|    time_elapsed         | 2209         |
|    total_timesteps      | 700416       |
| train/                  |              |
|    approx_kl            | 0.0004848666 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.329        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.79e+06     |
|    n_updates            | 3410         |
|    policy_gradient_loss | -0.00625     |
|    std                  | 0.988        |
|    value_loss           | 3.68e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 316           |
|    iterations           | 343           |
|    time_elapsed         | 2216          |
|    total_timesteps      | 702464        |
| train/                  |               |
|    approx_kl            | 0.00048114103 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.25          |
|    learning_rate        | 0.0001        |
|    loss                 | 2.49e+06      |
|    n_updates            | 3420          |
|    policy_gradient_loss | -0.00569      |
|    std                  | 0.988         |
|    value_loss           | 4.27e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 344          |
|    time_elapsed         | 2222         |
|    total_timesteps      | 704512       |
| train/                  |              |
|    approx_kl            | 0.0007173938 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.304        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.77e+06     |
|    n_updates            | 3430         |
|    policy_gradient_loss | -0.00697     |
|    std                  | 0.987        |
|    value_loss           | 3.2e+06      |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 345          |
|    time_elapsed         | 2229         |
|    total_timesteps      | 706560       |
| train/                  |              |
|    approx_kl            | 0.0014084994 |
|    clip_fraction        | 9.77e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.403        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.23e+06     |
|    n_updates            | 3440         |
|    policy_gradient_loss | -0.0105      |
|    std                  | 0.988        |
|    value_loss           | 2.55e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 317           |
|    iterations           | 346           |
|    time_elapsed         | 2235          |
|    total_timesteps      | 708608        |
| train/                  |               |
|    approx_kl            | 0.00058556965 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.434         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.7e+06       |
|    n_updates            | 3450          |
|    policy_gradient_loss | -0.00664      |
|    std                  | 0.988         |
|    value_loss           | 3.45e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 347          |
|    time_elapsed         | 2241         |
|    total_timesteps      | 710656       |
| train/                  |              |
|    approx_kl            | 0.0007252994 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.222        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.46e+06     |
|    n_updates            | 3460         |
|    policy_gradient_loss | -0.00597     |
|    std                  | 0.988        |
|    value_loss           | 3.43e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 317          |
|    iterations           | 348          |
|    time_elapsed         | 2247         |
|    total_timesteps      | 712704       |
| train/                  |              |
|    approx_kl            | 0.0012470764 |
|    clip_fraction        | 9.77e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.334        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.03e+06     |
|    n_updates            | 3470         |
|    policy_gradient_loss | -0.00965     |
|    std                  | 0.987        |
|    value_loss           | 3.22e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 349          |
|    time_elapsed         | 2255         |
|    total_timesteps      | 714752       |
| train/                  |              |
|    approx_kl            | 0.0008691228 |
|    clip_fraction        | 9.77e-05     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.375        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.96e+06     |
|    n_updates            | 3480         |
|    policy_gradient_loss | -0.00787     |
|    std                  | 0.987        |
|    value_loss           | 3.75e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 316           |
|    iterations           | 350           |
|    time_elapsed         | 2261          |
|    total_timesteps      | 716800        |
| train/                  |               |
|    approx_kl            | 0.00020486274 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.401         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.33e+06      |
|    n_updates            | 3490          |
|    policy_gradient_loss | -0.00401      |
|    std                  | 0.987         |
|    value_loss           | 2.97e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 316           |
|    iterations           | 351           |
|    time_elapsed         | 2268          |
|    total_timesteps      | 718848        |
| train/                  |               |
|    approx_kl            | 0.00031203355 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.147         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.97e+06      |
|    n_updates            | 3500          |
|    policy_gradient_loss | -0.00462      |
|    std                  | 0.987         |
|    value_loss           | 4.11e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 316           |
|    iterations           | 352           |
|    time_elapsed         | 2274          |
|    total_timesteps      | 720896        |
| train/                  |               |
|    approx_kl            | 0.00038761128 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.245         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.68e+06      |
|    n_updates            | 3510          |
|    policy_gradient_loss | -0.00541      |
|    std                  | 0.986         |
|    value_loss           | 2.83e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 316           |
|    iterations           | 353           |
|    time_elapsed         | 2281          |
|    total_timesteps      | 722944        |
| train/                  |               |
|    approx_kl            | 0.00026083624 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.268         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.71e+06      |
|    n_updates            | 3520          |
|    policy_gradient_loss | -0.00426      |
|    std                  | 0.986         |
|    value_loss           | 3.27e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 316           |
|    iterations           | 354           |
|    time_elapsed         | 2287          |
|    total_timesteps      | 724992        |
| train/                  |               |
|    approx_kl            | 0.00028368874 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.326         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.5e+06       |
|    n_updates            | 3530          |
|    policy_gradient_loss | -0.00472      |
|    std                  | 0.986         |
|    value_loss           | 3.69e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 316          |
|    iterations           | 355          |
|    time_elapsed         | 2293         |
|    total_timesteps      | 727040       |
| train/                  |              |
|    approx_kl            | 0.0012811208 |
|    clip_fraction        | 0.000342     |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.287        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.88e+06     |
|    n_updates            | 3540         |
|    policy_gradient_loss | -0.00968     |
|    std                  | 0.985        |
|    value_loss           | 3.07e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 317           |
|    iterations           | 356           |
|    time_elapsed         | 2299          |
|    total_timesteps      | 729088        |
| train/                  |               |
|    approx_kl            | 0.00016928921 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.255         |
|    learning_rate        | 0.0001        |
|    loss                 | 2.06e+06      |
|    n_updates            | 3550          |
|    policy_gradient_loss | -0.00341      |
|    std                  | 0.985         |
|    value_loss           | 3.25e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 317           |
|    iterations           | 357           |
|    time_elapsed         | 2306          |
|    total_timesteps      | 731136        |
| train/                  |               |
|    approx_kl            | 6.3057145e-05 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.231         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.59e+06      |
|    n_updates            | 3560          |
|    policy_gradient_loss | -0.00199      |
|    std                  | 0.986         |
|    value_loss           | 3.17e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 317           |
|    iterations           | 358           |
|    time_elapsed         | 2312          |
|    total_timesteps      | 733184        |
| train/                  |               |
|    approx_kl            | 0.00011284757 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.179         |
|    learning_rate        | 0.0001        |
|    loss                 | 2.08e+06      |
|    n_updates            | 3570          |
|    policy_gradient_loss | -0.00296      |
|    std                  | 0.986         |
|    value_loss           | 4.11e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 316           |
|    iterations           | 359           |
|    time_elapsed         | 2319          |
|    total_timesteps      | 735232        |
| train/                  |               |
|    approx_kl            | 0.00021363358 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.177         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.86e+06      |
|    n_updates            | 3580          |
|    policy_gradient_loss | -0.00303      |
|    std                  | 0.986         |
|    value_loss           | 3.59e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


------------------------------------------
| time/                   |              |
|    fps                  | 317          |
|    iterations           | 360          |
|    time_elapsed         | 2325         |
|    total_timesteps      | 737280       |
| train/                  |              |
|    approx_kl            | 0.0009582648 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -29.5        |
|    explained_variance   | 0.438        |
|    learning_rate        | 0.0001       |
|    loss                 | 1.31e+06     |
|    n_updates            | 3590         |
|    policy_gradient_loss | -0.00809     |
|    std                  | 0.986        |
|    value_loss           | 2.65e+06     |
------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 317         |
|    iterations           | 361         |
|    time_elapsed         | 2332        |
|    total_timesteps      | 739328      |
| train/                  |             |
|    approx_kl            | 0.000967026 |
|    clip_fraction        | 0.000293    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.343       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.47e+06    |
|    n_updates            | 3600        |
|    policy_gradient_loss | -0.00793    |
|    std                  | 0.986       |
|    value_loss           | 3.55e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 317         |
|    iterations           | 362         |
|    time_elapsed         | 2338        |
|    total_timesteps      | 741376      |
| train/                  |             |
|    approx_kl            | 0.001598729 |
|    clip_fraction        | 0.000586    |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.36        |
|    learning_rate        | 0.0001      |
|    loss                 | 2e+06       |
|    n_updates            | 3610        |
|    policy_gradient_loss | -0.0108     |
|    std                  | 0.986       |
|    value_loss           | 3.77e+06    |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-----------------------------------------
| time/                   |             |
|    fps                  | 317         |
|    iterations           | 363         |
|    time_elapsed         | 2345        |
|    total_timesteps      | 743424      |
| train/                  |             |
|    approx_kl            | 0.001276155 |
|    clip_fraction        | 0           |
|    clip_range           | 0.2         |
|    entropy_loss         | -29.5       |
|    explained_variance   | 0.325       |
|    learning_rate        | 0.0001      |
|    loss                 | 1.42e+06    |
|    n_updates            | 3620        |
|    policy_gradient_loss | -0.00932    |
|    std                  | 0.985       |
|    value_loss           | 3.7e+06     |
-----------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 317           |
|    iterations           | 364           |
|    time_elapsed         | 2351          |
|    total_timesteps      | 745472        |
| train/                  |               |
|    approx_kl            | 0.00042246206 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.458         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.18e+06      |
|    n_updates            | 3630          |
|    policy_gradient_loss | -0.00557      |
|    std                  | 0.985         |
|    value_loss           | 2.4e+06       |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 317           |
|    iterations           | 365           |
|    time_elapsed         | 2357          |
|    total_timesteps      | 747520        |
| train/                  |               |
|    approx_kl            | 0.00046963056 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.334         |
|    learning_rate        | 0.0001        |
|    loss                 | 2.16e+06      |
|    n_updates            | 3640          |
|    policy_gradient_loss | -0.0053       |
|    std                  | 0.985         |
|    value_loss           | 4.52e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 317           |
|    iterations           | 366           |
|    time_elapsed         | 2363          |
|    total_timesteps      | 749568        |
| train/                  |               |
|    approx_kl            | 0.00030369192 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.241         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.73e+06      |
|    n_updates            | 3650          |
|    policy_gradient_loss | -0.00462      |
|    std                  | 0.986         |
|    value_loss           | 3.53e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


-------------------------------------------
| time/                   |               |
|    fps                  | 317           |
|    iterations           | 367           |
|    time_elapsed         | 2370          |
|    total_timesteps      | 751616        |
| train/                  |               |
|    approx_kl            | 0.00014743354 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -29.5         |
|    explained_variance   | 0.166         |
|    learning_rate        | 0.0001        |
|    loss                 | 1.39e+06      |
|    n_updates            | 3660          |
|    policy_gradient_loss | -0.00307      |
|    std                  | 0.986         |
|    value_loss           | 2.62e+06      |
-------------------------------------------


  return datetime.utcnow().replace(tzinfo=utc)


DRL Training Complete.

--- 5. Evaluation (Policy Test on Validation Data) ---


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


 Validation Complete. Total Days: 321

--- DRL Strategy Pipeline Result (PPO, 750000 steps) ✨ ---
Number of Assets: 20 (Stocks + Cash)
Market Benchmark: ^STOXX
Total Features in State Space: 120
Validation Period Adjusted Sharpe: **1.3645**


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


**Results**

The DRL agent was trained for 750,000 steps and validated on an unseen (out-of-sample) data set.

**Key Result**:

The strategy achieved a Validation Period **Adjusted Sharpe of 1.3645**.

This metric indicates that the PPO agent successfully learned a policy that generates competitive returns while maintaining rigorous risk control, outperforming the benchmark on a risk-adjusted basis.

**WARNING**:

This notebook is not a financial advisor.