# Biblioteca de Algoritmos - Lab 03

Nos últimos anos, muitas bibliotecas RL foram desenvolvidas. Essas bibliotecas foram projetadas para ter todas as ferramentas necessárias para implementar e testar agentes de Aprendizado por Reforço .

Ainda assim, elas se diferem muito. É por isso que é importante escolher uma biblioteca que seja rápida, confiável e relevante para sua tarefa de RL. Do ponto de vista técnico, existem algumas coisas a se ter em mente ao considerar uma bilioteca para RL.

- **Suporte para bibliotecas de aprendizado de máquina existentes:** Como o RL normalmente usa algoritmos baseados em gradiente para aprender e ajustar funções de política, você vai querer que ele suporte sua biblioteca favorita (Tensorflow, Keras, Pytorch, etc.)
- **Escalabilidade:** RL é computacionalmente intensivo e ter a opção de executar de forma distribuída torna-se importante ao atacar ambientes complexos.
- **Composibilidade:** Os algoritmos de RL normalmente envolvem simulações e muitos outros componentes. Você vai querer uma biblioteca que permita reutilizar componentes de algoritmos de RL, que seja compatível com várias estruturas de aprendizado profundo.

[Aqui](https://docs.google.com/spreadsheets/d/1ZWhViAwCpRqupA5E_xFHSaBaaBZ1wAjO6PvmmEEpXGI/edit#gid=0) você consegue visualizar uma lista com algumas bibliotecas existentes.

<img src="https://i1.wp.com/neptune.ai/wp-content/uploads/RL-tools.png?resize=1024%2C372&ssl=1" width=500>


## Ray RLlib

[Ray](https://docs.ray.io/en/latest/) é uma plataforma de execução distribuída que fornece bases para paralelismo e escalabilidade que são simples de usar e permitem que os programas Python sejam escalados em qualquer lugar, de um notebook a um grande cluster. Além disso, construída sobre o Ray, temos a [RLlib](https://docs.ray.io/en/latest/rllib.html), que fornece uma API unificada que pode ser aproveitada em uma ampla gama de aplicações.

<br>

<img src="https://miro.medium.com/max/1838/1*_bomm09XtiZfQ52Kfz9Ciw.png" width=600>


A RLlib foi projetada para oferecer suporte a várias estruturas de aprendizado profundo (TensorFlow e PyTorch) e pode ser acessada por meio de uma API Python simples. Atualmente, ela vem com uma [série de algoritmos RL](https://docs.ray.io/en/latest/rllib-algorithms.html#available-algorithms-overview).

Em particular, a RLlib permite um desenvolvimento rápido porque torna mais fácil construir algoritmos RL escaláveis ​​por meio da reutilização e montagem de implementações existentes. A RLlib também permite que os desenvolvedores usem redes neurais criadas com várias estruturas de aprendizado profundo e se integra facilmente a simuladores de terceiros.


## (Iniciar Colab) Configuração

Você precisará fazer uma cópia deste notebook em seu Google Drive antes de editar. Você pode fazer isso com **Arquivo → Salvar uma cópia no Drive**.

In [None]:
import os
from google.colab import drive
drive.mount("/content/gdrive")
isColab = True

In [None]:
# Seu trabalho será armazenado em uma pasta chamada `minicurso_rl` por padrão 
# para evitar que o tempo limite da instância do Colab exclua suas edições

DRIVE_PATH = "/content/gdrive/MyDrive/minicurso_rl/lab03"
DRIVE_PYTHON_PATH = DRIVE_PATH.replace("\\", "")
if not os.path.exists(DRIVE_PYTHON_PATH):
  %mkdir -p $DRIVE_PATH

In [None]:
! wget http://www.atarimania.com/roms/Roms.rar
! mkdir /content/ROM/
! unrar e /content/Roms.rar /content/ROM/ -y
! python -m atari_py.import_roms /content/ROM/ > /dev/null 2>&1

## (Iniciar somente local, fora do COLAB) Configuração

In [3]:
import os
isColab = False

In [4]:
import copy

# Seu trabalho será armazenado em uma pasta chamada `minicurso_rl` por padrão 
# para evitar que o tempo limite da instância do Colab exclua suas edições
CONTENT_PATH = "./content"
if not os.path.exists(CONTENT_PATH):
    %mkdir $CONTENT_PATH

CKPT_PATH = "./ckpt"
if not os.path.exists(CKPT_PATH):
    %mkdir $CKPT_PATH

if not isColab:
    DRIVE_PATH = copy.deepcopy(CONTENT_PATH)

In [5]:
# ! wget http://www.atarimania.com/roms/Roms.rar
# ! mkdir ./content/ROM/
# ! mv ./Roms.rar ./content/
# ! unrar e ./content/Roms.rar ./content/ROM/ -y
# ! python -m atari_py.import_roms ./content/ROM/ > /dev/null 2>&1

## (Sempre) Outras configurações

In [None]:
# Ambiente da competição
!pip install --upgrade ceia-soccer-twos > /dev/null 2>&1
# a versão do ray compatível com a implementação dos agentes disponibilizada é a 1.4.0
!pip install 'aioredis==1.3.1' > /dev/null 2>&1 
!pip install 'aiohttp==3.7.4' > /dev/null 2>&1 
!pip install 'ray==1.4.0' > /dev/null 2>&1 
!pip install 'ray[rllib]==1.4.0' > /dev/null 2>&1 
!pip install 'ray[tune]==1.4.0' > /dev/null 2>&1 
!pip install torch > /dev/null 2>&1 
!pip install lz4 > /dev/null 2>&1
!pip install gym[atari] > /dev/null 2>&1
!pip install GPUtil /dev/null 2>&1

# Dependências necessárias para gravar os vídeos
!apt-get install -y xvfb x11-utils > /dev/null 2>&1 
!pip install pyvirtualdisplay==0.2.* > /dev/null 2>&1

In [7]:
# Inicializa uma instância de um display virtual
from pyvirtualdisplay import Display
display = Display(visible=False, size=(1400, 900))
_ = display.start()

INFO:pyvirtualdisplay.abstractdisplay:Successfully started X with display ":1009".


In [8]:
# Carrega a extensão do notebook TensorBoard
%load_ext tensorboard

In [9]:
def get_checkpoint_path(mode='best', env='soccer', algorithm='ppo', version='1.0'):
    if isColab:
        path = '/content/minicurso_rl/lab02/ckpt/'
    else:
        path = './ckpt/'

    return path + env + '_' + mode + '_' + algorithm + '_' + version + '_ckpt.pth'

## (Sempre) Ambiente

O OpenAI Gym possui um wrapper VideoRecorder que pode gravar um vídeo do ambiente em formato MP4. Abaixo iremos interagir no ambiente do [Carpole](https://gym.openai.com/envs/CartPole-v0/) executando ações aleatórias e gravar o resultado.

In [10]:
import gym
from gym.wrappers.monitoring.video_recorder import VideoRecorder

environment_id = "CartPole-v0"

In [11]:
import gym
from gym.wrappers.monitoring.video_recorder import VideoRecorder

env = gym.make(environment_id)
before_training = os.path.join(
    DRIVE_PATH, "{}_before_training.mp4".format(environment_id)
)
print(before_training)

video = VideoRecorder(env, before_training)
env.reset()
done = False
while not done:
    env.render()
    video.capture_frame()
    observation, reward, done, info = env.step(env.action_space.sample())

video.close()
env.close()

./content/CartPole-v0_before_training.mp4


O código acima salvou o arquivo de vídeo no seu Drive. Para exibi-lo no notebook, você precisa de uma função auxiliar.

In [12]:
from base64 import b64encode
def render_mp4(videopath: str) -> str:
    mp4 = open(videopath, 'rb').read()
    base64_encoded_mp4 = b64encode(mp4).decode()
    return f'<video width=400 controls><source src="data:video/mp4;' \
         f'base64,{base64_encoded_mp4}" type="video/mp4"></video>'

O código abaixo renderiza os resultados. Você deve obter um vídeo semelhante ao abaixo.

In [None]:
from IPython.display import HTML
html = render_mp4(before_training)
HTML(html)

## Treinando um agente de Aprendizado por Reforço

Primeiro, vamos começar a executar o Ray em segundo plano. Executar um `ray.shutdown()` seguido por um `ray.init()` deve dar início às coisas.

In [11]:
import ray

ray.shutdown()
ray.init(ignore_reinit_error=True, include_dashboard=False)

{'node_ip_address': '192.168.0.102',
 'raylet_ip_address': '192.168.0.102',
 'redis_address': '192.168.0.102:6379',
 'object_store_address': '/tmp/ray/session_2021-11-16_17-41-22_051049_6159/sockets/plasma_store',
 'raylet_socket_name': '/tmp/ray/session_2021-11-16_17-41-22_051049_6159/sockets/raylet',
 'webui_url': None,
 'session_dir': '/tmp/ray/session_2021-11-16_17-41-22_051049_6159',
 'metrics_export_port': 60883,
 'node_id': 'de7325ee89abb038669f710e177bba49ad8ea92740b57643b532c0a9'}

### Basic Python API

Em alto nível, RLlib fornece uma classe Trainer que contém uma política para interação com o ambiente. Por meio da interface do Trainer, a política pode ser treinada, avaliada ou computar uma ação. 

Para cada algoritmo gostaríamos de configurar os parâmetros (taxa de aprendizado, tamanho da rede, tamanho do batch, etc.) de acordo com a nossa aplicação.  Para isso o Ray fornece dois níveis de paramêtros que podemos alterar. Primeiramente temos os parâmetros comuns a todos os algoritmos. Você pode conferir uma lista com os parâmetros disponíveis através desse [link](https://docs.ray.io/en/latest/rllib-training.html#common-parameters).

E para cada [algoritmo disponível no ray](https://docs.ray.io/en/latest/rllib-algorithms.html#available-algorithms-overview) temos os parâmetros específicos. Na imagem abaixo podemos ver os parâmetros específicos para o algoritmo [Policy Gradient](https://docs.ray.io/en/latest/rllib-algorithms.html#policy-gradients).


<img src='https://drive.google.com/uc?id=1yKJDJViHE_F9JH7NTQMYtQL3KLBJoJyk' width="500" >


In [12]:
import ray
import ray.rllib.agents.pg as pg
from ray.tune.logger import pretty_print

config = pg.DEFAULT_CONFIG.copy()
config["num_gpus"] = 0
config["num_workers"] = 1
config["lr"] = 0.0004
config["framework"] = "torch"

trainer = pg.PGTrainer(config=config, env=environment_id)
episodes = 1000

for i in range(episodes):
   # Executa uma iteração de treinamento da política com Policy Gradient (PG)
    result = trainer.train()
    print(pretty_print(result))

    if i % 100 == 0:
        checkpoint = trainer.save()
        print("checkpoint saved at", checkpoint)

last_checkpoint = trainer.save()

2021-11-16 17:41:58,945	INFO trainer.py:696 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.


agent_timesteps_total: 200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-06
done: false
episode_len_mean: 19.7
episode_media: {}
episode_reward_max: 41.0
episode_reward_mean: 19.7
episode_reward_min: 11.0
episodes_this_iter: 10
episodes_total: 10
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 7.660465240478516
  num_agent_steps_sampled: 200
  num_steps_sampled: 200
  num_steps_trained: 200
iterations_since_restore: 1
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 19.4
  gpu_util_percent0: 0.08
  ram_util_percent: 51.4
  vram_util_percent0: 0.07584731819677526
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.049669351150740434
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.07061341508704036
  mean_inference_ms: 1.2711434815060438
  mean_r

agent_timesteps_total: 1600
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-08
done: false
episode_len_mean: 23.514705882352942
episode_media: {}
episode_reward_max: 68.0
episode_reward_mean: 23.514705882352942
episode_reward_min: 9.0
episodes_this_iter: 9
episodes_total: 68
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 7.896124839782715
  num_agent_steps_sampled: 1600
  num_steps_sampled: 1600
  num_steps_trained: 1600
iterations_since_restore: 8
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 36.7
  gpu_util_percent0: 0.02
  ram_util_percent: 51.4
  vram_util_percent0: 0.07370845672918723
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04499006021706943
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.06582676441750579
  mean_inference_ms

agent_timesteps_total: 3000
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-09
done: false
episode_len_mean: 28.91
episode_media: {}
episode_reward_max: 81.0
episode_reward_mean: 28.91
episode_reward_min: 9.0
episodes_this_iter: 5
episodes_total: 105
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 11.492469787597656
  num_agent_steps_sampled: 3000
  num_steps_sampled: 3000
  num_steps_trained: 3000
iterations_since_restore: 15
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 31.8
  gpu_util_percent0: 0.06
  ram_util_percent: 45.9
  vram_util_percent0: 0.05478775913129319
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.044048139170800374
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.064993412914786
  mean_inference_ms: 0.906225779704747
  me

agent_timesteps_total: 4400
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-11
done: false
episode_len_mean: 37.58
episode_media: {}
episode_reward_max: 130.0
episode_reward_mean: 37.58
episode_reward_min: 11.0
episodes_this_iter: 3
episodes_total: 131
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 14.83127212524414
  num_agent_steps_sampled: 4400
  num_steps_sampled: 4400
  num_steps_trained: 4400
iterations_since_restore: 22
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 31.1
  gpu_util_percent0: 0.07
  ram_util_percent: 44.6
  vram_util_percent0: 0.05577492596248766
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04310529312963243
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.06407645308126694
  mean_inference_ms: 0.8506834197011175
 

agent_timesteps_total: 5800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-12
done: false
episode_len_mean: 47.74
episode_media: {}
episode_reward_max: 180.0
episode_reward_mean: 47.74
episode_reward_min: 12.0
episodes_this_iter: 2
episodes_total: 147
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 18.29290771484375
  num_agent_steps_sampled: 5800
  num_steps_sampled: 5800
  num_steps_trained: 5800
iterations_since_restore: 29
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 29.9
  gpu_util_percent0: 0.04
  ram_util_percent: 42.2
  vram_util_percent0: 0.054129647910496875
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04258464356093038
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.06336785219000934
  mean_inference_ms: 0.8341917730222197


agent_timesteps_total: 7200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-14
done: false
episode_len_mean: 57.22
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 57.22
episode_reward_min: 12.0
episodes_this_iter: 1
episodes_total: 161
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 29.45177459716797
  num_agent_steps_sampled: 7200
  num_steps_sampled: 7200
  num_steps_trained: 7200
iterations_since_restore: 36
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 27.7
  gpu_util_percent0: 0.03
  ram_util_percent: 42.2
  vram_util_percent0: 0.052977953274103325
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.0421767800321116
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.06278815250647127
  mean_inference_ms: 0.8230541249113881
 

agent_timesteps_total: 8600
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-15
done: false
episode_len_mean: 68.95
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 68.95
episode_reward_min: 12.0
episodes_this_iter: 2
episodes_total: 171
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 21.72616958618164
  num_agent_steps_sampled: 8600
  num_steps_sampled: 8600
  num_steps_trained: 8600
iterations_since_restore: 43
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 24.8
  gpu_util_percent0: 0.04
  ram_util_percent: 42.2
  vram_util_percent0: 0.052977953274103325
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.041832308685208534
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.06223868417889193
  mean_inference_ms: 0.8141408117484323

agent_timesteps_total: 10000
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-17
done: false
episode_len_mean: 77.59
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 77.59
episode_reward_min: 16.0
episodes_this_iter: 1
episodes_total: 183
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 24.896936416625977
  num_agent_steps_sampled: 10000
  num_steps_sampled: 10000
  num_steps_trained: 10000
iterations_since_restore: 50
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 28.4
  gpu_util_percent0: 0.01
  ram_util_percent: 42.2
  vram_util_percent0: 0.05264889766370517
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04141703975900037
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.06156220494497115
  mean_inference_ms: 0.8040872057728

agent_timesteps_total: 11400
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-18
done: false
episode_len_mean: 86.87
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 86.87
episode_reward_min: 17.0
episodes_this_iter: 1
episodes_total: 195
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 24.236371994018555
  num_agent_steps_sampled: 11400
  num_steps_sampled: 11400
  num_steps_trained: 11400
iterations_since_restore: 57
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 26.7
  gpu_util_percent0: 0.02
  ram_util_percent: 42.2
  vram_util_percent0: 0.05264889766370517
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04098668639540733
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.06083651268931318
  mean_inference_ms: 0.7939798140522

agent_timesteps_total: 13000
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-20
done: false
episode_len_mean: 99.63
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 99.63
episode_reward_min: 17.0
episodes_this_iter: 1
episodes_total: 205
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 20.762653350830078
  num_agent_steps_sampled: 13000
  num_steps_sampled: 13000
  num_steps_trained: 13000
iterations_since_restore: 65
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 25.1
  gpu_util_percent0: 0.01
  ram_util_percent: 42.2
  vram_util_percent0: 0.05264889766370517
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04067776753265559
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.06029745253145501
  mean_inference_ms: 0.7867894181828

agent_timesteps_total: 14600
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-21
done: false
episode_len_mean: 112.26
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 112.26
episode_reward_min: 17.0
episodes_this_iter: 1
episodes_total: 213
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 18.991817474365234
  num_agent_steps_sampled: 14600
  num_steps_sampled: 14600
  num_steps_trained: 14600
iterations_since_restore: 73
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04044299283096061
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05987703230628363
  mean_inference_ms: 0.7813554173879325
  mean_raw_obs_processing_ms: 0.06016987959784993
time_since_restore: 14.62137508392334
time_this_iter_s: 0.1

agent_timesteps_total: 16200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-23
done: false
episode_len_mean: 124.02
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 124.02
episode_reward_min: 25.0
episodes_this_iter: 1
episodes_total: 222
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 19.452699661254883
  num_agent_steps_sampled: 16200
  num_steps_sampled: 16200
  num_steps_trained: 16200
iterations_since_restore: 81
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04018371786652398
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05941076988341456
  mean_inference_ms: 0.7754492665515221
  mean_raw_obs_processing_ms: 0.05924044157878046
time_since_restore: 16.09265971183777
time_this_iter_s: 0.1

agent_timesteps_total: 17600
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-24
done: false
episode_len_mean: 132.06
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 132.06
episode_reward_min: 25.0
episodes_this_iter: 1
episodes_total: 229
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 19.47826385498047
  num_agent_steps_sampled: 17600
  num_steps_sampled: 17600
  num_steps_trained: 17600
iterations_since_restore: 88
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.039997922313760516
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05907606978111754
  mean_inference_ms: 0.7712963497454252
  mean_raw_obs_processing_ms: 0.058570063223563375
time_since_restore: 17.39540410041809
time_this_iter_s: 0.

agent_timesteps_total: 19200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-26
done: false
episode_len_mean: 142.31
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 142.31
episode_reward_min: 25.0
episodes_this_iter: 1
episodes_total: 238
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 18.605485916137695
  num_agent_steps_sampled: 19200
  num_steps_sampled: 19200
  num_steps_trained: 19200
iterations_since_restore: 96
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03978354722066436
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05868859596702883
  mean_inference_ms: 0.7665546844678814
  mean_raw_obs_processing_ms: 0.0577834371259538
time_since_restore: 18.885604858398438
time_this_iter_s: 0.1

agent_timesteps_total: 20800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-27
done: false
episode_len_mean: 150.01
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 150.01
episode_reward_min: 25.0
episodes_this_iter: 1
episodes_total: 247
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 23.767915725708008
  num_agent_steps_sampled: 20800
  num_steps_sampled: 20800
  num_steps_trained: 20800
iterations_since_restore: 104
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 26.6
  gpu_util_percent0: 0.02
  ram_util_percent: 42.2
  vram_util_percent0: 0.05264889766370517
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03958851549518993
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05833255374756775
  mean_inference_ms: 0.7622775020

agent_timesteps_total: 22200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-29
done: false
episode_len_mean: 156.03
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 156.03
episode_reward_min: 25.0
episodes_this_iter: 1
episodes_total: 254
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 16.968435287475586
  num_agent_steps_sampled: 22200
  num_steps_sampled: 22200
  num_steps_trained: 22200
iterations_since_restore: 111
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.039459751402039084
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.058092168670371584
  mean_inference_ms: 0.7594322984113482
  mean_raw_obs_processing_ms: 0.05660396724656118
time_since_restore: 21.681379318237305
time_this_iter_s:

agent_timesteps_total: 23800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-30
done: false
episode_len_mean: 162.68
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 162.68
episode_reward_min: 28.0
episodes_this_iter: 1
episodes_total: 263
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 22.91423797607422
  num_agent_steps_sampled: 23800
  num_steps_sampled: 23800
  num_steps_trained: 23800
iterations_since_restore: 119
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 27.9
  gpu_util_percent0: 0.02
  ram_util_percent: 42.2
  vram_util_percent0: 0.05264889766370517
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.039315288482793594
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05781973766821067
  mean_inference_ms: 0.7562182525

agent_timesteps_total: 25200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-32
done: false
episode_len_mean: 165.61
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 165.61
episode_reward_min: 28.0
episodes_this_iter: 1
episodes_total: 270
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 17.271949768066406
  num_agent_steps_sampled: 25200
  num_steps_sampled: 25200
  num_steps_trained: 25200
iterations_since_restore: 126
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 28.3
  gpu_util_percent0: 0.01
  ram_util_percent: 42.2
  vram_util_percent0: 0.05264889766370517
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03922586504644024
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.0576432279099005
  mean_inference_ms: 0.75417530228

agent_timesteps_total: 26600
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-33
done: false
episode_len_mean: 172.3
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 172.3
episode_reward_min: 59.0
episodes_this_iter: 2
episodes_total: 278
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 20.551790237426758
  num_agent_steps_sampled: 26600
  num_steps_sampled: 26600
  num_steps_trained: 26600
iterations_since_restore: 133
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 27.3
  gpu_util_percent0: 0.02
  ram_util_percent: 42.2
  vram_util_percent0: 0.05264889766370517
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03913281480797959
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05745988035885788
  mean_inference_ms: 0.752022059323

agent_timesteps_total: 28000
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-35
done: false
episode_len_mean: 176.52
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 176.52
episode_reward_min: 63.0
episodes_this_iter: 1
episodes_total: 285
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 15.5475492477417
  num_agent_steps_sampled: 28000
  num_steps_sampled: 28000
  num_steps_trained: 28000
iterations_since_restore: 140
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.0390608928542091
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.057318533151831386
  mean_inference_ms: 0.7503532129740457
  mean_raw_obs_processing_ms: 0.05502222344735274
time_since_restore: 27.082506895065308
time_this_iter_s: 0.1

agent_timesteps_total: 29600
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-36
done: false
episode_len_mean: 184.12
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 184.12
episode_reward_min: 63.0
episodes_this_iter: 1
episodes_total: 294
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 18.80524444580078
  num_agent_steps_sampled: 29600
  num_steps_sampled: 29600
  num_steps_trained: 29600
iterations_since_restore: 148
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 26.9
  gpu_util_percent0: 0.05
  ram_util_percent: 42.2
  vram_util_percent0: 0.05264889766370517
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.038980510051231404
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05715834490342544
  mean_inference_ms: 0.7484513245

agent_timesteps_total: 31000
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-38
done: false
episode_len_mean: 186.39
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 186.39
episode_reward_min: 63.0
episodes_this_iter: 1
episodes_total: 301
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 18.41661834716797
  num_agent_steps_sampled: 31000
  num_steps_sampled: 31000
  num_steps_trained: 31000
iterations_since_restore: 155
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 29.8
  gpu_util_percent0: 0.0
  ram_util_percent: 42.2
  vram_util_percent0: 0.05264889766370517
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03892451612737362
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05704862148236174
  mean_inference_ms: 0.747141672910

agent_timesteps_total: 32400
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-39
done: false
episode_len_mean: 187.13
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 187.13
episode_reward_min: 118.0
episodes_this_iter: 1
episodes_total: 308
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 16.57707977294922
  num_agent_steps_sampled: 32400
  num_steps_sampled: 32400
  num_steps_trained: 32400
iterations_since_restore: 162
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 26.5
  gpu_util_percent0: 0.08
  ram_util_percent: 42.3
  vram_util_percent0: 0.05264889766370517
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.038875276021458974
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.056951763753897836
  mean_inference_ms: 0.74598895

agent_timesteps_total: 33800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-40
done: false
episode_len_mean: 187.57
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 187.57
episode_reward_min: 118.0
episodes_this_iter: 1
episodes_total: 315
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 19.266801834106445
  num_agent_steps_sampled: 33800
  num_steps_sampled: 33800
  num_steps_trained: 33800
iterations_since_restore: 169
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 25.0
  gpu_util_percent0: 0.11
  ram_util_percent: 42.3
  vram_util_percent0: 0.05264889766370517
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.038835492173846145
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05687232143439514
  mean_inference_ms: 0.74505689

agent_timesteps_total: 35400
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-42
done: false
episode_len_mean: 188.63
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 188.63
episode_reward_min: 122.0
episodes_this_iter: 1
episodes_total: 324
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 17.069124221801758
  num_agent_steps_sampled: 35400
  num_steps_sampled: 35400
  num_steps_trained: 35400
iterations_since_restore: 177
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 26.0
  gpu_util_percent0: 0.0
  ram_util_percent: 42.3
  vram_util_percent0: 0.05264889766370517
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03879604632021702
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05679054929987961
  mean_inference_ms: 0.7441105018

agent_timesteps_total: 36800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-43
done: false
episode_len_mean: 188.26
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 188.26
episode_reward_min: 122.0
episodes_this_iter: 1
episodes_total: 331
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 20.52057647705078
  num_agent_steps_sampled: 36800
  num_steps_sampled: 36800
  num_steps_trained: 36800
iterations_since_restore: 184
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 28.6
  gpu_util_percent0: 0.04
  ram_util_percent: 42.3
  vram_util_percent0: 0.05264889766370517
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03876975520900325
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05673673510115875
  mean_inference_ms: 0.7434865957

agent_timesteps_total: 38200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-45
done: false
episode_len_mean: 188.97
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 188.97
episode_reward_min: 122.0
episodes_this_iter: 1
episodes_total: 339
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 18.423643112182617
  num_agent_steps_sampled: 38200
  num_steps_sampled: 38200
  num_steps_trained: 38200
iterations_since_restore: 191
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 26.0
  gpu_util_percent0: 0.04
  ram_util_percent: 42.3
  vram_util_percent0: 0.05264889766370517
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.038743151439013926
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05668124950115539
  mean_inference_ms: 0.74283998

agent_timesteps_total: 39800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-46
done: false
episode_len_mean: 188.39
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 188.39
episode_reward_min: 121.0
episodes_this_iter: 1
episodes_total: 348
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 19.574115753173828
  num_agent_steps_sampled: 39800
  num_steps_sampled: 39800
  num_steps_trained: 39800
iterations_since_restore: 199
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03871725461215102
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.0566243576424175
  mean_inference_ms: 0.7421990318098213
  mean_raw_obs_processing_ms: 0.05342259757647816
time_since_restore: 38.08528232574463
time_this_iter_s: 0.

agent_timesteps_total: 41200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-48
done: false
episode_len_mean: 187.14
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 187.14
episode_reward_min: 93.0
episodes_this_iter: 1
episodes_total: 356
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 18.22860336303711
  num_agent_steps_sampled: 41200
  num_steps_sampled: 41200
  num_steps_trained: 41200
iterations_since_restore: 206
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.038696205881243145
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.056579491901081
  mean_inference_ms: 0.7416738050007929
  mean_raw_obs_processing_ms: 0.053319822598891224
time_since_restore: 39.39827871322632
time_this_iter_s: 0.1

agent_timesteps_total: 42600
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-49
done: false
episode_len_mean: 187.14
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 187.14
episode_reward_min: 93.0
episodes_this_iter: 1
episodes_total: 363
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 15.650334358215332
  num_agent_steps_sampled: 42600
  num_steps_sampled: 42600
  num_steps_trained: 42600
iterations_since_restore: 213
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 28.3
  gpu_util_percent0: 0.04
  ram_util_percent: 42.3
  vram_util_percent0: 0.05264889766370517
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03867981551683854
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.056544818819901546
  mean_inference_ms: 0.741259593

agent_timesteps_total: 44000
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-51
done: false
episode_len_mean: 186.9
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 186.9
episode_reward_min: 93.0
episodes_this_iter: 1
episodes_total: 371
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 23.250822067260742
  num_agent_steps_sampled: 44000
  num_steps_sampled: 44000
  num_steps_trained: 44000
iterations_since_restore: 220
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 28.1
  gpu_util_percent0: 0.34
  ram_util_percent: 42.3
  vram_util_percent0: 0.05264889766370517
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03866268904467271
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.056509227298614705
  mean_inference_ms: 0.74083173057

agent_timesteps_total: 45400
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-52
done: false
episode_len_mean: 187.74
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 187.74
episode_reward_min: 93.0
episodes_this_iter: 1
episodes_total: 378
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 20.24587059020996
  num_agent_steps_sampled: 45400
  num_steps_sampled: 45400
  num_steps_trained: 45400
iterations_since_restore: 227
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03865453411325679
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05649050802751408
  mean_inference_ms: 0.7405957322222957
  mean_raw_obs_processing_ms: 0.05309903354283205
time_since_restore: 43.45992827415466
time_this_iter_s: 0.1

agent_timesteps_total: 46800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-54
done: false
episode_len_mean: 188.66
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 188.66
episode_reward_min: 93.0
episodes_this_iter: 1
episodes_total: 385
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 16.4409236907959
  num_agent_steps_sampled: 46800
  num_steps_sampled: 46800
  num_steps_trained: 46800
iterations_since_restore: 234
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 27.6
  gpu_util_percent0: 0.02
  ram_util_percent: 41.1
  vram_util_percent0: 0.05264889766370517
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03865199776976119
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.056481574144034784
  mean_inference_ms: 0.74046756182

agent_timesteps_total: 48200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-55
done: false
episode_len_mean: 189.06
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 189.06
episode_reward_min: 93.0
episodes_this_iter: 1
episodes_total: 392
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 15.553577423095703
  num_agent_steps_sampled: 48200
  num_steps_sampled: 48200
  num_steps_trained: 48200
iterations_since_restore: 241
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 26.7
  gpu_util_percent0: 0.29
  ram_util_percent: 41.1
  vram_util_percent0: 0.05264889766370517
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.038652088916974144
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05647856332424642
  mean_inference_ms: 0.740396486

agent_timesteps_total: 49600
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-56
done: false
episode_len_mean: 188.1
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 188.1
episode_reward_min: 93.0
episodes_this_iter: 1
episodes_total: 400
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 20.263572692871094
  num_agent_steps_sampled: 49600
  num_steps_sampled: 49600
  num_steps_trained: 49600
iterations_since_restore: 248
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 26.6
  gpu_util_percent0: 0.12
  ram_util_percent: 41.1
  vram_util_percent0: 0.05264889766370517
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.0386544882420603
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.056480225838641006
  mean_inference_ms: 0.740374985733

agent_timesteps_total: 51000
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-58
done: false
episode_len_mean: 188.93
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 188.93
episode_reward_min: 93.0
episodes_this_iter: 1
episodes_total: 407
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 14.351947784423828
  num_agent_steps_sampled: 51000
  num_steps_sampled: 51000
  num_steps_trained: 51000
iterations_since_restore: 255
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 30.8
  gpu_util_percent0: 0.0
  ram_util_percent: 41.1
  vram_util_percent0: 0.05264889766370517
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03865881322937611
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05648548383689134
  mean_inference_ms: 0.74040210661

agent_timesteps_total: 52400
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-42-59
done: false
episode_len_mean: 188.3
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 188.3
episode_reward_min: 93.0
episodes_this_iter: 1
episodes_total: 414
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 14.23710823059082
  num_agent_steps_sampled: 52400
  num_steps_sampled: 52400
  num_steps_trained: 52400
iterations_since_restore: 262
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 28.4
  gpu_util_percent0: 0.08
  ram_util_percent: 41.1
  vram_util_percent0: 0.05264889766370517
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03866324589335775
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.056491494245407135
  mean_inference_ms: 0.740431702590

agent_timesteps_total: 53800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-01
done: false
episode_len_mean: 188.19
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 188.19
episode_reward_min: 93.0
episodes_this_iter: 1
episodes_total: 422
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 15.987165451049805
  num_agent_steps_sampled: 53800
  num_steps_sampled: 53800
  num_steps_trained: 53800
iterations_since_restore: 269
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 25.5
  gpu_util_percent0: 0.02
  ram_util_percent: 41.1
  vram_util_percent0: 0.05264889766370517
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03866780011067791
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.056497917371822394
  mean_inference_ms: 0.740459517

agent_timesteps_total: 55200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-02
done: false
episode_len_mean: 187.8
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 187.8
episode_reward_min: 93.0
episodes_this_iter: 1
episodes_total: 429
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 18.209070205688477
  num_agent_steps_sampled: 55200
  num_steps_sampled: 55200
  num_steps_trained: 55200
iterations_since_restore: 276
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 26.9
  gpu_util_percent0: 0.05
  ram_util_percent: 41.1
  vram_util_percent0: 0.05264889766370517
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03867172261505058
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05650343551931749
  mean_inference_ms: 0.740467710337

agent_timesteps_total: 56800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-04
done: false
episode_len_mean: 187.16
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 187.16
episode_reward_min: 93.0
episodes_this_iter: 1
episodes_total: 438
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 11.514081954956055
  num_agent_steps_sampled: 56800
  num_steps_sampled: 56800
  num_steps_trained: 56800
iterations_since_restore: 284
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 28.5
  gpu_util_percent0: 0.0
  ram_util_percent: 41.2
  vram_util_percent0: 0.05264889766370517
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03867699292119821
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.056512340432524615
  mean_inference_ms: 0.7404848658

agent_timesteps_total: 58200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-05
done: false
episode_len_mean: 187.86
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 187.86
episode_reward_min: 93.0
episodes_this_iter: 1
episodes_total: 445
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 11.747426986694336
  num_agent_steps_sampled: 58200
  num_steps_sampled: 58200
  num_steps_trained: 58200
iterations_since_restore: 291
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03868847761329526
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.0565323567481665
  mean_inference_ms: 0.7406534693455533
  mean_raw_obs_processing_ms: 0.052836537200877325
time_since_restore: 55.618404388427734
time_this_iter_s: 0

agent_timesteps_total: 59600
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-07
done: false
episode_len_mean: 189.01
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 189.01
episode_reward_min: 119.0
episodes_this_iter: 1
episodes_total: 453
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 11.449040412902832
  num_agent_steps_sampled: 59600
  num_steps_sampled: 59600
  num_steps_trained: 59600
iterations_since_restore: 298
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03870665410102511
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.056563807419620867
  mean_inference_ms: 0.7409525911973921
  mean_raw_obs_processing_ms: 0.052835793880228916
time_since_restore: 57.06355834007263
time_this_iter_s:

agent_timesteps_total: 61000
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-09
done: false
episode_len_mean: 187.47
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 187.47
episode_reward_min: 119.0
episodes_this_iter: 1
episodes_total: 461
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 16.86559295654297
  num_agent_steps_sampled: 61000
  num_steps_sampled: 61000
  num_steps_trained: 61000
iterations_since_restore: 305
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.038738841434095964
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.056617376791084986
  mean_inference_ms: 0.7415147836613916
  mean_raw_obs_processing_ms: 0.05285590935318126
time_since_restore: 58.68582248687744
time_this_iter_s: 

agent_timesteps_total: 62400
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-10
done: false
episode_len_mean: 186.19
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 186.19
episode_reward_min: 105.0
episodes_this_iter: 1
episodes_total: 469
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 12.014383316040039
  num_agent_steps_sampled: 62400
  num_steps_sampled: 62400
  num_steps_trained: 62400
iterations_since_restore: 312
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 40.1
  gpu_util_percent0: 0.05
  ram_util_percent: 43.2
  vram_util_percent0: 0.051826258637709774
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.038777415855692894
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.056681870743334264
  mean_inference_ms: 0.742191

agent_timesteps_total: 63800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-12
done: false
episode_len_mean: 186.05
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 186.05
episode_reward_min: 105.0
episodes_this_iter: 1
episodes_total: 477
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 13.556168556213379
  num_agent_steps_sampled: 63800
  num_steps_sampled: 63800
  num_steps_trained: 63800
iterations_since_restore: 319
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 51.8
  gpu_util_percent0: 0.05
  ram_util_percent: 41.7
  vram_util_percent0: 0.051826258637709774
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03881523477920245
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05674496780134065
  mean_inference_ms: 0.74284051

agent_timesteps_total: 65200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-13
done: false
episode_len_mean: 186.41
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 186.41
episode_reward_min: 105.0
episodes_this_iter: 1
episodes_total: 484
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 13.38009262084961
  num_agent_steps_sampled: 65200
  num_steps_sampled: 65200
  num_steps_trained: 65200
iterations_since_restore: 326
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 49.7
  gpu_util_percent0: 0.06
  ram_util_percent: 41.5
  vram_util_percent0: 0.05199078644290885
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03884884340203643
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05679963339389705
  mean_inference_ms: 0.7434129594

agent_timesteps_total: 66600
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-15
done: false
episode_len_mean: 186.66
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 186.66
episode_reward_min: 105.0
episodes_this_iter: 1
episodes_total: 491
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 11.558001518249512
  num_agent_steps_sampled: 66600
  num_steps_sampled: 66600
  num_steps_trained: 66600
iterations_since_restore: 333
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 32.3
  gpu_util_percent0: 0.05
  ram_util_percent: 41.6
  vram_util_percent0: 0.05199078644290885
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.038883566581511125
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05685577324171504
  mean_inference_ms: 0.74401298

agent_timesteps_total: 68000
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-16
done: false
episode_len_mean: 187.35
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 187.35
episode_reward_min: 105.0
episodes_this_iter: 1
episodes_total: 498
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 10.969280242919922
  num_agent_steps_sampled: 68000
  num_steps_sampled: 68000
  num_steps_trained: 68000
iterations_since_restore: 340
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 28.4
  gpu_util_percent0: 0.02
  ram_util_percent: 41.6
  vram_util_percent0: 0.05231984205330701
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.038917734343858625
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05691154003745073
  mean_inference_ms: 0.74460894

agent_timesteps_total: 69400
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-17
done: false
episode_len_mean: 187.78
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 187.78
episode_reward_min: 105.0
episodes_this_iter: 1
episodes_total: 505
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 12.208520889282227
  num_agent_steps_sampled: 69400
  num_steps_sampled: 69400
  num_steps_trained: 69400
iterations_since_restore: 347
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 29.9
  gpu_util_percent0: 0.08
  ram_util_percent: 41.6
  vram_util_percent0: 0.05248436985850609
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.038951358138319646
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05696683072779167
  mean_inference_ms: 0.74520104

agent_timesteps_total: 70800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-19
done: false
episode_len_mean: 188.57
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 188.57
episode_reward_min: 105.0
episodes_this_iter: 1
episodes_total: 512
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 15.114179611206055
  num_agent_steps_sampled: 70800
  num_steps_sampled: 70800
  num_steps_trained: 70800
iterations_since_restore: 354
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 26.7
  gpu_util_percent0: 0.07
  ram_util_percent: 41.7
  vram_util_percent0: 0.05380059230009872
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03898729621910188
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05702700203396526
  mean_inference_ms: 0.745840537

agent_timesteps_total: 72200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-20
done: false
episode_len_mean: 188.95
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 188.95
episode_reward_min: 105.0
episodes_this_iter: 1
episodes_total: 519
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 9.823939323425293
  num_agent_steps_sampled: 72200
  num_steps_sampled: 72200
  num_steps_trained: 72200
iterations_since_restore: 361
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03902517432918862
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.057094511833098445
  mean_inference_ms: 0.7465221183138169
  mean_raw_obs_processing_ms: 0.053114227387467426
time_since_restore: 69.7014627456665
time_this_iter_s: 0

agent_timesteps_total: 73600
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-22
done: false
episode_len_mean: 190.95
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 190.95
episode_reward_min: 105.0
episodes_this_iter: 1
episodes_total: 526
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 12.542461395263672
  num_agent_steps_sampled: 73600
  num_steps_sampled: 73600
  num_steps_trained: 73600
iterations_since_restore: 368
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03906570161803813
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.057169960754702924
  mean_inference_ms: 0.7472643045714841
  mean_raw_obs_processing_ms: 0.05315874187668806
time_since_restore: 71.05913424491882
time_this_iter_s: 

agent_timesteps_total: 75000
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-23
done: false
episode_len_mean: 191.48
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 191.48
episode_reward_min: 105.0
episodes_this_iter: 1
episodes_total: 533
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 15.931709289550781
  num_agent_steps_sampled: 75000
  num_steps_sampled: 75000
  num_steps_trained: 75000
iterations_since_restore: 375
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.039108160097339334
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.057249736563523584
  mean_inference_ms: 0.7480519614749855
  mean_raw_obs_processing_ms: 0.05320748014631994
time_since_restore: 72.45618772506714
time_this_iter_s:

agent_timesteps_total: 76400
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-25
done: false
episode_len_mean: 191.9
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 191.9
episode_reward_min: 105.0
episodes_this_iter: 1
episodes_total: 540
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 13.135366439819336
  num_agent_steps_sampled: 76400
  num_steps_sampled: 76400
  num_steps_trained: 76400
iterations_since_restore: 382
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.039152599638113454
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05733304074971591
  mean_inference_ms: 0.7488772153856093
  mean_raw_obs_processing_ms: 0.05325977753385943
time_since_restore: 73.93620491027832
time_this_iter_s: 0.

agent_timesteps_total: 77800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-27
done: false
episode_len_mean: 192.14
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 192.14
episode_reward_min: 105.0
episodes_this_iter: 1
episodes_total: 547
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 9.89659595489502
  num_agent_steps_sampled: 77800
  num_steps_sampled: 77800
  num_steps_trained: 77800
iterations_since_restore: 389
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 42.9
  gpu_util_percent0: 0.12
  ram_util_percent: 38.0
  vram_util_percent0: 0.05100361961171438
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03920278767085875
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05742758421568098
  mean_inference_ms: 0.74983915669

agent_timesteps_total: 79200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-28
done: false
episode_len_mean: 193.76
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 193.76
episode_reward_min: 105.0
episodes_this_iter: 1
episodes_total: 554
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 11.895757675170898
  num_agent_steps_sampled: 79200
  num_steps_sampled: 79200
  num_steps_trained: 79200
iterations_since_restore: 396
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 23.6
  gpu_util_percent0: 0.12
  ram_util_percent: 36.8
  vram_util_percent0: 0.050510036196117145
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.039254295602997126
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05752552238378375
  mean_inference_ms: 0.7508383

agent_timesteps_total: 80800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-30
done: false
episode_len_mean: 196.33
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 196.33
episode_reward_min: 105.0
episodes_this_iter: 1
episodes_total: 562
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 7.0598273277282715
  num_agent_steps_sampled: 80800
  num_steps_sampled: 80800
  num_steps_trained: 80800
iterations_since_restore: 404
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03929766546958076
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05761369061343686
  mean_inference_ms: 0.751702198260716
  mean_raw_obs_processing_ms: 0.053441103198803
time_since_restore: 78.51583552360535
time_this_iter_s: 0.17

agent_timesteps_total: 82400
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-31
done: false
episode_len_mean: 198.26
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 198.26
episode_reward_min: 139.0
episodes_this_iter: 1
episodes_total: 570
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 9.675714492797852
  num_agent_steps_sampled: 82400
  num_steps_sampled: 82400
  num_steps_trained: 82400
iterations_since_restore: 412
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03933114003035678
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05768558847501769
  mean_inference_ms: 0.7523858953340999
  mean_raw_obs_processing_ms: 0.0534812124507074
time_since_restore: 79.95027709007263
time_this_iter_s: 0.1

agent_timesteps_total: 84000
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-33
done: false
episode_len_mean: 196.87
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 196.87
episode_reward_min: 136.0
episodes_this_iter: 1
episodes_total: 579
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 9.919132232666016
  num_agent_steps_sampled: 84000
  num_steps_sampled: 84000
  num_steps_trained: 84000
iterations_since_restore: 420
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.039362049171661036
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.057754775019675714
  mean_inference_ms: 0.7530396317017599
  mean_raw_obs_processing_ms: 0.053516620783473405
time_since_restore: 81.41365718841553
time_this_iter_s:

agent_timesteps_total: 85400
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-34
done: false
episode_len_mean: 197.48
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 197.48
episode_reward_min: 136.0
episodes_this_iter: 1
episodes_total: 586
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 10.679348945617676
  num_agent_steps_sampled: 85400
  num_steps_sampled: 85400
  num_steps_trained: 85400
iterations_since_restore: 427
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.039379334233091416
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.057797750697681755
  mean_inference_ms: 0.7534216546423919
  mean_raw_obs_processing_ms: 0.05353560097117775
time_since_restore: 82.65308475494385
time_this_iter_s:

agent_timesteps_total: 87000
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-36
done: false
episode_len_mean: 197.48
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 197.48
episode_reward_min: 136.0
episodes_this_iter: 1
episodes_total: 594
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 10.043723106384277
  num_agent_steps_sampled: 87000
  num_steps_sampled: 87000
  num_steps_trained: 87000
iterations_since_restore: 435
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03938965947639574
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05783095048451337
  mean_inference_ms: 0.7536736752945948
  mean_raw_obs_processing_ms: 0.053543560801333816
time_since_restore: 83.9288239479065
time_this_iter_s: 0

agent_timesteps_total: 88600
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-37
done: false
episode_len_mean: 197.53
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 197.53
episode_reward_min: 136.0
episodes_this_iter: 1
episodes_total: 602
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 9.565502166748047
  num_agent_steps_sampled: 88600
  num_steps_sampled: 88600
  num_steps_trained: 88600
iterations_since_restore: 443
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.039388881826717764
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05784687718236487
  mean_inference_ms: 0.7537172369294817
  mean_raw_obs_processing_ms: 0.053537027987839804
time_since_restore: 85.15156316757202
time_this_iter_s: 

agent_timesteps_total: 90200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-38
done: false
episode_len_mean: 197.38
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 197.38
episode_reward_min: 136.0
episodes_this_iter: 1
episodes_total: 610
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 13.479522705078125
  num_agent_steps_sampled: 90200
  num_steps_sampled: 90200
  num_steps_trained: 90200
iterations_since_restore: 451
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 20.4
  gpu_util_percent0: 0.05
  ram_util_percent: 36.7
  vram_util_percent0: 0.04985192497532083
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.039375939540739424
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.057844448663039415
  mean_inference_ms: 0.7535313

agent_timesteps_total: 91800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-40
done: false
episode_len_mean: 197.38
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 197.38
episode_reward_min: 136.0
episodes_this_iter: 1
episodes_total: 618
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 12.352490425109863
  num_agent_steps_sampled: 91800
  num_steps_sampled: 91800
  num_steps_trained: 91800
iterations_since_restore: 459
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.039358255835381085
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.057831970298476705
  mean_inference_ms: 0.753254480266516
  mean_raw_obs_processing_ms: 0.053485737152729494
time_since_restore: 88.02349615097046
time_this_iter_s:

agent_timesteps_total: 93200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-42
done: false
episode_len_mean: 197.38
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 197.38
episode_reward_min: 136.0
episodes_this_iter: 1
episodes_total: 625
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 14.317028999328613
  num_agent_steps_sampled: 93200
  num_steps_sampled: 93200
  num_steps_trained: 93200
iterations_since_restore: 466
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03934169418634345
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05781513370273151
  mean_inference_ms: 0.7529865074325777
  mean_raw_obs_processing_ms: 0.05345851954104708
time_since_restore: 89.39774537086487
time_this_iter_s: 0

agent_timesteps_total: 94600
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-43
done: false
episode_len_mean: 197.38
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 197.38
episode_reward_min: 136.0
episodes_this_iter: 1
episodes_total: 632
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 12.372252464294434
  num_agent_steps_sampled: 94600
  num_steps_sampled: 94600
  num_steps_trained: 94600
iterations_since_restore: 473
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03932493730125963
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05779714892277123
  mean_inference_ms: 0.7527136076435629
  mean_raw_obs_processing_ms: 0.05342999675975043
time_since_restore: 90.73676013946533
time_this_iter_s: 0

agent_timesteps_total: 96000
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-44
done: false
episode_len_mean: 197.54
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 197.54
episode_reward_min: 136.0
episodes_this_iter: 1
episodes_total: 639
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 15.381251335144043
  num_agent_steps_sampled: 96000
  num_steps_sampled: 96000
  num_steps_trained: 96000
iterations_since_restore: 480
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03930610459471816
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.0577749600416049
  mean_inference_ms: 0.7523996101928452
  mean_raw_obs_processing_ms: 0.05339792259016898
time_since_restore: 92.1216082572937
time_this_iter_s: 0.1

agent_timesteps_total: 97800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-46
done: false
episode_len_mean: 197.14
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 197.14
episode_reward_min: 136.0
episodes_this_iter: 1
episodes_total: 649
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 14.196115493774414
  num_agent_steps_sampled: 97800
  num_steps_sampled: 97800
  num_steps_trained: 97800
iterations_since_restore: 489
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 25.3
  gpu_util_percent0: 0.05
  ram_util_percent: 37.3
  vram_util_percent0: 0.05495228693649227
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03926344470567833
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05771516149537077
  mean_inference_ms: 0.751608456

agent_timesteps_total: 99200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-48
done: false
episode_len_mean: 197.14
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 197.14
episode_reward_min: 136.0
episodes_this_iter: 1
episodes_total: 656
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 17.670028686523438
  num_agent_steps_sampled: 99200
  num_steps_sampled: 99200
  num_steps_trained: 99200
iterations_since_restore: 496
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 23.7
  gpu_util_percent0: 0.09
  ram_util_percent: 37.2
  vram_util_percent0: 0.05495228693649227
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03922848179679087
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05766372175255364
  mean_inference_ms: 0.750946545

agent_timesteps_total: 100600
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-49
done: false
episode_len_mean: 197.08
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 197.08
episode_reward_min: 136.0
episodes_this_iter: 1
episodes_total: 663
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 16.701553344726562
  num_agent_steps_sampled: 100600
  num_steps_sampled: 100600
  num_steps_trained: 100600
iterations_since_restore: 503
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 23.9
  gpu_util_percent0: 0.04
  ram_util_percent: 37.3
  vram_util_percent0: 0.05593945376768674
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03919560315702371
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05761550172377934
  mean_inference_ms: 0.75032

agent_timesteps_total: 102000
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-51
done: false
episode_len_mean: 197.08
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 197.08
episode_reward_min: 136.0
episodes_this_iter: 1
episodes_total: 670
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 15.4483060836792
  num_agent_steps_sampled: 102000
  num_steps_sampled: 102000
  num_steps_trained: 102000
iterations_since_restore: 510
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 20.1
  gpu_util_percent0: 0.03
  ram_util_percent: 37.2
  vram_util_percent0: 0.05593945376768674
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03916645270218146
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.057573976734688316
  mean_inference_ms: 0.749774

agent_timesteps_total: 103400
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-52
done: false
episode_len_mean: 198.35
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 198.35
episode_reward_min: 136.0
episodes_this_iter: 1
episodes_total: 677
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 15.909640312194824
  num_agent_steps_sampled: 103400
  num_steps_sampled: 103400
  num_steps_trained: 103400
iterations_since_restore: 517
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 18.5
  gpu_util_percent0: 0.08
  ram_util_percent: 37.1
  vram_util_percent0: 0.05511681474169135
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03913903666639363
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.057535411957542436
  mean_inference_ms: 0.7492

agent_timesteps_total: 104800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-53
done: false
episode_len_mean: 198.97
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 198.97
episode_reward_min: 166.0
episodes_this_iter: 1
episodes_total: 684
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 19.064210891723633
  num_agent_steps_sampled: 104800
  num_steps_sampled: 104800
  num_steps_trained: 104800
iterations_since_restore: 524
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 19.2
  gpu_util_percent0: 0.05
  ram_util_percent: 37.1
  vram_util_percent0: 0.05478775913129319
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03911232681960738
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05749769140271763
  mean_inference_ms: 0.74875

agent_timesteps_total: 106200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-55
done: false
episode_len_mean: 198.97
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 198.97
episode_reward_min: 166.0
episodes_this_iter: 1
episodes_total: 691
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 17.591419219970703
  num_agent_steps_sampled: 106200
  num_steps_sampled: 106200
  num_steps_trained: 106200
iterations_since_restore: 531
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 20.9
  gpu_util_percent0: 0.02
  ram_util_percent: 37.3
  vram_util_percent0: 0.057913787430075685
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.0390912750619536
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05746901213627836
  mean_inference_ms: 0.74836

agent_timesteps_total: 107600
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-56
done: false
episode_len_mean: 198.97
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 198.97
episode_reward_min: 166.0
episodes_this_iter: 1
episodes_total: 698
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 17.548921585083008
  num_agent_steps_sampled: 107600
  num_steps_sampled: 107600
  num_steps_trained: 107600
iterations_since_restore: 538
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 23.2
  gpu_util_percent0: 0.1
  ram_util_percent: 37.1
  vram_util_percent0: 0.05725567620927937
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.039079968963992
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05745569629094947
  mean_inference_ms: 0.74815936

agent_timesteps_total: 109000
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-58
done: false
episode_len_mean: 198.3
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 198.3
episode_reward_min: 162.0
episodes_this_iter: 1
episodes_total: 705
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 14.99135971069336
  num_agent_steps_sampled: 109000
  num_steps_sampled: 109000
  num_steps_trained: 109000
iterations_since_restore: 545
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 19.0
  gpu_util_percent0: 0.1
  ram_util_percent: 37.1
  vram_util_percent0: 0.0562685093780849
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.0390788898907293
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05745786371948813
  mean_inference_ms: 0.74814223221

agent_timesteps_total: 110400
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-43-59
done: false
episode_len_mean: 196.74
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 196.74
episode_reward_min: 135.0
episodes_this_iter: 1
episodes_total: 713
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 14.029033660888672
  num_agent_steps_sampled: 110400
  num_steps_sampled: 110400
  num_steps_trained: 110400
iterations_since_restore: 552
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 20.6
  gpu_util_percent0: 0.0
  ram_util_percent: 37.1
  vram_util_percent0: 0.0562685093780849
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03908515811228326
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05747071032153717
  mean_inference_ms: 0.7482597

agent_timesteps_total: 111800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-01
done: false
episode_len_mean: 196.03
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 196.03
episode_reward_min: 135.0
episodes_this_iter: 1
episodes_total: 720
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 14.234722137451172
  num_agent_steps_sampled: 111800
  num_steps_sampled: 111800
  num_steps_trained: 111800
iterations_since_restore: 559
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.039088326267462496
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.057477089444798894
  mean_inference_ms: 0.7483169360555831
  mean_raw_obs_processing_ms: 0.053013540981027625
time_since_restore: 107.12695574760437
time_this_i

agent_timesteps_total: 113400
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-02
done: false
episode_len_mean: 192.12
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 192.12
episode_reward_min: 116.0
episodes_this_iter: 1
episodes_total: 730
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 17.427305221557617
  num_agent_steps_sampled: 113400
  num_steps_sampled: 113400
  num_steps_trained: 113400
iterations_since_restore: 567
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 17.0
  gpu_util_percent0: 0.01
  ram_util_percent: 37.1
  vram_util_percent0: 0.055445870352089505
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03908836998704638
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05747756956148752
  mean_inference_ms: 0.7483

agent_timesteps_total: 115000
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-04
done: false
episode_len_mean: 189.31
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 189.31
episode_reward_min: 116.0
episodes_this_iter: 2
episodes_total: 740
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 16.991971969604492
  num_agent_steps_sampled: 115000
  num_steps_sampled: 115000
  num_steps_trained: 115000
iterations_since_restore: 575
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 24.5
  gpu_util_percent0: 0.0
  ram_util_percent: 37.1
  vram_util_percent0: 0.055445870352089505
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03908454415058887
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05747073424957946
  mean_inference_ms: 0.74822

agent_timesteps_total: 116600
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-05
done: false
episode_len_mean: 186.25
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 186.25
episode_reward_min: 103.0
episodes_this_iter: 1
episodes_total: 749
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 22.028337478637695
  num_agent_steps_sampled: 116600
  num_steps_sampled: 116600
  num_steps_trained: 116600
iterations_since_restore: 583
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.039076886137347895
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05745702019565207
  mean_inference_ms: 0.7480736458384398
  mean_raw_obs_processing_ms: 0.05297137819826788
time_since_restore: 111.45229625701904
time_this_ite

agent_timesteps_total: 118200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-07
done: false
episode_len_mean: 182.52
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 182.52
episode_reward_min: 103.0
episodes_this_iter: 1
episodes_total: 759
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 17.844676971435547
  num_agent_steps_sampled: 118200
  num_steps_sampled: 118200
  num_steps_trained: 118200
iterations_since_restore: 591
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.039065885522329535
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.057436701556875455
  mean_inference_ms: 0.7478539841996493
  mean_raw_obs_processing_ms: 0.052947791483965695
time_since_restore: 112.8914122581482
time_this_it

agent_timesteps_total: 119600
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-08
done: false
episode_len_mean: 179.43
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 179.43
episode_reward_min: 103.0
episodes_this_iter: 1
episodes_total: 768
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 17.565950393676758
  num_agent_steps_sampled: 119600
  num_steps_sampled: 119600
  num_steps_trained: 119600
iterations_since_restore: 598
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.039053282429257485
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05741287259483904
  mean_inference_ms: 0.7476010774787547
  mean_raw_obs_processing_ms: 0.05292275484621837
time_since_restore: 114.20476984977722
time_this_ite

agent_timesteps_total: 121400
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-10
done: false
episode_len_mean: 176.7
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 176.7
episode_reward_min: 103.0
episodes_this_iter: 1
episodes_total: 778
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 19.488561630249023
  num_agent_steps_sampled: 121400
  num_steps_sampled: 121400
  num_steps_trained: 121400
iterations_since_restore: 607
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03903926195624584
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05738606425227528
  mean_inference_ms: 0.7473154774026203
  mean_raw_obs_processing_ms: 0.05289514317725609
time_since_restore: 115.86279249191284
time_this_iter_s

agent_timesteps_total: 122800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-11
done: false
episode_len_mean: 174.49
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 174.49
episode_reward_min: 103.0
episodes_this_iter: 1
episodes_total: 786
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 19.440750122070312
  num_agent_steps_sampled: 122800
  num_steps_sampled: 122800
  num_steps_trained: 122800
iterations_since_restore: 614
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03902830428332055
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05736541347786605
  mean_inference_ms: 0.7470936447305854
  mean_raw_obs_processing_ms: 0.05287431478178133
time_since_restore: 117.18302130699158
time_this_iter

agent_timesteps_total: 124200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-13
done: false
episode_len_mean: 169.74
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 169.74
episode_reward_min: 103.0
episodes_this_iter: 1
episodes_total: 796
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 15.662930488586426
  num_agent_steps_sampled: 124200
  num_steps_sampled: 124200
  num_steps_trained: 124200
iterations_since_restore: 621
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03901314653920666
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05733715643612017
  mean_inference_ms: 0.74678824091924
  mean_raw_obs_processing_ms: 0.05284662728682471
time_since_restore: 118.48921179771423
time_this_iter_s

agent_timesteps_total: 125600
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-14
done: false
episode_len_mean: 165.47
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 165.47
episode_reward_min: 103.0
episodes_this_iter: 1
episodes_total: 805
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 18.36678123474121
  num_agent_steps_sampled: 125600
  num_steps_sampled: 125600
  num_steps_trained: 125600
iterations_since_restore: 628
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.038996124587616435
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.057305326341329475
  mean_inference_ms: 0.7464478338684326
  mean_raw_obs_processing_ms: 0.05281673393276172
time_since_restore: 119.78806829452515
time_this_ite

agent_timesteps_total: 127200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-16
done: false
episode_len_mean: 160.17
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 160.17
episode_reward_min: 102.0
episodes_this_iter: 2
episodes_total: 817
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 11.537388801574707
  num_agent_steps_sampled: 127200
  num_steps_sampled: 127200
  num_steps_trained: 127200
iterations_since_restore: 636
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03897415086859577
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.057263837556343414
  mean_inference_ms: 0.7460065006533142
  mean_raw_obs_processing_ms: 0.05277844615221113
time_since_restore: 121.2596685886383
time_this_iter

agent_timesteps_total: 128600
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-17
done: false
episode_len_mean: 158.95
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 158.95
episode_reward_min: 102.0
episodes_this_iter: 1
episodes_total: 826
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 13.905250549316406
  num_agent_steps_sampled: 128600
  num_steps_sampled: 128600
  num_steps_trained: 128600
iterations_since_restore: 643
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.038959882121624864
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05723702608837307
  mean_inference_ms: 0.7457171925652029
  mean_raw_obs_processing_ms: 0.05275418018492875
time_since_restore: 122.56464958190918
time_this_ite

agent_timesteps_total: 130200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-19
done: false
episode_len_mean: 156.35
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 156.35
episode_reward_min: 102.0
episodes_this_iter: 1
episodes_total: 837
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 13.781294822692871
  num_agent_steps_sampled: 130200
  num_steps_sampled: 130200
  num_steps_trained: 130200
iterations_since_restore: 651
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 20.2
  gpu_util_percent0: 0.05
  ram_util_percent: 37.0
  vram_util_percent0: 0.055445870352089505
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03894519467071639
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.057209489110328554
  mean_inference_ms: 0.745

agent_timesteps_total: 131800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-20
done: false
episode_len_mean: 157.23
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 157.23
episode_reward_min: 102.0
episodes_this_iter: 1
episodes_total: 847
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 16.907285690307617
  num_agent_steps_sampled: 131800
  num_steps_sampled: 131800
  num_steps_trained: 131800
iterations_since_restore: 659
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 19.2
  gpu_util_percent0: 0.07
  ram_util_percent: 37.0
  vram_util_percent0: 0.055445870352089505
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.038934138735006514
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05718869265598973
  mean_inference_ms: 0.745

agent_timesteps_total: 133200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-22
done: false
episode_len_mean: 156.41
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 156.41
episode_reward_min: 102.0
episodes_this_iter: 1
episodes_total: 855
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 12.136480331420898
  num_agent_steps_sampled: 133200
  num_steps_sampled: 133200
  num_steps_trained: 133200
iterations_since_restore: 666
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 18.9
  gpu_util_percent0: 0.02
  ram_util_percent: 37.0
  vram_util_percent0: 0.055445870352089505
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.038926659700969896
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.057174671180893186
  mean_inference_ms: 0.74

agent_timesteps_total: 135000
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-24
done: false
episode_len_mean: 157.94
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 157.94
episode_reward_min: 102.0
episodes_this_iter: 1
episodes_total: 865
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 13.096468925476074
  num_agent_steps_sampled: 135000
  num_steps_sampled: 135000
  num_steps_trained: 135000
iterations_since_restore: 675
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03891877720975603
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05716001617286248
  mean_inference_ms: 0.7448684938307324
  mean_raw_obs_processing_ms: 0.052683279382703015
time_since_restore: 128.55656814575195
time_this_ite

agent_timesteps_total: 136600
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-25
done: false
episode_len_mean: 158.58
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 158.58
episode_reward_min: 102.0
episodes_this_iter: 1
episodes_total: 874
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 15.787761688232422
  num_agent_steps_sampled: 136600
  num_steps_sampled: 136600
  num_steps_trained: 136600
iterations_since_restore: 683
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03891194782659707
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05714762418490029
  mean_inference_ms: 0.7447250208697862
  mean_raw_obs_processing_ms: 0.05267132194170894
time_since_restore: 130.0417673587799
time_this_iter_

agent_timesteps_total: 138200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-27
done: false
episode_len_mean: 160.91
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 160.91
episode_reward_min: 102.0
episodes_this_iter: 1
episodes_total: 883
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 14.036994934082031
  num_agent_steps_sampled: 138200
  num_steps_sampled: 138200
  num_steps_trained: 138200
iterations_since_restore: 691
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03890489196746075
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.057134970304054594
  mean_inference_ms: 0.7445751766923948
  mean_raw_obs_processing_ms: 0.05265877302162659
time_since_restore: 131.42011427879333
time_this_ite

agent_timesteps_total: 139800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-28
done: false
episode_len_mean: 161.9
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 161.9
episode_reward_min: 102.0
episodes_this_iter: 2
episodes_total: 892
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 15.697978973388672
  num_agent_steps_sampled: 139800
  num_steps_sampled: 139800
  num_steps_trained: 139800
iterations_since_restore: 699
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.038891735196808434
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.0571131844420282
  mean_inference_ms: 0.7443015289968713
  mean_raw_obs_processing_ms: 0.05263807406782127
time_since_restore: 132.6785752773285
time_this_iter_s:

agent_timesteps_total: 141400
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-29
done: false
episode_len_mean: 165.94
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 165.94
episode_reward_min: 102.0
episodes_this_iter: 1
episodes_total: 900
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 13.077068328857422
  num_agent_steps_sampled: 141400
  num_steps_sampled: 141400
  num_steps_trained: 141400
iterations_since_restore: 707
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.038874123687533235
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05708469133549339
  mean_inference_ms: 0.7439409440599541
  mean_raw_obs_processing_ms: 0.05261130835116121
time_since_restore: 133.88854551315308
time_this_ite

agent_timesteps_total: 143000
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-31
done: false
episode_len_mean: 169.95
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 169.95
episode_reward_min: 102.0
episodes_this_iter: 1
episodes_total: 908
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 14.757085800170898
  num_agent_steps_sampled: 143000
  num_steps_sampled: 143000
  num_steps_trained: 143000
iterations_since_restore: 715
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.0388505369732123
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05704730246530173
  mean_inference_ms: 0.7434673197737031
  mean_raw_obs_processing_ms: 0.052576152391468706
time_since_restore: 135.1842176914215
time_this_iter_

agent_timesteps_total: 144600
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-32
done: false
episode_len_mean: 174.27
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 174.27
episode_reward_min: 104.0
episodes_this_iter: 1
episodes_total: 916
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 15.240277290344238
  num_agent_steps_sampled: 144600
  num_steps_sampled: 144600
  num_steps_trained: 144600
iterations_since_restore: 723
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.038825285820318974
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.057007317777670555
  mean_inference_ms: 0.7429649663159933
  mean_raw_obs_processing_ms: 0.05253817258870951
time_since_restore: 136.63293528556824
time_this_it

agent_timesteps_total: 146000
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-34
done: false
episode_len_mean: 178.66
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 178.66
episode_reward_min: 104.0
episodes_this_iter: 1
episodes_total: 923
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 16.69879150390625
  num_agent_steps_sampled: 146000
  num_steps_sampled: 146000
  num_steps_trained: 146000
iterations_since_restore: 730
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03880255593584133
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.0569711062502293
  mean_inference_ms: 0.7425143132887206
  mean_raw_obs_processing_ms: 0.05250344526143864
time_since_restore: 137.90461993217468
time_this_iter_s

agent_timesteps_total: 147600
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-35
done: false
episode_len_mean: 183.26
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 183.26
episode_reward_min: 115.0
episodes_this_iter: 1
episodes_total: 931
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 16.610116958618164
  num_agent_steps_sampled: 147600
  num_steps_sampled: 147600
  num_steps_trained: 147600
iterations_since_restore: 738
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03877577283812427
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05692798141936214
  mean_inference_ms: 0.7419854461428333
  mean_raw_obs_processing_ms: 0.05246211064441598
time_since_restore: 139.34832668304443
time_this_iter

agent_timesteps_total: 149200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-37
done: false
episode_len_mean: 186.33
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 186.33
episode_reward_min: 115.0
episodes_this_iter: 1
episodes_total: 939
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 18.3605899810791
  num_agent_steps_sampled: 149200
  num_steps_sampled: 149200
  num_steps_trained: 149200
iterations_since_restore: 746
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03874785247611773
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05688255466582512
  mean_inference_ms: 0.7414363356607528
  mean_raw_obs_processing_ms: 0.05241846348608569
time_since_restore: 140.79195833206177
time_this_iter_s

agent_timesteps_total: 150800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-38
done: false
episode_len_mean: 188.8
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 188.8
episode_reward_min: 128.0
episodes_this_iter: 1
episodes_total: 947
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 18.744121551513672
  num_agent_steps_sampled: 150800
  num_steps_sampled: 150800
  num_steps_trained: 150800
iterations_since_restore: 754
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03871895977394873
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05683529368355003
  mean_inference_ms: 0.7408686012399002
  mean_raw_obs_processing_ms: 0.05237304328831388
time_since_restore: 142.18613243103027
time_this_iter_s

agent_timesteps_total: 152400
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-39
done: false
episode_len_mean: 191.65
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 191.65
episode_reward_min: 128.0
episodes_this_iter: 1
episodes_total: 955
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 15.605679512023926
  num_agent_steps_sampled: 152400
  num_steps_sampled: 152400
  num_steps_trained: 152400
iterations_since_restore: 762
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.0386851353620038
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05678031531820695
  mean_inference_ms: 0.7402058797825539
  mean_raw_obs_processing_ms: 0.05232093132706131
time_since_restore: 143.35596013069153
time_this_iter_

agent_timesteps_total: 154000
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-41
done: false
episode_len_mean: 193.34
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 193.34
episode_reward_min: 139.0
episodes_this_iter: 1
episodes_total: 963
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 18.75348663330078
  num_agent_steps_sampled: 154000
  num_steps_sampled: 154000
  num_steps_trained: 154000
iterations_since_restore: 770
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03864426622892353
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.056715214797010936
  mean_inference_ms: 0.7394111635676388
  mean_raw_obs_processing_ms: 0.05225959874823889
time_since_restore: 144.5261266231537
time_this_iter_

agent_timesteps_total: 155600
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-42
done: false
episode_len_mean: 195.66
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 195.66
episode_reward_min: 160.0
episodes_this_iter: 1
episodes_total: 971
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 15.335681915283203
  num_agent_steps_sampled: 155600
  num_steps_sampled: 155600
  num_steps_trained: 155600
iterations_since_restore: 778
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03859685074585683
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.056640524379778255
  mean_inference_ms: 0.7384947742515994
  mean_raw_obs_processing_ms: 0.0521896599182264
time_since_restore: 145.70929193496704
time_this_iter

agent_timesteps_total: 157200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-44
done: false
episode_len_mean: 197.09
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 197.09
episode_reward_min: 165.0
episodes_this_iter: 1
episodes_total: 979
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 21.175029754638672
  num_agent_steps_sampled: 157200
  num_steps_sampled: 157200
  num_steps_trained: 157200
iterations_since_restore: 786
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03854638346314838
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05656105512996849
  mean_inference_ms: 0.737522435572464
  mean_raw_obs_processing_ms: 0.05211564048447641
time_since_restore: 147.15803694725037
time_this_iter_

agent_timesteps_total: 158800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-45
done: false
episode_len_mean: 196.54
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 196.54
episode_reward_min: 157.0
episodes_this_iter: 1
episodes_total: 988
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 15.982270240783691
  num_agent_steps_sampled: 158800
  num_steps_sampled: 158800
  num_steps_trained: 158800
iterations_since_restore: 794
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03849300067398584
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.056476184393687186
  mean_inference_ms: 0.7364981044214721
  mean_raw_obs_processing_ms: 0.05203692844399666
time_since_restore: 148.6091136932373
time_this_iter

agent_timesteps_total: 160400
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-47
done: false
episode_len_mean: 192.59
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 192.59
episode_reward_min: 115.0
episodes_this_iter: 2
episodes_total: 999
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 13.979209899902344
  num_agent_steps_sampled: 160400
  num_steps_sampled: 160400
  num_steps_trained: 160400
iterations_since_restore: 802
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.038435482699801876
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05638350263340188
  mean_inference_ms: 0.7354013546080929
  mean_raw_obs_processing_ms: 0.05195066407353927
time_since_restore: 149.93569493293762
time_this_ite

agent_timesteps_total: 162000
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-48
done: false
episode_len_mean: 188.32
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 188.32
episode_reward_min: 113.0
episodes_this_iter: 1
episodes_total: 1009
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 17.30821418762207
  num_agent_steps_sampled: 162000
  num_steps_sampled: 162000
  num_steps_trained: 162000
iterations_since_restore: 810
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03838641339501813
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.056304605608266674
  mean_inference_ms: 0.7344685906916191
  mean_raw_obs_processing_ms: 0.051877652440495206
time_since_restore: 151.10755372047424
time_this_it

agent_timesteps_total: 163600
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-49
done: false
episode_len_mean: 184.06
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 184.06
episode_reward_min: 98.0
episodes_this_iter: 1
episodes_total: 1019
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 20.264921188354492
  num_agent_steps_sampled: 163600
  num_steps_sampled: 163600
  num_steps_trained: 163600
iterations_since_restore: 818
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 17.5
  gpu_util_percent0: 0.04
  ram_util_percent: 36.8
  vram_util_percent0: 0.05478775913129319
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03833291895723063
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05621970803949015
  mean_inference_ms: 0.73345

agent_timesteps_total: 165200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-50
done: false
episode_len_mean: 182.04
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 182.04
episode_reward_min: 98.0
episodes_this_iter: 1
episodes_total: 1028
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 20.105592727661133
  num_agent_steps_sampled: 165200
  num_steps_sampled: 165200
  num_steps_trained: 165200
iterations_since_restore: 826
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03827985484750917
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05613652522315779
  mean_inference_ms: 0.7324479908463292
  mean_raw_obs_processing_ms: 0.05172401301063316
time_since_restore: 153.5423092842102
time_this_iter_

agent_timesteps_total: 166800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-52
done: false
episode_len_mean: 180.35
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 180.35
episode_reward_min: 98.0
episodes_this_iter: 1
episodes_total: 1037
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 17.016376495361328
  num_agent_steps_sampled: 166800
  num_steps_sampled: 166800
  num_steps_trained: 166800
iterations_since_restore: 834
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 21.0
  gpu_util_percent0: 0.0
  ram_util_percent: 36.8
  vram_util_percent0: 0.05478775913129319
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.038226054728185084
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05605256648616977
  mean_inference_ms: 0.73143

agent_timesteps_total: 168400
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-53
done: false
episode_len_mean: 176.22
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 176.22
episode_reward_min: 98.0
episodes_this_iter: 1
episodes_total: 1047
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 17.668777465820312
  num_agent_steps_sampled: 168400
  num_steps_sampled: 168400
  num_steps_trained: 168400
iterations_since_restore: 842
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 19.3
  gpu_util_percent0: 0.0
  ram_util_percent: 36.8
  vram_util_percent0: 0.05478775913129319
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03816723027538689
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.055961027455380986
  mean_inference_ms: 0.73031

agent_timesteps_total: 170000
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-55
done: false
episode_len_mean: 174.28
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 174.28
episode_reward_min: 98.0
episodes_this_iter: 1
episodes_total: 1056
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 16.596067428588867
  num_agent_steps_sampled: 170000
  num_steps_sampled: 170000
  num_steps_trained: 170000
iterations_since_restore: 850
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 19.5
  gpu_util_percent0: 0.0
  ram_util_percent: 36.8
  vram_util_percent0: 0.05478775913129319
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03811909863110072
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05588587713188341
  mean_inference_ms: 0.729408

agent_timesteps_total: 171600
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-56
done: false
episode_len_mean: 170.41
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 170.41
episode_reward_min: 98.0
episodes_this_iter: 1
episodes_total: 1066
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 14.807747840881348
  num_agent_steps_sampled: 171600
  num_steps_sampled: 171600
  num_steps_trained: 171600
iterations_since_restore: 858
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.038069510607367144
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.055807748672186165
  mean_inference_ms: 0.7284672004298147
  mean_raw_obs_processing_ms: 0.05142531014588543
time_since_restore: 158.87015986442566
time_this_it

agent_timesteps_total: 173200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-57
done: false
episode_len_mean: 169.43
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 169.43
episode_reward_min: 98.0
episodes_this_iter: 1
episodes_total: 1075
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 18.437294006347656
  num_agent_steps_sampled: 173200
  num_steps_sampled: 173200
  num_steps_trained: 173200
iterations_since_restore: 866
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03802656861357126
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05573983522971128
  mean_inference_ms: 0.7276477268281396
  mean_raw_obs_processing_ms: 0.0513639370666572
time_since_restore: 160.05083107948303
time_this_iter_

agent_timesteps_total: 174800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-44-59
done: false
episode_len_mean: 169.52
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 169.52
episode_reward_min: 98.0
episodes_this_iter: 1
episodes_total: 1083
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 15.403395652770996
  num_agent_steps_sampled: 174800
  num_steps_sampled: 174800
  num_steps_trained: 174800
iterations_since_restore: 874
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03798526030922167
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.055674976388890185
  mean_inference_ms: 0.726859234218936
  mean_raw_obs_processing_ms: 0.05130568421786285
time_since_restore: 161.3686065673828
time_this_iter_

agent_timesteps_total: 176400
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-45-00
done: false
episode_len_mean: 171.22
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 171.22
episode_reward_min: 98.0
episodes_this_iter: 1
episodes_total: 1091
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 13.243581771850586
  num_agent_steps_sampled: 176400
  num_steps_sampled: 176400
  num_steps_trained: 176400
iterations_since_restore: 882
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03794366050931372
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05560983048679751
  mean_inference_ms: 0.7260670375068154
  mean_raw_obs_processing_ms: 0.05124711053034621
time_since_restore: 162.8114058971405
time_this_iter_

agent_timesteps_total: 178000
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-45-02
done: false
episode_len_mean: 175.46
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 175.46
episode_reward_min: 98.0
episodes_this_iter: 1
episodes_total: 1099
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 17.190797805786133
  num_agent_steps_sampled: 178000
  num_steps_sampled: 178000
  num_steps_trained: 178000
iterations_since_restore: 890
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.037903496723155426
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.055546617861216056
  mean_inference_ms: 0.7253016233605116
  mean_raw_obs_processing_ms: 0.051189984768099814
time_since_restore: 164.2615933418274
time_this_it

agent_timesteps_total: 179600
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-45-03
done: false
episode_len_mean: 178.55
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 178.55
episode_reward_min: 98.0
episodes_this_iter: 1
episodes_total: 1107
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 15.442204475402832
  num_agent_steps_sampled: 179600
  num_steps_sampled: 179600
  num_steps_trained: 179600
iterations_since_restore: 898
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03786529975263326
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05548593091772711
  mean_inference_ms: 0.7245747064542185
  mean_raw_obs_processing_ms: 0.05113513606084626
time_since_restore: 165.45386743545532
time_this_iter

agent_timesteps_total: 181200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-45-05
done: false
episode_len_mean: 182.71
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 182.71
episode_reward_min: 118.0
episodes_this_iter: 1
episodes_total: 1115
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 17.241722106933594
  num_agent_steps_sampled: 181200
  num_steps_sampled: 181200
  num_steps_trained: 181200
iterations_since_restore: 906
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.037827834200745085
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05542680169438114
  mean_inference_ms: 0.7238632701483096
  mean_raw_obs_processing_ms: 0.05108129738579452
time_since_restore: 166.79573893547058
time_this_it

agent_timesteps_total: 182800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-45-06
done: false
episode_len_mean: 185.27
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 185.27
episode_reward_min: 118.0
episodes_this_iter: 1
episodes_total: 1123
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 15.823307037353516
  num_agent_steps_sampled: 182800
  num_steps_sampled: 182800
  num_steps_trained: 182800
iterations_since_restore: 914
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.0377918330790501
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05537059729476259
  mean_inference_ms: 0.7231793809613546
  mean_raw_obs_processing_ms: 0.05102963721677762
time_since_restore: 167.9958372116089
time_this_iter_

agent_timesteps_total: 184400
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-45-08
done: false
episode_len_mean: 187.87
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 187.87
episode_reward_min: 118.0
episodes_this_iter: 1
episodes_total: 1131
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 17.86542320251465
  num_agent_steps_sampled: 184400
  num_steps_sampled: 184400
  num_steps_trained: 184400
iterations_since_restore: 922
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03775902549441785
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05531893672205977
  mean_inference_ms: 0.7225534627037488
  mean_raw_obs_processing_ms: 0.05098195135590079
time_since_restore: 169.53516221046448
time_this_iter

agent_timesteps_total: 185800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-45-09
done: false
episode_len_mean: 188.18
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 188.18
episode_reward_min: 118.0
episodes_this_iter: 1
episodes_total: 1138
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 15.621776580810547
  num_agent_steps_sampled: 185800
  num_steps_sampled: 185800
  num_steps_trained: 185800
iterations_since_restore: 929
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.037731910962993276
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05527647786687837
  mean_inference_ms: 0.7220335278319665
  mean_raw_obs_processing_ms: 0.05094242094129562
time_since_restore: 170.85473346710205
time_this_it

agent_timesteps_total: 187200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-45-10
done: false
episode_len_mean: 191.6
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 191.6
episode_reward_min: 119.0
episodes_this_iter: 1
episodes_total: 1145
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 17.005958557128906
  num_agent_steps_sampled: 187200
  num_steps_sampled: 187200
  num_steps_trained: 187200
iterations_since_restore: 936
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03770591476342009
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05523582311865174
  mean_inference_ms: 0.7215344969683091
  mean_raw_obs_processing_ms: 0.050904300941480296
time_since_restore: 172.1592733860016
time_this_iter_

agent_timesteps_total: 188800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-45-12
done: false
episode_len_mean: 193.67
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 193.67
episode_reward_min: 119.0
episodes_this_iter: 1
episodes_total: 1153
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 19.67793083190918
  num_agent_steps_sampled: 188800
  num_steps_sampled: 188800
  num_steps_trained: 188800
iterations_since_restore: 944
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.037677271277250915
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05519115715053447
  mean_inference_ms: 0.720984258361955
  mean_raw_obs_processing_ms: 0.05086228906772464
time_since_restore: 173.63468408584595
time_this_iter

agent_timesteps_total: 190400
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-45-14
done: false
episode_len_mean: 196.25
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 196.25
episode_reward_min: 119.0
episodes_this_iter: 1
episodes_total: 1161
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 17.820220947265625
  num_agent_steps_sampled: 190400
  num_steps_sampled: 190400
  num_steps_trained: 190400
iterations_since_restore: 952
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 54.4
  gpu_util_percent0: 0.0
  ram_util_percent: 37.3
  vram_util_percent0: 0.05495228693649227
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03765517340632194
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05515689296310526
  mean_inference_ms: 0.72055

agent_timesteps_total: 191800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-45-15
done: false
episode_len_mean: 198.61
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 198.61
episode_reward_min: 170.0
episodes_this_iter: 1
episodes_total: 1168
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 16.552637100219727
  num_agent_steps_sampled: 191800
  num_steps_sampled: 191800
  num_steps_trained: 191800
iterations_since_restore: 959
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.037642232550499116
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05513703497536529
  mean_inference_ms: 0.7203074278082704
  mean_raw_obs_processing_ms: 0.05080984144027305
time_since_restore: 176.69205808639526
time_this_it

agent_timesteps_total: 193400
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-45-17
done: false
episode_len_mean: 199.3
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 199.3
episode_reward_min: 176.0
episodes_this_iter: 1
episodes_total: 1176
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 16.46129035949707
  num_agent_steps_sampled: 193400
  num_steps_sampled: 193400
  num_steps_trained: 193400
iterations_since_restore: 967
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03763238044348938
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05512157563147058
  mean_inference_ms: 0.720113814206088
  mean_raw_obs_processing_ms: 0.050793608338409985
time_since_restore: 178.13314700126648
time_this_iter_s

agent_timesteps_total: 195000
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-45-18
done: false
episode_len_mean: 199.66
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 199.66
episode_reward_min: 181.0
episodes_this_iter: 1
episodes_total: 1184
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 17.286087036132812
  num_agent_steps_sampled: 195000
  num_steps_sampled: 195000
  num_steps_trained: 195000
iterations_since_restore: 975
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.0376260381362069
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05511118463294772
  mean_inference_ms: 0.7199863233358808
  mean_raw_obs_processing_ms: 0.05078155983722192
time_since_restore: 179.50381112098694
time_this_iter

agent_timesteps_total: 196600
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-45-20
done: false
episode_len_mean: 200.0
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 200.0
episode_reward_min: 200.0
episodes_this_iter: 1
episodes_total: 1192
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 18.51556396484375
  num_agent_steps_sampled: 196600
  num_steps_sampled: 196600
  num_steps_trained: 196600
iterations_since_restore: 983
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03761653397419119
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.05509635806272567
  mean_inference_ms: 0.7197977519837673
  mean_raw_obs_processing_ms: 0.0507652013171635
time_since_restore: 180.68912482261658
time_this_iter_s:

agent_timesteps_total: 198200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-45-21
done: false
episode_len_mean: 199.1
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 199.1
episode_reward_min: 110.0
episodes_this_iter: 1
episodes_total: 1200
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 28.19048309326172
  num_agent_steps_sampled: 198200
  num_steps_sampled: 198200
  num_steps_trained: 198200
iterations_since_restore: 991
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 18.3
  gpu_util_percent0: 0.0
  ram_util_percent: 36.9
  vram_util_percent0: 0.05495228693649227
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03760281129359328
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.055075454887749904
  mean_inference_ms: 0.7195266

agent_timesteps_total: 199800
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-45-22
done: false
episode_len_mean: 198.1
episode_media: {}
episode_reward_max: 200.0
episode_reward_mean: 198.1
episode_reward_min: 100.0
episodes_this_iter: 1
episodes_total: 1209
experiment_id: 9347da65e5334c0bba3d2e82feef9a20
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 19.154212951660156
  num_agent_steps_sampled: 199800
  num_steps_sampled: 199800
  num_steps_trained: 199800
iterations_since_restore: 999
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf: {}
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03758641124230447
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.0550505983194443
  mean_inference_ms: 0.7191982758735478
  mean_raw_obs_processing_ms: 0.050717373889883975
time_since_restore: 183.06478834152222
time_this_iter_

In [13]:
print("Last checkpoint saved at", last_checkpoint)

Last checkpoint saved at /home/bruno/ray_results/PG_CartPole-v0_2021-11-16_17-41-58o964uy7w/checkpoint_001000/checkpoint-1000


Agora vamos criar outro vídeo, mas desta vez escolha a ação recomendada pelo modelo treinado em vez de agir aleatoriamente.

In [None]:
trainer = pg.PGTrainer(config=config, env=environment_id)
trainer.restore(last_checkpoint)

after_training = os.path.join(
    DRIVE_PATH, "{}after_training_basic_api.mp4".format(environment_id)
)
after_video = VideoRecorder(env, after_training)
observation = env.reset()
done = False
while not done:
    env.render()
    after_video.capture_frame()
    action = trainer.compute_action(observation)
    observation, reward, done, info = env.step(action)
after_video.close()
env.close()
html = render_mp4(after_training)
HTML(html)

### Usando ambiente ou modelos personalizados

A API Python fornece a flexibilidade necessária para aplicar o RLlib a novos problemas. Você precisará usar esta API se desejar usar ambientes ou modelos personalizados com RLlib. Abaixo veremos um exemplo de um ambiente e um modelo customizado.

<br>


Para maiores informações veja em [APIs Python avançadas](https://docs.ray.io/en/latest/rllib-training.html#advanced-python-apis).

In [15]:
import gym
from gym.spaces import Discrete, Box
import numpy as np
import os
import random

import torch
import torch.nn as nn

import ray
from ray import tune
from ray.rllib.agents import pg
from ray.rllib.env.env_context import EnvContext
from ray.rllib.models import ModelCatalog
from ray.rllib.models.torch.torch_modelv2 import TorchModelV2
from ray.rllib.models.torch.fcnet import FullyConnectedNetwork as TorchFC
from ray.tune.logger import pretty_print

In [16]:
class SimpleCorridor(gym.Env):
    """Exemplo de um ambiente personalizado em que você tem que andar por um 
    corredor. Você pode configurar o comprimento do corredor através da 
    configuração do ambiente."""

    def __init__(self, config: EnvContext):
        self.end_pos = config["corridor_length"]
        self.cur_pos = 0
        self.action_space = Discrete(2)
        self.observation_space = Box(
            0.0, self.end_pos, shape=(1, ), dtype=np.float32)
        # Define a seed. É usado apenas para a recompensa final.
        self.seed(config.worker_index * config.num_workers)

    def reset(self):
        self.cur_pos = 0
        return [self.cur_pos]

    def step(self, action):
        assert action in [0, 1], action
        if action == 0 and self.cur_pos > 0:
            self.cur_pos -= 1
        elif action == 1:
            self.cur_pos += 1
        done = self.cur_pos >= self.end_pos
        # Produz uma recompensa aleatória quando atingirmos a meta.
        return [self.cur_pos], \
            random.random() * 2 if done else -0.1, done, {}

    def seed(self, seed=None):
        random.seed(seed)

In [17]:
class TorchCustomModel(TorchModelV2, nn.Module):
    """Exemplo de um modelo personalizado PyTorch que apenas delega para uma 
    fc-net."""

    def __init__(self, obs_space, action_space, num_outputs, model_config,
                 name):
        TorchModelV2.__init__(self, obs_space, action_space, num_outputs,
                              model_config, name)
        nn.Module.__init__(self)

        self.torch_sub_model = TorchFC(obs_space, action_space, num_outputs,
                                       model_config, name)

    def forward(self, input_dict, state, seq_lens):
        input_dict["obs"] = input_dict["obs"].float()
        fc_out, _ = self.torch_sub_model(input_dict, state, seq_lens)
        return fc_out, []

    def value_function(self):
        return torch.reshape(self.torch_sub_model.value_function(), [-1])

In [18]:
# Também pode registrar a função de criar um ambiente explicitamente com:
# register_env("corridor", lambda config: SimpleCorridor(config))

# Registrar o modelo customizado
ModelCatalog.register_custom_model(
    "my_model", TorchCustomModel
)

config = {
    "env": SimpleCorridor,  # ou "corridor" se registrado
    "env_config": {
        "corridor_length": 5,
    },
    "model": {
        "custom_model": "my_model",
        "vf_share_layers": True,
    },
    "num_workers": 1,  
    "framework": "torch",
}

stop = {
    "training_iteration": 50,
    "timesteps_total": 100000,
    "episode_reward_mean": 0.1,
}

In [19]:
pg_config = pg.DEFAULT_CONFIG.copy()
pg_config.update(config)
pg_config["lr"] = 1e-3

trainer = pg.PGTrainer(config=pg_config, env=SimpleCorridor)
# executa o loop de treinamento manual e imprime os resultados após cada iteração
for _ in range(stop["training_iteration"]):
    result = trainer.train()
    print(pretty_print(result))
    
    # pare o treinamento caso tiver alcançado a quantidade de steps desejada
    # ou caso a recompensa desejada seja alcançada
    if result["timesteps_total"] >= stop["timesteps_total"] or \
            result["episode_reward_mean"] >= stop["episode_reward_mean"]:
        break

agent_timesteps_total: 200
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-45-32
done: false
episode_len_mean: 26.571428571428573
episode_media: {}
episode_reward_max: 1.3866340851152703
episode_reward_mean: -1.6236628243102207
episode_reward_min: -6.37245076204676
episodes_this_iter: 7
episodes_total: 7
experiment_id: f7f428ff01b14bbcbcd7382e4dbeff1e
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.6858269572257996
  num_agent_steps_sampled: 200
  num_steps_sampled: 200
  num_steps_trained: 200
iterations_since_restore: 1
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 18.5
  gpu_util_percent0: 0.0
  ram_util_percent: 38.5
  vram_util_percent0: 0.054129647910496875
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04035560645867343
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.02896963660

agent_timesteps_total: 1400
custom_metrics:
  default_policy: {}
date: 2021-11-16_17-45-33
done: false
episode_len_mean: 9.17
episode_media: {}
episode_reward_max: 1.5475504723193834
episode_reward_mean: 0.24254277682776265
episode_reward_min: -2.371372001398453
episodes_this_iter: 31
episodes_total: 123
experiment_id: f7f428ff01b14bbcbcd7382e4dbeff1e
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: 0.26876452565193176
  num_agent_steps_sampled: 1400
  num_steps_sampled: 1400
  num_steps_trained: 1400
iterations_since_restore: 7
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 18.5
  gpu_util_percent0: 0.0
  ram_util_percent: 36.9
  vram_util_percent0: 0.054129647910496875
pid: 6159
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.03909706424923176
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 0.02893375425175383

### Ray Tune

Todos os Trainers do RLlib são compatíveis com a API do [Ray Tune](https://docs.ray.io/en/master/tune/index.html). Isso permite que eles sejam facilmente usados em experimentos com o Tune. Por exemplo, o código a seguir executa o mesmo treino com o CartPole com o algoritmo PG.

In [20]:
import ray
config = {
    "env": environment_id,
    "framework": "torch",
}
stop = {"episode_reward_mean": 150, "timesteps_total": 100000}

# Executar o treinamento
analysis = ray.tune.run(
    "PG",
    config=config,
    stop=stop,
    checkpoint_freq=10,
    checkpoint_at_end=True,
    local_dir=os.path.join(DRIVE_PATH, "results")
)


Trial name,status,loc
PG_CartPole-v0_25270_00000,PENDING,


[2m[36m(pid=12779)[0m 2021-11-16 17:45:36,330	INFO trainer.py:696 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.


Result for PG_CartPole-v0_25270_00000:
  agent_timesteps_total: 200
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-45-36
  done: false
  episode_len_mean: 22.571428571428573
  episode_media: {}
  episode_reward_max: 38.0
  episode_reward_mean: 22.571428571428573
  episode_reward_min: 10.0
  episodes_this_iter: 7
  episodes_total: 7
  experiment_id: a3f8cb6e5c254863b66e662df36490ce
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 9.701666831970215
    num_agent_steps_sampled: 200
    num_steps_sampled: 200
    num_steps_trained: 200
  iterations_since_restore: 1
  node_ip: 192.168.0.102
  num_healthy_workers: 0
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 24.0
    gpu_util_percent0: 0.0
    ram_util_percent: 38.1
    vram_util_percent0: 0.054129647910496875
  pid: 12779
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing

Trial name,status,loc,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_25270_00000,RUNNING,192.168.0.102:12779,17,2.70072,3400,31.48,77,9,31.48


Result for PG_CartPole-v0_25270_00000:
  agent_timesteps_total: 5800
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-45-41
  done: false
  episode_len_mean: 47.44
  episode_media: {}
  episode_reward_max: 164.0
  episode_reward_mean: 47.44
  episode_reward_min: 10.0
  episodes_this_iter: 1
  episodes_total: 142
  experiment_id: a3f8cb6e5c254863b66e662df36490ce
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 22.879854202270508
    num_agent_steps_sampled: 5800
    num_steps_sampled: 5800
    num_steps_trained: 5800
  iterations_since_restore: 29
  node_ip: 192.168.0.102
  num_healthy_workers: 0
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 18.7
    gpu_util_percent0: 0.0
    ram_util_percent: 38.2
    vram_util_percent0: 0.054294175715695954
  pid: 12779
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.0326227938

Trial name,status,loc,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_25270_00000,RUNNING,192.168.0.102:12779,45,7.27209,9000,73.69,200,11,73.69


Result for PG_CartPole-v0_25270_00000:
  agent_timesteps_total: 12000
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-45-46
  done: false
  episode_len_mean: 97.02
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 97.02
  episode_reward_min: 13.0
  episodes_this_iter: 1
  episodes_total: 182
  experiment_id: a3f8cb6e5c254863b66e662df36490ce
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 17.509845733642578
    num_agent_steps_sampled: 12000
    num_steps_sampled: 12000
    num_steps_trained: 12000
  iterations_since_restore: 60
  node_ip: 192.168.0.102
  num_healthy_workers: 0
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 18.5
    gpu_util_percent0: 0.04
    ram_util_percent: 38.2
    vram_util_percent0: 0.054129647910496875
  pid: 12779
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.03389

Trial name,status,loc,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_25270_00000,RUNNING,192.168.0.102:12779,74,11.9721,14800,117.93,200,22,117.93


Result for PG_CartPole-v0_25270_00000:
  agent_timesteps_total: 17200
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-45-51
  done: false
  episode_len_mean: 135.55
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 135.55
  episode_reward_min: 22.0
  episodes_this_iter: 1
  episodes_total: 212
  experiment_id: a3f8cb6e5c254863b66e662df36490ce
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 17.860557556152344
    num_agent_steps_sampled: 17200
    num_steps_sampled: 17200
    num_steps_trained: 17200
  iterations_since_restore: 86
  node_ip: 192.168.0.102
  num_healthy_workers: 0
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 18.1
    gpu_util_percent0: 0.31
    ram_util_percent: 38.1
    vram_util_percent0: 0.054129647910496875
  pid: 12779
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.034

Trial name,status,loc,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_25270_00000,TERMINATED,,97,16.1482,19400,150.07,200,22,150.07


2021-11-16 17:45:54,550	INFO tune.py:549 -- Total run time: 21.16 seconds (19.90 seconds for the tuning loop).


Embora o objeto de análise retornado do `ray.tune.run` anteriormente não tivesse nenhuma instância Trainer, ele tem todas as informações necessárias para reconstruir um de um checkpoint salvo.

O retorno do Ray Tune é um objeto [ExperimentAnalysis](https://docs.ray.io/en/latest/tune/api_docs/analysis.html?highlight=ExperimentAnalysis#experimentanalysis-tune-experimentanalysis) onde é possível resgatar qual o melhor checkpoint do treino.

In [21]:
from ray.rllib.agents.pg import PGTrainer

# restaurar um Trainer 
trial = analysis.get_best_logdir("episode_reward_mean", "max")
checkpoint = analysis.get_best_checkpoint(
    trial,
    "training_iteration",
    "max",
)
trainer = PGTrainer(config=config)
trainer.restore(checkpoint)

2021-11-16 17:45:55,270	INFO trainable.py:377 -- Restored on 192.168.0.102 from checkpoint: /home/bruno/Workspace/ceia-rl-curso/LAB_03/content/results/PG/PG_CartPole-v0_25270_00000_0_2021-11-16_17-45-34/checkpoint_000097/checkpoint-97
2021-11-16 17:45:55,271	INFO trainable.py:385 -- Current state after restoring: {'_iteration': 97, '_timesteps_total': None, '_time_total': 16.148224115371704, '_episodes_total': 223}


Agora vamos criar outro vídeo, mas desta vez escolha a ação recomendada pelo modelo treinado com a API Tune.

In [None]:
after_training = after_training = os.path.join(
    DRIVE_PATH, "{}after_training_tune.mp4".format(environment_id)
)
after_video = VideoRecorder(env, after_training)
observation = env.reset()
done = False
while not done:
    env.render()
    after_video.capture_frame()
    action = trainer.compute_action(observation)
    observation, reward, done, info = env.step(action)
after_video.close()
env.close()
# You should get a video similar to the one below. 
html = render_mp4(after_training)
HTML(html)

O Tune gera arquivos do [Tensorboard](https://www.tensorflow.org/tensorboard) automaticamente durante o `tune.run()` Para visualizar a aprendizagem no tensorboard, execute o célula abaixo:

In [None]:
if isColab:
    %tensorboard --logdir /content/gdrive/MyDrive/minicurso_rl/lab03/results/PG
else:
    %tensorboard --logdir ./content/results/PG

### Hyperparameter Tuning com o Ray Tune

[Ray Tune](https://docs.ray.io/en/latest/tune/index.html) é uma biblioteca para execução de experimentos e ajuste de hiperparâmetros. Vamos agora tentar encontrar hiperparâmetros que possam resolver o ambiente [Cartpole](https://gym.openai.com/envs/CartPole-v1/) no menor número de passos de tempo. Esteja preparado para que demore um pouco para ser executado.

In [24]:
parameter_search_config = {
    "env": environment_id,
    "framework": "torch",
    "num_gpus": 1,  # porcentagem da gpu disponível para treino
    "num_workers": 7,  # número de workers além do processo principal; no colab deve ser 1 pois só há 2 CPUs1
#     "num_envs_per_worker": 2,
    # Hyperparameter tuning
    "model": {
      "fcnet_hiddens": ray.tune.grid_search([[32], [64]]),
      "fcnet_activation": ray.tune.grid_search(["linear", "relu"]),
    },
    "lr": ray.tune.uniform(1e-7, 1e-2)
}

# To explicitly stop or restart Ray, use the shutdown API.
ray.shutdown()

ray.init(
    num_cpus=8,
    include_dashboard=False,
    ignore_reinit_error=True,
    log_to_driver=False,
)

parameter_search_analysis = ray.tune.run(
    "PG",
    config=parameter_search_config,
    stop=stop,
    num_samples=8,
    metric="timesteps_total",
    mode="min",
)

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens
PG_CartPole-v0_385d8_00000,RUNNING,,0.00428141,linear,[32]
PG_CartPole-v0_385d8_00001,PENDING,,0.00357243,relu,[32]
PG_CartPole-v0_385d8_00002,PENDING,,0.00806013,linear,[64]
PG_CartPole-v0_385d8_00003,PENDING,,0.000611846,relu,[64]
PG_CartPole-v0_385d8_00004,PENDING,,0.00379117,linear,[32]
PG_CartPole-v0_385d8_00005,PENDING,,0.00852355,relu,[32]
PG_CartPole-v0_385d8_00006,PENDING,,0.00338773,linear,[64]
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64]
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32]
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32]


Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens
PG_CartPole-v0_385d8_00000,RUNNING,,0.00428141,linear,[32]
PG_CartPole-v0_385d8_00001,PENDING,,0.00357243,relu,[32]
PG_CartPole-v0_385d8_00002,PENDING,,0.00806013,linear,[64]
PG_CartPole-v0_385d8_00003,PENDING,,0.000611846,relu,[64]
PG_CartPole-v0_385d8_00004,PENDING,,0.00379117,linear,[32]
PG_CartPole-v0_385d8_00005,PENDING,,0.00852355,relu,[32]
PG_CartPole-v0_385d8_00006,PENDING,,0.00338773,linear,[64]
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64]
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32]
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32]


Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens
PG_CartPole-v0_385d8_00000,RUNNING,,0.00428141,linear,[32]
PG_CartPole-v0_385d8_00001,PENDING,,0.00357243,relu,[32]
PG_CartPole-v0_385d8_00002,PENDING,,0.00806013,linear,[64]
PG_CartPole-v0_385d8_00003,PENDING,,0.000611846,relu,[64]
PG_CartPole-v0_385d8_00004,PENDING,,0.00379117,linear,[32]
PG_CartPole-v0_385d8_00005,PENDING,,0.00852355,relu,[32]
PG_CartPole-v0_385d8_00006,PENDING,,0.00338773,linear,[64]
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64]
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32]
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32]


Result for PG_CartPole-v0_385d8_00000:
  agent_timesteps_total: 1400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-46-38
  done: false
  episode_len_mean: 19.723076923076924
  episode_media: {}
  episode_reward_max: 69.0
  episode_reward_mean: 19.723076923076924
  episode_reward_min: 9.0
  episodes_this_iter: 65
  episodes_total: 65
  experiment_id: f512bb73236f46f0af80a3d19dab2175
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 8.085700988769531
    num_agent_steps_sampled: 1400
    num_steps_sampled: 1400
    num_steps_trained: 1400
  iterations_since_restore: 1
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 33.4
    gpu_util_percent0: 0.01
    ram_util_percent: 59.0
    vram_util_percent0: 0.16617308325106944
  pid: 13931
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_proce

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00000,RUNNING,192.168.0.102:13931,0.00428141,linear,[32],1.0,0.449145,1400.0,19.7231,69.0,9.0,19.7231
PG_CartPole-v0_385d8_00001,PENDING,,0.00357243,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00002,PENDING,,0.00806013,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00003,PENDING,,0.000611846,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00004,PENDING,,0.00379117,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00005,PENDING,,0.00852355,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00006,PENDING,,0.00338773,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00000:
  agent_timesteps_total: 21000
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-46-44
  done: false
  episode_len_mean: 46.02
  episode_media: {}
  episode_reward_max: 107.0
  episode_reward_mean: 46.02
  episode_reward_min: 15.0
  episodes_this_iter: 31
  episodes_total: 635
  experiment_id: f512bb73236f46f0af80a3d19dab2175
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 12.6880521774292
    num_agent_steps_sampled: 21000
    num_steps_sampled: 21000
    num_steps_trained: 21000
  iterations_since_restore: 15
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 90.5
    gpu_util_percent0: 0.12
    ram_util_percent: 59.2
    vram_util_percent0: 0.1686410003290556
  pid: 13931
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07294202

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00000,RUNNING,192.168.0.102:13931,0.00428141,linear,[32],15.0,5.29632,21000.0,46.02,107.0,15.0,46.02
PG_CartPole-v0_385d8_00001,PENDING,,0.00357243,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00002,PENDING,,0.00806013,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00003,PENDING,,0.000611846,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00004,PENDING,,0.00379117,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00005,PENDING,,0.00852355,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00006,PENDING,,0.00338773,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00000:
  agent_timesteps_total: 40600
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-46-49
  done: false
  episode_len_mean: 54.33
  episode_media: {}
  episode_reward_max: 182.0
  episode_reward_mean: 54.33
  episode_reward_min: 19.0
  episodes_this_iter: 28
  episodes_total: 1022
  experiment_id: f512bb73236f46f0af80a3d19dab2175
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 12.191372871398926
    num_agent_steps_sampled: 40600
    num_steps_sampled: 40600
    num_steps_trained: 40600
  iterations_since_restore: 29
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 91.9
    gpu_util_percent0: 0.04
    ram_util_percent: 59.0
    vram_util_percent0: 0.1686410003290556
  pid: 13931
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07305

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00000,RUNNING,192.168.0.102:13931,0.00428141,linear,[32],29.0,10.2845,40600.0,54.33,182.0,19.0,54.33
PG_CartPole-v0_385d8_00001,PENDING,,0.00357243,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00002,PENDING,,0.00806013,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00003,PENDING,,0.000611846,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00004,PENDING,,0.00379117,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00005,PENDING,,0.00852355,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00006,PENDING,,0.00338773,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00000:
  agent_timesteps_total: 60200
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-46-54
  done: false
  episode_len_mean: 69.32
  episode_media: {}
  episode_reward_max: 179.0
  episode_reward_mean: 69.32
  episode_reward_min: 23.0
  episodes_this_iter: 24
  episodes_total: 1335
  experiment_id: f512bb73236f46f0af80a3d19dab2175
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 13.868844985961914
    num_agent_steps_sampled: 60200
    num_steps_sampled: 60200
    num_steps_trained: 60200
  iterations_since_restore: 43
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 13931
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07373095336638234
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10520015107830613
    mean_inference_ms: 1.24159280882341

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00000,RUNNING,192.168.0.102:13931,0.00428141,linear,[32],43.0,15.1297,60200.0,69.32,179.0,23.0,69.32
PG_CartPole-v0_385d8_00001,PENDING,,0.00357243,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00002,PENDING,,0.00806013,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00003,PENDING,,0.000611846,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00004,PENDING,,0.00379117,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00005,PENDING,,0.00852355,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00006,PENDING,,0.00338773,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00000:
  agent_timesteps_total: 79800
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-46-59
  done: false
  episode_len_mean: 96.21
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 96.21
  episode_reward_min: 33.0
  episodes_this_iter: 10
  episodes_total: 1566
  experiment_id: f512bb73236f46f0af80a3d19dab2175
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 22.132614135742188
    num_agent_steps_sampled: 79800
    num_steps_sampled: 79800
    num_steps_trained: 79800
  iterations_since_restore: 57
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 88.9
    gpu_util_percent0: 0.04
    ram_util_percent: 59.2
    vram_util_percent0: 0.16929911154985192
  pid: 13931
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.0747

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00000,RUNNING,192.168.0.102:13931,0.00428141,linear,[32],57.0,20.2441,79800.0,96.21,200.0,33.0,96.21
PG_CartPole-v0_385d8_00001,PENDING,,0.00357243,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00002,PENDING,,0.00806013,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00003,PENDING,,0.000611846,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00004,PENDING,,0.00379117,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00005,PENDING,,0.00852355,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00006,PENDING,,0.00338773,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00000:
  agent_timesteps_total: 92400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-47-03
  done: true
  episode_len_mean: 153.74
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 153.74
  episode_reward_min: 33.0
  episodes_this_iter: 9
  episodes_total: 1639
  experiment_id: f512bb73236f46f0af80a3d19dab2175
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 23.051877975463867
    num_agent_steps_sampled: 92400
    num_steps_sampled: 92400
    num_steps_trained: 92400
  iterations_since_restore: 66
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 13931
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.0749034213902397
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10681724785081778
    mean_inference_ms: 1.258724154480888

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00001,RUNNING,192.168.0.102:14490,0.00357243,relu,[32],1.0,0.405597,1400.0,22.7414,61.0,8.0,22.7414
PG_CartPole-v0_385d8_00002,PENDING,,0.00806013,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00003,PENDING,,0.000611846,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00004,PENDING,,0.00379117,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00005,PENDING,,0.00852355,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00006,PENDING,,0.00338773,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00001:
  agent_timesteps_total: 21000
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-47-27
  done: false
  episode_len_mean: 37.93
  episode_media: {}
  episode_reward_max: 126.0
  episode_reward_mean: 37.93
  episode_reward_min: 12.0
  episodes_this_iter: 30
  episodes_total: 714
  experiment_id: 6ce69d14b70a4d74b6394f48a81fd211
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 13.785178184509277
    num_agent_steps_sampled: 21000
    num_steps_sampled: 21000
    num_steps_trained: 21000
  iterations_since_restore: 15
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 86.1
    gpu_util_percent0: 0.0
    ram_util_percent: 59.8
    vram_util_percent0: 0.16847647252385653
  pid: 14490
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.076757

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00001,RUNNING,192.168.0.102:14490,0.00357243,relu,[32],15.0,5.4331,21000.0,37.93,126.0,12.0,37.93
PG_CartPole-v0_385d8_00002,PENDING,,0.00806013,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00003,PENDING,,0.000611846,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00004,PENDING,,0.00379117,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00005,PENDING,,0.00852355,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00006,PENDING,,0.00338773,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00001:
  agent_timesteps_total: 40600
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-47-32
  done: false
  episode_len_mean: 65.81
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 65.81
  episode_reward_min: 14.0
  episodes_this_iter: 23
  episodes_total: 1107
  experiment_id: 6ce69d14b70a4d74b6394f48a81fd211
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 15.868468284606934
    num_agent_steps_sampled: 40600
    num_steps_sampled: 40600
    num_steps_trained: 40600
  iterations_since_restore: 29
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 14490
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07369599877342851
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10517495514956757
    mean_inference_ms: 1.25664587255482

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00001,RUNNING,192.168.0.102:14490,0.00357243,relu,[32],29.0,10.1622,40600.0,65.81,200.0,14.0,65.81
PG_CartPole-v0_385d8_00002,PENDING,,0.00806013,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00003,PENDING,,0.000611846,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00004,PENDING,,0.00379117,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00005,PENDING,,0.00852355,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00006,PENDING,,0.00338773,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00001:
  agent_timesteps_total: 60200
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-47-38
  done: false
  episode_len_mean: 105.53
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 105.53
  episode_reward_min: 25.0
  episodes_this_iter: 12
  episodes_total: 1329
  experiment_id: 6ce69d14b70a4d74b6394f48a81fd211
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 23.084613800048828
    num_agent_steps_sampled: 60200
    num_steps_sampled: 60200
    num_steps_trained: 60200
  iterations_since_restore: 43
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 14490
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07326129032055059
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10446497739475075
    mean_inference_ms: 1.253672840533

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00001,RUNNING,192.168.0.102:14490,0.00357243,relu,[32],43.0,15.0959,60200.0,105.53,200.0,25.0,105.53
PG_CartPole-v0_385d8_00002,PENDING,,0.00806013,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00003,PENDING,,0.000611846,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00004,PENDING,,0.00379117,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00005,PENDING,,0.00852355,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00006,PENDING,,0.00338773,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00001:
  agent_timesteps_total: 78400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-47-43
  done: false
  episode_len_mean: 147.49
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 147.49
  episode_reward_min: 29.0
  episodes_this_iter: 10
  episodes_total: 1457
  experiment_id: 6ce69d14b70a4d74b6394f48a81fd211
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 22.34724235534668
    num_agent_steps_sampled: 78400
    num_steps_sampled: 78400
    num_steps_trained: 78400
  iterations_since_restore: 56
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 14490
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07302948638650258
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10431578357551195
    mean_inference_ms: 1.2564058836214

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00001,RUNNING,192.168.0.102:14490,0.00357243,relu,[32],56.0,19.6244,78400.0,147.49,200.0,29.0,147.49
PG_CartPole-v0_385d8_00002,PENDING,,0.00806013,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00003,PENDING,,0.000611846,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00004,PENDING,,0.00379117,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00005,PENDING,,0.00852355,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00006,PENDING,,0.00338773,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00001:
  agent_timesteps_total: 85400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-47-44
  done: true
  episode_len_mean: 150.49
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 150.49
  episode_reward_min: 29.0
  episodes_this_iter: 8
  episodes_total: 1503
  experiment_id: 6ce69d14b70a4d74b6394f48a81fd211
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 21.796430587768555
    num_agent_steps_sampled: 85400
    num_steps_sampled: 85400
    num_steps_trained: 85400
  iterations_since_restore: 61
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 89.8
    gpu_util_percent0: 0.01
    ram_util_percent: 59.8
    vram_util_percent0: 0.16633761105626851
  pid: 14490
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.0732

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00002,RUNNING,192.168.0.102:15040,0.00806013,linear,[64],1.0,0.343228,1400.0,21.6393,53.0,9.0,21.6393
PG_CartPole-v0_385d8_00003,PENDING,,0.000611846,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00004,PENDING,,0.00379117,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00005,PENDING,,0.00852355,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00006,PENDING,,0.00338773,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00011,PENDING,,0.00128099,relu,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00002:
  agent_timesteps_total: 21000
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-48-01
  done: false
  episode_len_mean: 86.74
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 86.74
  episode_reward_min: 32.0
  episodes_this_iter: 12
  episodes_total: 400
  experiment_id: 0e4699ca957446d29e07a253ec79662a
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 20.037124633789062
    num_agent_steps_sampled: 21000
    num_steps_sampled: 21000
    num_steps_trained: 21000
  iterations_since_restore: 15
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 15040
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07302529806322931
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10539950387813352
    mean_inference_ms: 1.213355969861079

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00002,RUNNING,192.168.0.102:15040,0.00806013,linear,[64],16.0,5.43687,22400.0,95.95,200.0,32.0,95.95
PG_CartPole-v0_385d8_00003,PENDING,,0.000611846,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00004,PENDING,,0.00379117,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00005,PENDING,,0.00852355,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00006,PENDING,,0.00338773,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00011,PENDING,,0.00128099,relu,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00002:
  agent_timesteps_total: 32200
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-48-04
  done: true
  episode_len_mean: 154.94
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 154.94
  episode_reward_min: 46.0
  episodes_this_iter: 10
  episodes_total: 462
  experiment_id: 0e4699ca957446d29e07a253ec79662a
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 21.686317443847656
    num_agent_steps_sampled: 32200
    num_steps_sampled: 32200
    num_steps_trained: 32200
  iterations_since_restore: 23
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 15040
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07291688538182636
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10417743393591573
    mean_inference_ms: 1.21387723671413

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00003,RUNNING,192.168.0.102:15426,0.000611846,relu,[64],1.0,0.362499,1400.0,20.2031,50.0,8.0,20.2031
PG_CartPole-v0_385d8_00004,PENDING,,0.00379117,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00005,PENDING,,0.00852355,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00006,PENDING,,0.00338773,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00011,PENDING,,0.00128099,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00003:
  agent_timesteps_total: 21000
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-48-21
  done: false
  episode_len_mean: 23.65
  episode_media: {}
  episode_reward_max: 61.0
  episode_reward_mean: 23.65
  episode_reward_min: 9.0
  episodes_this_iter: 61
  episodes_total: 888
  experiment_id: b05697c255a940d3960f41cd01f222ed
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 8.636327743530273
    num_agent_steps_sampled: 21000
    num_steps_sampled: 21000
    num_steps_trained: 21000
  iterations_since_restore: 15
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 15426
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07283855086129708
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10494682137964646
    mean_inference_ms: 1.2418653787944935
 

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00003,RUNNING,192.168.0.102:15426,0.000611846,relu,[64],15.0,5.27828,21000.0,23.65,61.0,9.0,23.65
PG_CartPole-v0_385d8_00004,PENDING,,0.00379117,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00005,PENDING,,0.00852355,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00006,PENDING,,0.00338773,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00011,PENDING,,0.00128099,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,


Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00003,RUNNING,192.168.0.102:15426,0.000611846,relu,[64],28.0,9.95703,39200.0,31.93,124.0,9.0,31.93
PG_CartPole-v0_385d8_00004,PENDING,,0.00379117,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00005,PENDING,,0.00852355,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00006,PENDING,,0.00338773,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00011,PENDING,,0.00128099,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00003:
  agent_timesteps_total: 40600
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-48-26
  done: false
  episode_len_mean: 32.3
  episode_media: {}
  episode_reward_max: 124.0
  episode_reward_mean: 32.3
  episode_reward_min: 9.0
  episodes_this_iter: 47
  episodes_total: 1613
  experiment_id: b05697c255a940d3960f41cd01f222ed
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 10.327242851257324
    num_agent_steps_sampled: 40600
    num_steps_sampled: 40600
    num_steps_trained: 40600
  iterations_since_restore: 29
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 15426
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07502150253463702
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.1049223684331449
    mean_inference_ms: 1.2620045347758526
 

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00003,RUNNING,192.168.0.102:15426,0.000611846,relu,[64],42.0,14.7518,58800.0,34.28,138.0,10.0,34.28
PG_CartPole-v0_385d8_00004,PENDING,,0.00379117,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00005,PENDING,,0.00852355,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00006,PENDING,,0.00338773,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00011,PENDING,,0.00128099,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00003:
  agent_timesteps_total: 60200
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-48-31
  done: false
  episode_len_mean: 36.21
  episode_media: {}
  episode_reward_max: 138.0
  episode_reward_mean: 36.21
  episode_reward_min: 10.0
  episodes_this_iter: 34
  episodes_total: 2235
  experiment_id: b05697c255a940d3960f41cd01f222ed
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 13.013325691223145
    num_agent_steps_sampled: 60200
    num_steps_sampled: 60200
    num_steps_trained: 60200
  iterations_since_restore: 43
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 87.8
    gpu_util_percent0: 0.0
    ram_util_percent: 59.8
    vram_util_percent0: 0.16354063836788418
  pid: 15426
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07414

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00003,RUNNING,192.168.0.102:15426,0.000611846,relu,[64],55.0,19.5836,77000.0,42.34,166.0,12.0,42.34
PG_CartPole-v0_385d8_00004,PENDING,,0.00379117,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00005,PENDING,,0.00852355,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00006,PENDING,,0.00338773,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00011,PENDING,,0.00128099,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00003:
  agent_timesteps_total: 78400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-48-36
  done: false
  episode_len_mean: 39.92
  episode_media: {}
  episode_reward_max: 133.0
  episode_reward_mean: 39.92
  episode_reward_min: 12.0
  episodes_this_iter: 37
  episodes_total: 2705
  experiment_id: b05697c255a940d3960f41cd01f222ed
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 11.814464569091797
    num_agent_steps_sampled: 78400
    num_steps_sampled: 78400
    num_steps_trained: 78400
  iterations_since_restore: 56
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 15426
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07495742132941864
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10544273798613256
    mean_inference_ms: 1.26724245924541

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00003,RUNNING,192.168.0.102:15426,0.000611846,relu,[64],69.0,24.615,96600.0,46.05,171.0,10.0,46.05
PG_CartPole-v0_385d8_00004,PENDING,,0.00379117,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00005,PENDING,,0.00852355,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00006,PENDING,,0.00338773,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00011,PENDING,,0.00128099,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00003:
  agent_timesteps_total: 98000
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-48-41
  done: false
  episode_len_mean: 48.12
  episode_media: {}
  episode_reward_max: 171.0
  episode_reward_mean: 48.12
  episode_reward_min: 10.0
  episodes_this_iter: 28
  episodes_total: 3156
  experiment_id: b05697c255a940d3960f41cd01f222ed
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 14.668052673339844
    num_agent_steps_sampled: 98000
    num_steps_sampled: 98000
    num_steps_trained: 98000
  iterations_since_restore: 70
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 15426
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07482226618298643
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10547132736165941
    mean_inference_ms: 1.27066499408008

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00004,RUNNING,192.168.0.102:15876,0.00379117,linear,[32],1.0,0.379255,1400.0,21.4237,54.0,10.0,21.4237
PG_CartPole-v0_385d8_00005,PENDING,,0.00852355,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00006,PENDING,,0.00338773,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00011,PENDING,,0.00128099,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00004:
  agent_timesteps_total: 21000
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-48-59
  done: false
  episode_len_mean: 44.2
  episode_media: {}
  episode_reward_max: 131.0
  episode_reward_mean: 44.2
  episode_reward_min: 17.0
  episodes_this_iter: 33
  episodes_total: 645
  experiment_id: 91de2d9992644c3d8e84ef5ad1804d98
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 12.70740795135498
    num_agent_steps_sampled: 21000
    num_steps_sampled: 21000
    num_steps_trained: 21000
  iterations_since_restore: 15
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 15876
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07313490498092666
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10664067849521658
    mean_inference_ms: 1.2275225889378354
 

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00004,RUNNING,192.168.0.102:15876,0.00379117,linear,[32],16.0,5.51288,22400.0,43.82,121.0,17.0,43.82
PG_CartPole-v0_385d8_00005,PENDING,,0.00852355,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00006,PENDING,,0.00338773,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00011,PENDING,,0.00128099,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00004:
  agent_timesteps_total: 40600
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-49-04
  done: false
  episode_len_mean: 64.07
  episode_media: {}
  episode_reward_max: 161.0
  episode_reward_mean: 64.07
  episode_reward_min: 25.0
  episodes_this_iter: 19
  episodes_total: 997
  experiment_id: 91de2d9992644c3d8e84ef5ad1804d98
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 15.82958698272705
    num_agent_steps_sampled: 40600
    num_steps_sampled: 40600
    num_steps_trained: 40600
  iterations_since_restore: 29
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 92.5
    gpu_util_percent0: 0.0
    ram_util_percent: 60.2
    vram_util_percent0: 0.16518591641987496
  pid: 15876
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.0726923

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00004,RUNNING,192.168.0.102:15876,0.00379117,linear,[32],30.0,10.378,42000.0,67.56,161.0,25.0,67.56
PG_CartPole-v0_385d8_00005,PENDING,,0.00852355,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00006,PENDING,,0.00338773,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00011,PENDING,,0.00128099,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00004:
  agent_timesteps_total: 56000
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-49-09
  done: false
  episode_len_mean: 100.88
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 100.88
  episode_reward_min: 31.0
  episodes_this_iter: 10
  episodes_total: 1161
  experiment_id: 91de2d9992644c3d8e84ef5ad1804d98
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 21.73491859436035
    num_agent_steps_sampled: 56000
    num_steps_sampled: 56000
    num_steps_trained: 56000
  iterations_since_restore: 40
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 91.4
    gpu_util_percent0: 0.0
    ram_util_percent: 59.8
    vram_util_percent0: 0.16354063836788418
  pid: 15876
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.0746

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00004,RUNNING,192.168.0.102:15876,0.00379117,linear,[32],42.0,15.523,58800.0,114.54,200.0,35.0,114.54
PG_CartPole-v0_385d8_00005,PENDING,,0.00852355,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00006,PENDING,,0.00338773,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00011,PENDING,,0.00128099,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00004:
  agent_timesteps_total: 72800
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-49-14
  done: true
  episode_len_mean: 151.24
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 151.24
  episode_reward_min: 30.0
  episodes_this_iter: 8
  episodes_total: 1273
  experiment_id: 91de2d9992644c3d8e84ef5ad1804d98
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 25.42530632019043
    num_agent_steps_sampled: 72800
    num_steps_sampled: 72800
    num_steps_trained: 72800
  iterations_since_restore: 52
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 87.9
    gpu_util_percent0: 0.0
    ram_util_percent: 59.8
    vram_util_percent0: 0.16354063836788418
  pid: 15876
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.076285

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00005,RUNNING,192.168.0.102:16253,0.00852355,relu,[32],1.0,0.355833,1400.0,18.6849,37.0,8.0,18.6849
PG_CartPole-v0_385d8_00006,PENDING,,0.00338773,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00011,PENDING,,0.00128099,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00005:
  agent_timesteps_total: 21000
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-49-31
  done: false
  episode_len_mean: 55.21
  episode_media: {}
  episode_reward_max: 130.0
  episode_reward_mean: 55.21
  episode_reward_min: 14.0
  episodes_this_iter: 25
  episodes_total: 582
  experiment_id: 7843fe17775641d9862862e33c95f26d
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 14.236824989318848
    num_agent_steps_sampled: 21000
    num_steps_sampled: 21000
    num_steps_trained: 21000
  iterations_since_restore: 15
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 84.2
    gpu_util_percent0: 0.01
    ram_util_percent: 59.7
    vram_util_percent0: 0.16354063836788418
  pid: 16253
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07346

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00005,RUNNING,192.168.0.102:16253,0.00852355,relu,[32],15.0,5.31602,21000.0,55.21,130.0,14.0,55.21
PG_CartPole-v0_385d8_00006,PENDING,,0.00338773,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00011,PENDING,,0.00128099,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00005:
  agent_timesteps_total: 40600
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-49-36
  done: false
  episode_len_mean: 117.49
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 117.49
  episode_reward_min: 12.0
  episodes_this_iter: 7
  episodes_total: 784
  experiment_id: 7843fe17775641d9862862e33c95f26d
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 23.983518600463867
    num_agent_steps_sampled: 40600
    num_steps_sampled: 40600
    num_steps_trained: 40600
  iterations_since_restore: 29
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 16253
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07369947403762199
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.1054758069922658
    mean_inference_ms: 1.266211047313409

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00005,RUNNING,192.168.0.102:16253,0.00852355,relu,[32],29.0,10.1007,40600.0,117.49,200.0,12.0,117.49
PG_CartPole-v0_385d8_00006,PENDING,,0.00338773,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00011,PENDING,,0.00128099,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00005:
  agent_timesteps_total: 50400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-49-39
  done: true
  episode_len_mean: 154.08
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 154.08
  episode_reward_min: 71.0
  episodes_this_iter: 7
  episodes_total: 845
  experiment_id: 7843fe17775641d9862862e33c95f26d
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 22.678874969482422
    num_agent_steps_sampled: 50400
    num_steps_sampled: 50400
    num_steps_trained: 50400
  iterations_since_restore: 36
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 87.5
    gpu_util_percent0: 0.0
    ram_util_percent: 59.6
    vram_util_percent0: 0.16354063836788418
  pid: 16253
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.073783

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00006,RUNNING,192.168.0.102:16601,0.00338773,linear,[64],1.0,0.330151,1400.0,21.0833,52.0,10.0,21.0833
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00011,PENDING,,0.00128099,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00006:
  agent_timesteps_total: 22400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-49-56
  done: false
  episode_len_mean: 62.19
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 62.19
  episode_reward_min: 17.0
  episodes_this_iter: 22
  episodes_total: 520
  experiment_id: 9f4afc47eb2342828cc82aac1153fde7
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 14.589232444763184
    num_agent_steps_sampled: 22400
    num_steps_sampled: 22400
    num_steps_trained: 22400
  iterations_since_restore: 16
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 90.3
    gpu_util_percent0: 0.0
    ram_util_percent: 59.4
    vram_util_percent0: 0.16354063836788418
  pid: 16601
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.072822

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00006,RUNNING,192.168.0.102:16601,0.00338773,linear,[64],16.0,5.43561,22400.0,62.19,200.0,17.0,62.19
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00011,PENDING,,0.00128099,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00006:
  agent_timesteps_total: 42000
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-50-01
  done: false
  episode_len_mean: 119.54
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 119.54
  episode_reward_min: 28.0
  episodes_this_iter: 9
  episodes_total: 716
  experiment_id: 9f4afc47eb2342828cc82aac1153fde7
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 21.400957107543945
    num_agent_steps_sampled: 42000
    num_steps_sampled: 42000
    num_steps_trained: 42000
  iterations_since_restore: 30
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 16601
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07364985697542986
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10565809043132895
    mean_inference_ms: 1.23745147791147

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00006,RUNNING,192.168.0.102:16601,0.00338773,linear,[64],30.0,10.1777,42000.0,119.54,200.0,28.0,119.54
PG_CartPole-v0_385d8_00007,PENDING,,0.0079584,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00011,PENDING,,0.00128099,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00006:
  agent_timesteps_total: 49000
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-50-03
  done: true
  episode_len_mean: 155.01
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 155.01
  episode_reward_min: 28.0
  episodes_this_iter: 8
  episodes_total: 755
  experiment_id: 9f4afc47eb2342828cc82aac1153fde7
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 23.627647399902344
    num_agent_steps_sampled: 49000
    num_steps_sampled: 49000
    num_steps_trained: 49000
  iterations_since_restore: 35
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 16601
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07363171173865397
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.1054296943219707
    mean_inference_ms: 1.2369209432342205

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00007,RUNNING,192.168.0.102:16914,0.0079584,relu,[64],1.0,0.334615,1400.0,21.5161,93.0,9.0,21.5161
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00011,PENDING,,0.00128099,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00007:
  agent_timesteps_total: 19600
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-50-19
  done: false
  episode_len_mean: 68.58
  episode_media: {}
  episode_reward_max: 159.0
  episode_reward_mean: 68.58
  episode_reward_min: 20.0
  episodes_this_iter: 16
  episodes_total: 439
  experiment_id: a8179066f1024786af6d20df64548dc6
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 18.73122787475586
    num_agent_steps_sampled: 19600
    num_steps_sampled: 19600
    num_steps_trained: 19600
  iterations_since_restore: 14
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 87.2
    gpu_util_percent0: 0.0
    ram_util_percent: 59.6
    vram_util_percent0: 0.16370516617308326
  pid: 16914
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.0753513

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00007,RUNNING,192.168.0.102:16914,0.0079584,relu,[64],15.0,5.42402,21000.0,76.26,193.0,20.0,76.26
PG_CartPole-v0_385d8_00008,PENDING,,0.00893173,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00011,PENDING,,0.00128099,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00007:
  agent_timesteps_total: 36400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-50-24
  done: true
  episode_len_mean: 151.55
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 151.55
  episode_reward_min: 27.0
  episodes_this_iter: 7
  episodes_total: 551
  experiment_id: a8179066f1024786af6d20df64548dc6
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 25.03240966796875
    num_agent_steps_sampled: 36400
    num_steps_sampled: 36400
    num_steps_trained: 36400
  iterations_since_restore: 26
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 85.3
    gpu_util_percent0: 0.0
    ram_util_percent: 59.6
    vram_util_percent0: 0.16370516617308326
  pid: 16914
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.0750830

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00008,RUNNING,192.168.0.102:17332,0.00893173,linear,[32],1.0,0.342931,1400.0,22.9825,47.0,9.0,22.9825
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00011,PENDING,,0.00128099,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00008:
  agent_timesteps_total: 21000
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-50-41
  done: false
  episode_len_mean: 75.94
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 75.94
  episode_reward_min: 22.0
  episodes_this_iter: 15
  episodes_total: 469
  experiment_id: dfbbdd8a3a1d41e5a85b875b9a05d5c7
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 19.328386306762695
    num_agent_steps_sampled: 21000
    num_steps_sampled: 21000
    num_steps_trained: 21000
  iterations_since_restore: 15
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 17332
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.0731777102154947
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.1042316793161485
    mean_inference_ms: 1.2388927476979765


Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00008,RUNNING,192.168.0.102:17332,0.00893173,linear,[32],16.0,5.47493,22400.0,80.21,200.0,22.0,80.21
PG_CartPole-v0_385d8_00009,PENDING,,0.00666936,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00011,PENDING,,0.00128099,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00008:
  agent_timesteps_total: 40600
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-50-46
  done: true
  episode_len_mean: 150.78
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 150.78
  episode_reward_min: 48.0
  episodes_this_iter: 9
  episodes_total: 605
  experiment_id: dfbbdd8a3a1d41e5a85b875b9a05d5c7
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 21.100099563598633
    num_agent_steps_sampled: 40600
    num_steps_sampled: 40600
    num_steps_trained: 40600
  iterations_since_restore: 29
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 87.0
    gpu_util_percent0: 0.0
    ram_util_percent: 59.4
    vram_util_percent0: 0.16370516617308326
  pid: 17332
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.072951

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00009,RUNNING,192.168.0.102:17697,0.00666936,relu,[32],1.0,0.338014,1400.0,21.7759,63.0,9.0,21.7759
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00011,PENDING,,0.00128099,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00009:
  agent_timesteps_total: 21000
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-51-03
  done: false
  episode_len_mean: 52.23
  episode_media: {}
  episode_reward_max: 171.0
  episode_reward_mean: 52.23
  episode_reward_min: 13.0
  episodes_this_iter: 21
  episodes_total: 584
  experiment_id: 748bbd22c0294b22b63ef19de1c17a60
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 16.984968185424805
    num_agent_steps_sampled: 21000
    num_steps_sampled: 21000
    num_steps_trained: 21000
  iterations_since_restore: 15
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 17697
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.0741629062627979
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10501228155214211
    mean_inference_ms: 1.2563813086281272

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00009,RUNNING,192.168.0.102:17697,0.00666936,relu,[32],15.0,5.21495,21000.0,52.23,171.0,13.0,52.23
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00011,PENDING,,0.00128099,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00009:
  agent_timesteps_total: 40600
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-51-08
  done: false
  episode_len_mean: 79.04
  episode_media: {}
  episode_reward_max: 199.0
  episode_reward_mean: 79.04
  episode_reward_min: 25.0
  episodes_this_iter: 15
  episodes_total: 882
  experiment_id: 748bbd22c0294b22b63ef19de1c17a60
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 19.047815322875977
    num_agent_steps_sampled: 40600
    num_steps_sampled: 40600
    num_steps_trained: 40600
  iterations_since_restore: 29
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 17697
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07364733902102989
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10441657001394739
    mean_inference_ms: 1.255433680688413

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00009,RUNNING,192.168.0.102:17697,0.00666936,relu,[32],29.0,10.1138,40600.0,79.04,199.0,25.0,79.04
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00011,PENDING,,0.00128099,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00009:
  agent_timesteps_total: 60200
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-51-13
  done: false
  episode_len_mean: 140.31
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 140.31
  episode_reward_min: 26.0
  episodes_this_iter: 9
  episodes_total: 1030
  experiment_id: 748bbd22c0294b22b63ef19de1c17a60
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 22.79286003112793
    num_agent_steps_sampled: 60200
    num_steps_sampled: 60200
    num_steps_trained: 60200
  iterations_since_restore: 43
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 17697
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07377267417707066
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.1044332643221772
    mean_inference_ms: 1.260685038158001

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00009,RUNNING,192.168.0.102:17697,0.00666936,relu,[32],43.0,15.1194,60200.0,140.31,200.0,26.0,140.31
PG_CartPole-v0_385d8_00010,PENDING,,0.00550817,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00011,PENDING,,0.00128099,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00009:
  agent_timesteps_total: 64400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-51-14
  done: true
  episode_len_mean: 150.91
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 150.91
  episode_reward_min: 26.0
  episodes_this_iter: 7
  episodes_total: 1053
  experiment_id: 748bbd22c0294b22b63ef19de1c17a60
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 22.882028579711914
    num_agent_steps_sampled: 64400
    num_steps_sampled: 64400
    num_steps_trained: 64400
  iterations_since_restore: 46
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 86.0
    gpu_util_percent0: 0.01
    ram_util_percent: 58.9
    vram_util_percent0: 0.16370516617308326
  pid: 17697
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.0738

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00010,RUNNING,192.168.0.102:18054,0.00550817,linear,[64],1.0,0.398497,1400.0,23.6909,74.0,10.0,23.6909
PG_CartPole-v0_385d8_00011,PENDING,,0.00128099,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00010:
  agent_timesteps_total: 21000
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-51-32
  done: false
  episode_len_mean: 83.38
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 83.38
  episode_reward_min: 28.0
  episodes_this_iter: 11
  episodes_total: 396
  experiment_id: cc05c9d227564218a04a92ac61b2d436
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 20.882659912109375
    num_agent_steps_sampled: 21000
    num_steps_sampled: 21000
    num_steps_trained: 21000
  iterations_since_restore: 15
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 88.9
    gpu_util_percent0: 0.0
    ram_util_percent: 58.9
    vram_util_percent0: 0.16683119447186576
  pid: 18054
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.078802

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00010,RUNNING,192.168.0.102:18054,0.00550817,linear,[64],15.0,5.36514,21000.0,83.38,200.0,28.0,83.38
PG_CartPole-v0_385d8_00011,PENDING,,0.00128099,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00010:
  agent_timesteps_total: 39200
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-51-37
  done: true
  episode_len_mean: 152.74
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 152.74
  episode_reward_min: 53.0
  episodes_this_iter: 7
  episodes_total: 522
  experiment_id: cc05c9d227564218a04a92ac61b2d436
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 22.103899002075195
    num_agent_steps_sampled: 39200
    num_steps_sampled: 39200
    num_steps_trained: 39200
  iterations_since_restore: 28
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 87.6
    gpu_util_percent0: 0.01
    ram_util_percent: 59.0
    vram_util_percent0: 0.1641987495886805
  pid: 18054
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.076818

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00011,RUNNING,192.168.0.102:18368,0.00128099,relu,[64],1.0,0.37769,1400.0,21.371,58.0,9.0,21.371
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00011:
  agent_timesteps_total: 19600
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-51-54
  done: false
  episode_len_mean: 30.3
  episode_media: {}
  episode_reward_max: 132.0
  episode_reward_mean: 30.3
  episode_reward_min: 9.0
  episodes_this_iter: 46
  episodes_total: 781
  experiment_id: 23c79ce7e6af4291b3460e5614795d80
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 12.217696189880371
    num_agent_steps_sampled: 19600
    num_steps_sampled: 19600
    num_steps_trained: 19600
  iterations_since_restore: 14
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 85.5
    gpu_util_percent0: 0.0
    ram_util_percent: 58.9
    vram_util_percent0: 0.1641987495886805
  pid: 18368
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.0735641124

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00011,RUNNING,192.168.0.102:18368,0.00128099,relu,[64],15.0,5.37245,21000.0,29.56,132.0,10.0,29.56
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00011:
  agent_timesteps_total: 37800
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-52-00
  done: false
  episode_len_mean: 38.52
  episode_media: {}
  episode_reward_max: 100.0
  episode_reward_mean: 38.52
  episode_reward_min: 11.0
  episodes_this_iter: 32
  episodes_total: 1298
  experiment_id: 23c79ce7e6af4291b3460e5614795d80
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 14.408429145812988
    num_agent_steps_sampled: 37800
    num_steps_sampled: 37800
    num_steps_trained: 37800
  iterations_since_restore: 27
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 18368
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07539007137543466
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10849693774327063
    mean_inference_ms: 1.28965936652228

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00011,RUNNING,192.168.0.102:18368,0.00128099,relu,[64],28.0,10.3516,39200.0,39.72,200.0,12.0,39.72
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00011:
  agent_timesteps_total: 51800
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-52-05
  done: false
  episode_len_mean: 49.59
  episode_media: {}
  episode_reward_max: 121.0
  episode_reward_mean: 49.59
  episode_reward_min: 12.0
  episodes_this_iter: 25
  episodes_total: 1614
  experiment_id: 23c79ce7e6af4291b3460e5614795d80
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 16.882808685302734
    num_agent_steps_sampled: 51800
    num_steps_sampled: 51800
    num_steps_trained: 51800
  iterations_since_restore: 37
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 18368
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07979251005865218
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.11385377092624015
    mean_inference_ms: 1.38574850875994

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00011,RUNNING,192.168.0.102:18368,0.00128099,relu,[64],38.0,15.2779,53200.0,51.96,191.0,11.0,51.96
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00011:
  agent_timesteps_total: 67200
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-52-10
  done: false
  episode_len_mean: 63.36
  episode_media: {}
  episode_reward_max: 192.0
  episode_reward_mean: 63.36
  episode_reward_min: 17.0
  episodes_this_iter: 24
  episodes_total: 1876
  experiment_id: 23c79ce7e6af4291b3460e5614795d80
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 15.564369201660156
    num_agent_steps_sampled: 67200
    num_steps_sampled: 67200
    num_steps_trained: 67200
  iterations_since_restore: 48
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 86.6
    gpu_util_percent0: 0.03
    ram_util_percent: 59.9
    vram_util_percent0: 0.1564659427443238
  pid: 18368
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.08066

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00011,RUNNING,192.168.0.102:18368,0.00128099,relu,[64],49.0,20.3156,68600.0,63.06,192.0,17.0,63.06
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00011:
  agent_timesteps_total: 81200
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-52-15
  done: false
  episode_len_mean: 69.98
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 69.98
  episode_reward_min: 12.0
  episodes_this_iter: 19
  episodes_total: 2084
  experiment_id: 23c79ce7e6af4291b3460e5614795d80
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 17.949810028076172
    num_agent_steps_sampled: 81200
    num_steps_sampled: 81200
    num_steps_trained: 81200
  iterations_since_restore: 58
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 18368
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.08206277772034412
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.11821674294263303
    mean_inference_ms: 1.47624375985364

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00011,RUNNING,192.168.0.102:18368,0.00128099,relu,[64],59.0,25.4203,82600.0,69.64,200.0,12.0,69.64
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00011:
  agent_timesteps_total: 98000
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-52-21
  done: false
  episode_len_mean: 86.87
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 86.87
  episode_reward_min: 15.0
  episodes_this_iter: 14
  episodes_total: 2289
  experiment_id: 23c79ce7e6af4291b3460e5614795d80
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 22.420881271362305
    num_agent_steps_sampled: 98000
    num_steps_sampled: 98000
    num_steps_trained: 98000
  iterations_since_restore: 70
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 18368
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.08256059835884161
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.11919282175701765
    mean_inference_ms: 1.48499499794935

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00011,RUNNING,192.168.0.102:18368,0.00128099,relu,[64],71.0,30.3343,99400.0,88.22,200.0,15.0,88.22
PG_CartPole-v0_385d8_00012,PENDING,,0.00378441,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00011:
  agent_timesteps_total: 100800
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-52-21
  done: true
  episode_len_mean: 89.21
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 89.21
  episode_reward_min: 15.0
  episodes_this_iter: 15
  episodes_total: 2318
  experiment_id: 23c79ce7e6af4291b3460e5614795d80
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 19.511560440063477
    num_agent_steps_sampled: 100800
    num_steps_sampled: 100800
    num_steps_trained: 100800
  iterations_since_restore: 72
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 18368
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.08259194196031779
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.11928794062695713
    mean_inference_ms: 1.48454779963

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00012,RUNNING,192.168.0.102:18981,0.00378441,linear,[32],1.0,0.333069,1400.0,21.4828,61.0,9.0,21.4828
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00012:
  agent_timesteps_total: 22400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-52-39
  done: false
  episode_len_mean: 45.49
  episode_media: {}
  episode_reward_max: 119.0
  episode_reward_mean: 45.49
  episode_reward_min: 16.0
  episodes_this_iter: 26
  episodes_total: 629
  experiment_id: 8c951e3e883642c0b1432a75f69da07c
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 15.000411033630371
    num_agent_steps_sampled: 22400
    num_steps_sampled: 22400
    num_steps_trained: 22400
  iterations_since_restore: 16
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 18981
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07094639341797042
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.1011527831348922
    mean_inference_ms: 1.184664061408847


Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00012,RUNNING,192.168.0.102:18981,0.00378441,linear,[32],16.0,5.22636,22400.0,45.49,119.0,16.0,45.49
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00012:
  agent_timesteps_total: 44800
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-52-44
  done: false
  episode_len_mean: 74.72
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 74.72
  episode_reward_min: 25.0
  episodes_this_iter: 21
  episodes_total: 990
  experiment_id: 8c951e3e883642c0b1432a75f69da07c
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 15.687926292419434
    num_agent_steps_sampled: 44800
    num_steps_sampled: 44800
    num_steps_trained: 44800
  iterations_since_restore: 32
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 18981
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.06966600496379786
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.09907365804554102
    mean_inference_ms: 1.165640844500250

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00012,RUNNING,192.168.0.102:18981,0.00378441,linear,[32],32.0,10.236,44800.0,74.72,200.0,25.0,74.72
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00012:
  agent_timesteps_total: 63000
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-52-49
  done: false
  episode_len_mean: 126.24
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 126.24
  episode_reward_min: 27.0
  episodes_this_iter: 10
  episodes_total: 1145
  experiment_id: 8c951e3e883642c0b1432a75f69da07c
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 26.548398971557617
    num_agent_steps_sampled: 63000
    num_steps_sampled: 63000
    num_steps_trained: 63000
  iterations_since_restore: 45
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 71.3
    gpu_util_percent0: 0.05
    ram_util_percent: 59.7
    vram_util_percent0: 0.16732477788746297
  pid: 18981
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00012,RUNNING,192.168.0.102:18981,0.00378441,linear,[32],45.0,14.9469,63000.0,126.24,200.0,27.0,126.24
PG_CartPole-v0_385d8_00013,PENDING,,0.008528,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00012:
  agent_timesteps_total: 75600
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-52-52
  done: true
  episode_len_mean: 150.72
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 150.72
  episode_reward_min: 21.0
  episodes_this_iter: 8
  episodes_total: 1226
  experiment_id: 8c951e3e883642c0b1432a75f69da07c
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 22.49219512939453
    num_agent_steps_sampled: 75600
    num_steps_sampled: 75600
    num_steps_trained: 75600
  iterations_since_restore: 54
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 84.9
    gpu_util_percent0: 0.0
    ram_util_percent: 59.8
    vram_util_percent0: 0.16732477788746297
  pid: 18981
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.071141

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00013,RUNNING,192.168.0.102:19428,0.008528,relu,[32],1.0,0.33371,1400.0,21.2581,39.0,8.0,21.2581
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00013:
  agent_timesteps_total: 22400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-53-09
  done: false
  episode_len_mean: 53.02
  episode_media: {}
  episode_reward_max: 120.0
  episode_reward_mean: 53.02
  episode_reward_min: 12.0
  episodes_this_iter: 24
  episodes_total: 639
  experiment_id: 9ec94662e5b3441f93b46868457eb77e
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 16.49169921875
    num_agent_steps_sampled: 22400
    num_steps_sampled: 22400
    num_steps_trained: 22400
  iterations_since_restore: 16
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 19428
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.06895713261156958
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.1000876331787314
    mean_inference_ms: 1.178144452363074
    

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00013,RUNNING,192.168.0.102:19428,0.008528,relu,[32],16.0,5.20706,22400.0,53.02,120.0,12.0,53.02
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00013:
  agent_timesteps_total: 43400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-53-14
  done: false
  episode_len_mean: 94.18
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 94.18
  episode_reward_min: 29.0
  episodes_this_iter: 13
  episodes_total: 918
  experiment_id: 9ec94662e5b3441f93b46868457eb77e
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 20.303022384643555
    num_agent_steps_sampled: 43400
    num_steps_sampled: 43400
    num_steps_trained: 43400
  iterations_since_restore: 31
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 19428
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.0698500541600227
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10112152495572743
    mean_inference_ms: 1.190753485300298


Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00013,RUNNING,192.168.0.102:19428,0.008528,relu,[32],31.0,10.1269,43400.0,94.18,200.0,29.0,94.18
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00013:
  agent_timesteps_total: 64400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-53-19
  done: true
  episode_len_mean: 150.65
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 150.65
  episode_reward_min: 40.0
  episodes_this_iter: 9
  episodes_total: 1065
  experiment_id: 9ec94662e5b3441f93b46868457eb77e
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 21.392118453979492
    num_agent_steps_sampled: 64400
    num_steps_sampled: 64400
    num_steps_trained: 64400
  iterations_since_restore: 46
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 82.9
    gpu_util_percent0: 0.03
    ram_util_percent: 59.7
    vram_util_percent0: 0.16666666666666666
  pid: 19428
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.0701

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00014,PENDING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,


Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00014,RUNNING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,


Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00014,RUNNING,,0.00152422,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00014:
  agent_timesteps_total: 1400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-53-31
  done: false
  episode_len_mean: 25.653846153846153
  episode_media: {}
  episode_reward_max: 75.0
  episode_reward_mean: 25.653846153846153
  episode_reward_min: 11.0
  episodes_this_iter: 52
  episodes_total: 52
  experiment_id: f72577a192e7476ca54bc6ffc8d888a3
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 10.532560348510742
    num_agent_steps_sampled: 1400
    num_steps_sampled: 1400
    num_steps_trained: 1400
  iterations_since_restore: 1
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 60.4
    gpu_util_percent0: 0.07
    ram_util_percent: 59.6
    vram_util_percent0: 0.16535044422507403
  pid: 19784
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_pro

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00014,RUNNING,192.168.0.102:19784,0.00152422,linear,[64],11.0,3.65516,15400.0,33.47,104.0,12.0,33.47
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00014:
  agent_timesteps_total: 22400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-53-36
  done: false
  episode_len_mean: 40.13
  episode_media: {}
  episode_reward_max: 93.0
  episode_reward_mean: 40.13
  episode_reward_min: 13.0
  episodes_this_iter: 31
  episodes_total: 711
  experiment_id: f72577a192e7476ca54bc6ffc8d888a3
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 13.556086540222168
    num_agent_steps_sampled: 22400
    num_steps_sampled: 22400
    num_steps_trained: 22400
  iterations_since_restore: 16
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 87.9
    gpu_util_percent0: 0.07
    ram_util_percent: 59.6
    vram_util_percent0: 0.16288252714708787
  pid: 19784
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.072803

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00014,RUNNING,192.168.0.102:19784,0.00152422,linear,[64],26.0,8.59971,36400.0,54.65,147.0,15.0,54.65
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00014:
  agent_timesteps_total: 42000
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-53-41
  done: false
  episode_len_mean: 52.75
  episode_media: {}
  episode_reward_max: 130.0
  episode_reward_mean: 52.75
  episode_reward_min: 16.0
  episodes_this_iter: 19
  episodes_total: 1093
  experiment_id: f72577a192e7476ca54bc6ffc8d888a3
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 17.59491539001465
    num_agent_steps_sampled: 42000
    num_steps_sampled: 42000
    num_steps_trained: 42000
  iterations_since_restore: 30
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 19784
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.0738613726283726
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10244471240495862
    mean_inference_ms: 1.2207038267468673

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00014,RUNNING,192.168.0.102:19784,0.00152422,linear,[64],39.0,13.4002,54600.0,74.2,196.0,20.0,74.2
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00014:
  agent_timesteps_total: 60200
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-53-46
  done: false
  episode_len_mean: 86.76
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 86.76
  episode_reward_min: 25.0
  episodes_this_iter: 16
  episodes_total: 1327
  experiment_id: f72577a192e7476ca54bc6ffc8d888a3
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 19.485010147094727
    num_agent_steps_sampled: 60200
    num_steps_sampled: 60200
    num_steps_trained: 60200
  iterations_since_restore: 43
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 89.1
    gpu_util_percent0: 0.0
    ram_util_percent: 59.8
    vram_util_percent0: 0.16337611056268508
  pid: 19784
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07473

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00014,RUNNING,192.168.0.102:19784,0.00152422,linear,[64],52.0,18.3758,72800.0,117.31,200.0,26.0,117.31
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00014:
  agent_timesteps_total: 78400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-53-51
  done: false
  episode_len_mean: 124.1
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 124.1
  episode_reward_min: 31.0
  episodes_this_iter: 10
  episodes_total: 1482
  experiment_id: f72577a192e7476ca54bc6ffc8d888a3
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 23.242055892944336
    num_agent_steps_sampled: 78400
    num_steps_sampled: 78400
    num_steps_trained: 78400
  iterations_since_restore: 56
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 19784
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07614566378275213
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10697164373790845
    mean_inference_ms: 1.27123541534505

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00014,RUNNING,192.168.0.102:19784,0.00152422,linear,[64],65.0,23.3817,91000.0,148.98,200.0,39.0,148.98
PG_CartPole-v0_385d8_00015,PENDING,,0.00961794,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00014:
  agent_timesteps_total: 92400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-53-55
  done: true
  episode_len_mean: 152.47
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 152.47
  episode_reward_min: 39.0
  episodes_this_iter: 9
  episodes_total: 1573
  experiment_id: f72577a192e7476ca54bc6ffc8d888a3
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 24.121702194213867
    num_agent_steps_sampled: 92400
    num_steps_sampled: 92400
    num_steps_trained: 92400
  iterations_since_restore: 66
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 19784
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07669998387347259
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.1077325641357343
    mean_inference_ms: 1.281535854894267

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00015,RUNNING,192.168.0.102:20216,0.00961794,relu,[64],1.0,0.336079,1400.0,21.3934,57.0,9.0,21.3934
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00015:
  agent_timesteps_total: 15400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-54-14
  done: false
  episode_len_mean: 60.04
  episode_media: {}
  episode_reward_max: 141.0
  episode_reward_mean: 60.04
  episode_reward_min: 18.0
  episodes_this_iter: 19
  episodes_total: 399
  experiment_id: b8b1cf33020c4789a5d91622011bf08f
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 16.91427993774414
    num_agent_steps_sampled: 15400
    num_steps_sampled: 15400
    num_steps_trained: 15400
  iterations_since_restore: 11
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 93.1
    gpu_util_percent0: 0.07
    ram_util_percent: 61.8
    vram_util_percent0: 0.16831194471865746
  pid: 20216
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.105445

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00015,RUNNING,192.168.0.102:20216,0.00961794,relu,[64],11.0,5.56171,15400.0,60.04,141.0,18.0,60.04
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00015:
  agent_timesteps_total: 30800
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-54-19
  done: true
  episode_len_mean: 150.49
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 150.49
  episode_reward_min: 31.0
  episodes_this_iter: 7
  episodes_total: 496
  experiment_id: b8b1cf33020c4789a5d91622011bf08f
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 25.251127243041992
    num_agent_steps_sampled: 30800
    num_steps_sampled: 30800
    num_steps_trained: 30800
  iterations_since_restore: 22
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 20216
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.10286528222687939
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.13595301397758605
    mean_inference_ms: 1.713996263745491

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00016,PENDING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,


Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00016,RUNNING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,


Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00016,RUNNING,,0.00064969,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00016:
  agent_timesteps_total: 1400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-54-31
  done: false
  episode_len_mean: 24.22641509433962
  episode_media: {}
  episode_reward_max: 70.0
  episode_reward_mean: 24.22641509433962
  episode_reward_min: 9.0
  episodes_this_iter: 53
  episodes_total: 53
  experiment_id: 9c35740c525a40e496949c5acdf9b978
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 9.895605087280273
    num_agent_steps_sampled: 1400
    num_steps_sampled: 1400
    num_steps_trained: 1400
  iterations_since_restore: 1
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 66.2
    gpu_util_percent0: 0.03
    ram_util_percent: 62.0
    vram_util_percent0: 0.16699572227706483
  pid: 20815
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_process

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00016,RUNNING,192.168.0.102:20815,0.00064969,linear,[32],9.0,3.28807,12600.0,21.83,50.0,9.0,21.83
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00016:
  agent_timesteps_total: 21000
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-54-36
  done: false
  episode_len_mean: 25.53
  episode_media: {}
  episode_reward_max: 74.0
  episode_reward_mean: 25.53
  episode_reward_min: 10.0
  episodes_this_iter: 49
  episodes_total: 863
  experiment_id: 9c35740c525a40e496949c5acdf9b978
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 10.64200210571289
    num_agent_steps_sampled: 21000
    num_steps_sampled: 21000
    num_steps_trained: 21000
  iterations_since_restore: 15
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 20815
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07444697726795779
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10340929585199264
    mean_inference_ms: 1.2239447167087616


Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00016,RUNNING,192.168.0.102:20815,0.00064969,linear,[32],23.0,8.07132,32200.0,28.26,71.0,9.0,28.26
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00016:
  agent_timesteps_total: 42000
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-54-42
  done: false
  episode_len_mean: 32.25
  episode_media: {}
  episode_reward_max: 105.0
  episode_reward_mean: 32.25
  episode_reward_min: 11.0
  episodes_this_iter: 45
  episodes_total: 1615
  experiment_id: 9c35740c525a40e496949c5acdf9b978
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 11.2669038772583
    num_agent_steps_sampled: 42000
    num_steps_sampled: 42000
    num_steps_trained: 42000
  iterations_since_restore: 30
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 20815
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07238495997136266
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10193938241189876
    mean_inference_ms: 1.2030681952950897

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00016,RUNNING,192.168.0.102:20815,0.00064969,linear,[32],37.0,12.7075,51800.0,32.73,111.0,12.0,32.73
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00016:
  agent_timesteps_total: 61600
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-54-47
  done: false
  episode_len_mean: 33.12
  episode_media: {}
  episode_reward_max: 82.0
  episode_reward_mean: 33.12
  episode_reward_min: 10.0
  episodes_this_iter: 45
  episodes_total: 2205
  experiment_id: 9c35740c525a40e496949c5acdf9b978
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 11.125686645507812
    num_agent_steps_sampled: 61600
    num_steps_sampled: 61600
    num_steps_trained: 61600
  iterations_since_restore: 44
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 81.7
    gpu_util_percent0: 0.11
    ram_util_percent: 61.6
    vram_util_percent0: 0.16765383349786114
  pid: 20815
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07239

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00016,RUNNING,192.168.0.102:20815,0.00064969,linear,[32],50.0,17.7048,70000.0,36.1,122.0,11.0,36.1
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00016:
  agent_timesteps_total: 79800
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-54-52
  done: false
  episode_len_mean: 36.55
  episode_media: {}
  episode_reward_max: 179.0
  episode_reward_mean: 36.55
  episode_reward_min: 10.0
  episodes_this_iter: 39
  episodes_total: 2701
  experiment_id: 9c35740c525a40e496949c5acdf9b978
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 11.301813125610352
    num_agent_steps_sampled: 79800
    num_steps_sampled: 79800
    num_steps_trained: 79800
  iterations_since_restore: 57
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 77.0
    gpu_util_percent0: 0.03
    ram_util_percent: 61.8
    vram_util_percent0: 0.16831194471865746
  pid: 20815
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.0738

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00016,RUNNING,192.168.0.102:20815,0.00064969,linear,[32],64.0,22.3059,89600.0,42.44,137.0,13.0,42.44
PG_CartPole-v0_385d8_00017,PENDING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00016:
  agent_timesteps_total: 100800
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-54-57
  done: true
  episode_len_mean: 44.27
  episode_media: {}
  episode_reward_max: 134.0
  episode_reward_mean: 44.27
  episode_reward_min: 14.0
  episodes_this_iter: 28
  episodes_total: 3189
  experiment_id: 9c35740c525a40e496949c5acdf9b978
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 14.395594596862793
    num_agent_steps_sampled: 100800
    num_steps_sampled: 100800
    num_steps_trained: 100800
  iterations_since_restore: 72
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 85.4
    gpu_util_percent0: 0.01
    ram_util_percent: 61.9
    vram_util_percent0: 0.16666666666666666
  pid: 20815
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.0

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00017,RUNNING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,


Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00017,RUNNING,,0.00497376,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00017:
  agent_timesteps_total: 1400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-55-10
  done: false
  episode_len_mean: 19.705882352941178
  episode_media: {}
  episode_reward_max: 50.0
  episode_reward_mean: 19.705882352941178
  episode_reward_min: 8.0
  episodes_this_iter: 68
  episodes_total: 68
  experiment_id: d7e96a0687b04ce182e12ee4856a3473
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 7.6976799964904785
    num_agent_steps_sampled: 1400
    num_steps_sampled: 1400
    num_steps_trained: 1400
  iterations_since_restore: 1
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 64.5
    gpu_util_percent0: 0.08
    ram_util_percent: 62.3
    vram_util_percent0: 0.16436327739387957
  pid: 21344
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_proc

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00017,RUNNING,192.168.0.102:21344,0.00497376,relu,[32],1.0,0.353419,1400.0,19.7059,50.0,8.0,19.7059
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,


Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00017,RUNNING,192.168.0.102:21344,0.00497376,relu,[32],13.0,4.68847,18200.0,35.52,104.0,13.0,35.52
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00017:
  agent_timesteps_total: 19600
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-55-15
  done: false
  episode_len_mean: 36.05
  episode_media: {}
  episode_reward_max: 91.0
  episode_reward_mean: 36.05
  episode_reward_min: 9.0
  episodes_this_iter: 37
  episodes_total: 671
  experiment_id: d7e96a0687b04ce182e12ee4856a3473
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 12.19843864440918
    num_agent_steps_sampled: 19600
    num_steps_sampled: 19600
    num_steps_trained: 19600
  iterations_since_restore: 14
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 80.2
    gpu_util_percent0: 0.01
    ram_util_percent: 62.3
    vram_util_percent0: 0.1646923330042777
  pid: 21344
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.076466048

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00017,RUNNING,192.168.0.102:21344,0.00497376,relu,[32],27.0,9.66738,37800.0,56.11,133.0,20.0,56.11
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00017:
  agent_timesteps_total: 39200
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-55-20
  done: false
  episode_len_mean: 55.21
  episode_media: {}
  episode_reward_max: 133.0
  episode_reward_mean: 55.21
  episode_reward_min: 20.0
  episodes_this_iter: 24
  episodes_total: 1056
  experiment_id: d7e96a0687b04ce182e12ee4856a3473
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 14.94899845123291
    num_agent_steps_sampled: 39200
    num_steps_sampled: 39200
    num_steps_trained: 39200
  iterations_since_restore: 28
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 21344
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07560935213490068
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10610060716702252
    mean_inference_ms: 1.283475530642845

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00017,RUNNING,192.168.0.102:21344,0.00497376,relu,[32],42.0,14.7327,58800.0,80.75,168.0,26.0,80.75
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00017:
  agent_timesteps_total: 60200
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-55-25
  done: false
  episode_len_mean: 81.57
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 81.57
  episode_reward_min: 28.0
  episodes_this_iter: 17
  episodes_total: 1340
  experiment_id: d7e96a0687b04ce182e12ee4856a3473
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 16.460750579833984
    num_agent_steps_sampled: 60200
    num_steps_sampled: 60200
    num_steps_trained: 60200
  iterations_since_restore: 43
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 86.9
    gpu_util_percent0: 0.11
    ram_util_percent: 62.3
    vram_util_percent0: 0.16551497203027313
  pid: 21344
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.0748

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00017,RUNNING,192.168.0.102:21344,0.00497376,relu,[32],54.0,19.5337,75600.0,116.83,200.0,39.0,116.83
PG_CartPole-v0_385d8_00018,PENDING,,0.00947202,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00017:
  agent_timesteps_total: 77000
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-55-30
  done: false
  episode_len_mean: 123.68
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 123.68
  episode_reward_min: 39.0
  episodes_this_iter: 11
  episodes_total: 1488
  experiment_id: d7e96a0687b04ce182e12ee4856a3473
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 21.02155113220215
    num_agent_steps_sampled: 77000
    num_steps_sampled: 77000
    num_steps_trained: 77000
  iterations_since_restore: 55
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 21344
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07447414119096693
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10491894038240603
    mean_inference_ms: 1.2625389703238

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00018,RUNNING,192.168.0.102:21842,0.00947202,linear,[64],1.0,0.454342,1400.0,19.6618,49.0,10.0,19.6618
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00018:
  agent_timesteps_total: 19600
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-55-53
  done: false
  episode_len_mean: 94.18
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 94.18
  episode_reward_min: 24.0
  episodes_this_iter: 9
  episodes_total: 345
  experiment_id: b41d9d1064a547b3998aef262deef3f4
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 21.59575843811035
    num_agent_steps_sampled: 19600
    num_steps_sampled: 19600
    num_steps_trained: 19600
  iterations_since_restore: 14
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 21842
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.08075766354273389
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.11381880796610584
    mean_inference_ms: 1.3807374540365094


Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00018,RUNNING,192.168.0.102:21842,0.00947202,linear,[64],15.0,5.62196,21000.0,104.44,200.0,24.0,104.44
PG_CartPole-v0_385d8_00019,PENDING,,0.00154977,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00018:
  agent_timesteps_total: 29400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-55-55
  done: true
  episode_len_mean: 153.4
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 153.4
  episode_reward_min: 32.0
  episodes_this_iter: 7
  episodes_total: 396
  experiment_id: b41d9d1064a547b3998aef262deef3f4
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 22.616050720214844
    num_agent_steps_sampled: 29400
    num_steps_sampled: 29400
    num_steps_trained: 29400
  iterations_since_restore: 21
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 21842
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07748955274238101
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10961352915712068
    mean_inference_ms: 1.3211797612846379


Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00019,RUNNING,192.168.0.102:22198,0.00154977,relu,[64],1.0,0.365185,1400.0,21.8689,62.0,9.0,21.8689
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00019:
  agent_timesteps_total: 19600
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-56-13
  done: false
  episode_len_mean: 35.64
  episode_media: {}
  episode_reward_max: 134.0
  episode_reward_mean: 35.64
  episode_reward_min: 11.0
  episodes_this_iter: 38
  episodes_total: 711
  experiment_id: 5c5c3434f6254cd798d2285e0e0199ac
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 13.025162696838379
    num_agent_steps_sampled: 19600
    num_steps_sampled: 19600
    num_steps_trained: 19600
  iterations_since_restore: 14
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 22198
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.0777935196407712
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.11474226061679309
    mean_inference_ms: 1.3714647735506802

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00019,RUNNING,192.168.0.102:22198,0.00154977,relu,[64],14.0,5.44556,19600.0,35.64,134.0,11.0,35.64
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00019:
  agent_timesteps_total: 36400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-56-19
  done: false
  episode_len_mean: 43.09
  episode_media: {}
  episode_reward_max: 176.0
  episode_reward_mean: 43.09
  episode_reward_min: 10.0
  episodes_this_iter: 29
  episodes_total: 1166
  experiment_id: 5c5c3434f6254cd798d2285e0e0199ac
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 16.134716033935547
    num_agent_steps_sampled: 36400
    num_steps_sampled: 36400
    num_steps_trained: 36400
  iterations_since_restore: 26
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 22198
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07846055399298904
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.11329285171664914
    mean_inference_ms: 1.38334659358568

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00019,RUNNING,192.168.0.102:22198,0.00154977,relu,[64],26.0,10.2975,36400.0,43.09,176.0,10.0,43.09
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00019:
  agent_timesteps_total: 54600
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-56-24
  done: false
  episode_len_mean: 50.64
  episode_media: {}
  episode_reward_max: 145.0
  episode_reward_mean: 50.64
  episode_reward_min: 12.0
  episodes_this_iter: 30
  episodes_total: 1531
  experiment_id: 5c5c3434f6254cd798d2285e0e0199ac
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 14.007845878601074
    num_agent_steps_sampled: 54600
    num_steps_sampled: 54600
    num_steps_trained: 54600
  iterations_since_restore: 39
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 87.8
    gpu_util_percent0: 0.04
    ram_util_percent: 60.4
    vram_util_percent0: 0.16288252714708787
  pid: 22198
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.0784

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00019,RUNNING,192.168.0.102:22198,0.00154977,relu,[64],39.0,15.3868,54600.0,50.64,145.0,12.0,50.64
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00019:
  agent_timesteps_total: 71400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-56-29
  done: false
  episode_len_mean: 62.79
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 62.79
  episode_reward_min: 23.0
  episodes_this_iter: 27
  episodes_total: 1826
  experiment_id: 5c5c3434f6254cd798d2285e0e0199ac
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 13.774116516113281
    num_agent_steps_sampled: 71400
    num_steps_sampled: 71400
    num_steps_trained: 71400
  iterations_since_restore: 51
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 88.8
    gpu_util_percent0: 0.04
    ram_util_percent: 60.7
    vram_util_percent0: 0.16238894373149063
  pid: 22198
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.0785

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00019,RUNNING,192.168.0.102:22198,0.00154977,relu,[64],51.0,20.1674,71400.0,62.79,200.0,23.0,62.79
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00019:
  agent_timesteps_total: 86800
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-56-34
  done: false
  episode_len_mean: 78.25
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 78.25
  episode_reward_min: 13.0
  episodes_this_iter: 19
  episodes_total: 2033
  experiment_id: 5c5c3434f6254cd798d2285e0e0199ac
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 18.754920959472656
    num_agent_steps_sampled: 86800
    num_steps_sampled: 86800
    num_steps_trained: 86800
  iterations_since_restore: 62
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 22198
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07930127879955692
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.11384264137651849
    mean_inference_ms: 1.40210459816875

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00019,RUNNING,192.168.0.102:22198,0.00154977,relu,[64],63.0,25.3581,88200.0,81.24,200.0,13.0,81.24
PG_CartPole-v0_385d8_00020,PENDING,,0.00509507,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00019:
  agent_timesteps_total: 100800
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-56-38
  done: true
  episode_len_mean: 97.69
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 97.69
  episode_reward_min: 20.0
  episodes_this_iter: 14
  episodes_total: 2184
  experiment_id: 5c5c3434f6254cd798d2285e0e0199ac
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 19.502826690673828
    num_agent_steps_sampled: 100800
    num_steps_sampled: 100800
    num_steps_trained: 100800
  iterations_since_restore: 72
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 22198
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07938579204288666
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.11393450085524268
    mean_inference_ms: 1.39832025119

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00020,RUNNING,192.168.0.102:22801,0.00509507,linear,[32],1.0,0.358137,1400.0,22.4746,60.0,9.0,22.4746
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00020:
  agent_timesteps_total: 21000
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-56-56
  done: false
  episode_len_mean: 45.31
  episode_media: {}
  episode_reward_max: 110.0
  episode_reward_mean: 45.31
  episode_reward_min: 16.0
  episodes_this_iter: 33
  episodes_total: 613
  experiment_id: 821775545a5f454facc7cc47cc09490b
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 12.402629852294922
    num_agent_steps_sampled: 21000
    num_steps_sampled: 21000
    num_steps_trained: 21000
  iterations_since_restore: 15
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 86.5
    gpu_util_percent0: 0.03
    ram_util_percent: 60.2
    vram_util_percent0: 0.1625534715366897
  pid: 22801
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.077113

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00020,RUNNING,192.168.0.102:22801,0.00509507,linear,[32],15.0,5.45462,21000.0,45.31,110.0,16.0,45.31
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00020:
  agent_timesteps_total: 39200
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-57-01
  done: false
  episode_len_mean: 56.41
  episode_media: {}
  episode_reward_max: 129.0
  episode_reward_mean: 56.41
  episode_reward_min: 23.0
  episodes_this_iter: 27
  episodes_total: 963
  experiment_id: 821775545a5f454facc7cc47cc09490b
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 12.98719310760498
    num_agent_steps_sampled: 39200
    num_steps_sampled: 39200
    num_steps_trained: 39200
  iterations_since_restore: 28
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 22801
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07642678465381263
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10795556250216003
    mean_inference_ms: 1.286540128736856


Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00020,RUNNING,192.168.0.102:22801,0.00509507,linear,[32],28.0,10.2676,39200.0,56.41,129.0,23.0,56.41
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00020:
  agent_timesteps_total: 57400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-57-06
  done: false
  episode_len_mean: 82.76
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 82.76
  episode_reward_min: 21.0
  episodes_this_iter: 14
  episodes_total: 1209
  experiment_id: 821775545a5f454facc7cc47cc09490b
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 19.987239837646484
    num_agent_steps_sampled: 57400
    num_steps_sampled: 57400
    num_steps_trained: 57400
  iterations_since_restore: 41
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 86.6
    gpu_util_percent0: 0.01
    ram_util_percent: 60.3
    vram_util_percent0: 0.1641987495886805
  pid: 22801
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07864

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00020,RUNNING,192.168.0.102:22801,0.00509507,linear,[32],41.0,15.2819,57400.0,82.76,200.0,21.0,82.76
PG_CartPole-v0_385d8_00021,PENDING,,0.00251308,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,


Result for PG_CartPole-v0_385d8_00020:
  agent_timesteps_total: 74200
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-57-11
  done: true
  episode_len_mean: 151.07
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 151.07
  episode_reward_min: 30.0
  episodes_this_iter: 7
  episodes_total: 1323
  experiment_id: 821775545a5f454facc7cc47cc09490b
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 23.72123908996582
    num_agent_steps_sampled: 74200
    num_steps_sampled: 74200
    num_steps_trained: 74200
  iterations_since_restore: 53
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 22801
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07956700288853458
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10991749473911593
    mean_inference_ms: 1.319073851459119

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00021,RUNNING,192.168.0.102:23197,0.00251308,relu,[32],1.0,0.354928,1400.0,19.619,45.0,9.0,19.619
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00021:
  agent_timesteps_total: 19600
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-57-30
  done: false
  episode_len_mean: 29.26
  episode_media: {}
  episode_reward_max: 72.0
  episode_reward_mean: 29.26
  episode_reward_min: 11.0
  episodes_this_iter: 47
  episodes_total: 756
  experiment_id: 1a38b2a79b0145a1831667e7ecc329e0
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 10.575867652893066
    num_agent_steps_sampled: 19600
    num_steps_sampled: 19600
    num_steps_trained: 19600
  iterations_since_restore: 14
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 87.5
    gpu_util_percent0: 0.05
    ram_util_percent: 60.2
    vram_util_percent0: 0.15992102665350444
  pid: 23197
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.076873

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00021,RUNNING,192.168.0.102:23197,0.00251308,relu,[32],14.0,5.43301,19600.0,29.26,72.0,11.0,29.26
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,


Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00021,RUNNING,192.168.0.102:23197,0.00251308,relu,[32],26.0,10.2153,36400.0,40.67,147.0,11.0,40.67
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00021:
  agent_timesteps_total: 37800
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-57-36
  done: false
  episode_len_mean: 41.3
  episode_media: {}
  episode_reward_max: 134.0
  episode_reward_mean: 41.3
  episode_reward_min: 11.0
  episodes_this_iter: 30
  episodes_total: 1237
  experiment_id: 1a38b2a79b0145a1831667e7ecc329e0
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 14.358119010925293
    num_agent_steps_sampled: 37800
    num_steps_sampled: 37800
    num_steps_trained: 37800
  iterations_since_restore: 27
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 23197
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07896003956313169
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.11578264417467196
    mean_inference_ms: 1.4021875146921046

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00021,RUNNING,192.168.0.102:23197,0.00251308,relu,[32],40.0,15.0666,56000.0,53.16,171.0,15.0,53.16
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00021:
  agent_timesteps_total: 57400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-57-41
  done: false
  episode_len_mean: 54.25
  episode_media: {}
  episode_reward_max: 171.0
  episode_reward_mean: 54.25
  episode_reward_min: 15.0
  episodes_this_iter: 26
  episodes_total: 1636
  experiment_id: 1a38b2a79b0145a1831667e7ecc329e0
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 13.876832962036133
    num_agent_steps_sampled: 57400
    num_steps_sampled: 57400
    num_steps_trained: 57400
  iterations_since_restore: 41
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 23197
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07748035992218165
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.1117522439076514
    mean_inference_ms: 1.356165057642109

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00021,RUNNING,192.168.0.102:23197,0.00251308,relu,[32],55.0,20.0459,77000.0,64.27,166.0,20.0,64.27
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00021:
  agent_timesteps_total: 78400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-57-46
  done: false
  episode_len_mean: 66.18
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 66.18
  episode_reward_min: 20.0
  episodes_this_iter: 23
  episodes_total: 1973
  experiment_id: 1a38b2a79b0145a1831667e7ecc329e0
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 14.003336906433105
    num_agent_steps_sampled: 78400
    num_steps_sampled: 78400
    num_steps_trained: 78400
  iterations_since_restore: 56
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 23197
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07553092502852571
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10891455295432179
    mean_inference_ms: 1.31764172534669

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00021,RUNNING,192.168.0.102:23197,0.00251308,relu,[32],70.0,25.0481,98000.0,86.29,200.0,21.0,86.29
PG_CartPole-v0_385d8_00022,PENDING,,0.00877715,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00021:
  agent_timesteps_total: 99400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-57-51
  done: false
  episode_len_mean: 86.62
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 86.62
  episode_reward_min: 21.0
  episodes_this_iter: 19
  episodes_total: 2239
  experiment_id: 1a38b2a79b0145a1831667e7ecc329e0
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 17.039905548095703
    num_agent_steps_sampled: 99400
    num_steps_sampled: 99400
    num_steps_trained: 99400
  iterations_since_restore: 71
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 88.6
    gpu_util_percent0: 0.04
    ram_util_percent: 60.3
    vram_util_percent0: 0.1648568608094768
  pid: 23197
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07458

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00022,RUNNING,192.168.0.102:23796,0.00877715,linear,[64],1.0,0.480244,1400.0,20.3788,47.0,9.0,20.3788
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00022:
  agent_timesteps_total: 18200
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-58-09
  done: false
  episode_len_mean: 92.3
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 92.3
  episode_reward_min: 30.0
  episodes_this_iter: 9
  episodes_total: 321
  experiment_id: 12ee87e1ee39457d8712d0eedab7f1f9
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 21.514997482299805
    num_agent_steps_sampled: 18200
    num_steps_sampled: 18200
    num_steps_trained: 18200
  iterations_since_restore: 13
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 94.9
    gpu_util_percent0: 0.09
    ram_util_percent: 63.7
    vram_util_percent0: 0.18624547548535703
  pid: 23796
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.08875678

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00022,RUNNING,192.168.0.102:23796,0.00877715,linear,[64],13.0,5.60573,18200.0,92.3,200.0,30.0,92.3
PG_CartPole-v0_385d8_00023,PENDING,,0.00932385,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,


Result for PG_CartPole-v0_385d8_00022:
  agent_timesteps_total: 28000
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-58-12
  done: true
  episode_len_mean: 151.98
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 151.98
  episode_reward_min: 31.0
  episodes_this_iter: 7
  episodes_total: 373
  experiment_id: 12ee87e1ee39457d8712d0eedab7f1f9
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 20.676908493041992
    num_agent_steps_sampled: 28000
    num_steps_sampled: 28000
    num_steps_trained: 28000
  iterations_since_restore: 20
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 23796
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.08620443283953225
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.12427895044844198
    mean_inference_ms: 1.470828203923656

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00023,RUNNING,192.168.0.102:24386,0.00932385,relu,[64],1.0,0.392001,1400.0,20.5303,74.0,9.0,20.5303
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66.0,23.522,92400.0,153.74,200.0,33.0,153.74


Result for PG_CartPole-v0_385d8_00023:
  agent_timesteps_total: 16800
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-58-30
  done: false
  episode_len_mean: 66.35
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 66.35
  episode_reward_min: 18.0
  episodes_this_iter: 18
  episodes_total: 386
  experiment_id: 47567c2e724c4a31b4d727297b89e23a
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 17.597261428833008
    num_agent_steps_sampled: 16800
    num_steps_sampled: 16800
    num_steps_trained: 16800
  iterations_since_restore: 12
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 24386
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.08501733612462022
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.1270160474722839
    mean_inference_ms: 1.571759446086306


Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00023,RUNNING,192.168.0.102:24386,0.00932385,relu,[64],12.0,5.28802,16800.0,66.35,200.0,18.0,66.35
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66.0,23.522,92400.0,153.74,200.0,33.0,153.74


Result for PG_CartPole-v0_385d8_00023:
  agent_timesteps_total: 32200
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-58-35
  done: false
  episode_len_mean: 136.65
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 136.65
  episode_reward_min: 39.0
  episodes_this_iter: 9
  episodes_total: 505
  experiment_id: 47567c2e724c4a31b4d727297b89e23a
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 22.54524040222168
    num_agent_steps_sampled: 32200
    num_steps_sampled: 32200
    num_steps_trained: 32200
  iterations_since_restore: 23
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 24386
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.08540276227286267
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.12251270477972465
    mean_inference_ms: 1.533715400936058

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00023,RUNNING,192.168.0.102:24386,0.00932385,relu,[64],23.0,10.1127,32200.0,136.65,200.0,39.0,136.65
PG_CartPole-v0_385d8_00024,PENDING,,0.00370631,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66.0,23.522,92400.0,153.74,200.0,33.0,153.74


Result for PG_CartPole-v0_385d8_00023:
  agent_timesteps_total: 36400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-58-37
  done: true
  episode_len_mean: 156.78
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 156.78
  episode_reward_min: 50.0
  episodes_this_iter: 8
  episodes_total: 527
  experiment_id: 47567c2e724c4a31b4d727297b89e23a
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 24.028430938720703
    num_agent_steps_sampled: 36400
    num_steps_sampled: 36400
    num_steps_trained: 36400
  iterations_since_restore: 26
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 24386
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.08641750224753646
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.12252562554933051
    mean_inference_ms: 1.543523892037992

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00024,RUNNING,192.168.0.102:25196,0.00370631,linear,[32],1.0,0.455347,1400.0,23.7547,66.0,9.0,23.7547
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66.0,23.522,92400.0,153.74,200.0,33.0,153.74
PG_CartPole-v0_385d8_00001,TERMINATED,,0.00357243,relu,[32],61.0,21.3376,85400.0,150.49,200.0,29.0,150.49


Result for PG_CartPole-v0_385d8_00024:
  agent_timesteps_total: 16800
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-58-57
  done: false
  episode_len_mean: 36.89
  episode_media: {}
  episode_reward_max: 117.0
  episode_reward_mean: 36.89
  episode_reward_min: 11.0
  episodes_this_iter: 36
  episodes_total: 546
  experiment_id: 96054ed23e2f47d49e66b342a5596bf0
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 12.657791137695312
    num_agent_steps_sampled: 16800
    num_steps_sampled: 16800
    num_steps_trained: 16800
  iterations_since_restore: 12
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 25196
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.08951606717045094
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.12546280131907195
    mean_inference_ms: 1.485908734356918

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00024,RUNNING,192.168.0.102:25196,0.00370631,linear,[32],13.0,5.63849,18200.0,37.88,117.0,11.0,37.88
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66.0,23.522,92400.0,153.74,200.0,33.0,153.74
PG_CartPole-v0_385d8_00001,TERMINATED,,0.00357243,relu,[32],61.0,21.3376,85400.0,150.49,200.0,29.0,150.49


Result for PG_CartPole-v0_385d8_00024:
  agent_timesteps_total: 36400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-59-03
  done: false
  episode_len_mean: 56.05
  episode_media: {}
  episode_reward_max: 120.0
  episode_reward_mean: 56.05
  episode_reward_min: 18.0
  episodes_this_iter: 27
  episodes_total: 950
  experiment_id: 96054ed23e2f47d49e66b342a5596bf0
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 13.796335220336914
    num_agent_steps_sampled: 36400
    num_steps_sampled: 36400
    num_steps_trained: 36400
  iterations_since_restore: 26
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 78.6
    gpu_util_percent0: 0.0
    ram_util_percent: 64.9
    vram_util_percent0: 0.17900625205659756
  pid: 25196
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.081268

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00024,RUNNING,192.168.0.102:25196,0.00370631,linear,[32],27.0,10.4869,37800.0,54.99,120.0,18.0,54.99
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66.0,23.522,92400.0,153.74,200.0,33.0,153.74
PG_CartPole-v0_385d8_00001,TERMINATED,,0.00357243,relu,[32],61.0,21.3376,85400.0,150.49,200.0,29.0,150.49


Result for PG_CartPole-v0_385d8_00024:
  agent_timesteps_total: 56000
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-59-08
  done: false
  episode_len_mean: 73.15
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 73.15
  episode_reward_min: 31.0
  episodes_this_iter: 19
  episodes_total: 1244
  experiment_id: 96054ed23e2f47d49e66b342a5596bf0
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 16.25788688659668
    num_agent_steps_sampled: 56000
    num_steps_sampled: 56000
    num_steps_trained: 56000
  iterations_since_restore: 40
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 80.8
    gpu_util_percent0: 0.03
    ram_util_percent: 64.4
    vram_util_percent0: 0.17752550180980586
  pid: 25196
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07952

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00024,RUNNING,192.168.0.102:25196,0.00370631,linear,[32],41.0,15.6037,57400.0,72.42,200.0,32.0,72.42
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66.0,23.522,92400.0,153.74,200.0,33.0,153.74
PG_CartPole-v0_385d8_00001,TERMINATED,,0.00357243,relu,[32],61.0,21.3376,85400.0,150.49,200.0,29.0,150.49


Result for PG_CartPole-v0_385d8_00024:
  agent_timesteps_total: 75600
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-59-13
  done: false
  episode_len_mean: 137.0
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 137.0
  episode_reward_min: 30.0
  episodes_this_iter: 9
  episodes_total: 1404
  experiment_id: 96054ed23e2f47d49e66b342a5596bf0
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 20.551788330078125
    num_agent_steps_sampled: 75600
    num_steps_sampled: 75600
    num_steps_trained: 75600
  iterations_since_restore: 54
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 25196
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07815043279452864
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.11139519554598637
    mean_inference_ms: 1.311410093348209

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00024,RUNNING,192.168.0.102:25196,0.00370631,linear,[32],55.0,20.4823,77000.0,143.68,200.0,41.0,143.68
PG_CartPole-v0_385d8_00025,PENDING,,0.00539407,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66.0,23.522,92400.0,153.74,200.0,33.0,153.74
PG_CartPole-v0_385d8_00001,TERMINATED,,0.00357243,relu,[32],61.0,21.3376,85400.0,150.49,200.0,29.0,150.49


Result for PG_CartPole-v0_385d8_00024:
  agent_timesteps_total: 82600
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-59-15
  done: true
  episode_len_mean: 150.33
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 150.33
  episode_reward_min: 41.0
  episodes_this_iter: 9
  episodes_total: 1451
  experiment_id: 96054ed23e2f47d49e66b342a5596bf0
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 21.39032554626465
    num_agent_steps_sampled: 82600
    num_steps_sampled: 82600
    num_steps_trained: 82600
  iterations_since_restore: 59
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 25196
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07787023671695234
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.11125422595905624
    mean_inference_ms: 1.308222929665781

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00025,RUNNING,192.168.0.102:25748,0.00539407,relu,[32],1.0,0.380033,1400.0,20.4844,64.0,9.0,20.4844
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66.0,23.522,92400.0,153.74,200.0,33.0,153.74
PG_CartPole-v0_385d8_00001,TERMINATED,,0.00357243,relu,[32],61.0,21.3376,85400.0,150.49,200.0,29.0,150.49
PG_CartPole-v0_385d8_00002,TERMINATED,,0.00806013,linear,[64],23.0,7.8096,32200.0,154.94,200.0,46.0,154.94


Result for PG_CartPole-v0_385d8_00025:
  agent_timesteps_total: 21000
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-59-32
  done: false
  episode_len_mean: 41.45
  episode_media: {}
  episode_reward_max: 139.0
  episode_reward_mean: 41.45
  episode_reward_min: 11.0
  episodes_this_iter: 32
  episodes_total: 681
  experiment_id: 6b273bda25524a31810996b18559c0e6
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 13.526988983154297
    num_agent_steps_sampled: 21000
    num_steps_sampled: 21000
    num_steps_trained: 21000
  iterations_since_restore: 15
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 86.5
    gpu_util_percent0: 0.06
    ram_util_percent: 64.5
    vram_util_percent0: 0.17653833497861138
  pid: 25748
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07269

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00025,RUNNING,192.168.0.102:25748,0.00539407,relu,[32],15.0,5.21075,21000.0,41.45,139.0,11.0,41.45
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66.0,23.522,92400.0,153.74,200.0,33.0,153.74
PG_CartPole-v0_385d8_00001,TERMINATED,,0.00357243,relu,[32],61.0,21.3376,85400.0,150.49,200.0,29.0,150.49
PG_CartPole-v0_385d8_00002,TERMINATED,,0.00806013,linear,[64],23.0,7.8096,32200.0,154.94,200.0,46.0,154.94


Result for PG_CartPole-v0_385d8_00025:
  agent_timesteps_total: 39200
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-59-38
  done: false
  episode_len_mean: 62.05
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 62.05
  episode_reward_min: 14.0
  episodes_this_iter: 18
  episodes_total: 1012
  experiment_id: 6b273bda25524a31810996b18559c0e6
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 18.09259796142578
    num_agent_steps_sampled: 39200
    num_steps_sampled: 39200
    num_steps_trained: 39200
  iterations_since_restore: 28
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 89.0
    gpu_util_percent0: 0.01
    ram_util_percent: 65.2
    vram_util_percent0: 0.17604475156301416
  pid: 25748
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07445

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00025,RUNNING,192.168.0.102:25748,0.00539407,relu,[32],28.0,10.1311,39200.0,62.05,200.0,14.0,62.05
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66.0,23.522,92400.0,153.74,200.0,33.0,153.74
PG_CartPole-v0_385d8_00001,TERMINATED,,0.00357243,relu,[32],61.0,21.3376,85400.0,150.49,200.0,29.0,150.49
PG_CartPole-v0_385d8_00002,TERMINATED,,0.00806013,linear,[64],23.0,7.8096,32200.0,154.94,200.0,46.0,154.94


Result for PG_CartPole-v0_385d8_00025:
  agent_timesteps_total: 57400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-59-43
  done: false
  episode_len_mean: 111.79
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 111.79
  episode_reward_min: 22.0
  episodes_this_iter: 8
  episodes_total: 1203
  experiment_id: 6b273bda25524a31810996b18559c0e6
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 22.844261169433594
    num_agent_steps_sampled: 57400
    num_steps_sampled: 57400
    num_steps_trained: 57400
  iterations_since_restore: 41
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 25748
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07685077159444964
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10647087545722872
    mean_inference_ms: 1.3239515975343

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00025,RUNNING,192.168.0.102:25748,0.00539407,relu,[32],41.0,14.9919,57400.0,111.79,200.0,22.0,111.79
PG_CartPole-v0_385d8_00026,PENDING,,0.00872993,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66.0,23.522,92400.0,153.74,200.0,33.0,153.74
PG_CartPole-v0_385d8_00001,TERMINATED,,0.00357243,relu,[32],61.0,21.3376,85400.0,150.49,200.0,29.0,150.49
PG_CartPole-v0_385d8_00002,TERMINATED,,0.00806013,linear,[64],23.0,7.8096,32200.0,154.94,200.0,46.0,154.94


Result for PG_CartPole-v0_385d8_00025:
  agent_timesteps_total: 71400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_17-59-46
  done: true
  episode_len_mean: 150.48
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 150.48
  episode_reward_min: 23.0
  episodes_this_iter: 9
  episodes_total: 1298
  experiment_id: 6b273bda25524a31810996b18559c0e6
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 22.260366439819336
    num_agent_steps_sampled: 71400
    num_steps_sampled: 71400
    num_steps_trained: 71400
  iterations_since_restore: 51
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 86.3
    gpu_util_percent0: 0.06
    ram_util_percent: 64.6
    vram_util_percent0: 0.17620927936821323
  pid: 25748
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.0762

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00026,RUNNING,192.168.0.102:26206,0.00872993,linear,[64],1.0,0.347627,1400.0,20.1692,69.0,8.0,20.1692
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66.0,23.522,92400.0,153.74,200.0,33.0,153.74
PG_CartPole-v0_385d8_00001,TERMINATED,,0.00357243,relu,[32],61.0,21.3376,85400.0,150.49,200.0,29.0,150.49
PG_CartPole-v0_385d8_00002,TERMINATED,,0.00806013,linear,[64],23.0,7.8096,32200.0,154.94,200.0,46.0,154.94
PG_CartPole-v0_385d8_00003,TERMINATED,,0.000611846,relu,[64],72.0,25.6645,100800.0,48.36,171.0,11.0,48.36


Result for PG_CartPole-v0_385d8_00026:
  agent_timesteps_total: 22400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_18-00-04
  done: false
  episode_len_mean: 101.78
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 101.78
  episode_reward_min: 29.0
  episodes_this_iter: 11
  episodes_total: 382
  experiment_id: 7dbd42f18f244491aec71b32ab5ff3e4
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 18.000667572021484
    num_agent_steps_sampled: 22400
    num_steps_sampled: 22400
    num_steps_trained: 22400
  iterations_since_restore: 16
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 84.0
    gpu_util_percent0: 0.02
    ram_util_percent: 64.2
    vram_util_percent0: 0.17588022375781506
  pid: 26206
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.075

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00026,RUNNING,192.168.0.102:26206,0.00872993,linear,[64],16.0,5.37287,22400.0,101.78,200.0,29.0,101.78
PG_CartPole-v0_385d8_00027,PENDING,,0.00724645,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66.0,23.522,92400.0,153.74,200.0,33.0,153.74
PG_CartPole-v0_385d8_00001,TERMINATED,,0.00357243,relu,[32],61.0,21.3376,85400.0,150.49,200.0,29.0,150.49
PG_CartPole-v0_385d8_00002,TERMINATED,,0.00806013,linear,[64],23.0,7.8096,32200.0,154.94,200.0,46.0,154.94
PG_CartPole-v0_385d8_00003,TERMINATED,,0.000611846,relu,[64],72.0,25.6645,100800.0,48.36,171.0,11.0,48.36


Result for PG_CartPole-v0_385d8_00026:
  agent_timesteps_total: 35000
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_18-00-07
  done: true
  episode_len_mean: 152.96
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 152.96
  episode_reward_min: 85.0
  episodes_this_iter: 8
  episodes_total: 463
  experiment_id: 7dbd42f18f244491aec71b32ab5ff3e4
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 19.141530990600586
    num_agent_steps_sampled: 35000
    num_steps_sampled: 35000
    num_steps_trained: 35000
  iterations_since_restore: 25
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 26206
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07551709254508795
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10294793943568688
    mean_inference_ms: 1.222058877242342

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00027,RUNNING,192.168.0.102:26595,0.00724645,relu,[64],1.0,0.350924,1400.0,21.4194,63.0,8.0,21.4194
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66.0,23.522,92400.0,153.74,200.0,33.0,153.74
PG_CartPole-v0_385d8_00001,TERMINATED,,0.00357243,relu,[32],61.0,21.3376,85400.0,150.49,200.0,29.0,150.49
PG_CartPole-v0_385d8_00002,TERMINATED,,0.00806013,linear,[64],23.0,7.8096,32200.0,154.94,200.0,46.0,154.94
PG_CartPole-v0_385d8_00003,TERMINATED,,0.000611846,relu,[64],72.0,25.6645,100800.0,48.36,171.0,11.0,48.36
PG_CartPole-v0_385d8_00004,TERMINATED,,0.00379117,linear,[32],52.0,19.5719,72800.0,151.24,200.0,30.0,151.24


Result for PG_CartPole-v0_385d8_00027:
  agent_timesteps_total: 19600
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_18-00-24
  done: false
  episode_len_mean: 59.17
  episode_media: {}
  episode_reward_max: 146.0
  episode_reward_mean: 59.17
  episode_reward_min: 20.0
  episodes_this_iter: 22
  episodes_total: 483
  experiment_id: 2793b4987ada41119101466500094407
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 14.498601913452148
    num_agent_steps_sampled: 19600
    num_steps_sampled: 19600
    num_steps_trained: 19600
  iterations_since_restore: 14
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 26595
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07762870812158948
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.11150648255541502
    mean_inference_ms: 1.343901374049521

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00027,RUNNING,192.168.0.102:26595,0.00724645,relu,[64],15.0,5.58895,21000.0,59.94,146.0,20.0,59.94
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66.0,23.522,92400.0,153.74,200.0,33.0,153.74
PG_CartPole-v0_385d8_00001,TERMINATED,,0.00357243,relu,[32],61.0,21.3376,85400.0,150.49,200.0,29.0,150.49
PG_CartPole-v0_385d8_00002,TERMINATED,,0.00806013,linear,[64],23.0,7.8096,32200.0,154.94,200.0,46.0,154.94
PG_CartPole-v0_385d8_00003,TERMINATED,,0.000611846,relu,[64],72.0,25.6645,100800.0,48.36,171.0,11.0,48.36
PG_CartPole-v0_385d8_00004,TERMINATED,,0.00379117,linear,[32],52.0,19.5719,72800.0,151.24,200.0,30.0,151.24


Result for PG_CartPole-v0_385d8_00027:
  agent_timesteps_total: 40600
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_18-00-30
  done: false
  episode_len_mean: 145.87
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 145.87
  episode_reward_min: 40.0
  episodes_this_iter: 7
  episodes_total: 661
  experiment_id: 2793b4987ada41119101466500094407
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 25.363353729248047
    num_agent_steps_sampled: 40600
    num_steps_sampled: 40600
    num_steps_trained: 40600
  iterations_since_restore: 29
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 86.2
    gpu_util_percent0: 0.02
    ram_util_percent: 64.5
    vram_util_percent0: 0.17752550180980586
  pid: 26595
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.0749

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00027,RUNNING,192.168.0.102:26595,0.00724645,relu,[64],29.0,10.4137,40600.0,145.87,200.0,40.0,145.87
PG_CartPole-v0_385d8_00028,PENDING,,0.00992334,linear,[32],,,,,,,
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66.0,23.522,92400.0,153.74,200.0,33.0,153.74
PG_CartPole-v0_385d8_00001,TERMINATED,,0.00357243,relu,[32],61.0,21.3376,85400.0,150.49,200.0,29.0,150.49
PG_CartPole-v0_385d8_00002,TERMINATED,,0.00806013,linear,[64],23.0,7.8096,32200.0,154.94,200.0,46.0,154.94
PG_CartPole-v0_385d8_00003,TERMINATED,,0.000611846,relu,[64],72.0,25.6645,100800.0,48.36,171.0,11.0,48.36
PG_CartPole-v0_385d8_00004,TERMINATED,,0.00379117,linear,[32],52.0,19.5719,72800.0,151.24,200.0,30.0,151.24


Result for PG_CartPole-v0_385d8_00027:
  agent_timesteps_total: 43400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_18-00-31
  done: true
  episode_len_mean: 152.82
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 152.82
  episode_reward_min: 41.0
  episodes_this_iter: 10
  episodes_total: 679
  experiment_id: 2793b4987ada41119101466500094407
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 21.481197357177734
    num_agent_steps_sampled: 43400
    num_steps_sampled: 43400
    num_steps_trained: 43400
  iterations_since_restore: 31
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 26595
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07506569256024254
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10875000131273276
    mean_inference_ms: 1.30732525240146

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00028,RUNNING,192.168.0.102:27046,0.00992334,linear,[32],1.0,0.365001,1400.0,21.459,69.0,10.0,21.459
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66.0,23.522,92400.0,153.74,200.0,33.0,153.74
PG_CartPole-v0_385d8_00001,TERMINATED,,0.00357243,relu,[32],61.0,21.3376,85400.0,150.49,200.0,29.0,150.49
PG_CartPole-v0_385d8_00002,TERMINATED,,0.00806013,linear,[64],23.0,7.8096,32200.0,154.94,200.0,46.0,154.94
PG_CartPole-v0_385d8_00003,TERMINATED,,0.000611846,relu,[64],72.0,25.6645,100800.0,48.36,171.0,11.0,48.36
PG_CartPole-v0_385d8_00004,TERMINATED,,0.00379117,linear,[32],52.0,19.5719,72800.0,151.24,200.0,30.0,151.24
PG_CartPole-v0_385d8_00005,TERMINATED,,0.00852355,relu,[32],36.0,12.5987,50400.0,154.08,200.0,71.0,154.08


Result for PG_CartPole-v0_385d8_00028:
  agent_timesteps_total: 22400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_18-00-47
  done: false
  episode_len_mean: 103.89
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 103.89
  episode_reward_min: 22.0
  episodes_this_iter: 11
  episodes_total: 418
  experiment_id: 75c32fbda3714f1ba2b58e574b1bc37d
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 21.12861442565918
    num_agent_steps_sampled: 22400
    num_steps_sampled: 22400
    num_steps_trained: 22400
  iterations_since_restore: 16
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 27046
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07192424962163595
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10392228755829903
    mean_inference_ms: 1.19758157252289

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00028,RUNNING,192.168.0.102:27046,0.00992334,linear,[32],16.0,5.22464,22400.0,103.89,200.0,22.0,103.89
PG_CartPole-v0_385d8_00029,PENDING,,0.000872969,relu,[32],,,,,,,
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66.0,23.522,92400.0,153.74,200.0,33.0,153.74
PG_CartPole-v0_385d8_00001,TERMINATED,,0.00357243,relu,[32],61.0,21.3376,85400.0,150.49,200.0,29.0,150.49
PG_CartPole-v0_385d8_00002,TERMINATED,,0.00806013,linear,[64],23.0,7.8096,32200.0,154.94,200.0,46.0,154.94
PG_CartPole-v0_385d8_00003,TERMINATED,,0.000611846,relu,[64],72.0,25.6645,100800.0,48.36,171.0,11.0,48.36
PG_CartPole-v0_385d8_00004,TERMINATED,,0.00379117,linear,[32],52.0,19.5719,72800.0,151.24,200.0,30.0,151.24
PG_CartPole-v0_385d8_00005,TERMINATED,,0.00852355,relu,[32],36.0,12.5987,50400.0,154.08,200.0,71.0,154.08


Result for PG_CartPole-v0_385d8_00028:
  agent_timesteps_total: 33600
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_18-00-50
  done: true
  episode_len_mean: 153.94
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 153.94
  episode_reward_min: 37.0
  episodes_this_iter: 9
  episodes_total: 488
  experiment_id: 75c32fbda3714f1ba2b58e574b1bc37d
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 23.63840675354004
    num_agent_steps_sampled: 33600
    num_steps_sampled: 33600
    num_steps_trained: 33600
  iterations_since_restore: 24
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 80.1
    gpu_util_percent0: 0.0
    ram_util_percent: 64.4
    vram_util_percent0: 0.1763738071734123
  pid: 27046
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07141366

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00029,RUNNING,192.168.0.102:27365,0.000872969,relu,[32],1.0,0.370944,1400.0,21.9483,82.0,8.0,21.9483
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66.0,23.522,92400.0,153.74,200.0,33.0,153.74
PG_CartPole-v0_385d8_00001,TERMINATED,,0.00357243,relu,[32],61.0,21.3376,85400.0,150.49,200.0,29.0,150.49
PG_CartPole-v0_385d8_00002,TERMINATED,,0.00806013,linear,[64],23.0,7.8096,32200.0,154.94,200.0,46.0,154.94
PG_CartPole-v0_385d8_00003,TERMINATED,,0.000611846,relu,[64],72.0,25.6645,100800.0,48.36,171.0,11.0,48.36
PG_CartPole-v0_385d8_00004,TERMINATED,,0.00379117,linear,[32],52.0,19.5719,72800.0,151.24,200.0,30.0,151.24
PG_CartPole-v0_385d8_00005,TERMINATED,,0.00852355,relu,[32],36.0,12.5987,50400.0,154.08,200.0,71.0,154.08
PG_CartPole-v0_385d8_00006,TERMINATED,,0.00338773,linear,[64],35.0,11.8572,49000.0,155.01,200.0,28.0,155.01


Result for PG_CartPole-v0_385d8_00029:
  agent_timesteps_total: 19600
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_18-01-08
  done: false
  episode_len_mean: 23.25
  episode_media: {}
  episode_reward_max: 100.0
  episode_reward_mean: 23.25
  episode_reward_min: 9.0
  episodes_this_iter: 59
  episodes_total: 845
  experiment_id: 2a92f7cc18914e69a99a3ae203416d1f
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 9.911925315856934
    num_agent_steps_sampled: 19600
    num_steps_sampled: 19600
    num_steps_trained: 19600
  iterations_since_restore: 14
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 82.7
    gpu_util_percent0: 0.04
    ram_util_percent: 64.9
    vram_util_percent0: 0.17834814083580125
  pid: 27365
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.0747088

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00029,RUNNING,192.168.0.102:27365,0.000872969,relu,[32],15.0,5.48135,21000.0,23.36,100.0,9.0,23.36
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66.0,23.522,92400.0,153.74,200.0,33.0,153.74
PG_CartPole-v0_385d8_00001,TERMINATED,,0.00357243,relu,[32],61.0,21.3376,85400.0,150.49,200.0,29.0,150.49
PG_CartPole-v0_385d8_00002,TERMINATED,,0.00806013,linear,[64],23.0,7.8096,32200.0,154.94,200.0,46.0,154.94
PG_CartPole-v0_385d8_00003,TERMINATED,,0.000611846,relu,[64],72.0,25.6645,100800.0,48.36,171.0,11.0,48.36
PG_CartPole-v0_385d8_00004,TERMINATED,,0.00379117,linear,[32],52.0,19.5719,72800.0,151.24,200.0,30.0,151.24
PG_CartPole-v0_385d8_00005,TERMINATED,,0.00852355,relu,[32],36.0,12.5987,50400.0,154.08,200.0,71.0,154.08
PG_CartPole-v0_385d8_00006,TERMINATED,,0.00338773,linear,[64],35.0,11.8572,49000.0,155.01,200.0,28.0,155.01


Result for PG_CartPole-v0_385d8_00029:
  agent_timesteps_total: 39200
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_18-01-13
  done: false
  episode_len_mean: 27.33
  episode_media: {}
  episode_reward_max: 73.0
  episode_reward_mean: 27.33
  episode_reward_min: 9.0
  episodes_this_iter: 50
  episodes_total: 1610
  experiment_id: 2a92f7cc18914e69a99a3ae203416d1f
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 10.087470054626465
    num_agent_steps_sampled: 39200
    num_steps_sampled: 39200
    num_steps_trained: 39200
  iterations_since_restore: 28
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 27365
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07380204867602423
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.106832900194292
    mean_inference_ms: 1.2873888670239193
 

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00029,RUNNING,192.168.0.102:27365,0.000872969,relu,[32],28.0,10.0347,39200.0,27.33,73.0,9.0,27.33
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66.0,23.522,92400.0,153.74,200.0,33.0,153.74
PG_CartPole-v0_385d8_00001,TERMINATED,,0.00357243,relu,[32],61.0,21.3376,85400.0,150.49,200.0,29.0,150.49
PG_CartPole-v0_385d8_00002,TERMINATED,,0.00806013,linear,[64],23.0,7.8096,32200.0,154.94,200.0,46.0,154.94
PG_CartPole-v0_385d8_00003,TERMINATED,,0.000611846,relu,[64],72.0,25.6645,100800.0,48.36,171.0,11.0,48.36
PG_CartPole-v0_385d8_00004,TERMINATED,,0.00379117,linear,[32],52.0,19.5719,72800.0,151.24,200.0,30.0,151.24
PG_CartPole-v0_385d8_00005,TERMINATED,,0.00852355,relu,[32],36.0,12.5987,50400.0,154.08,200.0,71.0,154.08
PG_CartPole-v0_385d8_00006,TERMINATED,,0.00338773,linear,[64],35.0,11.8572,49000.0,155.01,200.0,28.0,155.01


Result for PG_CartPole-v0_385d8_00029:
  agent_timesteps_total: 58800
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_18-01-18
  done: false
  episode_len_mean: 30.1
  episode_media: {}
  episode_reward_max: 109.0
  episode_reward_mean: 30.1
  episode_reward_min: 11.0
  episodes_this_iter: 48
  episodes_total: 2295
  experiment_id: 2a92f7cc18914e69a99a3ae203416d1f
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 10.270766258239746
    num_agent_steps_sampled: 58800
    num_steps_sampled: 58800
    num_steps_trained: 58800
  iterations_since_restore: 42
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 86.6
    gpu_util_percent0: 0.05
    ram_util_percent: 64.9
    vram_util_percent0: 0.17670286278381048
  pid: 27365
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.073453

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00029,RUNNING,192.168.0.102:27365,0.000872969,relu,[32],43.0,15.2128,60200.0,30.68,86.0,10.0,30.68
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66.0,23.522,92400.0,153.74,200.0,33.0,153.74
PG_CartPole-v0_385d8_00001,TERMINATED,,0.00357243,relu,[32],61.0,21.3376,85400.0,150.49,200.0,29.0,150.49
PG_CartPole-v0_385d8_00002,TERMINATED,,0.00806013,linear,[64],23.0,7.8096,32200.0,154.94,200.0,46.0,154.94
PG_CartPole-v0_385d8_00003,TERMINATED,,0.000611846,relu,[64],72.0,25.6645,100800.0,48.36,171.0,11.0,48.36
PG_CartPole-v0_385d8_00004,TERMINATED,,0.00379117,linear,[32],52.0,19.5719,72800.0,151.24,200.0,30.0,151.24
PG_CartPole-v0_385d8_00005,TERMINATED,,0.00852355,relu,[32],36.0,12.5987,50400.0,154.08,200.0,71.0,154.08
PG_CartPole-v0_385d8_00006,TERMINATED,,0.00338773,linear,[64],35.0,11.8572,49000.0,155.01,200.0,28.0,155.01


Result for PG_CartPole-v0_385d8_00029:
  agent_timesteps_total: 78400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_18-01-23
  done: false
  episode_len_mean: 35.88
  episode_media: {}
  episode_reward_max: 124.0
  episode_reward_mean: 35.88
  episode_reward_min: 10.0
  episodes_this_iter: 42
  episodes_total: 2864
  experiment_id: 2a92f7cc18914e69a99a3ae203416d1f
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 10.700798034667969
    num_agent_steps_sampled: 78400
    num_steps_sampled: 78400
    num_steps_trained: 78400
  iterations_since_restore: 56
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 27365
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07375672159902413
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10608701993493044
    mean_inference_ms: 1.27737515943487

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00029,RUNNING,192.168.0.102:27365,0.000872969,relu,[32],57.0,20.1876,79800.0,37.42,124.0,10.0,37.42
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66.0,23.522,92400.0,153.74,200.0,33.0,153.74
PG_CartPole-v0_385d8_00001,TERMINATED,,0.00357243,relu,[32],61.0,21.3376,85400.0,150.49,200.0,29.0,150.49
PG_CartPole-v0_385d8_00002,TERMINATED,,0.00806013,linear,[64],23.0,7.8096,32200.0,154.94,200.0,46.0,154.94
PG_CartPole-v0_385d8_00003,TERMINATED,,0.000611846,relu,[64],72.0,25.6645,100800.0,48.36,171.0,11.0,48.36
PG_CartPole-v0_385d8_00004,TERMINATED,,0.00379117,linear,[32],52.0,19.5719,72800.0,151.24,200.0,30.0,151.24
PG_CartPole-v0_385d8_00005,TERMINATED,,0.00852355,relu,[32],36.0,12.5987,50400.0,154.08,200.0,71.0,154.08
PG_CartPole-v0_385d8_00006,TERMINATED,,0.00338773,linear,[64],35.0,11.8572,49000.0,155.01,200.0,28.0,155.01


Result for PG_CartPole-v0_385d8_00029:
  agent_timesteps_total: 98000
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_18-01-28
  done: false
  episode_len_mean: 37.3
  episode_media: {}
  episode_reward_max: 116.0
  episode_reward_mean: 37.3
  episode_reward_min: 11.0
  episodes_this_iter: 33
  episodes_total: 3389
  experiment_id: 2a92f7cc18914e69a99a3ae203416d1f
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 13.186749458312988
    num_agent_steps_sampled: 98000
    num_steps_sampled: 98000
    num_steps_trained: 98000
  iterations_since_restore: 70
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 83.4
    gpu_util_percent0: 0.0
    ram_util_percent: 64.7
    vram_util_percent0: 0.17588022375781506
  pid: 27365
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.0734839

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00029,RUNNING,192.168.0.102:27365,0.000872969,relu,[32],71.0,24.9729,99400.0,39.53,118.0,9.0,39.53
PG_CartPole-v0_385d8_00030,PENDING,,0.00619586,linear,[64],,,,,,,
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66.0,23.522,92400.0,153.74,200.0,33.0,153.74
PG_CartPole-v0_385d8_00001,TERMINATED,,0.00357243,relu,[32],61.0,21.3376,85400.0,150.49,200.0,29.0,150.49
PG_CartPole-v0_385d8_00002,TERMINATED,,0.00806013,linear,[64],23.0,7.8096,32200.0,154.94,200.0,46.0,154.94
PG_CartPole-v0_385d8_00003,TERMINATED,,0.000611846,relu,[64],72.0,25.6645,100800.0,48.36,171.0,11.0,48.36
PG_CartPole-v0_385d8_00004,TERMINATED,,0.00379117,linear,[32],52.0,19.5719,72800.0,151.24,200.0,30.0,151.24
PG_CartPole-v0_385d8_00005,TERMINATED,,0.00852355,relu,[32],36.0,12.5987,50400.0,154.08,200.0,71.0,154.08
PG_CartPole-v0_385d8_00006,TERMINATED,,0.00338773,linear,[64],35.0,11.8572,49000.0,155.01,200.0,28.0,155.01


Result for PG_CartPole-v0_385d8_00029:
  agent_timesteps_total: 100800
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_18-01-29
  done: true
  episode_len_mean: 38.99
  episode_media: {}
  episode_reward_max: 118.0
  episode_reward_mean: 38.99
  episode_reward_min: 9.0
  episodes_this_iter: 34
  episodes_total: 3461
  experiment_id: 2a92f7cc18914e69a99a3ae203416d1f
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 13.06325912475586
    num_agent_steps_sampled: 100800
    num_steps_sampled: 100800
    num_steps_trained: 100800
  iterations_since_restore: 72
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 87.3
    gpu_util_percent0: 0.05
    ram_util_percent: 64.9
    vram_util_percent0: 0.17588022375781506
  pid: 27365
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.073

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00030,RUNNING,192.168.0.102:27947,0.00619586,linear,[64],1.0,0.31837,1400.0,21.0952,55.0,9.0,21.0952
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66.0,23.522,92400.0,153.74,200.0,33.0,153.74
PG_CartPole-v0_385d8_00001,TERMINATED,,0.00357243,relu,[32],61.0,21.3376,85400.0,150.49,200.0,29.0,150.49
PG_CartPole-v0_385d8_00002,TERMINATED,,0.00806013,linear,[64],23.0,7.8096,32200.0,154.94,200.0,46.0,154.94
PG_CartPole-v0_385d8_00003,TERMINATED,,0.000611846,relu,[64],72.0,25.6645,100800.0,48.36,171.0,11.0,48.36
PG_CartPole-v0_385d8_00004,TERMINATED,,0.00379117,linear,[32],52.0,19.5719,72800.0,151.24,200.0,30.0,151.24
PG_CartPole-v0_385d8_00005,TERMINATED,,0.00852355,relu,[32],36.0,12.5987,50400.0,154.08,200.0,71.0,154.08
PG_CartPole-v0_385d8_00006,TERMINATED,,0.00338773,linear,[64],35.0,11.8572,49000.0,155.01,200.0,28.0,155.01
PG_CartPole-v0_385d8_00007,TERMINATED,,0.0079584,relu,[64],26.0,9.34593,36400.0,151.55,200.0,27.0,151.55


Result for PG_CartPole-v0_385d8_00030:
  agent_timesteps_total: 22400
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_18-01-47
  done: false
  episode_len_mean: 101.05
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 101.05
  episode_reward_min: 16.0
  episodes_this_iter: 7
  episodes_total: 397
  experiment_id: 5c2dcd48b905435a8b6a60115ec1f608
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 26.216829299926758
    num_agent_steps_sampled: 22400
    num_steps_sampled: 22400
    num_steps_trained: 22400
  iterations_since_restore: 16
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 83.6
    gpu_util_percent0: 0.01
    ram_util_percent: 65.1
    vram_util_percent0: 0.1771964461994077
  pid: 27947
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07008

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00030,RUNNING,192.168.0.102:27947,0.00619586,linear,[64],16.0,5.27579,22400.0,101.05,200.0,16.0,101.05
PG_CartPole-v0_385d8_00031,PENDING,,0.00915834,relu,[64],,,,,,,
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66.0,23.522,92400.0,153.74,200.0,33.0,153.74
PG_CartPole-v0_385d8_00001,TERMINATED,,0.00357243,relu,[32],61.0,21.3376,85400.0,150.49,200.0,29.0,150.49
PG_CartPole-v0_385d8_00002,TERMINATED,,0.00806013,linear,[64],23.0,7.8096,32200.0,154.94,200.0,46.0,154.94
PG_CartPole-v0_385d8_00003,TERMINATED,,0.000611846,relu,[64],72.0,25.6645,100800.0,48.36,171.0,11.0,48.36
PG_CartPole-v0_385d8_00004,TERMINATED,,0.00379117,linear,[32],52.0,19.5719,72800.0,151.24,200.0,30.0,151.24
PG_CartPole-v0_385d8_00005,TERMINATED,,0.00852355,relu,[32],36.0,12.5987,50400.0,154.08,200.0,71.0,154.08
PG_CartPole-v0_385d8_00006,TERMINATED,,0.00338773,linear,[64],35.0,11.8572,49000.0,155.01,200.0,28.0,155.01
PG_CartPole-v0_385d8_00007,TERMINATED,,0.0079584,relu,[64],26.0,9.34593,36400.0,151.55,200.0,27.0,151.55


Result for PG_CartPole-v0_385d8_00030:
  agent_timesteps_total: 30800
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_18-01-50
  done: true
  episode_len_mean: 152.23
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 152.23
  episode_reward_min: 16.0
  episodes_this_iter: 7
  episodes_total: 445
  experiment_id: 5c2dcd48b905435a8b6a60115ec1f608
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 23.055614471435547
    num_agent_steps_sampled: 30800
    num_steps_sampled: 30800
    num_steps_trained: 30800
  iterations_since_restore: 22
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf: {}
  pid: 27947
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.07256182947180959
    mean_env_render_ms: 0.0
    mean_env_wait_ms: 0.10414332698032762
    mean_inference_ms: 1.230980384729126

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00031,RUNNING,192.168.0.102:28386,0.00915834,relu,[64],1,0.342407,1400,20.7213,63,9,20.7213
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66,23.522,92400,153.74,200,33,153.74
PG_CartPole-v0_385d8_00001,TERMINATED,,0.00357243,relu,[32],61,21.3376,85400,150.49,200,29,150.49
PG_CartPole-v0_385d8_00002,TERMINATED,,0.00806013,linear,[64],23,7.8096,32200,154.94,200,46,154.94
PG_CartPole-v0_385d8_00003,TERMINATED,,0.000611846,relu,[64],72,25.6645,100800,48.36,171,11,48.36
PG_CartPole-v0_385d8_00004,TERMINATED,,0.00379117,linear,[32],52,19.5719,72800,151.24,200,30,151.24
PG_CartPole-v0_385d8_00005,TERMINATED,,0.00852355,relu,[32],36,12.5987,50400,154.08,200,71,154.08
PG_CartPole-v0_385d8_00006,TERMINATED,,0.00338773,linear,[64],35,11.8572,49000,155.01,200,28,155.01
PG_CartPole-v0_385d8_00007,TERMINATED,,0.0079584,relu,[64],26,9.34593,36400,151.55,200,27,151.55
PG_CartPole-v0_385d8_00008,TERMINATED,,0.00893173,linear,[32],29,9.90459,40600,150.78,200,48,150.78


Result for PG_CartPole-v0_385d8_00031:
  agent_timesteps_total: 21000
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_18-02-09
  done: false
  episode_len_mean: 83.94
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 83.94
  episode_reward_min: 24.0
  episodes_this_iter: 13
  episodes_total: 423
  experiment_id: 423f2a15e2654df485f11219f40daac0
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 18.95767593383789
    num_agent_steps_sampled: 21000
    num_steps_sampled: 21000
    num_steps_trained: 21000
  iterations_since_restore: 15
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 89.4
    gpu_util_percent0: 0.13
    ram_util_percent: 66.7
    vram_util_percent0: 0.18295491938137545
  pid: 28386
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.076005

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00031,RUNNING,192.168.0.102:28386,0.00915834,relu,[64],15,5.43445,21000,83.94,200,24,83.94
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66,23.522,92400,153.74,200,33,153.74
PG_CartPole-v0_385d8_00001,TERMINATED,,0.00357243,relu,[32],61,21.3376,85400,150.49,200,29,150.49
PG_CartPole-v0_385d8_00002,TERMINATED,,0.00806013,linear,[64],23,7.8096,32200,154.94,200,46,154.94
PG_CartPole-v0_385d8_00003,TERMINATED,,0.000611846,relu,[64],72,25.6645,100800,48.36,171,11,48.36
PG_CartPole-v0_385d8_00004,TERMINATED,,0.00379117,linear,[32],52,19.5719,72800,151.24,200,30,151.24
PG_CartPole-v0_385d8_00005,TERMINATED,,0.00852355,relu,[32],36,12.5987,50400,154.08,200,71,154.08
PG_CartPole-v0_385d8_00006,TERMINATED,,0.00338773,linear,[64],35,11.8572,49000,155.01,200,28,155.01
PG_CartPole-v0_385d8_00007,TERMINATED,,0.0079584,relu,[64],26,9.34593,36400,151.55,200,27,151.55
PG_CartPole-v0_385d8_00008,TERMINATED,,0.00893173,linear,[32],29,9.90459,40600,150.78,200,48,150.78


Result for PG_CartPole-v0_385d8_00031:
  agent_timesteps_total: 39200
  custom_metrics:
    default_policy: {}
  date: 2021-11-16_18-02-14
  done: true
  episode_len_mean: 150.58
  episode_media: {}
  episode_reward_max: 200.0
  episode_reward_mean: 150.58
  episode_reward_min: 44.0
  episodes_this_iter: 8
  episodes_total: 550
  experiment_id: 423f2a15e2654df485f11219f40daac0
  hostname: bruno-odyssey-mint
  info:
    learner:
      default_policy:
        allreduce_latency: 0.0
        policy_loss: 24.39448356628418
    num_agent_steps_sampled: 39200
    num_steps_sampled: 39200
    num_steps_trained: 39200
  iterations_since_restore: 28
  node_ip: 192.168.0.102
  num_healthy_workers: 7
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 83.9
    gpu_util_percent0: 0.05
    ram_util_percent: 66.3
    vram_util_percent0: 0.17851266864100032
  pid: 28386
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_action_processing_ms: 0.076527

Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66,23.522,92400,153.74,200,33,153.74
PG_CartPole-v0_385d8_00001,TERMINATED,,0.00357243,relu,[32],61,21.3376,85400,150.49,200,29,150.49
PG_CartPole-v0_385d8_00002,TERMINATED,,0.00806013,linear,[64],23,7.8096,32200,154.94,200,46,154.94
PG_CartPole-v0_385d8_00003,TERMINATED,,0.000611846,relu,[64],72,25.6645,100800,48.36,171,11,48.36
PG_CartPole-v0_385d8_00004,TERMINATED,,0.00379117,linear,[32],52,19.5719,72800,151.24,200,30,151.24
PG_CartPole-v0_385d8_00005,TERMINATED,,0.00852355,relu,[32],36,12.5987,50400,154.08,200,71,154.08
PG_CartPole-v0_385d8_00006,TERMINATED,,0.00338773,linear,[64],35,11.8572,49000,155.01,200,28,155.01
PG_CartPole-v0_385d8_00007,TERMINATED,,0.0079584,relu,[64],26,9.34593,36400,151.55,200,27,151.55
PG_CartPole-v0_385d8_00008,TERMINATED,,0.00893173,linear,[32],29,9.90459,40600,150.78,200,48,150.78
PG_CartPole-v0_385d8_00009,TERMINATED,,0.00666936,relu,[32],46,16.201,64400,150.91,200,26,150.91


Trial name,status,loc,lr,model/fcnet_activation,model/fcnet_hiddens,iter,total time (s),ts,reward,episode_reward_max,episode_reward_min,episode_len_mean
PG_CartPole-v0_385d8_00000,TERMINATED,,0.00428141,linear,[32],66,23.522,92400,153.74,200,33,153.74
PG_CartPole-v0_385d8_00001,TERMINATED,,0.00357243,relu,[32],61,21.3376,85400,150.49,200,29,150.49
PG_CartPole-v0_385d8_00002,TERMINATED,,0.00806013,linear,[64],23,7.8096,32200,154.94,200,46,154.94
PG_CartPole-v0_385d8_00003,TERMINATED,,0.000611846,relu,[64],72,25.6645,100800,48.36,171,11,48.36
PG_CartPole-v0_385d8_00004,TERMINATED,,0.00379117,linear,[32],52,19.5719,72800,151.24,200,30,151.24
PG_CartPole-v0_385d8_00005,TERMINATED,,0.00852355,relu,[32],36,12.5987,50400,154.08,200,71,154.08
PG_CartPole-v0_385d8_00006,TERMINATED,,0.00338773,linear,[64],35,11.8572,49000,155.01,200,28,155.01
PG_CartPole-v0_385d8_00007,TERMINATED,,0.0079584,relu,[64],26,9.34593,36400,151.55,200,27,151.55
PG_CartPole-v0_385d8_00008,TERMINATED,,0.00893173,linear,[32],29,9.90459,40600,150.78,200,48,150.78
PG_CartPole-v0_385d8_00009,TERMINATED,,0.00666936,relu,[32],46,16.201,64400,150.91,200,26,150.91


2021-11-16 18:02:14,686	INFO tune.py:549 -- Total run time: 968.62 seconds (967.95 seconds for the tuning loop).


In [25]:
print(
    "Melhores hiperparâmetros encontrados:",
    parameter_search_analysis.best_config,
)

Melhores hiperparâmetros encontrados: {'env': 'CartPole-v0', 'framework': 'torch', 'num_gpus': 1, 'num_workers': 7, 'model': {'fcnet_hiddens': [64], 'fcnet_activation': 'linear'}, 'lr': 0.008777148144145926}


Especificando num_samples = 5 significa que você obterá cinco amostras aleatórias para a taxa de aprendizagem. Para cada um deles, existem dois valores para o tamanho da camada oculta e dois valores para a função de ativação. Portanto, haverá 5 * 2 * 2 = 20 tentativas, mostradas com seus status na saída da célula à medida que o cálculo é executado.

Observe que Ray mostra a melhor configuração atual à medida que avança. Isso inclui todos os valores padrão que foram definidos, o que é um bom lugar para encontrar outros parâmetros que podem ser ajustados.


# Exercício

Agora que você conhece a API básica do Ray Tune e da RLLib, **utilize o ambiente `BreakoutNoFrameskip-v4` e treine agentes com os algoritmos A3C, PPO e SAC**. Lembre-se de utilizar também o tensorboard para acompanhar e comparar as curvas de aprendizado de suas execuções.

Descrições dos algoritmos e seus respectivos hiperparâmetros podem ser encontrados [aqui](https://docs.ray.io/en/latest/rllib-algorithms.html#available-algorithms-overview).

#### 0. (Re)Imports + env

In [11]:
import gym
from gym.wrappers.monitoring.video_recorder import VideoRecorder
from gym.spaces import Discrete, Box

import ray
import ray.rllib.agents.pg as pg
from ray.tune.logger import pretty_print
from ray import tune
from ray.rllib.env.env_context import EnvContext
from ray.rllib.models import ModelCatalog
from ray.rllib.models.torch.torch_modelv2 import TorchModelV2
from ray.rllib.models.torch.fcnet import FullyConnectedNetwork as TorchFC
from ray.rllib.agents.pg import PGTrainer

from ray.rllib.models.preprocessors import get_preprocessor

import numpy as np
import os
import random

import torch
import torch.nn as nn

In [12]:
environment_id = "BreakoutNoFrameskip-v4"
env = gym.make(environment_id)

action_size = env.action_space.n
observation_size = env.observation_space.shape[0]
print(f"Action size: {action_size}\nObservation size: {observation_size}")

Action size: 4
Observation size: 210


In [13]:
def one_hot_encode(targets: np.ndarray, nb_classes: int):
    """Get one_hot_encode from integer action
    Thanks to: https://stackoverflow.com/a/42874726/5128626

    Args:
        targets (List[int]): Lista com inteiros
        nb_classes (int): número de classes

    Returns:
        List[List[float]]: Array of encoded targets
    """
    res = np.eye(nb_classes)[np.array(targets).reshape(-1)]
    return res.reshape(list(targets.shape)+[nb_classes])

In [14]:
ray.shutdown()
ray.init(ignore_reinit_error=True, include_dashboard=False)

{'node_ip_address': '192.168.0.102',
 'raylet_ip_address': '192.168.0.102',
 'redis_address': '192.168.0.102:6379',
 'object_store_address': '/tmp/ray/session_2021-11-17_21-29-14_906496_11466/sockets/plasma_store',
 'raylet_socket_name': '/tmp/ray/session_2021-11-17_21-29-14_906496_11466/sockets/raylet',
 'webui_url': None,
 'session_dir': '/tmp/ray/session_2021-11-17_21-29-14_906496_11466',
 'metrics_export_port': 62327,
 'node_id': 'e2243d24b8b161e3cae43f75ef1398bf95668ec16a34846348b7f9fd'}

#### 1. Visualizar agente aleatório

In [None]:
# INSIRA AQUI O CÓDIGO PARA TREINAMENTO SOBRE O BreakoutNoFrameskip-v4
before_training = os.path.join(
    DRIVE_PATH, "{}_before_training.mp4".format(
        environment_id)
)
print(before_training)

video = VideoRecorder(env, before_training)
env.reset()
for i in range(200):
    env.render()
    video.capture_frame()
    # action = one_hot_encode(np.array([env.action_space.sample()]), action_size)
    observation, reward, done, info = env.step(env.action_space.sample())

video.close()
env.close()

html = render_mp4(before_training)
HTML(html)


#### 2. Treinar agente utilizando ray rllib

In [16]:
config = pg.DEFAULT_CONFIG.copy()
config["num_gpus"] = 0
config["num_workers"] = 1
config["lr"] = 0.0004
config["framework"] = "torch"


In [20]:
trainer = PGTrainer(config=config, env=environment_id)
episodes = 1000

for i in range(1, episodes+1):
    result = trainer.train()

    if i % 1 == 0:
        checkpoint = trainer.save()
        print(pretty_print(result))
        print("checkpoint saved at", checkpoint)

last_checkpoint = trainer.save()


2021-11-17 21:58:40,670	INFO trainable.py:101 -- Trainable.setup took 55.986 seconds. If your trainable is slow to initialize, consider setting reuse_actors=True to reduce actor creation overheads.


agent_timesteps_total: 200
custom_metrics:
  default_policy: {}
date: 2021-11-17_21-58-41
done: false
episode_len_mean: 525.0
episode_media: {}
episode_reward_max: 0.0
episode_reward_mean: 0.0
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 1
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 200
  num_steps_sampled: 200
  num_steps_trained: 200
iterations_since_restore: 1
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 22.450000000000003
  gpu_util_percent0: 0.07
  ram_util_percent: 49.9
  vram_util_percent0: 0.059065482066469235
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04701709272849618
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.2608748763354856
  mean_inference_ms: 2.4366082243658416
  mean_raw_

agent_timesteps_total: 1400
custom_metrics:
  default_policy: {}
date: 2021-11-17_21-58-51
done: false
episode_len_mean: 652.1111111111111
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.0
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 9
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 1400
  num_steps_sampled: 1400
  num_steps_trained: 1400
iterations_since_restore: 7
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 30.15
  gpu_util_percent0: 0.02
  ram_util_percent: 50.2
  vram_util_percent0: 0.06638696939782823
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.050732477963325376
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3985076727829207
  mean_inference_ms: 2.5705339794548503
  mean_r

agent_timesteps_total: 2600
custom_metrics:
  default_policy: {}
date: 2021-11-17_21-58-59
done: false
episode_len_mean: 658.2941176470588
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.0588235294117647
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 17
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 2600
  num_steps_sampled: 2600
  num_steps_trained: 2600
iterations_since_restore: 13
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 31.549999999999997
  gpu_util_percent0: 0.04
  ram_util_percent: 50.2
  vram_util_percent0: 0.06893715037841396
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05100677222749256
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.417535837542987
  mean_inference_ms:

agent_timesteps_total: 3800
custom_metrics:
  default_policy: {}
date: 2021-11-17_21-59-07
done: false
episode_len_mean: 609.0
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 0.6428571428571429
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 28
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 3800
  num_steps_sampled: 3800
  num_steps_trained: 3800
iterations_since_restore: 19
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 41.2
  gpu_util_percent0: 0.03
  ram_util_percent: 51.25
  vram_util_percent0: 0.06646923330042777
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05127401133112756
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4317542203552567
  mean_inference_ms: 2.6116558642475107
  me

agent_timesteps_total: 5000
custom_metrics:
  default_policy: {}
date: 2021-11-17_21-59-14
done: false
episode_len_mean: 622.0555555555555
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 0.75
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 36
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 5000
  num_steps_sampled: 5000
  num_steps_trained: 5000
iterations_since_restore: 25
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 24.25
  gpu_util_percent0: 0.03
  ram_util_percent: 50.1
  vram_util_percent0: 0.06646923330042777
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.0514439273415397
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4397953350768486
  mean_inference_ms: 2.6251437279435086
  mean_

agent_timesteps_total: 6200
custom_metrics:
  default_policy: {}
date: 2021-11-17_21-59-22
done: false
episode_len_mean: 606.1860465116279
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 0.627906976744186
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 43
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 6200
  num_steps_sampled: 6200
  num_steps_trained: 6200
iterations_since_restore: 31
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 23.0
  gpu_util_percent0: 0.03
  ram_util_percent: 50.1
  vram_util_percent0: 0.06646923330042777
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05147697830039937
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4426152624601702
  mean_inference_ms: 2.63294017047

agent_timesteps_total: 7400
custom_metrics:
  default_policy: {}
date: 2021-11-17_21-59-29
done: false
episode_len_mean: 663.734693877551
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.1020408163265305
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 49
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 7400
  num_steps_sampled: 7400
  num_steps_trained: 7400
iterations_since_restore: 37
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 23.35
  gpu_util_percent0: 0.015
  ram_util_percent: 49.85
  vram_util_percent0: 0.06646923330042777
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05133873189226558
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4404687214376932
  mean_inference_ms: 2.62893310

agent_timesteps_total: 8600
custom_metrics:
  default_policy: {}
date: 2021-11-17_21-59-35
done: false
episode_len_mean: 668.6545454545454
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.1454545454545455
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 55
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 8600
  num_steps_sampled: 8600
  num_steps_trained: 8600
iterations_since_restore: 43
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 30.2
  gpu_util_percent0: 0.06
  ram_util_percent: 49.8
  vram_util_percent0: 0.06646923330042777
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.051168045685516826
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4372878554220259
  mean_inference_ms: 2.622031317

agent_timesteps_total: 9800
custom_metrics:
  default_policy: {}
date: 2021-11-17_21-59-42
done: false
episode_len_mean: 688.0
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.3064516129032258
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 62
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 9800
  num_steps_sampled: 9800
  num_steps_trained: 9800
iterations_since_restore: 49
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 29.85
  gpu_util_percent0: 0.03
  ram_util_percent: 49.5
  vram_util_percent0: 0.06646923330042777
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05094436473169259
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4328603855011257
  mean_inference_ms: 2.611462832282615
  mea

agent_timesteps_total: 11000
custom_metrics:
  default_policy: {}
date: 2021-11-17_21-59-49
done: false
episode_len_mean: 687.5507246376811
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.3043478260869565
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 69
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 11000
  num_steps_sampled: 11000
  num_steps_trained: 11000
iterations_since_restore: 55
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 39.05
  gpu_util_percent0: 0.03
  ram_util_percent: 49.849999999999994
  vram_util_percent0: 0.06646923330042777
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05074989369234224
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.429236580914466
  mean_inferenc

agent_timesteps_total: 12200
custom_metrics:
  default_policy: {}
date: 2021-11-17_21-59-57
done: false
episode_len_mean: 727.0694444444445
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.625
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 72
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 12200
  num_steps_sampled: 12200
  num_steps_trained: 12200
iterations_since_restore: 61
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 50.900000000000006
  gpu_util_percent0: 0.18
  ram_util_percent: 50.2
  vram_util_percent0: 0.0665514972030273
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05068317890447065
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4280022309564846
  mean_inference_ms: 2.598328

agent_timesteps_total: 13400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-00-04
done: false
episode_len_mean: 754.1410256410256
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.8461538461538463
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 78
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 13400
  num_steps_sampled: 13400
  num_steps_trained: 13400
iterations_since_restore: 67
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 36.65
  gpu_util_percent0: 0.045
  ram_util_percent: 50.3
  vram_util_percent0: 0.06803224744981902
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.050549541193929244
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4251648635316858
  mean_inference_ms: 2.591

agent_timesteps_total: 14600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-00-11
done: false
episode_len_mean: 728.75
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.6363636363636365
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 88
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 14600
  num_steps_sampled: 14600
  num_steps_trained: 14600
iterations_since_restore: 73
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 24.85
  gpu_util_percent0: 0.005
  ram_util_percent: 50.1
  vram_util_percent0: 0.06745640013162224
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05034219923798216
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4203742675259392
  mean_inference_ms: 2.581736704207994

agent_timesteps_total: 15800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-00-19
done: false
episode_len_mean: 743.8804347826087
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.7608695652173914
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 92
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 15800
  num_steps_sampled: 15800
  num_steps_trained: 15800
iterations_since_restore: 79
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 46.2
  gpu_util_percent0: 0.075
  ram_util_percent: 50.1
  vram_util_percent0: 0.06745640013162224
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.050277545868990414
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4187966832573782
  mean_inference_ms: 2.5787

agent_timesteps_total: 17000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-00-27
done: false
episode_len_mean: 751.0909090909091
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.8181818181818181
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 99
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 17000
  num_steps_sampled: 17000
  num_steps_trained: 17000
iterations_since_restore: 85
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 56.4
  gpu_util_percent0: 0.02
  ram_util_percent: 49.95
  vram_util_percent0: 0.06745640013162224
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05019199709770858
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4166757068002371
  mean_inference_ms: 2.57500

agent_timesteps_total: 18200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-00-34
done: false
episode_len_mean: 748.71
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.8
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 105
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 18200
  num_steps_sampled: 18200
  num_steps_trained: 18200
iterations_since_restore: 91
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 45.45
  gpu_util_percent0: 0.13
  ram_util_percent: 50.05
  vram_util_percent0: 0.06745640013162224
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05014211360102024
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4176884675215917
  mean_inference_ms: 2.5745562901700896
  mean_raw_o

agent_timesteps_total: 19400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-00-42
done: false
episode_len_mean: 781.85
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.07
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 110
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 19400
  num_steps_sampled: 19400
  num_steps_trained: 19400
iterations_since_restore: 97
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 38.099999999999994
  gpu_util_percent0: 0.04
  ram_util_percent: 49.9
  vram_util_percent0: 0.06745640013162224
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05004090882666915
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.41580146205967
  mean_inference_ms: 2.5718411738076696
 

agent_timesteps_total: 20600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-00-50
done: false
episode_len_mean: 792.86
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.16
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 116
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 20600
  num_steps_sampled: 20600
  num_steps_trained: 20600
iterations_since_restore: 103
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 52.2
  gpu_util_percent0: 0.065
  ram_util_percent: 49.7
  vram_util_percent0: 0.06803224744981902
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04994430499549531
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.413634653803
  mean_inference_ms: 2.569316081002414
  mean_raw_obs_p

agent_timesteps_total: 21800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-00-58
done: false
episode_len_mean: 792.33
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.16
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 125
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 21800
  num_steps_sampled: 21800
  num_steps_trained: 21800
iterations_since_restore: 109
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 44.25
  gpu_util_percent0: 0.08
  ram_util_percent: 50.55
  vram_util_percent0: 0.06885488647581442
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.049772487276522774
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4092648557599008
  mean_inference_ms: 2.562879084949959
  mean_raw

agent_timesteps_total: 23000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-01-07
done: false
episode_len_mean: 802.82
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.25
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 132
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 23000
  num_steps_sampled: 23000
  num_steps_trained: 23000
iterations_since_restore: 115
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 53.3
  gpu_util_percent0: 0.08
  ram_util_percent: 50.6
  vram_util_percent0: 0.07025337282000658
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04965156392346441
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4061516702699444
  mean_inference_ms: 2.558231058494867
  mean_raw_ob

agent_timesteps_total: 24200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-01-16
done: false
episode_len_mean: 813.63
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.34
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 139
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 24200
  num_steps_sampled: 24200
  num_steps_trained: 24200
iterations_since_restore: 121
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 44.4
  gpu_util_percent0: 0.06
  ram_util_percent: 52.03333333333333
  vram_util_percent0: 0.06855325216628277
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04952789635696735
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4027351110467565
  mean_inference_ms: 2.5529333841938713

agent_timesteps_total: 25400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-01-25
done: false
episode_len_mean: 835.99
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.52
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 144
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 25400
  num_steps_sampled: 25400
  num_steps_trained: 25400
iterations_since_restore: 127
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 45.2
  gpu_util_percent0: 0.03
  ram_util_percent: 52.400000000000006
  vram_util_percent0: 0.07074695623560381
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.049500632457643585
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4020638232734783
  mean_inference_ms: 2.55177780790499

agent_timesteps_total: 26600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-01-33
done: false
episode_len_mean: 825.02
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.43
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 150
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 26600
  num_steps_sampled: 26600
  num_steps_trained: 26600
iterations_since_restore: 133
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 50.55
  gpu_util_percent0: 0.115
  ram_util_percent: 51.2
  vram_util_percent0: 0.06975978940440934
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04953417302255913
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4027732692570531
  mean_inference_ms: 2.5545746491574115
  mean_raw

agent_timesteps_total: 27800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-01-41
done: false
episode_len_mean: 825.35
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.43
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 156
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 27800
  num_steps_sampled: 27800
  num_steps_trained: 27800
iterations_since_restore: 139
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 32.8
  gpu_util_percent0: 0.0
  ram_util_percent: 51.3
  vram_util_percent0: 0.07058242843040474
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.049608654569111704
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.404623101835822
  mean_inference_ms: 2.5597598884288284
  mean_raw_ob

agent_timesteps_total: 29000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-01-49
done: false
episode_len_mean: 847.16
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.61
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 160
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 29000
  num_steps_sampled: 29000
  num_steps_trained: 29000
iterations_since_restore: 145
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 29.76666666666667
  gpu_util_percent0: 0.04666666666666667
  ram_util_percent: 51.333333333333336
  vram_util_percent0: 0.07025337282000658
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04967736856581821
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4062912544098887
  mean_inf

agent_timesteps_total: 30200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-01-57
done: false
episode_len_mean: 858.6
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.7
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 166
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 30200
  num_steps_sampled: 30200
  num_steps_trained: 30200
iterations_since_restore: 151
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 31.5
  gpu_util_percent0: 0.0
  ram_util_percent: 51.3
  vram_util_percent0: 0.07025337282000658
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.049790958480099894
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4088845418489344
  mean_inference_ms: 2.5723966120048334
  mean_raw_obs

agent_timesteps_total: 31400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-02-05
done: false
episode_len_mean: 858.0
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.7
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 171
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 31400
  num_steps_sampled: 31400
  num_steps_trained: 31400
iterations_since_restore: 157
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 63.849999999999994
  gpu_util_percent0: 0.055
  ram_util_percent: 51.2
  vram_util_percent0: 0.0700888450148075
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.049889617117507355
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4110493435871272
  mean_inference_ms: 2.579399109875585


agent_timesteps_total: 32600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-02-13
done: false
episode_len_mean: 836.04
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.52
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 178
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 32600
  num_steps_sampled: 32600
  num_steps_trained: 32600
iterations_since_restore: 163
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 80.0
  gpu_util_percent0: 0.01
  ram_util_percent: 51.2
  vram_util_percent0: 0.0709114840408029
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05003471030356217
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4144596746868672
  mean_inference_ms: 2.589436589540978
  mean_raw_obs

agent_timesteps_total: 33800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-02-21
done: false
episode_len_mean: 847.14
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.61
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 186
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 33800
  num_steps_sampled: 33800
  num_steps_trained: 33800
iterations_since_restore: 169
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 33.35
  gpu_util_percent0: 0.005
  ram_util_percent: 51.1
  vram_util_percent0: 0.07272128989799276
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.050226344444375286
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4191914031121382
  mean_inference_ms: 2.602179362119566
  mean_raw

agent_timesteps_total: 35000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-02-29
done: false
episode_len_mean: 825.65
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.43
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 194
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 35000
  num_steps_sampled: 35000
  num_steps_trained: 35000
iterations_since_restore: 175
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 56.8
  gpu_util_percent0: 0.005
  ram_util_percent: 51.2
  vram_util_percent0: 0.07272128989799276
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.050410128247019965
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4239095465403095
  mean_inference_ms: 2.614295042200254
  mean_raw_

agent_timesteps_total: 36200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-02-36
done: false
episode_len_mean: 814.06
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.34
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 201
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 36200
  num_steps_sampled: 36200
  num_steps_trained: 36200
iterations_since_restore: 181
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 54.15
  gpu_util_percent0: 0.005
  ram_util_percent: 51.4
  vram_util_percent0: 0.07272128989799276
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.050549869474833135
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4275555235473574
  mean_inference_ms: 2.623635144559847
  mean_raw

agent_timesteps_total: 37400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-02-45
done: false
episode_len_mean: 825.07
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.43
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 206
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 37400
  num_steps_sampled: 37400
  num_steps_trained: 37400
iterations_since_restore: 187
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 66.7
  gpu_util_percent0: 0.05
  ram_util_percent: 51.1
  vram_util_percent0: 0.07370845672918723
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.0506425932131081
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4299877802955197
  mean_inference_ms: 2.629829659159742
  mean_raw_obs

agent_timesteps_total: 38600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-02-52
done: false
episode_len_mean: 814.29
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.34
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 212
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 38600
  num_steps_sampled: 38600
  num_steps_trained: 38600
iterations_since_restore: 193
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 50.0
  gpu_util_percent0: 0.085
  ram_util_percent: 51.2
  vram_util_percent0: 0.07370845672918723
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.050740901679020387
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4326032241902968
  mean_inference_ms: 2.636423589586061
  mean_raw_

agent_timesteps_total: 39800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-03-00
done: false
episode_len_mean: 825.03
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.43
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 216
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 39800
  num_steps_sampled: 39800
  num_steps_trained: 39800
iterations_since_restore: 199
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 67.5
  gpu_util_percent0: 0.05
  ram_util_percent: 51.2
  vram_util_percent0: 0.07337940111878907
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05080099658728331
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4342526709008014
  mean_inference_ms: 2.640438821408454
  mean_raw_ob

agent_timesteps_total: 41000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-03-08
done: false
episode_len_mean: 847.41
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.61
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 222
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 41000
  num_steps_sampled: 41000
  num_steps_trained: 41000
iterations_since_restore: 205
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 70.85
  gpu_util_percent0: 0.2
  ram_util_percent: 51.3
  vram_util_percent0: 0.07387298453438632
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05089000766048293
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4366540470835576
  mean_inference_ms: 2.6464476425435333
  mean_raw_o

agent_timesteps_total: 42200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-03-16
done: false
episode_len_mean: 847.75
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.61
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 229
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 42200
  num_steps_sampled: 42200
  num_steps_trained: 42200
iterations_since_restore: 211
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 62.5
  gpu_util_percent0: 0.015
  ram_util_percent: 51.7
  vram_util_percent0: 0.07189865087199737
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.050983729693175925
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4390983886420983
  mean_inference_ms: 2.652915804145826
  mean_raw_

agent_timesteps_total: 43400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-03-24
done: false
episode_len_mean: 847.87
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.61
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 238
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 43400
  num_steps_sampled: 43400
  num_steps_trained: 43400
iterations_since_restore: 217
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 77.75
  gpu_util_percent0: 0.125
  ram_util_percent: 51.8
  vram_util_percent0: 0.0746956235603817
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.051079719607156696
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.441481165707131
  mean_inference_ms: 2.659635080916702
  mean_raw_o

agent_timesteps_total: 44600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-03-31
done: false
episode_len_mean: 837.5
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.52
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 244
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 44600
  num_steps_sampled: 44600
  num_steps_trained: 44600
iterations_since_restore: 223
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 52.25
  gpu_util_percent0: 0.04
  ram_util_percent: 51.7
  vram_util_percent0: 0.07255676209279369
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05112589195713596
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4424919782777872
  mean_inference_ms: 2.662905883377046
  mean_raw_ob

agent_timesteps_total: 45800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-03-39
done: false
episode_len_mean: 815.62
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.34
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 252
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 45800
  num_steps_sampled: 45800
  num_steps_trained: 45800
iterations_since_restore: 229
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 56.35
  gpu_util_percent0: 0.05
  ram_util_percent: 51.650000000000006
  vram_util_percent0: 0.07486015136558077
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05117134637652982
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.443379870084331
  mean_inference_ms: 2.666204455476991

agent_timesteps_total: 47000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-03-48
done: false
episode_len_mean: 815.9
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.34
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 257
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 47000
  num_steps_sampled: 47000
  num_steps_trained: 47000
iterations_since_restore: 235
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 55.8
  gpu_util_percent0: 0.015
  ram_util_percent: 51.650000000000006
  vram_util_percent0: 0.07370845672918723
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.051197887235846444
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4437912210715098
  mean_inference_ms: 2.66818190501639

agent_timesteps_total: 48200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-03-55
done: false
episode_len_mean: 816.41
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.34
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 263
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 48200
  num_steps_sampled: 48200
  num_steps_trained: 48200
iterations_since_restore: 241
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 62.5
  gpu_util_percent0: 0.03
  ram_util_percent: 51.55
  vram_util_percent0: 0.07370845672918723
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.051225844456785685
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4441868467947074
  mean_inference_ms: 2.6702520607238864
  mean_raw

agent_timesteps_total: 49400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-04-03
done: false
episode_len_mean: 794.61
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.16
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 269
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 49400
  num_steps_sampled: 49400
  num_steps_trained: 49400
iterations_since_restore: 247
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 50.0
  gpu_util_percent0: 0.03
  ram_util_percent: 51.6
  vram_util_percent0: 0.07370845672918723
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05125019288552333
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4445374048950759
  mean_inference_ms: 2.6720136097527685
  mean_raw_o

agent_timesteps_total: 50600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-04-11
done: false
episode_len_mean: 795.09
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.16
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 276
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 50600
  num_steps_sampled: 50600
  num_steps_trained: 50600
iterations_since_restore: 253
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 70.0
  gpu_util_percent0: 0.03
  ram_util_percent: 52.4
  vram_util_percent0: 0.07370845672918723
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05127421996287592
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4448998741626837
  mean_inference_ms: 2.673734119230838
  mean_raw_ob

agent_timesteps_total: 51800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-04-19
done: false
episode_len_mean: 816.76
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.34
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 281
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 51800
  num_steps_sampled: 51800
  num_steps_trained: 51800
iterations_since_restore: 259
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 44.7
  gpu_util_percent0: 0.015
  ram_util_percent: 51.7
  vram_util_percent0: 0.07370845672918723
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.0512899071113515
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.445134387282328
  mean_inference_ms: 2.6748793736639
  mean_raw_obs_p

agent_timesteps_total: 53000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-04-26
done: false
episode_len_mean: 827.45
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.43
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 286
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 53000
  num_steps_sampled: 53000
  num_steps_trained: 53000
iterations_since_restore: 265
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 40.2
  gpu_util_percent0: 0.03
  ram_util_percent: 51.75
  vram_util_percent0: 0.07370845672918723
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05130467448495331
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4453534307763731
  mean_inference_ms: 2.6759538853555243
  mean_raw_

agent_timesteps_total: 54200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-04-34
done: false
episode_len_mean: 838.29
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.52
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 292
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 54200
  num_steps_sampled: 54200
  num_steps_trained: 54200
iterations_since_restore: 271
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 41.45
  gpu_util_percent0: 0.035
  ram_util_percent: 51.7
  vram_util_percent0: 0.07370845672918723
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05132092986163565
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4455700061259824
  mean_inference_ms: 2.677152369031333
  mean_raw_

agent_timesteps_total: 55400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-04-42
done: false
episode_len_mean: 849.11
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.61
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 298
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 55400
  num_steps_sampled: 55400
  num_steps_trained: 55400
iterations_since_restore: 277
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 66.7
  gpu_util_percent0: 0.03
  ram_util_percent: 51.65
  vram_util_percent0: 0.07370845672918723
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05133501114577725
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4457034622898433
  mean_inference_ms: 2.6782763524626803
  mean_raw_

agent_timesteps_total: 56600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-04-49
done: false
episode_len_mean: 860.31
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.7
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 303
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 56600
  num_steps_sampled: 56600
  num_steps_trained: 56600
iterations_since_restore: 283
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 55.0
  gpu_util_percent0: 0.06
  ram_util_percent: 51.7
  vram_util_percent0: 0.07370845672918723
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.051345264507142795
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4457574806456366
  mean_inference_ms: 2.6791627871805974
  mean_raw_o

agent_timesteps_total: 57800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-04-57
done: false
episode_len_mean: 860.28
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.7
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 311
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 57800
  num_steps_sampled: 57800
  num_steps_trained: 57800
iterations_since_restore: 289
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 66.7
  gpu_util_percent0: 0.06
  ram_util_percent: 52.0
  vram_util_percent0: 0.07387298453438632
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05135384942481295
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4457206306715482
  mean_inference_ms: 2.6799922565041614
  mean_raw_ob

agent_timesteps_total: 59000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-05-04
done: false
episode_len_mean: 849.55
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.61
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 315
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 59000
  num_steps_sampled: 59000
  num_steps_trained: 59000
iterations_since_restore: 295
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 52.2
  gpu_util_percent0: 0.0
  ram_util_percent: 51.95
  vram_util_percent0: 0.07387298453438632
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.051357138242664906
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4456740951849236
  mean_inference_ms: 2.68033423176003
  mean_raw_ob

agent_timesteps_total: 60200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-05-12
done: false
episode_len_mean: 860.65
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.7
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 320
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 60200
  num_steps_sampled: 60200
  num_steps_trained: 60200
iterations_since_restore: 301
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 42.6
  gpu_util_percent0: 0.0
  ram_util_percent: 51.9
  vram_util_percent0: 0.07387298453438632
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05136083179935718
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.445609571031628
  mean_inference_ms: 2.68069167145818
  mean_raw_obs_pr

agent_timesteps_total: 61400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-05-20
done: false
episode_len_mean: 848.98
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.61
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 328
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 61400
  num_steps_sampled: 61400
  num_steps_trained: 61400
iterations_since_restore: 307
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 52.6
  gpu_util_percent0: 0.03
  ram_util_percent: 51.95
  vram_util_percent0: 0.07387298453438632
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.051359014047345435
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.44537481395822
  mean_inference_ms: 2.68066787818796
  mean_raw_obs

agent_timesteps_total: 62600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-05-27
done: false
episode_len_mean: 860.26
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.7
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 333
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 62600
  num_steps_sampled: 62600
  num_steps_trained: 62600
iterations_since_restore: 313
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 44.4
  gpu_util_percent0: 0.0
  ram_util_percent: 52.1
  vram_util_percent0: 0.07370845672918723
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.051351022929809594
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.445086718571973
  mean_inference_ms: 2.6801994366190005
  mean_raw_obs

agent_timesteps_total: 63800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-05-34
done: false
episode_len_mean: 859.89
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.7
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 341
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 63800
  num_steps_sampled: 63800
  num_steps_trained: 63800
iterations_since_restore: 319
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 58.35
  gpu_util_percent0: 0.01
  ram_util_percent: 51.8
  vram_util_percent0: 0.07370845672918723
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05132352488770763
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4443131690019297
  mean_inference_ms: 2.678586015222532
  mean_raw_ob

agent_timesteps_total: 65000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-05-43
done: false
episode_len_mean: 870.39
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.79
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 347
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 65000
  num_steps_sampled: 65000
  num_steps_trained: 65000
iterations_since_restore: 325
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 25.0
  gpu_util_percent0: 0.0
  ram_util_percent: 51.3
  vram_util_percent0: 0.07370845672918723
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05129555614344258
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4435870169931149
  mean_inference_ms: 2.676908456059264
  mean_raw_obs

agent_timesteps_total: 66200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-05-50
done: false
episode_len_mean: 870.55
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.79
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 354
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 66200
  num_steps_sampled: 66200
  num_steps_trained: 66200
iterations_since_restore: 331
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 52.1
  gpu_util_percent0: 0.03
  ram_util_percent: 51.7
  vram_util_percent0: 0.07370845672918723
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05125117469325028
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4424149650959364
  mean_inference_ms: 2.6743212341197604
  mean_raw_o

agent_timesteps_total: 67400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-05-57
done: false
episode_len_mean: 858.86
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.7
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 361
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 67400
  num_steps_sampled: 67400
  num_steps_trained: 67400
iterations_since_restore: 337
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 37.5
  gpu_util_percent0: 0.06
  ram_util_percent: 51.6
  vram_util_percent0: 0.07370845672918723
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05120165799858991
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4410686964427102
  mean_inference_ms: 2.6714612367640793
  mean_raw_ob

agent_timesteps_total: 68600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-06-05
done: false
episode_len_mean: 836.75
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.52
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 369
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 68600
  num_steps_sampled: 68600
  num_steps_trained: 68600
iterations_since_restore: 343
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 47.2
  gpu_util_percent0: 0.0
  ram_util_percent: 51.8
  vram_util_percent0: 0.07222770648239553
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.051141797931184625
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4394597813518504
  mean_inference_ms: 2.6680147485944246
  mean_raw_o

agent_timesteps_total: 69800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-06-13
done: false
episode_len_mean: 836.75
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.52
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 377
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 69800
  num_steps_sampled: 69800
  num_steps_trained: 69800
iterations_since_restore: 349
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 45.75
  gpu_util_percent0: 0.06
  ram_util_percent: 51.95
  vram_util_percent0: 0.07222770648239553
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.051078754846433656
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.437788698395889
  mean_inference_ms: 2.6643785149742496
  mean_raw

agent_timesteps_total: 71000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-06-20
done: false
episode_len_mean: 825.88
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.43
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 382
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 71000
  num_steps_sampled: 71000
  num_steps_trained: 71000
iterations_since_restore: 355
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 43.75
  gpu_util_percent0: 0.0
  ram_util_percent: 51.349999999999994
  vram_util_percent0: 0.07222770648239553
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.0510332417357823
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4365377258075496
  mean_inference_ms: 2.661756135589992


agent_timesteps_total: 72200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-06-27
done: false
episode_len_mean: 803.88
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.25
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 388
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 72200
  num_steps_sampled: 72200
  num_steps_trained: 72200
iterations_since_restore: 361
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 53.55
  gpu_util_percent0: 0.03
  ram_util_percent: 51.75
  vram_util_percent0: 0.07222770648239553
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.0509783962884681
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4350286149699079
  mean_inference_ms: 2.6585720446324563
  mean_raw_

agent_timesteps_total: 73400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-06-34
done: false
episode_len_mean: 825.93
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.43
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 394
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 73400
  num_steps_sampled: 73400
  num_steps_trained: 73400
iterations_since_restore: 367
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 41.65
  gpu_util_percent0: 0.03
  ram_util_percent: 51.6
  vram_util_percent0: 0.07222770648239553
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05092105705092848
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4334462968706712
  mean_inference_ms: 2.6552562069303285
  mean_raw_

agent_timesteps_total: 74600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-06-42
done: false
episode_len_mean: 793.16
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.16
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 403
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 74600
  num_steps_sampled: 74600
  num_steps_trained: 74600
iterations_since_restore: 373
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 50.0
  gpu_util_percent0: 0.03
  ram_util_percent: 51.7
  vram_util_percent0: 0.07222770648239553
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.050842595815885044
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4313390703613211
  mean_inference_ms: 2.650584327786026
  mean_raw_o

agent_timesteps_total: 75800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-06-49
done: false
episode_len_mean: 815.41
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.34
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 407
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 75800
  num_steps_sampled: 75800
  num_steps_trained: 75800
iterations_since_restore: 379
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 70.0
  gpu_util_percent0: 0.0
  ram_util_percent: 52.7
  vram_util_percent0: 0.07222770648239553
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05080887862810737
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.430431883238011
  mean_inference_ms: 2.6485507536532236
  mean_raw_obs

agent_timesteps_total: 77000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-06-57
done: false
episode_len_mean: 815.28
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.34
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 413
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 77000
  num_steps_sampled: 77000
  num_steps_trained: 77000
iterations_since_restore: 385
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 74.45
  gpu_util_percent0: 0.03
  ram_util_percent: 52.2
  vram_util_percent0: 0.07222770648239553
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05076083031654966
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4291514272845616
  mean_inference_ms: 2.645628985645394
  mean_raw_o

agent_timesteps_total: 78200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-07-06
done: false
episode_len_mean: 760.1
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.89
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 423
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 78200
  num_steps_sampled: 78200
  num_steps_trained: 78200
iterations_since_restore: 391
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 61.5
  gpu_util_percent0: 0.0
  ram_util_percent: 51.3
  vram_util_percent0: 0.07222770648239553
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.0506853191276733
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4271314786970306
  mean_inference_ms: 2.641134326681381
  mean_raw_obs_p

agent_timesteps_total: 79400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-07-14
done: false
episode_len_mean: 782.21
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.07
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 428
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 79400
  num_steps_sampled: 79400
  num_steps_trained: 79400
iterations_since_restore: 397
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 36.65
  gpu_util_percent0: 0.02
  ram_util_percent: 51.1
  vram_util_percent0: 0.07222770648239553
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05064763185170775
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4260644522210741
  mean_inference_ms: 2.6389698627711904
  mean_raw_

agent_timesteps_total: 80600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-07-22
done: false
episode_len_mean: 771.31
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.98
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 435
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 80600
  num_steps_sampled: 80600
  num_steps_trained: 80600
iterations_since_restore: 403
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 52.7
  gpu_util_percent0: 0.03
  ram_util_percent: 51.35
  vram_util_percent0: 0.0663047054952287
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05060219241292988
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4247217043430678
  mean_inference_ms: 2.6363694386685927
  mean_raw_o

agent_timesteps_total: 81800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-07-30
done: false
episode_len_mean: 771.38
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.98
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 443
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 81800
  num_steps_sampled: 81800
  num_steps_trained: 81800
iterations_since_restore: 409
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 52.8
  gpu_util_percent0: 0.06
  ram_util_percent: 51.5
  vram_util_percent0: 0.0663047054952287
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05056102082786948
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.423421173525395
  mean_inference_ms: 2.6340227536580456
  mean_raw_obs

agent_timesteps_total: 83000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-07-37
done: false
episode_len_mean: 771.19
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.98
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 450
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 83000
  num_steps_sampled: 83000
  num_steps_trained: 83000
iterations_since_restore: 415
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 45.0
  gpu_util_percent0: 0.034999999999999996
  ram_util_percent: 51.4
  vram_util_percent0: 0.0663047054952287
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.050534888191887235
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4225584242391898
  mean_inference_ms: 2.6324979277030

agent_timesteps_total: 84200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-07-45
done: false
episode_len_mean: 771.69
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.98
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 456
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 84200
  num_steps_sampled: 84200
  num_steps_trained: 84200
iterations_since_restore: 421
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 50.25
  gpu_util_percent0: 0.0
  ram_util_percent: 51.55
  vram_util_percent0: 0.0663047054952287
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05052034580613588
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4220519114380648
  mean_inference_ms: 2.631587917469821
  mean_raw_ob

agent_timesteps_total: 85400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-07-52
done: false
episode_len_mean: 782.97
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.07
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 461
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 85400
  num_steps_sampled: 85400
  num_steps_trained: 85400
iterations_since_restore: 427
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 39.099999999999994
  gpu_util_percent0: 0.035
  ram_util_percent: 51.4
  vram_util_percent0: 0.0663047054952287
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.0505106219398742
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4217151504321806
  mean_inference_ms: 2.630952310706236


agent_timesteps_total: 86600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-08-00
done: false
episode_len_mean: 804.86
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.25
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 468
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 86600
  num_steps_sampled: 86600
  num_steps_trained: 86600
iterations_since_restore: 433
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 34.85
  gpu_util_percent0: 0.06
  ram_util_percent: 51.6
  vram_util_percent0: 0.0663047054952287
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.050497507871912324
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4212142773710366
  mean_inference_ms: 2.63012546615382
  mean_raw_ob

agent_timesteps_total: 87800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-08-07
done: false
episode_len_mean: 815.66
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.34
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 473
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 87800
  num_steps_sampled: 87800
  num_steps_trained: 87800
iterations_since_restore: 439
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 45.25
  gpu_util_percent0: 0.06
  ram_util_percent: 51.4
  vram_util_percent0: 0.0663047054952287
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.050489737468599645
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4208773740757934
  mean_inference_ms: 2.629644362560747
  mean_raw_o

agent_timesteps_total: 89000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-08-15
done: false
episode_len_mean: 804.11
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.25
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 479
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 89000
  num_steps_sampled: 89000
  num_steps_trained: 89000
iterations_since_restore: 445
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 41.55
  gpu_util_percent0: 0.03
  ram_util_percent: 51.9
  vram_util_percent0: 0.05643303718328398
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05048451584033133
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4206126151811662
  mean_inference_ms: 2.6292580101331513
  mean_raw_

agent_timesteps_total: 90200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-08-23
done: false
episode_len_mean: 815.24
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.34
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 485
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 90200
  num_steps_sampled: 90200
  num_steps_trained: 90200
iterations_since_restore: 451
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 33.3
  gpu_util_percent0: 0.0
  ram_util_percent: 51.7
  vram_util_percent0: 0.0562685093780849
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05048240291001032
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.420485514301401
  mean_inference_ms: 2.629045886665192
  mean_raw_obs_p

agent_timesteps_total: 91400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-08-29
done: false
episode_len_mean: 793.01
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.16
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 491
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 91400
  num_steps_sampled: 91400
  num_steps_trained: 91400
iterations_since_restore: 457
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 50.3
  gpu_util_percent0: 0.065
  ram_util_percent: 51.6
  vram_util_percent0: 0.05610398157288582
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05047984637535038
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4203577572236818
  mean_inference_ms: 2.628834480037309
  mean_raw_o

agent_timesteps_total: 92600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-08-37
done: false
episode_len_mean: 815.52
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.34
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 498
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 92600
  num_steps_sampled: 92600
  num_steps_trained: 92600
iterations_since_restore: 463
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 53.85
  gpu_util_percent0: 0.03
  ram_util_percent: 51.5
  vram_util_percent0: 0.05610398157288582
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.050476479488634166
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4202134741203245
  mean_inference_ms: 2.6285642502900237
  mean_raw

agent_timesteps_total: 93800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-08-45
done: false
episode_len_mean: 847.98
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.61
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 502
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 93800
  num_steps_sampled: 93800
  num_steps_trained: 93800
iterations_since_restore: 469
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 50.0
  gpu_util_percent0: 0.0
  ram_util_percent: 51.8
  vram_util_percent0: 0.05610398157288582
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05047534606028115
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4201440460273198
  mean_inference_ms: 2.628464417581808
  mean_raw_obs

agent_timesteps_total: 95000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-08-52
done: false
episode_len_mean: 803.91
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.25
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 509
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 95000
  num_steps_sampled: 95000
  num_steps_trained: 95000
iterations_since_restore: 475
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 50.0
  gpu_util_percent0: 0.06
  ram_util_percent: 51.75
  vram_util_percent0: 0.05610398157288582
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.0504710818658755
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4199311371397274
  mean_inference_ms: 2.628273015470671
  mean_raw_ob

agent_timesteps_total: 96200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-08-59
done: false
episode_len_mean: 815.41
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.34
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 517
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 96200
  num_steps_sampled: 96200
  num_steps_trained: 96200
iterations_since_restore: 481
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 33.3
  gpu_util_percent0: 0.06
  ram_util_percent: 51.6
  vram_util_percent0: 0.05610398157288582
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05046189951012542
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4195487440360082
  mean_inference_ms: 2.627829017215208
  mean_raw_ob

agent_timesteps_total: 97400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-09-07
done: false
episode_len_mean: 825.94
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.43
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 523
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 97400
  num_steps_sampled: 97400
  num_steps_trained: 97400
iterations_since_restore: 487
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 50.0
  gpu_util_percent0: 0.02
  ram_util_percent: 51.833333333333336
  vram_util_percent0: 0.05610398157288582
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05045223078811257
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.41917734428753
  mean_inference_ms: 2.6273480813248202


agent_timesteps_total: 98600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-09-14
done: false
episode_len_mean: 848.06
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.61
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 526
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 98600
  num_steps_sampled: 98600
  num_steps_trained: 98600
iterations_since_restore: 493
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 44.05
  gpu_util_percent0: 0.03
  ram_util_percent: 51.55
  vram_util_percent0: 0.05610398157288582
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.050448796268601936
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4190462173679321
  mean_inference_ms: 2.6271663859848893
  mean_ra

agent_timesteps_total: 99800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-09-21
done: false
episode_len_mean: 837.06
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.52
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 533
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 99800
  num_steps_sampled: 99800
  num_steps_trained: 99800
iterations_since_restore: 499
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 46.7
  gpu_util_percent0: 0.0
  ram_util_percent: 51.4
  vram_util_percent0: 0.05610398157288582
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.050438140413501954
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4186819647371527
  mean_inference_ms: 2.6267045841898717
  mean_raw_o

agent_timesteps_total: 101000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-09-29
done: false
episode_len_mean: 869.53
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.79
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 538
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 101000
  num_steps_sampled: 101000
  num_steps_trained: 101000
iterations_since_restore: 505
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 48.55
  gpu_util_percent0: 0.03
  ram_util_percent: 52.55
  vram_util_percent0: 0.05610398157288582
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.050428745016316424
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4183895033811178
  mean_inference_ms: 2.626292989194834
  mean

agent_timesteps_total: 102200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-09-38
done: false
episode_len_mean: 858.98
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.7
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 545
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 102200
  num_steps_sampled: 102200
  num_steps_trained: 102200
iterations_since_restore: 511
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 48.05
  gpu_util_percent0: 0.0
  ram_util_percent: 52.15
  vram_util_percent0: 0.05610398157288582
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05041495762033412
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4179743673483984
  mean_inference_ms: 2.625692820795892
  mean_ra

agent_timesteps_total: 103400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-09-46
done: false
episode_len_mean: 880.88
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.88
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 549
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 103400
  num_steps_sampled: 103400
  num_steps_trained: 103400
iterations_since_restore: 517
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 50.349999999999994
  gpu_util_percent0: 0.0
  ram_util_percent: 52.0
  vram_util_percent0: 0.05610398157288582
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.050406716837694515
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4177264671225416
  mean_inference_ms: 2.62535763067

agent_timesteps_total: 104600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-09-53
done: false
episode_len_mean: 913.91
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 3.15
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 553
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 104600
  num_steps_sampled: 104600
  num_steps_trained: 104600
iterations_since_restore: 523
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 40.3
  gpu_util_percent0: 0.065
  ram_util_percent: 51.85
  vram_util_percent0: 0.05610398157288582
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.050396450127770276
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4174316246335688
  mean_inference_ms: 2.624952653868931
  mean

agent_timesteps_total: 105800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-10-00
done: false
episode_len_mean: 891.64
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.97
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 561
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 105800
  num_steps_sampled: 105800
  num_steps_trained: 105800
iterations_since_restore: 529
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 31.3
  gpu_util_percent0: 0.07
  ram_util_percent: 52.3
  vram_util_percent0: 0.05610398157288582
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.050374076981803045
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4168150413605682
  mean_inference_ms: 2.6240212179526337
  mean_

agent_timesteps_total: 107000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-10-07
done: false
episode_len_mean: 891.83
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.97
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 566
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 107000
  num_steps_sampled: 107000
  num_steps_trained: 107000
iterations_since_restore: 535
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 22.2
  gpu_util_percent0: 0.0
  ram_util_percent: 52.3
  vram_util_percent0: 0.05610398157288582
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.050359673726750105
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4164331160556949
  mean_inference_ms: 2.6234222237326206
  mean_r

agent_timesteps_total: 108200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-10-15
done: false
episode_len_mean: 881.33
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.88
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 574
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 108200
  num_steps_sampled: 108200
  num_steps_trained: 108200
iterations_since_restore: 541
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 52.7
  gpu_util_percent0: 0.03
  ram_util_percent: 52.05
  vram_util_percent0: 0.05610398157288582
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05033353982302928
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.415739534152202
  mean_inference_ms: 2.622337444248787
  mean_ra

agent_timesteps_total: 109400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-10-22
done: false
episode_len_mean: 848.6
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.61
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 583
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 109400
  num_steps_sampled: 109400
  num_steps_trained: 109400
iterations_since_restore: 547
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 38.5
  gpu_util_percent0: 0.0
  ram_util_percent: 51.9
  vram_util_percent0: 0.05610398157288582
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05030282762363424
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4149450228535585
  mean_inference_ms: 2.621048654639056
  mean_raw_

agent_timesteps_total: 110600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-10-30
done: false
episode_len_mean: 848.5
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.61
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 589
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 110600
  num_steps_sampled: 110600
  num_steps_trained: 110600
iterations_since_restore: 553
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 55.6
  gpu_util_percent0: 0.0
  ram_util_percent: 51.6
  vram_util_percent0: 0.05610398157288582
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05028283380946385
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4144452280500224
  mean_inference_ms: 2.6201873206443667
  mean_raw

agent_timesteps_total: 111800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-10-38
done: false
episode_len_mean: 837.54
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.52
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 598
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 111800
  num_steps_sampled: 111800
  num_steps_trained: 111800
iterations_since_restore: 559
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 59.6
  gpu_util_percent0: 0.015
  ram_util_percent: 52.1
  vram_util_percent0: 0.05610398157288582
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05025600586978205
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4138217328245752
  mean_inference_ms: 2.619045519892565
  mean_r

agent_timesteps_total: 113000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-10-45
done: false
episode_len_mean: 815.79
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.34
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 604
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 113000
  num_steps_sampled: 113000
  num_steps_trained: 113000
iterations_since_restore: 565
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 61.3
  gpu_util_percent0: 0.06
  ram_util_percent: 51.95
  vram_util_percent0: 0.05610398157288582
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05023670138182795
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4133641872889156
  mean_inference_ms: 2.618245105533682
  mean_r

agent_timesteps_total: 114200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-10-52
done: false
episode_len_mean: 814.97
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.34
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 612
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 114200
  num_steps_sampled: 114200
  num_steps_trained: 114200
iterations_since_restore: 571
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 42.9
  gpu_util_percent0: 0.0
  ram_util_percent: 52.1
  vram_util_percent0: 0.05610398157288582
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.050211136810282476
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4127462251980178
  mean_inference_ms: 2.617232333754821
  mean_ra

agent_timesteps_total: 115400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-11-00
done: false
episode_len_mean: 804.44
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.25
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 622
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 115400
  num_steps_sampled: 115400
  num_steps_trained: 115400
iterations_since_restore: 577
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 46.45
  gpu_util_percent0: 0.045
  ram_util_percent: 52.1
  vram_util_percent0: 0.05610398157288582
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05017758096049531
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4119422577575778
  mean_inference_ms: 2.6159220559711724
  mean

agent_timesteps_total: 116600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-11-07
done: false
episode_len_mean: 760.06
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.89
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 628
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 116600
  num_steps_sampled: 116600
  num_steps_trained: 116600
iterations_since_restore: 583
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 49.25
  gpu_util_percent0: 0.015
  ram_util_percent: 51.95
  vram_util_percent0: 0.0539651201052978
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.050153080660297465
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4113461791963386
  mean_inference_ms: 2.614917351357349
  mean

agent_timesteps_total: 117800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-11-14
done: false
episode_len_mean: 759.69
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.89
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 636
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 117800
  num_steps_sampled: 117800
  num_steps_trained: 117800
iterations_since_restore: 589
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 54.8
  gpu_util_percent0: 0.03
  ram_util_percent: 52.1
  vram_util_percent0: 0.0539651201052978
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05011946292589318
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.410533320373063
  mean_inference_ms: 2.6134356337688347
  mean_raw

agent_timesteps_total: 119000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-11-22
done: false
episode_len_mean: 726.4
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.62
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 646
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 119000
  num_steps_sampled: 119000
  num_steps_trained: 119000
iterations_since_restore: 595
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 57.1
  gpu_util_percent0: 0.04
  ram_util_percent: 51.6
  vram_util_percent0: 0.0539651201052978
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.050077832325065644
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.409495311798659
  mean_inference_ms: 2.6116274573184244
  mean_raw

agent_timesteps_total: 120200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-11-29
done: false
episode_len_mean: 737.41
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.71
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 649
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 120200
  num_steps_sampled: 120200
  num_steps_trained: 120200
iterations_since_restore: 601
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 56.2
  gpu_util_percent0: 0.07
  ram_util_percent: 51.7
  vram_util_percent0: 0.0539651201052978
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05006613921175096
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.409201074538827
  mean_inference_ms: 2.611128163780094
  mean_raw_

agent_timesteps_total: 121400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-11-36
done: false
episode_len_mean: 726.64
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.62
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 654
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 121400
  num_steps_sampled: 121400
  num_steps_trained: 121400
iterations_since_restore: 607
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 41.650000000000006
  gpu_util_percent0: 0.03
  ram_util_percent: 51.9
  vram_util_percent0: 0.0539651201052978
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.050046810950012954
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4087110772807434
  mean_inference_ms: 2.61028523458

agent_timesteps_total: 122600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-11-44
done: false
episode_len_mean: 748.25
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.8
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 659
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 122600
  num_steps_sampled: 122600
  num_steps_trained: 122600
iterations_since_restore: 613
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 57.1
  gpu_util_percent0: 0.03
  ram_util_percent: 52.1
  vram_util_percent0: 0.0539651201052978
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.05002770408897213
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.408229979029796
  mean_inference_ms: 2.6094651120203474
  mean_raw_

agent_timesteps_total: 123800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-11-51
done: false
episode_len_mean: 759.08
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.89
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 664
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 123800
  num_steps_sampled: 123800
  num_steps_trained: 123800
iterations_since_restore: 619
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 33.3
  gpu_util_percent0: 0.05
  ram_util_percent: 52.1
  vram_util_percent0: 0.0539651201052978
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.050010089263538615
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4078035147200967
  mean_inference_ms: 2.6087044304404157
  mean_r

agent_timesteps_total: 125000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-11-58
done: false
episode_len_mean: 770.28
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.98
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 670
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 125000
  num_steps_sampled: 125000
  num_steps_trained: 125000
iterations_since_restore: 625
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 42.9
  gpu_util_percent0: 0.0
  ram_util_percent: 51.9
  vram_util_percent0: 0.0539651201052978
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04998881514159666
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4072980412431482
  mean_inference_ms: 2.6077857631949364
  mean_raw

agent_timesteps_total: 126200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-12-06
done: false
episode_len_mean: 769.49
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.98
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 678
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 126200
  num_steps_sampled: 126200
  num_steps_trained: 126200
iterations_since_restore: 631
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 57.1
  gpu_util_percent0: 0.0
  ram_util_percent: 52.1
  vram_util_percent0: 0.0539651201052978
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04996156221051293
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4066460644665628
  mean_inference_ms: 2.6066324207468337
  mean_raw

agent_timesteps_total: 127400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-12-13
done: false
episode_len_mean: 802.5
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.25
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 681
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 127400
  num_steps_sampled: 127400
  num_steps_trained: 127400
iterations_since_restore: 637
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 47.75
  gpu_util_percent0: 0.0
  ram_util_percent: 53.150000000000006
  vram_util_percent0: 0.0539651201052978
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.049950699298855955
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4063803751387731
  mean_inference_ms: 2.606179206916

agent_timesteps_total: 128600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-12-21
done: false
episode_len_mean: 813.79
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.34
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 687
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 128600
  num_steps_sampled: 128600
  num_steps_trained: 128600
iterations_since_restore: 643
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 61.25
  gpu_util_percent0: 0.0
  ram_util_percent: 51.95
  vram_util_percent0: 0.0539651201052978
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.049929349515409366
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4058654009624751
  mean_inference_ms: 2.6052950694605808
  mean_

agent_timesteps_total: 129800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-12-28
done: false
episode_len_mean: 802.18
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.25
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 695
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 129800
  num_steps_sampled: 129800
  num_steps_trained: 129800
iterations_since_restore: 649
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 50.0
  gpu_util_percent0: 0.045
  ram_util_percent: 52.1
  vram_util_percent0: 0.0539651201052978
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.049901748999272064
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4051848226336383
  mean_inference_ms: 2.604155462809329
  mean_r

agent_timesteps_total: 131000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-12-35
done: false
episode_len_mean: 812.96
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.34
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 701
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 131000
  num_steps_sampled: 131000
  num_steps_trained: 131000
iterations_since_restore: 655
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 31.25
  gpu_util_percent0: 0.03
  ram_util_percent: 52.05
  vram_util_percent0: 0.0539651201052978
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.049879525417631874
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.404637041174914
  mean_inference_ms: 2.603209580213918
  mean_r

agent_timesteps_total: 132200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-12-43
done: false
episode_len_mean: 790.77
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.16
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 709
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 132200
  num_steps_sampled: 132200
  num_steps_trained: 132200
iterations_since_restore: 661
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 50.0
  gpu_util_percent0: 0.0
  ram_util_percent: 52.1
  vram_util_percent0: 0.0539651201052978
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04985167730509721
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.40398405082526
  mean_inference_ms: 2.6019904608889637
  mean_raw_o

agent_timesteps_total: 133400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-12-50
done: false
episode_len_mean: 812.98
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.34
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 717
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 133400
  num_steps_sampled: 133400
  num_steps_trained: 133400
iterations_since_restore: 667
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 52.8
  gpu_util_percent0: 0.06
  ram_util_percent: 52.1
  vram_util_percent0: 0.0539651201052978
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04982547489940952
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4033688145803296
  mean_inference_ms: 2.6008548417698787
  mean_ra

agent_timesteps_total: 134600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-12-57
done: false
episode_len_mean: 813.04
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.34
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 724
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 134600
  num_steps_sampled: 134600
  num_steps_trained: 134600
iterations_since_restore: 673
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 50.0
  gpu_util_percent0: 0.0
  ram_util_percent: 52.2
  vram_util_percent0: 0.0539651201052978
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.049804971364473774
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4028749412360622
  mean_inference_ms: 2.6000048418303368
  mean_ra

agent_timesteps_total: 135800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-13-04
done: false
episode_len_mean: 824.3
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.43
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 730
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 135800
  num_steps_sampled: 135800
  num_steps_trained: 135800
iterations_since_restore: 679
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 52.7
  gpu_util_percent0: 0.05
  ram_util_percent: 52.1
  vram_util_percent0: 0.0539651201052978
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04978985267657525
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.402500316527665
  mean_inference_ms: 2.5994069687865933
  mean_raw_

agent_timesteps_total: 137000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-13-12
done: false
episode_len_mean: 835.56
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.52
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 736
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 137000
  num_steps_sampled: 137000
  num_steps_trained: 137000
iterations_since_restore: 685
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 62.5
  gpu_util_percent0: 0.06
  ram_util_percent: 52.2
  vram_util_percent0: 0.0539651201052978
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04977724425348568
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4021897167960446
  mean_inference_ms: 2.5989191917354684
  mean_ra

agent_timesteps_total: 138200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-13-19
done: false
episode_len_mean: 858.13
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.7
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 742
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 138200
  num_steps_sampled: 138200
  num_steps_trained: 138200
iterations_since_restore: 691
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 47.9
  gpu_util_percent0: 0.005
  ram_util_percent: 52.1
  vram_util_percent0: 0.0539651201052978
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04976457043269094
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4018674323118847
  mean_inference_ms: 2.5984198485409022
  mean_ra

agent_timesteps_total: 139400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-13-26
done: false
episode_len_mean: 869.18
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.79
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 748
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 139400
  num_steps_sampled: 139400
  num_steps_trained: 139400
iterations_since_restore: 697
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 47.9
  gpu_util_percent0: 0.045
  ram_util_percent: 51.9
  vram_util_percent0: 0.0539651201052978
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04975062874479774
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4015036629699902
  mean_inference_ms: 2.5978439267721716
  mean_r

agent_timesteps_total: 140600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-13-34
done: false
episode_len_mean: 824.67
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.43
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 756
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 140600
  num_steps_sampled: 140600
  num_steps_trained: 140600
iterations_since_restore: 703
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 54.0
  gpu_util_percent0: 0.03
  ram_util_percent: 52.0
  vram_util_percent0: 0.0539651201052978
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.049733681256254506
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4010420703855078
  mean_inference_ms: 2.597138939864113
  mean_ra

agent_timesteps_total: 141800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-13-41
done: false
episode_len_mean: 792.08
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.16
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 764
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 141800
  num_steps_sampled: 141800
  num_steps_trained: 141800
iterations_since_restore: 709
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 46.9
  gpu_util_percent0: 0.03
  ram_util_percent: 51.849999999999994
  vram_util_percent0: 0.0539651201052978
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04971595469173982
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4005317110891247
  mean_inference_ms: 2.596401955213

agent_timesteps_total: 143000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-13-48
done: false
episode_len_mean: 780.7
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.07
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 770
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 143000
  num_steps_sampled: 143000
  num_steps_trained: 143000
iterations_since_restore: 715
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 50.0
  gpu_util_percent0: 0.06
  ram_util_percent: 51.8
  vram_util_percent0: 0.0539651201052978
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04970384748405831
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.4001837551307321
  mean_inference_ms: 2.595880464102196
  mean_raw_

agent_timesteps_total: 144200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-13-56
done: false
episode_len_mean: 802.81
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.25
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 774
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 144200
  num_steps_sampled: 144200
  num_steps_trained: 144200
iterations_since_restore: 721
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 33.3
  gpu_util_percent0: 0.0
  ram_util_percent: 52.4
  vram_util_percent0: 0.0539651201052978
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04969574188071844
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3999496344088371
  mean_inference_ms: 2.5955301443092456
  mean_raw

agent_timesteps_total: 145400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-14-03
done: false
episode_len_mean: 836.24
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.52
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 778
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 145400
  num_steps_sampled: 145400
  num_steps_trained: 145400
iterations_since_restore: 727
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 52.1
  gpu_util_percent0: 0.0
  ram_util_percent: 52.45
  vram_util_percent0: 0.06103981572885818
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.049686661554059625
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3996885776312606
  mean_inference_ms: 2.595145488384869
  mean_r

agent_timesteps_total: 146600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-14-11
done: false
episode_len_mean: 792.43
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.16
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 786
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 146600
  num_steps_sampled: 146600
  num_steps_trained: 146600
iterations_since_restore: 733
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 40.2
  gpu_util_percent0: 0.0
  ram_util_percent: 52.2
  vram_util_percent0: 0.06103981572885818
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04966823935934091
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3991367469281002
  mean_inference_ms: 2.594355492845133
  mean_raw

agent_timesteps_total: 147800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-14-17
done: false
episode_len_mean: 803.51
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.25
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 792
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 147800
  num_steps_sampled: 147800
  num_steps_trained: 147800
iterations_since_restore: 739
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 52.7
  gpu_util_percent0: 0.0
  ram_util_percent: 52.5
  vram_util_percent0: 0.060217176702862786
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04965321664681777
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3986796171082259
  mean_inference_ms: 2.5936965659041205
  mean_r

agent_timesteps_total: 149000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-14-25
done: false
episode_len_mean: 803.95
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.25
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 799
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 149000
  num_steps_sampled: 149000
  num_steps_trained: 149000
iterations_since_restore: 745
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 60.0
  gpu_util_percent0: 0.0
  ram_util_percent: 52.6
  vram_util_percent0: 0.05939453767686739
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.049636272797471875
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3981517232418614
  mean_inference_ms: 2.5929772231306583
  mean_r

agent_timesteps_total: 150200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-14-33
done: false
episode_len_mean: 815.06
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.34
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 804
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 150200
  num_steps_sampled: 150200
  num_steps_trained: 150200
iterations_since_restore: 751
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 46.45
  gpu_util_percent0: 0.0
  ram_util_percent: 52.4
  vram_util_percent0: 0.05939453767686739
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.0496258430302754
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3978222286991622
  mean_inference_ms: 2.592543334819108
  mean_raw

agent_timesteps_total: 151400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-14-41
done: false
episode_len_mean: 825.94
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.43
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 811
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 151400
  num_steps_sampled: 151400
  num_steps_trained: 151400
iterations_since_restore: 757
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 43.65
  gpu_util_percent0: 0.0
  ram_util_percent: 52.3
  vram_util_percent0: 0.05939453767686739
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.049612195023466185
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.397400002744936
  mean_inference_ms: 2.591981790414958
  mean_ra

agent_timesteps_total: 152600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-14-48
done: false
episode_len_mean: 836.97
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.52
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 817
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 152600
  num_steps_sampled: 152600
  num_steps_trained: 152600
iterations_since_restore: 763
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 40.0
  gpu_util_percent0: 0.0
  ram_util_percent: 52.4
  vram_util_percent0: 0.05939453767686739
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.049601666451939884
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3970873068740712
  mean_inference_ms: 2.5915325993801646
  mean_r

agent_timesteps_total: 153800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-14-56
done: false
episode_len_mean: 814.38
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.34
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 825
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 153800
  num_steps_sampled: 153800
  num_steps_trained: 153800
iterations_since_restore: 769
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 67.8
  gpu_util_percent0: 0.11
  ram_util_percent: 52.75
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.049588619746883784
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3967295318322166
  mean_inference_ms: 2.5909416300152013
  mea

agent_timesteps_total: 155000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-15-04
done: false
episode_len_mean: 825.34
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.43
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 831
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 155000
  num_steps_sampled: 155000
  num_steps_trained: 155000
iterations_since_restore: 775
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 45.0
  gpu_util_percent0: 0.0
  ram_util_percent: 53.150000000000006
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.049581738953534925
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3965737073501108
  mean_inference_ms: 2.5906257308

agent_timesteps_total: 156200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-15-12
done: false
episode_len_mean: 803.37
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.25
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 839
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 156200
  num_steps_sampled: 156200
  num_steps_trained: 156200
iterations_since_restore: 781
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 34.699999999999996
  gpu_util_percent0: 0.0033333333333333335
  ram_util_percent: 52.56666666666666
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04957289793817124
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3963789701225833
  m

agent_timesteps_total: 157400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-15-19
done: false
episode_len_mean: 814.0
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.34
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 844
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 157400
  num_steps_sampled: 157400
  num_steps_trained: 157400
iterations_since_restore: 787
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 62.0
  gpu_util_percent0: 0.0
  ram_util_percent: 52.4
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04956852940793476
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3962994301272107
  mean_inference_ms: 2.5900373386654696
  mean_ra

agent_timesteps_total: 158600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-15-27
done: false
episode_len_mean: 813.57
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.34
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 852
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 158600
  num_steps_sampled: 158600
  num_steps_trained: 158600
iterations_since_restore: 793
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 62.5
  gpu_util_percent0: 0.0
  ram_util_percent: 52.6
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04956086001671868
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3961695474832985
  mean_inference_ms: 2.5896783091110342
  mean_r

agent_timesteps_total: 159800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-15-35
done: false
episode_len_mean: 813.8
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.34
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 858
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 159800
  num_steps_sampled: 159800
  num_steps_trained: 159800
iterations_since_restore: 799
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 51.45
  gpu_util_percent0: 0.0
  ram_util_percent: 53.2
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04955567436624096
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.396088594487743
  mean_inference_ms: 2.589447810878497
  mean_raw

agent_timesteps_total: 161000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-15-43
done: false
episode_len_mean: 824.42
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.43
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 865
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 161000
  num_steps_sampled: 161000
  num_steps_trained: 161000
iterations_since_restore: 805
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 50.0
  gpu_util_percent0: 0.0
  ram_util_percent: 52.3
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04955253136479783
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3960863102718049
  mean_inference_ms: 2.5893379157181857
  mean_r

agent_timesteps_total: 162200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-15-51
done: false
episode_len_mean: 803.2
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.25
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 874
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 162200
  num_steps_sampled: 162200
  num_steps_trained: 162200
iterations_since_restore: 811
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 51.45
  gpu_util_percent0: 0.0
  ram_util_percent: 52.45
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04955052974270962
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3961359385003047
  mean_inference_ms: 2.589324564807383
  mean_r

agent_timesteps_total: 163400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-15-58
done: false
episode_len_mean: 770.2
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.98
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 880
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 163400
  num_steps_sampled: 163400
  num_steps_trained: 163400
iterations_since_restore: 817
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 52.8
  gpu_util_percent0: 0.0
  ram_util_percent: 52.2
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04955199347582938
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3962415555499625
  mean_inference_ms: 2.5894354944839835
  mean_ra

agent_timesteps_total: 164600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-16-06
done: false
episode_len_mean: 780.82
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.07
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 887
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 164600
  num_steps_sampled: 164600
  num_steps_trained: 164600
iterations_since_restore: 823
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 52.2
  gpu_util_percent0: 0.0
  ram_util_percent: 52.4
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.049554527082058236
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3963964878168529
  mean_inference_ms: 2.5895935486334607
  mean_

agent_timesteps_total: 165800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-16-14
done: false
episode_len_mean: 758.8
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.89
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 896
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 165800
  num_steps_sampled: 165800
  num_steps_trained: 165800
iterations_since_restore: 829
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 61.35
  gpu_util_percent0: 0.01
  ram_util_percent: 52.35
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.049559910361979105
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.396669301147066
  mean_inference_ms: 2.589886334299825
  mean_

agent_timesteps_total: 167000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-16-21
done: false
episode_len_mean: 769.28
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.98
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 901
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 167000
  num_steps_sampled: 167000
  num_steps_trained: 167000
iterations_since_restore: 835
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 52.9
  gpu_util_percent0: 0.01
  ram_util_percent: 52.3
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04956280340163522
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3968227614103987
  mean_inference_ms: 2.590030087010638
  mean_r

agent_timesteps_total: 168200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-16-29
done: false
episode_len_mean: 747.7
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.8
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 909
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 168200
  num_steps_sampled: 168200
  num_steps_trained: 168200
iterations_since_restore: 841
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 40.0
  gpu_util_percent0: 0.21
  ram_util_percent: 52.1
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04956550398241956
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.396999381813087
  mean_inference_ms: 2.590160947984595
  mean_raw_

agent_timesteps_total: 169400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-16-36
done: false
episode_len_mean: 747.77
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.8
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 915
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 169400
  num_steps_sampled: 169400
  num_steps_trained: 169400
iterations_since_restore: 847
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 42.9
  gpu_util_percent0: 0.0
  ram_util_percent: 52.3
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04956716869696545
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.397119506760921
  mean_inference_ms: 2.590233415401228
  mean_raw_

agent_timesteps_total: 170600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-16-44
done: false
episode_len_mean: 758.85
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.89
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 923
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 170600
  num_steps_sampled: 170600
  num_steps_trained: 170600
iterations_since_restore: 853
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 42.85
  gpu_util_percent0: 0.005
  ram_util_percent: 52.5
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04956986923357874
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3972867770267015
  mean_inference_ms: 2.5903632025362806
  mea

agent_timesteps_total: 171800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-16-52
done: false
episode_len_mean: 736.44
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.71
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 931
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 171800
  num_steps_sampled: 171800
  num_steps_trained: 171800
iterations_since_restore: 859
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 42.9
  gpu_util_percent0: 0.005
  ram_util_percent: 52.35
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.049572323187039344
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3974331609245516
  mean_inference_ms: 2.5904923160293807
  me

agent_timesteps_total: 173000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-16-59
done: false
episode_len_mean: 736.23
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.71
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 937
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 173000
  num_steps_sampled: 173000
  num_steps_trained: 173000
iterations_since_restore: 865
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 42.9
  gpu_util_percent0: 0.005
  ram_util_percent: 52.5
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04957200542790871
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3974778784093034
  mean_inference_ms: 2.590491931627353
  mean_

agent_timesteps_total: 174200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-17-06
done: false
episode_len_mean: 747.33
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.8
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 942
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 174200
  num_steps_sampled: 174200
  num_steps_trained: 174200
iterations_since_restore: 871
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 43.4
  gpu_util_percent0: 0.005
  ram_util_percent: 52.4
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04957053666417795
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3974798694656598
  mean_inference_ms: 2.5904334071386588
  mean_

agent_timesteps_total: 175400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-17-13
done: false
episode_len_mean: 758.52
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 1.89
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 946
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 175400
  num_steps_sampled: 175400
  num_steps_trained: 175400
iterations_since_restore: 877
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 40.2
  gpu_util_percent0: 0.0
  ram_util_percent: 52.2
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.049568949652738094
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.397463805518193
  mean_inference_ms: 2.590376248112756
  mean_ra

agent_timesteps_total: 176600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-17-21
done: false
episode_len_mean: 792.08
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.16
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 951
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 176600
  num_steps_sampled: 176600
  num_steps_trained: 176600
iterations_since_restore: 883
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 46.0
  gpu_util_percent0: 0.0
  ram_util_percent: 52.5
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04956769178865844
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3974682882675293
  mean_inference_ms: 2.59033457865481
  mean_raw

agent_timesteps_total: 177800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-17-28
done: false
episode_len_mean: 803.35
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.25
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 957
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 177800
  num_steps_sampled: 177800
  num_steps_trained: 177800
iterations_since_restore: 889
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 41.7
  gpu_util_percent0: 0.0
  ram_util_percent: 52.3
  vram_util_percent0: 0.05939453767686739
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04956696292973106
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3975204665765006
  mean_inference_ms: 2.590294489485001
  mean_raw

agent_timesteps_total: 179000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-17-36
done: false
episode_len_mean: 814.33
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.34
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 962
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 179000
  num_steps_sampled: 179000
  num_steps_trained: 179000
iterations_since_restore: 895
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 31.55
  gpu_util_percent0: 0.0
  ram_util_percent: 53.8
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.049566119748850695
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3975769374180471
  mean_inference_ms: 2.590232245427832
  mean_

agent_timesteps_total: 180200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-17-43
done: false
episode_len_mean: 815.16
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.34
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 968
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 180200
  num_steps_sampled: 180200
  num_steps_trained: 180200
iterations_since_restore: 901
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 66.7
  gpu_util_percent0: 0.01
  ram_util_percent: 52.4
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.049563379209107195
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3976073750150229
  mean_inference_ms: 2.5900499060256403
  mean

agent_timesteps_total: 181400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-17-51
done: false
episode_len_mean: 836.51
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.52
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 973
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 181400
  num_steps_sampled: 181400
  num_steps_trained: 181400
iterations_since_restore: 907
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 48.1
  gpu_util_percent0: 0.005
  ram_util_percent: 52.650000000000006
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04956101874532821
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3976367582853044
  mean_inference_ms: 2.589888505

agent_timesteps_total: 182600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-17-58
done: false
episode_len_mean: 814.61
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.34
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 982
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 182600
  num_steps_sampled: 182600
  num_steps_trained: 182600
iterations_since_restore: 913
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 50.0
  gpu_util_percent0: 0.0
  ram_util_percent: 52.5
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04955672777551089
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3976854364128783
  mean_inference_ms: 2.5896120360183748
  mean_r

agent_timesteps_total: 183800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-18-05
done: false
episode_len_mean: 814.92
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.34
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 988
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 183800
  num_steps_sampled: 183800
  num_steps_trained: 183800
iterations_since_restore: 919
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 42.9
  gpu_util_percent0: 0.01
  ram_util_percent: 52.8
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04955322079626802
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3976817765440495
  mean_inference_ms: 2.589411141924274
  mean_r

agent_timesteps_total: 185000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-18-12
done: false
episode_len_mean: 836.89
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.52
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 993
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 185000
  num_steps_sampled: 185000
  num_steps_trained: 185000
iterations_since_restore: 925
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 66.7
  gpu_util_percent0: 0.0
  ram_util_percent: 52.8
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04954838689289001
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3976237234715503
  mean_inference_ms: 2.5891508179846237
  mean_r

agent_timesteps_total: 186200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-18-19
done: false
episode_len_mean: 825.63
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.43
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 1002
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 186200
  num_steps_sampled: 186200
  num_steps_trained: 186200
iterations_since_restore: 931
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 58.9
  gpu_util_percent0: 0.005
  ram_util_percent: 52.7
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04953799759718074
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3974871703932898
  mean_inference_ms: 2.5885740230749996
  mea

agent_timesteps_total: 187400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-18-27
done: false
episode_len_mean: 836.33
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.52
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 1008
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 187400
  num_steps_sampled: 187400
  num_steps_trained: 187400
iterations_since_restore: 937
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 56.25
  gpu_util_percent0: 0.005
  ram_util_percent: 52.6
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04953236612611185
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3974409713946678
  mean_inference_ms: 2.588244982551319
  mea

agent_timesteps_total: 188600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-18-35
done: false
episode_len_mean: 836.31
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.52
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 1015
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 188600
  num_steps_sampled: 188600
  num_steps_trained: 188600
iterations_since_restore: 943
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 62.5
  gpu_util_percent0: 0.0
  ram_util_percent: 52.7
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04952592116634265
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3973962409045524
  mean_inference_ms: 2.5878723939424297
  mean_

agent_timesteps_total: 189800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-18-42
done: false
episode_len_mean: 869.64
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.79
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 1018
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 189800
  num_steps_sampled: 189800
  num_steps_trained: 189800
iterations_since_restore: 949
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 55.05
  gpu_util_percent0: 0.005
  ram_util_percent: 52.6
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04952299391418282
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3973738128967315
  mean_inference_ms: 2.5877049928421383
  me

agent_timesteps_total: 191000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-18-49
done: false
episode_len_mean: 869.97
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.79
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 1023
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 191000
  num_steps_sampled: 191000
  num_steps_trained: 191000
iterations_since_restore: 955
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 40.95
  gpu_util_percent0: 0.005
  ram_util_percent: 52.7
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.049517265877587244
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.397305051231315
  mean_inference_ms: 2.5873946473184946
  me

agent_timesteps_total: 192200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-18-56
done: false
episode_len_mean: 892.48
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.97
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 1028
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 192200
  num_steps_sampled: 192200
  num_steps_trained: 192200
iterations_since_restore: 961
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 50.0
  gpu_util_percent0: 0.01
  ram_util_percent: 52.7
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04951100957150574
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3972168114435752
  mean_inference_ms: 2.587068836796492
  mean_

agent_timesteps_total: 193400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-19-04
done: false
episode_len_mean: 903.53
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 3.06
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 1035
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 193400
  num_steps_sampled: 193400
  num_steps_trained: 193400
iterations_since_restore: 967
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 42.1
  gpu_util_percent0: 0.005
  ram_util_percent: 52.650000000000006
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04950244266890138
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3970912750767202
  mean_inference_ms: 2.58663260

agent_timesteps_total: 194600
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-19-11
done: false
episode_len_mean: 881.87
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.88
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 1043
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 194600
  num_steps_sampled: 194600
  num_steps_trained: 194600
iterations_since_restore: 973
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 62.5
  gpu_util_percent0: 0.0
  ram_util_percent: 52.8
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04949398333488615
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3969825124101334
  mean_inference_ms: 2.5861893171367085
  mean_

agent_timesteps_total: 195800
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-19-18
done: false
episode_len_mean: 881.91
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.88
episode_reward_min: 0.0
episodes_this_iter: 2
episodes_total: 1047
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 195800
  num_steps_sampled: 195800
  num_steps_trained: 195800
iterations_since_restore: 979
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 55.6
  gpu_util_percent0: 0.0
  ram_util_percent: 52.599999999999994
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04949002292754558
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3969360184694866
  mean_inference_ms: 2.5859877479

agent_timesteps_total: 197000
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-19-26
done: false
episode_len_mean: 859.53
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.7
episode_reward_min: 0.0
episodes_this_iter: 0
episodes_total: 1052
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 197000
  num_steps_sampled: 197000
  num_steps_trained: 197000
iterations_since_restore: 985
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 47.2
  gpu_util_percent0: 0.0
  ram_util_percent: 52.7
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04948472429398816
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.396865664275963
  mean_inference_ms: 2.5857383295078886
  mean_ra

agent_timesteps_total: 198200
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-19-32
done: false
episode_len_mean: 848.25
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.61
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 1061
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 198200
  num_steps_sampled: 198200
  num_steps_trained: 198200
iterations_since_restore: 991
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 30.0
  gpu_util_percent0: 0.01
  ram_util_percent: 52.7
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04947106897790175
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3965911760910041
  mean_inference_ms: 2.5850748961729204
  mean

agent_timesteps_total: 199400
custom_metrics:
  default_policy: {}
date: 2021-11-17_22-19-38
done: false
episode_len_mean: 836.68
episode_media: {}
episode_reward_max: 9.0
episode_reward_mean: 2.52
episode_reward_min: 0.0
episodes_this_iter: 1
episodes_total: 1068
experiment_id: 63e36b8ff9cc480c88a578ecff5b5385
hostname: bruno-odyssey-mint
info:
  learner:
    default_policy:
      allreduce_latency: 0.0
      policy_loss: -0.0
  num_agent_steps_sampled: 199400
  num_steps_sampled: 199400
  num_steps_trained: 199400
iterations_since_restore: 997
node_ip: 192.168.0.102
num_healthy_workers: 1
off_policy_estimator: {}
perf:
  cpu_util_percent: 50.0
  gpu_util_percent0: 0.0
  ram_util_percent: 52.7
  vram_util_percent0: 0.059230009871668314
pid: 11466
policy_reward_max: {}
policy_reward_mean: {}
policy_reward_min: {}
sampler_perf:
  mean_action_processing_ms: 0.04945899996746829
  mean_env_render_ms: 0.0
  mean_env_wait_ms: 1.3963026440425546
  mean_inference_ms: 2.5845018433189306
  mean_

In [21]:
print("Last checkpoint saved at", last_checkpoint)

Last checkpoint saved at /home/bruno/ray_results/PG_BreakoutNoFrameskip-v4_2021-11-17_21-57-444l1pd2dt/checkpoint_001000/checkpoint-1000


In [22]:
# See: https://docs.ray.io/en/releases-1.4.0/rllib-training.html
preprocessor = get_preprocessor(env.observation_space)(env.observation_space)

In [24]:
trainer = pg.PGTrainer(config=config, env=environment_id)
trainer.restore(last_checkpoint)

after_training = os.path.join(
    DRIVE_PATH, "{}after_training_basic_api.mp4".format(environment_id)
)
after_video = VideoRecorder(env, after_training)
observation = env.reset()
done = False
while not done:
    env.render()
    after_video.capture_frame()
    action = trainer.compute_action(preprocessor.transform(observation))
    observation, reward, done, info = env.step(action)
after_video.close()
env.close()


2021-11-18 01:03:07,824	INFO trainable.py:377 -- Restored on 192.168.0.102 from checkpoint: /home/bruno/ray_results/PG_BreakoutNoFrameskip-v4_2021-11-17_21-57-444l1pd2dt/checkpoint_001000/checkpoint-1000
2021-11-18 01:03:07,825	INFO trainable.py:385 -- Current state after restoring: {'_iteration': 1000, '_timesteps_total': None, '_time_total': 1215.3582525253296, '_episodes_total': 1071}


RuntimeError: Given groups=1, weight of size [16, 4, 8, 8], expected input[1, 3, 88, 88] to have 4 channels, but got 3 channels instead

In [None]:
# Visualizar
html = render_mp4(after_training)
HTML(html)

#### 3. Treinar agente usando modelo pré-treinado

In [None]:
class TorchCustomModel(TorchModelV2, nn.Module):
    """Exemplo de um modelo personalizado PyTorch que apenas delega para uma 
    fc-net."""

    def __init__(self, obs_space, action_space, num_outputs, model_config,
                 name):
        TorchModelV2.__init__(self, obs_space, action_space, num_outputs,
                              model_config, name)
        nn.Module.__init__(self)

        self.torch_sub_model = TorchFC(obs_space, action_space, num_outputs,
                                       model_config, name)

    def forward(self, input_dict, state, seq_lens):
        input_dict["obs"] = input_dict["obs"].float()
        fc_out, _ = self.torch_sub_model(input_dict, state, seq_lens)
        return fc_out, []

    def value_function(self):
        return torch.reshape(self.torch_sub_model.value_function(), [-1])

In [None]:
# Também pode registrar a função de criar um ambiente explicitamente com:
# register_env("corridor", lambda config: SimpleCorridor(config))

# Registrar o modelo customizado
ModelCatalog.register_custom_model(
    "my_model", TorchCustomModel
)

config = {
    "env": environment_id, 
    "env_config": {},
    "model": {
        "custom_model": "my_model",
        "vf_share_layers": True,
    },
    "num_workers": 1,  
    "framework": "torch",
}

stop = {
    "training_iteration": 1,  #50
    "timesteps_total": 100000,
    "episode_reward_mean": 0.1,
}


In [None]:
pg_config = pg.DEFAULT_CONFIG.copy()
pg_config.update(config)
pg_config["lr"] = 1e-3

trainer = pg.PPOTrainer(config=pg_config, env=environment_id)
# executa o loop de treinamento manual e imprime os resultados após cada iteração
for _ in range(stop["training_iteration"]):
    result = trainer.train()
    print(pretty_print(result))
    
    # pare o treinamento caso tiver alcançado a quantidade de steps desejada
    # ou caso a recompensa desejada seja alcançada
    if result["timesteps_total"] >= stop["timesteps_total"] or \
            result["episode_reward_mean"] >= stop["episode_reward_mean"]:
        break


In [None]:
after_training = os.path.join(
    DRIVE_PATH, "{}after_training_basic_api.mp4".format(environment_id)
)
after_video = VideoRecorder(env, after_training)
observation = env.reset()
done = False
while not done:
    env.render()
    after_video.capture_frame()
    action = trainer.compute_action(preprocessor.transform(observation))
    observation, reward, done, info = env.step(action)
after_video.close()
env.close()

#### 4. Ray Tune

In [None]:
config = {
    "env": environment_id,
    "framework": "torch",
}
stop = {"timesteps_total": 10000}

# Executar o treinamento
analysis = ray.tune.run(
    "PPO",
    config=config,
    stop=stop,
    checkpoint_freq=100,
    checkpoint_at_end=True,
    local_dir=os.path.join(DRIVE_PATH, "results")
)


In [None]:
# restaurar um Trainer 
trial = analysis.get_best_logdir("episode_reward_mean", "max")
checkpoint = analysis.get_best_checkpoint(
  trial,
  "training_iteration",
  "max",
)
trainer = PGTrainer(config=config)
trainer.restore(checkpoint)


In [None]:
after_training = after_training = os.path.join(
    DRIVE_PATH, "{}after_training_tune.mp4".format(environment_id)
)
after_video = VideoRecorder(env, after_training)
observation = env.reset()
done = False
while not done:
    env.render()
    after_video.capture_frame()
    action = trainer.compute_action(preprocessor.transform(observation))
    observation, reward, done, info = env.step(action)
after_video.close()
env.close()
# You should get a video similar to the one below. 
html = render_mp4(after_training)
HTML(html)


In [None]:
if isColab:
    %tensorboard --logdir /content/gdrive/MyDrive/minicurso_rl/lab03/results/PG
else:
    %tensorboard --logdir ./content/results/PG

#### 5. Hyperparameter Tune

In [None]:
parameter_search_config = {
    "env": environment_id,
    "framework": "torch",
    "num_gpus": 1,  # porcentagem da gpu disponível para treino
    "num_workers": 1,  # número de workers além do processo principal; no colab deve ser 1 pois só há 2 CPUs

    # Hyperparameter tuning
    "model": {
      "fcnet_hiddens": ray.tune.grid_search([[32], [64]]),
      "fcnet_activation": ray.tune.grid_search(["linear", "relu"]),
    },
    "lr": ray.tune.uniform(1e-7, 1e-2)
}

# To explicitly stop or restart Ray, use the shutdown API.
ray.shutdown()

ray.init(
    num_cpus=2,
    include_dashboard=False,
    ignore_reinit_error=True,
    log_to_driver=False,
)

parameter_search_analysis = ray.tune.run(
    "PPO",
    config=parameter_search_config,
    stop=stop,
    num_samples=5,
    metric="timesteps_total",
    mode="min",
)

In [None]:
print(
  "Melhores hiperparâmetros encontrados:",
  parameter_search_analysis.best_config,
)

# Bônus

Como tarefa bônus, experimente com os algoritmos aprendidos no ambiente `soccer_twos`, que será utilizado na competição final deste curso*. Para facilitar, utilize a variação `team_vs_policy` como no laboratório anterior.

<img src="https://raw.githubusercontent.com/bryanoliveira/soccer-twos-env/master/images/screenshot.png" height="400">

> Visualização do ambiente

Este ambiente consiste em um jogo de futebol de carros 2x2, ou seja, o objetivo é marcar um gol no adversário o mais rápido possível. Na variação `team_vs_policy`, seu agente controla um jogador do time azul e joga contra um time aleatório. Mais informações sobre o ambiente podem ser encontradas [no repositório](https://github.com/bryanoliveira/soccer-twos-env) e [na documentação do Unity ml-agents](https://github.com/Unity-Technologies/ml-agents/blob/main/docs/Learning-Environment-Examples.md#soccer-twos).


**Sua tarefa é treinar um agente com a interface do Ray apresentada, experimentando com diferentes algoritmos e hiperparâmetros.**


<br>

*A variação utilizada na competição será a `multiagent_player`, mas agentes treinados para `team_vs_policy` podem ser facilmente adaptados. Na seção "Exportando seu agente treinado" o agente "MyDqnSoccerAgent" faz exatamente isso.

Utilize o ambiente instanciado abaixo para executar o algoritmo de treinamento. Ao final da execução, a recompensa do seu agente por episódio deve tender a +2.

In [14]:
import gym
from gym.wrappers.monitoring.video_recorder import VideoRecorder
from gym.spaces import Discrete, Box

import ray
import ray.rllib.agents.ppo as pg
from ray.tune.logger import pretty_print
from ray import tune
from ray.rllib.env.env_context import EnvContext
from ray.rllib.models import ModelCatalog
from ray.rllib.models.torch.torch_modelv2 import TorchModelV2
from ray.rllib.models.torch.fcnet import FullyConnectedNetwork as TorchFC
from ray.rllib.agents.ppo import PPOTrainer

import numpy as np
import os
import random

import torch
import torch.nn as nn

In [15]:
import soccer_twos

# Fecha o ambiente caso tenha sido aberto anteriormente
try: env.close()
except: pass

env = soccer_twos.make(
    variation=soccer_twos.EnvType.team_vs_policy,
    flatten_branched=True, # converte o action_space de MultiDiscrete para Discrete
    single_player=True, # controla um dos jogadores enquanto os outros ficam parados
    opponent_policy=lambda *_: 0,  # faz os oponentes ficarem parados
)

environment_id = "soccer-v0"

# Obtem tamanhos de estado e ação
state_size = env.observation_space.shape[0]
action_size = env.action_space.n

print("Tamanho do estado: {}, tamanho da ação: {}".format(state_size, action_size))
env.close()

[INFO] Connected to Unity environment with package version 2.1.0-exp.1 and communication version 1.5.0


INFO:mlagents_envs.environment:Connected to Unity environment with package version 2.1.0-exp.1 and communication version 1.5.0


[INFO] Connected new brain: SoccerTwos?team=1


INFO:mlagents_envs.environment:Connected new brain: SoccerTwos?team=1


[INFO] Connected new brain: SoccerTwos?team=0


INFO:mlagents_envs.environment:Connected new brain: SoccerTwos?team=0


Tamanho do estado: 336, tamanho da ação: 27


In [24]:
ray.shutdown()
ray.init(num_cpus=4, ignore_reinit_error=True, include_dashboard=False)

{'node_ip_address': '192.168.0.102',
 'raylet_ip_address': '192.168.0.102',
 'redis_address': '192.168.0.102:12844',
 'object_store_address': '/tmp/ray/session_2021-11-18_15-03-27_313316_269868/sockets/plasma_store',
 'raylet_socket_name': '/tmp/ray/session_2021-11-18_15-03-27_313316_269868/sockets/raylet',
 'webui_url': None,
 'session_dir': '/tmp/ray/session_2021-11-18_15-03-27_313316_269868',
 'metrics_export_port': 61492,
 'node_id': '1a6b1fec0972210a0e500c5656b6e833cea2e29e62c329b0e949d334'}

In [25]:
def create_rllib_env(env_config: dict = {}):
    # suporte a múltiplas instâncias do ambiente na mesma máquina
    if hasattr(env_config, "worker_index"):
        env_config["worker_id"] = (
            env_config.worker_index * env_config.get("num_envs_per_worker", 1)
            + env_config.vector_index
        )
    return soccer_twos.make(**env_config)

# registra ambiente no Ray
tune.registry.register_env(environment_id, create_rllib_env)

In [26]:
NUM_ENVS_PER_WORKER = 1

Utilize a configuração abaixo como ponto de partida para seus testes. 

A parte mais imporante é a chave `env_config`, que configura o ambiente para ser compatível com o agente disponibilizado para exportação do seu agente. Neste ponto do curso você já deve conseguir testar as outras variações do ambiente e utilizar as APIs do Ray para treinar um agente próximo (ou melhor) do que o [ceia_baseline_agent](https://drive.google.com/file/d/1WEjr48D7QG9uVy1tf4GJAZTpimHtINzE/view). Exemplos de como utilizar as outras variações podem ser encontrados [aqui](https://github.com/dlb-rl/rl-tournament-starter/). Ao utilizar essas variações, você deve utilizar também outras definições de agente para lidar com os diferentes espaços de observação e ação (que também estão presentes nos exemplos).

In [27]:
analysis = tune.run(
    "PPO",
    num_samples=1,
    config={
        # system settings
        "num_gpus": 1,
        "num_workers": 3,
#         "num_envs_per_worker": NUM_ENVS_PER_WORKER,
        "log_level": "INFO",
        "framework": "torch",
        # RL setup
        "env": environment_id,
        "env_config": {
            "variation": soccer_twos.EnvType.team_vs_policy,
            "single_player": True,
            "flatten_branched": True,
            "opponent_policy": lambda *_: 0,
        },
    },
    stop={
        # 10000000 (10M) de steps podem ser necessários para aprender uma política útil
        "timesteps_total": int(15e6),
        # você também pode limitar por tempo, de acordo com o tempo limite do colab
#         "time_total_s": 14400, # 4h
        "time_total_s": 86400, # 24h
    },
    checkpoint_freq=100,
    checkpoint_at_end=True,
    local_dir=os.path.join(DRIVE_PATH, "results"),
)

Trial name,status,loc
PPO_soccer-v0_d6059_00000,PENDING,




[2m[36m(pid=324757)[0m [INFO] Connected to Unity environment with package version 2.1.0-exp.1 and communication version 1.5.0
[2m[36m(pid=324757)[0m [INFO] Connected new brain: SoccerTwos?team=1
[2m[36m(pid=324757)[0m [INFO] Connected new brain: SoccerTwos?team=0


[2m[36m(pid=324757)[0m INFO:mlagents_envs.environment:Connected to Unity environment with package version 2.1.0-exp.1 and communication version 1.5.0
[2m[36m(pid=324757)[0m INFO:mlagents_envs.environment:Connected new brain: SoccerTwos?team=1
[2m[36m(pid=324757)[0m INFO:mlagents_envs.environment:Connected new brain: SoccerTwos?team=0
[2m[36m(pid=324757)[0m 2021-11-18 15:03:38,180	INFO torch_policy.py:134 -- TorchPolicy (worker=1) running on CPU.
[2m[36m(pid=324756)[0m INFO:mlagents_envs.environment:Connected to Unity environment with package version 2.1.0-exp.1 and communication version 1.5.0
[2m[36m(pid=324756)[0m INFO:mlagents_envs.environment:Connected new brain: SoccerTwos?team=1
[2m[36m(pid=324756)[0m INFO:mlagents_envs.environment:Connected new brain: SoccerTwos?team=0


[2m[36m(pid=324756)[0m [INFO] Connected to Unity environment with package version 2.1.0-exp.1 and communication version 1.5.0
[2m[36m(pid=324756)[0m [INFO] Connected new brain: SoccerTwos?team=1
[2m[36m(pid=324756)[0m [INFO] Connected new brain: SoccerTwos?team=0


[2m[36m(pid=324759)[0m 2021-11-18 15:03:38,360	INFO torch_policy.py:148 -- TorchPolicy (worker=local) running on 1 GPU(s).
[2m[36m(pid=324756)[0m 2021-11-18 15:03:38,405	INFO torch_policy.py:134 -- TorchPolicy (worker=3) running on CPU.
[2m[36m(pid=324758)[0m INFO:mlagents_envs.environment:Connected to Unity environment with package version 2.1.0-exp.1 and communication version 1.5.0


[2m[36m(pid=324758)[0m [INFO] Connected to Unity environment with package version 2.1.0-exp.1 and communication version 1.5.0
[2m[36m(pid=324758)[0m [INFO] Connected new brain: SoccerTwos?team=1
[2m[36m(pid=324758)[0m [INFO] Connected new brain: SoccerTwos?team=0


[2m[36m(pid=324758)[0m INFO:mlagents_envs.environment:Connected new brain: SoccerTwos?team=1
[2m[36m(pid=324758)[0m INFO:mlagents_envs.environment:Connected new brain: SoccerTwos?team=0
[2m[36m(pid=324758)[0m 2021-11-18 15:03:38,732	INFO torch_policy.py:134 -- TorchPolicy (worker=2) running on CPU.
[2m[36m(pid=324759)[0m 2021-11-18 15:03:40,815	INFO rollout_worker.py:1199 -- Built policy map: {'default_policy': <ray.rllib.policy.policy_template.PPOTorchPolicy object at 0x7fd4e022a6d0>}
[2m[36m(pid=324759)[0m 2021-11-18 15:03:40,815	INFO rollout_worker.py:1200 -- Built preprocessor map: {'default_policy': <ray.rllib.models.preprocessors.NoPreprocessor object at 0x7fd4e022a520>}
[2m[36m(pid=324759)[0m 2021-11-18 15:03:40,815	INFO rollout_worker.py:583 -- Built filter map: {'default_policy': <ray.rllib.utils.filter.NoFilter object at 0x7fd4e0204730>}
2021-11-18 15:03:40,904	ERROR trial_runner.py:748 -- Trial PPO_soccer-v0_d6059_00000: Error processing event.
Traceback (m

Result for PPO_soccer-v0_d6059_00000:
  {}
  


Trial name,status,loc
PPO_soccer-v0_d6059_00000,ERROR,

Trial name,# failures,error file
PPO_soccer-v0_d6059_00000,1,/home/bruno/Workspace/ceia-rl-curso/LAB_03/content/results/PPO/PPO_soccer-v0_d6059_00000_0_2021-11-18_15-03-30/error.txt


Trial name,status,loc
PPO_soccer-v0_d6059_00000,ERROR,

Trial name,# failures,error file
PPO_soccer-v0_d6059_00000,1,/home/bruno/Workspace/ceia-rl-curso/LAB_03/content/results/PPO/PPO_soccer-v0_d6059_00000_0_2021-11-18_15-03-30/error.txt


[2m[36m(pid=324759)[0m 2021-11-18 15:03:40,899	INFO trainer.py:601 -- Worker crashed during call to train(). To attempt to continue training without the failed worker, set `'ignore_worker_failures': True`.
[2m[36m(pid=324757)[0m 2021-11-18 15:03:40,889	INFO rollout_worker.py:723 -- Generating sample batch of size 1333
[2m[36m(pid=324756)[0m 2021-11-18 15:03:42,001	ERROR worker.py:409 -- SystemExit was raised from the worker
[2m[36m(pid=324756)[0m Traceback (most recent call last):
[2m[36m(pid=324756)[0m   File "python/ray/_raylet.pyx", line 490, in ray._raylet.execute_task
[2m[36m(pid=324756)[0m   File "python/ray/_raylet.pyx", line 497, in ray._raylet.execute_task
[2m[36m(pid=324756)[0m   File "python/ray/_raylet.pyx", line 501, in ray._raylet.execute_task
[2m[36m(pid=324756)[0m   File "python/ray/_raylet.pyx", line 451, in ray._raylet.execute_task.function_executor
[2m[36m(pid=324756)[0m   File "/home/bruno/anaconda3/envs/soccer-twos/lib/python3.8/site-pack

TuneError: ('Trials did not complete', [PPO_soccer-v0_d6059_00000])

## Exportando seu agente treinado

Assim como no Lab 02, você pode exportar seu agente treinado para ser executado como competidor no ambiente da competição ou simplesmente assistí-lo. Para isso, devemos definir uma classe de agente que implemente a interface e trate as observações/ações para o formato da competição. Abaixo, configuramos qual experimento/checkpoint exportar e guardamos a implementação em uma variável para salvá-la em um arquivo posteriormente.

In [98]:
ALGORITHM = "PPO"
TRIAL = analysis.get_best_logdir("episode_reward_mean", "max")
CHECKPOINT = analysis.get_best_checkpoint(
  TRIAL,
  "training_iteration",
  "max",
)
TRIAL, CHECKPOINT

('/home/bruno/Workspace/ceia-rl-curso/LAB_03/content/results/PPO/PPO_soccer-v0_3fbc3_00000_0_lr=9.7555e-05_2021-11-15_17-54-39',
 '/home/bruno/Workspace/ceia-rl-curso/LAB_03/content/results/PPO/PPO_soccer-v0_3fbc3_00000_0_lr=9.7555e-05_2021-11-15_17-54-39/checkpoint_001252/checkpoint-1252')

In [99]:
agent_file = f"""
import pickle
import os

import gym
from gym_unity.envs import ActionFlattener
import ray
from ray import tune
from ray.tune.registry import get_trainable_cls

from soccer_twos import AgentInterface, DummyEnv


ALGORITHM = "{ALGORITHM}"
CHECKPOINT_PATH = os.path.join(
    os.path.dirname(os.path.abspath(__file__)), 
    "{CHECKPOINT.split("LAB_03/")[1]}"
)


class MyRaySoccerAgent(AgentInterface):
    def __init__(self, env: gym.Env):
        super().__init__()
        ray.init(ignore_reinit_error=True)

        self.flattener = ActionFlattener(env.action_space.nvec)

        # Load configuration from checkpoint file.
        config_path = ""
        if CHECKPOINT_PATH:
            config_dir = os.path.dirname(CHECKPOINT_PATH)
            config_path = os.path.join(config_dir, "params.pkl")
            # Try parent directory.
            if not os.path.exists(config_path):
                config_path = os.path.join(config_dir, "../params.pkl")

        # Load the config from pickled.
        if os.path.exists(config_path):
            with open(config_path, "rb") as f:
                config = pickle.load(f)
        else:
            # If no config in given checkpoint -> Error.
            raise ValueError(
                "Could not find params.pkl in either the checkpoint dir or "
                "its parent directory!"
            )

        # no need for parallelism on evaluation
        config["num_workers"] = 0
        config["num_gpus"] = 0

        # create a dummy env since it's required but we only care about the policy
        obs_space = env.observation_space
        act_space = self.flattener.action_space
        tune.registry.register_env(
            "DummyEnv",
            lambda *_: DummyEnv(obs_space, act_space),
        )
        config["env"] = "DummyEnv"

        # create the Trainer from config
        cls = get_trainable_cls(ALGORITHM)
        agent = cls(env=config["env"], config=config)
        # load state from checkpoint
        agent.restore(CHECKPOINT_PATH)
        # get default policy for evaluation
        self.policy = agent.get_policy()

    def act(self, observation):
        actions = {{}}
        for player_id in observation:
            # compute_single_action returns a tuple of (action, action_info, ...)
            # as we only need the action, we discard the other elements
            actions[player_id] = self.flattener.lookup_action(
                self.policy.compute_single_action(observation[player_id])[0]
            )
        return actions
"""

In [100]:
import os
import shutil

agent_name = "my_ray_soccer_agent"
agent_path = os.path.join(
    DRIVE_PATH, agent_name, agent_name) if isColab else os.path.join(DRIVE_PATH, agent_name)
os.makedirs(agent_path, exist_ok=True)

shutil.rmtree(agent_path)
os.makedirs(agent_path)

# salva a classe do agente
with open(os.path.join(agent_path, "agent.py"), "w") as f:
    f.write(agent_file)

# salva um __init__ para criar o módulo Python
with open(os.path.join(agent_path, "__init__.py"), "w") as f:
    f.write("from .agent import MyRaySoccerAgent")

# copia o trial inteiro, incluindo os arquivos de configuração do experimento
shutil.copytree(TRIAL, os.path.join(agent_path, TRIAL.split("LAB_03/")[1]))

# empacota tudo num arquivo .zip
if isColab:
    shutil.make_archive(os.path.join(DRIVE_PATH, agent_name),
                        "zip", os.path.join(DRIVE_PATH, agent_name))


Após empacotar todos os arquivos necessários para a execução do seu agente, será criado um arquivo `minicurso_rl/lab03/my_ray_soccer_agent.zip` nos arquivos do Colab e na pasta correspondente no Google Drive. Baixe o arquivo e extraia-o para alguma pasta no seu computador. 

Assumindo que o ambiente Python já está configurado (e.g. os pacotes no [requirements.txt](https://github.com/dlb-rl/rl-tournament-starter/blob/main/requirements.txt) estão instalados), rode `python -m soccer_twos.watch -m my_ray_soccer_agent` para assistir seu agente jogando contra si mesmo. 

Você também pode testar dois agentes diferentes jogando um contra o outro. Utilize o seguinte comando: `python -m soccer_twos.watch -m1 my_ray_soccer_agent -m2 ceia_baseline_agent`. Você pode baixar o agente *ceia_baseline_agent* [aqui](https://drive.google.com/file/d/1WEjr48D7QG9uVy1tf4GJAZTpimHtINzE/view).