# Ray Tune Tutorial - Exercise Solutions

© 2019-2020, Anyscale. All Rights Reserved

![Anyscale Academy](../../images/AnyscaleAcademy_Logo_clearbanner_141x100.png)

## 01 Ray Tune Overview 

### Exercise - Try More Neural Network Sizes

Repeat the experiment above using the sizes `[20, 40, 60, 80, 100]` or some subset of these numbers, depending on how long you are willing to wait. What combination appears to be best, given the considerations we discussed above?

First, we set up everything we need from the lesson.

In [4]:
import ray
from ray import tune

In [5]:
!../../tools/start-ray.sh --check --verbose

INFO: Ray is already running.


In [6]:
ray.init(address='auto', ignore_reinit_error=True)

{'node_ip_address': '192.168.1.149',
 'raylet_ip_address': '192.168.1.149',
 'redis_address': '192.168.1.149:6379',
 'object_store_address': '/tmp/ray/session_2020-07-19_08-56-14_461147_28318/sockets/plasma_store',
 'raylet_socket_name': '/tmp/ray/session_2020-07-19_08-56-14_461147_28318/sockets/raylet',
 'webui_url': 'localhost:8265',
 'session_dir': '/tmp/ray/session_2020-07-19_08-56-14_461147_28318'}

In [7]:
sizes = [20, 40, 60, 80, 100]

The next cell will take around 20-30 minutes, even on a fast laptop.

In [None]:
analysis = tune.run(
    "PPO",
    stop={"episode_reward_mean": 400},

    config={
        "env": "CartPole-v1",
        "num_gpus": 0,
        "num_workers": 6,
        "model": {
            'fcnet_hiddens': [
                tune.grid_search(sizes),
                tune.grid_search(sizes)
            ]
        },
        "eager": False,
    },
    verbose=2,
    ray_auto_init=False
)

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1
PPO_CartPole-v1_6da34_00001,PENDING,,,
PPO_CartPole-v1_6da34_00002,PENDING,,,
PPO_CartPole-v1_6da34_00003,PENDING,,,
PPO_CartPole-v1_6da34_00004,PENDING,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,


[2m[36m(pid=28334)[0m 2020-07-19 08:56:41,840	INFO trainer.py:585 -- Tip: set framework=tfe or the --eager flag to enable TensorFlow eager execution
[2m[36m(pid=28334)[0m 2020-07-19 08:56:41,840	INFO trainer.py:612 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
Result for PPO_CartPole-v1_6da34_00000:
  custom_metrics: {}
  date: 2020-07-19_08-56-53
  done: false
  episode_len_mean: 23.224390243902437
  episode_reward_max: 123.0
  episode_reward_mean: 23.224390243902437
  episode_reward_min: 8.0
  episodes_this_iter: 205
  episodes_total: 205
  experiment_id: 36a71047b4424780b80338a588eccfa9
  experiment_tag: 0_fcnet_hiddens_0=20,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.20000000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.6709845066070557
        entropy_coeff: 0.0
        kl: 0.021955376490950584
        model: {}
 

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00001,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00002,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00003,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00000:
  custom_metrics: {}
  date: 2020-07-19_08-57-00
  done: false
  episode_len_mean: 71.33
  episode_reward_max: 222.0
  episode_reward_mean: 71.33
  episode_reward_min: 16.0
  episodes_this_iter: 51
  episodes_total: 489
  experiment_id: 36a71047b4424780b80338a588eccfa9
  experiment_tag: 0_fcnet_hiddens_0=20,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5931600332260132
        entropy_coeff: 0.0
        kl: 0.006998282857239246
        model: {}
        policy_loss: -0.013401087373495102
        total_loss: 1259.3472900390625
        vf_explained_var: 0.006506545934826136
        vf_loss: 1259.358642578125
    num_steps_sampled: 19200
    num_steps_trained: 19200
  iterations_since_restore: 4
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 71.8

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00001,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00002,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00003,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00000:
  custom_metrics: {}
  date: 2020-07-19_08-57-06
  done: false
  episode_len_mean: 162.91
  episode_reward_max: 500.0
  episode_reward_mean: 162.91
  episode_reward_min: 11.0
  episodes_this_iter: 20
  episodes_total: 562
  experiment_id: 36a71047b4424780b80338a588eccfa9
  experiment_tag: 0_fcnet_hiddens_0=20,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.15000000596046448
        cur_lr: 4.999999873689376e-05
        entropy: 0.5359815359115601
        entropy_coeff: 0.0
        kl: 0.004617937374860048
        model: {}
        policy_loss: -0.00648966059088707
        total_loss: 1975.390625
        vf_explained_var: 0.0015481371665373445
        vf_loss: 1975.3963623046875
    num_steps_sampled: 33600
    num_steps_trained: 33600
  iterations_since_restore: 7
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 71.19999

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00001,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00002,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00003,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00000:
  custom_metrics: {}
  date: 2020-07-19_08-57-13
  done: false
  episode_len_mean: 241.39
  episode_reward_max: 500.0
  episode_reward_mean: 241.39
  episode_reward_min: 74.0
  episodes_this_iter: 18
  episodes_total: 616
  experiment_id: 36a71047b4424780b80338a588eccfa9
  experiment_tag: 0_fcnet_hiddens_0=20,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.01875000074505806
        cur_lr: 4.999999873689376e-05
        entropy: 0.5408639907836914
        entropy_coeff: 0.0
        kl: 0.00545285502448678
        model: {}
        policy_loss: -0.0016197676304727793
        total_loss: 1936.75439453125
        vf_explained_var: 0.00019752979278564453
        vf_loss: 1936.756103515625
    num_steps_sampled: 48000
    num_steps_trained: 48000
  iterations_since_restore: 10
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 7

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00001,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00002,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00003,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00000:
  custom_metrics: {}
  date: 2020-07-19_08-57-20
  done: false
  episode_len_mean: 294.3
  episode_reward_max: 500.0
  episode_reward_mean: 294.3
  episode_reward_min: 107.0
  episodes_this_iter: 13
  episodes_total: 655
  experiment_id: 36a71047b4424780b80338a588eccfa9
  experiment_tag: 0_fcnet_hiddens_0=20,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.004687500186264515
        cur_lr: 4.999999873689376e-05
        entropy: 0.5000230073928833
        entropy_coeff: 0.0
        kl: 0.0006449955399148166
        model: {}
        policy_loss: -0.002165779937058687
        total_loss: 1927.8553466796875
        vf_explained_var: 6.803950964240357e-05
        vf_loss: 1927.857666015625
    num_steps_sampled: 62400
    num_steps_trained: 62400
  iterations_since_restore: 13
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent:

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00001,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00002,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00003,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00000:
  custom_metrics: {}
  date: 2020-07-19_08-57-26
  done: false
  episode_len_mean: 344.19
  episode_reward_max: 500.0
  episode_reward_mean: 344.19
  episode_reward_min: 153.0
  episodes_this_iter: 12
  episodes_total: 695
  experiment_id: 36a71047b4424780b80338a588eccfa9
  experiment_tag: 0_fcnet_hiddens_0=20,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.0005859375232830644
        cur_lr: 4.999999873689376e-05
        entropy: 0.47939738631248474
        entropy_coeff: 0.0
        kl: 0.005784088280051947
        model: {}
        policy_loss: -0.0035216067917644978
        total_loss: 1770.7283935546875
        vf_explained_var: 1.7187079720315523e-05
        vf_loss: 1770.7320556640625
    num_steps_sampled: 76800
    num_steps_trained: 76800
  iterations_since_restore: 16
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_pe

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00001,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00002,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00003,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00000:
  custom_metrics: {}
  date: 2020-07-19_08-57-33
  done: false
  episode_len_mean: 366.88
  episode_reward_max: 500.0
  episode_reward_mean: 366.88
  episode_reward_min: 169.0
  episodes_this_iter: 9
  episodes_total: 733
  experiment_id: 36a71047b4424780b80338a588eccfa9
  experiment_tag: 0_fcnet_hiddens_0=20,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.0005859375232830644
        cur_lr: 4.999999873689376e-05
        entropy: 0.477664589881897
        entropy_coeff: 0.0
        kl: 0.0017057254444807768
        model: {}
        policy_loss: 0.0016749865608289838
        total_loss: 1718.893798828125
        vf_explained_var: 2.625826311941637e-07
        vf_loss: 1718.892333984375
    num_steps_sampled: 91200
    num_steps_trained: 91200
  iterations_since_restore: 19
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent:

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00001,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00002,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00003,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00000:
  custom_metrics: {}
  date: 2020-07-19_08-57-40
  done: true
  episode_len_mean: 402.71
  episode_reward_max: 500.0
  episode_reward_mean: 402.71
  episode_reward_min: 180.0
  episodes_this_iter: 10
  episodes_total: 764
  experiment_id: 36a71047b4424780b80338a588eccfa9
  experiment_tag: 0_fcnet_hiddens_0=20,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.0001464843808207661
        cur_lr: 4.999999873689376e-05
        entropy: 0.470482736825943
        entropy_coeff: 0.0
        kl: 0.007052297703921795
        model: {}
        policy_loss: -0.003557320684194565
        total_loss: 1532.3072509765625
        vf_explained_var: 0.045018646866083145
        vf_loss: 1532.310791015625
    num_steps_sampled: 105600
    num_steps_trained: 105600
  iterations_since_restore: 22
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00001,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00002,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00003,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,


[2m[36m(pid=28330)[0m 2020-07-19 08:57:45,311	INFO trainer.py:585 -- Tip: set framework=tfe or the --eager flag to enable TensorFlow eager execution
[2m[36m(pid=28330)[0m 2020-07-19 08:57:45,311	INFO trainer.py:612 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
Result for PPO_CartPole-v1_6da34_00001:
  custom_metrics: {}
  date: 2020-07-19_08-57-59
  done: false
  episode_len_mean: 21.832558139534882
  episode_reward_max: 87.0
  episode_reward_mean: 21.832558139534882
  episode_reward_min: 9.0
  episodes_this_iter: 215
  episodes_total: 215
  experiment_id: 44f778a809fb40eca35357255b4a3da8
  experiment_tag: 1_fcnet_hiddens_0=40,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.20000000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.6702538728713989
        entropy_coeff: 0.0
        kl: 0.023810263723134995
        model: {}
  

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00002,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00003,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00001:
  custom_metrics: {}
  date: 2020-07-19_08-58-04
  done: false
  episode_len_mean: 50.89
  episode_reward_max: 138.0
  episode_reward_mean: 50.89
  episode_reward_min: 11.0
  episodes_this_iter: 86
  episodes_total: 441
  experiment_id: 44f778a809fb40eca35357255b4a3da8
  experiment_tag: 1_fcnet_hiddens_0=40,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5967952609062195
        entropy_coeff: 0.0
        kl: 0.011363429017364979
        model: {}
        policy_loss: -0.01925078220665455
        total_loss: 638.57421875
        vf_explained_var: 0.005123368930071592
        vf_loss: 638.590087890625
    num_steps_sampled: 14400
    num_steps_trained: 14400
  iterations_since_restore: 3
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 75.7
    ram

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00002,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00003,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00001:
  custom_metrics: {}
  date: 2020-07-19_08-58-11
  done: false
  episode_len_mean: 141.28
  episode_reward_max: 500.0
  episode_reward_mean: 141.28
  episode_reward_min: 15.0
  episodes_this_iter: 19
  episodes_total: 533
  experiment_id: 44f778a809fb40eca35357255b4a3da8
  experiment_tag: 1_fcnet_hiddens_0=40,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5485965013504028
        entropy_coeff: 0.0
        kl: 0.003564785933122039
        model: {}
        policy_loss: -0.002743388758972287
        total_loss: 2033.9683837890625
        vf_explained_var: 0.0013817226281389594
        vf_loss: 2033.9698486328125
    num_steps_sampled: 28800
    num_steps_trained: 28800
  iterations_since_restore: 6
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00002,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00003,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00001:
  custom_metrics: {}
  date: 2020-07-19_08-58-19
  done: false
  episode_len_mean: 228.45
  episode_reward_max: 500.0
  episode_reward_mean: 228.45
  episode_reward_min: 15.0
  episodes_this_iter: 23
  episodes_total: 592
  experiment_id: 44f778a809fb40eca35357255b4a3da8
  experiment_tag: 1_fcnet_hiddens_0=40,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.07500000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.5067653059959412
        entropy_coeff: 0.0
        kl: 0.0028071149718016386
        model: {}
        policy_loss: 0.0006881076842546463
        total_loss: 1723.2452392578125
        vf_explained_var: 0.0001529793516965583
        vf_loss: 1723.2442626953125
    num_steps_sampled: 43200
    num_steps_trained: 43200
  iterations_since_restore: 9
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent:

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00002,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00003,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00001:
  custom_metrics: {}
  date: 2020-07-19_08-58-25
  done: false
  episode_len_mean: 242.62
  episode_reward_max: 500.0
  episode_reward_mean: 242.62
  episode_reward_min: 109.0
  episodes_this_iter: 21
  episodes_total: 633
  experiment_id: 44f778a809fb40eca35357255b4a3da8
  experiment_tag: 1_fcnet_hiddens_0=40,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.01875000074505806
        cur_lr: 4.999999873689376e-05
        entropy: 0.5401674509048462
        entropy_coeff: 0.0
        kl: 0.010654631070792675
        model: {}
        policy_loss: -0.002786755794659257
        total_loss: 1634.9063720703125
        vf_explained_var: 6.383012805599719e-05
        vf_loss: 1634.9088134765625
    num_steps_sampled: 52800
    num_steps_trained: 52800
  iterations_since_restore: 11
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00002,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00003,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00001:
  custom_metrics: {}
  date: 2020-07-19_08-58-30
  done: false
  episode_len_mean: 257.11
  episode_reward_max: 500.0
  episode_reward_mean: 257.11
  episode_reward_min: 148.0
  episodes_this_iter: 10
  episodes_total: 661
  experiment_id: 44f778a809fb40eca35357255b4a3da8
  experiment_tag: 1_fcnet_hiddens_0=40,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.00937500037252903
        cur_lr: 4.999999873689376e-05
        entropy: 0.5463888049125671
        entropy_coeff: 0.0
        kl: 0.0010843303753063083
        model: {}
        policy_loss: 0.0006517936708405614
        total_loss: 2177.93505859375
        vf_explained_var: 3.067222905883682e-06
        vf_loss: 2177.93408203125
    num_steps_sampled: 62400
    num_steps_trained: 62400
  iterations_since_restore: 13
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 8

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00002,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00003,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00001:
  custom_metrics: {}
  date: 2020-07-19_08-58-36
  done: false
  episode_len_mean: 296.02
  episode_reward_max: 500.0
  episode_reward_mean: 296.02
  episode_reward_min: 133.0
  episodes_this_iter: 13
  episodes_total: 687
  experiment_id: 44f778a809fb40eca35357255b4a3da8
  experiment_tag: 1_fcnet_hiddens_0=40,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.0023437500931322575
        cur_lr: 4.999999873689376e-05
        entropy: 0.5414092540740967
        entropy_coeff: 0.0
        kl: 0.009934326633810997
        model: {}
        policy_loss: -0.004858800675719976
        total_loss: 1841.809814453125
        vf_explained_var: 5.267762048788427e-07
        vf_loss: 1841.8148193359375
    num_steps_sampled: 72000
    num_steps_trained: 72000
  iterations_since_restore: 15
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percen

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00002,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00003,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00001:
  custom_metrics: {}
  date: 2020-07-19_08-58-43
  done: false
  episode_len_mean: 352.1
  episode_reward_max: 500.0
  episode_reward_mean: 352.1
  episode_reward_min: 133.0
  episodes_this_iter: 12
  episodes_total: 723
  experiment_id: 44f778a809fb40eca35357255b4a3da8
  experiment_tag: 1_fcnet_hiddens_0=40,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.0023437500931322575
        cur_lr: 4.999999873689376e-05
        entropy: 0.5253211855888367
        entropy_coeff: 0.0
        kl: 0.009855245240032673
        model: {}
        policy_loss: -0.0020169576164335012
        total_loss: 1785.3980712890625
        vf_explained_var: -9.665617639598167e-09
        vf_loss: 1785.39990234375
    num_steps_sampled: 86400
    num_steps_trained: 86400
  iterations_since_restore: 18
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00002,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00003,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00001:
  custom_metrics: {}
  date: 2020-07-19_08-58-50
  done: true
  episode_len_mean: 409.63
  episode_reward_max: 500.0
  episode_reward_mean: 409.63
  episode_reward_min: 133.0
  episodes_this_iter: 10
  episodes_total: 754
  experiment_id: 44f778a809fb40eca35357255b4a3da8
  experiment_tag: 1_fcnet_hiddens_0=40,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.0011718750465661287
        cur_lr: 4.999999873689376e-05
        entropy: 0.5329412221908569
        entropy_coeff: 0.0
        kl: 0.005241755861788988
        model: {}
        policy_loss: -0.003045935183763504
        total_loss: 1613.3494873046875
        vf_explained_var: -3.3829664403128845e-08
        vf_loss: 1613.3525390625
    num_steps_sampled: 100800
    num_steps_trained: 100800
  iterations_since_restore: 21
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_perce

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00002,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00003,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,


[2m[36m(pid=28640)[0m 2020-07-19 08:58:55,240	INFO trainer.py:585 -- Tip: set framework=tfe or the --eager flag to enable TensorFlow eager execution
[2m[36m(pid=28640)[0m 2020-07-19 08:58:55,240	INFO trainer.py:612 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
Result for PPO_CartPole-v1_6da34_00002:
  custom_metrics: {}
  date: 2020-07-19_08-59-04
  done: false
  episode_len_mean: 21.96698113207547
  episode_reward_max: 63.0
  episode_reward_mean: 21.96698113207547
  episode_reward_min: 9.0
  episodes_this_iter: 212
  episodes_total: 212
  experiment_id: 86142fde4745400d81b2bace2a389f9d
  experiment_tag: 2_fcnet_hiddens_0=60,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.20000000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.6660190224647522
        entropy_coeff: 0.0
        kl: 0.027577368542551994
        model: {}
    

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00003,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00002:
  custom_metrics: {}
  date: 2020-07-19_08-59-11
  done: false
  episode_len_mean: 83.58
  episode_reward_max: 344.0
  episode_reward_mean: 83.58
  episode_reward_min: 12.0
  episodes_this_iter: 39
  episodes_total: 468
  experiment_id: 86142fde4745400d81b2bace2a389f9d
  experiment_tag: 2_fcnet_hiddens_0=60,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5797058939933777
        entropy_coeff: 0.0
        kl: 0.010305119678378105
        model: {}
        policy_loss: -0.01168621052056551
        total_loss: 1510.205078125
        vf_explained_var: 0.0018591075204312801
        vf_loss: 1510.2139892578125
    num_steps_sampled: 19200
    num_steps_trained: 19200
  iterations_since_restore: 4
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 76.7
  

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00003,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00002:
  custom_metrics: {}
  date: 2020-07-19_08-59-19
  done: false
  episode_len_mean: 189.53
  episode_reward_max: 500.0
  episode_reward_mean: 189.53
  episode_reward_min: 19.0
  episodes_this_iter: 20
  episodes_total: 530
  experiment_id: 86142fde4745400d81b2bace2a389f9d
  experiment_tag: 2_fcnet_hiddens_0=60,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5431830286979675
        entropy_coeff: 0.0
        kl: 0.003107875119894743
        model: {}
        policy_loss: -0.004871003329753876
        total_loss: 1960.3590087890625
        vf_explained_var: 0.0004553231119643897
        vf_loss: 1960.3631591796875
    num_steps_sampled: 33600
    num_steps_trained: 33600
  iterations_since_restore: 7
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00003,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00002:
  custom_metrics: {}
  date: 2020-07-19_08-59-25
  done: false
  episode_len_mean: 285.07
  episode_reward_max: 500.0
  episode_reward_mean: 285.07
  episode_reward_min: 22.0
  episodes_this_iter: 13
  episodes_total: 567
  experiment_id: 86142fde4745400d81b2bace2a389f9d
  experiment_tag: 2_fcnet_hiddens_0=60,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.07500000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.5338836908340454
        entropy_coeff: 0.0
        kl: 0.00856406707316637
        model: {}
        policy_loss: -0.008728088811039925
        total_loss: 2103.968994140625
        vf_explained_var: 6.735485658282414e-05
        vf_loss: 2103.976806640625
    num_steps_sampled: 48000
    num_steps_trained: 48000
  iterations_since_restore: 10
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 68

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00003,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00002:
  custom_metrics: {}
  date: 2020-07-19_08-59-32
  done: false
  episode_len_mean: 363.97
  episode_reward_max: 500.0
  episode_reward_mean: 363.97
  episode_reward_min: 111.0
  episodes_this_iter: 12
  episodes_total: 598
  experiment_id: 86142fde4745400d81b2bace2a389f9d
  experiment_tag: 2_fcnet_hiddens_0=60,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.03750000149011612
        cur_lr: 4.999999873689376e-05
        entropy: 0.4957581162452698
        entropy_coeff: 0.0
        kl: 0.010847947560250759
        model: {}
        policy_loss: -0.004457523114979267
        total_loss: 1998.124755859375
        vf_explained_var: 9.153340215561911e-06
        vf_loss: 1998.1290283203125
    num_steps_sampled: 62400
    num_steps_trained: 62400
  iterations_since_restore: 13
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent:

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00003,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00002:
  custom_metrics: {}
  date: 2020-07-19_08-59-39
  done: true
  episode_len_mean: 400.13
  episode_reward_max: 500.0
  episode_reward_mean: 400.13
  episode_reward_min: 231.0
  episodes_this_iter: 13
  episodes_total: 637
  experiment_id: 86142fde4745400d81b2bace2a389f9d
  experiment_tag: 2_fcnet_hiddens_0=60,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.01875000074505806
        cur_lr: 4.999999873689376e-05
        entropy: 0.5074348449707031
        entropy_coeff: 0.0
        kl: 0.008905192837119102
        model: {}
        policy_loss: -0.0056859697215259075
        total_loss: 1843.987548828125
        vf_explained_var: 7.088119957643357e-08
        vf_loss: 1843.9930419921875
    num_steps_sampled: 76800
    num_steps_trained: 76800
  iterations_since_restore: 16
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent:

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00003,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,


[2m[36m(pid=28803)[0m 2020-07-19 08:59:44,227	INFO trainer.py:585 -- Tip: set framework=tfe or the --eager flag to enable TensorFlow eager execution
[2m[36m(pid=28803)[0m 2020-07-19 08:59:44,227	INFO trainer.py:612 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
Result for PPO_CartPole-v1_6da34_00003:
  custom_metrics: {}
  date: 2020-07-19_08-59-57
  done: false
  episode_len_mean: 21.53181818181818
  episode_reward_max: 82.0
  episode_reward_mean: 21.53181818181818
  episode_reward_min: 9.0
  episodes_this_iter: 220
  episodes_total: 220
  experiment_id: f9fad4b5fdd74cc1b2a4f3309fce6320
  experiment_tag: 3_fcnet_hiddens_0=80,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.20000000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.6678789258003235
        entropy_coeff: 0.0
        kl: 0.02645065449178219
        model: {}
     

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00003:
  custom_metrics: {}
  date: 2020-07-19_09-00-02
  done: false
  episode_len_mean: 53.06
  episode_reward_max: 201.0
  episode_reward_mean: 53.06
  episode_reward_min: 11.0
  episodes_this_iter: 84
  episodes_total: 435
  experiment_id: f9fad4b5fdd74cc1b2a4f3309fce6320
  experiment_tag: 3_fcnet_hiddens_0=80,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.593519389629364
        entropy_coeff: 0.0
        kl: 0.012376002967357635
        model: {}
        policy_loss: -0.021163098514080048
        total_loss: 869.5513916015625
        vf_explained_var: 0.0016247453168034554
        vf_loss: 869.5687866210938
    num_steps_sampled: 14400
    num_steps_trained: 14400
  iterations_since_restore: 3
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 68.42

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00003:
  custom_metrics: {}
  date: 2020-07-19_09-00-08
  done: false
  episode_len_mean: 153.72
  episode_reward_max: 500.0
  episode_reward_mean: 153.72
  episode_reward_min: 13.0
  episodes_this_iter: 11
  episodes_total: 497
  experiment_id: f9fad4b5fdd74cc1b2a4f3309fce6320
  experiment_tag: 3_fcnet_hiddens_0=80,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5782045125961304
        entropy_coeff: 0.0
        kl: 0.0021998886950314045
        model: {}
        policy_loss: -0.0007727226475253701
        total_loss: 2664.34423828125
        vf_explained_var: 0.00018171200645156205
        vf_loss: 2664.344482421875
    num_steps_sampled: 28800
    num_steps_trained: 28800
  iterations_since_restore: 6
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00003:
  custom_metrics: {}
  date: 2020-07-19_09-00-16
  done: false
  episode_len_mean: 276.36
  episode_reward_max: 500.0
  episode_reward_mean: 276.36
  episode_reward_min: 14.0
  episodes_this_iter: 10
  episodes_total: 536
  experiment_id: f9fad4b5fdd74cc1b2a4f3309fce6320
  experiment_tag: 3_fcnet_hiddens_0=80,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.07500000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.5575742721557617
        entropy_coeff: 0.0
        kl: 0.006223721895366907
        model: {}
        policy_loss: -0.007315991912037134
        total_loss: 2389.61865234375
        vf_explained_var: 1.3266060705063865e-05
        vf_loss: 2389.625732421875
    num_steps_sampled: 43200
    num_steps_trained: 43200
  iterations_since_restore: 9
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 72

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00003:
  custom_metrics: {}
  date: 2020-07-19_09-00-23
  done: false
  episode_len_mean: 372.12
  episode_reward_max: 500.0
  episode_reward_mean: 372.12
  episode_reward_min: 61.0
  episodes_this_iter: 9
  episodes_total: 567
  experiment_id: f9fad4b5fdd74cc1b2a4f3309fce6320
  experiment_tag: 3_fcnet_hiddens_0=80,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.03750000149011612
        cur_lr: 4.999999873689376e-05
        entropy: 0.5280267596244812
        entropy_coeff: 0.0
        kl: 0.0035497047938406467
        model: {}
        policy_loss: -0.006150512490421534
        total_loss: 2207.228515625
        vf_explained_var: 8.699056053274035e-08
        vf_loss: 2207.23486328125
    num_steps_sampled: 57600
    num_steps_trained: 57600
  iterations_since_restore: 12
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 70.12

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00003:
  custom_metrics: {}
  date: 2020-07-19_09-00-28
  done: true
  episode_len_mean: 417.96
  episode_reward_max: 500.0
  episode_reward_mean: 417.96
  episode_reward_min: 61.0
  episodes_this_iter: 9
  episodes_total: 587
  experiment_id: f9fad4b5fdd74cc1b2a4f3309fce6320
  experiment_tag: 3_fcnet_hiddens_0=80,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.00937500037252903
        cur_lr: 4.999999873689376e-05
        entropy: 0.5559065937995911
        entropy_coeff: 0.0
        kl: 0.003971484024077654
        model: {}
        policy_loss: -0.0014445012202486396
        total_loss: 2124.797119140625
        vf_explained_var: -9.665617994869535e-08
        vf_loss: 2124.798583984375
    num_steps_sampled: 67200
    num_steps_trained: 67200
  iterations_since_restore: 14
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 6

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00004,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,


[2m[36m(pid=28832)[0m 2020-07-19 09:00:34,270	INFO trainer.py:585 -- Tip: set framework=tfe or the --eager flag to enable TensorFlow eager execution
[2m[36m(pid=28832)[0m 2020-07-19 09:00:34,270	INFO trainer.py:612 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
Result for PPO_CartPole-v1_6da34_00004:
  custom_metrics: {}
  date: 2020-07-19_09-00-44
  done: false
  episode_len_mean: 22.358851674641148
  episode_reward_max: 55.0
  episode_reward_mean: 22.358851674641148
  episode_reward_min: 9.0
  episodes_this_iter: 209
  episodes_total: 209
  experiment_id: 8700195ecdec4b30ac281940951ea99f
  experiment_tag: 4_fcnet_hiddens_0=100,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.20000000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.6646804809570312
        entropy_coeff: 0.0
        kl: 0.029441647231578827
        model: {}
 

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00004:
  custom_metrics: {}
  date: 2020-07-19_09-00-49
  done: false
  episode_len_mean: 62.47
  episode_reward_max: 197.0
  episode_reward_mean: 62.47
  episode_reward_min: 12.0
  episodes_this_iter: 57
  episodes_total: 387
  experiment_id: 8700195ecdec4b30ac281940951ea99f
  experiment_tag: 4_fcnet_hiddens_0=100,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5906598567962646
        entropy_coeff: 0.0
        kl: 0.012457733042538166
        model: {}
        policy_loss: -0.018887832760810852
        total_loss: 1225.72607421875
        vf_explained_var: 0.0013216057559475303
        vf_loss: 1225.7412109375
    num_steps_sampled: 14400
    num_steps_trained: 14400
  iterations_since_restore: 3
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 62.95


Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00004:
  custom_metrics: {}
  date: 2020-07-19_09-00-56
  done: false
  episode_len_mean: 160.21
  episode_reward_max: 500.0
  episode_reward_mean: 160.21
  episode_reward_min: 14.0
  episodes_this_iter: 19
  episodes_total: 461
  experiment_id: 8700195ecdec4b30ac281940951ea99f
  experiment_tag: 4_fcnet_hiddens_0=100,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.15000000596046448
        cur_lr: 4.999999873689376e-05
        entropy: 0.557289719581604
        entropy_coeff: 0.0
        kl: 0.005479466635733843
        model: {}
        policy_loss: -0.0019569764845073223
        total_loss: 2102.773681640625
        vf_explained_var: 0.0004056965990457684
        vf_loss: 2102.77490234375
    num_steps_sampled: 28800
    num_steps_trained: 28800
  iterations_since_restore: 6
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 60

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00004:
  custom_metrics: {}
  date: 2020-07-19_09-01-02
  done: false
  episode_len_mean: 258.18
  episode_reward_max: 500.0
  episode_reward_mean: 258.18
  episode_reward_min: 14.0
  episodes_this_iter: 12
  episodes_total: 503
  experiment_id: 8700195ecdec4b30ac281940951ea99f
  experiment_tag: 4_fcnet_hiddens_0=100,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.15000000596046448
        cur_lr: 4.999999873689376e-05
        entropy: 0.526450514793396
        entropy_coeff: 0.0
        kl: 0.003630430204793811
        model: {}
        policy_loss: -0.002507850993424654
        total_loss: 2327.1474609375
        vf_explained_var: 1.7647807908360846e-05
        vf_loss: 2327.1494140625
    num_steps_sampled: 43200
    num_steps_trained: 43200
  iterations_since_restore: 9
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 64.3


Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00004:
  custom_metrics: {}
  date: 2020-07-19_09-01-09
  done: false
  episode_len_mean: 341.88
  episode_reward_max: 500.0
  episode_reward_mean: 341.88
  episode_reward_min: 14.0
  episodes_this_iter: 10
  episodes_total: 536
  experiment_id: 8700195ecdec4b30ac281940951ea99f
  experiment_tag: 4_fcnet_hiddens_0=100,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.03750000149011612
        cur_lr: 4.999999873689376e-05
        entropy: 0.5309068560600281
        entropy_coeff: 0.0
        kl: 0.0014336350141093135
        model: {}
        policy_loss: -0.004630922339856625
        total_loss: 2198.32177734375
        vf_explained_var: 9.330543434771243e-06
        vf_loss: 2198.326171875
    num_steps_sampled: 57600
    num_steps_trained: 57600
  iterations_since_restore: 12
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 62.

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00005,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00004:
  custom_metrics: {}
  date: 2020-07-19_09-01-16
  done: true
  episode_len_mean: 410.24
  episode_reward_max: 500.0
  episode_reward_mean: 410.24
  episode_reward_min: 132.0
  episodes_this_iter: 9
  episodes_total: 565
  experiment_id: 8700195ecdec4b30ac281940951ea99f
  experiment_tag: 4_fcnet_hiddens_0=100,fcnet_hiddens_1=20
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.00937500037252903
        cur_lr: 4.999999873689376e-05
        entropy: 0.5088074207305908
        entropy_coeff: 0.0
        kl: 0.0061948844231665134
        model: {}
        policy_loss: -0.0007102466188371181
        total_loss: 1952.3623046875
        vf_explained_var: 9.504524456360741e-08
        vf_loss: 1952.3629150390625
    num_steps_sampled: 72000
    num_steps_trained: 72000
  iterations_since_restore: 15
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 

2020-07-19 09:01:17,548	INFO (unknown file):0 -- gc.collect() freed 11 refs in 0.1217695029999959 seconds


[2m[36m(pid=28928)[0m 2020-07-19 09:01:20,784	INFO trainer.py:585 -- Tip: set framework=tfe or the --eager flag to enable TensorFlow eager execution
[2m[36m(pid=28928)[0m 2020-07-19 09:01:20,784	INFO trainer.py:612 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
Result for PPO_CartPole-v1_6da34_00005:
  custom_metrics: {}
  date: 2020-07-19_09-01-33
  done: false
  episode_len_mean: 21.761467889908257
  episode_reward_max: 68.0
  episode_reward_mean: 21.761467889908257
  episode_reward_min: 9.0
  episodes_this_iter: 218
  episodes_total: 218
  experiment_id: ebe8a89fd36948beb2b9e611c3bbfd7f
  experiment_tag: 5_fcnet_hiddens_0=20,fcnet_hiddens_1=40
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.20000000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.6710796356201172
        entropy_coeff: 0.0
        kl: 0.022238153964281082
        model: {}
  

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00005:
  custom_metrics: {}
  date: 2020-07-19_09-01-40
  done: false
  episode_len_mean: 60.58
  episode_reward_max: 160.0
  episode_reward_mean: 60.58
  episode_reward_min: 14.0
  episodes_this_iter: 75
  episodes_total: 546
  experiment_id: ebe8a89fd36948beb2b9e611c3bbfd7f
  experiment_tag: 5_fcnet_hiddens_0=20,fcnet_hiddens_1=40
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5948651432991028
        entropy_coeff: 0.0
        kl: 0.006660849321633577
        model: {}
        policy_loss: -0.010564912110567093
        total_loss: 504.98211669921875
        vf_explained_var: 0.013336948119103909
        vf_loss: 504.9906005859375
    num_steps_sampled: 19200
    num_steps_trained: 19200
  iterations_since_restore: 4
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 64.3

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00005:
  custom_metrics: {}
  date: 2020-07-19_09-01-47
  done: false
  episode_len_mean: 117.6
  episode_reward_max: 494.0
  episode_reward_mean: 117.6
  episode_reward_min: 11.0
  episodes_this_iter: 31
  episodes_total: 672
  experiment_id: ebe8a89fd36948beb2b9e611c3bbfd7f
  experiment_tag: 5_fcnet_hiddens_0=20,fcnet_hiddens_1=40
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5619719624519348
        entropy_coeff: 0.0
        kl: 0.00421889079734683
        model: {}
        policy_loss: -0.00810688454657793
        total_loss: 1166.2972412109375
        vf_explained_var: 0.007231350056827068
        vf_loss: 1166.30419921875
    num_steps_sampled: 33600
    num_steps_trained: 33600
  iterations_since_restore: 7
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 66.2
  

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00005:
  custom_metrics: {}
  date: 2020-07-19_09-01-54
  done: false
  episode_len_mean: 213.05
  episode_reward_max: 500.0
  episode_reward_mean: 213.05
  episode_reward_min: 14.0
  episodes_this_iter: 20
  episodes_total: 723
  experiment_id: ebe8a89fd36948beb2b9e611c3bbfd7f
  experiment_tag: 5_fcnet_hiddens_0=20,fcnet_hiddens_1=40
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.07500000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.5535653829574585
        entropy_coeff: 0.0
        kl: 0.0095695024356246
        model: {}
        policy_loss: -0.002659343648701906
        total_loss: 1170.2486572265625
        vf_explained_var: 0.09187152236700058
        vf_loss: 1170.2506103515625
    num_steps_sampled: 48000
    num_steps_trained: 48000
  iterations_since_restore: 10
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 67.

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00005:
  custom_metrics: {}
  date: 2020-07-19_09-02-01
  done: false
  episode_len_mean: 301.07
  episode_reward_max: 500.0
  episode_reward_mean: 301.07
  episode_reward_min: 28.0
  episodes_this_iter: 13
  episodes_total: 757
  experiment_id: ebe8a89fd36948beb2b9e611c3bbfd7f
  experiment_tag: 5_fcnet_hiddens_0=20,fcnet_hiddens_1=40
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.03750000149011612
        cur_lr: 4.999999873689376e-05
        entropy: 0.5266083478927612
        entropy_coeff: 0.0
        kl: 0.0033742785453796387
        model: {}
        policy_loss: -0.0009685732657089829
        total_loss: 1116.404541015625
        vf_explained_var: 0.10699250549077988
        vf_loss: 1116.4053955078125
    num_steps_sampled: 62400
    num_steps_trained: 62400
  iterations_since_restore: 13
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00005:
  custom_metrics: {}
  date: 2020-07-19_09-02-07
  done: false
  episode_len_mean: 381.73
  episode_reward_max: 500.0
  episode_reward_mean: 381.73
  episode_reward_min: 100.0
  episodes_this_iter: 11
  episodes_total: 790
  experiment_id: ebe8a89fd36948beb2b9e611c3bbfd7f
  experiment_tag: 5_fcnet_hiddens_0=20,fcnet_hiddens_1=40
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.00937500037252903
        cur_lr: 4.999999873689376e-05
        entropy: 0.5307157635688782
        entropy_coeff: 0.0
        kl: 0.004437300842255354
        model: {}
        policy_loss: 0.00030851445626467466
        total_loss: 996.6212158203125
        vf_explained_var: 0.11976678669452667
        vf_loss: 996.6207885742188
    num_steps_sampled: 76800
    num_steps_trained: 76800
  iterations_since_restore: 16
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 6

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00006,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00005:
  custom_metrics: {}
  date: 2020-07-19_09-02-11
  done: true
  episode_len_mean: 409.17
  episode_reward_max: 500.0
  episode_reward_mean: 409.17
  episode_reward_min: 127.0
  episodes_this_iter: 9
  episodes_total: 808
  experiment_id: ebe8a89fd36948beb2b9e611c3bbfd7f
  experiment_tag: 5_fcnet_hiddens_0=20,fcnet_hiddens_1=40
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.004687500186264515
        cur_lr: 4.999999873689376e-05
        entropy: 0.5079281330108643
        entropy_coeff: 0.0
        kl: 0.00710186455398798
        model: {}
        policy_loss: -0.0038281551096588373
        total_loss: 963.203125
        vf_explained_var: 0.00015835181693546474
        vf_loss: 963.2069091796875
    num_steps_sampled: 86400
    num_steps_trained: 86400
  iterations_since_restore: 18
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 63.225


Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00006:
  custom_metrics: {}
  date: 2020-07-19_09-02-31
  done: false
  episode_len_mean: 53.29
  episode_reward_max: 154.0
  episode_reward_mean: 53.29
  episode_reward_min: 10.0
  episodes_this_iter: 85
  episodes_total: 433
  experiment_id: 4d080bc16b9545aab7839e8cdab2eb07
  experiment_tag: 6_fcnet_hiddens_0=40,fcnet_hiddens_1=40
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5904853343963623
        entropy_coeff: 0.0
        kl: 0.01160881295800209
        model: {}
        policy_loss: -0.019516896456480026
        total_loss: 638.7928466796875
        vf_explained_var: 0.012190445326268673
        vf_loss: 638.808837890625
    num_steps_sampled: 14400
    num_steps_trained: 14400
  iterations_since_restore: 3
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 68.3
  

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00006:
  custom_metrics: {}
  date: 2020-07-19_09-02-38
  done: false
  episode_len_mean: 149.98
  episode_reward_max: 471.0
  episode_reward_mean: 149.98
  episode_reward_min: 11.0
  episodes_this_iter: 19
  episodes_total: 514
  experiment_id: 4d080bc16b9545aab7839e8cdab2eb07
  experiment_tag: 6_fcnet_hiddens_0=40,fcnet_hiddens_1=40
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5637285709381104
        entropy_coeff: 0.0
        kl: 0.0021010758355259895
        model: {}
        policy_loss: 0.00043340277625247836
        total_loss: 1633.4901123046875
        vf_explained_var: 0.005250368732959032
        vf_loss: 1633.489013671875
    num_steps_sampled: 28800
    num_steps_trained: 28800
  iterations_since_restore: 6
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00006:
  custom_metrics: {}
  date: 2020-07-19_09-02-44
  done: false
  episode_len_mean: 252.35
  episode_reward_max: 500.0
  episode_reward_mean: 252.35
  episode_reward_min: 14.0
  episodes_this_iter: 11
  episodes_total: 551
  experiment_id: 4d080bc16b9545aab7839e8cdab2eb07
  experiment_tag: 6_fcnet_hiddens_0=40,fcnet_hiddens_1=40
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.07500000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.5306363701820374
        entropy_coeff: 0.0
        kl: 0.006622670218348503
        model: {}
        policy_loss: -0.0029428009875118732
        total_loss: 1644.8323974609375
        vf_explained_var: 0.0007756513659842312
        vf_loss: 1644.8349609375
    num_steps_sampled: 43200
    num_steps_trained: 43200
  iterations_since_restore: 9
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 64

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00006:
  custom_metrics: {}
  date: 2020-07-19_09-02-51
  done: false
  episode_len_mean: 334.59
  episode_reward_max: 500.0
  episode_reward_mean: 334.59
  episode_reward_min: 49.0
  episodes_this_iter: 10
  episodes_total: 589
  experiment_id: 4d080bc16b9545aab7839e8cdab2eb07
  experiment_tag: 6_fcnet_hiddens_0=40,fcnet_hiddens_1=40
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.07500000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.5130739212036133
        entropy_coeff: 0.0
        kl: 0.0017609935021027923
        model: {}
        policy_loss: -0.0027927353512495756
        total_loss: 1433.4461669921875
        vf_explained_var: 0.0006892085075378418
        vf_loss: 1433.4488525390625
    num_steps_sampled: 57600
    num_steps_trained: 57600
  iterations_since_restore: 12
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percen

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00007,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,




Result for PPO_CartPole-v1_6da34_00006:
  custom_metrics: {}
  date: 2020-07-19_09-02-57
  done: true
  episode_len_mean: 402.38
  episode_reward_max: 500.0
  episode_reward_mean: 402.38
  episode_reward_min: 49.0
  episodes_this_iter: 9
  episodes_total: 620
  experiment_id: 4d080bc16b9545aab7839e8cdab2eb07
  experiment_tag: 6_fcnet_hiddens_0=40,fcnet_hiddens_1=40
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.01875000074505806
        cur_lr: 4.999999873689376e-05
        entropy: 0.49369916319847107
        entropy_coeff: 0.0
        kl: 0.004900893662124872
        model: {}
        policy_loss: -0.0030332941096276045
        total_loss: 1177.2862548828125
        vf_explained_var: 8.6885855125729e-05
        vf_loss: 1177.2889404296875
    num_steps_sampled: 72000
    num_steps_trained: 72000
  iterations_since_restore: 15
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 6

2020-07-19 09:02:58,290	INFO (unknown file):0 -- gc.collect() freed 72 refs in 0.14371479800001907 seconds


[2m[36m(pid=29054)[0m 2020-07-19 09:03:02,191	INFO trainer.py:585 -- Tip: set framework=tfe or the --eager flag to enable TensorFlow eager execution
[2m[36m(pid=29054)[0m 2020-07-19 09:03:02,191	INFO trainer.py:612 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
Result for PPO_CartPole-v1_6da34_00007:
  custom_metrics: {}
  date: 2020-07-19_09-03-14
  done: false
  episode_len_mean: 21.990697674418605
  episode_reward_max: 71.0
  episode_reward_mean: 21.990697674418605
  episode_reward_min: 9.0
  episodes_this_iter: 215
  episodes_total: 215
  experiment_id: c3277816ab6d4d1b90e22f7a810dab17
  experiment_tag: 7_fcnet_hiddens_0=60,fcnet_hiddens_1=40
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.20000000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.6651923656463623
        entropy_coeff: 0.0
        kl: 0.028940584510564804
        model: {}
  

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00007:
  custom_metrics: {}
  date: 2020-07-19_09-03-19
  done: false
  episode_len_mean: 53.35
  episode_reward_max: 167.0
  episode_reward_mean: 53.35
  episode_reward_min: 11.0
  episodes_this_iter: 83
  episodes_total: 441
  experiment_id: c3277816ab6d4d1b90e22f7a810dab17
  experiment_tag: 7_fcnet_hiddens_0=60,fcnet_hiddens_1=40
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.6029510498046875
        entropy_coeff: 0.0
        kl: 0.010715698823332787
        model: {}
        policy_loss: -0.01955040730535984
        total_loss: 550.2619018554688
        vf_explained_var: 0.005963272415101528
        vf_loss: 550.2783203125
    num_steps_sampled: 14400
    num_steps_trained: 14400
  iterations_since_restore: 3
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 65.125
  

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00007:
  custom_metrics: {}
  date: 2020-07-19_09-03-26
  done: false
  episode_len_mean: 142.24
  episode_reward_max: 486.0
  episode_reward_mean: 142.24
  episode_reward_min: 20.0
  episodes_this_iter: 20
  episodes_total: 531
  experiment_id: c3277816ab6d4d1b90e22f7a810dab17
  experiment_tag: 7_fcnet_hiddens_0=60,fcnet_hiddens_1=40
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.15000000596046448
        cur_lr: 4.999999873689376e-05
        entropy: 0.5757880210876465
        entropy_coeff: 0.0
        kl: 0.004560480825603008
        model: {}
        policy_loss: -0.00462718028575182
        total_loss: 1609.4254150390625
        vf_explained_var: 0.003993777092546225
        vf_loss: 1609.42919921875
    num_steps_sampled: 28800
    num_steps_trained: 28800
  iterations_since_restore: 6
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 64.7

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00007:
  custom_metrics: {}
  date: 2020-07-19_09-03-33
  done: false
  episode_len_mean: 245.14
  episode_reward_max: 500.0
  episode_reward_mean: 245.14
  episode_reward_min: 37.0
  episodes_this_iter: 12
  episodes_total: 572
  experiment_id: c3277816ab6d4d1b90e22f7a810dab17
  experiment_tag: 7_fcnet_hiddens_0=60,fcnet_hiddens_1=40
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.03750000149011612
        cur_lr: 4.999999873689376e-05
        entropy: 0.535930871963501
        entropy_coeff: 0.0
        kl: 0.005364434793591499
        model: {}
        policy_loss: -0.004621787462383509
        total_loss: 1645.1175537109375
        vf_explained_var: 0.01435407716780901
        vf_loss: 1645.1219482421875
    num_steps_sampled: 43200
    num_steps_trained: 43200
  iterations_since_restore: 9
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 67.

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00007:
  custom_metrics: {}
  date: 2020-07-19_09-03-40
  done: false
  episode_len_mean: 343.07
  episode_reward_max: 500.0
  episode_reward_mean: 343.07
  episode_reward_min: 42.0
  episodes_this_iter: 11
  episodes_total: 603
  experiment_id: c3277816ab6d4d1b90e22f7a810dab17
  experiment_tag: 7_fcnet_hiddens_0=60,fcnet_hiddens_1=40
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.03750000149011612
        cur_lr: 4.999999873689376e-05
        entropy: 0.5598181486129761
        entropy_coeff: 0.0
        kl: 0.0030202537309378386
        model: {}
        policy_loss: -0.0005228170193731785
        total_loss: 1409.273193359375
        vf_explained_var: -2.9351260764087783e-06
        vf_loss: 1409.273681640625
    num_steps_sampled: 57600
    num_steps_trained: 57600
  iterations_since_restore: 12
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percen

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00007:
  custom_metrics: {}
  date: 2020-07-19_09-03-47
  done: true
  episode_len_mean: 427.62
  episode_reward_max: 500.0
  episode_reward_mean: 427.62
  episode_reward_min: 42.0
  episodes_this_iter: 11
  episodes_total: 633
  experiment_id: c3277816ab6d4d1b90e22f7a810dab17
  experiment_tag: 7_fcnet_hiddens_0=60,fcnet_hiddens_1=40
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.004687500186264515
        cur_lr: 4.999999873689376e-05
        entropy: 0.5871568322181702
        entropy_coeff: 0.0
        kl: 0.0028638069052249193
        model: {}
        policy_loss: -0.0009025207255035639
        total_loss: 1119.2503662109375
        vf_explained_var: -2.1973171442368766e-06
        vf_loss: 1119.2513427734375
    num_steps_sampled: 72000
    num_steps_trained: 72000
  iterations_since_restore: 15
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_perc

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00008,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,


[2m[36m(pid=29078)[0m 2020-07-19 09:03:51,101	INFO trainer.py:585 -- Tip: set framework=tfe or the --eager flag to enable TensorFlow eager execution
[2m[36m(pid=29078)[0m 2020-07-19 09:03:51,101	INFO trainer.py:612 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
Result for PPO_CartPole-v1_6da34_00008:
  custom_metrics: {}
  date: 2020-07-19_09-04-01
  done: false
  episode_len_mean: 22.254716981132077
  episode_reward_max: 54.0
  episode_reward_mean: 22.254716981132077
  episode_reward_min: 9.0
  episodes_this_iter: 212
  episodes_total: 212
  experiment_id: 5a5f0ca211a1430eb21ea3a14bcabb3d
  experiment_tag: 8_fcnet_hiddens_0=80,fcnet_hiddens_1=40
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.20000000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.6634814739227295
        entropy_coeff: 0.0
        kl: 0.030201228335499763
        model: {}
  

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00008:
  custom_metrics: {}
  date: 2020-07-19_09-04-08
  done: false
  episode_len_mean: 81.21
  episode_reward_max: 203.0
  episode_reward_mean: 81.21
  episode_reward_min: 15.0
  episodes_this_iter: 43
  episodes_total: 462
  experiment_id: 5a5f0ca211a1430eb21ea3a14bcabb3d
  experiment_tag: 8_fcnet_hiddens_0=80,fcnet_hiddens_1=40
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5794376730918884
        entropy_coeff: 0.0
        kl: 0.011137197725474834
        model: {}
        policy_loss: -0.010917034931480885
        total_loss: 980.5659790039062
        vf_explained_var: 0.006884771399199963
        vf_loss: 980.5736694335938
    num_steps_sampled: 19200
    num_steps_trained: 19200
  iterations_since_restore: 4
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 68.10

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00008:
  custom_metrics: {}
  date: 2020-07-19_09-04-15
  done: false
  episode_len_mean: 183.18
  episode_reward_max: 500.0
  episode_reward_mean: 183.18
  episode_reward_min: 17.0
  episodes_this_iter: 12
  episodes_total: 510
  experiment_id: 5a5f0ca211a1430eb21ea3a14bcabb3d
  experiment_tag: 8_fcnet_hiddens_0=80,fcnet_hiddens_1=40
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.15000000596046448
        cur_lr: 4.999999873689376e-05
        entropy: 0.545344352722168
        entropy_coeff: 0.0
        kl: 0.004697732627391815
        model: {}
        policy_loss: -0.0031601220835000277
        total_loss: 1907.158203125
        vf_explained_var: 0.0005218161386437714
        vf_loss: 1907.160888671875
    num_steps_sampled: 33600
    num_steps_trained: 33600
  iterations_since_restore: 7
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 66.96

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00008:
  custom_metrics: {}
  date: 2020-07-19_09-04-22
  done: false
  episode_len_mean: 297.66
  episode_reward_max: 500.0
  episode_reward_mean: 297.66
  episode_reward_min: 22.0
  episodes_this_iter: 12
  episodes_total: 543
  experiment_id: 5a5f0ca211a1430eb21ea3a14bcabb3d
  experiment_tag: 8_fcnet_hiddens_0=80,fcnet_hiddens_1=40
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.07500000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.4965367615222931
        entropy_coeff: 0.0
        kl: 0.0043554347939789295
        model: {}
        policy_loss: -0.004241461865603924
        total_loss: 1530.736572265625
        vf_explained_var: 0.007487648166716099
        vf_loss: 1530.740478515625
    num_steps_sampled: 48000
    num_steps_trained: 48000
  iterations_since_restore: 10
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 6

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00008:
  custom_metrics: {}
  date: 2020-07-19_09-04-28
  done: true
  episode_len_mean: 400.71
  episode_reward_max: 500.0
  episode_reward_mean: 400.71
  episode_reward_min: 76.0
  episodes_this_iter: 9
  episodes_total: 571
  experiment_id: 5a5f0ca211a1430eb21ea3a14bcabb3d
  experiment_tag: 8_fcnet_hiddens_0=80,fcnet_hiddens_1=40
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.01875000074505806
        cur_lr: 4.999999873689376e-05
        entropy: 0.48159074783325195
        entropy_coeff: 0.0
        kl: 0.004831274971365929
        model: {}
        policy_loss: -0.0017808357952162623
        total_loss: 1344.2681884765625
        vf_explained_var: 5.2241055527701974e-05
        vf_loss: 1344.2698974609375
    num_steps_sampled: 62400
    num_steps_trained: 62400
  iterations_since_restore: 13
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00009,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,


[2m[36m(pid=29173)[0m 2020-07-19 09:04:33,345	INFO trainer.py:585 -- Tip: set framework=tfe or the --eager flag to enable TensorFlow eager execution
[2m[36m(pid=29173)[0m 2020-07-19 09:04:33,345	INFO trainer.py:612 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
Result for PPO_CartPole-v1_6da34_00009:
  custom_metrics: {}
  date: 2020-07-19_09-04-46
  done: false
  episode_len_mean: 23.385
  episode_reward_max: 80.0
  episode_reward_mean: 23.385
  episode_reward_min: 9.0
  episodes_this_iter: 200
  episodes_total: 200
  experiment_id: 7592386b343a4e839e76569959302108
  experiment_tag: 9_fcnet_hiddens_0=100,fcnet_hiddens_1=40
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.20000000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.6637318730354309
        entropy_coeff: 0.0
        kl: 0.030820205807685852
        model: {}
        policy_loss: -0.0

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00009:
  custom_metrics: {}
  date: 2020-07-19_09-04-51
  done: false
  episode_len_mean: 61.89
  episode_reward_max: 178.0
  episode_reward_mean: 61.89
  episode_reward_min: 13.0
  episodes_this_iter: 66
  episodes_total: 389
  experiment_id: 7592386b343a4e839e76569959302108
  experiment_tag: 9_fcnet_hiddens_0=100,fcnet_hiddens_1=40
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5796043276786804
        entropy_coeff: 0.0
        kl: 0.01225506141781807
        model: {}
        policy_loss: -0.021052081137895584
        total_loss: 795.0252685546875
        vf_explained_var: 0.005091375671327114
        vf_loss: 795.042724609375
    num_steps_sampled: 14400
    num_steps_trained: 14400
  iterations_since_restore: 3
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 69.2
 

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00009:
  custom_metrics: {}
  date: 2020-07-19_09-04-58
  done: false
  episode_len_mean: 159.99
  episode_reward_max: 500.0
  episode_reward_mean: 159.99
  episode_reward_min: 13.0
  episodes_this_iter: 17
  episodes_total: 456
  experiment_id: 7592386b343a4e839e76569959302108
  experiment_tag: 9_fcnet_hiddens_0=100,fcnet_hiddens_1=40
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.15000000596046448
        cur_lr: 4.999999873689376e-05
        entropy: 0.5272051095962524
        entropy_coeff: 0.0
        kl: 0.004855133593082428
        model: {}
        policy_loss: -0.005584267899394035
        total_loss: 1738.5872802734375
        vf_explained_var: 0.004448209423571825
        vf_loss: 1738.59228515625
    num_steps_sampled: 28800
    num_steps_trained: 28800
  iterations_since_restore: 6
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 67

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00009:
  custom_metrics: {}
  date: 2020-07-19_09-05-05
  done: false
  episode_len_mean: 275.87
  episode_reward_max: 500.0
  episode_reward_mean: 275.87
  episode_reward_min: 16.0
  episodes_this_iter: 12
  episodes_total: 489
  experiment_id: 7592386b343a4e839e76569959302108
  experiment_tag: 9_fcnet_hiddens_0=100,fcnet_hiddens_1=40
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.03750000149011612
        cur_lr: 4.999999873689376e-05
        entropy: 0.4633627235889435
        entropy_coeff: 0.0
        kl: 0.005690160673111677
        model: {}
        policy_loss: -0.006666009314358234
        total_loss: 1680.7755126953125
        vf_explained_var: 2.7722602681024e-05
        vf_loss: 1680.7821044921875
    num_steps_sampled: 43200
    num_steps_trained: 43200
  iterations_since_restore: 9
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 6

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,


Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00010,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00009:
  custom_metrics: {}
  date: 2020-07-19_09-05-12
  done: false
  episode_len_mean: 378.84
  episode_reward_max: 500.0
  episode_reward_mean: 378.84
  episode_reward_min: 114.0
  episodes_this_iter: 10
  episodes_total: 519
  experiment_id: 7592386b343a4e839e76569959302108
  experiment_tag: 9_fcnet_hiddens_0=100,fcnet_hiddens_1=40
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.00937500037252903
        cur_lr: 4.999999873689376e-05
        entropy: 0.4430758059024811
        entropy_coeff: 0.0
        kl: 0.008564898744225502
        model: {}
        policy_loss: -0.002105983905494213
        total_loss: 1441.9239501953125
        vf_explained_var: -1.0309992859447448e-07
        vf_loss: 1441.926025390625
    num_steps_sampled: 57600
    num_steps_trained: 57600
  iterations_since_restore: 12
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_perce

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00010:
  custom_metrics: {}
  date: 2020-07-19_09-05-34
  done: false
  episode_len_mean: 55.05
  episode_reward_max: 161.0
  episode_reward_mean: 55.05
  episode_reward_min: 12.0
  episodes_this_iter: 75
  episodes_total: 423
  experiment_id: 6526928e62184ec2b58dbaa28670f115
  experiment_tag: 10_fcnet_hiddens_0=20,fcnet_hiddens_1=60
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5972269773483276
        entropy_coeff: 0.0
        kl: 0.01021580770611763
        model: {}
        policy_loss: -0.018484998494386673
        total_loss: 580.8726806640625
        vf_explained_var: 0.027915557846426964
        vf_loss: 580.8881225585938
    num_steps_sampled: 14400
    num_steps_trained: 14400
  iterations_since_restore: 3
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 67.67

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00010:
  custom_metrics: {}
  date: 2020-07-19_09-05-41
  done: false
  episode_len_mean: 144.37
  episode_reward_max: 366.0
  episode_reward_mean: 144.37
  episode_reward_min: 16.0
  episodes_this_iter: 23
  episodes_total: 520
  experiment_id: 6526928e62184ec2b58dbaa28670f115
  experiment_tag: 10_fcnet_hiddens_0=20,fcnet_hiddens_1=60
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5701967477798462
        entropy_coeff: 0.0
        kl: 0.0038552763871848583
        model: {}
        policy_loss: -0.0018138921586796641
        total_loss: 1101.360595703125
        vf_explained_var: 0.019832465797662735
        vf_loss: 1101.3612060546875
    num_steps_sampled: 28800
    num_steps_trained: 28800
  iterations_since_restore: 6
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent:

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00010:
  custom_metrics: {}
  date: 2020-07-19_09-05-48
  done: false
  episode_len_mean: 243.93
  episode_reward_max: 500.0
  episode_reward_mean: 243.93
  episode_reward_min: 16.0
  episodes_this_iter: 13
  episodes_total: 563
  experiment_id: 6526928e62184ec2b58dbaa28670f115
  experiment_tag: 10_fcnet_hiddens_0=20,fcnet_hiddens_1=60
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.07500000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.5224301815032959
        entropy_coeff: 0.0
        kl: 0.004090090747922659
        model: {}
        policy_loss: -0.003945420496165752
        total_loss: 1074.2159423828125
        vf_explained_var: 0.0027486439794301987
        vf_loss: 1074.219482421875
    num_steps_sampled: 43200
    num_steps_trained: 43200
  iterations_since_restore: 9
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00010:
  custom_metrics: {}
  date: 2020-07-19_09-05-54
  done: false
  episode_len_mean: 336.7
  episode_reward_max: 500.0
  episode_reward_mean: 336.7
  episode_reward_min: 16.0
  episodes_this_iter: 12
  episodes_total: 595
  experiment_id: 6526928e62184ec2b58dbaa28670f115
  experiment_tag: 10_fcnet_hiddens_0=20,fcnet_hiddens_1=60
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.01875000074505806
        cur_lr: 4.999999873689376e-05
        entropy: 0.5109713673591614
        entropy_coeff: 0.0
        kl: 0.009676586836576462
        model: {}
        policy_loss: -0.004511289298534393
        total_loss: 807.3935546875
        vf_explained_var: 0.024977311491966248
        vf_loss: 807.39794921875
    num_steps_sampled: 57600
    num_steps_trained: 57600
  iterations_since_restore: 12
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 67.56666

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00011,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00010:
  custom_metrics: {}
  date: 2020-07-19_09-06-01
  done: true
  episode_len_mean: 402.32
  episode_reward_max: 500.0
  episode_reward_mean: 402.32
  episode_reward_min: 150.0
  episodes_this_iter: 11
  episodes_total: 629
  experiment_id: 6526928e62184ec2b58dbaa28670f115
  experiment_tag: 10_fcnet_hiddens_0=20,fcnet_hiddens_1=60
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.004687500186264515
        cur_lr: 4.999999873689376e-05
        entropy: 0.4871262311935425
        entropy_coeff: 0.0
        kl: 0.005311083048582077
        model: {}
        policy_loss: -0.005007399711757898
        total_loss: 650.0125732421875
        vf_explained_var: 0.08484116196632385
        vf_loss: 650.0174560546875
    num_steps_sampled: 72000
    num_steps_trained: 72000
  iterations_since_restore: 15
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 6



Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00011:
  custom_metrics: {}
  date: 2020-07-19_09-06-24
  done: false
  episode_len_mean: 59.46
  episode_reward_max: 278.0
  episode_reward_mean: 59.46
  episode_reward_min: 11.0
  episodes_this_iter: 59
  episodes_total: 423
  experiment_id: e8c86a4261c4470a9e91ce71605d1b61
  experiment_tag: 11_fcnet_hiddens_0=40,fcnet_hiddens_1=60
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.6017545461654663
        entropy_coeff: 0.0
        kl: 0.008226046338677406
        model: {}
        policy_loss: -0.012139515019953251
        total_loss: 858.5896606445312
        vf_explained_var: 0.061548858880996704
        vf_loss: 858.5994262695312
    num_steps_sampled: 14400
    num_steps_trained: 14400
  iterations_since_restore: 3
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 69.2

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00011:
  custom_metrics: {}
  date: 2020-07-19_09-06-31
  done: false
  episode_len_mean: 150.28
  episode_reward_max: 460.0
  episode_reward_mean: 150.28
  episode_reward_min: 12.0
  episodes_this_iter: 25
  episodes_total: 515
  experiment_id: e8c86a4261c4470a9e91ce71605d1b61
  experiment_tag: 11_fcnet_hiddens_0=40,fcnet_hiddens_1=60
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.15000000596046448
        cur_lr: 4.999999873689376e-05
        entropy: 0.5693479180335999
        entropy_coeff: 0.0
        kl: 0.007663580123335123
        model: {}
        policy_loss: -0.005717838648706675
        total_loss: 982.7056884765625
        vf_explained_var: 0.01697850599884987
        vf_loss: 982.710205078125
    num_steps_sampled: 28800
    num_steps_trained: 28800
  iterations_since_restore: 6
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 67.7

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00011:
  custom_metrics: {}
  date: 2020-07-19_09-06-37
  done: false
  episode_len_mean: 237.03
  episode_reward_max: 500.0
  episode_reward_mean: 237.03
  episode_reward_min: 25.0
  episodes_this_iter: 16
  episodes_total: 564
  experiment_id: e8c86a4261c4470a9e91ce71605d1b61
  experiment_tag: 11_fcnet_hiddens_0=40,fcnet_hiddens_1=60
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.07500000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.5587447285652161
        entropy_coeff: 0.0
        kl: 0.004058286547660828
        model: {}
        policy_loss: -0.0006268470315262675
        total_loss: 1023.5531005859375
        vf_explained_var: 0.10502774268388748
        vf_loss: 1023.5534057617188
    num_steps_sampled: 43200
    num_steps_trained: 43200
  iterations_since_restore: 9
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 



Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00011:
  custom_metrics: {}
  date: 2020-07-19_09-06-45
  done: false
  episode_len_mean: 299.91
  episode_reward_max: 500.0
  episode_reward_mean: 299.91
  episode_reward_min: 86.0
  episodes_this_iter: 8
  episodes_total: 602
  experiment_id: e8c86a4261c4470a9e91ce71605d1b61
  experiment_tag: 11_fcnet_hiddens_0=40,fcnet_hiddens_1=60
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.01875000074505806
        cur_lr: 4.999999873689376e-05
        entropy: 0.5201814770698547
        entropy_coeff: 0.0
        kl: 0.0065205576829612255
        model: {}
        policy_loss: -0.002541219349950552
        total_loss: 886.0687255859375
        vf_explained_var: 0.0014771110145375133
        vf_loss: 886.0709228515625
    num_steps_sampled: 57600
    num_steps_trained: 57600
  iterations_since_restore: 12
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00011:
  custom_metrics: {}
  date: 2020-07-19_09-06-52
  done: false
  episode_len_mean: 371.38
  episode_reward_max: 500.0
  episode_reward_mean: 371.38
  episode_reward_min: 119.0
  episodes_this_iter: 12
  episodes_total: 635
  experiment_id: e8c86a4261c4470a9e91ce71605d1b61
  experiment_tag: 11_fcnet_hiddens_0=40,fcnet_hiddens_1=60
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.01875000074505806
        cur_lr: 4.999999873689376e-05
        entropy: 0.48205333948135376
        entropy_coeff: 0.0
        kl: 0.004809299949556589
        model: {}
        policy_loss: -0.0029199239797890186
        total_loss: 595.3821411132812
        vf_explained_var: 0.16126076877117157
        vf_loss: 595.385009765625
    num_steps_sampled: 72000
    num_steps_trained: 72000
  iterations_since_restore: 15
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00012,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00011:
  custom_metrics: {}
  date: 2020-07-19_09-06-56
  done: true
  episode_len_mean: 402.92
  episode_reward_max: 500.0
  episode_reward_mean: 402.92
  episode_reward_min: 119.0
  episodes_this_iter: 12
  episodes_total: 656
  experiment_id: e8c86a4261c4470a9e91ce71605d1b61
  experiment_tag: 11_fcnet_hiddens_0=40,fcnet_hiddens_1=60
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.00937500037252903
        cur_lr: 4.999999873689376e-05
        entropy: 0.4632655382156372
        entropy_coeff: 0.0
        kl: 0.007733934558928013
        model: {}
        policy_loss: -0.005746770650148392
        total_loss: 570.1273803710938
        vf_explained_var: 0.0008266036747954786
        vf_loss: 570.1329956054688
    num_steps_sampled: 81600
    num_steps_trained: 81600
  iterations_since_restore: 17
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00012:
  custom_metrics: {}
  date: 2020-07-19_09-07-16
  done: false
  episode_len_mean: 58.0
  episode_reward_max: 225.0
  episode_reward_mean: 58.0
  episode_reward_min: 12.0
  episodes_this_iter: 61
  episodes_total: 402
  experiment_id: f76d40e1e2e644ff9b900816b92a0030
  experiment_tag: 12_fcnet_hiddens_0=60,fcnet_hiddens_1=60
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.6020333766937256
        entropy_coeff: 0.0
        kl: 0.009768083691596985
        model: {}
        policy_loss: -0.0199267715215683
        total_loss: 836.4822387695312
        vf_explained_var: 0.004484194330871105
        vf_loss: 836.499267578125
    num_steps_sampled: 14400
    num_steps_trained: 14400
  iterations_since_restore: 3
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 67.7
    

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00012:
  custom_metrics: {}
  date: 2020-07-19_09-07-23
  done: false
  episode_len_mean: 159.86
  episode_reward_max: 500.0
  episode_reward_mean: 159.86
  episode_reward_min: 13.0
  episodes_this_iter: 12
  episodes_total: 470
  experiment_id: f76d40e1e2e644ff9b900816b92a0030
  experiment_tag: 12_fcnet_hiddens_0=60,fcnet_hiddens_1=60
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5550718903541565
        entropy_coeff: 0.0
        kl: 0.0039919288828969
        model: {}
        policy_loss: -0.0030691083520650864
        total_loss: 1533.1995849609375
        vf_explained_var: 0.0010395050048828125
        vf_loss: 1533.20166015625
    num_steps_sampled: 28800
    num_steps_trained: 28800
  iterations_since_restore: 6
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 68

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00012:
  custom_metrics: {}
  date: 2020-07-19_09-07-28
  done: false
  episode_len_mean: 232.66
  episode_reward_max: 500.0
  episode_reward_mean: 232.66
  episode_reward_min: 13.0
  episodes_this_iter: 14
  episodes_total: 496
  experiment_id: f76d40e1e2e644ff9b900816b92a0030
  experiment_tag: 12_fcnet_hiddens_0=60,fcnet_hiddens_1=60
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.15000000596046448
        cur_lr: 4.999999873689376e-05
        entropy: 0.535614550113678
        entropy_coeff: 0.0
        kl: 0.0071636526845395565
        model: {}
        policy_loss: -0.0027218603063374758
        total_loss: 1221.8731689453125
        vf_explained_var: 0.0689893513917923
        vf_loss: 1221.8751220703125
    num_steps_sampled: 38400
    num_steps_trained: 38400
  iterations_since_restore: 8
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 6

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00012:
  custom_metrics: {}
  date: 2020-07-19_09-07-36
  done: false
  episode_len_mean: 345.1
  episode_reward_max: 500.0
  episode_reward_mean: 345.1
  episode_reward_min: 13.0
  episodes_this_iter: 11
  episodes_total: 528
  experiment_id: f76d40e1e2e644ff9b900816b92a0030
  experiment_tag: 12_fcnet_hiddens_0=60,fcnet_hiddens_1=60
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.15000000596046448
        cur_lr: 4.999999873689376e-05
        entropy: 0.5190042853355408
        entropy_coeff: 0.0
        kl: 0.0013915685703977942
        model: {}
        policy_loss: -0.0022497239988297224
        total_loss: 957.7533569335938
        vf_explained_var: 0.00047181910485960543
        vf_loss: 957.75537109375
    num_steps_sampled: 52800
    num_steps_trained: 52800
  iterations_since_restore: 11
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 6

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00013,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00012:
  custom_metrics: {}
  date: 2020-07-19_09-07-40
  done: true
  episode_len_mean: 403.25
  episode_reward_max: 500.0
  episode_reward_mean: 403.25
  episode_reward_min: 13.0
  episodes_this_iter: 10
  episodes_total: 546
  experiment_id: f76d40e1e2e644ff9b900816b92a0030
  experiment_tag: 12_fcnet_hiddens_0=60,fcnet_hiddens_1=60
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.03750000149011612
        cur_lr: 4.999999873689376e-05
        entropy: 0.5367062091827393
        entropy_coeff: 0.0
        kl: 0.0066146752797067165
        model: {}
        policy_loss: -0.004650297574698925
        total_loss: 754.4281005859375
        vf_explained_var: 0.08103358745574951
        vf_loss: 754.4324951171875
    num_steps_sampled: 62400
    num_steps_trained: 62400
  iterations_since_restore: 13
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 61

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00013:
  custom_metrics: {}
  date: 2020-07-19_09-08-03
  done: false
  episode_len_mean: 67.62
  episode_reward_max: 253.0
  episode_reward_mean: 67.62
  episode_reward_min: 13.0
  episodes_this_iter: 51
  episodes_total: 395
  experiment_id: 68b45565aa6c4f7ab0bfe67aa4e9423a
  experiment_tag: 13_fcnet_hiddens_0=80,fcnet_hiddens_1=60
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5855362415313721
        entropy_coeff: 0.0
        kl: 0.011940672062337399
        model: {}
        policy_loss: -0.014826527796685696
        total_loss: 845.8572387695312
        vf_explained_var: 0.010721741244196892
        vf_loss: 845.868408203125
    num_steps_sampled: 14400
    num_steps_trained: 14400
  iterations_since_restore: 3
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 62.47

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00013:
  custom_metrics: {}
  date: 2020-07-19_09-08-10
  done: false
  episode_len_mean: 164.82
  episode_reward_max: 500.0
  episode_reward_mean: 164.82
  episode_reward_min: 13.0
  episodes_this_iter: 15
  episodes_total: 464
  experiment_id: 68b45565aa6c4f7ab0bfe67aa4e9423a
  experiment_tag: 13_fcnet_hiddens_0=80,fcnet_hiddens_1=60
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.15000000596046448
        cur_lr: 4.999999873689376e-05
        entropy: 0.5577540397644043
        entropy_coeff: 0.0
        kl: 0.006133140064775944
        model: {}
        policy_loss: -0.004231530241668224
        total_loss: 1477.2510986328125
        vf_explained_var: 0.03474333509802818
        vf_loss: 1477.2542724609375
    num_steps_sampled: 28800
    num_steps_trained: 28800
  iterations_since_restore: 6
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 6

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00013:
  custom_metrics: {}
  date: 2020-07-19_09-08-17
  done: false
  episode_len_mean: 268.4
  episode_reward_max: 500.0
  episode_reward_mean: 268.4
  episode_reward_min: 18.0
  episodes_this_iter: 10
  episodes_total: 498
  experiment_id: 68b45565aa6c4f7ab0bfe67aa4e9423a
  experiment_tag: 13_fcnet_hiddens_0=80,fcnet_hiddens_1=60
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.07500000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.5517488718032837
        entropy_coeff: 0.0
        kl: 0.005760296247899532
        model: {}
        policy_loss: -0.0032246022019535303
        total_loss: 1169.0482177734375
        vf_explained_var: 0.0031801043078303337
        vf_loss: 1169.051025390625
    num_steps_sampled: 43200
    num_steps_trained: 43200
  iterations_since_restore: 9
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 6

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00013:
  custom_metrics: {}
  date: 2020-07-19_09-08-22
  done: false
  episode_len_mean: 337.99
  episode_reward_max: 500.0
  episode_reward_mean: 337.99
  episode_reward_min: 18.0
  episodes_this_iter: 10
  episodes_total: 520
  experiment_id: 68b45565aa6c4f7ab0bfe67aa4e9423a
  experiment_tag: 13_fcnet_hiddens_0=80,fcnet_hiddens_1=60
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.07500000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.559528648853302
        entropy_coeff: 0.0
        kl: 0.004642067942768335
        model: {}
        policy_loss: -0.004415214993059635
        total_loss: 979.4334716796875
        vf_explained_var: 0.03437583148479462
        vf_loss: 979.4375
    num_steps_sampled: 52800
    num_steps_trained: 52800
  iterations_since_restore: 11
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 70.966666666

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00013:
  custom_metrics: {}
  date: 2020-07-19_09-08-28
  done: false
  episode_len_mean: 390.89
  episode_reward_max: 500.0
  episode_reward_mean: 390.89
  episode_reward_min: 24.0
  episodes_this_iter: 10
  episodes_total: 540
  experiment_id: 68b45565aa6c4f7ab0bfe67aa4e9423a
  experiment_tag: 13_fcnet_hiddens_0=80,fcnet_hiddens_1=60
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.01875000074505806
        cur_lr: 4.999999873689376e-05
        entropy: 0.5384823679924011
        entropy_coeff: 0.0
        kl: 0.006660538259893656
        model: {}
        policy_loss: -0.004465852864086628
        total_loss: 779.651123046875
        vf_explained_var: 0.141123965382576
        vf_loss: 779.6554565429688
    num_steps_sampled: 62400
    num_steps_trained: 62400
  iterations_since_restore: 13
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 72.52

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00014,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,


[2m[36m(pid=29447)[0m 2020-07-19 09:08:35,847	INFO trainer.py:585 -- Tip: set framework=tfe or the --eager flag to enable TensorFlow eager execution
[2m[36m(pid=29447)[0m 2020-07-19 09:08:35,847	INFO trainer.py:612 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
Result for PPO_CartPole-v1_6da34_00014:
  custom_metrics: {}
  date: 2020-07-19_09-08-46
  done: false
  episode_len_mean: 21.160714285714285
  episode_reward_max: 78.0
  episode_reward_mean: 21.160714285714285
  episode_reward_min: 8.0
  episodes_this_iter: 224
  episodes_total: 224
  experiment_id: 18e41bb17502461eb5e5aa1ddb693a7d
  experiment_tag: 14_fcnet_hiddens_0=100,fcnet_hiddens_1=60
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.20000000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.6617387533187866
        entropy_coeff: 0.0
        kl: 0.03214733302593231
        model: {}
 



Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00014:
  custom_metrics: {}
  date: 2020-07-19_09-08-51
  done: false
  episode_len_mean: 58.74
  episode_reward_max: 346.0
  episode_reward_mean: 58.74
  episode_reward_min: 12.0
  episodes_this_iter: 60
  episodes_total: 415
  experiment_id: 18e41bb17502461eb5e5aa1ddb693a7d
  experiment_tag: 14_fcnet_hiddens_0=100,fcnet_hiddens_1=60
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5968897342681885
        entropy_coeff: 0.0
        kl: 0.01288764737546444
        model: {}
        policy_loss: -0.017829900607466698
        total_loss: 804.0458984375
        vf_explained_var: 0.008214224129915237
        vf_loss: 804.0599365234375
    num_steps_sampled: 14400
    num_steps_trained: 14400
  iterations_since_restore: 3
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 61.3
  

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00014:
  custom_metrics: {}
  date: 2020-07-19_09-08-58
  done: false
  episode_len_mean: 152.34
  episode_reward_max: 500.0
  episode_reward_mean: 152.34
  episode_reward_min: 15.0
  episodes_this_iter: 14
  episodes_total: 494
  experiment_id: 18e41bb17502461eb5e5aa1ddb693a7d
  experiment_tag: 14_fcnet_hiddens_0=100,fcnet_hiddens_1=60
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5867331624031067
        entropy_coeff: 0.0
        kl: 0.00222884607501328
        model: {}
        policy_loss: -0.0008419876685366035
        total_loss: 1470.6785888671875
        vf_explained_var: 0.0760088711977005
        vf_loss: 1470.6788330078125
    num_steps_sampled: 28800
    num_steps_trained: 28800
  iterations_since_restore: 6
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 6

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00014:
  custom_metrics: {}
  date: 2020-07-19_09-09-05
  done: false
  episode_len_mean: 245.28
  episode_reward_max: 500.0
  episode_reward_mean: 245.28
  episode_reward_min: 28.0
  episodes_this_iter: 12
  episodes_total: 540
  experiment_id: 18e41bb17502461eb5e5aa1ddb693a7d
  experiment_tag: 14_fcnet_hiddens_0=100,fcnet_hiddens_1=60
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.15000000596046448
        cur_lr: 4.999999873689376e-05
        entropy: 0.5338549017906189
        entropy_coeff: 0.0
        kl: 0.005612959153950214
        model: {}
        policy_loss: -0.004502077121287584
        total_loss: 1134.43701171875
        vf_explained_var: 0.18040646612644196
        vf_loss: 1134.4405517578125
    num_steps_sampled: 43200
    num_steps_trained: 43200
  iterations_since_restore: 9
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 64

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,


Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00014:
  custom_metrics: {}
  date: 2020-07-19_09-09-12
  done: false
  episode_len_mean: 333.9
  episode_reward_max: 500.0
  episode_reward_mean: 333.9
  episode_reward_min: 30.0
  episodes_this_iter: 9
  episodes_total: 573
  experiment_id: 18e41bb17502461eb5e5aa1ddb693a7d
  experiment_tag: 14_fcnet_hiddens_0=100,fcnet_hiddens_1=60
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.03750000149011612
        cur_lr: 4.999999873689376e-05
        entropy: 0.4863224923610687
        entropy_coeff: 0.0
        kl: 0.004221517127007246
        model: {}
        policy_loss: -0.003916044719517231
        total_loss: 844.2005004882812
        vf_explained_var: 0.24639995396137238
        vf_loss: 844.2041625976562
    num_steps_sampled: 57600
    num_steps_trained: 57600
  iterations_since_restore: 12
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 68.0

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,


Result for PPO_CartPole-v1_6da34_00014:
  custom_metrics: {}
  date: 2020-07-19_09-09-20
  done: false
  episode_len_mean: 397.87
  episode_reward_max: 500.0
  episode_reward_mean: 397.87
  episode_reward_min: 112.0
  episodes_this_iter: 13
  episodes_total: 608
  experiment_id: 18e41bb17502461eb5e5aa1ddb693a7d
  experiment_tag: 14_fcnet_hiddens_0=100,fcnet_hiddens_1=60
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.00937500037252903
        cur_lr: 4.999999873689376e-05
        entropy: 0.45854970812797546
        entropy_coeff: 0.0
        kl: 0.010136676952242851
        model: {}
        policy_loss: -0.005997608415782452
        total_loss: 518.4125366210938
        vf_explained_var: 0.42939820885658264
        vf_loss: 518.4183959960938
    num_steps_sampled: 72000
    num_steps_trained: 72000
  iterations_since_restore: 15
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent:



Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00015,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,


[2m[36m(pid=29581)[0m 2020-07-19 09:09:27,033	INFO trainer.py:585 -- Tip: set framework=tfe or the --eager flag to enable TensorFlow eager execution
[2m[36m(pid=29581)[0m 2020-07-19 09:09:27,033	INFO trainer.py:612 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
Result for PPO_CartPole-v1_6da34_00015:
  custom_metrics: {}
  date: 2020-07-19_09-09-41
  done: false
  episode_len_mean: 22.633333333333333
  episode_reward_max: 104.0
  episode_reward_mean: 22.633333333333333
  episode_reward_min: 9.0
  episodes_this_iter: 210
  episodes_total: 210
  experiment_id: 47f69a51c61946d6915b2a2587ae2c08
  experiment_tag: 15_fcnet_hiddens_0=20,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.20000000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.6685973405838013
        entropy_coeff: 0.0
        kl: 0.02549297921359539
        model: {}
 



Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,RUNNING,192.168.1.149:29581,,,1.0,10.6151,4800.0,22.6333


Result for PPO_CartPole-v1_6da34_00015:
  custom_metrics: {}
  date: 2020-07-19_09-09-47
  done: false
  episode_len_mean: 54.07
  episode_reward_max: 214.0
  episode_reward_mean: 54.07
  episode_reward_min: 13.0
  episodes_this_iter: 79
  episodes_total: 420
  experiment_id: 47f69a51c61946d6915b2a2587ae2c08
  experiment_tag: 15_fcnet_hiddens_0=20,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5955317616462708
        entropy_coeff: 0.0
        kl: 0.009413913823664188
        model: {}
        policy_loss: -0.016440575942397118
        total_loss: 516.6162109375
        vf_explained_var: 0.02183302491903305
        vf_loss: 516.6298828125
    num_steps_sampled: 14400
    num_steps_trained: 14400
  iterations_since_restore: 3
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 71.0
    ra

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,RUNNING,192.168.1.149:29581,,,3.0,16.4767,14400.0,54.07


Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,RUNNING,192.168.1.149:29581,,,5.0,21.012,24000.0,110.07


Result for PPO_CartPole-v1_6da34_00015:
  custom_metrics: {}
  date: 2020-07-19_09-09-54
  done: false
  episode_len_mean: 148.75
  episode_reward_max: 482.0
  episode_reward_mean: 148.75
  episode_reward_min: 13.0
  episodes_this_iter: 22
  episodes_total: 509
  experiment_id: 47f69a51c61946d6915b2a2587ae2c08
  experiment_tag: 15_fcnet_hiddens_0=20,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.15000000596046448
        cur_lr: 4.999999873689376e-05
        entropy: 0.5809319019317627
        entropy_coeff: 0.0
        kl: 0.009815990924835205
        model: {}
        policy_loss: -0.006865849252790213
        total_loss: 937.7061767578125
        vf_explained_var: 0.08696248382329941
        vf_loss: 937.7114868164062
    num_steps_sampled: 28800
    num_steps_trained: 28800
  iterations_since_restore: 6
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 62.

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,RUNNING,192.168.1.149:29581,,,8.0,27.5008,38400.0,196.92


Result for PPO_CartPole-v1_6da34_00015:
  custom_metrics: {}
  date: 2020-07-19_09-10-00
  done: false
  episode_len_mean: 217.08
  episode_reward_max: 500.0
  episode_reward_mean: 217.08
  episode_reward_min: 31.0
  episodes_this_iter: 17
  episodes_total: 569
  experiment_id: 47f69a51c61946d6915b2a2587ae2c08
  experiment_tag: 15_fcnet_hiddens_0=20,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.07500000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.536970853805542
        entropy_coeff: 0.0
        kl: 0.004183500073850155
        model: {}
        policy_loss: -0.004030575044453144
        total_loss: 640.9854736328125
        vf_explained_var: 0.2638966143131256
        vf_loss: 640.9891357421875
    num_steps_sampled: 43200
    num_steps_trained: 43200
  iterations_since_restore: 9
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 67.43



Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,RUNNING,192.168.1.149:29581,,,10.0,32.41,48000.0,243.68


Result for PPO_CartPole-v1_6da34_00015:
  custom_metrics: {}
  date: 2020-07-19_09-10-06
  done: false
  episode_len_mean: 270.08
  episode_reward_max: 500.0
  episode_reward_mean: 270.08
  episode_reward_min: 52.0
  episodes_this_iter: 12
  episodes_total: 592
  experiment_id: 47f69a51c61946d6915b2a2587ae2c08
  experiment_tag: 15_fcnet_hiddens_0=20,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.03750000149011612
        cur_lr: 4.999999873689376e-05
        entropy: 0.5383052229881287
        entropy_coeff: 0.0
        kl: 0.004095465410500765
        model: {}
        policy_loss: -0.0006711723399348557
        total_loss: 603.916259765625
        vf_explained_var: 0.08657798171043396
        vf_loss: 603.9168090820312
    num_steps_sampled: 52800
    num_steps_trained: 52800
  iterations_since_restore: 11
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 73

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,RUNNING,192.168.1.149:29581,,,13.0,40.3556,62400.0,308.98




Result for PPO_CartPole-v1_6da34_00015:
  custom_metrics: {}
  date: 2020-07-19_09-10-16
  done: false
  episode_len_mean: 360.11
  episode_reward_max: 500.0
  episode_reward_mean: 360.11
  episode_reward_min: 94.0
  episodes_this_iter: 9
  episodes_total: 639
  experiment_id: 47f69a51c61946d6915b2a2587ae2c08
  experiment_tag: 15_fcnet_hiddens_0=20,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.00937500037252903
        cur_lr: 4.999999873689376e-05
        entropy: 0.5483006238937378
        entropy_coeff: 0.0
        kl: 0.00489670317620039
        model: {}
        policy_loss: -0.0005033733323216438
        total_loss: 338.72454833984375
        vf_explained_var: 0.30832311511039734
        vf_loss: 338.7249450683594
    num_steps_sampled: 72000
    num_steps_trained: 72000
  iterations_since_restore: 15
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 73

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00015,RUNNING,192.168.1.149:29581,,,15.0,45.4263,72000.0,360.11


Result for PPO_CartPole-v1_6da34_00015:
  custom_metrics: {}
  date: 2020-07-19_09-10-23
  done: true
  episode_len_mean: 417.28
  episode_reward_max: 500.0
  episode_reward_mean: 417.28
  episode_reward_min: 110.0
  episodes_this_iter: 8
  episodes_total: 671
  experiment_id: 47f69a51c61946d6915b2a2587ae2c08
  experiment_tag: 15_fcnet_hiddens_0=20,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.004687500186264515
        cur_lr: 4.999999873689376e-05
        entropy: 0.511905312538147
        entropy_coeff: 0.0
        kl: 0.0044898465275764465
        model: {}
        policy_loss: 8.400066144531593e-05
        total_loss: 347.4313659667969
        vf_explained_var: 0.39405831694602966
        vf_loss: 347.4312744140625
    num_steps_sampled: 86400
    num_steps_trained: 86400
  iterations_since_restore: 18
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 66

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00016,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71


[2m[36m(pid=29603)[0m 2020-07-19 09:10:28,400	INFO trainer.py:585 -- Tip: set framework=tfe or the --eager flag to enable TensorFlow eager execution
[2m[36m(pid=29603)[0m 2020-07-19 09:10:28,400	INFO trainer.py:612 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
Result for PPO_CartPole-v1_6da34_00016:
  custom_metrics: {}
  date: 2020-07-19_09-10-39
  done: false
  episode_len_mean: 21.962441314553992
  episode_reward_max: 84.0
  episode_reward_mean: 21.962441314553992
  episode_reward_min: 9.0
  episodes_this_iter: 213
  episodes_total: 213
  experiment_id: f335681b1249432586b65ab5d5b55cbd
  experiment_tag: 16_fcnet_hiddens_0=40,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.20000000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.667738676071167
        entropy_coeff: 0.0
        kl: 0.026129575446248055
        model: {}
  



Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,RUNNING,192.168.1.149:29603,,,1.0,7.62893,4800.0,21.9624
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71


Result for PPO_CartPole-v1_6da34_00016:
  custom_metrics: {}
  date: 2020-07-19_09-10-44
  done: false
  episode_len_mean: 58.04
  episode_reward_max: 175.0
  episode_reward_mean: 58.04
  episode_reward_min: 11.0
  episodes_this_iter: 63
  episodes_total: 403
  experiment_id: f335681b1249432586b65ab5d5b55cbd
  experiment_tag: 16_fcnet_hiddens_0=40,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5986922979354858
        entropy_coeff: 0.0
        kl: 0.010720140300691128
        model: {}
        policy_loss: -0.017537111416459084
        total_loss: 624.167724609375
        vf_explained_var: 0.04553915187716484
        vf_loss: 624.1820678710938
    num_steps_sampled: 14400
    num_steps_trained: 14400
  iterations_since_restore: 3
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 64.05


Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,RUNNING,192.168.1.149:29603,,,4.0,15.3349,19200.0,94.27
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71




Result for PPO_CartPole-v1_6da34_00016:
  custom_metrics: {}
  date: 2020-07-19_09-10-51
  done: false
  episode_len_mean: 152.47
  episode_reward_max: 500.0
  episode_reward_mean: 152.47
  episode_reward_min: 17.0
  episodes_this_iter: 19
  episodes_total: 481
  experiment_id: f335681b1249432586b65ab5d5b55cbd
  experiment_tag: 16_fcnet_hiddens_0=40,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.15000000596046448
        cur_lr: 4.999999873689376e-05
        entropy: 0.5531034469604492
        entropy_coeff: 0.0
        kl: 0.006352023687213659
        model: {}
        policy_loss: -0.005283246748149395
        total_loss: 963.5640869140625
        vf_explained_var: 0.0816204845905304
        vf_loss: 963.5684204101562
    num_steps_sampled: 28800
    num_steps_trained: 28800
  iterations_since_restore: 6
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 71.2

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,RUNNING,192.168.1.149:29603,,,7.0,22.6892,33600.0,187.9
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71


Result for PPO_CartPole-v1_6da34_00016:
  custom_metrics: {}
  date: 2020-07-19_09-10-58
  done: false
  episode_len_mean: 260.2
  episode_reward_max: 500.0
  episode_reward_mean: 260.2
  episode_reward_min: 28.0
  episodes_this_iter: 10
  episodes_total: 517
  experiment_id: f335681b1249432586b65ab5d5b55cbd
  experiment_tag: 16_fcnet_hiddens_0=40,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.07500000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.5225740671157837
        entropy_coeff: 0.0
        kl: 0.003692624159157276
        model: {}
        policy_loss: -0.003052622778341174
        total_loss: 774.8634033203125
        vf_explained_var: 0.15915663540363312
        vf_loss: 774.8661499023438
    num_steps_sampled: 43200
    num_steps_trained: 43200
  iterations_since_restore: 9
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 62.47



Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,RUNNING,192.168.1.149:29603,,,10.0,29.8895,48000.0,291.78
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71


Result for PPO_CartPole-v1_6da34_00016:
  custom_metrics: {}
  date: 2020-07-19_09-11-06
  done: false
  episode_len_mean: 358.46
  episode_reward_max: 500.0
  episode_reward_mean: 358.46
  episode_reward_min: 34.0
  episodes_this_iter: 8
  episodes_total: 546
  experiment_id: f335681b1249432586b65ab5d5b55cbd
  experiment_tag: 16_fcnet_hiddens_0=40,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.00937500037252903
        cur_lr: 4.999999873689376e-05
        entropy: 0.4918243885040283
        entropy_coeff: 0.0
        kl: 0.004019985906779766
        model: {}
        policy_loss: -0.0012988304952159524
        total_loss: 571.8005981445312
        vf_explained_var: 0.00014557871327269822
        vf_loss: 571.8018188476562
    num_steps_sampled: 57600
    num_steps_trained: 57600
  iterations_since_restore: 12
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent:

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00017,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00016,RUNNING,192.168.1.149:29603,,,13.0,36.9617,62400.0,390.73
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71


Result for PPO_CartPole-v1_6da34_00016:
  custom_metrics: {}
  date: 2020-07-19_09-11-10
  done: true
  episode_len_mean: 417.55
  episode_reward_max: 500.0
  episode_reward_mean: 417.55
  episode_reward_min: 39.0
  episodes_this_iter: 9
  episodes_total: 565
  experiment_id: f335681b1249432586b65ab5d5b55cbd
  experiment_tag: 16_fcnet_hiddens_0=40,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.0023437500931322575
        cur_lr: 4.999999873689376e-05
        entropy: 0.49588626623153687
        entropy_coeff: 0.0
        kl: 0.0033857657108455896
        model: {}
        policy_loss: -0.0013667646562680602
        total_loss: 486.8962097167969
        vf_explained_var: 0.00012013235391350463
        vf_loss: 486.8975830078125
    num_steps_sampled: 67200
    num_steps_trained: 67200
  iterations_since_restore: 14
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_perce

2020-07-19 09:11:11,557	INFO (unknown file):0 -- gc.collect() freed 64 refs in 0.13837642800001504 seconds


[2m[36m(pid=29727)[0m 2020-07-19 09:11:15,334	INFO trainer.py:585 -- Tip: set framework=tfe or the --eager flag to enable TensorFlow eager execution
[2m[36m(pid=29727)[0m 2020-07-19 09:11:15,334	INFO trainer.py:612 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
Result for PPO_CartPole-v1_6da34_00017:
  custom_metrics: {}
  date: 2020-07-19_09-11-28
  done: false
  episode_len_mean: 23.336633663366335
  episode_reward_max: 84.0
  episode_reward_mean: 23.336633663366335
  episode_reward_min: 8.0
  episodes_this_iter: 202
  episodes_total: 202
  experiment_id: b476b40f2e1a42c1810b39e6065540a7
  experiment_tag: 17_fcnet_hiddens_0=60,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.20000000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.6655235886573792
        entropy_coeff: 0.0
        kl: 0.028366759419441223
        model: {}
 



Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,RUNNING,192.168.1.149:29727,,,1.0,9.49995,4800.0,23.3366
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63


Result for PPO_CartPole-v1_6da34_00017:
  custom_metrics: {}
  date: 2020-07-19_09-11-33
  done: false
  episode_len_mean: 61.73
  episode_reward_max: 234.0
  episode_reward_mean: 61.73
  episode_reward_min: 12.0
  episodes_this_iter: 57
  episodes_total: 379
  experiment_id: b476b40f2e1a42c1810b39e6065540a7
  experiment_tag: 17_fcnet_hiddens_0=60,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5959182977676392
        entropy_coeff: 0.0
        kl: 0.01191799994558096
        model: {}
        policy_loss: -0.014649197459220886
        total_loss: 623.2485961914062
        vf_explained_var: 0.047422848641872406
        vf_loss: 623.2596435546875
    num_steps_sampled: 14400
    num_steps_trained: 14400
  iterations_since_restore: 3
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 69.93

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,RUNNING,192.168.1.149:29727,,,4.0,17.0042,19200.0,92.85
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63


Result for PPO_CartPole-v1_6da34_00017:
  custom_metrics: {}
  date: 2020-07-19_09-11-40
  done: false
  episode_len_mean: 170.58
  episode_reward_max: 500.0
  episode_reward_mean: 170.58
  episode_reward_min: 18.0
  episodes_this_iter: 16
  episodes_total: 445
  experiment_id: b476b40f2e1a42c1810b39e6065540a7
  experiment_tag: 17_fcnet_hiddens_0=60,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5459039807319641
        entropy_coeff: 0.0
        kl: 0.0024614164140075445
        model: {}
        policy_loss: -0.004024569410830736
        total_loss: 1075.2457275390625
        vf_explained_var: 0.06663402915000916
        vf_loss: 1075.2489013671875
    num_steps_sampled: 28800
    num_steps_trained: 28800
  iterations_since_restore: 6
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 



Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,RUNNING,192.168.1.149:29727,,,6.0,21.7066,28800.0,170.58
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63


Result for PPO_CartPole-v1_6da34_00017:
  custom_metrics: {}
  date: 2020-07-19_09-11-48
  done: false
  episode_len_mean: 281.56
  episode_reward_max: 500.0
  episode_reward_mean: 281.56
  episode_reward_min: 19.0
  episodes_this_iter: 11
  episodes_total: 477
  experiment_id: b476b40f2e1a42c1810b39e6065540a7
  experiment_tag: 17_fcnet_hiddens_0=60,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.07500000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.5663930177688599
        entropy_coeff: 0.0
        kl: 0.002480332041159272
        model: {}
        policy_loss: -0.0027260505594313145
        total_loss: 797.3068237304688
        vf_explained_var: 0.12280404567718506
        vf_loss: 797.3094482421875
    num_steps_sampled: 43200
    num_steps_trained: 43200
  iterations_since_restore: 9
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 63

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,RUNNING,192.168.1.149:29727,,,9.0,28.9494,43200.0,281.56
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63




Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00017,RUNNING,192.168.1.149:29727,,,11.0,33.8747,52800.0,343.11
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63


Result for PPO_CartPole-v1_6da34_00017:
  custom_metrics: {}
  date: 2020-07-19_09-11-55
  done: false
  episode_len_mean: 368.45
  episode_reward_max: 500.0
  episode_reward_mean: 368.45
  episode_reward_min: 93.0
  episodes_this_iter: 13
  episodes_total: 513
  experiment_id: b476b40f2e1a42c1810b39e6065540a7
  experiment_tag: 17_fcnet_hiddens_0=60,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.03750000149011612
        cur_lr: 4.999999873689376e-05
        entropy: 0.5269111394882202
        entropy_coeff: 0.0
        kl: 0.006432550493627787
        model: {}
        policy_loss: -0.0028653740882873535
        total_loss: 489.1404113769531
        vf_explained_var: 0.26960185170173645
        vf_loss: 489.1430969238281
    num_steps_sampled: 57600
    num_steps_trained: 57600
  iterations_since_restore: 12
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 6

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00018,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13


[2m[36m(pid=29754)[0m 2020-07-19 09:12:04,088	INFO trainer.py:585 -- Tip: set framework=tfe or the --eager flag to enable TensorFlow eager execution
[2m[36m(pid=29754)[0m 2020-07-19 09:12:04,088	INFO trainer.py:612 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
Result for PPO_CartPole-v1_6da34_00018:
  custom_metrics: {}
  date: 2020-07-19_09-12-14
  done: false
  episode_len_mean: 21.61467889908257
  episode_reward_max: 105.0
  episode_reward_mean: 21.61467889908257
  episode_reward_min: 9.0
  episodes_this_iter: 218
  episodes_total: 218
  experiment_id: 65cce78a16af45bbab5a5ccb9ba135ec
  experiment_tag: 18_fcnet_hiddens_0=80,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.20000000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.6659562587738037
        entropy_coeff: 0.0
        kl: 0.027584275230765343
        model: {}
  



Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,RUNNING,192.168.1.149:29754,,,1.0,7.60083,4800.0,21.6147
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13


Result for PPO_CartPole-v1_6da34_00018:
  custom_metrics: {}
  date: 2020-07-19_09-12-19
  done: false
  episode_len_mean: 58.42
  episode_reward_max: 184.0
  episode_reward_mean: 58.42
  episode_reward_min: 13.0
  episodes_this_iter: 60
  episodes_total: 410
  experiment_id: 65cce78a16af45bbab5a5ccb9ba135ec
  experiment_tag: 18_fcnet_hiddens_0=80,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5810070037841797
        entropy_coeff: 0.0
        kl: 0.011908426880836487
        model: {}
        policy_loss: -0.020052846521139145
        total_loss: 619.5546875
        vf_explained_var: 0.046552229672670364
        vf_loss: 619.5711669921875
    num_steps_sampled: 14400
    num_steps_trained: 14400
  iterations_since_restore: 3
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 63.2
    r

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,RUNNING,192.168.1.149:29754,,,4.0,15.0725,19200.0,99.84
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13


Result for PPO_CartPole-v1_6da34_00018:
  custom_metrics: {}
  date: 2020-07-19_09-12-26
  done: false
  episode_len_mean: 165.28
  episode_reward_max: 500.0
  episode_reward_mean: 165.28
  episode_reward_min: 13.0
  episodes_this_iter: 16
  episodes_total: 474
  experiment_id: 65cce78a16af45bbab5a5ccb9ba135ec
  experiment_tag: 18_fcnet_hiddens_0=80,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.15000000596046448
        cur_lr: 4.999999873689376e-05
        entropy: 0.5522646307945251
        entropy_coeff: 0.0
        kl: 0.006963782012462616
        model: {}
        policy_loss: -0.005084616597741842
        total_loss: 1087.2034912109375
        vf_explained_var: 0.15777096152305603
        vf_loss: 1087.2073974609375
    num_steps_sampled: 28800
    num_steps_trained: 28800
  iterations_since_restore: 6
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 6



Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,RUNNING,192.168.1.149:29754,,,6.0,19.6505,28800.0,165.28
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13


Result for PPO_CartPole-v1_6da34_00018:
  custom_metrics: {}
  date: 2020-07-19_09-12-34
  done: false
  episode_len_mean: 282.76
  episode_reward_max: 500.0
  episode_reward_mean: 282.76
  episode_reward_min: 13.0
  episodes_this_iter: 12
  episodes_total: 506
  experiment_id: 65cce78a16af45bbab5a5ccb9ba135ec
  experiment_tag: 18_fcnet_hiddens_0=80,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.07500000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.5370339751243591
        entropy_coeff: 0.0
        kl: 0.0008207617211155593
        model: {}
        policy_loss: -0.0009783880086615682
        total_loss: 776.3783569335938
        vf_explained_var: 0.07150978595018387
        vf_loss: 776.379150390625
    num_steps_sampled: 43200
    num_steps_trained: 43200
  iterations_since_restore: 9
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 66

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,RUNNING,192.168.1.149:29754,,,9.0,27.3694,43200.0,282.76
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13




Result for PPO_CartPole-v1_6da34_00018:
  custom_metrics: {}
  date: 2020-07-19_09-12-39
  done: false
  episode_len_mean: 344.96
  episode_reward_max: 500.0
  episode_reward_mean: 344.96
  episode_reward_min: 81.0
  episodes_this_iter: 10
  episodes_total: 527
  experiment_id: 65cce78a16af45bbab5a5ccb9ba135ec
  experiment_tag: 18_fcnet_hiddens_0=80,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.03750000149011612
        cur_lr: 4.999999873689376e-05
        entropy: 0.5229136347770691
        entropy_coeff: 0.0
        kl: 0.008044026792049408
        model: {}
        policy_loss: -0.0075463829562067986
        total_loss: 638.5568237304688
        vf_explained_var: 0.06672485172748566
        vf_loss: 638.5641479492188
    num_steps_sampled: 52800
    num_steps_trained: 52800
  iterations_since_restore: 11
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 6

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00018,RUNNING,192.168.1.149:29754,,,11.0,32.457,52800.0,344.96
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13


Result for PPO_CartPole-v1_6da34_00018:
  custom_metrics: {}
  date: 2020-07-19_09-12-46
  done: true
  episode_len_mean: 426.62
  episode_reward_max: 500.0
  episode_reward_mean: 426.62
  episode_reward_min: 111.0
  episodes_this_iter: 9
  episodes_total: 557
  experiment_id: 65cce78a16af45bbab5a5ccb9ba135ec
  experiment_tag: 18_fcnet_hiddens_0=80,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.00937500037252903
        cur_lr: 4.999999873689376e-05
        entropy: 0.4859611988067627
        entropy_coeff: 0.0
        kl: 0.0038959146477282047
        model: {}
        policy_loss: -0.0012531217653304338
        total_loss: 459.9927062988281
        vf_explained_var: 0.0923391580581665
        vf_loss: 459.9939270019531
    num_steps_sampled: 67200
    num_steps_trained: 67200
  iterations_since_restore: 14
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 66

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00019,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13
PPO_CartPole-v1_6da34_00003,TERMINATED,,,,14.0,40.2008,67200.0,417.96


[2m[36m(pid=29851)[0m 2020-07-19 09:12:51,537	INFO trainer.py:585 -- Tip: set framework=tfe or the --eager flag to enable TensorFlow eager execution
[2m[36m(pid=29851)[0m 2020-07-19 09:12:51,537	INFO trainer.py:612 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
Result for PPO_CartPole-v1_6da34_00019:
  custom_metrics: {}
  date: 2020-07-19_09-13-04
  done: false
  episode_len_mean: 21.2152466367713
  episode_reward_max: 70.0
  episode_reward_mean: 21.2152466367713
  episode_reward_min: 8.0
  episodes_this_iter: 223
  episodes_total: 223
  experiment_id: eabdddbced3c4cb790df748cae62b7ac
  experiment_tag: 19_fcnet_hiddens_0=100,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.20000000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.6622926592826843
        entropy_coeff: 0.0
        kl: 0.03167973458766937
        model: {}
     



Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,RUNNING,192.168.1.149:29851,,,1.0,9.16834,4800.0,21.2152
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13
PPO_CartPole-v1_6da34_00003,TERMINATED,,,,14.0,40.2008,67200.0,417.96


Result for PPO_CartPole-v1_6da34_00019:
  custom_metrics: {}
  date: 2020-07-19_09-13-10
  done: false
  episode_len_mean: 63.25
  episode_reward_max: 209.0
  episode_reward_mean: 63.25
  episode_reward_min: 11.0
  episodes_this_iter: 60
  episodes_total: 405
  experiment_id: eabdddbced3c4cb790df748cae62b7ac
  experiment_tag: 19_fcnet_hiddens_0=100,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5789802074432373
        entropy_coeff: 0.0
        kl: 0.01077406108379364
        model: {}
        policy_loss: -0.01733315922319889
        total_loss: 594.7528076171875
        vf_explained_var: 0.016610242426395416
        vf_loss: 594.7669067382812
    num_steps_sampled: 14400
    num_steps_trained: 14400
  iterations_since_restore: 3
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 71.22

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,RUNNING,192.168.1.149:29851,,,4.0,17.1252,19200.0,94.59
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13
PPO_CartPole-v1_6da34_00003,TERMINATED,,,,14.0,40.2008,67200.0,417.96




Result for PPO_CartPole-v1_6da34_00019:
  custom_metrics: {}
  date: 2020-07-19_09-13-17
  done: false
  episode_len_mean: 161.1
  episode_reward_max: 500.0
  episode_reward_mean: 161.1
  episode_reward_min: 11.0
  episodes_this_iter: 11
  episodes_total: 466
  experiment_id: eabdddbced3c4cb790df748cae62b7ac
  experiment_tag: 19_fcnet_hiddens_0=100,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.15000000596046448
        cur_lr: 4.999999873689376e-05
        entropy: 0.5439649820327759
        entropy_coeff: 0.0
        kl: 0.005275224335491657
        model: {}
        policy_loss: -0.0026339292526245117
        total_loss: 1214.1019287109375
        vf_explained_var: 0.047647953033447266
        vf_loss: 1214.103515625
    num_steps_sampled: 28800
    num_steps_trained: 28800
  iterations_since_restore: 6
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 70.9

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,RUNNING,192.168.1.149:29851,,,7.0,24.6884,33600.0,201.83
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13
PPO_CartPole-v1_6da34_00003,TERMINATED,,,,14.0,40.2008,67200.0,417.96


Result for PPO_CartPole-v1_6da34_00019:
  custom_metrics: {}
  date: 2020-07-19_09-13-22
  done: false
  episode_len_mean: 234.42
  episode_reward_max: 500.0
  episode_reward_mean: 234.42
  episode_reward_min: 15.0
  episodes_this_iter: 16
  episodes_total: 501
  experiment_id: eabdddbced3c4cb790df748cae62b7ac
  experiment_tag: 19_fcnet_hiddens_0=100,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.15000000596046448
        cur_lr: 4.999999873689376e-05
        entropy: 0.5168917775154114
        entropy_coeff: 0.0
        kl: 0.008995833806693554
        model: {}
        policy_loss: -0.0064676604233682156
        total_loss: 777.9534301757812
        vf_explained_var: 0.3328872621059418
        vf_loss: 777.9585571289062
    num_steps_sampled: 38400
    num_steps_trained: 38400
  iterations_since_restore: 8
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 75



Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,RUNNING,192.168.1.149:29851,,,9.0,29.9136,43200.0,267.12
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13
PPO_CartPole-v1_6da34_00003,TERMINATED,,,,14.0,40.2008,67200.0,417.96


Result for PPO_CartPole-v1_6da34_00019:
  custom_metrics: {}
  date: 2020-07-19_09-13-28
  done: false
  episode_len_mean: 293.36
  episode_reward_max: 500.0
  episode_reward_mean: 293.36
  episode_reward_min: 15.0
  episodes_this_iter: 9
  episodes_total: 524
  experiment_id: eabdddbced3c4cb790df748cae62b7ac
  experiment_tag: 19_fcnet_hiddens_0=100,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.15000000596046448
        cur_lr: 4.999999873689376e-05
        entropy: 0.5001192688941956
        entropy_coeff: 0.0
        kl: 0.0036632237024605274
        model: {}
        policy_loss: -0.004165520891547203
        total_loss: 709.5796508789062
        vf_explained_var: 0.10946696251630783
        vf_loss: 709.583251953125
    num_steps_sampled: 48000
    num_steps_trained: 48000
  iterations_since_restore: 10
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 70

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,RUNNING,192.168.1.149:29851,,,12.0,37.6685,57600.0,338.23
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13
PPO_CartPole-v1_6da34_00003,TERMINATED,,,,14.0,40.2008,67200.0,417.96




Result for PPO_CartPole-v1_6da34_00019:
  custom_metrics: {}
  date: 2020-07-19_09-13-40
  done: false
  episode_len_mean: 377.45
  episode_reward_max: 500.0
  episode_reward_mean: 377.45
  episode_reward_min: 140.0
  episodes_this_iter: 10
  episodes_total: 585
  experiment_id: eabdddbced3c4cb790df748cae62b7ac
  experiment_tag: 19_fcnet_hiddens_0=100,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.01875000074505806
        cur_lr: 4.999999873689376e-05
        entropy: 0.4495605230331421
        entropy_coeff: 0.0
        kl: 0.0045609804801642895
        model: {}
        policy_loss: -0.0032472864259034395
        total_loss: 396.1687927246094
        vf_explained_var: 0.25510334968566895
        vf_loss: 396.17193603515625
    num_steps_sampled: 72000
    num_steps_trained: 72000
  iterations_since_restore: 15
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percen

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00019,RUNNING,192.168.1.149:29851,,,15.0,44.8302,72000.0,377.45
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13
PPO_CartPole-v1_6da34_00003,TERMINATED,,,,14.0,40.2008,67200.0,417.96


Result for PPO_CartPole-v1_6da34_00019:
  custom_metrics: {}
  date: 2020-07-19_09-13-45
  done: true
  episode_len_mean: 416.0
  episode_reward_max: 500.0
  episode_reward_mean: 416.0
  episode_reward_min: 140.0
  episodes_this_iter: 11
  episodes_total: 607
  experiment_id: eabdddbced3c4cb790df748cae62b7ac
  experiment_tag: 19_fcnet_hiddens_0=100,fcnet_hiddens_1=80
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.00937500037252903
        cur_lr: 4.999999873689376e-05
        entropy: 0.3916558623313904
        entropy_coeff: 0.0
        kl: 0.0037040910683572292
        model: {}
        policy_loss: -0.005075641442090273
        total_loss: 483.6969909667969
        vf_explained_var: 0.04908422380685806
        vf_loss: 483.7020568847656
    num_steps_sampled: 81600
    num_steps_trained: 81600
  iterations_since_restore: 17
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 65

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00020,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13
PPO_CartPole-v1_6da34_00003,TERMINATED,,,,14.0,40.2008,67200.0,417.96
PPO_CartPole-v1_6da34_00004,TERMINATED,,,,15.0,39.149,72000.0,410.24


[2m[36m(pid=29869)[0m 2020-07-19 09:13:51,202	INFO trainer.py:585 -- Tip: set framework=tfe or the --eager flag to enable TensorFlow eager execution
[2m[36m(pid=29869)[0m 2020-07-19 09:13:51,202	INFO trainer.py:612 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
Result for PPO_CartPole-v1_6da34_00020:
  custom_metrics: {}
  date: 2020-07-19_09-14-02
  done: false
  episode_len_mean: 22.516746411483254
  episode_reward_max: 79.0
  episode_reward_mean: 22.516746411483254
  episode_reward_min: 9.0
  episodes_this_iter: 209
  episodes_total: 209
  experiment_id: fe463f32be524f5892a2b381dbcdf769
  experiment_tag: 20_fcnet_hiddens_0=20,fcnet_hiddens_1=100
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.20000000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.6679457426071167
        entropy_coeff: 0.0
        kl: 0.02487972564995289
        model: {}
 



Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,RUNNING,192.168.1.149:29869,,,1.0,8.18136,4800.0,22.5167
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13
PPO_CartPole-v1_6da34_00003,TERMINATED,,,,14.0,40.2008,67200.0,417.96
PPO_CartPole-v1_6da34_00004,TERMINATED,,,,15.0,39.149,72000.0,410.24


Result for PPO_CartPole-v1_6da34_00020:
  custom_metrics: {}
  date: 2020-07-19_09-14-07
  done: false
  episode_len_mean: 55.19
  episode_reward_max: 181.0
  episode_reward_mean: 55.19
  episode_reward_min: 13.0
  episodes_this_iter: 77
  episodes_total: 412
  experiment_id: fe463f32be524f5892a2b381dbcdf769
  experiment_tag: 20_fcnet_hiddens_0=20,fcnet_hiddens_1=100
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.6011988520622253
        entropy_coeff: 0.0
        kl: 0.009058996103703976
        model: {}
        policy_loss: -0.015748614445328712
        total_loss: 421.72528076171875
        vf_explained_var: 0.04320693016052246
        vf_loss: 421.73828125
    num_steps_sampled: 14400
    num_steps_trained: 14400
  iterations_since_restore: 3
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 64.875
 

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,RUNNING,192.168.1.149:29869,,,4.0,15.7151,19200.0,84.08
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13
PPO_CartPole-v1_6da34_00003,TERMINATED,,,,14.0,40.2008,67200.0,417.96
PPO_CartPole-v1_6da34_00004,TERMINATED,,,,15.0,39.149,72000.0,410.24


Result for PPO_CartPole-v1_6da34_00020:
  custom_metrics: {}
  date: 2020-07-19_09-14-14
  done: false
  episode_len_mean: 140.31
  episode_reward_max: 419.0
  episode_reward_mean: 140.31
  episode_reward_min: 18.0
  episodes_this_iter: 21
  episodes_total: 504
  experiment_id: fe463f32be524f5892a2b381dbcdf769
  experiment_tag: 20_fcnet_hiddens_0=20,fcnet_hiddens_1=100
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5630952715873718
        entropy_coeff: 0.0
        kl: 0.0051293582655489445
        model: {}
        policy_loss: -0.005150581244379282
        total_loss: 669.7315673828125
        vf_explained_var: 0.18817812204360962
        vf_loss: 669.7352294921875
    num_steps_sampled: 28800
    num_steps_trained: 28800
  iterations_since_restore: 6
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 6



Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,RUNNING,192.168.1.149:29869,,,6.0,20.3009,28800.0,140.31
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13
PPO_CartPole-v1_6da34_00003,TERMINATED,,,,14.0,40.2008,67200.0,417.96
PPO_CartPole-v1_6da34_00004,TERMINATED,,,,15.0,39.149,72000.0,410.24


Result for PPO_CartPole-v1_6da34_00020:
  custom_metrics: {}
  date: 2020-07-19_09-14-21
  done: false
  episode_len_mean: 231.98
  episode_reward_max: 500.0
  episode_reward_mean: 231.98
  episode_reward_min: 59.0
  episodes_this_iter: 15
  episodes_total: 550
  experiment_id: fe463f32be524f5892a2b381dbcdf769
  experiment_tag: 20_fcnet_hiddens_0=20,fcnet_hiddens_1=100
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.07500000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.5277463793754578
        entropy_coeff: 0.0
        kl: 0.007554518990218639
        model: {}
        policy_loss: -0.005015526432543993
        total_loss: 442.058837890625
        vf_explained_var: 0.37583795189857483
        vf_loss: 442.0633239746094
    num_steps_sampled: 43200
    num_steps_trained: 43200
  iterations_since_restore: 9
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 63.

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,RUNNING,192.168.1.149:29869,,,9.0,27.3768,43200.0,231.98
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13
PPO_CartPole-v1_6da34_00003,TERMINATED,,,,14.0,40.2008,67200.0,417.96
PPO_CartPole-v1_6da34_00004,TERMINATED,,,,15.0,39.149,72000.0,410.24




Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,RUNNING,192.168.1.149:29869,,,11.0,32.2986,52800.0,300.4
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13
PPO_CartPole-v1_6da34_00003,TERMINATED,,,,14.0,40.2008,67200.0,417.96
PPO_CartPole-v1_6da34_00004,TERMINATED,,,,15.0,39.149,72000.0,410.24


Result for PPO_CartPole-v1_6da34_00020:
  custom_metrics: {}
  date: 2020-07-19_09-14-29
  done: false
  episode_len_mean: 316.3
  episode_reward_max: 500.0
  episode_reward_mean: 316.3
  episode_reward_min: 75.0
  episodes_this_iter: 15
  episodes_total: 588
  experiment_id: fe463f32be524f5892a2b381dbcdf769
  experiment_tag: 20_fcnet_hiddens_0=20,fcnet_hiddens_1=100
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.03750000149011612
        cur_lr: 4.999999873689376e-05
        entropy: 0.5266649127006531
        entropy_coeff: 0.0
        kl: 0.00878076907247305
        model: {}
        policy_loss: -0.006637557875365019
        total_loss: 262.58123779296875
        vf_explained_var: 0.6097952127456665
        vf_loss: 262.5875549316406
    num_steps_sampled: 57600
    num_steps_trained: 57600
  iterations_since_restore: 12
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 75.0

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00020,RUNNING,192.168.1.149:29869,,,14.0,40.1254,67200.0,364.15
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13
PPO_CartPole-v1_6da34_00003,TERMINATED,,,,14.0,40.2008,67200.0,417.96
PPO_CartPole-v1_6da34_00004,TERMINATED,,,,15.0,39.149,72000.0,410.24




Result for PPO_CartPole-v1_6da34_00020:
  custom_metrics: {}
  date: 2020-07-19_09-14-41
  done: true
  episode_len_mean: 406.29
  episode_reward_max: 500.0
  episode_reward_mean: 406.29
  episode_reward_min: 125.0
  episodes_this_iter: 11
  episodes_total: 642
  experiment_id: fe463f32be524f5892a2b381dbcdf769
  experiment_tag: 20_fcnet_hiddens_0=20,fcnet_hiddens_1=100
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.00937500037252903
        cur_lr: 4.999999873689376e-05
        entropy: 0.5308876037597656
        entropy_coeff: 0.0
        kl: 0.0033708212431520224
        model: {}
        policy_loss: -0.0010916210012510419
        total_loss: 406.95294189453125
        vf_explained_var: 0.22857666015625
        vf_loss: 406.9539794921875
    num_steps_sampled: 81600
    num_steps_trained: 81600
  iterations_since_restore: 17
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 6

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00021,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13
PPO_CartPole-v1_6da34_00003,TERMINATED,,,,14.0,40.2008,67200.0,417.96
PPO_CartPole-v1_6da34_00004,TERMINATED,,,,15.0,39.149,72000.0,410.24
PPO_CartPole-v1_6da34_00005,TERMINATED,,,,18.0,47.0836,86400.0,409.17


2020-07-19 09:14:42,898	INFO (unknown file):0 -- gc.collect() freed 64 refs in 0.08257269199998518 seconds


[2m[36m(pid=29980)[0m 2020-07-19 09:14:46,598	INFO trainer.py:585 -- Tip: set framework=tfe or the --eager flag to enable TensorFlow eager execution
[2m[36m(pid=29980)[0m 2020-07-19 09:14:46,598	INFO trainer.py:612 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
Result for PPO_CartPole-v1_6da34_00021:
  custom_metrics: {}
  date: 2020-07-19_09-15-00
  done: false
  episode_len_mean: 22.046296296296298
  episode_reward_max: 81.0
  episode_reward_mean: 22.046296296296298
  episode_reward_min: 8.0
  episodes_this_iter: 216
  episodes_total: 216
  experiment_id: 2e7b49591e044f5fb49dc05d7bfd50a1
  experiment_tag: 21_fcnet_hiddens_0=40,fcnet_hiddens_1=100
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.20000000298023224
        cur_lr: 4.999999873689376e-05
        entropy: 0.6664403676986694
        entropy_coeff: 0.0
        kl: 0.026889286935329437
        model: {}




Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,RUNNING,192.168.1.149:29980,,,1.0,9.61069,4800.0,22.0463
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13
PPO_CartPole-v1_6da34_00003,TERMINATED,,,,14.0,40.2008,67200.0,417.96
PPO_CartPole-v1_6da34_00004,TERMINATED,,,,15.0,39.149,72000.0,410.24
PPO_CartPole-v1_6da34_00005,TERMINATED,,,,18.0,47.0836,86400.0,409.17


Result for PPO_CartPole-v1_6da34_00021:
  custom_metrics: {}
  date: 2020-07-19_09-15-05
  done: false
  episode_len_mean: 53.58
  episode_reward_max: 149.0
  episode_reward_mean: 53.58
  episode_reward_min: 16.0
  episodes_this_iter: 82
  episodes_total: 426
  experiment_id: 2e7b49591e044f5fb49dc05d7bfd50a1
  experiment_tag: 21_fcnet_hiddens_0=40,fcnet_hiddens_1=100
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5943393111228943
        entropy_coeff: 0.0
        kl: 0.009782293811440468
        model: {}
        policy_loss: -0.016886409372091293
        total_loss: 320.7001647949219
        vf_explained_var: 0.04333251342177391
        vf_loss: 320.7141418457031
    num_steps_sampled: 14400
    num_steps_trained: 14400
  iterations_since_restore: 3
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 73.1

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,RUNNING,192.168.1.149:29980,,,4.0,17.1493,19200.0,81.09
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13
PPO_CartPole-v1_6da34_00003,TERMINATED,,,,14.0,40.2008,67200.0,417.96
PPO_CartPole-v1_6da34_00004,TERMINATED,,,,15.0,39.149,72000.0,410.24
PPO_CartPole-v1_6da34_00005,TERMINATED,,,,18.0,47.0836,86400.0,409.17


Result for PPO_CartPole-v1_6da34_00021:
  custom_metrics: {}
  date: 2020-07-19_09-15-12
  done: false
  episode_len_mean: 138.64
  episode_reward_max: 357.0
  episode_reward_mean: 138.64
  episode_reward_min: 14.0
  episodes_this_iter: 22
  episodes_total: 527
  experiment_id: 2e7b49591e044f5fb49dc05d7bfd50a1
  experiment_tag: 21_fcnet_hiddens_0=40,fcnet_hiddens_1=100
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5636154413223267
        entropy_coeff: 0.0
        kl: 0.0035216917749494314
        model: {}
        policy_loss: -0.003129619639366865
        total_loss: 669.7596435546875
        vf_explained_var: 0.18927162885665894
        vf_loss: 669.7618408203125
    num_steps_sampled: 28800
    num_steps_trained: 28800
  iterations_since_restore: 6
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 6



Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,RUNNING,192.168.1.149:29980,,,6.0,21.7975,28800.0,138.64
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13
PPO_CartPole-v1_6da34_00003,TERMINATED,,,,14.0,40.2008,67200.0,417.96
PPO_CartPole-v1_6da34_00004,TERMINATED,,,,15.0,39.149,72000.0,410.24
PPO_CartPole-v1_6da34_00005,TERMINATED,,,,18.0,47.0836,86400.0,409.17


Result for PPO_CartPole-v1_6da34_00021:
  custom_metrics: {}
  date: 2020-07-19_09-15-19
  done: false
  episode_len_mean: 208.85
  episode_reward_max: 500.0
  episode_reward_mean: 208.85
  episode_reward_min: 44.0
  episodes_this_iter: 14
  episodes_total: 585
  experiment_id: 2e7b49591e044f5fb49dc05d7bfd50a1
  experiment_tag: 21_fcnet_hiddens_0=40,fcnet_hiddens_1=100
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.15000000596046448
        cur_lr: 4.999999873689376e-05
        entropy: 0.5221731662750244
        entropy_coeff: 0.0
        kl: 0.0023898642975836992
        model: {}
        policy_loss: -0.0048236362636089325
        total_loss: 529.7894287109375
        vf_explained_var: 0.21835587918758392
        vf_loss: 529.7939453125
    num_steps_sampled: 43200
    num_steps_trained: 43200
  iterations_since_restore: 9
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 68.

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,RUNNING,192.168.1.149:29980,,,9.0,29.1043,43200.0,208.85
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13
PPO_CartPole-v1_6da34_00003,TERMINATED,,,,14.0,40.2008,67200.0,417.96
PPO_CartPole-v1_6da34_00004,TERMINATED,,,,15.0,39.149,72000.0,410.24
PPO_CartPole-v1_6da34_00005,TERMINATED,,,,18.0,47.0836,86400.0,409.17


Result for PPO_CartPole-v1_6da34_00021:
  custom_metrics: {}
  date: 2020-07-19_09-15-24
  done: false
  episode_len_mean: 253.21
  episode_reward_max: 500.0
  episode_reward_mean: 253.21
  episode_reward_min: 91.0
  episodes_this_iter: 15
  episodes_total: 619
  experiment_id: 2e7b49591e044f5fb49dc05d7bfd50a1
  experiment_tag: 21_fcnet_hiddens_0=40,fcnet_hiddens_1=100
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.03750000149011612
        cur_lr: 4.999999873689376e-05
        entropy: 0.5304079055786133
        entropy_coeff: 0.0
        kl: 0.0071663823910057545
        model: {}
        policy_loss: -0.001677929307334125
        total_loss: 219.4928741455078
        vf_explained_var: 0.7001882791519165
        vf_loss: 219.49429321289062
    num_steps_sampled: 52800
    num_steps_trained: 52800
  iterations_since_restore: 11
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 



Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,RUNNING,192.168.1.149:29980,,,11.0,34.134,52800.0,253.21
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13
PPO_CartPole-v1_6da34_00003,TERMINATED,,,,14.0,40.2008,67200.0,417.96
PPO_CartPole-v1_6da34_00004,TERMINATED,,,,15.0,39.149,72000.0,410.24
PPO_CartPole-v1_6da34_00005,TERMINATED,,,,18.0,47.0836,86400.0,409.17


Result for PPO_CartPole-v1_6da34_00021:
  custom_metrics: {}
  date: 2020-07-19_09-15-32
  done: false
  episode_len_mean: 324.6
  episode_reward_max: 500.0
  episode_reward_mean: 324.6
  episode_reward_min: 91.0
  episodes_this_iter: 11
  episodes_total: 652
  experiment_id: 2e7b49591e044f5fb49dc05d7bfd50a1
  experiment_tag: 21_fcnet_hiddens_0=40,fcnet_hiddens_1=100
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.03750000149011612
        cur_lr: 4.999999873689376e-05
        entropy: 0.503437876701355
        entropy_coeff: 0.0
        kl: 0.007601656019687653
        model: {}
        policy_loss: -0.0055838986299932
        total_loss: 330.1593933105469
        vf_explained_var: 0.33793532848358154
        vf_loss: 330.1647033691406
    num_steps_sampled: 67200
    num_steps_trained: 67200
  iterations_since_restore: 14
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 63.599

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,RUNNING,192.168.1.149:29980,,,14.0,41.4196,67200.0,324.6
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13
PPO_CartPole-v1_6da34_00003,TERMINATED,,,,14.0,40.2008,67200.0,417.96
PPO_CartPole-v1_6da34_00004,TERMINATED,,,,15.0,39.149,72000.0,410.24
PPO_CartPole-v1_6da34_00005,TERMINATED,,,,18.0,47.0836,86400.0,409.17




Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00022,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00021,RUNNING,192.168.1.149:29980,,,16.0,46.3225,76800.0,373.6
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13
PPO_CartPole-v1_6da34_00003,TERMINATED,,,,14.0,40.2008,67200.0,417.96
PPO_CartPole-v1_6da34_00004,TERMINATED,,,,15.0,39.149,72000.0,410.24
PPO_CartPole-v1_6da34_00005,TERMINATED,,,,18.0,47.0836,86400.0,409.17


Result for PPO_CartPole-v1_6da34_00021:
  custom_metrics: {}
  date: 2020-07-19_09-15-39
  done: false
  episode_len_mean: 393.31
  episode_reward_max: 500.0
  episode_reward_mean: 393.31
  episode_reward_min: 122.0
  episodes_this_iter: 10
  episodes_total: 682
  experiment_id: 2e7b49591e044f5fb49dc05d7bfd50a1
  experiment_tag: 21_fcnet_hiddens_0=40,fcnet_hiddens_1=100
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.01875000074505806
        cur_lr: 4.999999873689376e-05
        entropy: 0.4591194987297058
        entropy_coeff: 0.0
        kl: 0.004352487623691559
        model: {}
        policy_loss: -0.002263543661683798
        total_loss: 510.2033386230469
        vf_explained_var: 0.09235598891973495
        vf_loss: 510.2054443359375
    num_steps_sampled: 81600
    num_steps_trained: 81600
  iterations_since_restore: 17
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 



Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,RUNNING,192.168.1.149:30003,,,1.0,7.70477,4800.0,22.9029
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13
PPO_CartPole-v1_6da34_00003,TERMINATED,,,,14.0,40.2008,67200.0,417.96
PPO_CartPole-v1_6da34_00004,TERMINATED,,,,15.0,39.149,72000.0,410.24
PPO_CartPole-v1_6da34_00005,TERMINATED,,,,18.0,47.0836,86400.0,409.17
PPO_CartPole-v1_6da34_00006,TERMINATED,,,,15.0,38.6426,72000.0,402.38


Result for PPO_CartPole-v1_6da34_00022:
  custom_metrics: {}
  date: 2020-07-19_09-16-03
  done: false
  episode_len_mean: 66.5
  episode_reward_max: 225.0
  episode_reward_mean: 66.5
  episode_reward_min: 13.0
  episodes_this_iter: 54
  episodes_total: 369
  experiment_id: deb84c7dd33b46e286c1696ebd4f4f6c
  experiment_tag: 22_fcnet_hiddens_0=60,fcnet_hiddens_1=100
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5835725665092468
        entropy_coeff: 0.0
        kl: 0.009695125743746758
        model: {}
        policy_loss: -0.01155663188546896
        total_loss: 652.697509765625
        vf_explained_var: 0.032399676740169525
        vf_loss: 652.7062377929688
    num_steps_sampled: 14400
    num_steps_trained: 14400
  iterations_since_restore: 3
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 64.95
 

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,RUNNING,192.168.1.149:30003,,,4.0,15.5883,19200.0,100.17
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13
PPO_CartPole-v1_6da34_00003,TERMINATED,,,,14.0,40.2008,67200.0,417.96
PPO_CartPole-v1_6da34_00004,TERMINATED,,,,15.0,39.149,72000.0,410.24
PPO_CartPole-v1_6da34_00005,TERMINATED,,,,18.0,47.0836,86400.0,409.17
PPO_CartPole-v1_6da34_00006,TERMINATED,,,,15.0,38.6426,72000.0,402.38




Result for PPO_CartPole-v1_6da34_00022:
  custom_metrics: {}
  date: 2020-07-19_09-16-10
  done: false
  episode_len_mean: 163.31
  episode_reward_max: 500.0
  episode_reward_mean: 163.31
  episode_reward_min: 13.0
  episodes_this_iter: 13
  episodes_total: 434
  experiment_id: deb84c7dd33b46e286c1696ebd4f4f6c
  experiment_tag: 22_fcnet_hiddens_0=60,fcnet_hiddens_1=100
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.30000001192092896
        cur_lr: 4.999999873689376e-05
        entropy: 0.5313628315925598
        entropy_coeff: 0.0
        kl: 0.00425554858520627
        model: {}
        policy_loss: -0.007148286793380976
        total_loss: 832.1736450195312
        vf_explained_var: 0.1399255394935608
        vf_loss: 832.1795043945312
    num_steps_sampled: 28800
    num_steps_trained: 28800
  iterations_since_restore: 6
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 65.5

Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,RUNNING,192.168.1.149:30003,,,7.0,22.9247,33600.0,198.38
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13
PPO_CartPole-v1_6da34_00003,TERMINATED,,,,14.0,40.2008,67200.0,417.96
PPO_CartPole-v1_6da34_00004,TERMINATED,,,,15.0,39.149,72000.0,410.24
PPO_CartPole-v1_6da34_00005,TERMINATED,,,,18.0,47.0836,86400.0,409.17
PPO_CartPole-v1_6da34_00006,TERMINATED,,,,15.0,38.6426,72000.0,402.38


Result for PPO_CartPole-v1_6da34_00022:
  custom_metrics: {}
  date: 2020-07-19_09-16-15
  done: false
  episode_len_mean: 243.84
  episode_reward_max: 500.0
  episode_reward_mean: 243.84
  episode_reward_min: 13.0
  episodes_this_iter: 13
  episodes_total: 458
  experiment_id: deb84c7dd33b46e286c1696ebd4f4f6c
  experiment_tag: 22_fcnet_hiddens_0=60,fcnet_hiddens_1=100
  hostname: DWAnyscaleMBP.local
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.15000000596046448
        cur_lr: 4.999999873689376e-05
        entropy: 0.5670538544654846
        entropy_coeff: 0.0
        kl: 0.002951942617073655
        model: {}
        policy_loss: -0.005621629301458597
        total_loss: 632.5805053710938
        vf_explained_var: 0.15515446662902832
        vf_loss: 632.5857543945312
    num_steps_sampled: 38400
    num_steps_trained: 38400
  iterations_since_restore: 8
  node_ip: 192.168.1.149
  num_healthy_workers: 6
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 63



Trial name,status,loc,model/fcnet_hiddens/0,model/fcnet_hiddens/1,iter,total time (s),ts,reward
PPO_CartPole-v1_6da34_00023,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00024,PENDING,,,,,,,
PPO_CartPole-v1_6da34_00022,RUNNING,192.168.1.149:30003,,,10.0,30.443,48000.0,307.15
PPO_CartPole-v1_6da34_00000,TERMINATED,,,,22.0,55.2572,105600.0,402.71
PPO_CartPole-v1_6da34_00001,TERMINATED,,,,21.0,60.2917,100800.0,409.63
PPO_CartPole-v1_6da34_00002,TERMINATED,,,,16.0,41.1459,76800.0,400.13
PPO_CartPole-v1_6da34_00003,TERMINATED,,,,14.0,40.2008,67200.0,417.96
PPO_CartPole-v1_6da34_00004,TERMINATED,,,,15.0,39.149,72000.0,410.24
PPO_CartPole-v1_6da34_00005,TERMINATED,,,,18.0,47.0836,86400.0,409.17
PPO_CartPole-v1_6da34_00006,TERMINATED,,,,15.0,38.6426,72000.0,402.38


## Understanding the Results

First, how long did this take?

In [None]:
stats = analysis.stats()
secs = stats["timestamp"] - stats["start_time"]
print(f'{secs:7.2f} seconds, {secs/60.0:7.2f} minutes')

Which one performed best based on our stopping criteria?

In [None]:
analysis.get_best_config(metric="episode_reward_mean")

Interesting that the best result is for the smallest size for the first layer and the largest size for the second layer, but recall what we said in the lesson about all values providing good results.

In [None]:
df = analysis.dataframe()
df

Let's sort by `timesteps_total` to see which ones were fastest.

In [None]:
df.sort_values('timesteps_total', ascending=True)

It appears that the largest networks trained the fastest, `[100,100]`, `[60,100]`, and `[80,100]`, followed closely by some smaller configurations. The larger networks would be easier to train, because they more parameters, but the larger parameter sets would increase training times, but apparently not enough to tip the balance against them. 

However, the differences are still relatively small compared to our previous pick of `[40,40]`. If you compare the timestamp values, the training time for `[40,40]` is about 20% slower.

## 03 Search Algorithms and Schedulers

### Exercise - PopulationBasedTraining

In [None]:
from ray.tune.schedulers import PopulationBasedTraining

In [None]:
import sys
sys.path.append("..")
from mnist import ConvNet, TrainMNIST, EPOCH_SIZE, TEST_SIZE, DATA_ROOT

In [None]:
experiment_metrics = dict(metric="mean_accuracy", mode="max")

#search_algorithm = TuneBOHB(config_space, max_concurrent=4, **experiment_metrics)

In [None]:
pbt_scheduler = PopulationBasedTraining(
        time_attr='training_iteration',
        perturbation_interval=10,  # Every N time_attr units, "perturb" the parameters.
        hyperparam_mutations={
            "lr": [0.001, 0.01, 0.1],
            "momentum": [0.001, 0.01, 0.1, 0.9]
        },
        **experiment_metrics)

In [None]:
config = {
    "lr": 0.001,            # Use the lowest values from the previous cell
    "momentum": 0.001
}

In [None]:
analysis = tune.run(TrainMNIST, 
    scheduler=pbt_scheduler, 
    #search_alg=search_algorithm,
    config=config,
    #num_samples=12,
    verbose=1,     
    ray_auto_init=False
)

In [None]:
print("Best config: ", analysis.get_best_config(metric="mean_accuracy"))

In [None]:
analysis.dataframe().sort_values('mean_accuracy', ascending=False).head()

In [None]:
analysis.dataframe()[['mean_accuracy', 'config/lr', 'config/momentum']].sort_values('mean_accuracy', ascending=False)

In [None]:
stats = analysis.stats()
secs = stats["timestamp"] - stats["start_time"]
print(f'{secs:7.2f} seconds, {secs/60.0:7.2f} minutes')