# Distribution of influences

To get a grip on how much randomness influences the results, we run the same model lots of times and compare the results.

In [1]:
# for Colab, install fin_benefits and unemployment-gym from Github
#!pip install -q git+https://github.com/ajtanskanen/benefits.git  
#!pip install -q git+https://github.com/ajtanskanen/econogym.git
#!pip install -q git+https://github.com/ajtanskanen/lifecycle-rl.git

# and then restart kernel
  
  # For a specific version:
#!pip install tensorflow==1.15
#!pip install stable-baselines==2.8
  
# restart kernel after running pip's

Then load all modules and set parameters for simulations.

In [2]:
import numpy as np
import matplotlib.pyplot as plt
from lifecycle_rl import Lifecycle

%matplotlib inline
%pylab inline

# varoitukset piiloon (Stable baseline ei ole vielä Tensorflow 2.0-yhteensopiva, ja Tensorflow 1.15 valittaa paljon)
# ei taida toimia piilottaminen
import warnings
warnings.filterwarnings('ignore')

# parameters for the simulation
# episode = 51 / 205 timesteps (1y/3m timestep)
pop_size=10_000 # size of the population to be simulated
size1=5_000_000 #0_000 # number of timesteps in phase 1 training (callback not used)
size2=100 #0_000 # number of timesteps in phase 2 training (callback is used to save the best results)
size3=100 # number of timesteps in phase 1 training (callback not used) for policy changes
batch1=1 # size of minibatch in phase 1 as number of episodes
batch2=9_00  # size of minibatch in phase 1 as number of episodes
callback_minsteps=batch2 # how many episodes callback needs 
deterministic=False # use deterministic prediction (True) or probabilitic prediction (False)
mortality=False # include mortality in computations
randomness=True # include externally given, random state-transitions (parental leaves, disability, lay-offs) 
pinkslip=True # include lay-offs at 5 percent level each year
rlmodel='acktr' # use ACKTR algorithm
twostage=False # ajataan kahdessa vaiheessa vai ei
perusmalli='best/malli_perus3'

The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.

Populating the interactive namespace from numpy and matplotlib


# Baseline

Lasketaan työllisyysasteet nykymallissa.

In [None]:
cc1=Lifecycle(env='unemployment-v1',minimal=False,mortality=mortality,perustulo=False,
              randomness=randomness,pinkslip=pinkslip,plotdebug=False)
cc1.explain()
cc1.run_distrib(n=50,debug=False,steps1=size1,steps2=size2,pop=pop_size,deterministic=deterministic,
                train=True,predict=True,batch1=batch1,batch2=batch2,
                save=perusmalli,plot=True,cont=True,start_from=perusmalli,results='results/distrib_base',
                callback_minsteps=callback_minsteps,rlmodel=rlmodel,twostage=twostage)

No mortality included
Parameters of lifecycle:
timestep 0.25
gamma 0.9793703613355593 (0.9200000000000003 per anno)
min_age 20
max_age 70
min_retirementage 63.5
max_retirementage 68.5
ansiopvraha_kesto300 None
ansiopvraha_kesto400 None
ansiopvraha_toe None
perustulo False
karenssi_kesto 0.25
mortality False
randomness True
include_putki None
include_pinkslip True
step 0.25

train...
phase 1
batch 1 learning rate 0.125 scaled 0.125




Instructions for updating:
Use keras.layers.flatten instead.
Instructions for updating:
Please use `layer.__call__` method instead.








Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where






training...








---------------------------------
| explained_variance | 0.952    |
| fps                | 1053     |
| nupdates           | 1        |
| policy_entropy     | 0.824    |
| policy_loss        | -0.0616  |
| total_timesteps    | 0        |
| value_loss         | 0.664    |
-----------------------------

---------------------------------
| explained_variance | 0.696    |
| fps                | 1534     |
| nupdates           | 20       |
| policy_entropy     | 0.556    |
| policy_loss        | -0.204   |
| total_timesteps    | 46531    |
| value_loss         | 3.86     |
---------------------------------
---------------------------------
| explained_variance | 0.879    |
| fps                | 1476     |
| nupdates           | 30       |
| policy_entropy     | 0.705    |
| policy_loss        | -0.201   |
| total_timesteps    | 71021    |
| value_loss         | 1.83     |
---------------------------------
---------------------------------
| explained_variance | 0.893    |
| fps                | 1494     |
| nupdates           | 40       |
| policy_entropy     | 0.618    |
| policy_loss        | 0.154    |
| total_timesteps    | 95511    |
| value_loss         | 1.09     |
---------------------------------
---------------------------------
| explained_variance | 0.955    |
| fps         

---------------------------------
| explained_variance | 0.976    |
| fps                | 1610     |
| nupdates           | 290      |
| policy_entropy     | 0.804    |
| policy_loss        | -0.0698  |
| total_timesteps    | 707761   |
| value_loss         | 0.421    |
---------------------------------
---------------------------------
| explained_variance | 0.978    |
| fps                | 1611     |
| nupdates           | 300      |
| policy_entropy     | 0.783    |
| policy_loss        | 0.753    |
| total_timesteps    | 732251   |
| value_loss         | 1.29     |
---------------------------------
---------------------------------
| explained_variance | 0.974    |
| fps                | 1613     |
| nupdates           | 310      |
| policy_entropy     | 0.79     |
| policy_loss        | -0.433   |
| total_timesteps    | 756741   |
| value_loss         | 0.608    |
---------------------------------
---------------------------------
| explained_variance | 0.972    |
| fps         

---------------------------------
| explained_variance | 0.956    |
| fps                | 1629     |
| nupdates           | 560      |
| policy_entropy     | 0.774    |
| policy_loss        | 0.593    |
| total_timesteps    | 1368991  |
| value_loss         | 1.26     |
---------------------------------
---------------------------------
| explained_variance | 0.946    |
| fps                | 1630     |
| nupdates           | 570      |
| policy_entropy     | 0.786    |
| policy_loss        | -0.6     |
| total_timesteps    | 1393481  |
| value_loss         | 1.13     |
---------------------------------
---------------------------------
| explained_variance | 0.965    |
| fps                | 1630     |
| nupdates           | 580      |
| policy_entropy     | 0.803    |
| policy_loss        | -0.189   |
| total_timesteps    | 1417971  |
| value_loss         | 0.46     |
---------------------------------
---------------------------------
| explained_variance | 0.961    |
| fps         

---------------------------------
| explained_variance | 0.976    |
| fps                | 1641     |
| nupdates           | 830      |
| policy_entropy     | 0.824    |
| policy_loss        | -0.0563  |
| total_timesteps    | 2030221  |
| value_loss         | 0.258    |
---------------------------------
---------------------------------
| explained_variance | 0.948    |
| fps                | 1641     |
| nupdates           | 840      |
| policy_entropy     | 0.846    |
| policy_loss        | -0.136   |
| total_timesteps    | 2054711  |
| value_loss         | 0.777    |
---------------------------------
---------------------------------
| explained_variance | 0.978    |
| fps                | 1642     |
| nupdates           | 850      |
| policy_entropy     | 0.84     |
| policy_loss        | 0.014    |
| total_timesteps    | 2079201  |
| value_loss         | 0.415    |
---------------------------------
---------------------------------
| explained_variance | 0.956    |
| fps         

---------------------------------
| explained_variance | 0.92     |
| fps                | 1649     |
| nupdates           | 1100     |
| policy_entropy     | 0.806    |
| policy_loss        | -0.0349  |
| total_timesteps    | 2691451  |
| value_loss         | 0.544    |
---------------------------------
---------------------------------
| explained_variance | 0.973    |
| fps                | 1649     |
| nupdates           | 1110     |
| policy_entropy     | 0.763    |
| policy_loss        | -0.00253 |
| total_timesteps    | 2715941  |
| value_loss         | 0.361    |
---------------------------------
---------------------------------
| explained_variance | 0.959    |
| fps                | 1650     |
| nupdates           | 1120     |
| policy_entropy     | 0.745    |
| policy_loss        | -0.0781  |
| total_timesteps    | 2740431  |
| value_loss         | 0.446    |
---------------------------------
---------------------------------
| explained_variance | 0.95     |
| fps         

---------------------------------
| explained_variance | 0.988    |
| fps                | 1656     |
| nupdates           | 1370     |
| policy_entropy     | 0.785    |
| policy_loss        | -0.00311 |
| total_timesteps    | 3352681  |
| value_loss         | 0.209    |
---------------------------------
---------------------------------
| explained_variance | 0.963    |
| fps                | 1657     |
| nupdates           | 1380     |
| policy_entropy     | 0.807    |
| policy_loss        | -0.05    |
| total_timesteps    | 3377171  |
| value_loss         | 0.418    |
---------------------------------
---------------------------------
| explained_variance | 0.973    |
| fps                | 1657     |
| nupdates           | 1390     |
| policy_entropy     | 0.746    |
| policy_loss        | 0.0864   |
| total_timesteps    | 3401661  |
| value_loss         | 0.25     |
---------------------------------
---------------------------------
| explained_variance | 0.976    |
| fps         

---------------------------------
| explained_variance | 0.967    |
| fps                | 1662     |
| nupdates           | 1640     |
| policy_entropy     | 0.759    |
| policy_loss        | 0.021    |
| total_timesteps    | 4013911  |
| value_loss         | 0.306    |
---------------------------------
---------------------------------
| explained_variance | 0.981    |
| fps                | 1663     |
| nupdates           | 1650     |
| policy_entropy     | 0.777    |
| policy_loss        | -0.0222  |
| total_timesteps    | 4038401  |
| value_loss         | 0.305    |
---------------------------------
---------------------------------
| explained_variance | 0.969    |
| fps                | 1662     |
| nupdates           | 1660     |
| policy_entropy     | 0.853    |
| policy_loss        | -0.0592  |
| total_timesteps    | 4062891  |
| value_loss         | 0.48     |
---------------------------------
---------------------------------
| explained_variance | 0.96     |
| fps         

---------------------------------
| explained_variance | 0.951    |
| fps                | 1668     |
| nupdates           | 1910     |
| policy_entropy     | 0.774    |
| policy_loss        | -0.0777  |
| total_timesteps    | 4675141  |
| value_loss         | 0.48     |
---------------------------------
---------------------------------
| explained_variance | 0.922    |
| fps                | 1668     |
| nupdates           | 1920     |
| policy_entropy     | 0.785    |
| policy_loss        | -0.0754  |
| total_timesteps    | 4699631  |
| value_loss         | 0.654    |
---------------------------------
---------------------------------
| explained_variance | 0.967    |
| fps                | 1668     |
| nupdates           | 1930     |
| policy_entropy     | 0.767    |
| policy_loss        | 0.0711   |
| total_timesteps    | 4724121  |
| value_loss         | 0.33     |
---------------------------------
---------------------------------
| explained_variance | 0.972    |
| fps         

HBox(children=(IntProgress(value=0, description='Population', max=10000), HTML(value='')))

train...
phase 1
batch 1 learning rate 0.125 scaled 0.125
training...
---------------------------------
| explained_variance | 0.976    |
| fps                | 1313     |
| nupdates           | 1        |
| policy_entropy     | 0.792    |
| policy_loss        | 0.0376   |
| total_timesteps    | 0        |
| value_loss         | 0.246    |
---------------------------------
---------------------------------
| explained_variance | 0.574    |
| fps                | 2130     |
| nupdates           | 10       |
| policy_entropy     | 0.667    |
| policy_loss        | -0.916   |
| total_timesteps    | 22041    |
| value_loss         | 5.28     |
---------------------------------
---------------------------------
| explained_variance | 0.618    |
| fps                | 1863     |
| nupdates           | 20       |
| policy_entropy     | 0.703    |
| policy_loss        | 1.24     |
| total_timesteps    | 46531    |
| value_loss         | 6.86     |
---------------------------------
------------

---------------------------------
| explained_variance | 0.943    |
| fps                | 1678     |
| nupdates           | 270      |
| policy_entropy     | 0.802    |
| policy_loss        | -0.798   |
| total_timesteps    | 658781   |
| value_loss         | 1.33     |
---------------------------------
---------------------------------
| explained_variance | 0.927    |
| fps                | 1678     |
| nupdates           | 280      |
| policy_entropy     | 0.824    |
| policy_loss        | -1.03    |
| total_timesteps    | 683271   |
| value_loss         | 2.25     |
---------------------------------
---------------------------------
| explained_variance | 0.963    |
| fps                | 1678     |
| nupdates           | 290      |
| policy_entropy     | 0.87     |
| policy_loss        | 0.764    |
| total_timesteps    | 707761   |
| value_loss         | 1.24     |
---------------------------------
---------------------------------
| explained_variance | 0.932    |
| fps         

---------------------------------
| explained_variance | 0.922    |
| fps                | 1672     |
| nupdates           | 540      |
| policy_entropy     | 0.913    |
| policy_loss        | -0.689   |
| total_timesteps    | 1320011  |
| value_loss         | 0.974    |
---------------------------------
---------------------------------
| explained_variance | 0.964    |
| fps                | 1670     |
| nupdates           | 550      |
| policy_entropy     | 0.905    |
| policy_loss        | -0.0253  |
| total_timesteps    | 1344501  |
| value_loss         | 0.375    |
---------------------------------
---------------------------------
| explained_variance | 0.98     |
| fps                | 1670     |
| nupdates           | 560      |
| policy_entropy     | 0.857    |
| policy_loss        | 0.596    |
| total_timesteps    | 1368991  |
| value_loss         | 0.786    |
---------------------------------
---------------------------------
| explained_variance | 0.938    |
| fps         

---------------------------------
| explained_variance | 0.981    |
| fps                | 1637     |
| nupdates           | 810      |
| policy_entropy     | 0.862    |
| policy_loss        | -0.00512 |
| total_timesteps    | 1981241  |
| value_loss         | 0.345    |
---------------------------------
---------------------------------
| explained_variance | 0.925    |
| fps                | 1635     |
| nupdates           | 820      |
| policy_entropy     | 0.826    |
| policy_loss        | 0.0143   |
| total_timesteps    | 2005731  |
| value_loss         | 0.555    |
---------------------------------
---------------------------------
| explained_variance | 0.978    |
| fps                | 1634     |
| nupdates           | 830      |
| policy_entropy     | 0.857    |
| policy_loss        | -0.0887  |
| total_timesteps    | 2030221  |
| value_loss         | 0.31     |
---------------------------------
---------------------------------
| explained_variance | 0.973    |
| fps         

---------------------------------
| explained_variance | 0.975    |
| fps                | 1614     |
| nupdates           | 1080     |
| policy_entropy     | 0.783    |
| policy_loss        | 0.126    |
| total_timesteps    | 2642471  |
| value_loss         | 0.356    |
---------------------------------
---------------------------------
| explained_variance | 0.922    |
| fps                | 1616     |
| nupdates           | 1090     |
| policy_entropy     | 0.748    |
| policy_loss        | 0.138    |
| total_timesteps    | 2666961  |
| value_loss         | 0.519    |
---------------------------------
---------------------------------
| explained_variance | 0.937    |
| fps                | 1618     |
| nupdates           | 1100     |
| policy_entropy     | 0.735    |
| policy_loss        | -0.0177  |
| total_timesteps    | 2691451  |
| value_loss         | 0.485    |
---------------------------------
---------------------------------
| explained_variance | 0.975    |
| fps         

---------------------------------
| explained_variance | 0.97     |
| fps                | 1629     |
| nupdates           | 1350     |
| policy_entropy     | 0.804    |
| policy_loss        | -0.0456  |
| total_timesteps    | 3303701  |
| value_loss         | 0.401    |
---------------------------------
---------------------------------
| explained_variance | 0.956    |
| fps                | 1629     |
| nupdates           | 1360     |
| policy_entropy     | 0.838    |
| policy_loss        | -0.136   |
| total_timesteps    | 3328191  |
| value_loss         | 0.479    |
---------------------------------
---------------------------------
| explained_variance | 0.973    |
| fps                | 1630     |
| nupdates           | 1370     |
| policy_entropy     | 0.8      |
| policy_loss        | 0.00769  |
| total_timesteps    | 3352681  |
| value_loss         | 0.408    |
---------------------------------
---------------------------------
| explained_variance | 0.974    |
| fps         

---------------------------------
| explained_variance | 0.966    |
| fps                | 1638     |
| nupdates           | 1620     |
| policy_entropy     | 0.771    |
| policy_loss        | 0.00841  |
| total_timesteps    | 3964931  |
| value_loss         | 0.279    |
---------------------------------
---------------------------------
| explained_variance | 0.935    |
| fps                | 1638     |
| nupdates           | 1630     |
| policy_entropy     | 0.812    |
| policy_loss        | -0.141   |
| total_timesteps    | 3989421  |
| value_loss         | 0.475    |
---------------------------------
---------------------------------
| explained_variance | 0.974    |
| fps                | 1639     |
| nupdates           | 1640     |
| policy_entropy     | 0.807    |
| policy_loss        | 0.0372   |
| total_timesteps    | 4013911  |
| value_loss         | 0.3      |
---------------------------------
---------------------------------
| explained_variance | 0.972    |
| fps         

---------------------------------
| explained_variance | 0.958    |
| fps                | 1644     |
| nupdates           | 1890     |
| policy_entropy     | 0.745    |
| policy_loss        | -0.182   |
| total_timesteps    | 4626161  |
| value_loss         | 0.479    |
---------------------------------
---------------------------------
| explained_variance | 0.973    |
| fps                | 1645     |
| nupdates           | 1900     |
| policy_entropy     | 0.794    |
| policy_loss        | -0.0504  |
| total_timesteps    | 4650651  |
| value_loss         | 0.349    |
---------------------------------
---------------------------------
| explained_variance | 0.965    |
| fps                | 1645     |
| nupdates           | 1910     |
| policy_entropy     | 0.8      |
| policy_loss        | 0.0621   |
| total_timesteps    | 4675141  |
| value_loss         | 0.348    |
---------------------------------
---------------------------------
| explained_variance | 0.954    |
| fps         

HBox(children=(IntProgress(value=0, description='Population', max=10000), HTML(value='')))

train...
phase 1
batch 1 learning rate 0.125 scaled 0.125
training...
---------------------------------
| explained_variance | 0.965    |
| fps                | 1252     |
| nupdates           | 1        |
| policy_entropy     | 0.859    |
| policy_loss        | -0.0617  |
| total_timesteps    | 0        |
| value_loss         | 0.464    |
---------------------------------
---------------------------------
| explained_variance | 0.277    |
| fps                | 2115     |
| nupdates           | 10       |
| policy_entropy     | 0.835    |
| policy_loss        | -2.16    |
| total_timesteps    | 22041    |
| value_loss         | 15.1     |
---------------------------------
---------------------------------
| explained_variance | 0.669    |
| fps                | 1852     |
| nupdates           | 20       |
| policy_entropy     | 0.838    |
| policy_loss        | -1.06    |
| total_timesteps    | 46531    |
| value_loss         | 5.79     |
---------------------------------
------------

---------------------------------
| explained_variance | 0.978    |
| fps                | 1679     |
| nupdates           | 270      |
| policy_entropy     | 0.859    |
| policy_loss        | 0.762    |
| total_timesteps    | 658781   |
| value_loss         | 1.11     |
---------------------------------
---------------------------------
| explained_variance | 0.958    |
| fps                | 1680     |
| nupdates           | 280      |
| policy_entropy     | 0.886    |
| policy_loss        | 0.828    |
| total_timesteps    | 683271   |
| value_loss         | 1.49     |
---------------------------------
---------------------------------
| explained_variance | 0.95     |
| fps                | 1681     |
| nupdates           | 290      |
| policy_entropy     | 0.909    |
| policy_loss        | -0.907   |
| total_timesteps    | 707761   |
| value_loss         | 1.62     |
---------------------------------
---------------------------------
| explained_variance | 0.965    |
| fps         

---------------------------------
| explained_variance | 0.909    |
| fps                | 1674     |
| nupdates           | 540      |
| policy_entropy     | 0.85     |
| policy_loss        | 0.6      |
| total_timesteps    | 1320011  |
| value_loss         | 1.17     |
---------------------------------
---------------------------------
| explained_variance | 0.93     |
| fps                | 1672     |
| nupdates           | 550      |
| policy_entropy     | 0.767    |
| policy_loss        | -0.347   |
| total_timesteps    | 1344501  |
| value_loss         | 0.591    |
---------------------------------
---------------------------------
| explained_variance | 0.926    |
| fps                | 1671     |
| nupdates           | 560      |
| policy_entropy     | 0.81     |
| policy_loss        | -0.247   |
| total_timesteps    | 1368991  |
| value_loss         | 0.482    |
---------------------------------
---------------------------------
| explained_variance | 0.92     |
| fps         

---------------------------------
| explained_variance | 0.954    |
| fps                | 1639     |
| nupdates           | 810      |
| policy_entropy     | 0.802    |
| policy_loss        | 0.0139   |
| total_timesteps    | 1981241  |
| value_loss         | 0.429    |
---------------------------------
---------------------------------
| explained_variance | 0.949    |
| fps                | 1638     |
| nupdates           | 820      |
| policy_entropy     | 0.803    |
| policy_loss        | -0.0226  |
| total_timesteps    | 2005731  |
| value_loss         | 0.399    |
---------------------------------
---------------------------------
| explained_variance | 0.945    |
| fps                | 1637     |
| nupdates           | 830      |
| policy_entropy     | 0.751    |
| policy_loss        | 0.114    |
| total_timesteps    | 2030221  |
| value_loss         | 0.45     |
---------------------------------
---------------------------------
| explained_variance | 0.979    |
| fps         

---------------------------------
| explained_variance | 0.968    |
| fps                | 1621     |
| nupdates           | 1080     |
| policy_entropy     | 0.762    |
| policy_loss        | -0.0559  |
| total_timesteps    | 2642471  |
| value_loss         | 0.354    |
---------------------------------
---------------------------------
| explained_variance | 0.981    |
| fps                | 1623     |
| nupdates           | 1090     |
| policy_entropy     | 0.858    |
| policy_loss        | 0.0996   |
| total_timesteps    | 2666961  |
| value_loss         | 0.303    |
---------------------------------
---------------------------------
| explained_variance | 0.961    |
| fps                | 1624     |
| nupdates           | 1100     |
| policy_entropy     | 0.85     |
| policy_loss        | -0.19    |
| total_timesteps    | 2691451  |
| value_loss         | 0.51     |
---------------------------------
---------------------------------
| explained_variance | 0.973    |
| fps         

---------------------------------
| explained_variance | 0.957    |
| fps                | 1635     |
| nupdates           | 1350     |
| policy_entropy     | 0.84     |
| policy_loss        | 0.0534   |
| total_timesteps    | 3303701  |
| value_loss         | 0.543    |
---------------------------------
---------------------------------
| explained_variance | 0.984    |
| fps                | 1636     |
| nupdates           | 1360     |
| policy_entropy     | 0.83     |
| policy_loss        | -0.084   |
| total_timesteps    | 3328191  |
| value_loss         | 0.252    |
---------------------------------
---------------------------------
| explained_variance | 0.986    |
| fps                | 1637     |
| nupdates           | 1370     |
| policy_entropy     | 0.81     |
| policy_loss        | -0.00821 |
| total_timesteps    | 3352681  |
| value_loss         | 0.192    |
---------------------------------
---------------------------------
| explained_variance | 0.978    |
| fps         

---------------------------------
| explained_variance | 0.976    |
| fps                | 1645     |
| nupdates           | 1620     |
| policy_entropy     | 0.783    |
| policy_loss        | 0.05     |
| total_timesteps    | 3964931  |
| value_loss         | 0.311    |
---------------------------------
---------------------------------
| explained_variance | 0.946    |
| fps                | 1645     |
| nupdates           | 1630     |
| policy_entropy     | 0.785    |
| policy_loss        | 0.0639   |
| total_timesteps    | 3989421  |
| value_loss         | 0.479    |
---------------------------------
---------------------------------
| explained_variance | 0.961    |
| fps                | 1646     |
| nupdates           | 1640     |
| policy_entropy     | 0.755    |
| policy_loss        | -0.0975  |
| total_timesteps    | 4013911  |
| value_loss         | 0.422    |
---------------------------------
---------------------------------
| explained_variance | 0.969    |
| fps         

---------------------------------
| explained_variance | 0.972    |
| fps                | 1652     |
| nupdates           | 1890     |
| policy_entropy     | 0.795    |
| policy_loss        | -0.0662  |
| total_timesteps    | 4626161  |
| value_loss         | 0.496    |
---------------------------------
---------------------------------
| explained_variance | 0.956    |
| fps                | 1652     |
| nupdates           | 1900     |
| policy_entropy     | 0.72     |
| policy_loss        | -0.0506  |
| total_timesteps    | 4650651  |
| value_loss         | 0.487    |
---------------------------------
---------------------------------
| explained_variance | 0.941    |
| fps                | 1652     |
| nupdates           | 1910     |
| policy_entropy     | 0.783    |
| policy_loss        | 0.0113   |
| total_timesteps    | 4675141  |
| value_loss         | 0.467    |
---------------------------------
---------------------------------
| explained_variance | 0.973    |
| fps         

HBox(children=(IntProgress(value=0, description='Population', max=10000), HTML(value='')))

train...
phase 1
batch 1 learning rate 0.125 scaled 0.125
training...
---------------------------------
| explained_variance | 0.94     |
| fps                | 1249     |
| nupdates           | 1        |
| policy_entropy     | 0.832    |
| policy_loss        | -0.0595  |
| total_timesteps    | 0        |
| value_loss         | 0.584    |
---------------------------------
---------------------------------
| explained_variance | 0.891    |
| fps                | 2119     |
| nupdates           | 10       |
| policy_entropy     | 0.512    |
| policy_loss        | 0.375    |
| total_timesteps    | 22041    |
| value_loss         | 2.09     |
---------------------------------
---------------------------------
| explained_variance | 0.671    |
| fps                | 1859     |
| nupdates           | 20       |
| policy_entropy     | 0.41     |
| policy_loss        | -1.05    |
| total_timesteps    | 46531    |
| value_loss         | 11.1     |
---------------------------------
------------

---------------------------------
| explained_variance | 0.932    |
| fps                | 1675     |
| nupdates           | 270      |
| policy_entropy     | 0.772    |
| policy_loss        | -0.957   |
| total_timesteps    | 658781   |
| value_loss         | 2.33     |
---------------------------------
---------------------------------
| explained_variance | 0.943    |
| fps                | 1674     |
| nupdates           | 280      |
| policy_entropy     | 0.844    |
| policy_loss        | 0.79     |
| total_timesteps    | 683271   |
| value_loss         | 1.29     |
---------------------------------
---------------------------------
| explained_variance | 0.916    |
| fps                | 1674     |
| nupdates           | 290      |
| policy_entropy     | 0.782    |
| policy_loss        | -0.153   |
| total_timesteps    | 707761   |
| value_loss         | 0.761    |
---------------------------------
---------------------------------
| explained_variance | 0.961    |
| fps         

---------------------------------
| explained_variance | 0.948    |
| fps                | 1674     |
| nupdates           | 540      |
| policy_entropy     | 0.898    |
| policy_loss        | 0.219    |
| total_timesteps    | 1320011  |
| value_loss         | 0.61     |
---------------------------------
---------------------------------
| explained_variance | 0.97     |
| fps                | 1673     |
| nupdates           | 550      |
| policy_entropy     | 0.824    |
| policy_loss        | 0.245    |
| total_timesteps    | 1344501  |
| value_loss         | 0.531    |
---------------------------------
---------------------------------
| explained_variance | 0.937    |
| fps                | 1673     |
| nupdates           | 560      |
| policy_entropy     | 0.838    |
| policy_loss        | -0.326   |
| total_timesteps    | 1368991  |
| value_loss         | 0.675    |
---------------------------------
---------------------------------
| explained_variance | 0.925    |
| fps         

---------------------------------
| explained_variance | 0.959    |
| fps                | 1672     |
| nupdates           | 810      |
| policy_entropy     | 0.764    |
| policy_loss        | -0.0423  |
| total_timesteps    | 1981241  |
| value_loss         | 0.403    |
---------------------------------
---------------------------------
| explained_variance | 0.981    |
| fps                | 1672     |
| nupdates           | 820      |
| policy_entropy     | 0.793    |
| policy_loss        | 0.0311   |
| total_timesteps    | 2005731  |
| value_loss         | 0.235    |
---------------------------------
---------------------------------
| explained_variance | 0.902    |
| fps                | 1673     |
| nupdates           | 830      |
| policy_entropy     | 0.834    |
| policy_loss        | -0.156   |
| total_timesteps    | 2030221  |
| value_loss         | 0.523    |
---------------------------------
---------------------------------
| explained_variance | 0.965    |
| fps         

---------------------------------
| explained_variance | 0.977    |
| fps                | 1675     |
| nupdates           | 1080     |
| policy_entropy     | 0.777    |
| policy_loss        | -0.0424  |
| total_timesteps    | 2642471  |
| value_loss         | 0.341    |
---------------------------------
---------------------------------
| explained_variance | 0.961    |
| fps                | 1675     |
| nupdates           | 1090     |
| policy_entropy     | 0.771    |
| policy_loss        | -0.09    |
| total_timesteps    | 2666961  |
| value_loss         | 0.597    |
---------------------------------
---------------------------------
| explained_variance | 0.984    |
| fps                | 1675     |
| nupdates           | 1100     |
| policy_entropy     | 0.792    |
| policy_loss        | -0.0418  |
| total_timesteps    | 2691451  |
| value_loss         | 0.225    |
---------------------------------
---------------------------------
| explained_variance | 0.953    |
| fps         

---------------------------------
| explained_variance | 0.953    |
| fps                | 1679     |
| nupdates           | 1350     |
| policy_entropy     | 0.796    |
| policy_loss        | 0.68     |
| total_timesteps    | 3303701  |
| value_loss         | 1.35     |
---------------------------------
---------------------------------
| explained_variance | 0.967    |
| fps                | 1679     |
| nupdates           | 1360     |
| policy_entropy     | 0.791    |
| policy_loss        | -0.556   |
| total_timesteps    | 3328191  |
| value_loss         | 0.756    |
---------------------------------
---------------------------------
| explained_variance | 0.962    |
| fps                | 1680     |
| nupdates           | 1370     |
| policy_entropy     | 0.755    |
| policy_loss        | 0.103    |
| total_timesteps    | 3352681  |
| value_loss         | 0.467    |
---------------------------------
---------------------------------
| explained_variance | 0.957    |
| fps         

---------------------------------
| explained_variance | 0.942    |
| fps                | 1682     |
| nupdates           | 1620     |
| policy_entropy     | 0.785    |
| policy_loss        | -0.15    |
| total_timesteps    | 3964931  |
| value_loss         | 0.723    |
---------------------------------
---------------------------------
| explained_variance | 0.953    |
| fps                | 1682     |
| nupdates           | 1630     |
| policy_entropy     | 0.828    |
| policy_loss        | -0.0405  |
| total_timesteps    | 3989421  |
| value_loss         | 0.343    |
---------------------------------
---------------------------------
| explained_variance | 0.964    |
| fps                | 1682     |
| nupdates           | 1640     |
| policy_entropy     | 0.814    |
| policy_loss        | 0.0129   |
| total_timesteps    | 4013911  |
| value_loss         | 0.345    |
---------------------------------
---------------------------------
| explained_variance | 0.967    |
| fps         

---------------------------------
| explained_variance | 0.941    |
| fps                | 1683     |
| nupdates           | 1890     |
| policy_entropy     | 0.774    |
| policy_loss        | -0.0491  |
| total_timesteps    | 4626161  |
| value_loss         | 0.429    |
---------------------------------
---------------------------------
| explained_variance | 0.973    |
| fps                | 1684     |
| nupdates           | 1900     |
| policy_entropy     | 0.788    |
| policy_loss        | 0.0633   |
| total_timesteps    | 4650651  |
| value_loss         | 0.246    |
---------------------------------
---------------------------------
| explained_variance | 0.976    |
| fps                | 1684     |
| nupdates           | 1910     |
| policy_entropy     | 0.761    |
| policy_loss        | 0.0563   |
| total_timesteps    | 4675141  |
| value_loss         | 0.218    |
---------------------------------
---------------------------------
| explained_variance | 0.971    |
| fps         

HBox(children=(IntProgress(value=0, description='Population', max=10000), HTML(value='')))

train...
phase 1
batch 1 learning rate 0.125 scaled 0.125
training...
---------------------------------
| explained_variance | 0.965    |
| fps                | 1273     |
| nupdates           | 1        |
| policy_entropy     | 0.81     |
| policy_loss        | 0.0734   |
| total_timesteps    | 0        |
| value_loss         | 0.403    |
---------------------------------
---------------------------------
| explained_variance | 0.851    |
| fps                | 2140     |
| nupdates           | 10       |
| policy_entropy     | 0.597    |
| policy_loss        | -1.53    |
| total_timesteps    | 22041    |
| value_loss         | 7.3      |
---------------------------------
---------------------------------
| explained_variance | 0.87     |
| fps                | 1860     |
| nupdates           | 20       |
| policy_entropy     | 0.527    |
| policy_loss        | 1.29     |
| total_timesteps    | 46531    |
| value_loss         | 11.7     |
---------------------------------
------------

---------------------------------
| explained_variance | 0.882    |
| fps                | 1597     |
| nupdates           | 270      |
| policy_entropy     | 0.865    |
| policy_loss        | -0.406   |
| total_timesteps    | 658781   |
| value_loss         | 0.885    |
---------------------------------
---------------------------------
| explained_variance | 0.953    |
| fps                | 1594     |
| nupdates           | 280      |
| policy_entropy     | 0.817    |
| policy_loss        | 0.536    |
| total_timesteps    | 683271   |
| value_loss         | 1.06     |
---------------------------------
---------------------------------
| explained_variance | 0.97     |
| fps                | 1592     |
| nupdates           | 290      |
| policy_entropy     | 0.849    |
| policy_loss        | -0.41    |
| total_timesteps    | 707761   |
| value_loss         | 0.622    |
---------------------------------
---------------------------------
| explained_variance | 0.962    |
| fps         

---------------------------------
| explained_variance | 0.966    |
| fps                | 1564     |
| nupdates           | 540      |
| policy_entropy     | 0.834    |
| policy_loss        | 0.0743   |
| total_timesteps    | 1320011  |
| value_loss         | 0.595    |
---------------------------------
---------------------------------
| explained_variance | 0.938    |
| fps                | 1568     |
| nupdates           | 550      |
| policy_entropy     | 0.865    |
| policy_loss        | -0.387   |
| total_timesteps    | 1344501  |
| value_loss         | 0.757    |
---------------------------------
---------------------------------
| explained_variance | 0.978    |
| fps                | 1572     |
| nupdates           | 560      |
| policy_entropy     | 0.851    |
| policy_loss        | 0.325    |
| total_timesteps    | 1368991  |
| value_loss         | 0.435    |
---------------------------------
---------------------------------
| explained_variance | 0.977    |
| fps         

---------------------------------
| explained_variance | 0.967    |
| fps                | 1606     |
| nupdates           | 810      |
| policy_entropy     | 0.803    |
| policy_loss        | -0.0971  |
| total_timesteps    | 1981241  |
| value_loss         | 0.357    |
---------------------------------
---------------------------------
| explained_variance | 0.964    |
| fps                | 1605     |
| nupdates           | 820      |
| policy_entropy     | 0.869    |
| policy_loss        | -0.0174  |
| total_timesteps    | 2005731  |
| value_loss         | 0.47     |
---------------------------------
---------------------------------
| explained_variance | 0.964    |
| fps                | 1606     |
| nupdates           | 830      |
| policy_entropy     | 0.787    |
| policy_loss        | 0.0762   |
| total_timesteps    | 2030221  |
| value_loss         | 0.411    |
---------------------------------
---------------------------------
| explained_variance | 0.966    |
| fps         

---------------------------------
| explained_variance | 0.981    |
| fps                | 1623     |
| nupdates           | 1080     |
| policy_entropy     | 0.747    |
| policy_loss        | 0.0912   |
| total_timesteps    | 2642471  |
| value_loss         | 0.265    |
---------------------------------
---------------------------------
| explained_variance | 0.964    |
| fps                | 1624     |
| nupdates           | 1090     |
| policy_entropy     | 0.725    |
| policy_loss        | -0.00649 |
| total_timesteps    | 2666961  |
| value_loss         | 0.551    |
---------------------------------
---------------------------------
| explained_variance | 0.96     |
| fps                | 1624     |
| nupdates           | 1100     |
| policy_entropy     | 0.821    |
| policy_loss        | 0.00775  |
| total_timesteps    | 2691451  |
| value_loss         | 0.764    |
---------------------------------
---------------------------------
| explained_variance | 0.977    |
| fps         

---------------------------------
| explained_variance | 0.977    |
| fps                | 1635     |
| nupdates           | 1350     |
| policy_entropy     | 0.783    |
| policy_loss        | -0.00105 |
| total_timesteps    | 3303701  |
| value_loss         | 0.281    |
---------------------------------
---------------------------------
| explained_variance | 0.919    |
| fps                | 1636     |
| nupdates           | 1360     |
| policy_entropy     | 0.76     |
| policy_loss        | -0.00781 |
| total_timesteps    | 3328191  |
| value_loss         | 0.426    |
---------------------------------
---------------------------------
| explained_variance | 0.975    |
| fps                | 1636     |
| nupdates           | 1370     |
| policy_entropy     | 0.783    |
| policy_loss        | -0.0702  |
| total_timesteps    | 3352681  |
| value_loss         | 0.304    |
---------------------------------
---------------------------------
| explained_variance | 0.957    |
| fps         

---------------------------------
| explained_variance | 0.977    |
| fps                | 1643     |
| nupdates           | 1620     |
| policy_entropy     | 0.726    |
| policy_loss        | 0.0311   |
| total_timesteps    | 3964931  |
| value_loss         | 0.314    |
---------------------------------
---------------------------------
| explained_variance | 0.95     |
| fps                | 1644     |
| nupdates           | 1630     |
| policy_entropy     | 0.784    |
| policy_loss        | -0.0156  |
| total_timesteps    | 3989421  |
| value_loss         | 0.452    |
---------------------------------
---------------------------------
| explained_variance | 0.941    |
| fps                | 1644     |
| nupdates           | 1640     |
| policy_entropy     | 0.786    |
| policy_loss        | -0.0259  |
| total_timesteps    | 4013911  |
| value_loss         | 0.476    |
---------------------------------
---------------------------------
| explained_variance | 0.975    |
| fps         

---------------------------------
| explained_variance | 0.914    |
| fps                | 1650     |
| nupdates           | 1890     |
| policy_entropy     | 0.742    |
| policy_loss        | 0.0823   |
| total_timesteps    | 4626161  |
| value_loss         | 0.523    |
---------------------------------
---------------------------------
| explained_variance | 0.984    |
| fps                | 1651     |
| nupdates           | 1900     |
| policy_entropy     | 0.78     |
| policy_loss        | 0.0532   |
| total_timesteps    | 4650651  |
| value_loss         | 0.195    |
---------------------------------
---------------------------------
| explained_variance | 0.956    |
| fps                | 1651     |
| nupdates           | 1910     |
| policy_entropy     | 0.79     |
| policy_loss        | -0.0349  |
| total_timesteps    | 4675141  |
| value_loss         | 0.481    |
---------------------------------
---------------------------------
| explained_variance | 0.971    |
| fps         

HBox(children=(IntProgress(value=0, description='Population', max=10000), HTML(value='')))

train...
phase 1
batch 1 learning rate 0.125 scaled 0.125
training...
---------------------------------
| explained_variance | 0.968    |
| fps                | 1282     |
| nupdates           | 1        |
| policy_entropy     | 0.823    |
| policy_loss        | 0.0119   |
| total_timesteps    | 0        |
| value_loss         | 0.433    |
---------------------------------
---------------------------------
| explained_variance | 0.449    |
| fps                | 2129     |
| nupdates           | 10       |
| policy_entropy     | 0.787    |
| policy_loss        | 0.832    |
| total_timesteps    | 22041    |
| value_loss         | 5.67     |
---------------------------------
---------------------------------
| explained_variance | 0.734    |
| fps                | 1863     |
| nupdates           | 20       |
| policy_entropy     | 0.834    |
| policy_loss        | -1.76    |
| total_timesteps    | 46531    |
| value_loss         | 7.86     |
---------------------------------
------------

---------------------------------
| explained_variance | 0.962    |
| fps                | 1680     |
| nupdates           | 270      |
| policy_entropy     | 0.827    |
| policy_loss        | 0.0531   |
| total_timesteps    | 658781   |
| value_loss         | 0.592    |
---------------------------------
---------------------------------
| explained_variance | 0.956    |
| fps                | 1681     |
| nupdates           | 280      |
| policy_entropy     | 0.871    |
| policy_loss        | 1.11     |
| total_timesteps    | 683271   |
| value_loss         | 2.38     |
---------------------------------
---------------------------------
| explained_variance | 0.933    |
| fps                | 1676     |
| nupdates           | 290      |
| policy_entropy     | 0.893    |
| policy_loss        | -0.758   |
| total_timesteps    | 707761   |
| value_loss         | 1.21     |
---------------------------------
---------------------------------
| explained_variance | 0.932    |
| fps         

---------------------------------
| explained_variance | 0.957    |
| fps                | 1673     |
| nupdates           | 540      |
| policy_entropy     | 0.882    |
| policy_loss        | 0.344    |
| total_timesteps    | 1320011  |
| value_loss         | 0.759    |
---------------------------------
---------------------------------
| explained_variance | 0.958    |
| fps                | 1673     |
| nupdates           | 550      |
| policy_entropy     | 0.832    |
| policy_loss        | -0.164   |
| total_timesteps    | 1344501  |
| value_loss         | 0.428    |
---------------------------------
---------------------------------
| explained_variance | 0.981    |
| fps                | 1673     |
| nupdates           | 560      |
| policy_entropy     | 0.831    |
| policy_loss        | 0.0177   |
| total_timesteps    | 1368991  |
| value_loss         | 0.264    |
---------------------------------
---------------------------------
| explained_variance | 0.98     |
| fps         

---------------------------------
| explained_variance | 0.958    |
| fps                | 1676     |
| nupdates           | 810      |
| policy_entropy     | 0.814    |
| policy_loss        | -0.117   |
| total_timesteps    | 1981241  |
| value_loss         | 0.497    |
---------------------------------
---------------------------------
| explained_variance | 0.907    |
| fps                | 1676     |
| nupdates           | 820      |
| policy_entropy     | 0.842    |
| policy_loss        | 0.0131   |
| total_timesteps    | 2005731  |
| value_loss         | 0.502    |
---------------------------------
---------------------------------
| explained_variance | 0.963    |
| fps                | 1676     |
| nupdates           | 830      |
| policy_entropy     | 0.779    |
| policy_loss        | -0.00226 |
| total_timesteps    | 2030221  |
| value_loss         | 0.412    |
---------------------------------
---------------------------------
| explained_variance | 0.974    |
| fps         

---------------------------------
| explained_variance | 0.954    |
| fps                | 1679     |
| nupdates           | 1080     |
| policy_entropy     | 0.815    |
| policy_loss        | -0.227   |
| total_timesteps    | 2642471  |
| value_loss         | 0.429    |
---------------------------------
---------------------------------
| explained_variance | 0.982    |
| fps                | 1679     |
| nupdates           | 1090     |
| policy_entropy     | 0.808    |
| policy_loss        | 0.0622   |
| total_timesteps    | 2666961  |
| value_loss         | 0.301    |
---------------------------------
---------------------------------
| explained_variance | 0.977    |
| fps                | 1680     |
| nupdates           | 1100     |
| policy_entropy     | 0.841    |
| policy_loss        | 0.0207   |
| total_timesteps    | 2691451  |
| value_loss         | 0.3      |
---------------------------------
---------------------------------
| explained_variance | 0.956    |
| fps         

---------------------------------
| explained_variance | 0.985    |
| fps                | 1681     |
| nupdates           | 1350     |
| policy_entropy     | 0.801    |
| policy_loss        | 0.0397   |
| total_timesteps    | 3303701  |
| value_loss         | 0.212    |
---------------------------------
---------------------------------
| explained_variance | 0.958    |
| fps                | 1681     |
| nupdates           | 1360     |
| policy_entropy     | 0.773    |
| policy_loss        | 0.00367  |
| total_timesteps    | 3328191  |
| value_loss         | 0.387    |
---------------------------------
---------------------------------
| explained_variance | 0.984    |
| fps                | 1681     |
| nupdates           | 1370     |
| policy_entropy     | 0.835    |
| policy_loss        | -0.0927  |
| total_timesteps    | 3352681  |
| value_loss         | 0.244    |
---------------------------------
---------------------------------
| explained_variance | 0.984    |
| fps         

---------------------------------
| explained_variance | 0.982    |
| fps                | 1685     |
| nupdates           | 1620     |
| policy_entropy     | 0.704    |
| policy_loss        | 0.199    |
| total_timesteps    | 3964931  |
| value_loss         | 0.307    |
---------------------------------
---------------------------------
| explained_variance | 0.968    |
| fps                | 1685     |
| nupdates           | 1630     |
| policy_entropy     | 0.766    |
| policy_loss        | 0.0683   |
| total_timesteps    | 3989421  |
| value_loss         | 0.305    |
---------------------------------
---------------------------------
| explained_variance | 0.971    |
| fps                | 1685     |
| nupdates           | 1640     |
| policy_entropy     | 0.812    |
| policy_loss        | -0.0715  |
| total_timesteps    | 4013911  |
| value_loss         | 0.292    |
---------------------------------
---------------------------------
| explained_variance | 0.969    |
| fps         

---------------------------------
| explained_variance | 0.96     |
| fps                | 1687     |
| nupdates           | 1890     |
| policy_entropy     | 0.761    |
| policy_loss        | 0.017    |
| total_timesteps    | 4626161  |
| value_loss         | 0.355    |
---------------------------------
---------------------------------
| explained_variance | 0.983    |
| fps                | 1687     |
| nupdates           | 1900     |
| policy_entropy     | 0.77     |
| policy_loss        | -0.0145  |
| total_timesteps    | 4650651  |
| value_loss         | 0.343    |
---------------------------------
---------------------------------
| explained_variance | 0.97     |
| fps                | 1687     |
| nupdates           | 1910     |
| policy_entropy     | 0.801    |
| policy_loss        | 0.0432   |
| total_timesteps    | 4675141  |
| value_loss         | 0.352    |
---------------------------------
---------------------------------
| explained_variance | 0.981    |
| fps         

HBox(children=(IntProgress(value=0, description='Population', max=10000), HTML(value='')))

train...
phase 1
batch 1 learning rate 0.125 scaled 0.125
training...
---------------------------------
| explained_variance | 0.946    |
| fps                | 1423     |
| nupdates           | 1        |
| policy_entropy     | 0.839    |
| policy_loss        | -0.0402  |
| total_timesteps    | 0        |
| value_loss         | 0.494    |
---------------------------------
---------------------------------
| explained_variance | 0.67     |
| fps                | 2408     |
| nupdates           | 10       |
| policy_entropy     | 0.476    |
| policy_loss        | 0.831    |
| total_timesteps    | 22041    |
| value_loss         | 4.72     |
---------------------------------
---------------------------------
| explained_variance | -0.258   |
| fps                | 2102     |
| nupdates           | 20       |
| policy_entropy     | 0.348    |
| policy_loss        | -3.78    |
| total_timesteps    | 46531    |
| value_loss         | 175      |
---------------------------------
------------

---------------------------------
| explained_variance | 0.935    |
| fps                | 1703     |
| nupdates           | 270      |
| policy_entropy     | 0.619    |
| policy_loss        | -0.749   |
| total_timesteps    | 658781   |
| value_loss         | 2.12     |
---------------------------------
---------------------------------
| explained_variance | 0.928    |
| fps                | 1702     |
| nupdates           | 280      |
| policy_entropy     | 0.604    |
| policy_loss        | -0.591   |
| total_timesteps    | 683271   |
| value_loss         | 1.94     |
---------------------------------
---------------------------------
| explained_variance | 0.951    |
| fps                | 1702     |
| nupdates           | 290      |
| policy_entropy     | 0.63     |
| policy_loss        | 0.546    |
| total_timesteps    | 707761   |
| value_loss         | 1.55     |
---------------------------------
---------------------------------
| explained_variance | 0.949    |
| fps         

---------------------------------
| explained_variance | 0.969    |
| fps                | 1690     |
| nupdates           | 540      |
| policy_entropy     | 0.844    |
| policy_loss        | 0.0211   |
| total_timesteps    | 1320011  |
| value_loss         | 0.382    |
---------------------------------
---------------------------------
| explained_variance | 0.978    |
| fps                | 1690     |
| nupdates           | 550      |
| policy_entropy     | 0.807    |
| policy_loss        | -0.134   |
| total_timesteps    | 1344501  |
| value_loss         | 0.318    |
---------------------------------
---------------------------------
| explained_variance | 0.917    |
| fps                | 1690     |
| nupdates           | 560      |
| policy_entropy     | 0.788    |
| policy_loss        | -0.105   |
| total_timesteps    | 1368991  |
| value_loss         | 0.436    |
---------------------------------
---------------------------------
| explained_variance | 0.96     |
| fps         

---------------------------------
| explained_variance | 0.956    |
| fps                | 1687     |
| nupdates           | 810      |
| policy_entropy     | 0.781    |
| policy_loss        | -0.155   |
| total_timesteps    | 1981241  |
| value_loss         | 0.654    |
---------------------------------
---------------------------------
| explained_variance | 0.967    |
| fps                | 1686     |
| nupdates           | 820      |
| policy_entropy     | 0.771    |
| policy_loss        | 0.0545   |
| total_timesteps    | 2005731  |
| value_loss         | 0.305    |
---------------------------------
---------------------------------
| explained_variance | 0.978    |
| fps                | 1686     |
| nupdates           | 830      |
| policy_entropy     | 0.754    |
| policy_loss        | -0.00664 |
| total_timesteps    | 2030221  |
| value_loss         | 0.283    |
---------------------------------
---------------------------------
| explained_variance | 0.971    |
| fps         

---------------------------------
| explained_variance | 0.973    |
| fps                | 1687     |
| nupdates           | 1080     |
| policy_entropy     | 0.811    |
| policy_loss        | -0.117   |
| total_timesteps    | 2642471  |
| value_loss         | 0.362    |
---------------------------------
---------------------------------
| explained_variance | 0.94     |
| fps                | 1687     |
| nupdates           | 1090     |
| policy_entropy     | 0.81     |
| policy_loss        | 0.0148   |
| total_timesteps    | 2666961  |
| value_loss         | 0.417    |
---------------------------------
---------------------------------
| explained_variance | 0.976    |
| fps                | 1687     |
| nupdates           | 1100     |
| policy_entropy     | 0.777    |
| policy_loss        | 0.0948   |
| total_timesteps    | 2691451  |
| value_loss         | 0.436    |
---------------------------------
---------------------------------
| explained_variance | 0.961    |
| fps         

---------------------------------
| explained_variance | 0.969    |
| fps                | 1689     |
| nupdates           | 1350     |
| policy_entropy     | 0.738    |
| policy_loss        | 0.00542  |
| total_timesteps    | 3303701  |
| value_loss         | 0.256    |
---------------------------------
---------------------------------
| explained_variance | 0.953    |
| fps                | 1689     |
| nupdates           | 1360     |
| policy_entropy     | 0.79     |
| policy_loss        | -0.0651  |
| total_timesteps    | 3328191  |
| value_loss         | 0.583    |
---------------------------------
---------------------------------
| explained_variance | 0.97     |
| fps                | 1689     |
| nupdates           | 1370     |
| policy_entropy     | 0.808    |
| policy_loss        | 0.135    |
| total_timesteps    | 3352681  |
| value_loss         | 0.362    |
---------------------------------
---------------------------------
| explained_variance | 0.953    |
| fps         

---------------------------------
| explained_variance | 0.963    |
| fps                | 1691     |
| nupdates           | 1620     |
| policy_entropy     | 0.721    |
| policy_loss        | 0.0529   |
| total_timesteps    | 3964931  |
| value_loss         | 0.394    |
---------------------------------
---------------------------------
| explained_variance | 0.967    |
| fps                | 1691     |
| nupdates           | 1630     |
| policy_entropy     | 0.855    |
| policy_loss        | -0.115   |
| total_timesteps    | 3989421  |
| value_loss         | 0.522    |
---------------------------------
---------------------------------
| explained_variance | 0.97     |
| fps                | 1691     |
| nupdates           | 1640     |
| policy_entropy     | 0.716    |
| policy_loss        | 0.0442   |
| total_timesteps    | 4013911  |
| value_loss         | 0.403    |
---------------------------------
---------------------------------
| explained_variance | 0.965    |
| fps         

---------------------------------
| explained_variance | 0.961    |
| fps                | 1692     |
| nupdates           | 1890     |
| policy_entropy     | 0.744    |
| policy_loss        | -0.0379  |
| total_timesteps    | 4626161  |
| value_loss         | 0.362    |
---------------------------------
---------------------------------
| explained_variance | 0.928    |
| fps                | 1692     |
| nupdates           | 1900     |
| policy_entropy     | 0.73     |
| policy_loss        | -0.0502  |
| total_timesteps    | 4650651  |
| value_loss         | 0.64     |
---------------------------------
---------------------------------
| explained_variance | 0.967    |
| fps                | 1692     |
| nupdates           | 1910     |
| policy_entropy     | 0.744    |
| policy_loss        | -0.0494  |
| total_timesteps    | 4675141  |
| value_loss         | 0.369    |
---------------------------------
---------------------------------
| explained_variance | 0.945    |
| fps         

HBox(children=(IntProgress(value=0, description='Population', max=10000), HTML(value='')))

train...
phase 1
batch 1 learning rate 0.125 scaled 0.125
training...
---------------------------------
| explained_variance | 0.983    |
| fps                | 1296     |
| nupdates           | 1        |
| policy_entropy     | 0.833    |
| policy_loss        | -0.00603 |
| total_timesteps    | 0        |
| value_loss         | 0.289    |
---------------------------------
---------------------------------
| explained_variance | 0.656    |
| fps                | 2177     |
| nupdates           | 10       |
| policy_entropy     | 0.662    |
| policy_loss        | -1.96    |
| total_timesteps    | 22041    |
| value_loss         | 11.8     |
---------------------------------
---------------------------------
| explained_variance | 0.864    |
| fps                | 1879     |
| nupdates           | 20       |
| policy_entropy     | 0.705    |
| policy_loss        | 0.971    |
| total_timesteps    | 46531    |
| value_loss         | 2.93     |
---------------------------------
------------

---------------------------------
| explained_variance | 0.955    |
| fps                | 1690     |
| nupdates           | 270      |
| policy_entropy     | 0.816    |
| policy_loss        | -0.907   |
| total_timesteps    | 658781   |
| value_loss         | 1.69     |
---------------------------------
---------------------------------
| explained_variance | 0.943    |
| fps                | 1689     |
| nupdates           | 280      |
| policy_entropy     | 0.803    |
| policy_loss        | -0.34    |
| total_timesteps    | 683271   |
| value_loss         | 0.659    |
---------------------------------
---------------------------------
| explained_variance | 0.964    |
| fps                | 1690     |
| nupdates           | 290      |
| policy_entropy     | 0.853    |
| policy_loss        | 0.557    |
| total_timesteps    | 707761   |
| value_loss         | 1.04     |
---------------------------------
---------------------------------
| explained_variance | 0.953    |
| fps         

---------------------------------
| explained_variance | 0.935    |
| fps                | 1686     |
| nupdates           | 540      |
| policy_entropy     | 0.849    |
| policy_loss        | -0.478   |
| total_timesteps    | 1320011  |
| value_loss         | 0.722    |
---------------------------------
---------------------------------
| explained_variance | 0.964    |
| fps                | 1685     |
| nupdates           | 550      |
| policy_entropy     | 0.808    |
| policy_loss        | 0.153    |
| total_timesteps    | 1344501  |
| value_loss         | 0.396    |
---------------------------------
---------------------------------
| explained_variance | 0.938    |
| fps                | 1686     |
| nupdates           | 560      |
| policy_entropy     | 0.864    |
| policy_loss        | -0.0269  |
| total_timesteps    | 1368991  |
| value_loss         | 0.529    |
---------------------------------
---------------------------------
| explained_variance | 0.953    |
| fps         

---------------------------------
| explained_variance | 0.981    |
| fps                | 1686     |
| nupdates           | 810      |
| policy_entropy     | 0.839    |
| policy_loss        | -0.0231  |
| total_timesteps    | 1981241  |
| value_loss         | 0.243    |
---------------------------------
---------------------------------
| explained_variance | 0.967    |
| fps                | 1686     |
| nupdates           | 820      |
| policy_entropy     | 0.808    |
| policy_loss        | -0.035   |
| total_timesteps    | 2005731  |
| value_loss         | 0.482    |
---------------------------------
---------------------------------
| explained_variance | 0.974    |
| fps                | 1686     |
| nupdates           | 830      |
| policy_entropy     | 0.807    |
| policy_loss        | 0.0355   |
| total_timesteps    | 2030221  |
| value_loss         | 0.393    |
---------------------------------
---------------------------------
| explained_variance | 0.969    |
| fps         

---------------------------------
| explained_variance | 0.986    |
| fps                | 1689     |
| nupdates           | 1080     |
| policy_entropy     | 0.796    |
| policy_loss        | -0.0324  |
| total_timesteps    | 2642471  |
| value_loss         | 0.284    |
---------------------------------
---------------------------------
| explained_variance | 0.936    |
| fps                | 1689     |
| nupdates           | 1090     |
| policy_entropy     | 0.774    |
| policy_loss        | -0.13    |
| total_timesteps    | 2666961  |
| value_loss         | 0.652    |
---------------------------------
---------------------------------
| explained_variance | 0.961    |
| fps                | 1690     |
| nupdates           | 1100     |
| policy_entropy     | 0.75     |
| policy_loss        | 0.165    |
| total_timesteps    | 2691451  |
| value_loss         | 0.478    |
---------------------------------
---------------------------------
| explained_variance | 0.944    |
| fps         

---------------------------------
| explained_variance | 0.974    |
| fps                | 1693     |
| nupdates           | 1350     |
| policy_entropy     | 0.759    |
| policy_loss        | -0.0398  |
| total_timesteps    | 3303701  |
| value_loss         | 0.33     |
---------------------------------
---------------------------------
| explained_variance | 0.949    |
| fps                | 1694     |
| nupdates           | 1360     |
| policy_entropy     | 0.772    |
| policy_loss        | 0.0367   |
| total_timesteps    | 3328191  |
| value_loss         | 0.555    |
---------------------------------
---------------------------------
| explained_variance | 0.918    |
| fps                | 1694     |
| nupdates           | 1370     |
| policy_entropy     | 0.752    |
| policy_loss        | -0.049   |
| total_timesteps    | 3352681  |
| value_loss         | 0.51     |
---------------------------------
---------------------------------
| explained_variance | 0.948    |
| fps         

---------------------------------
| explained_variance | 0.955    |
| fps                | 1696     |
| nupdates           | 1620     |
| policy_entropy     | 0.748    |
| policy_loss        | 0.0922   |
| total_timesteps    | 3964931  |
| value_loss         | 0.475    |
---------------------------------
---------------------------------
| explained_variance | 0.947    |
| fps                | 1694     |
| nupdates           | 1630     |
| policy_entropy     | 0.723    |
| policy_loss        | -0.0909  |
| total_timesteps    | 3989421  |
| value_loss         | 0.554    |
---------------------------------
---------------------------------
| explained_variance | 0.963    |
| fps                | 1693     |
| nupdates           | 1640     |
| policy_entropy     | 0.735    |
| policy_loss        | -0.184   |
| total_timesteps    | 4013911  |
| value_loss         | 0.407    |
---------------------------------
---------------------------------
| explained_variance | 0.968    |
| fps         

---------------------------------
| explained_variance | 0.983    |
| fps                | 1674     |
| nupdates           | 1890     |
| policy_entropy     | 0.79     |
| policy_loss        | -0.252   |
| total_timesteps    | 4626161  |
| value_loss         | 0.378    |
---------------------------------
---------------------------------
| explained_variance | 0.94     |
| fps                | 1673     |
| nupdates           | 1900     |
| policy_entropy     | 0.782    |
| policy_loss        | -0.106   |
| total_timesteps    | 4650651  |
| value_loss         | 0.506    |
---------------------------------
---------------------------------
| explained_variance | 0.972    |
| fps                | 1673     |
| nupdates           | 1910     |
| policy_entropy     | 0.785    |
| policy_loss        | 0.0588   |
| total_timesteps    | 4675141  |
| value_loss         | 0.386    |
---------------------------------
---------------------------------
| explained_variance | 0.978    |
| fps         

HBox(children=(IntProgress(value=0, description='Population', max=10000), HTML(value='')))

train...
phase 1
batch 1 learning rate 0.125 scaled 0.125
training...
---------------------------------
| explained_variance | 0.955    |
| fps                | 1263     |
| nupdates           | 1        |
| policy_entropy     | 0.876    |
| policy_loss        | 0.012    |
| total_timesteps    | 0        |
| value_loss         | 0.376    |
---------------------------------
---------------------------------
| explained_variance | 0.184    |
| fps                | 2079     |
| nupdates           | 10       |
| policy_entropy     | 0.715    |
| policy_loss        | 0.995    |
| total_timesteps    | 22041    |
| value_loss         | 8.95     |
---------------------------------
---------------------------------
| explained_variance | 0.893    |
| fps                | 1853     |
| nupdates           | 20       |
| policy_entropy     | 0.736    |
| policy_loss        | -0.899   |
| total_timesteps    | 46531    |
| value_loss         | 3.07     |
---------------------------------
------------

---------------------------------
| explained_variance | 0.911    |
| fps                | 1685     |
| nupdates           | 270      |
| policy_entropy     | 0.805    |
| policy_loss        | -1.12    |
| total_timesteps    | 658781   |
| value_loss         | 2.19     |
---------------------------------
---------------------------------
| explained_variance | 0.952    |
| fps                | 1686     |
| nupdates           | 280      |
| policy_entropy     | 0.8      |
| policy_loss        | -0.631   |
| total_timesteps    | 683271   |
| value_loss         | 1.31     |
---------------------------------
---------------------------------
| explained_variance | 0.932    |
| fps                | 1686     |
| nupdates           | 290      |
| policy_entropy     | 0.889    |
| policy_loss        | 0.657    |
| total_timesteps    | 707761   |
| value_loss         | 1.32     |
---------------------------------
---------------------------------
| explained_variance | 0.937    |
| fps         

---------------------------------
| explained_variance | 0.95     |
| fps                | 1683     |
| nupdates           | 540      |
| policy_entropy     | 0.81     |
| policy_loss        | -0.389   |
| total_timesteps    | 1320011  |
| value_loss         | 0.981    |
---------------------------------
---------------------------------
| explained_variance | 0.948    |
| fps                | 1683     |
| nupdates           | 550      |
| policy_entropy     | 0.807    |
| policy_loss        | 0.253    |
| total_timesteps    | 1344501  |
| value_loss         | 0.664    |
---------------------------------
---------------------------------
| explained_variance | 0.987    |
| fps                | 1683     |
| nupdates           | 560      |
| policy_entropy     | 0.78     |
| policy_loss        | 0.122    |
| total_timesteps    | 1368991  |
| value_loss         | 0.261    |
---------------------------------
---------------------------------
| explained_variance | 0.969    |
| fps         

---------------------------------
| explained_variance | 0.959    |
| fps                | 1683     |
| nupdates           | 810      |
| policy_entropy     | 0.865    |
| policy_loss        | 0.0835   |
| total_timesteps    | 1981241  |
| value_loss         | 0.488    |
---------------------------------
---------------------------------
| explained_variance | 0.987    |
| fps                | 1682     |
| nupdates           | 820      |
| policy_entropy     | 0.821    |
| policy_loss        | -0.179   |
| total_timesteps    | 2005731  |
| value_loss         | 0.243    |
---------------------------------
---------------------------------
| explained_variance | 0.979    |
| fps                | 1682     |
| nupdates           | 830      |
| policy_entropy     | 0.819    |
| policy_loss        | 0.0548   |
| total_timesteps    | 2030221  |
| value_loss         | 0.292    |
---------------------------------
---------------------------------
| explained_variance | 0.974    |
| fps         

---------------------------------
| explained_variance | 0.977    |
| fps                | 1683     |
| nupdates           | 1080     |
| policy_entropy     | 0.762    |
| policy_loss        | 0.00821  |
| total_timesteps    | 2642471  |
| value_loss         | 0.369    |
---------------------------------
---------------------------------
| explained_variance | 0.963    |
| fps                | 1683     |
| nupdates           | 1090     |
| policy_entropy     | 0.833    |
| policy_loss        | -0.0274  |
| total_timesteps    | 2666961  |
| value_loss         | 0.355    |
---------------------------------
---------------------------------
| explained_variance | 0.968    |
| fps                | 1684     |
| nupdates           | 1100     |
| policy_entropy     | 0.783    |
| policy_loss        | -0.175   |
| total_timesteps    | 2691451  |
| value_loss         | 0.544    |
---------------------------------
---------------------------------
| explained_variance | 0.971    |
| fps         

---------------------------------
| explained_variance | 0.974    |
| fps                | 1687     |
| nupdates           | 1350     |
| policy_entropy     | 0.761    |
| policy_loss        | 0.0134   |
| total_timesteps    | 3303701  |
| value_loss         | 0.452    |
---------------------------------
---------------------------------
| explained_variance | 0.955    |
| fps                | 1687     |
| nupdates           | 1360     |
| policy_entropy     | 0.841    |
| policy_loss        | -0.104   |
| total_timesteps    | 3328191  |
| value_loss         | 0.383    |
---------------------------------
---------------------------------
| explained_variance | 0.966    |
| fps                | 1688     |
| nupdates           | 1370     |
| policy_entropy     | 0.834    |
| policy_loss        | 0.0687   |
| total_timesteps    | 3352681  |
| value_loss         | 0.392    |
---------------------------------
---------------------------------
| explained_variance | 0.947    |
| fps         

---------------------------------
| explained_variance | 0.957    |
| fps                | 1691     |
| nupdates           | 1620     |
| policy_entropy     | 0.73     |
| policy_loss        | -0.0426  |
| total_timesteps    | 3964931  |
| value_loss         | 0.443    |
---------------------------------
---------------------------------
| explained_variance | 0.968    |
| fps                | 1691     |
| nupdates           | 1630     |
| policy_entropy     | 0.699    |
| policy_loss        | 0.0931   |
| total_timesteps    | 3989421  |
| value_loss         | 0.331    |
---------------------------------
---------------------------------
| explained_variance | 0.963    |
| fps                | 1691     |
| nupdates           | 1640     |
| policy_entropy     | 0.823    |
| policy_loss        | -0.103   |
| total_timesteps    | 4013911  |
| value_loss         | 0.581    |
---------------------------------
---------------------------------
| explained_variance | 0.974    |
| fps         

---------------------------------
| explained_variance | 0.972    |
| fps                | 1694     |
| nupdates           | 1890     |
| policy_entropy     | 0.81     |
| policy_loss        | -0.017   |
| total_timesteps    | 4626161  |
| value_loss         | 0.344    |
---------------------------------
---------------------------------
| explained_variance | 0.975    |
| fps                | 1694     |
| nupdates           | 1900     |
| policy_entropy     | 0.741    |
| policy_loss        | -0.0189  |
| total_timesteps    | 4650651  |
| value_loss         | 0.358    |
---------------------------------
----------------------------------
| explained_variance | 0.975     |
| fps                | 1694      |
| nupdates           | 1910      |
| policy_entropy     | 0.788     |
| policy_loss        | -0.000458 |
| total_timesteps    | 4675141   |
| value_loss         | 0.374     |
----------------------------------
---------------------------------
| explained_variance | 0.956    |
| fps

HBox(children=(IntProgress(value=0, description='Population', max=10000), HTML(value='')))

train...
phase 1
batch 1 learning rate 0.125 scaled 0.125
training...
---------------------------------
| explained_variance | 0.957    |
| fps                | 1277     |
| nupdates           | 1        |
| policy_entropy     | 0.916    |
| policy_loss        | -0.0638  |
| total_timesteps    | 0        |
| value_loss         | 0.345    |
---------------------------------
---------------------------------
| explained_variance | 0.409    |
| fps                | 2142     |
| nupdates           | 10       |
| policy_entropy     | 0.856    |
| policy_loss        | -3.52    |
| total_timesteps    | 22041    |
| value_loss         | 19.8     |
---------------------------------
---------------------------------
| explained_variance | 0.857    |
| fps                | 1890     |
| nupdates           | 20       |
| policy_entropy     | 0.834    |
| policy_loss        | 0.0851   |
| total_timesteps    | 46531    |
| value_loss         | 2.51     |
---------------------------------
------------

---------------------------------
| explained_variance | 0.925    |
| fps                | 1682     |
| nupdates           | 270      |
| policy_entropy     | 0.823    |
| policy_loss        | -0.906   |
| total_timesteps    | 658781   |
| value_loss         | 2.01     |
---------------------------------
---------------------------------
| explained_variance | 0.962    |
| fps                | 1681     |
| nupdates           | 280      |
| policy_entropy     | 0.85     |
| policy_loss        | 1.06     |
| total_timesteps    | 683271   |
| value_loss         | 2.08     |
---------------------------------
---------------------------------
| explained_variance | 0.958    |
| fps                | 1682     |
| nupdates           | 290      |
| policy_entropy     | 0.856    |
| policy_loss        | 0.207    |
| total_timesteps    | 707761   |
| value_loss         | 0.765    |
---------------------------------
---------------------------------
| explained_variance | 0.948    |
| fps         

---------------------------------
| explained_variance | 0.937    |
| fps                | 1682     |
| nupdates           | 540      |
| policy_entropy     | 0.835    |
| policy_loss        | 0.625    |
| total_timesteps    | 1320011  |
| value_loss         | 0.926    |
---------------------------------
---------------------------------
| explained_variance | 0.974    |
| fps                | 1682     |
| nupdates           | 550      |
| policy_entropy     | 0.877    |
| policy_loss        | 0.0779   |
| total_timesteps    | 1344501  |
| value_loss         | 0.375    |
---------------------------------
---------------------------------
| explained_variance | 0.98     |
| fps                | 1681     |
| nupdates           | 560      |
| policy_entropy     | 0.844    |
| policy_loss        | -0.382   |
| total_timesteps    | 1368991  |
| value_loss         | 0.527    |
---------------------------------
---------------------------------
| explained_variance | 0.935    |
| fps         

---------------------------------
| explained_variance | 0.982    |
| fps                | 1680     |
| nupdates           | 810      |
| policy_entropy     | 0.779    |
| policy_loss        | 0.312    |
| total_timesteps    | 1981241  |
| value_loss         | 0.51     |
---------------------------------
---------------------------------
| explained_variance | 0.975    |
| fps                | 1681     |
| nupdates           | 820      |
| policy_entropy     | 0.827    |
| policy_loss        | -0.123   |
| total_timesteps    | 2005731  |
| value_loss         | 0.241    |
---------------------------------
---------------------------------
| explained_variance | 0.966    |
| fps                | 1681     |
| nupdates           | 830      |
| policy_entropy     | 0.852    |
| policy_loss        | 0.0338   |
| total_timesteps    | 2030221  |
| value_loss         | 0.456    |
---------------------------------
---------------------------------
| explained_variance | 0.97     |
| fps         

---------------------------------
| explained_variance | 0.951    |
| fps                | 1681     |
| nupdates           | 1080     |
| policy_entropy     | 0.835    |
| policy_loss        | -0.232   |
| total_timesteps    | 2642471  |
| value_loss         | 0.634    |
---------------------------------
---------------------------------
| explained_variance | 0.978    |
| fps                | 1679     |
| nupdates           | 1090     |
| policy_entropy     | 0.827    |
| policy_loss        | 0.181    |
| total_timesteps    | 2666961  |
| value_loss         | 0.29     |
---------------------------------
---------------------------------
| explained_variance | 0.966    |
| fps                | 1678     |
| nupdates           | 1100     |
| policy_entropy     | 0.825    |
| policy_loss        | -0.0263  |
| total_timesteps    | 2691451  |
| value_loss         | 0.353    |
---------------------------------
---------------------------------
| explained_variance | 0.976    |
| fps         

---------------------------------
| explained_variance | 0.963    |
| fps                | 1649     |
| nupdates           | 1350     |
| policy_entropy     | 0.81     |
| policy_loss        | -0.103   |
| total_timesteps    | 3303701  |
| value_loss         | 0.432    |
---------------------------------
---------------------------------
| explained_variance | 0.981    |
| fps                | 1648     |
| nupdates           | 1360     |
| policy_entropy     | 0.773    |
| policy_loss        | 0.228    |
| total_timesteps    | 3328191  |
| value_loss         | 0.421    |
---------------------------------
---------------------------------
| explained_variance | 0.957    |
| fps                | 1647     |
| nupdates           | 1370     |
| policy_entropy     | 0.766    |
| policy_loss        | -0.0914  |
| total_timesteps    | 3352681  |
| value_loss         | 0.505    |
---------------------------------
---------------------------------
| explained_variance | 0.969    |
| fps         

---------------------------------
| explained_variance | 0.968    |
| fps                | 1650     |
| nupdates           | 1620     |
| policy_entropy     | 0.792    |
| policy_loss        | 0.0544   |
| total_timesteps    | 3964931  |
| value_loss         | 0.312    |
---------------------------------
---------------------------------
| explained_variance | 0.983    |
| fps                | 1650     |
| nupdates           | 1630     |
| policy_entropy     | 0.741    |
| policy_loss        | -0.0101  |
| total_timesteps    | 3989421  |
| value_loss         | 0.247    |
---------------------------------
---------------------------------
| explained_variance | 0.882    |
| fps                | 1650     |
| nupdates           | 1640     |
| policy_entropy     | 0.719    |
| policy_loss        | -0.0139  |
| total_timesteps    | 4013911  |
| value_loss         | 0.516    |
---------------------------------
---------------------------------
| explained_variance | 0.969    |
| fps         

---------------------------------
| explained_variance | 0.958    |
| fps                | 1658     |
| nupdates           | 1890     |
| policy_entropy     | 0.725    |
| policy_loss        | -0.0588  |
| total_timesteps    | 4626161  |
| value_loss         | 0.474    |
---------------------------------
---------------------------------
| explained_variance | 0.98     |
| fps                | 1658     |
| nupdates           | 1900     |
| policy_entropy     | 0.767    |
| policy_loss        | 0.00867  |
| total_timesteps    | 4650651  |
| value_loss         | 0.246    |
---------------------------------
---------------------------------
| explained_variance | 0.94     |
| fps                | 1659     |
| nupdates           | 1910     |
| policy_entropy     | 0.783    |
| policy_loss        | -0.0638  |
| total_timesteps    | 4675141  |
| value_loss         | 0.543    |
---------------------------------
---------------------------------
| explained_variance | 0.978    |
| fps         

HBox(children=(IntProgress(value=0, description='Population', max=10000), HTML(value='')))

train...
phase 1
batch 1 learning rate 0.125 scaled 0.125
training...
---------------------------------
| explained_variance | 0.947    |
| fps                | 1324     |
| nupdates           | 1        |
| policy_entropy     | 0.83     |
| policy_loss        | -0.151   |
| total_timesteps    | 0        |
| value_loss         | 0.518    |
---------------------------------
---------------------------------
| explained_variance | 0.907    |
| fps                | 2158     |
| nupdates           | 10       |
| policy_entropy     | 0.797    |
| policy_loss        | 1.49     |
| total_timesteps    | 22041    |
| value_loss         | 4.74     |
---------------------------------
---------------------------------
| explained_variance | 0.712    |
| fps                | 1902     |
| nupdates           | 20       |
| policy_entropy     | 0.803    |
| policy_loss        | -2.6     |
| total_timesteps    | 46531    |
| value_loss         | 12.6     |
---------------------------------
------------

---------------------------------
| explained_variance | 0.906    |
| fps                | 1696     |
| nupdates           | 270      |
| policy_entropy     | 0.868    |
| policy_loss        | -0.406   |
| total_timesteps    | 658781   |
| value_loss         | 0.914    |
---------------------------------
---------------------------------
| explained_variance | 0.939    |
| fps                | 1697     |
| nupdates           | 280      |
| policy_entropy     | 0.871    |
| policy_loss        | 1.31     |
| total_timesteps    | 683271   |
| value_loss         | 3.36     |
---------------------------------
---------------------------------
| explained_variance | 0.964    |
| fps                | 1697     |
| nupdates           | 290      |
| policy_entropy     | 0.853    |
| policy_loss        | -0.222   |
| total_timesteps    | 707761   |
| value_loss         | 0.537    |
---------------------------------
---------------------------------
| explained_variance | 0.928    |
| fps         

---------------------------------
| explained_variance | 0.975    |
| fps                | 1692     |
| nupdates           | 540      |
| policy_entropy     | 0.904    |
| policy_loss        | 0.497    |
| total_timesteps    | 1320011  |
| value_loss         | 0.785    |
---------------------------------
---------------------------------
| explained_variance | 0.97     |
| fps                | 1692     |
| nupdates           | 550      |
| policy_entropy     | 0.896    |
| policy_loss        | -0.522   |
| total_timesteps    | 1344501  |
| value_loss         | 0.738    |
---------------------------------
---------------------------------
| explained_variance | 0.957    |
| fps                | 1691     |
| nupdates           | 560      |
| policy_entropy     | 0.913    |
| policy_loss        | -0.21    |
| total_timesteps    | 1368991  |
| value_loss         | 0.424    |
---------------------------------
---------------------------------
| explained_variance | 0.959    |
| fps         

---------------------------------
| explained_variance | 0.91     |
| fps                | 1692     |
| nupdates           | 810      |
| policy_entropy     | 0.895    |
| policy_loss        | 0.0757   |
| total_timesteps    | 1981241  |
| value_loss         | 0.678    |
---------------------------------
---------------------------------
| explained_variance | 0.95     |
| fps                | 1692     |
| nupdates           | 820      |
| policy_entropy     | 0.876    |
| policy_loss        | -0.109   |
| total_timesteps    | 2005731  |
| value_loss         | 0.537    |
---------------------------------
---------------------------------
| explained_variance | 0.977    |
| fps                | 1692     |
| nupdates           | 830      |
| policy_entropy     | 0.883    |
| policy_loss        | -0.0455  |
| total_timesteps    | 2030221  |
| value_loss         | 0.294    |
---------------------------------
---------------------------------
| explained_variance | 0.98     |
| fps         

---------------------------------
| explained_variance | 0.967    |
| fps                | 1692     |
| nupdates           | 1080     |
| policy_entropy     | 0.937    |
| policy_loss        | 0.0694   |
| total_timesteps    | 2642471  |
| value_loss         | 0.4      |
---------------------------------
---------------------------------
| explained_variance | 0.957    |
| fps                | 1692     |
| nupdates           | 1090     |
| policy_entropy     | 0.95     |
| policy_loss        | -0.00929 |
| total_timesteps    | 2666961  |
| value_loss         | 0.403    |
---------------------------------
---------------------------------
| explained_variance | 0.977    |
| fps                | 1692     |
| nupdates           | 1100     |
| policy_entropy     | 0.885    |
| policy_loss        | -0.184   |
| total_timesteps    | 2691451  |
| value_loss         | 0.369    |
---------------------------------
---------------------------------
| explained_variance | 0.966    |
| fps         

---------------------------------
| explained_variance | 0.942    |
| fps                | 1694     |
| nupdates           | 1350     |
| policy_entropy     | 0.869    |
| policy_loss        | -0.0366  |
| total_timesteps    | 3303701  |
| value_loss         | 0.452    |
---------------------------------
---------------------------------
| explained_variance | 0.945    |
| fps                | 1694     |
| nupdates           | 1360     |
| policy_entropy     | 0.824    |
| policy_loss        | -0.15    |
| total_timesteps    | 3328191  |
| value_loss         | 0.598    |
---------------------------------
---------------------------------
| explained_variance | 0.936    |
| fps                | 1694     |
| nupdates           | 1370     |
| policy_entropy     | 0.789    |
| policy_loss        | -0.0136  |
| total_timesteps    | 3352681  |
| value_loss         | 0.552    |
---------------------------------
---------------------------------
| explained_variance | 0.983    |
| fps         

---------------------------------
| explained_variance | 0.968    |
| fps                | 1697     |
| nupdates           | 1620     |
| policy_entropy     | 0.784    |
| policy_loss        | 0.0833   |
| total_timesteps    | 3964931  |
| value_loss         | 0.322    |
---------------------------------
---------------------------------
| explained_variance | 0.977    |
| fps                | 1697     |
| nupdates           | 1630     |
| policy_entropy     | 0.845    |
| policy_loss        | 0.000249 |
| total_timesteps    | 3989421  |
| value_loss         | 0.325    |
---------------------------------
---------------------------------
| explained_variance | 0.973    |
| fps                | 1697     |
| nupdates           | 1640     |
| policy_entropy     | 0.82     |
| policy_loss        | -0.0507  |
| total_timesteps    | 4013911  |
| value_loss         | 0.304    |
---------------------------------
---------------------------------
| explained_variance | 0.952    |
| fps         

---------------------------------
| explained_variance | 0.986    |
| fps                | 1698     |
| nupdates           | 1890     |
| policy_entropy     | 0.775    |
| policy_loss        | 0.0749   |
| total_timesteps    | 4626161  |
| value_loss         | 0.239    |
---------------------------------
---------------------------------
| explained_variance | 0.977    |
| fps                | 1699     |
| nupdates           | 1900     |
| policy_entropy     | 0.809    |
| policy_loss        | 0.0113   |
| total_timesteps    | 4650651  |
| value_loss         | 0.326    |
---------------------------------
---------------------------------
| explained_variance | 0.978    |
| fps                | 1699     |
| nupdates           | 1910     |
| policy_entropy     | 0.827    |
| policy_loss        | -0.0999  |
| total_timesteps    | 4675141  |
| value_loss         | 0.348    |
---------------------------------
---------------------------------
| explained_variance | 0.986    |
| fps         

HBox(children=(IntProgress(value=0, description='Population', max=10000), HTML(value='')))

In [None]:

#cc1.render_distrib(load='results/perus_results2')


# Työttömyysputken poisto

Työttömyysputkelle meneminen on usein hyvin suosittua elinkaarimalleissa. Tarkastellaan millainen työllisyysvaikutus on putken poistamisella.

In [None]:
cc1_putki=Lifecycle(env='unemployment-v1',minimal=False,include_putki=False,mortality=mortality,
                    perustulo=False,randomness=randomness)
cc1_putki.run_distrib(n=5,debug=False,steps1=size1,steps2=size2,pop=pop_size,deterministic=deterministic,
                train=True,predict=True,batch1=batch1,batch2=batch2,
                save=perusmalli,plot=True,cont=True,start_from=perusmalli,results='results/distrib_poisto',
                callback_minsteps=callback_minsteps,rlmodel=rlmodel,twostage=twostage)


In [None]:
cc1_putki.render_distrib(load='results/putki_results')
cc1_putki.compare_simstats('results/putki_results','results/putki_results')                    