# Introduction

**Executive summary:** use `WalkForwardRunner` to run your walk-forward training, use `WalkForwardResults` to restore the agent and visualize the results.

In [1]:
import ray
from ray import rllib, tune
import pandas as pd
from datetime import datetime
import trading_gym
from trading_gym.registry.gaia.v7.env import GAIAPredictorsContinuousV7
from trading_gym.registry.gaia.v8.env import GAIAPredictorsContinuousV8
from trading_gym.registry.gaia.v9.env import GAIAPredictorsContinuousV9
from trading_gym.ray.walkforward import WalkForwardRunner, WalkForwardResults
import os
%matplotlib inline
print(trading_gym.__package__, trading_gym.__version__)
print(ray.__package__, ray.__version__)

trading-gym 0.8.1
ray 0.7.2


In [2]:
ray.init()

2019-08-21 15:56:54,825	INFO node.py:498 -- Process STDOUT and STDERR is being redirected to /tmp/ray/session_2019-08-21_15-56-54_825142_14506/logs.
2019-08-21 15:56:54,940	INFO services.py:409 -- Waiting for redis server at 127.0.0.1:60773 to respond...
2019-08-21 15:56:55,059	INFO services.py:409 -- Waiting for redis server at 127.0.0.1:49091 to respond...
2019-08-21 15:56:55,063	INFO services.py:806 -- Starting Redis shard with 10.0 GB max memory.
2019-08-21 15:56:55,098	INFO node.py:512 -- Process STDOUT and STDERR is being redirected to /tmp/ray/session_2019-08-21_15-56-54_825142_14506/logs.
2019-08-21 15:56:55,103	INFO services.py:1446 -- Starting the Plasma object store with 20.0 GB memory using /dev/shm.


{'node_ip_address': '10.0.5.4',
 'redis_address': '10.0.5.4:60773',
 'object_store_address': '/tmp/ray/session_2019-08-21_15-56-54_825142_14506/sockets/plasma_store',
 'raylet_socket_name': '/tmp/ray/session_2019-08-21_15-56-54_825142_14506/sockets/raylet',
 'webui_url': None,
 'session_dir': '/tmp/ray/session_2019-08-21_15-56-54_825142_14506'}

# WalkForwardRunner
In pure `ray`, you are used to do something along the lines of what follows:

    config = ray.rllib.agents.ppo.DEFAULT_CONFIG.copy()
    config['env'] = GAIAPredictorsContinuousV8
    config['env_config'] = {'cost_of_commissions': 0.00005, 'cost_of_spread': 0.0001}
    tune.run(
        PPOTrainer,
        config=config,
        stop={'timesteps_total': 25000},
        checkpoint_freq=1,
        verbose=1,
    )
    
In this section we see how to reproduce the same logic using `WalkForwardRunner`. The benefit of using `WalkForwardRunner` as opposed to pure `ray` is that only the former allows to use `WalkForwardResults` to restore and visualize agents on different folds.

## Create the walk-forward partitions
It's responsibility of the user to create the training/test (and maybe validation) partitions to turn the walk forward training. Note that 2-fold split is a particular case of walk-forward training, so you are still able to run a simple 2-fold split.

In [3]:
partitions = list()
for year in range(2007, 2018):
    partition = {
        'training-set': [datetime.min, datetime(year, 12, 31)],
        'test-set': [datetime(year + 1, 1, 1), datetime(year + 1, 12, 31)],
    }
    partitions.append(partition)
partitions

[{'training-set': [datetime.datetime(1, 1, 1, 0, 0),
   datetime.datetime(2007, 12, 31, 0, 0)],
  'test-set': [datetime.datetime(2008, 1, 1, 0, 0),
   datetime.datetime(2008, 12, 31, 0, 0)]},
 {'training-set': [datetime.datetime(1, 1, 1, 0, 0),
   datetime.datetime(2008, 12, 31, 0, 0)],
  'test-set': [datetime.datetime(2009, 1, 1, 0, 0),
   datetime.datetime(2009, 12, 31, 0, 0)]},
 {'training-set': [datetime.datetime(1, 1, 1, 0, 0),
   datetime.datetime(2009, 12, 31, 0, 0)],
  'test-set': [datetime.datetime(2010, 1, 1, 0, 0),
   datetime.datetime(2010, 12, 31, 0, 0)]},
 {'training-set': [datetime.datetime(1, 1, 1, 0, 0),
   datetime.datetime(2010, 12, 31, 0, 0)],
  'test-set': [datetime.datetime(2011, 1, 1, 0, 0),
   datetime.datetime(2011, 12, 31, 0, 0)]},
 {'training-set': [datetime.datetime(1, 1, 1, 0, 0),
   datetime.datetime(2011, 12, 31, 0, 0)],
  'test-set': [datetime.datetime(2012, 1, 1, 0, 0),
   datetime.datetime(2012, 12, 31, 0, 0)]},
 {'training-set': [datetime.datetime(1, 

## Create the config dict


In [4]:
config = ray.rllib.agents.ppo.DEFAULT_CONFIG.copy()
config['env'] = GAIAPredictorsContinuousV9
config['env_config'] = {
    'cost_of_commissions': tune.grid_search([0.00005]),
    'cost_of_spread': 0.0001,
}
config['gamma'] = 0.82

In [5]:
env = GAIAPredictorsContinuousV9()
env.action_space.sample()

array([ 0.05468333, -0.02438399])

## Run your walk-forward experiment

In [6]:
walk_forward = WalkForwardRunner(
    env_partitions=partitions,
    trainable=ray.rllib.agents.ppo.PPOTrainer,
    config=config,
    stop={'timesteps_total': 50000},
    checkpoint_freq=1,
)

Note that WalkForwardRunner has constructed the implied ray Experiment(s) from your walk forward settings.

In [6]:
trials = walk_forward.run(verbose=0)

2019-07-18 13:05:35,921	INFO tune.py:61 -- Tip: to resume incomplete experiments, pass resume='prompt' or resume=True to run()
2019-07-18 13:05:35,923	INFO tune.py:233 -- Starting a new experiment.


[2m[36m(pid=18974)[0m 2019-07-18 13:05:38,576	INFO rollout_worker.py:301 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=18974)[0m 2019-07-18 13:05:38.576880: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
[2m[36m(pid=18967)[0m 2019-07-18 13:05:38,639	INFO rollout_worker.py:301 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=18967)[0m 2019-07-18 13:05:38.639484: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
[2m[36m(pid=18974)[0m 2019-07-18 13:05:38,725	INFO dynamic_tf_policy.py:313 -- Initializing loss function with dummy input:
[2m[36m(pid=18974)[0m 
[2m[36m(pid=18974)[0m { 'action_prob': <tf.Tensor 'default_policy/action_prob:0' shape=(?,) dtype=float32>,
[2m[36m(pid=18974)

[2m[36m(pid=19079)[0m 2019-07-18 13:05:42,624	INFO dynamic_tf_policy.py:313 -- Initializing loss function with dummy input:
[2m[36m(pid=19079)[0m 
[2m[36m(pid=19079)[0m { 'action_prob': <tf.Tensor 'default_policy/action_prob:0' shape=(?,) dtype=float32>,
[2m[36m(pid=19079)[0m   'actions': <tf.Tensor 'default_policy/actions:0' shape=(?, 2) dtype=float32>,
[2m[36m(pid=19079)[0m   'advantages': <tf.Tensor 'default_policy/advantages:0' shape=(?,) dtype=float32>,
[2m[36m(pid=19079)[0m   'behaviour_logits': <tf.Tensor 'default_policy/behaviour_logits:0' shape=(?, 2) dtype=float32>,
[2m[36m(pid=19079)[0m   'dones': <tf.Tensor 'default_policy/dones:0' shape=(?,) dtype=bool>,
[2m[36m(pid=19079)[0m   'new_obs': <tf.Tensor 'default_policy/new_obs:0' shape=(?, 5) dtype=float32>,
[2m[36m(pid=19079)[0m   'obs': <tf.Tensor 'default_policy/observation:0' shape=(?, 5) dtype=float32>,
[2m[36m(pid=19079)[0m   'prev_actions': <tf.Tensor 'default_policy/action:0' shape=(?, 2) 

[2m[36m(pid=19077)[0m 2019-07-18 13:05:44,263	INFO rollout_worker.py:462 -- Completed sample batch:
[2m[36m(pid=19077)[0m 
[2m[36m(pid=19077)[0m { 'data': { 'action_prob': np.ndarray((200,), dtype=float32, min=0.967, max=1.023, mean=1.001),
[2m[36m(pid=19077)[0m             'actions': np.ndarray((200, 2), dtype=float32, min=0.003, max=0.997, mean=0.5),
[2m[36m(pid=19077)[0m             'advantages': np.ndarray((200,), dtype=float32, min=-0.023, max=0.019, mean=0.003),
[2m[36m(pid=19077)[0m             'agent_index': np.ndarray((200,), dtype=int64, min=0.0, max=0.0, mean=0.0),
[2m[36m(pid=19077)[0m             'behaviour_logits': np.ndarray((200, 2), dtype=float32, min=-0.009, max=0.014, mean=0.005),
[2m[36m(pid=19077)[0m             'dones': np.ndarray((200,), dtype=bool, min=0.0, max=1.0, mean=0.05),
[2m[36m(pid=19077)[0m             'eps_id': np.ndarray((200,), dtype=int64, min=231480974.0, max=1979631230.0, mean=1158329005.4),
[2m[36m(pid=19077)[0m     

[2m[36m(pid=18974)[0m 2019-07-18 13:05:52,178	INFO tf_run_builder.py:92 -- Executing TF run without tracing. To dump TF timeline traces to disk, set the TF_TIMELINE_DIR environment variable.
[2m[36m(pid=18967)[0m 2019-07-18 13:05:52,229	INFO tf_run_builder.py:92 -- Executing TF run without tracing. To dump TF timeline traces to disk, set the TF_TIMELINE_DIR environment variable.


2019-07-18 13:06:40,571	INFO ray_trial_executor.py:187 -- Destroying actor for trial PPO_GAIAPredictorsContinuousV8_restoreID=-5810601706822191451_runID=SYtcQGjy_1_cost_of_commissions=0.0005. If your trainable is slow to initialize, consider setting reuse_actors=True to reduce actor creation overheads.
2019-07-18 13:06:40,838	INFO ray_trial_executor.py:187 -- Destroying actor for trial PPO_GAIAPredictorsContinuousV8_restoreID=-6139801031714166768_runID=SYtcQGjy_0_cost_of_commissions=5e-05. If your trainable is slow to initialize, consider setting reuse_actors=True to reduce actor creation overheads.
2019-07-18 13:06:40,869	INFO tune.py:61 -- Tip: to resume incomplete experiments, pass resume='prompt' or resume=True to run()
2019-07-18 13:06:40,871	INFO tune.py:233 -- Starting a new experiment.


[2m[36m(pid=18973)[0m 2019-07-18 13:06:42,006	INFO rollout_worker.py:301 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=18973)[0m 2019-07-18 13:06:42.007045: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
[2m[36m(pid=18968)[0m 2019-07-18 13:06:42,035	INFO rollout_worker.py:301 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=18968)[0m 2019-07-18 13:06:42.035507: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
[2m[36m(pid=18973)[0m 2019-07-18 13:06:42,143	INFO dynamic_tf_policy.py:313 -- Initializing loss function with dummy input:
[2m[36m(pid=18973)[0m 
[2m[36m(pid=18973)[0m { 'action_prob': <tf.Tensor 'default_policy/action_prob:0' shape=(?,) dtype=float32>,
[2m[36m(pid=18973)

[2m[36m(pid=18972)[0m 
[2m[36m(pid=18972)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=18972)[0m 
[2m[36m(pid=18971)[0m 
[2m[36m(pid=18971)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=18971)[0m 
[2m[36m(pid=18969)[0m 
[2m[36m(pid=18969)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=18969)[0m 
[2m[36m(pid=19693)[0m 2019-07-18 13:06:46,040	INFO rollout_worker.py:301 -- Creating policy evaluation worker 2 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=19693)[0m 2019-07-18 13:06:46.047842: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
[2m[36m(pid=19693)[0m 
[2m[36m(pid=19693)[0m Converting sparse IndexedSl

[2m[36m(pid=18971)[0m 2019-07-18 13:06:46,863	INFO sample_batch_builder.py:161 -- Trajectory fragment after postprocess_trajectory():
[2m[36m(pid=18971)[0m 
[2m[36m(pid=18971)[0m { 'agent0': { 'data': { 'action_prob': np.ndarray((20,), dtype=float32, min=0.99, max=1.005, mean=1.0),
[2m[36m(pid=18971)[0m                         'actions': np.ndarray((20, 2), dtype=float32, min=0.001, max=0.999, mean=0.5),
[2m[36m(pid=18971)[0m                         'advantages': np.ndarray((20,), dtype=float32, min=-0.0, max=0.014, mean=0.006),
[2m[36m(pid=18971)[0m                         'agent_index': np.ndarray((20,), dtype=int64, min=0.0, max=0.0, mean=0.0),
[2m[36m(pid=18971)[0m                         'behaviour_logits': np.ndarray((20, 2), dtype=float32, min=-0.003, max=0.002, mean=-0.001),
[2m[36m(pid=18971)[0m                         'dones': np.ndarray((20,), dtype=bool, min=0.0, max=1.0, mean=0.05),
[2m[36m(pid=18971)[0m                         'eps_id': np.ndarr

[2m[36m(pid=18973)[0m 2019-07-18 13:06:50,959	INFO multi_gpu_impl.py:146 -- Training on concatenated sample batches:
[2m[36m(pid=18973)[0m 
[2m[36m(pid=18973)[0m { 'inputs': [ np.ndarray((4000, 2), dtype=float32, min=0.0, max=1.0, mean=0.475),
[2m[36m(pid=18973)[0m               np.ndarray((4000,), dtype=float32, min=-0.05, max=0.042, mean=0.0),
[2m[36m(pid=18973)[0m               np.ndarray((4000, 5), dtype=float32, min=-14.063, max=13.391, mean=0.278),
[2m[36m(pid=18973)[0m               np.ndarray((4000, 2), dtype=float32, min=0.0, max=1.0, mean=0.5),
[2m[36m(pid=18973)[0m               np.ndarray((4000,), dtype=float32, min=-7.279, max=6.801, mean=-0.0),
[2m[36m(pid=18973)[0m               np.ndarray((4000, 2), dtype=float32, min=-0.008, max=0.013, mean=0.001),
[2m[36m(pid=18973)[0m               np.ndarray((4000,), dtype=float32, min=-0.05, max=0.042, mean=0.0),
[2m[36m(pid=18973)[0m               np.ndarray((4000,), dtype=float32, min=-0.01, max=0.00

2019-07-18 13:07:44,357	INFO ray_trial_executor.py:187 -- Destroying actor for trial PPO_GAIAPredictorsContinuousV8_restoreID=-5810601706822191451_runID=SYtcQGjy_1_cost_of_commissions=0.0005. If your trainable is slow to initialize, consider setting reuse_actors=True to reduce actor creation overheads.
2019-07-18 13:07:44,511	INFO ray_trial_executor.py:187 -- Destroying actor for trial PPO_GAIAPredictorsContinuousV8_restoreID=-6139801031714166768_runID=SYtcQGjy_0_cost_of_commissions=5e-05. If your trainable is slow to initialize, consider setting reuse_actors=True to reduce actor creation overheads.


Note that trials are associated with a `RestoreID`. This `ID` is all you need to restore an agent. Here we are using a grid search of two values for `cost_of_commissions` on two partitions, so we have a total of 4 experiments.

In [7]:
trials

[PPO_GAIAPredictorsContinuousV8_restoreID=-6139801031714166768_runID=SYtcQGjy_0_cost_of_commissions=5e-05,
 PPO_GAIAPredictorsContinuousV8_restoreID=-5810601706822191451_runID=SYtcQGjy_1_cost_of_commissions=0.0005,
 PPO_GAIAPredictorsContinuousV8_restoreID=-6139801031714166768_runID=SYtcQGjy_0_cost_of_commissions=5e-05,
 PPO_GAIAPredictorsContinuousV8_restoreID=-5810601706822191451_runID=SYtcQGjy_1_cost_of_commissions=0.0005]

You can monitor agents' training in tensorboard.
![title](tensorboard.png)

# WalkForwardResults

`WalkForwardResults` is the 'controller' of all your walk forward results across all environment you have solved using `WalkForwardRunner`. If you print the class instance, you will see the list of all environments that you've solved.

In [7]:
import os

In [10]:
# results = WalkForwardResults(path=os.path.join(os.getcwd(), 'logs'))
# results
print(os.path.join(os.getcwd(), 'logs'))
path = os.path.join(os.getcwd(), 'logs')

/home/Nicholas/trading-gym/notebooks/trading-gym/walk-forward/logs


Let's select the results associated with a particular environment.

In [11]:
results = WalkForwardResults(path)

In [12]:
env_results = results['GAIAPredictorsContinuousV8']
env_results

EnvResults(GAIAPredictorsContinuousV8)

To assess an agent on an env you need an agent and an env. That's all we are trying to do here. Steps:

1. `EnvResults.make_env`: this method allows you to restore an environment and gives you the possibility to pass a new `env_config` (e.g. different transaction costs). Note that this flexibility allows you to train on given `env_config` (e.g. unrealistically high transaction costs) and assess using different configurations (e.g. realistic transaction costs).
2. `EnvResults.make_policy`: this method allows you to implement an `AbstractPolicy` which is needed to sample episodes from `TradingEnv` and thus render the results. Arguments of this method allows you to specify stuff like:
    1. Which agent's checkpoint to use (last $n$). If blank the last checkpoint will be used (i.e. last training iteration). If $n>1$, then a list of actions will be returned (one for each agent).
    2. Whether or not to create an esample from the previously trained agents.
3. `episode.TradingEnv.sample_episode(policy)`: returns an instance of `Episode`. Charts and tables can be produced using `Episode.render`. `episode` stores all information you need to assess your agent.

In [13]:
# Step 1.
env = env_results.make_env(
    env_config={
        'cost_of_commissions': 0.00005,
        'cost_of_spread': 0.0001,
        'folds': {
            'training-set': [datetime.min, datetime(2008, 3, 18)],
            'test-set': [datetime(2008, 3, 19), datetime.max],
        }
    },
)

Note that these are the same `RestoreID`s that we so in the previous section.

In [14]:
env_results.restore_ids

{7598463039896949665: [AgentResults(GAIAPredictorsContinuousV8_1-01-01_to_2007-12-31/PPO_GAIAPredictorsContinuousV8_restoreID=7598463039896949665_runID=xZhYdK7w_3_clip_param=0.9,entropy_coeff=1e-05,cost_of_commissio_2019-07-18_13-38-13kwfwp3oo),
  AgentResults(GAIAPredictorsContinuousV8_1-01-01_to_2008-12-31/PPO_GAIAPredictorsContinuousV8_restoreID=7598463039896949665_runID=xZhYdK7w_3_clip_param=0.9,entropy_coeff=1e-05,cost_of_commissio_2019-07-18_14-57-20h4p6ma0t)],
 8837309464980678580: [AgentResults(GAIAPredictorsContinuousV8_1-01-01_to_2007-12-31/PPO_GAIAPredictorsContinuousV8_restoreID=8837309464980678580_runID=VQ6ZSlxh_3_clip_param=0.4,entropy_coeff=1e-05,cost_of_commissio_2019-07-18_17-08-438n1jeljd),
  AgentResults(GAIAPredictorsContinuousV8_1-01-01_to_2008-12-31/PPO_GAIAPredictorsContinuousV8_restoreID=8837309464980678580_runID=VQ6ZSlxh_3_clip_param=0.4,entropy_coeff=1e-05,cost_of_commissio_2019-07-18_18-23-120ssc446k)],
 7122607550531400491: [AgentResults(GAIAPredictorsContin

In [16]:
# We select a whatever id here (the first). 
# If you want to restore a particular run from tensorboard, just check RestoreID and specify it here.
restore_id = list(env_results.restore_ids)[0]
restore_id

7598463039896949665

In [17]:
# Step 2.
policy = env_results.make_policy(
    env=env,
    restore_id=restore_id,
    checkpoint_nr=5,  # use None (or don't specify) to use last checkpoint available
)
policy

<trading_gym.ray.walkforward.policy.WalkForwardPolicy at 0x7f32eacea710>

In the previous chapters we have used `WalkForwardRunner` to run a walk forward optimization re-training every year. In other circumstances, re-training might follow more complex patterns. For example re-training might occur on an irregular basis, e.g. whenever there is a structural break in the markets. So it might be useful to visualize the "age" of the model used in a given day. The older the model, the higher the risk that there has been a change in the dynamics of the system and so the model might be out-dated.

In [19]:
history = policy.history()
# history

In [20]:
history['AgeInDays'].iplot(
    title='Age in days of the most recent model that could be used in the date indicated in the x-axis<br>Whenever the count drops to zero, there has been a re-training of the model',
    yTitle='Nr of days',
    fill=True,
)

In [21]:
# Step 3.
episode = env.sample_episode(fold='test-set', policy=policy, verbose=False)

2019-08-07 21:35:32,941	INFO rollout_worker.py:301 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
2019-08-07 21:35:33,520	INFO dynamic_tf_policy.py:313 -- Initializing loss function with dummy input:

{ 'action_prob': <tf.Tensor 'default_policy/action_prob:0' shape=(?,) dtype=float32>,
  'actions': <tf.Tensor 'default_policy/actions:0' shape=(?, 2) dtype=float32>,
  'advantages': <tf.Tensor 'default_policy/advantages:0' shape=(?,) dtype=float32>,
  'behaviour_logits': <tf.Tensor 'default_policy/behaviour_logits:0' shape=(?, 2) dtype=float32>,
  'dones': <tf.Tensor 'default_policy/dones:0' shape=(?,) dtype=bool>,
  'new_obs': <tf.Tensor 'default_policy/new_obs:0' shape=(?, 5) dtype=float32>,
  'obs': <tf.Tensor 'default_policy/observation:0' shape=(?, 5) dtype=float32>,
  'prev_actions': <tf.Tensor 'default_policy/action:0' shape=(?, 2) dtype=float32>,
  'prev_rewards': <tf.Tensor 'default_policy/prev_reward:0' shape=(?,) dtype=float32>,
  'rewards': 

2019-08-07 21:35:59,853	ERROR worker.py:1612 -- Possible unhandled error from worker: [36mray_RolloutWorker:__init__()[39m (pid=11053, host=Nicholas)
  File "/home/Nicholas/.venv/lib/python3.6/site-packages/ray/memory_monitor.py", line 77, in raise_if_low_memory
    self.error_threshold))
ray.memory_monitor.RayOutOfMemoryError: More than 95% of the memory on node Nicholas is used (66.84 / 67.53 GB). The top 5 memory consumers are:

PID	MEM	COMMAND
27961	15.62GB	/home/Nicholas/.venv/bin/python3 /home/Nicholas/.venv/bin/tensorboard --logdir /home/Nicholas/tradin
96381	15.34GB	/home/Nicholas/.venv/bin/python3 -m ipykernel_launcher -f /home/Nicholas/.local/share/jupyter/runtim
48848	4.15GB	/home/Nicholas/.venv/bin/python3 -m ipykernel_launcher -f /home/Nicholas/.local/share/jupyter/runtim
50106	3.01GB	/home/Nicholas/Downloads/pycharm-community-2019.1.3/jre64/bin/java -classpath /home/Nicholas/Downloa
42895	1.99GB	/home/Nicholas/.venv/bin/python3 -m ipykernel_launcher -f /home/Nicholas/.l

Render results over the combined test-folds
`episode.renderer` is probably the single most useful attribute of `Episode` to visualize results, but you are invited to explore other attributes such as `episode.states` or `episode.actions`.

In [22]:
episode.renderer.cumulative_performance.to_plotly()
episode.renderer.target_weights.to_plotly()
episode.renderer.annual_returns.to_plotly()
episode.renderer.tearsheet()

Unnamed: 0,Unnamed: 1,Strategy,Index(Aric-Benchmark),Index(USD 1M Deposit),Cash(USD),"ETF(Russell 1000, SMART, USD)","ETF(7-10Y T-Bills, SMART, USD)"
Context,From,2008-03-19,2008-03-19,2008-03-19,2008-03-19,2008-03-19,2008-03-19
Context,To,2018-08-28,2018-08-28,2018-08-28,2018-08-28,2018-08-28,2018-08-28
Context,Years,10.4493,10.4493,10.4493,10.4493,10.4493,10.4493
Context,Observations,2725,2725,2725,2725,2725,2725
Context,Risk-free asset,Index(USD 1M Deposit),Index(USD 1M Deposit),Index(USD 1M Deposit),Index(USD 1M Deposit),Index(USD 1M Deposit),Index(USD 1M Deposit)
Context,Risk-free CAGR,0.00681294,0.00681294,0.00681294,0.00681294,0.00681294,0.00681294
Return,CAGR,0.0957706,0.158586,0.00681294,0,0.104507,0.0339243
Return,CAGR over cash,0.0889577,0.151773,0,-0.00681294,0.0976941,0.0271113
Return,Overall return,1.60042,3.65592,0.0735266,0,1.82541,0.417089
Risk,Volatility,0.0746025,0.0970738,0.000598812,0,0.197859,0.0766871


## Use case: interactive charts with widgets to visualize the agent during training
How do the historical test-set weights change as we train? By default, one call to `agent.train` runs the agent for 4000 timesteps. We previously set `checkpoint_freq=1`, so we will be able to restore agents every 4000 timesteps.

In [18]:
# This might take long as it requires to restore #agents \times #nr_checkpoints
nr2episode = env_results.get_nr2episode(
    restore_id=restore_id,
    checkpoint_nrs=[1, 2, 3, 4, 5, 6, 7],
    fold='test-set',
    env_config={
        'folds': {
            'training-set': [datetime.min, datetime(2008, 3, 18)],
            'test-set': [datetime(2008, 3, 19), datetime.max],
        }
    }
)
nr2episode

{1: <trading_gym.env.Episode at 0x7f75932f0710>,
 2: <trading_gym.env.Episode at 0x7f75932f0e10>,
 3: <trading_gym.env.Episode at 0x7f7592c05fd0>,
 4: <trading_gym.env.Episode at 0x7f75928e2780>,
 5: <trading_gym.env.Episode at 0x7f759257b908>,
 6: <trading_gym.env.Episode at 0x7f75928d7470>,
 7: <trading_gym.env.Episode at 0x7f7591fa4080>}

In [19]:
nr2episode.plot_weights()

interactive(children=(IntSlider(value=4, description='nr', max=12, min=-4), Output()), _dom_classes=('widget-i…

FigureWidget({
    'data': [{'name': 'Cash(USD)',
              'type': 'scatter',
              'uid': 'ea89f…

In [20]:
nr2episode.plot_levels()

interactive(children=(IntSlider(value=4, description='nr', max=12, min=-4), Output()), _dom_classes=('widget-i…

FigureWidget({
    'data': [{'name': 'Strategy',
              'type': 'scatter',
              'uid': '47e173…

In [21]:
nr2episode.plot_metrics_as_we_train()