# Introduction

**Executive summary:** use `WalkForwardRunner` to run your walk-forward training, use `WalkForwardResults` to restore the agent and visualize the results.

In [1]:
import ray
from ray import rllib, tune
import pandas as pd
import numpy as np
from datetime import datetime
import trading_gym
from trading_gym.registry.gaia.v7.env import GAIAPredictorsContinuousV7
from trading_gym.registry.gaia.v8.env import GAIAPredictorsContinuousV8
from trading_gym.ray.walkforward import WalkForwardRunner, WalkForwardResults
%matplotlib inline
print(trading_gym.__package__, trading_gym.__version__)
print(ray.__package__, ray.__version__)

trading-gym 0.8.0
ray 0.7.2


In [2]:
ray.init()

2019-07-16 16:21:22,657	INFO node.py:498 -- Process STDOUT and STDERR is being redirected to /tmp/ray/session_2019-07-16_16-21-22_657135_626/logs.
2019-07-16 16:21:22,796	INFO services.py:409 -- Waiting for redis server at 127.0.0.1:52584 to respond...
2019-07-16 16:21:22,910	INFO services.py:409 -- Waiting for redis server at 127.0.0.1:25807 to respond...
2019-07-16 16:21:22,916	INFO services.py:806 -- Starting Redis shard with 6.72 GB max memory.
2019-07-16 16:21:22,964	INFO node.py:512 -- Process STDOUT and STDERR is being redirected to /tmp/ray/session_2019-07-16_16-21-22_657135_626/logs.
2019-07-16 16:21:22,966	INFO services.py:1446 -- Starting the Plasma object store with 10.08 GB memory using /dev/shm.


{'node_ip_address': '192.168.81.206',
 'redis_address': '192.168.81.206:52584',
 'object_store_address': '/tmp/ray/session_2019-07-16_16-21-22_657135_626/sockets/plasma_store',
 'raylet_socket_name': '/tmp/ray/session_2019-07-16_16-21-22_657135_626/sockets/raylet',
 'webui_url': None,
 'session_dir': '/tmp/ray/session_2019-07-16_16-21-22_657135_626'}

# WalkForwardRunner
In pure `ray`, you are used to do something along the lines of what follows:

    config = ray.rllib.agents.ppo.DEFAULT_CONFIG.copy()
    config['env'] = GAIAPredictorsContinuousV8
    config['env_config'] = {'cost_of_commissions': 0.00005, 'cost_of_spread': 0.0001}
    tune.run(
        PPOTrainer,
        config=config,
        stop={'timesteps_total': 25000},
        checkpoint_freq=1,
        verbose=1,
    )
    
In this section we see how to reproduce the same logic using `WalkForwardRunner`. The benefit of using `WalkForwardRunner` as opposed to pure `ray` is that only the former allows to use `WalkForwardResults` to restore and visualize agents on different folds.

## Create the walk-forward partitions
It's responsibility of the user to create the training/test (and maybe validation) partitions to turn the walk forward training. Note that 2-fold split is a particular case of walk-forward training, so you are still able to run a simple 2-fold split.

In [3]:
partitions = list()
for year in range(2007, 2018):
    partition = {
        'training-set': [datetime.min, datetime(year, 12, 31)],
        'test-set': [datetime(year + 1, 1, 1), datetime(year + 1, 12, 31)],
    }
    partitions.append(partition)
partitions

[{'training-set': [datetime.datetime(1, 1, 1, 0, 0),
   datetime.datetime(2007, 12, 31, 0, 0)],
  'test-set': [datetime.datetime(2008, 1, 1, 0, 0),
   datetime.datetime(2008, 12, 31, 0, 0)]},
 {'training-set': [datetime.datetime(1, 1, 1, 0, 0),
   datetime.datetime(2008, 12, 31, 0, 0)],
  'test-set': [datetime.datetime(2009, 1, 1, 0, 0),
   datetime.datetime(2009, 12, 31, 0, 0)]},
 {'training-set': [datetime.datetime(1, 1, 1, 0, 0),
   datetime.datetime(2009, 12, 31, 0, 0)],
  'test-set': [datetime.datetime(2010, 1, 1, 0, 0),
   datetime.datetime(2010, 12, 31, 0, 0)]},
 {'training-set': [datetime.datetime(1, 1, 1, 0, 0),
   datetime.datetime(2010, 12, 31, 0, 0)],
  'test-set': [datetime.datetime(2011, 1, 1, 0, 0),
   datetime.datetime(2011, 12, 31, 0, 0)]},
 {'training-set': [datetime.datetime(1, 1, 1, 0, 0),
   datetime.datetime(2011, 12, 31, 0, 0)],
  'test-set': [datetime.datetime(2012, 1, 1, 0, 0),
   datetime.datetime(2012, 12, 31, 0, 0)]},
 {'training-set': [datetime.datetime(1, 

## Create the config dict


In [4]:
config = ray.rllib.agents.ppo.DEFAULT_CONFIG.copy()
config['env'] = GAIAPredictorsContinuousV8
config['env_config'] = {
    'cost_of_commissions': tune.grid_search([0, 0.00001, 0.0001, 0.001, 0.01, 0.1]),
    'cost_of_spread': 0.0001,
}
config['gamma'] = 0.
config['clip_param'] = 0.8
config['entropy_coeff'] = 1e-5
config['use_gae'] = False
config['vf_share_layers'] = True
config['kl_coeff'] = 0.2
config['kl_target'] = 0.01
config['lambda'] = 0.
config['vf_loss_coeff'] = 0.
config['vf_clip_param'] = 0.
config['batch_mode'] = 'complete_episodes'

## Run your walk-forward experiment

In [5]:
walk_forward = WalkForwardRunner(
    env_partitions=partitions,
    trainable=ray.rllib.agents.ppo.PPOTrainer,
    config=config,
    stop={'timesteps_total': 500000},
    checkpoint_freq=1,
)

Note that WalkForwardRunner has constructed the implied ray Experiment(s) from your walk forward settings.

In [6]:
walk_forward.experiments

[<ray.tune.experiment.Experiment at 0x7f35c8c706a0>,
 <ray.tune.experiment.Experiment at 0x7f35c8c706d8>,
 <ray.tune.experiment.Experiment at 0x7f35c8c70cf8>,
 <ray.tune.experiment.Experiment at 0x7f35c8c70b38>,
 <ray.tune.experiment.Experiment at 0x7f35c8c70a58>,
 <ray.tune.experiment.Experiment at 0x7f35c8c70978>,
 <ray.tune.experiment.Experiment at 0x7f35c8c707f0>,
 <ray.tune.experiment.Experiment at 0x7f35c8c70e10>,
 <ray.tune.experiment.Experiment at 0x7f35d83c4cf8>,
 <ray.tune.experiment.Experiment at 0x7f35d83c4fd0>,
 <ray.tune.experiment.Experiment at 0x7f35d83c4f98>]

Note that trials are associated with a `RestoreID`. This `ID` is all you need to restore an agent. Here we are using a grid search of two values for `cost_of_commissions` on two partitions, so we have a total of 4 experiments.

In [7]:
run = False
if run:
    trials = tune.run_experiments(walk_forward.experiments, verbose=0)
    print(trials)

# WalkForwardResults

`WalkForwardResults` is the 'controller' of all your walk forward results across all environment you have solved using `WalkForwardRunner`. If you print the class instance, you will see the list of all environments that you've solved.

In [8]:
results = WalkForwardResults(r'/home/federico/Desktop/repos/trading-gym/notebooks/registry/gaia/v8/logs')
results

WalkForwardResults(['GAIAPredictorsContinuousV8'])

Let's select the results associated with a particular environment.

In [9]:
env_results = results['GAIAPredictorsContinuousV8']
env_results

EnvResults(GAIAPredictorsContinuousV8)

To assess an agent on an env you need an agent and an env. That's all we are trying to do here. Steps:

1. `EnvResults.make_env`: this method allows you to restore an environment and gives you the possibility to pass a new `env_config` (e.g. different transaction costs). Note that this flexibility allows you to train on given `env_config` (e.g. unrealistically high transaction costs) and assess using different configurations (e.g. realistic transaction costs).
2. `EnvResults.make_policy`: this method allows you to implement an `AbstractPolicy` which is needed to sample episodes from `TradingEnv` and thus render the results. Arguments of this method allows you to specify stuff like:
    1. Which agent's checkpoint to use (last $n$). If blank the last checkpoint will be used (i.e. last training iteration). If $n>1$, then a list of actions will be returned (one for each agent).
    2. Whether or not to create an esample from the previously trained agents.
3. `episode.TradingEnv.sample_episode(policy)`: returns an instance of `Episode`. Charts and tables can be produced using `Episode.render`. `episode` stores all information you need to assess your agent.

In [10]:
# Step 1.
env = env_results.make_env(
    env_config={
        'cost_of_commissions': 0,
        'cost_of_spread': 0,
        'folds': {
            'training-set': [datetime.min, datetime(2008, 3, 18)],
            'test-set': [datetime(2008, 3, 19), datetime.max],
        }
    },
)

Note that these are the same `RestoreID`s that we so in the previous section.

In [11]:
env_results.restore_ids

{-8480896234107644715: [AgentResults(GAIAPredictorsContinuousV8_1-01-01_to_2017-12-31/PPO_GAIAPredictorsContinuousV8_2_cost_of_commissions=0.0001,restoreID=-8480896234107644715_2019-07-16_03-26-596eqq0x_b),
  AgentResults(GAIAPredictorsContinuousV8_1-01-01_to_2010-12-31/PPO_GAIAPredictorsContinuousV8_2_cost_of_commissions=0.0001,restoreID=-8480896234107644715_2019-07-15_22-22-52qxe0__3s),
  AgentResults(GAIAPredictorsContinuousV8_1-01-01_to_2007-12-31/PPO_GAIAPredictorsContinuousV8_2_cost_of_commissions=0.0001,restoreID=-8480896234107644715_2019-07-15_20-12-29kpscyyqe),
  AgentResults(GAIAPredictorsContinuousV8_1-01-01_to_2009-12-31/PPO_GAIAPredictorsContinuousV8_2_cost_of_commissions=0.0001,restoreID=-8480896234107644715_2019-07-15_21-39-337r81c0k0),
  AgentResults(GAIAPredictorsContinuousV8_1-01-01_to_2016-12-31/PPO_GAIAPredictorsContinuousV8_2_cost_of_commissions=0.0001,restoreID=-8480896234107644715_2019-07-16_02-43-367n3q58kx),
  AgentResults(GAIAPredictorsContinuousV8_1-01-01_to_

In [12]:
cost2restore_id = {
    0.:      2717662117527193638,
    0.1:     5510258102469832729,
    0.01:    5455102632216835301,
    0.001:  -8874778998028814031,
    0.0001: -8480896234107644715,
    0.00001: 155743597065357471,    
}

In [13]:
nr2episode = dict()
for cost_of_commissions, restore_id in cost2restore_id.items():
    nr2episode[cost_of_commissions] = env_results.get_nr2episode(
        restore_id=restore_id,
        checkpoint_nrs=np.arange(1, 126, 1),
        fold='test-set',
        env_config={
            'folds': {
                'training-set': [datetime.min, datetime(2008, 3, 18)],
                'test-set': [datetime(2008, 3, 19), datetime.max],
            }
        }
    )

2019-07-16 16:21:25,841	INFO rollout_worker.py:301 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
2019-07-16 16:21:25,955	INFO dynamic_tf_policy.py:313 -- Initializing loss function with dummy input:

{ 'action_prob': <tf.Tensor 'default_policy/action_prob:0' shape=(?,) dtype=float32>,
  'actions': <tf.Tensor 'default_policy/actions:0' shape=(?, 2) dtype=float32>,
  'advantages': <tf.Tensor 'default_policy/advantages:0' shape=(?,) dtype=float32>,
  'behaviour_logits': <tf.Tensor 'default_policy/behaviour_logits:0' shape=(?, 2) dtype=float32>,
  'dones': <tf.Tensor 'default_policy/dones:0' shape=(?,) dtype=bool>,
  'new_obs': <tf.Tensor 'default_policy/new_obs:0' shape=(?, 5) dtype=float32>,
  'obs': <tf.Tensor 'default_policy/observation:0' shape=(?, 5) dtype=float32>,
  'prev_actions': <tf.Tensor 'default_policy/action:0' shape=(?, 2) dtype=float32>,
  'prev_rewards': <tf.Tensor 'default_policy/prev_reward:0' shape=(?,) dtype=float32>,
  'rewards': 

[2m[36m(pid=679)[0m 2019-07-16 16:21:28,849	INFO rollout_worker.py:301 -- Creating policy evaluation worker 1 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=679)[0m 2019-07-16 16:21:28.857244: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
[2m[36m(pid=681)[0m 2019-07-16 16:21:28,878	INFO rollout_worker.py:301 -- Creating policy evaluation worker 2 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=681)[0m 2019-07-16 16:21:28.885353: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
[2m[36m(pid=679)[0m 2019-07-16 16:21:28,963	INFO dynamic_tf_policy.py:313 -- Initializing loss function with dummy input:
[2m[36m(pid=679)[0m 
[2m[36m(pid=679)[0m { 'action_prob': <tf.Tensor 'default_policy/action_prob:0' shape=(?,) dtype=float32>,
[2m[36m(pid=679)[0m   'actions'

2019-07-16 16:21:29,142	INFO tf_run_builder.py:92 -- Executing TF run without tracing. To dump TF timeline traces to disk, set the TF_TIMELINE_DIR environment variable.


[2m[36m(pid=679)[0m 
[2m[36m(pid=679)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=679)[0m 
[2m[36m(pid=681)[0m 
[2m[36m(pid=681)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=681)[0m 


2019-07-16 16:21:29,850	INFO rollout_worker.py:301 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
2019-07-16 16:21:30,665	INFO rollout_worker.py:719 -- Built policy map: {'default_policy': <ray.rllib.policy.tf_policy_template.PPOTFPolicy object at 0x7f350af65ba8>}
2019-07-16 16:21:30,666	INFO rollout_worker.py:720 -- Built preprocessor map: {'default_policy': <ray.rllib.models.preprocessors.NoPreprocessor object at 0x7f350af65780>}
2019-07-16 16:21:30,666	INFO rollout_worker.py:333 -- Built filter map: {'default_policy': <ray.rllib.utils.filter.NoFilter object at 0x7f350af655c0>}
2019-07-16 16:21:30,685	INFO multi_gpu_optimizer.py:79 -- LocalMultiGPUOptimizer devices ['/cpu:0']


[2m[36m(pid=676)[0m 2019-07-16 16:21:32,408	INFO rollout_worker.py:301 -- Creating policy evaluation worker 1 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=676)[0m 2019-07-16 16:21:32.415687: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
[2m[36m(pid=678)[0m 2019-07-16 16:21:32,377	INFO rollout_worker.py:301 -- Creating policy evaluation worker 2 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=678)[0m 2019-07-16 16:21:32.384578: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
[2m[36m(pid=676)[0m 2019-07-16 16:21:32,534	INFO dynamic_tf_policy.py:313 -- Initializing loss function with dummy input:
[2m[36m(pid=676)[0m 
[2m[36m(pid=676)[0m { 'action_prob': <tf.Tensor 'default_policy/action_prob:0' shape=(?,) dtype=float32>,
[2m[36m(pid=676)[0m   'actions'

2019-07-16 16:21:33,607	INFO rollout_worker.py:301 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
2019-07-16 16:21:34,289	INFO rollout_worker.py:719 -- Built policy map: {'default_policy': <ray.rllib.policy.tf_policy_template.PPOTFPolicy object at 0x7f32956a40b8>}
2019-07-16 16:21:34,290	INFO rollout_worker.py:720 -- Built preprocessor map: {'default_policy': <ray.rllib.models.preprocessors.NoPreprocessor object at 0x7f3295749c50>}
2019-07-16 16:21:34,290	INFO rollout_worker.py:333 -- Built filter map: {'default_policy': <ray.rllib.utils.filter.NoFilter object at 0x7f3295749a20>}
2019-07-16 16:21:34,309	INFO multi_gpu_optimizer.py:79 -- LocalMultiGPUOptimizer devices ['/cpu:0']


[2m[36m(pid=677)[0m 2019-07-16 16:21:35,943	INFO rollout_worker.py:301 -- Creating policy evaluation worker 2 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=680)[0m 2019-07-16 16:21:35,950	INFO rollout_worker.py:301 -- Creating policy evaluation worker 1 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=680)[0m 2019-07-16 16:21:35.957696: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
[2m[36m(pid=677)[0m 2019-07-16 16:21:35.950539: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
[2m[36m(pid=680)[0m 2019-07-16 16:21:36,063	INFO dynamic_tf_policy.py:313 -- Initializing loss function with dummy input:
[2m[36m(pid=680)[0m 
[2m[36m(pid=680)[0m { 'action_prob': <tf.Tensor 'default_policy/action_prob:0' shape=(?,) dtype=float32>,
[2m[36m(pid=680)[0m   'actions'

2019-07-16 16:21:37,138	INFO rollout_worker.py:301 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
2019-07-16 16:21:37,816	INFO rollout_worker.py:719 -- Built policy map: {'default_policy': <ray.rllib.policy.tf_policy_template.PPOTFPolicy object at 0x7f327c9c44e0>}
2019-07-16 16:21:37,816	INFO rollout_worker.py:720 -- Built preprocessor map: {'default_policy': <ray.rllib.models.preprocessors.NoPreprocessor object at 0x7f327c9c40b8>}
2019-07-16 16:21:37,817	INFO rollout_worker.py:333 -- Built filter map: {'default_policy': <ray.rllib.utils.filter.NoFilter object at 0x7f327ca6ae10>}
2019-07-16 16:21:37,837	INFO multi_gpu_optimizer.py:79 -- LocalMultiGPUOptimizer devices ['/cpu:0']


[2m[36m(pid=675)[0m 2019-07-16 16:21:39,489	INFO rollout_worker.py:301 -- Creating policy evaluation worker 2 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=675)[0m 2019-07-16 16:21:39.496969: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
[2m[36m(pid=674)[0m 2019-07-16 16:21:39,481	INFO rollout_worker.py:301 -- Creating policy evaluation worker 1 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=674)[0m 2019-07-16 16:21:39.488378: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
[2m[36m(pid=674)[0m 2019-07-16 16:21:39,596	INFO dynamic_tf_policy.py:313 -- Initializing loss function with dummy input:
[2m[36m(pid=674)[0m 
[2m[36m(pid=674)[0m { 'action_prob': <tf.Tensor 'default_policy/action_prob:0' shape=(?,) dtype=float32>,
[2m[36m(pid=674)[0m   'actions'

2019-07-16 16:21:40,702	INFO rollout_worker.py:301 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
2019-07-16 16:21:41,385	INFO rollout_worker.py:719 -- Built policy map: {'default_policy': <ray.rllib.policy.tf_policy_template.PPOTFPolicy object at 0x7f326bcf2898>}
2019-07-16 16:21:41,385	INFO rollout_worker.py:720 -- Built preprocessor map: {'default_policy': <ray.rllib.models.preprocessors.NoPreprocessor object at 0x7f326bcf2470>}
2019-07-16 16:21:41,386	INFO rollout_worker.py:333 -- Built filter map: {'default_policy': <ray.rllib.utils.filter.NoFilter object at 0x7f326bcf2240>}
2019-07-16 16:21:41,404	INFO multi_gpu_optimizer.py:79 -- LocalMultiGPUOptimizer devices ['/cpu:0']
2019-07-16 16:21:44,118	INFO rollout_worker.py:301 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
2019-07-16 16:21:44,848	INFO rollout_worker.py:719 -- Built policy map: {'default_policy': <ray.rllib.policy.tf_policy_template.PPOTFPolicy obj

2019-07-16 16:28:12,904	INFO rollout_worker.py:720 -- Built preprocessor map: {'default_policy': <ray.rllib.models.preprocessors.NoPreprocessor object at 0x7f31c2f09ba8>}
2019-07-16 16:28:12,904	INFO rollout_worker.py:333 -- Built filter map: {'default_policy': <ray.rllib.utils.filter.NoFilter object at 0x7f31c2f09940>}
2019-07-16 16:28:12,922	INFO multi_gpu_optimizer.py:79 -- LocalMultiGPUOptimizer devices ['/cpu:0']
2019-07-16 16:28:15,342	INFO rollout_worker.py:301 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
2019-07-16 16:28:16,024	INFO rollout_worker.py:719 -- Built policy map: {'default_policy': <ray.rllib.policy.tf_policy_template.PPOTFPolicy object at 0x7f31be253860>}
2019-07-16 16:28:16,025	INFO rollout_worker.py:720 -- Built preprocessor map: {'default_policy': <ray.rllib.models.preprocessors.NoPreprocessor object at 0x7f31be253438>}
2019-07-16 16:28:16,025	INFO rollout_worker.py:333 -- Built filter map: {'default_policy': <ray.rllib.util

2019-07-16 16:34:46,269	INFO multi_gpu_optimizer.py:79 -- LocalMultiGPUOptimizer devices ['/cpu:0']
2019-07-16 16:34:48,719	INFO rollout_worker.py:301 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
2019-07-16 16:34:49,392	INFO rollout_worker.py:719 -- Built policy map: {'default_policy': <ray.rllib.policy.tf_policy_template.PPOTFPolicy object at 0x7f31715fb048>}
2019-07-16 16:34:49,393	INFO rollout_worker.py:720 -- Built preprocessor map: {'default_policy': <ray.rllib.models.preprocessors.NoPreprocessor object at 0x7f317169fbe0>}
2019-07-16 16:34:49,393	INFO rollout_worker.py:333 -- Built filter map: {'default_policy': <ray.rllib.utils.filter.NoFilter object at 0x7f317169f9b0>}
2019-07-16 16:34:49,412	INFO multi_gpu_optimizer.py:79 -- LocalMultiGPUOptimizer devices ['/cpu:0']
2019-07-16 16:34:51,846	INFO rollout_worker.py:301 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
2019-07-16 16:34:52,522	INFO rollout_worker

2019-07-16 16:41:28,764	INFO rollout_worker.py:720 -- Built preprocessor map: {'default_policy': <ray.rllib.models.preprocessors.NoPreprocessor object at 0x7f3124ad9da0>}
2019-07-16 16:41:28,764	INFO rollout_worker.py:333 -- Built filter map: {'default_policy': <ray.rllib.utils.filter.NoFilter object at 0x7f3124ad9b38>}
2019-07-16 16:41:28,781	INFO multi_gpu_optimizer.py:79 -- LocalMultiGPUOptimizer devices ['/cpu:0']
2019-07-16 16:41:31,222	INFO rollout_worker.py:301 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
2019-07-16 16:41:31,913	INFO rollout_worker.py:719 -- Built policy map: {'default_policy': <ray.rllib.policy.tf_policy_template.PPOTFPolicy object at 0x7f311fe26978>}
2019-07-16 16:41:31,913	INFO rollout_worker.py:720 -- Built preprocessor map: {'default_policy': <ray.rllib.models.preprocessors.NoPreprocessor object at 0x7f311fe26550>}
2019-07-16 16:41:31,913	INFO rollout_worker.py:333 -- Built filter map: {'default_policy': <ray.rllib.util

2019-07-16 16:48:56,162	INFO multi_gpu_optimizer.py:79 -- LocalMultiGPUOptimizer devices ['/cpu:0']
2019-07-16 16:48:59,809	INFO rollout_worker.py:301 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
2019-07-16 16:49:00,612	INFO rollout_worker.py:719 -- Built policy map: {'default_policy': <ray.rllib.policy.tf_policy_template.PPOTFPolicy object at 0x7f30d326bfd0>}
2019-07-16 16:49:00,613	INFO rollout_worker.py:720 -- Built preprocessor map: {'default_policy': <ray.rllib.models.preprocessors.NoPreprocessor object at 0x7f30d326bba8>}
2019-07-16 16:49:00,613	INFO rollout_worker.py:333 -- Built filter map: {'default_policy': <ray.rllib.utils.filter.NoFilter object at 0x7f30d326b940>}
2019-07-16 16:49:00,633	INFO multi_gpu_optimizer.py:79 -- LocalMultiGPUOptimizer devices ['/cpu:0']
2019-07-16 16:49:03,471	INFO rollout_worker.py:301 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
2019-07-16 16:49:04,564	INFO rollout_worker

2019-07-16 16:55:44,203	INFO rollout_worker.py:720 -- Built preprocessor map: {'default_policy': <ray.rllib.models.preprocessors.NoPreprocessor object at 0x7f30866b4cc0>}
2019-07-16 16:55:44,203	INFO rollout_worker.py:333 -- Built filter map: {'default_policy': <ray.rllib.utils.filter.NoFilter object at 0x7f308660c048>}
2019-07-16 16:55:44,220	INFO multi_gpu_optimizer.py:79 -- LocalMultiGPUOptimizer devices ['/cpu:0']
2019-07-16 16:55:46,681	INFO rollout_worker.py:301 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
2019-07-16 16:55:47,363	INFO rollout_worker.py:719 -- Built policy map: {'default_policy': <ray.rllib.policy.tf_policy_template.PPOTFPolicy object at 0x7f308197a898>}
2019-07-16 16:55:47,364	INFO rollout_worker.py:720 -- Built preprocessor map: {'default_policy': <ray.rllib.models.preprocessors.NoPreprocessor object at 0x7f308197a470>}
2019-07-16 16:55:47,364	INFO rollout_worker.py:333 -- Built filter map: {'default_policy': <ray.rllib.util

## Results

### cost_of_transaction=0
First this is to establish what's a decent stopping point.

In [14]:
cost_of_commissions = 0.
nr2episode[cost_of_commissions].plot_weights()

interactive(children=(IntSlider(value=63, description='nr', max=189, min=-63), Output()), _dom_classes=('widge…

FigureWidget({
    'data': [{'name': 'Cash(USD)',
              'type': 'scatter',
              'uid': 'f0d0e…

In [15]:
nr2episode[cost_of_commissions].plot_levels()

interactive(children=(IntSlider(value=63, description='nr', max=189, min=-63), Output()), _dom_classes=('widge…

FigureWidget({
    'data': [{'name': 'Strategy',
              'type': 'scatter',
              'uid': '7dd528…

In [16]:
nr2episode[cost_of_commissions].plot_metrics_as_we_train()

In [17]:
nr_checkpoint = 101
#nr2episode[cost_of_commissions][nr_checkpoint].renderer.tearsheet()

### cost_of_transaction=0.00001

In [18]:
cost_of_commissions = 0.00001
nr2episode[cost_of_commissions].plot_weights()

interactive(children=(IntSlider(value=63, description='nr', max=189, min=-63), Output()), _dom_classes=('widge…

FigureWidget({
    'data': [{'name': 'Cash(USD)',
              'type': 'scatter',
              'uid': 'd97ba…

In [19]:
nr2episode[cost_of_commissions].plot_levels()

interactive(children=(IntSlider(value=63, description='nr', max=189, min=-63), Output()), _dom_classes=('widge…

FigureWidget({
    'data': [{'name': 'Strategy',
              'type': 'scatter',
              'uid': 'ed01c8…

In [20]:
nr2episode[cost_of_commissions].plot_metrics_as_we_train()

### cost_of_transaction=0.0001

In [21]:
cost_of_commissions = 0.0001
nr2episode[cost_of_commissions].plot_weights()

interactive(children=(IntSlider(value=63, description='nr', max=189, min=-63), Output()), _dom_classes=('widge…

FigureWidget({
    'data': [{'name': 'Cash(USD)',
              'type': 'scatter',
              'uid': 'c9947…

In [22]:
nr2episode[cost_of_commissions].plot_levels()

interactive(children=(IntSlider(value=63, description='nr', max=189, min=-63), Output()), _dom_classes=('widge…

FigureWidget({
    'data': [{'name': 'Strategy',
              'type': 'scatter',
              'uid': '2d0c86…

### cost_of_transaction=0.001

In [37]:
cost_of_commissions = 0.001
nr2episode[cost_of_commissions].plot_weights()

interactive(children=(IntSlider(value=63, description='nr', max=189, min=-63), Output()), _dom_classes=('widge…

FigureWidget({
    'data': [{'name': 'Cash(USD)',
              'type': 'scatter',
              'uid': 'e694c…

In [38]:
nr2episode[cost_of_commissions].plot_levels()

interactive(children=(IntSlider(value=63, description='nr', max=189, min=-63), Output()), _dom_classes=('widge…

FigureWidget({
    'data': [{'name': 'Strategy',
              'type': 'scatter',
              'uid': 'ab3616…

In [39]:
nr2episode[cost_of_commissions].plot_metrics_as_we_train()

In [40]:
nr2episode[cost_of_commissions].plot_metrics_as_we_train()

### cost_of_transaction=0.01

In [24]:
cost_of_commissions = 0.01
nr2episode[cost_of_commissions].plot_weights()

interactive(children=(IntSlider(value=63, description='nr', max=189, min=-63), Output()), _dom_classes=('widge…

FigureWidget({
    'data': [{'name': 'Cash(USD)',
              'type': 'scatter',
              'uid': 'f3545…

In [25]:
nr2episode[cost_of_commissions].plot_levels()

interactive(children=(IntSlider(value=63, description='nr', max=189, min=-63), Output()), _dom_classes=('widge…

FigureWidget({
    'data': [{'name': 'Strategy',
              'type': 'scatter',
              'uid': '5f35a8…

In [26]:
nr2episode[cost_of_commissions].plot_metrics_as_we_train()

### cost_of_transaction=0.1

In [27]:
cost_of_commissions = 0.1
nr2episode[cost_of_commissions].plot_weights()

interactive(children=(IntSlider(value=63, description='nr', max=189, min=-63), Output()), _dom_classes=('widge…

FigureWidget({
    'data': [{'name': 'Cash(USD)',
              'type': 'scatter',
              'uid': '06b77…

In [28]:
nr2episode[cost_of_commissions].plot_levels()

interactive(children=(IntSlider(value=63, description='nr', max=189, min=-63), Output()), _dom_classes=('widge…

FigureWidget({
    'data': [{'name': 'Strategy',
              'type': 'scatter',
              'uid': '32d60b…

In [29]:
nr2episode[cost_of_commissions].plot_metrics_as_we_train()

## Use case: render results over the combined test-folds
`episode.renderer` is probably the single most useful attribute of `Episode` to visualize results, but you are invited to explore other attributes such as `episode.states` or `episode.actions`.

In [46]:
# Step 2.
policy = env_results.make_policy(
    env=env,
    restore_id=5455102632216835301,
    checkpoint_nr=125,  # use None (or don't specify) to use last checkpoint available
)
policy

<trading_gym.ray.walkforward.policy.WalkForwardPolicy at 0x7f305179a668>

In [47]:
# Step 3.
episode = env.sample_episode(fold='test-set', policy=policy, verbose=False)

In [48]:
episode.renderer.cumulative_performance.to_plotly()
episode.renderer.target_weights.to_plotly()
episode.renderer.annual_returns.to_plotly()
episode.renderer.tearsheet()

Unnamed: 0,Unnamed: 1,Strategy,Index(Aric-Benchmark),Index(USD 1M Deposit),Cash(USD),"ETF(Russell 1000, SMART, USD)","ETF(7-10Y T-Bills, SMART, USD)"
Context,From,2008-03-19,2008-03-19,2008-03-19,2008-03-19,2008-03-19,2008-03-19
Context,To,2018-08-28,2018-08-28,2018-08-28,2018-08-28,2018-08-28,2018-08-28
Context,Years,10.4493,10.4493,10.4493,10.4493,10.4493,10.4493
Context,Observations,2725,2725,2725,2725,2725,2725
Context,Risk-free asset,Index(USD 1M Deposit),Index(USD 1M Deposit),Index(USD 1M Deposit),Index(USD 1M Deposit),Index(USD 1M Deposit),Index(USD 1M Deposit)
Context,Risk-free CAGR,0.00681294,0.00681294,0.00681294,0.00681294,0.00681294,0.00681294
Return,CAGR,0.141268,0.158586,0.00681294,0,0.104507,0.0339243
Return,CAGR over cash,0.134455,0.151773,0,-0.00681294,0.0976941,0.0271113
Return,Overall return,2.97796,3.65592,0.0735266,0,1.82541,0.417089
Risk,Volatility,0.109026,0.0970738,0.000598812,0,0.197859,0.0766871


## Use case: visualize time passed since last retraining
In the previous chapters we have used `WalkForwardRunner` to run a walk forward optimization re-training every year. In other circumstances, re-training might follow more complex patterns. For example re-training might occur on an irregular basis, e.g. whenever there is a structural break in the markets. So it might be useful to visualize the "age" of the model used in a given day. The older the model, the higher the risk that there has been a change in the dynamics of the system and so the model might be out-dated.

In [49]:
history = policy.history()
history

Unnamed: 0,AgeInDays,FoldEnd,RestoreID
2008-01-01,1,2007-12-31 00:00:00,5455102632216835301
2008-01-02,2,2007-12-31 00:00:00,5455102632216835301
2008-01-03,3,2007-12-31 00:00:00,5455102632216835301
2008-01-04,4,2007-12-31 00:00:00,5455102632216835301
2008-01-07,7,2007-12-31 00:00:00,5455102632216835301
2008-01-08,8,2007-12-31 00:00:00,5455102632216835301
2008-01-09,9,2007-12-31 00:00:00,5455102632216835301
2008-01-10,10,2007-12-31 00:00:00,5455102632216835301
2008-01-11,11,2007-12-31 00:00:00,5455102632216835301
2008-01-14,14,2007-12-31 00:00:00,5455102632216835301


In [50]:
history['AgeInDays'].iplot(
    title='Age in days of the most recent model that could be used in the date indicated in the x-axis<br>Whenever the count drops to zero, there has been a re-training of the model',
    yTitle='Nr of calendar days',
    fill=True,
)