In [None]:
# TODO: doctest this .ipynb with pytest.

# Introduction

With `trading-gym` you can:
1. Create custom trading environments (e.g. custom features, reward, assets)
2. Use previously created environments from the registry.

In this notebook we will solve a previously created environment with ray.

# Playground

## Instance the environment

Load the environment `GAIAPredictorsContinuousV7`, which is the latest version of what Aric wanted me to solve.

In [3]:
import trading_gym
from trading_gym.registry.gaia.v7.env import GAIAPredictorsContinuousV7
from datetime import datetime
from collections import namedtuple
import json
import os
import pandas as pd
import ray
print(datetime.now())
print(trading_gym.__name__, trading_gym.__version__)
print(ray.__name__, ray.__version__)

2019-08-20 22:29:02.287199
trading_gym 0.8.1
ray 0.7.3


`ray` requires all custom environments (such as the ones which can be created with `trading-gym`) to accept the optional argument `env_config` in their initialization (e.g. see [this example](https://github.com/ray-project/ray/blob/master/python/ray/rllib/examples/custom_env.py) or [read the docs](https://ray.readthedocs.io/en/latest/rllib-env.html#rllib-environments)). `env_config` allows to pass optional custom configurations to the custom environment.

In our case, we want to split all data the env in two folds: in-sample and out-of-sample.

In [4]:
env_config = dict()
env_config['folds'] =  {
    'training-set': [datetime.min, datetime(2008, 3, 18)],
    'test-set': [datetime(2008, 3, 19), datetime.max],
}
env = GAIAPredictorsContinuousV7(env_config)
env

<trading_gym.registry.gaia.v7.env.GAIAPredictorsContinuousV7 at 0x7f355e13fd68>

## Observation space

The observation space (state) is a vector of 3 real numbers (this can be customized):
1. GAIA predictor for Russell 1000
2. GAIA predictor for Russell 1000 - previous GAIA predictor
3. Dummy variable indicating if GAIA predictor for Russell 1000 is zero (i.e. not available).

In [5]:
env.observation_space

Box(3,)

## Action space

Simplex is ray's action space associated with the Dirichlet distribution ([reference](https://github.com/ray-project/ray/issues/4440))

Note that Dirichlet (Simplex) is 2-dimensional but there are 3 contracts. In this particular environment, `Cash` is always assumed to be 0% (design choice), so the agent does not get a choice on the target amount of cash in the portfolio.

In [4]:
env.action_space

Simplex((2,); [1, 1])

In [5]:
env.action_space.contracts

[Cash(USD), ETF(Russell 1000, SMART, USD), ETF(7-10Y T-Bills, SMART, USD)]

In [6]:
env.action_space.sample()

array([0.9516352 , 0.04836483], dtype=float32)

## Interact with the environment
Interact with the environment until when the episode has finished. This API is 100% the same as the one implemented in `openai-gym`, so if you are not familiar with it please [read the docs](https://gym.openai.com/docs/).

At each step (associated with the timestamp `env.now`), the agent uses the state $\mathbf{s} \in \mathbb{R}^3$ to take the action $\mathbf{a} \in \mathbb{R}^3$ subject to $\sum_{i=1}^{n=3} a_i=1$ representing the target weight of the portfolio. If the target weights differ from the current weights, then there will be a rebalancing of the portfolio, meaning that assets will be sold/bought in order to bring the current weight to the target level. The reward $r \in \mathbb{R}$ indicates the simple percentage change in the net liquidation value in dollars of the agent.

In [8]:
Interaction = namedtuple('Interaction', ['time', 's', 'a', 'r', "s_prime"])

state = env.reset()  # 'training-set' by default
done = False
while not done:
    old_state = state
    action = env.action_space.sample()
    state, reward, done, info = env.step(action)
    
    # This print should show that something is going on.
    print(Interaction(env.now, old_state, action, reward, state), '\n')

print('End of the episode, you are done.')

Interaction(time=Timestamp('2007-09-03 00:00:00'), s=array([ 1.2695895 ,  2.67764494, -1.        ]), a=array([0.5274689, 0.4725311], dtype=float32), r=5.946156874547803e-05, s_prime=array([ 1.86227179,  1.2500046 , -1.        ])) 

Interaction(time=Timestamp('2007-09-04 00:00:00'), s=array([ 1.86227179,  1.2500046 , -1.        ]), a=array([0.560311, 0.439689], dtype=float32), r=0.005561203789692604, s_prime=array([ 2.2995324 ,  0.92221043, -1.        ])) 

Interaction(time=Timestamp('2007-09-05 00:00:00'), s=array([ 2.2995324 ,  0.92221043, -1.        ]), a=array([0.7738601 , 0.22613987], dtype=float32), r=-0.0065893325685768556, s_prime=array([ 2.12572615, -0.36656843, -1.        ])) 

Interaction(time=Timestamp('2007-09-06 00:00:00'), s=array([ 2.12572615, -0.36656843, -1.        ]), a=array([0.5096948, 0.4903052], dtype=float32), r=0.0011510323569501324, s_prime=array([ 1.95916477, -0.35128853, -1.        ])) 

Interaction(time=Timestamp('2007-09-07 00:00:00'), s=array([ 1.95916477,

## Visualizations
All visualizations already implemented in `trading-gym` for you can be found in `Renderer`. From this class, you can produce visualizations (using the method `.to_plotly`) or retrieve the underlying data (`.to_frame`).

In [9]:
renderer = env.render()
renderer

<trading_gym.renderers.Renderer at 0x7fbc2615bb00>

In [10]:
# renderer's methods. 
list(vars(renderer).keys())

['track_record',
 'benchmark',
 'risk_free',
 'cost_of_commissions',
 'cost_of_spread',
 'target_weights',
 'pnl',
 'performance_contribution',
 'risk_contribution',
 'cumulative_performance',
 'level',
 'capm']

We now visualize a few stuffs:
1. `target_weights`: historical actions
2. `pnl`: (cumulative) profit and losses through time.
3. `cumulative_performance`:  (cumulative) profit and losses through time as a percentage of the starting capital ($100 by default)

In [11]:
renderer.target_weights.to_plotly()
renderer.pnl.to_plotly()
renderer.cumulative_performance.to_plotly()

## Tearsheet
Common financial metrics to assess return, risk, and risk-adjusted return of an investment. These numbers are incredibly useful when displayed in `tensorboard` over the number of training iterations.

We can (optionally) a few optional argument to `tearsheet` in enrich the number of metrics returned.

Description and source code for these metrics can be found in `trading_gym.metrics` or just google them.

In [12]:
levels = renderer.level.to_frame()
levels.tearsheet(
    risk_free=env._load_risk_free(),
    benchmark=env._load_benchmark(),
    weights=renderer.track_record.to_frame('weights_target'),
)

Unnamed: 0,Unnamed: 1,Strategy,Index(Aric-Benchmark),Index(USD 1M Deposit),Cash(USD),"ETF(Russell 1000, SMART, USD)","ETF(7-10Y T-Bills, SMART, USD)"
Context,From,2007-08-31,2007-08-31,2007-08-31,2007-08-31,2007-08-31,2007-08-31
Context,To,2007-09-27,2007-09-27,2007-09-27,2007-09-27,2007-09-27,2007-09-27
Context,Years,0.0739726,0.0739726,0.0739726,0.0739726,0.0739726,0.0739726
Context,Observations,20,20,20,20,20,20
Context,Risk-free asset,Index(USD 1M Deposit),Index(USD 1M Deposit),Index(USD 1M Deposit),Index(USD 1M Deposit),Index(USD 1M Deposit),Index(USD 1M Deposit)
Context,Risk-free CAGR,0.0572772,0.0572772,0.0572772,0.0572772,0.0572772,0.0572772
Return,CAGR,0.476049,0.724249,0.0572772,0,0.724159,0.00951005
Return,CAGR over cash,0.418771,0.666972,0,-0.0572772,0.666882,-0.0477672
Return,Overall return,0.0292214,0.0411227,0.00412855,0,0.0411187,0.000700404
Risk,Volatility,0.0756612,0.155471,0.00204313,0,0.155472,0.0750822


## Out of sample assessment of a (random) agent
The ONLY difference is to specify the 'test-set' when resetting the environment ('training-set' by default). So here I'm just copy-pasting the code that we previously used to interact with the environment.

In practice, the action should now be a random action but an output of your policy network, and you want the results ('Strategy') to look as good as possible out-of-sample. If the strategy is as good as Aric's benchmark, then the RL solution might be deployed in production.

In [13]:
# Interact with the environment.
state = env.reset(fold='test-set')  # 'training-set' by default
done = False
while not done:
    action = env.action_space.sample()
    state, reward, done, info = env.step(action)
    
# Render results (your (random) agent is labelled as 'Strategy').
renderer = env.render()
renderer.plotly_report()
renderer.tearsheet()

Unnamed: 0,Unnamed: 1,Strategy,Index(Aric-Benchmark),Index(USD 1M Deposit),Cash(USD),"ETF(Russell 1000, SMART, USD)","ETF(7-10Y T-Bills, SMART, USD)"
Context,From,2008-03-19,2008-03-19,2008-03-19,2008-03-19,2008-03-19,2008-03-19
Context,To,2018-08-28,2018-08-28,2018-08-28,2018-08-28,2018-08-28,2018-08-28
Context,Years,10.4493,10.4493,10.4493,10.4493,10.4493,10.4493
Context,Observations,2725,2725,2725,2725,2725,2725
Context,Risk-free asset,Index(USD 1M Deposit),Index(USD 1M Deposit),Index(USD 1M Deposit),Index(USD 1M Deposit),Index(USD 1M Deposit),Index(USD 1M Deposit)
Context,Risk-free CAGR,0.00681294,0.00681294,0.00681294,0.00681294,0.00681294,0.00681294
Return,CAGR,0.104861,0.158586,0.00681294,0,0.104507,0.0339243
Return,CAGR over cash,0.0980485,0.151773,0,-0.00681294,0.0976941,0.0271113
Return,Overall return,1.83489,3.65592,0.0735266,0,1.82541,0.417089
Risk,Volatility,0.086638,0.0970738,0.000598812,0,0.197859,0.0766871


# Solve GAIA-v7 with ray (2-folds split)

In [6]:
import ray
from ray import rllib, tune
from trading_gym.ray.logger import calculate_tearsheet, CustomLogger
from copy import deepcopy
ray.init()
ray.__version__

2019-06-21 16:20:32,466	INFO node.py:498 -- Process STDOUT and STDERR is being redirected to /tmp/ray/session_2019-06-21_16-20-32_465718_36057/logs.
2019-06-21 16:20:32,609	INFO services.py:409 -- Waiting for redis server at 127.0.0.1:23371 to respond...
2019-06-21 16:20:32,749	INFO services.py:409 -- Waiting for redis server at 127.0.0.1:19700 to respond...
2019-06-21 16:20:32,755	INFO services.py:806 -- Starting Redis shard with 10.0 GB max memory.
2019-06-21 16:20:32,834	INFO node.py:512 -- Process STDOUT and STDERR is being redirected to /tmp/ray/session_2019-06-21_16-20-32_465718_36057/logs.
2019-06-21 16:20:32,840	INFO services.py:1442 -- Starting the Plasma object store with 20.0 GB memory using /dev/shm.


'0.7.1'

## Setting up the experiment

In [17]:
config = rllib.agents.ppo.DEFAULT_CONFIG.copy()
config['env'] = GAIAPredictorsContinuousV7
config['gamma'] = tune.grid_search([0])
config['num_workers'] = 6
config['callbacks']['on_train_result'] = tune.function(calculate_tearsheet)
config['entropy_coeff'] = tune.grid_search([1e-5])
config['batch_mode'] = 'complete_episodes'
config['use_lstm']: False
config['lr'] = tune.grid_search([1e-5])
config['num_sgd_iter'] = tune.grid_search([8])
config['sgd_minibatch_size'] = 128
config['train_batch_size'] = tune.grid_search([4000])
config['use_gae'] = tune.grid_search([False])
config['vf_share_layers'] = False
config['vf_loss_coeff'] = tune.grid_search([0.])
config['vf_clip_param'] = tune.grid_search([0.])
config['lambda'] = tune.grid_search([0.])
config['kl_coeff'] = 0.2
config['kl_target'] = 0.01
config['clip_param'] = 0.55

Note that we pass the previously defined `env_config` here, which will be used internally by `ray` to initialize the env(s).

In [18]:
config['env_config'] = env_config

In [29]:
experiment = tune.Experiment(
    name='Playground-2folds',
    run=rllib.agents.ppo.PPOTrainer,
    stop={"timesteps_total": 1000000},
    config=deepcopy(config),
    num_samples=1,
    local_dir='logs',
    #checkpoint_freq=int(1e4 / config['train_batch_size']),  # checkpoint every 100k iters
    checkpoint_at_end=True,
    max_failures=0,
    loggers=[CustomLogger],
)

## Custom network architecture (OPTIONAL)
The default architecture in `ray` in a MLP with 2-layers, 256 units per layer with tanh. As an exercise, say that we want to use dropout. In order to do that, we create a [custom architecture](https://ray.readthedocs.io/en/latest/rllib-models.html#custom-models-tensorflow).

TODO: make sure that dropout is handled correctly at test time.

In [15]:
from ray.rllib.models import ModelCatalog
from ray.rllib.models.model import Model
from ray.rllib.models.misc import normc_initializer, get_activation_fn
import tensorflow as tf
import tensorflow.contrib.slim as slim


class MLP(Model):
    def _build_layers_v2(self, input_dict: dict, num_outputs: int, config: dict):
        import tensorflow.contrib.slim as slim

        with tf.name_scope("fc_net"):
            last_layer = input_dict['obs']
            activation = get_activation_fn(config.get("fcnet_activation"))
            for i, size in enumerate(config.get("fcnet_hiddens"), 1):
                last_layer = slim.fully_connected(
                    inputs=last_layer,
                    num_outputs=size,
                    weights_initializer=normc_initializer(1.0),
                    activation_fn=activation,
                    scope="fc{}".format(i),
                )
                last_layer = tf.layers.dropout(
                    inputs=last_layer,
                    rate=config['custom_options']["fcnet_dropout_rate"],
                    training=input_dict['is_training'],
                    name="dropout{}".format(i),
                )
            output = slim.fully_connected(
                inputs=last_layer,
                num_outputs=num_outputs,
                weights_initializer=normc_initializer(0.01),
                activation_fn=None,
                scope="fc_out",
            )
            return output, last_layer

ModelCatalog.register_custom_model(MLP.__name__, MLP)

In [19]:
config['model']['custom_options'] = {'fcnet_dropout_rate': 0.5}
config['model']['custom_model'] = MLP.__name__

## Run the experiment
Results are automatically saved to disk. Later on, we will restore an agent that we trained in this section.

NTN: Saves to both the local directory 'logs' and 'ray_results's

In [None]:
trials = tune.run_experiments(
    experiments=experiment,
    search_alg=tune.suggest.BasicVariantGenerator(),
    scheduler=tune.schedulers.FIFOScheduler(),
    verbose=1,
    reuse_actors=False,
    resume=False,
)

## Tensorboard
Tensorboard

![alt text](tensorboard-playground-2-folds.png "Title")

## Restore the agent
All objects saved to disk by `ray` must be serialized (~pickled). This does not seem very clean, but I've not found a better way to restore agents so far. Please do explore alternative or raise an issue in ray and ask.

In [2]:
from ray import cloudpickle
from ray.utils import binary_to_hex, hex_to_binary


def cloudpickleloads(obj):
    if isinstance(obj, dict):
        try:
            return cloudpickle.loads(hex_to_binary(obj["value"]))
        except:
            for key, value in obj.items():
                if isinstance(value, dict):
                    if sorted(value) == ['_type', 'value']:
                        obj[key] = cloudpickle.loads(hex_to_binary(value["value"]))
                    else:
                        obj[key] = cloudpickleloads(value)
                elif isinstance(value, list):
                    for i, item in enumerate(value):
                        obj[key][i] = cloudpickleloads(item)
    return obj

In [12]:
path = '/home/Nicholas/Desktop/trading-gym/logs/Playground-2folds/experiment_state-2019-06-17_10-01-21.json'
# path = '/home/Nicholas/Desktop/trading-gym/notebooks/registry/gaia/v7/logs/Playground-2folds-tuning/experiment_state-2019-06-20_15-53-36.json'
# path = '/home/Nicholas/Desktop/trading-gym/notebooks/registry/gaia/v7/logs/Playground-2folds-tuning/experiment_state-2019-06-20_15-53-36.json'

with open(path) as f:
    metadata = json.load(f)

runner_data = metadata['runner_data']
stats = metadata['stats']

checkpoint = metadata['checkpoints'][-1]
checkpoint = cloudpickleloads(checkpoint)
checkpoint_path = cloudpickle.loads(hex_to_binary(checkpoint['_checkpoint'])).value
print(checkpoint_path)

config = checkpoint['config']
env_cls = config['env']
env_config = config['env_config']
path_restore = os.path.join(checkpoint['logdir'], checkpoint_path)

logs/Playground-2folds/PPOTrainer_GAIAPredictorsContinuousV7_0_entropy_coeff=1e-05,gamma=0,lambda=0.0,lr=1e-05,num_sgd_iter=8,train_batch_size=4000,use_g_2019-06-17_10-01-21ymqkrkjb/checkpoint_250/checkpoint-250


In [13]:
agent = rllib.agents.ppo.PPOTrainer(config, env_cls)
agent.restore(path_restore)

2019-06-21 16:29:12,613	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)


<class 'module'>


2019-06-21 16:29:14,786	INFO policy_evaluator.py:735 -- Built policy map: {'default_policy': <ray.rllib.policy.tf_policy_template.PPOTFPolicy object at 0x7f3164671be0>}
2019-06-21 16:29:14,789	INFO policy_evaluator.py:736 -- Built preprocessor map: {'default_policy': <ray.rllib.models.preprocessors.NoPreprocessor object at 0x7f3164671898>}
2019-06-21 16:29:14,791	INFO policy_evaluator.py:347 -- Built filter map: {'default_policy': <ray.rllib.utils.filter.NoFilter object at 0x7f31646713c8>}
2019-06-21 16:29:15,004	INFO multi_gpu_optimizer.py:80 -- LocalMultiGPUOptimizer devices ['/cpu:0']


NameError: name 'path_restore' is not defined

## Run and visualize the agent out-of-sample
`.sample_episode` allows to easily assess an agent on a specific fold of the data. All parameters are optional and they default to testing a random agent on the training-set. Episode should collect most of the information (if not all) needed to assess the results of a policy in a given fold.

In [None]:
episode = env.sample_episode(
    fold='test-set',
    policy=agent,
    episode_length=None,
    benchmark=env._load_benchmark().squeeze(),
    risk_free=env._load_risk_free().squeeze(),
    burn=1,
)

In [None]:
episode.renderer.plotly_report()
episode.renderer.tearsheet()

[2m[36m(pid=37006)[0m 2019-06-21 16:29:24,434	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 1 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=37006)[0m <class 'module'>
[2m[36m(pid=37006)[0m 2019-06-21 16:29:24.474987: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
[2m[36m(pid=37005)[0m 2019-06-21 16:29:24,641	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 2 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=37005)[0m <class 'module'>
[2m[36m(pid=37005)[0m 2019-06-21 16:29:24.697700: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
[2m[36m(pid=37006)[0m 2019-06-21 16:29:25,319	INFO dynamic_tf_policy.py:265 -- Initializing loss function with dummy input:
[2m[36m(pid=37006)[0m 
[2m[3

Now we use `episode` is create a couple of charts which help to understand/visualize the  behavior of the RL agent.

In [22]:
from trading_gym.contracts import ETF, Index


# Load.
actions = episode.actions_as_frame()
states = episode.states_as_frame()

# Parse.
gaia_predictor = states[0].to_frame('GAIA Predictor')
target_weight_russell_1000 = actions[ETF('Russell 1000')]
target_weight_russell_1000.name = 'Target weight: ' + str(target_weight_russell_1000.name)
mapping = gaia_predictor.join(target_weight_russell_1000)

# Visualize.
mapping.iplot(
    title="Hisorical GAIA predictor for Russell 1000 vs agent's target weights",
    secondary_y='GAIA Predictor',
    yTitle=target_weight_russell_1000.name,
    secondary_y_title='GAIA Predictor',
    legend={'orientation': 'h'},
)
mapping.set_index('GAIA Predictor').iplot(
    title='Policy: mapping from GAIA predictor (state) to target weight for Russell 1000 (action)',
    xTitle='GAIA predictor for Russell 1000 (standardized)',
    yTitle='Target weight for Russell 1000',
    kind='scatter',
    mode='markers',
    size=4,
)

## Conclusions

As we can see from the tearsheet and chart above, the RL agent does not outperform the benchmark out-of-sample. Visualizing the mapping functions allows to know in advance how the RL agent would behave in previously unseen situations and its a nice visualization to communicate results and trust the results.

Walk-forward optimization is certainly something sensible to try as this naive 2-folds split assumes not re-training of the RL agent during the out-of-sample period as new data arrive. In the next section we repeat the same process while re-training the RL agent every year.

# Solve GAIA-v7 with ray (walk-forward, no transfer learning)
TODO

## Set up and run the experiment(s)

In [20]:
for year in range(2007, 2018):
    print('_______________________________________{}____________________________________________'.format(year))
    config['env_config'] = {
        'folds': {
            'training-set': [datetime.min, datetime(year, 12, 31)],
            'test-set': [datetime(year + 1, 1, 1), datetime(year + 1, 12, 31)],
        }
    }
    experiment = tune.Experiment(
        name='Playground-WalkForward{}'.format(year),
        run=rllib.agents.ppo.PPOTrainer,
        stop={"timesteps_total": 50000},
        config=deepcopy(config),
        num_samples=1,
        local_dir='logs',
        #checkpoint_freq=int(1e4 / config['train_batch_size']),  # checkpoint every 100k iters
        checkpoint_at_end=True,
        max_failures=0,
        loggers=[CustomLogger],
    )
    trials = tune.run_experiments(
        experiments=experiment,
        search_alg=tune.suggest.BasicVariantGenerator(),
        scheduler=tune.schedulers.FIFOScheduler(),
        verbose=0,
        reuse_actors=False,
        resume=False,
    )

2019-06-17 15:01:34,304	INFO tune.py:65 -- Did not find checkpoint file in logs/Playground-WalkForward2007.
2019-06-17 15:01:34,305	INFO tune.py:232 -- Starting a new experiment.


_______________________________________2007____________________________________________
[2m[36m(pid=96351)[0m 2019-06-17 15:01:39,538	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=96351)[0m 2019-06-17 15:01:39.539850: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
[2m[36m(pid=96351)[0m <class 'module'>
[2m[36m(pid=96351)[0m 2019-06-17 15:01:47,225	INFO dynamic_tf_policy.py:265 -- Initializing loss function with dummy input:
[2m[36m(pid=96351)[0m 
[2m[36m(pid=96351)[0m { 'action_prob': <tf.Tensor 'default_policy/action_prob:0' shape=(?,) dtype=float32>,
[2m[36m(pid=96351)[0m   'actions': <tf.Tensor 'default_policy/actions:0' shape=(?, 2) dtype=float32>,
[2m[36m(pid=96351)[0m   'advantages': <tf.Tensor 'default_policy/advantages:0' shape=(?,) dtype=float32>,
[2m[36m(pid=

[2m[36m(pid=96355)[0m 
[2m[36m(pid=96355)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=96355)[0m 
[2m[36m(pid=96353)[0m 
[2m[36m(pid=96353)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=96353)[0m 
[2m[36m(pid=96352)[0m 
[2m[36m(pid=96352)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=96352)[0m 
[2m[36m(pid=96348)[0m 
[2m[36m(pid=96348)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=96348)[0m 
[2m[36m(pid=96349)[0m 
[2m[36m(pid=96349)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=96349)[0m 
[2m[36m(pid=96355)[0m 2019-06-17 15:03:11,594	INFO policy

[2m[36m(pid=96351)[0m 2019-06-17 15:03:27,843	INFO multi_gpu_impl.py:146 -- Training on concatenated sample batches:
[2m[36m(pid=96351)[0m 
[2m[36m(pid=96351)[0m { 'inputs': [ np.ndarray((4000, 2), dtype=float32, min=0.0, max=1.0, mean=0.475),
[2m[36m(pid=96351)[0m               np.ndarray((4000,), dtype=float32, min=-0.027, max=0.028, mean=0.0),
[2m[36m(pid=96351)[0m               np.ndarray((4000, 3), dtype=float32, min=-14.063, max=11.025, mean=0.09),
[2m[36m(pid=96351)[0m               np.ndarray((4000, 2), dtype=float32, min=0.0, max=1.0, mean=0.5),
[2m[36m(pid=96351)[0m               np.ndarray((4000,), dtype=float32, min=-6.374, max=6.426, mean=-0.0),
[2m[36m(pid=96351)[0m               np.ndarray((4000, 2), dtype=float32, min=-0.016, max=0.016, mean=0.002),
[2m[36m(pid=96351)[0m               np.ndarray((4000,), dtype=float32, min=0.0, max=0.0, mean=0.0),
[2m[36m(pid=96351)[0m               np.ndarray((4000,), dtype=float32, min=0.0, max=0.0, mean=

2019-06-17 15:07:20,622	INFO ray_trial_executor.py:187 -- Destroying actor for trial PPOTrainer_GAIAPredictorsContinuousV7_0_entropy_coeff=1e-05,gamma=0,lambda=0.0,lr=1e-05,num_sgd_iter=8,train_batch_size=4000,use_gae=False,vf_clip_param=0.0,vf_loss_coeff=0.0. If your trainable is slow to initialize, consider setting reuse_actors=True to reduce actor creation overheads.
2019-06-17 15:07:20,649	INFO tune.py:65 -- Did not find checkpoint file in logs/Playground-WalkForward2008.
2019-06-17 15:07:20,650	INFO tune.py:232 -- Starting a new experiment.


_______________________________________2008____________________________________________
[2m[36m(pid=96350)[0m 2019-06-17 15:07:27,689	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=96350)[0m 2019-06-17 15:07:27.690027: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
[2m[36m(pid=96350)[0m <class 'module'>
[2m[36m(pid=96350)[0m 2019-06-17 15:07:35,411	INFO dynamic_tf_policy.py:265 -- Initializing loss function with dummy input:
[2m[36m(pid=96350)[0m 
[2m[36m(pid=96350)[0m { 'action_prob': <tf.Tensor 'default_policy/action_prob:0' shape=(?,) dtype=float32>,
[2m[36m(pid=96350)[0m   'actions': <tf.Tensor 'default_policy/actions:0' shape=(?, 2) dtype=float32>,
[2m[36m(pid=96350)[0m   'advantages': <tf.Tensor 'default_policy/advantages:0' shape=(?,) dtype=float32>,
[2m[36m(pid=

[2m[36m(pid=110127)[0m 
[2m[36m(pid=110127)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=110127)[0m 
[2m[36m(pid=110122)[0m 
[2m[36m(pid=110122)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=110122)[0m 
[2m[36m(pid=110124)[0m 
[2m[36m(pid=110124)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=110124)[0m 
[2m[36m(pid=110123)[0m 
[2m[36m(pid=110123)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=110123)[0m 
[2m[36m(pid=110125)[0m 
[2m[36m(pid=110125)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=110125)[0m 
[2m[36m(pid=110122)[0m 2019-06-17 15:09:03

[2m[36m(pid=96350)[0m 2019-06-17 15:09:19,011	INFO multi_gpu_impl.py:146 -- Training on concatenated sample batches:
[2m[36m(pid=96350)[0m 
[2m[36m(pid=96350)[0m { 'inputs': [ np.ndarray((4000, 2), dtype=float32, min=0.0, max=1.0, mean=0.475),
[2m[36m(pid=96350)[0m               np.ndarray((4000,), dtype=float32, min=-0.078, max=0.05, mean=0.0),
[2m[36m(pid=96350)[0m               np.ndarray((4000, 3), dtype=float32, min=-14.063, max=14.341, mean=0.104),
[2m[36m(pid=96350)[0m               np.ndarray((4000, 2), dtype=float32, min=0.0, max=1.0, mean=0.5),
[2m[36m(pid=96350)[0m               np.ndarray((4000,), dtype=float32, min=-17.498, max=11.23, mean=-0.0),
[2m[36m(pid=96350)[0m               np.ndarray((4000, 2), dtype=float32, min=-0.008, max=0.006, mean=-0.001),
[2m[36m(pid=96350)[0m               np.ndarray((4000,), dtype=float32, min=0.0, max=0.0, mean=0.0),
[2m[36m(pid=96350)[0m               np.ndarray((4000,), dtype=float32, min=0.0, max=0.0, mea

2019-06-17 15:13:08,725	INFO ray_trial_executor.py:187 -- Destroying actor for trial PPOTrainer_GAIAPredictorsContinuousV7_0_entropy_coeff=1e-05,gamma=0,lambda=0.0,lr=1e-05,num_sgd_iter=8,train_batch_size=4000,use_gae=False,vf_clip_param=0.0,vf_loss_coeff=0.0. If your trainable is slow to initialize, consider setting reuse_actors=True to reduce actor creation overheads.
2019-06-17 15:13:08,749	INFO tune.py:65 -- Did not find checkpoint file in logs/Playground-WalkForward2009.
2019-06-17 15:13:08,750	INFO tune.py:232 -- Starting a new experiment.


_______________________________________2009____________________________________________
[2m[36m(pid=110146)[0m <class 'module'>
[2m[36m(pid=110146)[0m 2019-06-17 15:13:15,798	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=110146)[0m 2019-06-17 15:13:15.799013: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
[2m[36m(pid=110146)[0m 2019-06-17 15:13:23,600	INFO dynamic_tf_policy.py:265 -- Initializing loss function with dummy input:
[2m[36m(pid=110146)[0m 
[2m[36m(pid=110146)[0m { 'action_prob': <tf.Tensor 'default_policy/action_prob:0' shape=(?,) dtype=float32>,
[2m[36m(pid=110146)[0m   'actions': <tf.Tensor 'default_policy/actions:0' shape=(?, 2) dtype=float32>,
[2m[36m(pid=110146)[0m   'advantages': <tf.Tensor 'default_policy/advantages:0' shape=(?,) dtype=float32>,
[2m[

[2m[36m(pid=110139)[0m 
[2m[36m(pid=110139)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=110139)[0m 
[2m[36m(pid=110142)[0m 
[2m[36m(pid=110142)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=110142)[0m 
[2m[36m(pid=110156)[0m 
[2m[36m(pid=110156)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=110156)[0m 
[2m[36m(pid=110160)[0m 
[2m[36m(pid=110160)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=110160)[0m 
[2m[36m(pid=110137)[0m 
[2m[36m(pid=110137)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=110137)[0m 
[2m[36m(pid=110138)[0m 
[2m[36m(pid=1101

[2m[36m(pid=110146)[0m 2019-06-17 15:15:04,733	INFO multi_gpu_impl.py:146 -- Training on concatenated sample batches:
[2m[36m(pid=110146)[0m 
[2m[36m(pid=110146)[0m { 'inputs': [ np.ndarray((4000, 2), dtype=float32, min=0.0, max=1.0, mean=0.475),
[2m[36m(pid=110146)[0m               np.ndarray((4000,), dtype=float32, min=-0.084, max=0.038, mean=0.0),
[2m[36m(pid=110146)[0m               np.ndarray((4000, 3), dtype=float32, min=-14.063, max=12.847, mean=0.121),
[2m[36m(pid=110146)[0m               np.ndarray((4000, 2), dtype=float32, min=0.0, max=1.0, mean=0.5),
[2m[36m(pid=110146)[0m               np.ndarray((4000,), dtype=float32, min=-18.658, max=8.294, mean=0.0),
[2m[36m(pid=110146)[0m               np.ndarray((4000, 2), dtype=float32, min=-0.008, max=0.01, mean=-0.001),
[2m[36m(pid=110146)[0m               np.ndarray((4000,), dtype=float32, min=0.0, max=0.0, mean=0.0),
[2m[36m(pid=110146)[0m               np.ndarray((4000,), dtype=float32, min=0.0, max

2019-06-17 15:18:58,855	INFO ray_trial_executor.py:187 -- Destroying actor for trial PPOTrainer_GAIAPredictorsContinuousV7_0_entropy_coeff=1e-05,gamma=0,lambda=0.0,lr=1e-05,num_sgd_iter=8,train_batch_size=4000,use_gae=False,vf_clip_param=0.0,vf_loss_coeff=0.0. If your trainable is slow to initialize, consider setting reuse_actors=True to reduce actor creation overheads.
2019-06-17 15:18:58,878	INFO tune.py:65 -- Did not find checkpoint file in logs/Playground-WalkForward2010.
2019-06-17 15:18:58,879	INFO tune.py:232 -- Starting a new experiment.


_______________________________________2010____________________________________________
[2m[36m(pid=110149)[0m <class 'module'>
[2m[36m(pid=110149)[0m 2019-06-17 15:19:05,930	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=110149)[0m 2019-06-17 15:19:05.932139: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
[2m[36m(pid=110149)[0m 2019-06-17 15:19:13,520	INFO dynamic_tf_policy.py:265 -- Initializing loss function with dummy input:
[2m[36m(pid=110149)[0m 
[2m[36m(pid=110149)[0m { 'action_prob': <tf.Tensor 'default_policy/action_prob:0' shape=(?,) dtype=float32>,
[2m[36m(pid=110149)[0m   'actions': <tf.Tensor 'default_policy/actions:0' shape=(?, 2) dtype=float32>,
[2m[36m(pid=110149)[0m   'advantages': <tf.Tensor 'default_policy/advantages:0' shape=(?,) dtype=float32>,
[2m[

[2m[36m(pid=130795)[0m 
[2m[36m(pid=130795)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=130795)[0m 
[2m[36m(pid=130793)[0m 
[2m[36m(pid=130793)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=130793)[0m 
[2m[36m(pid=130792)[0m 
[2m[36m(pid=130792)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=130792)[0m 
[2m[36m(pid=130796)[0m 
[2m[36m(pid=130796)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=130796)[0m 
[2m[36m(pid=130791)[0m 
[2m[36m(pid=130791)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=130791)[0m 
[2m[36m(pid=130794)[0m 
[2m[36m(pid=1307

[2m[36m(pid=110149)[0m 2019-06-17 15:20:58,202	INFO multi_gpu_impl.py:146 -- Training on concatenated sample batches:
[2m[36m(pid=110149)[0m 
[2m[36m(pid=110149)[0m { 'inputs': [ np.ndarray((4000, 2), dtype=float32, min=0.0, max=1.0, mean=0.475),
[2m[36m(pid=110149)[0m               np.ndarray((4000,), dtype=float32, min=-0.04, max=0.027, mean=0.0),
[2m[36m(pid=110149)[0m               np.ndarray((4000, 3), dtype=float32, min=-14.063, max=15.924, mean=0.101),
[2m[36m(pid=110149)[0m               np.ndarray((4000, 2), dtype=float32, min=0.0, max=1.0, mean=0.5),
[2m[36m(pid=110149)[0m               np.ndarray((4000,), dtype=float32, min=-8.401, max=6.082, mean=-0.0),
[2m[36m(pid=110149)[0m               np.ndarray((4000, 2), dtype=float32, min=-0.015, max=0.014, mean=0.002),
[2m[36m(pid=110149)[0m               np.ndarray((4000,), dtype=float32, min=0.0, max=0.0, mean=0.0),
[2m[36m(pid=110149)[0m               np.ndarray((4000,), dtype=float32, min=0.0, max=

2019-06-17 15:24:51,286	INFO ray_trial_executor.py:187 -- Destroying actor for trial PPOTrainer_GAIAPredictorsContinuousV7_0_entropy_coeff=1e-05,gamma=0,lambda=0.0,lr=1e-05,num_sgd_iter=8,train_batch_size=4000,use_gae=False,vf_clip_param=0.0,vf_loss_coeff=0.0. If your trainable is slow to initialize, consider setting reuse_actors=True to reduce actor creation overheads.
2019-06-17 15:24:51,311	INFO tune.py:65 -- Did not find checkpoint file in logs/Playground-WalkForward2011.
2019-06-17 15:24:51,312	INFO tune.py:232 -- Starting a new experiment.


_______________________________________2011____________________________________________
[2m[36m(pid=130804)[0m <class 'module'>
[2m[36m(pid=130804)[0m 2019-06-17 15:24:58,223	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=130804)[0m 2019-06-17 15:24:58.224153: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
[2m[36m(pid=130804)[0m 2019-06-17 15:25:05,748	INFO dynamic_tf_policy.py:265 -- Initializing loss function with dummy input:
[2m[36m(pid=130804)[0m 
[2m[36m(pid=130804)[0m { 'action_prob': <tf.Tensor 'default_policy/action_prob:0' shape=(?,) dtype=float32>,
[2m[36m(pid=130804)[0m   'actions': <tf.Tensor 'default_policy/actions:0' shape=(?, 2) dtype=float32>,
[2m[36m(pid=130804)[0m   'advantages': <tf.Tensor 'default_policy/advantages:0' shape=(?,) dtype=float32>,
[2m[

[2m[36m(pid=130805)[0m 
[2m[36m(pid=130805)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=130805)[0m 
[2m[36m(pid=130815)[0m 
[2m[36m(pid=130815)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=130815)[0m 
[2m[36m(pid=130820)[0m 
[2m[36m(pid=130820)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=130820)[0m 
[2m[36m(pid=130812)[0m 
[2m[36m(pid=130812)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=130812)[0m 
[2m[36m(pid=130806)[0m 
[2m[36m(pid=130806)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=130806)[0m 
[2m[36m(pid=130817)[0m 
[2m[36m(pid=1308

[2m[36m(pid=130804)[0m 2019-06-17 15:26:46,154	INFO multi_gpu_impl.py:146 -- Training on concatenated sample batches:
[2m[36m(pid=130804)[0m 
[2m[36m(pid=130804)[0m { 'inputs': [ np.ndarray((4000, 2), dtype=float32, min=0.0, max=1.0, mean=0.475),
[2m[36m(pid=130804)[0m               np.ndarray((4000,), dtype=float32, min=-0.039, max=0.038, mean=0.0),
[2m[36m(pid=130804)[0m               np.ndarray((4000, 3), dtype=float32, min=-14.063, max=12.267, mean=0.094),
[2m[36m(pid=130804)[0m               np.ndarray((4000, 2), dtype=float32, min=0.0, max=1.0, mean=0.5),
[2m[36m(pid=130804)[0m               np.ndarray((4000,), dtype=float32, min=-8.028, max=7.781, mean=0.0),
[2m[36m(pid=130804)[0m               np.ndarray((4000, 2), dtype=float32, min=-0.011, max=0.009, mean=-0.003),
[2m[36m(pid=130804)[0m               np.ndarray((4000,), dtype=float32, min=0.0, max=0.0, mean=0.0),
[2m[36m(pid=130804)[0m               np.ndarray((4000,), dtype=float32, min=0.0, max

2019-06-17 15:30:35,975	INFO ray_trial_executor.py:187 -- Destroying actor for trial PPOTrainer_GAIAPredictorsContinuousV7_0_entropy_coeff=1e-05,gamma=0,lambda=0.0,lr=1e-05,num_sgd_iter=8,train_batch_size=4000,use_gae=False,vf_clip_param=0.0,vf_loss_coeff=0.0. If your trainable is slow to initialize, consider setting reuse_actors=True to reduce actor creation overheads.
2019-06-17 15:30:36,002	INFO tune.py:65 -- Did not find checkpoint file in logs/Playground-WalkForward2012.
2019-06-17 15:30:36,004	INFO tune.py:232 -- Starting a new experiment.


_______________________________________2012____________________________________________
[2m[36m(pid=130809)[0m <class 'module'>
[2m[36m(pid=130809)[0m 2019-06-17 15:30:42,992	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=130809)[0m 2019-06-17 15:30:42.993810: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
[2m[36m(pid=130809)[0m 2019-06-17 15:30:50,858	INFO dynamic_tf_policy.py:265 -- Initializing loss function with dummy input:
[2m[36m(pid=130809)[0m 
[2m[36m(pid=130809)[0m { 'action_prob': <tf.Tensor 'default_policy/action_prob:0' shape=(?,) dtype=float32>,
[2m[36m(pid=130809)[0m   'actions': <tf.Tensor 'default_policy/actions:0' shape=(?, 2) dtype=float32>,
[2m[36m(pid=130809)[0m   'advantages': <tf.Tensor 'default_policy/advantages:0' shape=(?,) dtype=float32>,
[2m[

[2m[36m(pid=20462)[0m 
[2m[36m(pid=20462)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=20462)[0m 
[2m[36m(pid=20460)[0m 
[2m[36m(pid=20460)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=20460)[0m 
[2m[36m(pid=20463)[0m 
[2m[36m(pid=20463)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=20463)[0m 
[2m[36m(pid=20461)[0m 
[2m[36m(pid=20461)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=20461)[0m 
[2m[36m(pid=20459)[0m 
[2m[36m(pid=20459)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=20459)[0m 
[2m[36m(pid=20459)[0m 2019-06-17 15:32:20,483	INFO policy

[2m[36m(pid=130809)[0m 2019-06-17 15:32:34,410	INFO multi_gpu_impl.py:146 -- Training on concatenated sample batches:
[2m[36m(pid=130809)[0m 
[2m[36m(pid=130809)[0m { 'inputs': [ np.ndarray((4000, 2), dtype=float32, min=0.0, max=1.0, mean=0.475),
[2m[36m(pid=130809)[0m               np.ndarray((4000,), dtype=float32, min=-0.052, max=0.048, mean=0.0),
[2m[36m(pid=130809)[0m               np.ndarray((4000, 3), dtype=float32, min=-14.063, max=10.392, mean=0.097),
[2m[36m(pid=130809)[0m               np.ndarray((4000, 2), dtype=float32, min=0.0, max=1.0, mean=0.5),
[2m[36m(pid=130809)[0m               np.ndarray((4000,), dtype=float32, min=-10.879, max=9.907, mean=0.0),
[2m[36m(pid=130809)[0m               np.ndarray((4000, 2), dtype=float32, min=-0.008, max=0.007, mean=-0.001),
[2m[36m(pid=130809)[0m               np.ndarray((4000,), dtype=float32, min=0.0, max=0.0, mean=0.0),
[2m[36m(pid=130809)[0m               np.ndarray((4000,), dtype=float32, min=0.0, ma

2019-06-17 15:36:23,993	INFO ray_trial_executor.py:187 -- Destroying actor for trial PPOTrainer_GAIAPredictorsContinuousV7_0_entropy_coeff=1e-05,gamma=0,lambda=0.0,lr=1e-05,num_sgd_iter=8,train_batch_size=4000,use_gae=False,vf_clip_param=0.0,vf_loss_coeff=0.0. If your trainable is slow to initialize, consider setting reuse_actors=True to reduce actor creation overheads.
2019-06-17 15:36:24,018	INFO tune.py:65 -- Did not find checkpoint file in logs/Playground-WalkForward2013.
2019-06-17 15:36:24,019	INFO tune.py:232 -- Starting a new experiment.


_______________________________________2013____________________________________________
[2m[36m(pid=20471)[0m 2019-06-17 15:36:31,027	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=20471)[0m 2019-06-17 15:36:31.027912: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
[2m[36m(pid=20471)[0m <class 'module'>
[2m[36m(pid=20471)[0m 2019-06-17 15:36:38,964	INFO dynamic_tf_policy.py:265 -- Initializing loss function with dummy input:
[2m[36m(pid=20471)[0m 
[2m[36m(pid=20471)[0m { 'action_prob': <tf.Tensor 'default_policy/action_prob:0' shape=(?,) dtype=float32>,
[2m[36m(pid=20471)[0m   'actions': <tf.Tensor 'default_policy/actions:0' shape=(?, 2) dtype=float32>,
[2m[36m(pid=20471)[0m   'advantages': <tf.Tensor 'default_policy/advantages:0' shape=(?,) dtype=float32>,
[2m[36m(pid=

[2m[36m(pid=20480)[0m 
[2m[36m(pid=20480)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=20480)[0m 
[2m[36m(pid=20488)[0m 
[2m[36m(pid=20488)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=20488)[0m 
[2m[36m(pid=20482)[0m 
[2m[36m(pid=20482)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=20482)[0m 
[2m[36m(pid=20480)[0m 2019-06-17 15:38:04,754	INFO policy_evaluator.py:441 -- Generating sample batch of size 200
[2m[36m(pid=20480)[0m 2019-06-17 15:38:04,797	INFO sampler.py:308 -- Raw obs from env: { 0: { 'agent0': np.ndarray((3,), dtype=float64, min=0.0, max=1.0, mean=0.333)}}
[2m[36m(pid=20480)[0m 2019-06-17 15:38:04,797	INFO sampler.py:309 -- Info return from env: {0: {'agent0': None}}
[2m[36m(pid=20480)[0m 2019-06

[2m[36m(pid=20471)[0m 2019-06-17 15:38:23,417	INFO tf_run_builder.py:92 -- Executing TF run without tracing. To dump TF timeline traces to disk, set the TF_TIMELINE_DIR environment variable.


2019-06-17 15:42:09,000	INFO ray_trial_executor.py:187 -- Destroying actor for trial PPOTrainer_GAIAPredictorsContinuousV7_0_entropy_coeff=1e-05,gamma=0,lambda=0.0,lr=1e-05,num_sgd_iter=8,train_batch_size=4000,use_gae=False,vf_clip_param=0.0,vf_loss_coeff=0.0. If your trainable is slow to initialize, consider setting reuse_actors=True to reduce actor creation overheads.
2019-06-17 15:42:09,024	INFO tune.py:65 -- Did not find checkpoint file in logs/Playground-WalkForward2014.
2019-06-17 15:42:09,025	INFO tune.py:232 -- Starting a new experiment.


_______________________________________2014____________________________________________
[2m[36m(pid=20485)[0m <class 'module'>
[2m[36m(pid=20485)[0m 2019-06-17 15:42:16,002	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=20485)[0m 2019-06-17 15:42:16.003488: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
[2m[36m(pid=20485)[0m 2019-06-17 15:42:23,481	INFO dynamic_tf_policy.py:265 -- Initializing loss function with dummy input:
[2m[36m(pid=20485)[0m 
[2m[36m(pid=20485)[0m { 'action_prob': <tf.Tensor 'default_policy/action_prob:0' shape=(?,) dtype=float32>,
[2m[36m(pid=20485)[0m   'actions': <tf.Tensor 'default_policy/actions:0' shape=(?, 2) dtype=float32>,
[2m[36m(pid=20485)[0m   'advantages': <tf.Tensor 'default_policy/advantages:0' shape=(?,) dtype=float32>,
[2m[36m(pid=

[2m[36m(pid=40198)[0m 
[2m[36m(pid=40198)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=40198)[0m 
[2m[36m(pid=40192)[0m 
[2m[36m(pid=40192)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=40192)[0m 
[2m[36m(pid=40199)[0m 
[2m[36m(pid=40199)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=40199)[0m 
[2m[36m(pid=40191)[0m 
[2m[36m(pid=40191)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=40191)[0m 
[2m[36m(pid=40200)[0m 
[2m[36m(pid=40200)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=40200)[0m 
[2m[36m(pid=40190)[0m 2019-06-17 15:43:48,321	INFO policy

[2m[36m(pid=20485)[0m 2019-06-17 15:44:05,782	INFO multi_gpu_impl.py:146 -- Training on concatenated sample batches:
[2m[36m(pid=20485)[0m 
[2m[36m(pid=20485)[0m { 'inputs': [ np.ndarray((4000, 2), dtype=float32, min=0.0, max=1.0, mean=0.475),
[2m[36m(pid=20485)[0m               np.ndarray((4000,), dtype=float32, min=-0.063, max=0.029, mean=0.0),
[2m[36m(pid=20485)[0m               np.ndarray((4000, 3), dtype=float32, min=-14.063, max=9.882, mean=0.087),
[2m[36m(pid=20485)[0m               np.ndarray((4000, 2), dtype=float32, min=0.0, max=1.0, mean=0.5),
[2m[36m(pid=20485)[0m               np.ndarray((4000,), dtype=float32, min=-13.672, max=6.26, mean=0.0),
[2m[36m(pid=20485)[0m               np.ndarray((4000, 2), dtype=float32, min=-0.014, max=0.014, mean=-0.0),
[2m[36m(pid=20485)[0m               np.ndarray((4000,), dtype=float32, min=0.0, max=0.0, mean=0.0),
[2m[36m(pid=20485)[0m               np.ndarray((4000,), dtype=float32, min=0.0, max=0.0, mean=0.

2019-06-17 15:47:54,417	INFO ray_trial_executor.py:187 -- Destroying actor for trial PPOTrainer_GAIAPredictorsContinuousV7_0_entropy_coeff=1e-05,gamma=0,lambda=0.0,lr=1e-05,num_sgd_iter=8,train_batch_size=4000,use_gae=False,vf_clip_param=0.0,vf_loss_coeff=0.0. If your trainable is slow to initialize, consider setting reuse_actors=True to reduce actor creation overheads.
2019-06-17 15:47:54,443	INFO tune.py:65 -- Did not find checkpoint file in logs/Playground-WalkForward2015.
2019-06-17 15:47:54,444	INFO tune.py:232 -- Starting a new experiment.


_______________________________________2015____________________________________________
[2m[36m(pid=40242)[0m <class 'module'>
[2m[36m(pid=40242)[0m 2019-06-17 15:48:01,510	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=40242)[0m 2019-06-17 15:48:01.511496: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
[2m[36m(pid=40242)[0m 2019-06-17 15:48:09,151	INFO dynamic_tf_policy.py:265 -- Initializing loss function with dummy input:
[2m[36m(pid=40242)[0m 
[2m[36m(pid=40242)[0m { 'action_prob': <tf.Tensor 'default_policy/action_prob:0' shape=(?,) dtype=float32>,
[2m[36m(pid=40242)[0m   'actions': <tf.Tensor 'default_policy/actions:0' shape=(?, 2) dtype=float32>,
[2m[36m(pid=40242)[0m   'advantages': <tf.Tensor 'default_policy/advantages:0' shape=(?,) dtype=float32>,
[2m[36m(pid=

[2m[36m(pid=40235)[0m 
[2m[36m(pid=40235)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=40235)[0m 
[2m[36m(pid=40233)[0m 
[2m[36m(pid=40233)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=40233)[0m 
[2m[36m(pid=40237)[0m 
[2m[36m(pid=40237)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=40237)[0m 
[2m[36m(pid=40255)[0m 
[2m[36m(pid=40255)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=40255)[0m 
[2m[36m(pid=40252)[0m 2019-06-17 15:49:33,469	INFO policy_evaluator.py:441 -- Generating sample batch of size 200
[2m[36m(pid=40252)[0m 2019-06-17 15:49:33,542	INFO sampler.py:308 -- Raw obs from env: { 0: { 'agent0': np.ndarray((3,), dtype=f

[2m[36m(pid=40242)[0m 2019-06-17 15:49:50,136	INFO multi_gpu_impl.py:146 -- Training on concatenated sample batches:
[2m[36m(pid=40242)[0m 
[2m[36m(pid=40242)[0m { 'inputs': [ np.ndarray((4000, 2), dtype=float32, min=0.0, max=1.0, mean=0.475),
[2m[36m(pid=40242)[0m               np.ndarray((4000,), dtype=float32, min=-0.028, max=0.052, mean=0.0),
[2m[36m(pid=40242)[0m               np.ndarray((4000, 3), dtype=float32, min=-14.063, max=12.585, mean=0.08),
[2m[36m(pid=40242)[0m               np.ndarray((4000, 2), dtype=float32, min=0.0, max=1.0, mean=0.5),
[2m[36m(pid=40242)[0m               np.ndarray((4000,), dtype=float32, min=-11.636, max=10.889, mean=-0.0),
[2m[36m(pid=40242)[0m               np.ndarray((4000, 2), dtype=float32, min=-0.006, max=0.007, mean=0.001),
[2m[36m(pid=40242)[0m               np.ndarray((4000,), dtype=float32, min=0.0, max=0.0, mean=0.0),
[2m[36m(pid=40242)[0m               np.ndarray((4000,), dtype=float32, min=0.0, max=0.0, mea

2019-06-17 15:53:39,681	INFO ray_trial_executor.py:187 -- Destroying actor for trial PPOTrainer_GAIAPredictorsContinuousV7_0_entropy_coeff=1e-05,gamma=0,lambda=0.0,lr=1e-05,num_sgd_iter=8,train_batch_size=4000,use_gae=False,vf_clip_param=0.0,vf_loss_coeff=0.0. If your trainable is slow to initialize, consider setting reuse_actors=True to reduce actor creation overheads.
2019-06-17 15:53:39,706	INFO tune.py:65 -- Did not find checkpoint file in logs/Playground-WalkForward2016.
2019-06-17 15:53:39,707	INFO tune.py:232 -- Starting a new experiment.


_______________________________________2016____________________________________________
[2m[36m(pid=40234)[0m <class 'module'>
[2m[36m(pid=40234)[0m 2019-06-17 15:53:46,725	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=40234)[0m 2019-06-17 15:53:46.726350: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
[2m[36m(pid=40234)[0m 2019-06-17 15:53:54,383	INFO dynamic_tf_policy.py:265 -- Initializing loss function with dummy input:
[2m[36m(pid=40234)[0m 
[2m[36m(pid=40234)[0m { 'action_prob': <tf.Tensor 'default_policy/action_prob:0' shape=(?,) dtype=float32>,
[2m[36m(pid=40234)[0m   'actions': <tf.Tensor 'default_policy/actions:0' shape=(?, 2) dtype=float32>,
[2m[36m(pid=40234)[0m   'advantages': <tf.Tensor 'default_policy/advantages:0' shape=(?,) dtype=float32>,
[2m[36m(pid=

[2m[36m(pid=60069)[0m 
[2m[36m(pid=60069)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=60069)[0m 
[2m[36m(pid=60067)[0m 
[2m[36m(pid=60067)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=60067)[0m 
[2m[36m(pid=60072)[0m 
[2m[36m(pid=60072)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=60072)[0m 
[2m[36m(pid=60068)[0m 
[2m[36m(pid=60068)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=60068)[0m 
[2m[36m(pid=60071)[0m 
[2m[36m(pid=60071)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=60071)[0m 
[2m[36m(pid=60067)[0m 2019-06-17 15:55:19,475	INFO policy

[2m[36m(pid=40234)[0m 2019-06-17 15:55:35,710	INFO multi_gpu_impl.py:146 -- Training on concatenated sample batches:
[2m[36m(pid=40234)[0m 
[2m[36m(pid=40234)[0m { 'inputs': [ np.ndarray((4000, 2), dtype=float32, min=0.0, max=1.0, mean=0.475),
[2m[36m(pid=40234)[0m               np.ndarray((4000,), dtype=float32, min=-0.04, max=0.05, mean=0.0),
[2m[36m(pid=40234)[0m               np.ndarray((4000, 3), dtype=float32, min=-9.108, max=15.356, mean=0.106),
[2m[36m(pid=40234)[0m               np.ndarray((4000, 2), dtype=float32, min=0.0, max=1.0, mean=0.5),
[2m[36m(pid=40234)[0m               np.ndarray((4000,), dtype=float32, min=-9.646, max=11.868, mean=-0.0),
[2m[36m(pid=40234)[0m               np.ndarray((4000, 2), dtype=float32, min=-0.012, max=0.006, mean=-0.003),
[2m[36m(pid=40234)[0m               np.ndarray((4000,), dtype=float32, min=0.0, max=0.0, mean=0.0),
[2m[36m(pid=40234)[0m               np.ndarray((4000,), dtype=float32, min=0.0, max=0.0, mean=

2019-06-17 15:59:24,362	INFO ray_trial_executor.py:187 -- Destroying actor for trial PPOTrainer_GAIAPredictorsContinuousV7_0_entropy_coeff=1e-05,gamma=0,lambda=0.0,lr=1e-05,num_sgd_iter=8,train_batch_size=4000,use_gae=False,vf_clip_param=0.0,vf_loss_coeff=0.0. If your trainable is slow to initialize, consider setting reuse_actors=True to reduce actor creation overheads.
2019-06-17 15:59:24,386	INFO tune.py:65 -- Did not find checkpoint file in logs/Playground-WalkForward2017.
2019-06-17 15:59:24,386	INFO tune.py:232 -- Starting a new experiment.


_______________________________________2017____________________________________________
[2m[36m(pid=60096)[0m 2019-06-17 15:59:31,471	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=60096)[0m 2019-06-17 15:59:31.471788: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
[2m[36m(pid=60096)[0m <class 'module'>
[2m[36m(pid=60096)[0m 2019-06-17 15:59:39,486	INFO dynamic_tf_policy.py:265 -- Initializing loss function with dummy input:
[2m[36m(pid=60096)[0m 
[2m[36m(pid=60096)[0m { 'action_prob': <tf.Tensor 'default_policy/action_prob:0' shape=(?,) dtype=float32>,
[2m[36m(pid=60096)[0m   'actions': <tf.Tensor 'default_policy/actions:0' shape=(?, 2) dtype=float32>,
[2m[36m(pid=60096)[0m   'advantages': <tf.Tensor 'default_policy/advantages:0' shape=(?,) dtype=float32>,
[2m[36m(pid=

[2m[36m(pid=60086)[0m 
[2m[36m(pid=60086)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=60086)[0m 
[2m[36m(pid=60084)[0m 
[2m[36m(pid=60084)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=60084)[0m 
[2m[36m(pid=60081)[0m 
[2m[36m(pid=60081)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=60081)[0m 
[2m[36m(pid=60094)[0m 
[2m[36m(pid=60094)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=60094)[0m 
[2m[36m(pid=60088)[0m 
[2m[36m(pid=60088)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=60088)[0m 
[2m[36m(pid=60088)[0m 2019-06-17 16:01:03,869	INFO policy

[2m[36m(pid=60096)[0m 2019-06-17 16:01:18,627	INFO multi_gpu_impl.py:146 -- Training on concatenated sample batches:
[2m[36m(pid=60096)[0m 
[2m[36m(pid=60096)[0m { 'inputs': [ np.ndarray((4000, 2), dtype=float32, min=0.0, max=1.0, mean=0.475),
[2m[36m(pid=60096)[0m               np.ndarray((4000,), dtype=float32, min=-0.029, max=0.037, mean=0.0),
[2m[36m(pid=60096)[0m               np.ndarray((4000, 3), dtype=float32, min=-14.063, max=11.937, mean=0.064),
[2m[36m(pid=60096)[0m               np.ndarray((4000, 2), dtype=float32, min=0.0, max=1.0, mean=0.5),
[2m[36m(pid=60096)[0m               np.ndarray((4000,), dtype=float32, min=-6.425, max=8.653, mean=-0.0),
[2m[36m(pid=60096)[0m               np.ndarray((4000, 2), dtype=float32, min=-0.011, max=0.004, mean=-0.004),
[2m[36m(pid=60096)[0m               np.ndarray((4000,), dtype=float32, min=0.0, max=0.0, mean=0.0),
[2m[36m(pid=60096)[0m               np.ndarray((4000,), dtype=float32, min=0.0, max=0.0, mea

2019-06-17 16:05:01,312	INFO ray_trial_executor.py:187 -- Destroying actor for trial PPOTrainer_GAIAPredictorsContinuousV7_0_entropy_coeff=1e-05,gamma=0,lambda=0.0,lr=1e-05,num_sgd_iter=8,train_batch_size=4000,use_gae=False,vf_clip_param=0.0,vf_loss_coeff=0.0. If your trainable is slow to initialize, consider setting reuse_actors=True to reduce actor creation overheads.


In [71]:
# raise ValueError('TODO: update paths with latest runs')
paths = {2007: '/home/Nicholas/Desktop/trading-gym/notebooks/registry/gaia/v7/logs/Playground-WalkForward2007/experiment_state-2019-06-17_15-01-34.json',
        2008: '/home/Nicholas/Desktop/trading-gym/notebooks/registry/gaia/v7/logs/Playground-WalkForward2008/experiment_state-2019-06-17_15-07-20.json',
        2009: '/home/Nicholas/Desktop/trading-gym/notebooks/registry/gaia/v7/logs/Playground-WalkForward2009/experiment_state-2019-06-17_15-13-08.json',
        2010: '/home/Nicholas/Desktop/trading-gym/notebooks/registry/gaia/v7/logs/Playground-WalkForward2010/experiment_state-2019-06-17_15-18-58.json',
        2011: '/home/Nicholas/Desktop/trading-gym/notebooks/registry/gaia/v7/logs/Playground-WalkForward2011/experiment_state-2019-06-17_15-24-51.json',
        2012: '/home/Nicholas/Desktop/trading-gym/notebooks/registry/gaia/v7/logs/Playground-WalkForward2012/experiment_state-2019-06-17_15-30-36.json',
        2013: '/home/Nicholas/Desktop/trading-gym/notebooks/registry/gaia/v7/logs/Playground-WalkForward2013/experiment_state-2019-06-17_15-36-24.json',
        2014: '/home/Nicholas/Desktop/trading-gym/notebooks/registry/gaia/v7/logs/Playground-WalkForward2014/experiment_state-2019-06-17_15-42-09.json',
        2015: '/home/Nicholas/Desktop/trading-gym/notebooks/registry/gaia/v7/logs/Playground-WalkForward2015/experiment_state-2019-06-17_15-47-54.json',
        2016: '/home/Nicholas/Desktop/trading-gym/notebooks/registry/gaia/v7/logs/Playground-WalkForward2016/experiment_state-2019-06-17_15-53-39.json',
        2017: '/home/Nicholas/Desktop/trading-gym/notebooks/registry/gaia/v7/logs/Playground-WalkForward2017/experiment_state-2019-06-17_15-59-24.json'
        }


episodes = dict()
agents = dict()
for year, path in paths.items():
    # RESTORE part (a)
    with open(path) as f:
        metadata = json.load(f)

    runner_data = metadata['runner_data']
    stats = metadata['stats']

    checkpoint = metadata['checkpoints'][-1]
    checkpoint = cloudpickleloads(checkpoint)
    checkpoint_path = cloudpickle.loads(hex_to_binary(checkpoint['_checkpoint'])).value

    config = checkpoint['config']
#     Don't actually need to redefine the env_cls as it's always the same 
    env_cls = config['env']
    env_config = config['env_config']
    path_restore = os.path.join(checkpoint['logdir'], checkpoint_path)
    
    # RESTORE part (b)
    agent = rllib.agents.ppo.PPOTrainer(config, env_cls)
#     agent.restore(path_restore)
# THIS IS A BUG: 
    agent._restore(path_restore)
    
    env = env_cls(env_config)
    episode = env.sample_episode(
        fold='test-set',
        policy=agent,
        episode_length=None,
        benchmark=env._load_benchmark().squeeze(),
        risk_free=env._load_risk_free().squeeze(),
        burn=1,
    )
    
    episodes[year] = episode
    agents[year] = agent

2019-06-18 16:09:24,784	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)


<class 'module'>


2019-06-18 16:09:33,837	INFO policy_evaluator.py:735 -- Built policy map: {'default_policy': <ray.rllib.policy.tf_policy_template.PPOTFPolicy object at 0x7fb99f649f98>}
2019-06-18 16:09:33,839	INFO policy_evaluator.py:736 -- Built preprocessor map: {'default_policy': <ray.rllib.models.preprocessors.NoPreprocessor object at 0x7fb99f649b70>}
2019-06-18 16:09:33,840	INFO policy_evaluator.py:347 -- Built filter map: {'default_policy': <ray.rllib.utils.filter.NoFilter object at 0x7fb99f6499b0>}
2019-06-18 16:09:34,253	INFO multi_gpu_optimizer.py:80 -- LocalMultiGPUOptimizer devices ['/cpu:0']
2019-06-18 16:09:49,521	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)


<class 'module'>


2019-06-18 16:09:51,216	INFO policy_evaluator.py:735 -- Built policy map: {'default_policy': <ray.rllib.policy.tf_policy_template.PPOTFPolicy object at 0x7fb9ca1a4198>}
2019-06-18 16:09:51,216	INFO policy_evaluator.py:736 -- Built preprocessor map: {'default_policy': <ray.rllib.models.preprocessors.NoPreprocessor object at 0x7fb9d02f2d30>}
2019-06-18 16:09:51,217	INFO policy_evaluator.py:347 -- Built filter map: {'default_policy': <ray.rllib.utils.filter.NoFilter object at 0x7fb9ca608f98>}
2019-06-18 16:09:51,330	INFO multi_gpu_optimizer.py:80 -- LocalMultiGPUOptimizer devices ['/cpu:0']


[2m[36m(pid=39470)[0m 2019-06-18 16:09:59,630	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 4 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=39470)[0m <class 'module'>
[2m[36m(pid=39470)[0m 2019-06-18 16:09:59.870134: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
[2m[36m(pid=39469)[0m 2019-06-18 16:10:00,383	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 3 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=39472)[0m <class 'module'>
[2m[36m(pid=39472)[0m 2019-06-18 16:10:00,476	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 1 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=39469)[0m <class 'module'>
[2m[36m(pid=39473)[0m 2019-06-18 16:10:00,584	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 6 on CPU (please ignore any CUDA init errors)
[2m[36m(pid

2019-06-18 16:10:03,161	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)


<class 'module'>


2019-06-18 16:10:05,191	INFO policy_evaluator.py:735 -- Built policy map: {'default_policy': <ray.rllib.policy.tf_policy_template.PPOTFPolicy object at 0x7fb9a81b07b8>}
2019-06-18 16:10:05,192	INFO policy_evaluator.py:736 -- Built preprocessor map: {'default_policy': <ray.rllib.models.preprocessors.NoPreprocessor object at 0x7fb9a81b0390>}
2019-06-18 16:10:05,193	INFO policy_evaluator.py:347 -- Built filter map: {'default_policy': <ray.rllib.utils.filter.NoFilter object at 0x7fb9a81b0160>}
2019-06-18 16:10:05,302	INFO multi_gpu_optimizer.py:80 -- LocalMultiGPUOptimizer devices ['/cpu:0']
2019-06-18 16:10:16,997	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)


<class 'module'>


2019-06-18 16:10:18,454	INFO policy_evaluator.py:735 -- Built policy map: {'default_policy': <ray.rllib.policy.tf_policy_template.PPOTFPolicy object at 0x7fb9ca5bf320>}
2019-06-18 16:10:18,455	INFO policy_evaluator.py:736 -- Built preprocessor map: {'default_policy': <ray.rllib.models.preprocessors.NoPreprocessor object at 0x7fb9a71e3eb8>}
2019-06-18 16:10:18,456	INFO policy_evaluator.py:347 -- Built filter map: {'default_policy': <ray.rllib.utils.filter.NoFilter object at 0x7fb9a71e3c88>}
2019-06-18 16:10:18,548	INFO multi_gpu_optimizer.py:80 -- LocalMultiGPUOptimizer devices ['/cpu:0']
2019-06-18 16:10:29,862	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)


<class 'module'>


2019-06-18 16:10:31,658	INFO policy_evaluator.py:735 -- Built policy map: {'default_policy': <ray.rllib.policy.tf_policy_template.PPOTFPolicy object at 0x7fb9c0704e48>}
2019-06-18 16:10:31,658	INFO policy_evaluator.py:736 -- Built preprocessor map: {'default_policy': <ray.rllib.models.preprocessors.NoPreprocessor object at 0x7fb9c0704a20>}
2019-06-18 16:10:31,659	INFO policy_evaluator.py:347 -- Built filter map: {'default_policy': <ray.rllib.utils.filter.NoFilter object at 0x7fb9c0704860>}
2019-06-18 16:10:31,767	INFO multi_gpu_optimizer.py:80 -- LocalMultiGPUOptimizer devices ['/cpu:0']
2019-06-18 16:10:44,261	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)


<class 'module'>


2019-06-18 16:10:46,019	INFO policy_evaluator.py:735 -- Built policy map: {'default_policy': <ray.rllib.policy.tf_policy_template.PPOTFPolicy object at 0x7fb9d130d278>}
2019-06-18 16:10:46,021	INFO policy_evaluator.py:736 -- Built preprocessor map: {'default_policy': <ray.rllib.models.preprocessors.NoPreprocessor object at 0x7fba28066e10>}
2019-06-18 16:10:46,023	INFO policy_evaluator.py:347 -- Built filter map: {'default_policy': <ray.rllib.utils.filter.NoFilter object at 0x7fba28066c50>}
2019-06-18 16:10:46,134	INFO multi_gpu_optimizer.py:80 -- LocalMultiGPUOptimizer devices ['/cpu:0']
2019-06-18 16:10:57,978	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)


<class 'module'>


2019-06-18 16:10:59,577	INFO policy_evaluator.py:735 -- Built policy map: {'default_policy': <ray.rllib.policy.tf_policy_template.PPOTFPolicy object at 0x7fb9ca9d6710>}
2019-06-18 16:10:59,578	INFO policy_evaluator.py:736 -- Built preprocessor map: {'default_policy': <ray.rllib.models.preprocessors.NoPreprocessor object at 0x7fb9ca9d62e8>}
2019-06-18 16:10:59,579	INFO policy_evaluator.py:347 -- Built filter map: {'default_policy': <ray.rllib.utils.filter.NoFilter object at 0x7fb9ca9d6128>}
2019-06-18 16:10:59,687	INFO multi_gpu_optimizer.py:80 -- LocalMultiGPUOptimizer devices ['/cpu:0']
2019-06-18 16:11:11,182	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)


<class 'module'>


2019-06-18 16:11:12,658	INFO policy_evaluator.py:735 -- Built policy map: {'default_policy': <ray.rllib.policy.tf_policy_template.PPOTFPolicy object at 0x7fb9c4c4d080>}
2019-06-18 16:11:12,658	INFO policy_evaluator.py:736 -- Built preprocessor map: {'default_policy': <ray.rllib.models.preprocessors.NoPreprocessor object at 0x7fb9caac0c18>}
2019-06-18 16:11:12,659	INFO policy_evaluator.py:347 -- Built filter map: {'default_policy': <ray.rllib.utils.filter.NoFilter object at 0x7fb9cbfe10f0>}
2019-06-18 16:11:12,762	INFO multi_gpu_optimizer.py:80 -- LocalMultiGPUOptimizer devices ['/cpu:0']
2019-06-18 16:11:23,934	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)


<class 'module'>


2019-06-18 16:11:26,639	INFO policy_evaluator.py:735 -- Built policy map: {'default_policy': <ray.rllib.policy.tf_policy_template.PPOTFPolicy object at 0x7fb9bf17fd30>}
2019-06-18 16:11:26,639	INFO policy_evaluator.py:736 -- Built preprocessor map: {'default_policy': <ray.rllib.models.preprocessors.NoPreprocessor object at 0x7fb9bf17f908>}
2019-06-18 16:11:26,640	INFO policy_evaluator.py:347 -- Built filter map: {'default_policy': <ray.rllib.utils.filter.NoFilter object at 0x7fb9bf17f6d8>}
2019-06-18 16:11:26,736	INFO multi_gpu_optimizer.py:80 -- LocalMultiGPUOptimizer devices ['/cpu:0']
2019-06-18 16:11:38,658	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)


<class 'module'>


2019-06-18 16:11:40,436	INFO policy_evaluator.py:735 -- Built policy map: {'default_policy': <ray.rllib.policy.tf_policy_template.PPOTFPolicy object at 0x7fb9b8b6e828>}
2019-06-18 16:11:40,437	INFO policy_evaluator.py:736 -- Built preprocessor map: {'default_policy': <ray.rllib.models.preprocessors.NoPreprocessor object at 0x7fb9b8b6e400>}
2019-06-18 16:11:40,438	INFO policy_evaluator.py:347 -- Built filter map: {'default_policy': <ray.rllib.utils.filter.NoFilter object at 0x7fb9b8b6e1d0>}
2019-06-18 16:11:40,546	INFO multi_gpu_optimizer.py:80 -- LocalMultiGPUOptimizer devices ['/cpu:0']
2019-06-18 16:11:52,030	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)


<class 'module'>


2019-06-18 16:11:55,357	INFO policy_evaluator.py:735 -- Built policy map: {'default_policy': <ray.rllib.policy.tf_policy_template.PPOTFPolicy object at 0x7fb9afa47400>}
2019-06-18 16:11:55,358	INFO policy_evaluator.py:736 -- Built preprocessor map: {'default_policy': <ray.rllib.models.preprocessors.NoPreprocessor object at 0x7fb9afa74f98>}
2019-06-18 16:11:55,360	INFO policy_evaluator.py:347 -- Built filter map: {'default_policy': <ray.rllib.utils.filter.NoFilter object at 0x7fb9afa74d68>}
2019-06-18 16:11:55,526	INFO multi_gpu_optimizer.py:80 -- LocalMultiGPUOptimizer devices ['/cpu:0']


[2m[36m(pid=39472)[0m 2019-06-18 16:12:15,708	INFO dynamic_tf_policy.py:265 -- Initializing loss function with dummy input:
[2m[36m(pid=39472)[0m 
[2m[36m(pid=39472)[0m { 'action_prob': <tf.Tensor 'default_policy/action_prob:0' shape=(?,) dtype=float32>,
[2m[36m(pid=39472)[0m   'actions': <tf.Tensor 'default_policy/actions:0' shape=(?, 2) dtype=float32>,
[2m[36m(pid=39472)[0m   'advantages': <tf.Tensor 'default_policy/advantages:0' shape=(?,) dtype=float32>,
[2m[36m(pid=39472)[0m   'behaviour_logits': <tf.Tensor 'default_policy/behaviour_logits:0' shape=(?, 2) dtype=float32>,
[2m[36m(pid=39472)[0m   'dones': <tf.Tensor 'default_policy/dones:0' shape=(?,) dtype=bool>,
[2m[36m(pid=39472)[0m   'new_obs': <tf.Tensor 'default_policy/new_obs:0' shape=(?, 3) dtype=float32>,
[2m[36m(pid=39472)[0m   'obs': <tf.Tensor 'default_policy/observation:0' shape=(?, 3) dtype=float32>,
[2m[36m(pid=39472)[0m   'prev_actions': <tf.Tensor 'default_policy/action:0' shape=(?, 2) 

[2m[36m(pid=39523)[0m 
[2m[36m(pid=39523)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=39523)[0m 
[2m[36m(pid=39501)[0m 
[2m[36m(pid=39501)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=39501)[0m 
[2m[36m(pid=39507)[0m 
[2m[36m(pid=39507)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=39507)[0m 
[2m[36m(pid=39511)[0m 
[2m[36m(pid=39511)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=39511)[0m 
[2m[36m(pid=39492)[0m 
[2m[36m(pid=39492)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=39492)[0m 
[2m[36m(pid=39489)[0m 
[2m[36m(pid=39489)[0m Convertin

[2m[36m(pid=51743)[0m 
[2m[36m(pid=51743)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=51743)[0m 
[2m[36m(pid=51747)[0m 
[2m[36m(pid=51747)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=51747)[0m 
[2m[36m(pid=51717)[0m 
[2m[36m(pid=51717)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=51717)[0m 
[2m[36m(pid=51719)[0m 
[2m[36m(pid=51719)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=51719)[0m 
[2m[36m(pid=51722)[0m 
[2m[36m(pid=51722)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=51722)[0m 
[2m[36m(pid=51716)[0m 
[2m[36m(pid=51716)[0m Convertin

[2m[36m(pid=57329)[0m 
[2m[36m(pid=57329)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=57329)[0m 
[2m[36m(pid=57314)[0m 
[2m[36m(pid=57314)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=57314)[0m 
[2m[36m(pid=57307)[0m 
[2m[36m(pid=57307)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=57307)[0m 
[2m[36m(pid=57309)[0m 
[2m[36m(pid=57309)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=57309)[0m 
[2m[36m(pid=57312)[0m 
[2m[36m(pid=57312)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=57312)[0m 
[2m[36m(pid=57310)[0m <class 'module'>
[2m[36m(pid=5731

[2m[36m(pid=61626)[0m 
[2m[36m(pid=61626)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=61626)[0m 
[2m[36m(pid=61624)[0m 
[2m[36m(pid=61624)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=61624)[0m 
[2m[36m(pid=61664)[0m 2019-06-18 16:23:26,393	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 1 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=61664)[0m <class 'module'>
[2m[36m(pid=61664)[0m 2019-06-18 16:23:26.554723: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
[2m[36m(pid=61625)[0m <class 'module'>
[2m[36m(pid=61625)[0m 2019-06-18 16:23:29,156	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 2 on CPU (please ignore any CUDA init errors)


[2m[36m(pid=61658)[0m 
[2m[36m(pid=61658)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=61658)[0m 
[2m[36m(pid=61623)[0m 
[2m[36m(pid=61623)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=61623)[0m 
[2m[36m(pid=61659)[0m 
[2m[36m(pid=61659)[0m Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
[2m[36m(pid=61659)[0m 
[2m[36m(pid=61666)[0m 2019-06-18 16:25:30,797	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 3 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=61666)[0m <class 'module'>
[2m[36m(pid=61666)[0m 2019-06-18 16:25:30.919960: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
[2m[36m(pid=61

[2m[36m(pid=69062)[0m 2019-06-18 16:27:35,187	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 5 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=69062)[0m <class 'module'>
[2m[36m(pid=69062)[0m 2019-06-18 16:27:35.402929: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
[2m[36m(pid=69060)[0m <class 'module'>
[2m[36m(pid=69060)[0m 2019-06-18 16:27:37,825	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 6 on CPU (please ignore any CUDA init errors)
[2m[36m(pid=69060)[0m 2019-06-18 16:27:38.064044: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
[2m[36m(pid=69086)[0m <class 'module'>
[2m[36m(pid=69086)[0m 2019-06-18 16:27:40,209	INFO policy_evaluator.py:312 -- Creating policy evaluation worker 1 o

[2m[36m(pid=73193)[0m 2019-06-18 16:31:28,314	INFO dynamic_tf_policy.py:265 -- Initializing loss function with dummy input:
[2m[36m(pid=73193)[0m 
[2m[36m(pid=73193)[0m { 'action_prob': <tf.Tensor 'default_policy/action_prob:0' shape=(?,) dtype=float32>,
[2m[36m(pid=73193)[0m   'actions': <tf.Tensor 'default_policy/actions:0' shape=(?, 2) dtype=float32>,
[2m[36m(pid=73193)[0m   'advantages': <tf.Tensor 'default_policy/advantages:0' shape=(?,) dtype=float32>,
[2m[36m(pid=73193)[0m   'behaviour_logits': <tf.Tensor 'default_policy/behaviour_logits:0' shape=(?, 2) dtype=float32>,
[2m[36m(pid=73193)[0m   'dones': <tf.Tensor 'default_policy/dones:0' shape=(?,) dtype=bool>,
[2m[36m(pid=73193)[0m   'new_obs': <tf.Tensor 'default_policy/new_obs:0' shape=(?, 3) dtype=float32>,
[2m[36m(pid=73193)[0m   'obs': <tf.Tensor 'default_policy/observation:0' shape=(?, 3) dtype=float32>,
[2m[36m(pid=73193)[0m   'prev_actions': <tf.Tensor 'default_policy/action:0' shape=(?, 2) 

In [51]:
levels = list()
mappings = pd.DataFrame()
mapping_functions = dict()
for year in paths:
    episode = episodes[year]
    agent = agents[year]

    # Load.
    actions = episode.actions_as_frame()
    states = episode.states_as_frame()
    
    # Parse.
    gaia_predictor = states[0].to_frame('GAIA Predictor')
    
#     The following line was here before
#     target_weight_russell_1000 = actions[ETF('Russell 1000')]
    target_weight_russell_1000 = actions[actions.columns[0]]
    target_weight_russell_1000.name = 'Target weight: ' + str(target_weight_russell_1000.name)
    mapping = gaia_predictor.join(target_weight_russell_1000)
    mapping_function = mapping.set_index('GAIA Predictor')

    levels.append(episode.renderer.level.to_frame().pct_change())
    mappings = mappings.append(mapping)
    mapping_functions[year] = mapping_function

    # Visualize.
    mapping.iplot(
        title="Hisorical GAIA predictor for Russell 1000 vs agent's target weights",
        secondary_y='GAIA Predictor',
        yTitle=target_weight_russell_1000.name,
        secondary_y_title='GAIA Predictor',
        legend={'orientation': 'h'},
    )
    mapping_function.iplot(
        title='Policy: mapping from GAIA predictor (state) to target weight for Russell 1000 (action)',
        xTitle='GAIA predictor for Russell 1000 (standardized)',
        yTitle='Target weight for Russell 1000',
        kind='scatter',
        mode='markers',
        size=4,
    )

In [58]:
daily_ret = pd.concat(levels).sort_index().fillna(0)
cumulative_performance = (1 + daily_ret).cumprod() - 1
cumulative_performance *= 100

aric = cumulative_performance.columns[1]
cumulative_performance['Strategy relative to Aric-Benchmark'] = cumulative_performance['Strategy'] - cumulative_performance[aric]


# Visualizations.
cumulative_performance.iplot(
    legend={'orientation': 'h'},
    yTitle='Total returns',
)

In [61]:
levels = (1 + cumulative_performance / 100)
annual_rets = (levels.resample('Y').last() / levels.resample('Y').first() - 1)

    
annual_rets['Strategy relative to Aric-Benchmark'] = annual_rets['Strategy'] - annual_rets[aric]
annual_rets.index = annual_rets.index.year
annual_rets *= 100
annual_rets.iplot(kind='bar', legend={'orientation': 'h'}, yTitle='%')

In [62]:
levels.drop('Strategy relative to Aric-Benchmark', axis='columns').tearsheet(
    benchmark=env._load_benchmark().loc['2008':].squeeze(),
    risk_free=env._load_risk_free().loc['2008':].squeeze(),
    #weights=env.broker.track_record.to_frame('weights_target').iloc[1:]
)

Unnamed: 0,Unnamed: 1,Strategy,Index(Aric-Benchmark),Index(USD 1M Deposit),Cash(USD),"ETF(Russell 1000, SMART, USD)","ETF(7-10Y T-Bills, SMART, USD)"
Context,From,2008-01-01,2008-01-01,2008-01-01,2008-01-01,2008-01-01,2008-01-01
Context,To,2018-08-28,2018-08-28,2018-08-28,2018-08-28,2018-08-28,2018-08-28
Context,Years,10.663,10.663,10.663,10.663,10.663,10.663
Context,Observations,2771,2771,2771,2771,2771,2771
Context,Risk-free asset,Index(USD 1M Deposit),Index(USD 1M Deposit),Index(USD 1M Deposit),Index(USD 1M Deposit),Index(USD 1M Deposit),Index(USD 1M Deposit)
Context,Risk-free CAGR,0.00736481,0.00736481,0.00736481,0.00736481,0.00736481,0.00736481
Return,CAGR,0.124306,0.161394,0.00732137,0,0.08955,0.0400452
Return,CAGR over cash,0.116941,0.15403,-4.34354e-05,-0.00736481,0.0821852,0.0326804
Return,Overall return,2.48803,3.93038,0.0808886,0,1.49556,0.519945
Risk,Volatility,0.095085,0.0974619,0.000660706,0,0.199054,0.0776511


In [63]:
mapping_functions

{2007:                 Target weight: ETF(Russell 1000, SMART, USD)
 GAIA Predictor                                              
 0.000000                                            0.198985
 0.000000                                            0.198985
 0.000000                                            0.198985
 0.000000                                            0.198985
 0.000000                                            0.198985
 0.000000                                            0.198985
 0.000000                                            0.198985
 1.688722                                            0.596809
 0.000000                                            0.355523
 0.000000                                            0.198985
 0.000000                                            0.198985
 0.000000                                            0.198985
 0.000000                                            0.198985
 0.000000                                            0.198985
 0

In [70]:
import plotly.plotly as py
import plotly.graph_objs as go
from plotly.offline import init_notebook_mode, iplot
import cufflinks
cufflinks.go_offline()
init_notebook_mode(connected=False)


traces = list()
for year, series in mapping_functions.items():
    trace = go.Scatter(
        x = list(series.squeeze().index[:-1]),
        y = list(series.squeeze().values[:-1]),
        mode = 'markers',
        name = year
    )
    traces.append(trace)
    
layout = go.Layout(
    title='GAIA vs RL mapping functions',
    xaxis=dict(
        title='GAIA Mapping'
    ),
    yaxis=dict(
        title='PPO Mapping'
        )
        
    )
fig = go.Figure(data=traces,layout=layout)
iplot(fig,filename='scatter=mode')

# iplot(traces, filename='scatter-mode')

## Tensorboard

## Restore the agents

## Run and visualize the agents out-of-sample

# Solve GAIA-v7 with ray (walk-forward, transfer learning)
TODO: walk-forward re-training using a sliding (i.e. not expanding) window. An expanding window is bad because the agent would overweight old observations as they would be used more often than recent observations.