Error encountered on 'Unreal Stacked Lstm example' #80

JaCoderX · 2018-11-21T18:13:31Z

I encountered an error and I'm trying to figure if it is wrong setting on my end or possible bug.

I took the 'Unreal Stacked Lstm example' and set the trainer to work with PPO.
on the strategy params I made the following changes:

MyCerebro.addstrategy(
    ...
    skip_frame=60, # skip_frame_period <= avg_period <= time_embedding_period:
    time_dim = 128,
    avg_period = 100,
    ...

I'm experimenting with skip_frame and time_dim to try and compare two models that are trained with different time frames. with the following settings:

First model -
one year data of 1 min resolution, skip_frame ~ one hour (60 frames), time_dim ~ 2 hours (or above)
Second model
one year data of 1 hour resolution, skip_frame ~ one hour (1 frame), time_dim ~ 120 hours (or above)

With this setting the models are making decisions at the same time (every hour) but after learning 2 different representation of the same data.
I'm curious in seeing if the '1 min model' can learn also the representation of the '1 hour model'

I get this error only after changing to the above values (no error on the default values: skip_frame=10, time_dim=30, avg_period=20)

this is the error:

Traceback (most recent call last):
File "/home/jack/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1292, in _do_call
return fn(*args)
File "/home/jack/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1277, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/jack/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1367, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Nan in summary histogram for: PPO/model/advantage
[[{{node PPO/model/advantage}} = HistogramSummary[T=DT_FLOAT, _device="/job:worker/replica:0/task:0/device:CPU:0"](PPO/model/advantage/tag, _recv_PPO/PPO/on_policy_advantage_pl_0)]]
[[{{node PPO/clip_by_global_norm/mul_47_S555}} = _Recvclient_terminated=false, recv_device="/job:ps/replica:0/task:0/device:CPU:0", send_device="/job:worker/replica:0/task:0/device:CPU:0", send_device_incarnation=-1203984878618912631, tensor_name="edge_4873_PPO/clip_by_global_norm/mul_47", tensor_type=DT_FLOAT, _device="/job:ps/replica:0/task:0/device:CPU:0"]]

The text was updated successfully, but these errors were encountered:

Kismuz · 2018-11-21T19:16:12Z

@JacobHanouna ,

the log says Advantage estimation graph returned NaN which has been passed to freaky nan-intolerant tf.summary.histogram:
"nvalidArgumentError (see above for traceback): Nan in summary histogram for: PPO/model/advantage".
It's unclear what caused NaN; I'd recommend trying A3C class with same settings to check if error persists;
while playing with env. parameters it is good practice to run it manually couple of times before launching TF cluster; some inconsistent behaviour can be spotted in a first place. I do it so often I wrote very simple wrapper to collect data; could be used like this:

from _everywhere import _everything_needed
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt

from btgym.research.misc_utils import EnvRunner

env = BTGymEnv(**my_env_config_kwargs)
data_provider = EnvRunner(env=env)

# Get episode and prepare data,
# here - from train dataset, set sample_type=1 to get test:

obs = data_provider.get_episode(sample_type=0)

image = data_provider.env.render('episode')

data_provider.close()

# Get stream of external observations: 
data = np.concatenate(obs['external'], axis=1)

# See what data looks like for an agent:
plt.figure(num=1, figsize=(14, 8))
plt.title('External data:')
plt.grid(True)
_ = plt.plot(data[-1, :, :])

# Show rendered episode:
plt.figure(num=2, figsize=(22, 30))
plt.title('Episode summary:')
_ = plt.imshow(image)

# See rewards closeup:
r = np.asarray(obs['reward'])
plt.figure(num=3, figsize=(14, 8))
plt.title('Reward:')
_ = plt.plot(r)
plt.grid(True)

JaCoderX · 2018-11-22T18:54:00Z

while playing with env. parameters it is good practice to run it manually couple of times before launching TF cluster; some inconsistent behaviour can be spotted in a first place. I do it so often I wrote very simple wrapper to collect data; could be used like this:

Thanks for the tip. I will use it :)

It's unclear what caused NaN; I'd recommend trying A3C class with same settings to check if error persists;

OK you are right this is not PPO related. I get the same error also when using the BaseAAC trainer with StackedLstmPolicy.

Tried also using BaseAacPolicy which raise a different error (let me know if you need full traceback)

INFO:tensorflow:Restoring parameters from /home/jack/tmp/test/train/model.ckpt-0
INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.NotFoundError'>, Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Key AAC/global/conv2d/_layer_1/W/Adam not found in checkpoint
	 [[{{node save/RestoreV2}} = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:ps/replica:0/task:0/device:CPU:0"](_recv_save/Const_0_S1, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

All the errors happen only if I change to the above settings on the strategy.

Kismuz · 2018-11-23T18:35:19Z

@JacobHanouna, as traceback says, it failed to load saved model weights to new graph; usually appears when you have changed tf.graph definition (like switched from PPO to A3C or did some other alterations to graph) and tried to load previously saved model. Discard old checkpoints and start from scratch.

JaCoderX · 2018-11-23T18:39:27Z

@Kismuz
This was tested on clean model (I made sure to delete the result folder after every test)

Kismuz · 2018-11-23T18:47:58Z

what it tries restoring than? have you got clean manual env. run with strategy settings altered your way?

another option for orig. error can be: switch back to PPO and alter BaseAAC method _combine_summaries" - comment out following line:

                         tf.summary.histogram('advantage', self.local_network.on_pi_adv_target),

JaCoderX · 2018-11-23T19:40:51Z

@JacobHanouna, as traceback says, it failed to load saved model weights to new graph; usually appears when you have changed tf.graph definition (like switched from PPO to A3C or did some other alterations to graph) and tried to load previously saved model. Discard old checkpoints and start from scratch.

You were right, probably didn't clear that result. using BaseAacPolicy give the same error as original.

from btgym.research.misc_utils import EnvRunner

can you share EnvRunner as well (if it is not private of course :) )

another option for orig. error can be: switch back to PPO and alter BaseAAC method _combine_summaries - comment out following line

after commenting out this line I get a different error
InvalidArgumentError (see above for traceback): Found Inf or NaN global norm. : Tensor had NaN values [[{{node AAC/VerifyFinite/CheckNumerics}} = CheckNumerics[T=DT_FLOAT, message="Found Inf or NaN global norm.", _device="/job:worker/replica:0/task:0/device:CPU:0"](AAC/global_norm/global_norm)]] INFO:tensorflow:Error reported to Coordinator: <class 'RuntimeError'>, process() exception occurred

Kismuz · 2018-11-23T19:47:06Z

can you share EnvRunner

it's here, update BTgym:

git pull
pip install --upgrade -e .

Found Inf or NaN global norm. : Tensor had NaN values

well, time to manually check environment run.

JaCoderX · 2018-11-23T22:59:16Z

I ran it manually but no error

Kismuz · 2018-11-24T10:59:27Z

@JacobHanouna , can you share exact env. setup code so I can replicate a error?

JaCoderX · 2018-11-24T12:15:50Z

All env setting are the one used in 'unreal example'. only changed the strategy a bit as follows:
(skip_frame, time_dim, avg_period)

# Define strategy and broker account parameters:
MyCerebro.addstrategy(
    DevStrat_4_11,
    start_cash=2000,  # initial broker cash
    commission=0.0001,  # commisssion to imitate spread
    leverage=10.0,
    order_size=2000,  # fixed stake, mind leverage
    drawdown_call=10, # max % to loose, in percent of initial cash
    target_call=10,  # max % to win, same
    skip_frame=60, # skip_frame_period <= avg_period <= time_embedding_period:
    time_dim = 128,
    avg_period = 100,
    gamma=0.99,
    reward_scale=7, # gardient`s nitrox, touch with care!
    state_ext_scale = np.linspace(3e3, 1e3, num=5)
)

Kismuz · 2018-11-24T13:46:39Z

@JacobHanouna,
Well, there is caveat with setting oftime_dim and avg_period . Namely, BTgym API shell needs to infer observation state shape before actual instance of class DevStrat_4_11 is created; that means (taking in account backtrader paradigm that instances of classes are created at actual backtest runtime) observation state shape and all variables it depends upon should be class attributes and therefore can not be set via parameters dictionary (and, consequently, via addstrategy () method).
So, redefining skip_frame is fine but for others mentioned it is no easier way than to subclass startegy and explicitly set all class attributes; in your case it could look like this:

from gym import spaces
from btgym import DictSpace

class DevStrat_4_11_prime(DevStrat_4_11):
    time_dim = 128  
    skip_frame = 60
    avg_period = 100
    portfolio_actions = ('hold', 'buy', 'sell', 'close')
    gamma = 0.99  
    state_ext_scale = np.linspace(3e3, 1e3, num=5)
    params = dict(
        # Note: fake `Width` dimension to use 2d conv etc.:
        state_shape=
        {
            'external': spaces.Box(low=-100, high=100, shape=(time_dim, 1, 5), dtype=np.float32),
            'internal': spaces.Box(low=-2, high=2, shape=(avg_period, 1, 6), dtype=np.float32),
            'metadata': DictSpace(
                {
                    'type': spaces.Box(
                        shape=(),
                        low=0,
                        high=1,
                        dtype=np.uint32
                    ),
                    'trial_num': spaces.Box(
                        shape=(),
                        low=0,
                        high=10 ** 10,
                        dtype=np.uint32
                    ),
                    'trial_type': spaces.Box(
                        shape=(),
                        low=0,
                        high=1,
                        dtype=np.uint32
                    ),
                    'sample_num': spaces.Box(
                        shape=(),
                        low=0,
                        high=10 ** 10,
                        dtype=np.uint32
                    ),
                    'first_row': spaces.Box(
                        shape=(),
                        low=0,
                        high=10 ** 10,
                        dtype=np.uint32
                    ),
                    'timestamp': spaces.Box(
                        shape=(),
                        low=0,
                        high=np.finfo(np.float64).max,
                        dtype=np.float64
                    ),
                }
            )
        },
        cash_name='default_cash',
        asset_names=['default_asset'],
        start_cash=None,
        commission=None,
        leverage=1.0,
        drawdown_call=5,
        target_call=19,
        portfolio_actions=portfolio_actions,
        initial_action=None,
        initial_portfolio_action=None,
        skip_frame=skip_frame,
        gamma=gamma,
        reward_scale=1.0,
        state_ext_scale=state_ext_scale,  # EURUSD
        state_int_scale=1.0,
        metadata={},
    )
...............
MyCerebro.addstrategy(
    DevStrat_4_11_prime,
    start_cash=2000,  
    commission=0.0001,  
    leverage=10.0,
    order_size=2000,  
    drawdown_call=10, 
    target_call=10,  
    skip_frame=60, 
    gamma=0.99,
    reward_scale=7, 
)

ugly, I know - i'll try to address it in future;

JaCoderX · 2018-11-24T15:08:38Z

OK Thanks :)

JaCoderX · 2018-12-08T18:04:18Z

@Kismuz, I came across another issue while playing around with this example.

Under the trainer_config I have enabled use_value_replay=True and got the following error:

File "/home/jack/btgym/btgym/algorithms/policy/base.py", line 373, in get_pc_target
feeder = {self.pc_change_state_in: state['external'], self.pc_change_last_state_in: last_state['external']}
AttributeError: 'AacStackedRL2Policy' object has no attribute 'pc_change_state_in'

I checked the code and on StckedLstmPolicy you have disabled Aux 1 - Pixel Control. So pc_change_state_in doesn't get declared... but it appears that base is expecting it.

Kismuz · 2018-12-09T19:39:14Z

@JacobHanouna,
fixed, update package; thanks for spotting it out.

JaCoderX changed the title ~~Error encountered when using PPO on 'Unreal Stacked Lstm example'~~ Error encountered on 'Unreal Stacked Lstm example' Nov 22, 2018

JaCoderX closed this as completed Nov 24, 2018

JaCoderX reopened this Dec 8, 2018

Kismuz added error algorithm labels Dec 9, 2018

Kismuz added a commit that referenced this issue Dec 9, 2018

#80 off-policy value replay err. fix

ffdaa90

Kismuz added the resolved label Dec 9, 2018

JaCoderX closed this as completed Dec 9, 2018

Kismuz mentioned this issue Feb 18, 2019

cassandra dataframe is not works #101

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error encountered on 'Unreal Stacked Lstm example' #80

Error encountered on 'Unreal Stacked Lstm example' #80

JaCoderX commented Nov 21, 2018 •

edited

Loading

Kismuz commented Nov 21, 2018 •

edited

Loading

JaCoderX commented Nov 22, 2018

Kismuz commented Nov 23, 2018 •

edited

Loading

JaCoderX commented Nov 23, 2018

Kismuz commented Nov 23, 2018

JaCoderX commented Nov 23, 2018

Kismuz commented Nov 23, 2018

JaCoderX commented Nov 23, 2018

Kismuz commented Nov 24, 2018

JaCoderX commented Nov 24, 2018

Kismuz commented Nov 24, 2018 •

edited

Loading

JaCoderX commented Nov 24, 2018

JaCoderX commented Dec 8, 2018

Kismuz commented Dec 9, 2018

Error encountered on 'Unreal Stacked Lstm example' #80

Error encountered on 'Unreal Stacked Lstm example' #80

Comments

JaCoderX commented Nov 21, 2018 • edited Loading

Kismuz commented Nov 21, 2018 • edited Loading

JaCoderX commented Nov 22, 2018

Kismuz commented Nov 23, 2018 • edited Loading

JaCoderX commented Nov 23, 2018

Kismuz commented Nov 23, 2018

JaCoderX commented Nov 23, 2018

Kismuz commented Nov 23, 2018

JaCoderX commented Nov 23, 2018

Kismuz commented Nov 24, 2018

JaCoderX commented Nov 24, 2018

Kismuz commented Nov 24, 2018 • edited Loading

JaCoderX commented Nov 24, 2018

JaCoderX commented Dec 8, 2018

Kismuz commented Dec 9, 2018

JaCoderX commented Nov 21, 2018 •

edited

Loading

Kismuz commented Nov 21, 2018 •

edited

Loading

Kismuz commented Nov 23, 2018 •

edited

Loading

Kismuz commented Nov 24, 2018 •

edited

Loading