Gaes fees #21

crazypythonista · 2022-03-06T22:59:04Z

Hello, I was trying to work this out on my end from scratch, I have got it to the point of training the model and also visualize but it seems to drop in the middle of the training session without saving the model.

VC:
Python : 3.8.10
tensorflow = 2.3.1
Windows = 11
No IDLE, Using script mode from windows power shell virtual env.

Below is the complete Traceback of the error I received.

2022-03-07 04:17:43.095316: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-03-07 04:17:43.100610: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263]
Traceback (most recent call last):
File "RL-Bitcoin-trading-bot_7.py", line 501, in
train_multiprocessing(CustomEnv, agent, train_df, train_df_nomalized, num_worker = 5, training_batch_size=50, visualize=True, EPISODES=5)
File "D:\Mine\RLCurrent\multiprocessing_env.py", line 95, in train_multiprocessing
a_loss, c_loss = agent.replay(states[worker_id], actions[worker_id], rewards[worker_id], predictions[worker_id], dones[worker_id], next_states[worker_id])
File "RL-Bitcoin-trading-bot_7.py", line 121, in replay
advantages, target = self.get_gaes(rewards, dones, np.squeeze(values), np.squeeze(next_values))
File "RL-Bitcoin-trading-bot_7.py", line 93, in get_gaes
deltas = [r + gamma * (1 - d) * nv - v for r, d, nv, v in zip(rewards, dones, next_values, values)]
File "RL-Bitcoin-trading-bot_7.py", line 93, in
deltas = [r + gamma * (1 - d) * nv - v for r, d, nv, v in zip(rewards, dones, next_values, values)]
TypeError: unsupported operand type(s) for +: 'NoneType' and 'float'

Any sort of help is highly appreciated. If needed I'll post code snippets as well for more clarity.
Thanks.

HoaxParagon · 2022-03-06T23:15:14Z

This is a duplicate of #18

HoaxParagon · 2022-03-07T14:15:09Z

Also a duplicate of #9

wanga10000 · 2022-03-14T07:41:03Z

Hey I think the problem is originated from the output of critic_predict. I guess that in the original PPO function implemented by the writer has included "Critic model also watched the previous predicted value", but he removed it in this tutorial. That means critic model doesn't check previous value input now. Maybe you should try removing the np.zero input in critirc_predict function.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gaes fees #21

Gaes fees #21

crazypythonista commented Mar 6, 2022

HoaxParagon commented Mar 6, 2022

HoaxParagon commented Mar 7, 2022

wanga10000 commented Mar 14, 2022 •

edited

Loading

Gaes fees #21

Gaes fees #21

Comments

crazypythonista commented Mar 6, 2022

HoaxParagon commented Mar 6, 2022

HoaxParagon commented Mar 7, 2022

wanga10000 commented Mar 14, 2022 • edited Loading

wanga10000 commented Mar 14, 2022 •

edited

Loading