You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jan 16, 2023. It is now read-only.
For example, in the network.py in dmlab, you use clipped_reward (line 112). Can I know the reason why you construct the reward and the network this way? And is there any constraint on my reward setting in the environment.
Thanks
The text was updated successfully, but these errors were encountered:
For this project we didn't try to innovate much on the networks. The network is really from the IMPALA work: https://arxiv.org/abs/1802.01561
For some games it may make sense to not clip or to represent the reward as a onehot. However, I haven't found it to be very important. It makes the most sense when the same object in a game can have different rewards depending on other factors in the game.
For example, in the network.py in dmlab, you use clipped_reward (line 112). Can I know the reason why you construct the reward and the network this way? And is there any constraint on my reward setting in the environment.
Thanks
The text was updated successfully, but these errors were encountered: