You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Renamed agent argument reward_preprocessing to reward_processing, and in case of Tensorforce agent moved to reward_estimation[reward_processing]
Distributions:
New categorical distribution argument skip_linear to not add the implicit linear logits layer
Environments:
Support for multi-actor parallel environments via new function Environment.num_actors()
Runner uses multi-actor parallelism by default if environment is multi-actor
New optional Environment function episode_return() which returns the true return of the last episode, if cumulative sum of environment rewards is not a good metric for runner display
Examples:
New vectorized_environment.py and multiactor_environment.py script to illustrate how to setup a vectorized/multi-actor environment.