-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update ALE for ROM fixes and snapshot/restore of state #15
Conversation
these ALE fixes alter the semantics of the following Atari envs: - Asteroids: add score wrapping - Breakout: reduce minimal action set - Kaboom: add missing action, fix terminal (was never done?) - Kangaroo: different score(?) - QBert: fix lives/terminal issue - WizardOfWor: different score and lives(?)
|
By default only errors are logged. |
|
n.b. only the existing ROMs are updated and new ROMs are not included
`State` is the environment state without system state like pseudorandomness while `SystemState` is the full state. To restore an identical environment where the same actions yield the same successors, make use of `SystemState`s.
Alright, this should be ready. I did a few simple checks of the environments, but more testing and review might be a good idea before merge. |
@nottombrown could you try running our RL algorithm baselines with this branch? |
Bumped the version and am testing now using PPO (because it's fast)
|
Is it supposed to preserve seed determinism with previous versions? I'm a bit confused by the graph above, @shelhamer. It looks like enduro was preserved, but not pong |
Awesome -- the rare uniform improvement. Let's merge! |
@nottombrown I'd expect slight improvement in Qbert and Breakout because of minor alterations of the ROMs in 199fdf3 (which fix a terminal condition and reduce the minimal action set respectively), and otherwise equivalent results. However, seeds are not preserved across this change because of a switch in the ALE RNG in Farama-Foundation/Arcade-Learning-Environment@a5241ce; see https://github.com/mgbellemare/Arcade-Learning-Environment/commits/master/src/emucore/Random.cxx for more history. There is still determinism for the new seed/RNG scheme, but it will give different random numbers so it's all the more important to bump the Atari envs to v4. |
Thanks @nottombrown for catching the version bump! |
gym 0.9.1 upgrades the Atari environments to v4 by: - upgrading atari-py to the latest ALE, including ROM fixes openai/atari-py#15 - upgrading gym to the latest atari-py and declaring Atari v4 #584 - exposing snapshot/restore of Atari environments through gym #575 Switching to v4 is encouraged for correctness, although the environment differences are minor. For exact comparison with existing Atari v3 results, make use of earlier gym and atari-py versions.
random->mat1 = 4753849; | ||
random->mat2 = 3231259923; | ||
random->tmat = 614784120; | ||
|
||
int i; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only GNU extension allows declaring variables in the middle of the block.
Update ALE for ROM fixes and snapshot/restore of state
gym 0.9.1 upgrades the Atari environments to v4 by: - upgrading atari-py to the latest ALE, including ROM fixes openai/atari-py#15 - upgrading gym to the latest atari-py and declaring Atari v4 openai/gym#584 - exposing snapshot/restore of Atari environments through gym openai/gym#575 Switching to v4 is encouraged for correctness, although the environment differences are minor. For exact comparison with existing Atari v3 results, make use of earlier gym and atari-py versions.
ALE development has gone on from the point that atari-py forked. While atari-py forked from ALE without keeping it's history, the diff can be inspected by comparing
atari-py/atari_py/ale_interface/src
andale/src
. Among the various fixes and improvements two stand out for the purposes of gym:cloneSystemState()
andrestoreSystemState()
)This is required to address openai/gym#402. I'm happy to carry out the gym change to expose the snapshot/restore functionality next.
Do not merge until we double-check that nothing breaks and decide that we're happy with the new ROM semantics. Once merged, gym should bump the env versions when the new atari-py is incorporated.