Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT] Add functionality of FireReset to AtariPreprocessing wrapper #805

Closed

Conversation

balazsgyenes
Copy link

Description

Breakout requires an agent to use FIRE to launch the next "ball" after losing a life or losing the game.
Without this feature, the agent can get stuck if it hasn't learned to press FIRE after losing a life (or losing the game), and the FIRE action otherwise does nothing.

This PR integrates the functionality of Stable-Baselines3's FireResetEnv wrapper into Gymnasium's AtariPreprocessing wrapper.

Fixes #781

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

Checklist:

  • I have run the pre-commit checks with pre-commit run --all-files (see CONTRIBUTING.md instructions to set it up)
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Factor life tracking and reactions into subfunctions.
@balazsgyenes
Copy link
Author

Note: I'm not quite done with this feature yet, and have not tested and fully documented the changes. Sorry for the delay, I will do it as soon as I can. In the meantime, feedback is very much appreciated.

In addition, I noticed another potential bug that I wanted to ask about before addressing. The wrapper currently takes the max over the last two observations of the N frame skip frames. However, if the wrapped environment is done (either due to truncated or terminated), no further frames are saved, and the obs_buffer can be in any state (2 old observations from the previous step, 1 new / 1 old, 2 new observations). This has probably never caused any issues because Atari environments are typically not trained with time limits (I think?), and the final observation is ignored anyway if terminated=True. However, it is incorrect if truncated=True.

A potential fix would use a deque object from the collections library to save every observation, with only the last 2 observations retained. This would likely come with a performance penalty, since e.g. for frame_skip=4, half the observations are not even generated. @pseudo-rnd-thoughts, do you think this bug should be addressed?

@pseudo-rnd-thoughts
Copy link
Member

Reading through openai/baselines#240 then I believe that FireReset is not necessary as a feature as it is implemented in ale-py automatically. Therefore, SB3 and CleanRL etc who use it should be able to remove it from the environment stack

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Proposal] Add the functionality of FireResetEnv to the AtariPreprocessing wrapper
2 participants