Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Randomize LunarLander wind generation at reset to gain statistical independence between episodes #959

Conversation

TobiasKallehauge
Copy link
Contributor

@TobiasKallehauge TobiasKallehauge commented Mar 8, 2024

Description

This request changes the way gymnasium/envs/box2d/lunar_lander.py randomly draws a new wind_idx and torque_idx so new indexes are drawn randomly whenever the environment is reset rather than only at initialization. This ensures that the environment is statistically independent between episodes, which it is currently not. Changed the version from v2 to v3 and added a unit test to check that the seed is correctly working in the new version.

Fixes #954

Type of change

  • New feature (non-breaking change which adds functionality)

Checklist:

  • I have run the pre-commit checks with pre-commit run --all-files (see CONTRIBUTING.md instructions to set it up)
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

…ical independence between epsiodes

This will ensure that the environment is statistically independent between episodes, which it is currently not.
Copy link
Collaborator

@Kallinteris-Andreas Kallinteris-Andreas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will also need Unit Testing to validate the changes.

gymnasium/envs/box2d/lunar_lander.py Outdated Show resolved Hide resolved
@pseudo-rnd-thoughts
Copy link
Member

@TobiasKallehauge this seems to pass all of our internal testing.
As this effects an agent's performance, we should update the environment's version number and add a note to the environment's docstring in version control about the reason for the change.

Then I think we should be go to merge

@TobiasKallehauge
Copy link
Contributor Author

I committed the version change to the documentation and updated the default version used for tests. Let me know if something else should be changed.

@Kallinteris-Andreas
Copy link
Collaborator

I want to see a unit test validating the change.

also Bump the version in registration

@pseudo-rnd-thoughts
Copy link
Member

@Kallinteris-Andreas What do you mean?

@TobiasKallehauge Could you add a new test in tests/envs/test_env_implementations.py that using different reset(seed) will cause wind_idx and torque_idx to change. (Probably use known seeds to avoid future randomness issues)

You need to change gymnasium/envs/__init__.py on the lunarlander registration to be v3 not v2

Test if setting the same seed causes same initial wind and torque index. Also testing if setting different seed causes different initial wind and torque index
@TobiasKallehauge
Copy link
Contributor Author

I added the unit test now, and the seed works as expected - I hope the test complies with your standards.

There was still an error in the previous commit for the version due to changing the version number but maybe the new one will pass after changing the tests

@TobiasKallehauge
Copy link
Contributor Author

There seems to be a few more places where the version number is referenced:

  • gymnasium/wrappers/rendering.py
  • docs/tutorials/gymnasium_basics/vector_envs_tutorial.py

Right now gymnasium/wrappers/rendering.py is throwing an error. There are also some markdown files where it is references:

  • docs/index.md
  • docs/introduction/migration_guide.md
  • docs/introduction/basic_usage.md

I will update the version in all these places (including in the markdown files) - let me know if this is wrong

Copy link
Collaborator

@Kallinteris-Andreas Kallinteris-Andreas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks good to me.

Copy link
Member

@pseudo-rnd-thoughts pseudo-rnd-thoughts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @TobiasKallehauge and your rapid responses

@pseudo-rnd-thoughts pseudo-rnd-thoughts merged commit fd4ae52 into Farama-Foundation:main Mar 9, 2024
11 checks passed
@TobiasKallehauge
Copy link
Contributor Author

Thanks to you as well! I am happy to contribute

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Proposal] Randomize LunarLander wind generation at reset to gain statistical independence between episodes
3 participants