Randomize LunarLander wind generation at reset to gain statistical independence between episodes #959

TobiasKallehauge · 2024-03-08T09:49:44Z

Description

This request changes the way gymnasium/envs/box2d/lunar_lander.py randomly draws a new wind_idx and torque_idx so new indexes are drawn randomly whenever the environment is reset rather than only at initialization. This ensures that the environment is statistically independent between episodes, which it is currently not. Changed the version from v2 to v3 and added a unit test to check that the seed is correctly working in the new version.

Fixes #954

Type of change

New feature (non-breaking change which adds functionality)

Checklist:

I have run the pre-commit checks with pre-commit run --all-files (see CONTRIBUTING.md instructions to set it up)
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

…ical independence between epsiodes This will ensure that the environment is statistically independent between episodes, which it is currently not.

Kallinteris-Andreas

We will also need Unit Testing to validate the changes.

gymnasium/envs/box2d/lunar_lander.py

pseudo-rnd-thoughts · 2024-03-08T11:40:06Z

@TobiasKallehauge this seems to pass all of our internal testing.
As this effects an agent's performance, we should update the environment's version number and add a note to the environment's docstring in version control about the reason for the change.

Then I think we should be go to merge

TobiasKallehauge · 2024-03-08T12:03:59Z

I committed the version change to the documentation and updated the default version used for tests. Let me know if something else should be changed.

Kallinteris-Andreas · 2024-03-08T13:19:41Z

I want to see a unit test validating the change.

also Bump the version in registration

pseudo-rnd-thoughts · 2024-03-08T13:54:25Z

@Kallinteris-Andreas What do you mean?

@TobiasKallehauge Could you add a new test in tests/envs/test_env_implementations.py that using different reset(seed) will cause wind_idx and torque_idx to change. (Probably use known seeds to avoid future randomness issues)

You need to change gymnasium/envs/__init__.py on the lunarlander registration to be v3 not v2

Test if setting the same seed causes same initial wind and torque index. Also testing if setting different seed causes different initial wind and torque index

TobiasKallehauge · 2024-03-08T14:53:26Z

I added the unit test now, and the seed works as expected - I hope the test complies with your standards.

There was still an error in the previous commit for the version due to changing the version number but maybe the new one will pass after changing the tests

TobiasKallehauge · 2024-03-08T15:28:12Z

There seems to be a few more places where the version number is referenced:

gymnasium/wrappers/rendering.py
docs/tutorials/gymnasium_basics/vector_envs_tutorial.py

Right now gymnasium/wrappers/rendering.py is throwing an error. There are also some markdown files where it is references:

docs/index.md
docs/introduction/migration_guide.md
docs/introduction/basic_usage.md

I will update the version in all these places (including in the markdown files) - let me know if this is wrong

…arkdown files

Kallinteris-Andreas

Overall this looks good to me.

gymnasium/envs/box2d/lunar_lander.py

pseudo-rnd-thoughts

Thanks for the PR @TobiasKallehauge and your rapid responses

TobiasKallehauge · 2024-03-10T14:41:54Z

Thanks to you as well! I am happy to contribute

TobiasKallehauge added 3 commits March 8, 2024 10:02

Moved random number seen for wind to reset function to ensure statist…

cc25271

…ical independence between epsiodes This will ensure that the environment is statistically independent between episodes, which it is currently not.

Removed extra space

0887f2e

Formatting whitespace

be40220

Kallinteris-Andreas requested changes Mar 8, 2024

View reviewed changes

gymnasium/envs/box2d/lunar_lander.py Outdated Show resolved Hide resolved

Use internal numpy random for wind index generation to control seed

ffe78ee

TobiasKallehauge added 2 commits March 8, 2024 12:58

Added new version in to documentation describing the change

a5421e2

Changed default version from v2 to v3

0d0bf76

Farama-Foundation deleted a comment from Kallinteris-Andreas Mar 8, 2024

TobiasKallehauge added 3 commits March 8, 2024 14:58

Added new version to register

f0c76d9

Added new version to continious version as well

eaa9dcc

Added unit test

97de3c8

Test if setting the same seed causes same initial wind and torque index. Also testing if setting different seed causes different initial wind and torque index

Updated the version number of LunarLander to v3 in a few scipts and m…

b673392

…arkdown files

Kallinteris-Andreas requested changes Mar 8, 2024

View reviewed changes

Kallinteris-Andreas reviewed Mar 8, 2024

View reviewed changes

gymnasium/envs/box2d/lunar_lander.py Outdated Show resolved Hide resolved

Added link to GitHub issue

60652c7

pseudo-rnd-thoughts approved these changes Mar 9, 2024

View reviewed changes

pseudo-rnd-thoughts merged commit fd4ae52 into Farama-Foundation:main Mar 9, 2024
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Randomize LunarLander wind generation at reset to gain statistical independence between episodes #959

Randomize LunarLander wind generation at reset to gain statistical independence between episodes #959

TobiasKallehauge commented Mar 8, 2024 •

edited

Loading

Kallinteris-Andreas left a comment

pseudo-rnd-thoughts commented Mar 8, 2024

TobiasKallehauge commented Mar 8, 2024

Kallinteris-Andreas commented Mar 8, 2024

pseudo-rnd-thoughts commented Mar 8, 2024

TobiasKallehauge commented Mar 8, 2024

TobiasKallehauge commented Mar 8, 2024

Kallinteris-Andreas left a comment

pseudo-rnd-thoughts left a comment

TobiasKallehauge commented Mar 10, 2024

Randomize LunarLander wind generation at reset to gain statistical independence between episodes #959

Randomize LunarLander wind generation at reset to gain statistical independence between episodes #959

Conversation

TobiasKallehauge commented Mar 8, 2024 • edited Loading

Description

Type of change

Checklist:

Kallinteris-Andreas left a comment

Choose a reason for hiding this comment

pseudo-rnd-thoughts commented Mar 8, 2024

TobiasKallehauge commented Mar 8, 2024

Kallinteris-Andreas commented Mar 8, 2024

pseudo-rnd-thoughts commented Mar 8, 2024

TobiasKallehauge commented Mar 8, 2024

TobiasKallehauge commented Mar 8, 2024

Kallinteris-Andreas left a comment

Choose a reason for hiding this comment

pseudo-rnd-thoughts left a comment

Choose a reason for hiding this comment

TobiasKallehauge commented Mar 10, 2024

TobiasKallehauge commented Mar 8, 2024 •

edited

Loading