Significant changes in scores with the v0.0.11 physics #184

valentinmace · 2022-04-06T16:59:24Z

Context:

I am running my implemention of the PGA-MAP-Elites (see paper) in brax halfcheetah.

I obtained good and consistent results with the old v0.0.10 physics in halfcheetah (and in all other environments too), ranging from 6000 to 7000 at the very end of the run.

Problem description:

Since the v0.0.11 version, these results have changed a lot, here are some points that I want to mention:

1. At the beginning of the runs, scores are significantly higher and have more variance. With the old physics, the first scores that I plotted ranged from 2000 to 3000 maximum and never went out of these bounds. With the new physics, starting scores range from 4000 to 7000 (which is more than what I obtained for a full run using old physics). Of course these absolute values are not relevant but the sudden change is.
1. For a full run, there is way less amplitude in the score evolution. Final scores with new physics are still ranging from 6000 to 7000 but now they start way higher.
1. I want to emphasize that obtaining 7000 at the very beginning of a run seems problematic since for another seed with the same parameters I could not be reaching 7000 even at the end of the run.
1. It makes comparisons with old results and brax oriented papers irrelevant, at least in halfcheetah.

Additional details:

All parameters are kept equal for these comparisons, only the physics changes (I made use of legacy_spring=True to switch between versions).
The results reported here take into account multiple seeds for each version, by that I mean that I see a very significant change that cannot be due to statistical noise.

The text was updated successfully, but these errors were encountered:

cdfreeman-google · 2022-04-08T17:30:53Z

Indeed, the position based dynamics physics introduced in version v0.0.11 are quite different than the old springy physics. They much more accurately model stiff joints, whereas the legacy_spring=True physics of original brax were significantly more wobbly.

As an engine developer, this is a tricky problem more generally. When we know there's an area we can improve the physics, we want to be able to make those improvements, and have those improvements reflected in the environments that we support. At the same time, certain users, especially users in the RL research community, want the physics to be exactly the same all the time because they typically are using these sorts of environments as baselines, and want to be able to compare across papers.

We take the opinionated approach that our environments will always have the best physics we have to offer, but that if users want old behavior, that it be available, hence the legacy_spring=True kwarg. Alternatively, you can specify which Brax version you're comparing against, which should be an unambiguous indicator of environment version. Version pinning is probably a good idea anyway when doing comparison experiments.

Stepping back: we made a pretty significant effort to maintain backwards compatibility support in the new version, and tried to loudly inform users about these changes in warning messages. Is there anything you feel like you cannot do in the new version that you could in the old? Ideally, these updates are purely additive to engine performance and capability.

valentinmace · 2022-04-11T09:25:08Z

Thanks for your answer.

There is nothing we cannot do in the new version that we could do in the old. My concern was mostly regarding RL research and comparability between papers.

As you said there is always the solution of mentioning the Brax version in the paper, but in terms of practical use I'm afraid it will not always be easy to find the exact version used in other works or simply to compare to multiple works that use different versions (if Brax physics change on a regular basis). As a Brax adept I'd like it to become a standard in the RL/Evo community and hope it won't be an obstacle :)

Another point that I want to emphasize is iii. from the original post. If, for any reason, the change in physics make it suddenly very easy (or very difficult) to obtain good results on a given environment, I fear it won't help to discriminate between good and bad research ideas.

Again thanks for answering our concerns, I totally understand your opinionated approach and will stick with legacy_spring for now !

qlan3 · 2023-12-17T07:12:50Z

With Brax 0.9.3, PPO can only achieve 3000 in HalfCheetah with the implementation and hyper-parameters provided in the notebook, much lower than previous return (~8000).

btaba · 2024-01-02T18:33:46Z

Hi @qlan3

As brax has evolved from version 0.0.11 to now, the physics has once again changed quite a bit, and @cdfreeman-google's comment still holds. For best reproducibility I would stick to a specific version of Brax and physics backend (generalized, positional, or spring).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Significant changes in scores with the v0.0.11 physics #184

Significant changes in scores with the v0.0.11 physics #184

valentinmace commented Apr 6, 2022 •

edited

Loading

cdfreeman-google commented Apr 8, 2022

valentinmace commented Apr 11, 2022

qlan3 commented Dec 17, 2023 •

edited

Loading

btaba commented Jan 2, 2024

Significant changes in scores with the v0.0.11 physics #184

Significant changes in scores with the v0.0.11 physics #184

Comments

valentinmace commented Apr 6, 2022 • edited Loading

Context:

Problem description:

Additional details:

cdfreeman-google commented Apr 8, 2022

valentinmace commented Apr 11, 2022

qlan3 commented Dec 17, 2023 • edited Loading

btaba commented Jan 2, 2024

valentinmace commented Apr 6, 2022 •

edited

Loading

qlan3 commented Dec 17, 2023 •

edited

Loading