Change end-of-episode in CarRacing to termination as opposed to truncation #813

RedTachyon · 2023-12-05T00:54:42Z

There's been some heated debate about this, I'm still open to have my mind changed, but imo with the new termination-truncation split, this is a clear example of a termination as opposed to truncation.

Truncation is meant to represent hitting the "externally imposed" time limit as opposed to actually reaching a terminal state. When the car does the laps that it needs, the episode should be terminated.

To give some more information, I also added an info entry describing the reason for termination (whether or not the lap was finished)

This also requires a version bump, since implementations that correctly handle termination/truncation may converge to different policies

…nv version

pseudo-rnd-thoughts · 2023-12-06T00:40:59Z

There are a couple of options which we need to plot the training curves for

Do nothing, keep the buggy truncation case
Fix the truncation testing to check that the agent truncates correctly after 2 (or more) laps (add the truncation lap number as an environment parameter, for later) - https://pypi.org/project/ufal.pybox2d/
Change to termination after a single lap

Kallinteris-Andreas · 2023-12-14T14:04:36Z

For reference, this is the initial discussion: #106

Here is my analysis, giving a truncation signal after finishing the LAP would imply that the environment is not over, that there would be more LAPs left to complete, while giving a termination signal would indicate that nothing matters after completing the LAP, for example crushing the car afterwords would be totally fine (realistically that is not going to happen).

Learning performance wise, I suspect both options would be really close regardless.

pseudo-rnd-thoughts · 2024-07-31T12:08:50Z

@RedTachyon Any chance you can run some training plots for this to measure the change?
Can you update the changelog for the environment, otherwise good with me

RedTachyon · 2024-08-08T22:33:23Z

Re: changelong - do you mean in the docstring? That's already done. The docs site page (I think) gets automatically generated from that.

Re: experiments - probably too much effort for the benefit. The impact would likely be visible in the qualitative behavior of the agent towards the very end of the race, and that's contingent on the implementation correctly accounting for the termination/truncation split

RedTachyon added 3 commits December 5, 2023 01:47

Change end-of-episode truncation to termination in Car Racing. Bump e…

60bebb0

…nv version

Add an info entry

ad3b1a3

Rename info entry

795508b

pseudo-rnd-thoughts mentioned this pull request Dec 12, 2023

[request] Clarify in the description of Car Racing environment, how many laps is the objective #840

Open

RedTachyon added 2 commits July 30, 2024 20:22

Merge branch 'main' into carracing-termination

9e06113

Rreplace any stray CarRacing-v2 with CarRacing-v3

6d106ff

pseudo-rnd-thoughts approved these changes Aug 9, 2024

View reviewed changes

pseudo-rnd-thoughts merged commit c9e2957 into main Aug 9, 2024
23 checks passed

pseudo-rnd-thoughts deleted the carracing-termination branch August 30, 2024 09:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change end-of-episode in CarRacing to termination as opposed to truncation #813

Change end-of-episode in CarRacing to termination as opposed to truncation #813

RedTachyon commented Dec 5, 2023

pseudo-rnd-thoughts commented Dec 6, 2023

Kallinteris-Andreas commented Dec 14, 2023

pseudo-rnd-thoughts commented Jul 31, 2024

RedTachyon commented Aug 8, 2024

Change end-of-episode in CarRacing to termination as opposed to truncation #813

Change end-of-episode in CarRacing to termination as opposed to truncation #813

Conversation

RedTachyon commented Dec 5, 2023

pseudo-rnd-thoughts commented Dec 6, 2023

Kallinteris-Andreas commented Dec 14, 2023

pseudo-rnd-thoughts commented Jul 31, 2024

RedTachyon commented Aug 8, 2024