You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently in Car Racing, when the agent finishes a lap, the environment is marked as truncated instead of terminated. This seems like a really odd choice to me.
This was added in openai/gym#2890 alongside an actual fix to the environment logic. I suspect the review focused on the bug fix, and omitted the undiscussed change, so it slipped through the cracks. (BTW now you see why I'm always being annoying about out-of-scope changes in PRs and similar stuff)
Finishing a lap is a very clear example for episode termination. You reach a terminal state after making a full loop, and the episode ends. It should never have been marked as truncation.
The annoying part the environment version was bumped for this (but also for the actual bug), so we'll have to bump it again. But I can't really see any justification for keeping this marked as truncation, which is inconsistent with the entire rationale for what truncation is meant to be (reaching the time limit). The explanation in some of the comments was that finishing the lap shouldn't be treated as a failure, but termination does not imply failure. Failure or success is defined by the reward. Termination says "You're done, nothing more to do". Truncation says "You took too long, try again".
I don't think it's an MDP if you change it to be a termination since the observation doesn't really carry any information about whether a lap has been completed/how much of a lap has been completed (or does it, idk?). So making it a termination may be problematic, at least that would be my understanding.
Proposal
Currently in Car Racing, when the agent finishes a lap, the environment is marked as truncated instead of terminated. This seems like a really odd choice to me.
Gymnasium/gymnasium/envs/box2d/car_racing.py
Lines 557 to 561 in cc7d8dd
This was added in openai/gym#2890 alongside an actual fix to the environment logic. I suspect the review focused on the bug fix, and omitted the undiscussed change, so it slipped through the cracks. (BTW now you see why I'm always being annoying about out-of-scope changes in PRs and similar stuff)
Finishing a lap is a very clear example for episode termination. You reach a terminal state after making a full loop, and the episode ends. It should never have been marked as truncation.
The annoying part the environment version was bumped for this (but also for the actual bug), so we'll have to bump it again. But I can't really see any justification for keeping this marked as truncation, which is inconsistent with the entire rationale for what truncation is meant to be (reaching the time limit). The explanation in some of the comments was that finishing the lap shouldn't be treated as a failure, but termination does not imply failure. Failure or success is defined by the reward. Termination says "You're done, nothing more to do". Truncation says "You took too long, try again".
@pseudo-rnd-thoughts @jkterry1 @araffin
The text was updated successfully, but these errors were encountered: