Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix jerky camera angle in car racing env #2525

Merged
merged 6 commits into from Feb 6, 2022

Conversation

carlosluis
Copy link
Contributor

@carlosluis carlosluis commented Dec 12, 2021

Fixes #2099

You can find context and investigations in the issue.

@araffin
Copy link
Contributor

araffin commented Dec 16, 2021

Please wait a little before merging, there might be two additional fixes to include. I will try to detail them tomorrow, but in short:

@araffin
Copy link
Contributor

araffin commented Dec 17, 2021

reward on the last tile is weird: only negative (breaks Markov assumption again as no info about the last tile is given)

In fact, the bug seems to come from the completion of a lap wrongly detected. See below:

car_racing.mp4

@carlosluis
Copy link
Contributor Author

No worries, we will wait a bit before merging because we want to migrate to pygame from pyglet as well alongside the version bump (context in #2099). I won't have time to get around to do that before I go on vacation for the rest of the year, so there's no rush on my side.

@araffin
Copy link
Contributor

araffin commented Dec 20, 2021

I found the issue, unfortunately, it is not clear what to do...
My agent was actually cutting corner... so it managed to complete the track without visiting every tile, as a result, the episode is not terminated when the car finishes one full lap.
There are different options:

  1. do not change anything, that's the agent fault
  2. consider that the track is completed if we pass through the first tile again and that 95% of all tiles are visited. (detecting a full lap is not trivial as going in the wrong direction is allowed)
  3. change the way visit is computed but this may introduce other reward hack/bugs

cutting_corner

@jkterry1
Copy link
Collaborator

It seems to me like #2 is the most natural. Is there any big issue with this approach that I'm not seeing? We'd obviously need to try learning with the changes before merging to ensure no unexpected behaviors occur given though.

@carlosluis
Copy link
Contributor Author

@araffin I implemented the workaround #2 you suggested. I tested manually by controlling myself the car but since you seem to have a good learning setup for this env maybe it's worth it for you to train again with this changes to verify there are no new exploits

@RedTachyon
Copy link
Contributor

I don't have any strong opinions on the functionality, but I'd like to see some comments in the code explaining what's going on in lines 391-394 since it's a pretty unusual pattern. Can this be simplified/made more pythonic? Also, couldn't this operation be performed once per initialization or once per reset, and the coordinates of the tile can then be cached?

@araffin
Copy link
Contributor

araffin commented Jan 12, 2022

@araffin I implemented the workaround #2 you suggested. I tested manually by controlling myself the car but since you seem to have a good learning setup for this env maybe it's worth it for you to train again with this changes to verify there are no new exploits

sorry, I've got notifications off on gym, I implemented the fixes in my fork:
https://github.com/araffin/gym/tree/fix/carracing

For detecting the first tile, you should probably use the idx of the tile instead (see my fork where I set an id for each tile).

I think we should provide an argument to choose how much should be visited to consider the lap over.

@jkterry1 jkterry1 mentioned this pull request Jan 13, 2022
13 tasks
@jkterry1
Copy link
Collaborator

@araffin are you happy with carlos' changes?

@jkterry1 jkterry1 merged commit 62e5272 into openai:master Feb 6, 2022
and self.env.tile_visited_count / len(self.env.track)
> self.lap_complete_percent
):
self.env_new_lap = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized this should be self.env.new_lap no?
and in that case, termination is due to reaching lap end (should not be treated as normal termination)

@pseudo-rnd-thoughts
Copy link
Contributor

Would you want to make a PR for this?

@araffin
Copy link
Contributor

araffin commented May 18, 2022

Would you want to make a PR for this?

well, current implementation is buggy, so I would say yes ;). We also need to fix TimeLimit wrapper (it should check if the TimeLimit.truncated key already exist or not)

@araffin araffin mentioned this pull request Jun 14, 2022
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CarRacing-v0 - Weird Camera Jerk Issue
5 participants