-
Notifications
You must be signed in to change notification settings - Fork 4.4k
Change AgentProcessor logic to fix memory leak #3383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
vincentpierre
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have some training results on bouncer and hallway to see if there are regressions ?
| if not self.last_step_result[_gid][0].done: | ||
| if "action" in take_action_outputs: | ||
| self.policy.save_previous_action( | ||
| [global_id], take_action_outputs["action"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_gid ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
| self.policy.save_previous_action( | ||
| previous_action.agent_ids, take_action_outputs["action"] | ||
| # Index is needed to grab from last_take_action_outputs | ||
| self.last_step_result[global_id] = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_gid ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one needs to be global_id, its in a different loop
|
How do you know it works without a test? |
Tracking the size of the dicts using example envs (Hallway, Bouncer). Still need to think of a good unit test for this; we need to emulate On-demand decisions |
Added a unit test that checks the basic deleting mechanism between resets. |
|
Great, just wanted to make sure we had some checks on it. |


In a recent change, the Agent ID now changes after every episode reset. This caused the dictionary of keys to grow inside the AgentProcessor. This PR cleans up the adding/deletion of items on reset boundaries, so that the dict doesn't grow.