Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agent reset() not called before starting new episodes in SingleThreadedWorker #51

Open
dwang9302 opened this issue Feb 22, 2019 · 5 comments
Assignees

Comments

@dwang9302
Copy link

Coming from tensorforce (where agent.reset() is called in episode loop), and by reading the doc on agents/agent.py, it seems agent.reset() is supposed to be called before starting a new episode. However currently it does not seem to be called in SingleThreadedWorker nor RayWorker before new episodes, although preprocessors stack seems to have been reset explicitly.

It would be nice if you could clarify the purpose of agent.reset() and when it is supposed to be called. Would appreciate some examples..

def reset(self):
        """
        Must be implemented to define some reset behavior (before starting a new episode).
        This could include resetting the preprocessor and other Components.
        """
        pass  # optional

Refs:
https://github.com/rlgraph/rlgraph/blob/master/rlgraph/agents/agent.py
https://github.com/rlgraph/rlgraph/blob/master/rlgraph/execution/single_threaded_worker.py

@michaelschaarschmidt
Copy link
Contributor

Hey, thanks for raising this! Will take care of shortly

@michaelschaarschmidt
Copy link
Contributor

Irrespective of the issue just a short piece of info on why the preprocessor code is even there:

The reason the preprocessors are done separately in Python is because we noticed an issue around image resizing in TensorFlow (https://hackernoon.com/how-tensorflows-tf-image-resize-stole-60-days-of-my-life-aba5eb093f35), so to do any benchmarks around images, we realised we would need to have the option to do the preprocessing with CV2 out-of-graph.

@michaelschaarschmidt
Copy link
Contributor

Cont.:

The heart of the issue is then that agent.reset() is meant to reset the preprocessor, but because of this separation as a consequence of this critical TF bug, it is not being called currently.

In any case, the name reset() is maybe misleading because users could reasonably expect that reset() fully resets the internal state (i.e. reinitalises all variables in particular).

We should hence maybe have a reset() method to fully reset agent state, and an episode_reset() which resets preprocessor state.

@dwang9302
Copy link
Author

dwang9302 commented Feb 27, 2019

Since agent.reset() is not being called because of an external bug, I assume the worker(s) logic would remain the same until the TF bug is fixed? The reason I asked is that I put some logic into my modified agent and an end-of-episode agent.reset() call becomes critical in training.
Thanks for the reply and clarification!

@michaelschaarschmidt
Copy link
Contributor

Yes since preprocessing is being handled via the python/numpy preprocessor stack implementations by default now, the tf preprocessor does not need resetting (even thought it would not hurt). So you could add a reset call if your agent needed some specific extra resetting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants