Offline PPO?

Is there an easy way to train PPO offline using the current API, i.e. pre-collecting state, action, reward tuples and training on those (as opposed to using a memory agent with observations)?

The Runner documentation on https://tensorforce.readthedocs.io/en/latest/runner.html says you should be able to call something like `agent.observe(state, action, reward, terminal_state)`