You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the solution you'd like
When fit_online, it should be very useful to save all the experiences into an MDPDataset so that we can use it for offline RL to improve the policy.
Has sense to use the Buffers already done for online learning? or should we think in another mechanism, such as openai gym wrappers monitor to make this?
Perhaps not all online learning algos use a Buffer, perhaps a new param to the fit_online so save every transition into history and every save time we also save the corresponding MDPDataset?
The text was updated successfully, but these errors were encountered:
@jamartinh Hello, sorry for the late response... I've implemented to_mdp_dataset method to ReplayBuffer. Do you think this is good enough for your requirement? 5936131
# convert online buffer to static dataset by tracing Transition objects in the buffer.
dataset = replay_buffer.to_mdp_dataset()
Describe the solution you'd like
When fit_online, it should be very useful to save all the experiences into an MDPDataset so that we can use it for offline RL to improve the policy.
Has sense to use the Buffers already done for online learning? or should we think in another mechanism, such as openai gym wrappers monitor to make this?
Perhaps not all online learning algos use a Buffer, perhaps a new param to the fit_online so save every transition into history and every save time we also save the corresponding MDPDataset?
The text was updated successfully, but these errors were encountered: