Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Offline version of AWR #5

Open
FineArtz opened this issue May 22, 2021 · 1 comment
Open

Offline version of AWR #5

FineArtz opened this issue May 22, 2021 · 1 comment

Comments

@FineArtz
Copy link

Hi, I am trying to modify AWR into the offline version (or fully off-policy version). I find that the paper states that one can simply treat the dataset as the replay buffer and don't need to do any modifications. But I notice that if I remove sampling in rl_agent.train, line 105 in rl_agent.py:
train_return, train_path_count, new_sample_count = self._rollout_train(self._samples_per_iter),
new_sample_count will remain 0, so that update steps are also 0.

Would you like to point out a proper way of modifications to obtain the offline AWR?

@xbpeng
Copy link
Owner

xbpeng commented May 25, 2021

you can just change the code so that the number of update steps do not depend on new_sample_count, like setting it to a constant:

critic_steps = int(np.ceil(self._critic_steps * new_sample_count / self._samples_per_iter))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants