Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random sampling in tools::sample_episodes #42

Open
dirkmcpherson opened this issue Oct 31, 2023 · 1 comment
Open

Random sampling in tools::sample_episodes #42

dirkmcpherson opened this issue Oct 31, 2023 · 1 comment

Comments

@dirkmcpherson
Copy link

Hi,

I was going over your dataset code and I noticed that you're sampling from the episode buffer randomly. Generally this is correct because subsequent episodes will be strongly correlated, but your sampling technique picks a random episode at each step rather than guaranteeing every episode is seen before any episode is seen twice.

It's probably not a big deal since you'll sample uniformly on average, but I was wondering if you had a reason to make this implementation choice?

Thanks again for writing this repo.

@NM512
Copy link
Owner

NM512 commented Nov 4, 2023

Hi,

Thank you for your question.

In the context of off-policy reinforcement learning, it's a common practice to stochastically sample steps from the replay buffer. As a reference, implementation in original DreamerV3 uses a similar approach by randomly selecting "chunks" of successive 1024 steps and then sampling sequences from those chunks.

In my repository, I save data on an episode-by-episode basis within the replay buffer. This choice was made to facilitate handling of individual episode data, making it easier to work with.

I hope this clarifies the implementation choice. If you have any further questions, please feel free to share them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants