Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ExperienceSourceFirstLast #17

Closed
raymondchua opened this issue Dec 10, 2018 · 1 comment
Closed

ExperienceSourceFirstLast #17

raymondchua opened this issue Dec 10, 2018 · 1 comment

Comments

@raymondchua
Copy link

Can someone explain the main difference between ExperienceSourceFirstLast and ExperienceSource? Are we still storing every incoming state?

@Shmuma
Copy link
Owner

Shmuma commented May 6, 2019

Sorry for delay :).

The main difference is that ExperienceSource produces all traces of given length, but ExperienceSourceFirstLast returns only first and last states with calculated discounted reward between. It could be illustrated on example.

Suppose we have single episode with states 0 -> 1 -> 2 -> 3 -> 4. On the last state episode is terminated.

Suppose we have ExperienceSource(steps_count=3), then it will produce the following data on iteration:

  • [Experience(state=0), Experience(state=1), Experience(state=2)]
  • [Experience(state=1), Experience(state=2), Experience(state=3)]
  • [Experience(state=2), Experience(state=3), Experience(state=4)]
  • [Experience(state=3), Experience(state=4)]
  • [Experience(state=4)]

But ExperienceSourceFirstLast(steps_count=3) will return the following:

  • ExperienceFirstLast(state=0, last_state=2)
  • ExperienceFirstLast(state=1, last_state=3)
  • ExperienceFirstLast(state=2, last_state=None)
  • ExperienceFirstLast(state=3, last_state=None)

Reward returned by ExperienceSourceFirstLast is aggregated using gamma passed on constructor.

Most of the time, ExperienceSourceFirstLast is more convenient, as we're not normally need intermediate states. But sometimes, we need more control, so, ExperienceSource could be handy.
In terms of implementation, ExperienceSourceFirstLast is a wrapper around ExperienceSource.

@Shmuma Shmuma closed this as completed May 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants