Skip to content

Conversation

@pderichai
Copy link
Contributor

Summary

Curriculum thresholding based on rewards was shown to be broken #895. This PR redefines min_lesson_length to be the minimum number of episodes that must be completed in a lesson. Once the minimum number of episodes has completed, the curriculum will become eligible for increment.

Changes

  • The PPOTrainer now holds a reward_buffer. This buffer is a fixed-size queue that stores the cumulative reward given by the most recent episodes completed by the trainer.
  • trainer_controller.py uses the average cumulative reward in the reward_buffer to determine whether the reward threshold has been met. trainer_controller.py will set the size of the capacity of reward_buffer when constructing the PPOTrainer. The size of the reward buffer must be at least as large as the min_lesson_length in the curriculum, and in this implementation the size of the buffer is set to exactly that.
  • The lesson_length field has been removed from Curriculum since it was a vague metric. The burden of ensuring that the minimum number of episodes have completed is on trainer_controller.py.

@pderichai pderichai changed the base branch from develop to release-v0.5 August 31, 2018 20:41
@pderichai pderichai removed the request for review from vincentpierre August 31, 2018 20:44
if ((progress > self.data['thresholds'][self.lesson_num]) and
(self.lesson_length > self.data['min_lesson_length'])):
if progress > self.data['thresholds'][self.lesson_num]:
print(progress, 'is above the threshold, successfully incrementing lesson')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we print here instead of logging?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woops, that was a debug print. Let me remove that.

@pderichai
Copy link
Contributor Author

Merging, approved offline.

@pderichai
Copy link
Contributor Author

Actually merging now, approved offline again.

@pderichai pderichai merged commit a4e7140 into release-v0.5 Sep 5, 2018
@pderichai pderichai deleted the develop-cl-bug-fix branch September 5, 2018 00:00
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 19, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants