Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] Add simple curriculum learning API and example script. #15740

Merged
merged 13 commits into from
May 16, 2021

Conversation

sven1977
Copy link
Contributor

@sven1977 sven1977 commented May 11, 2021

This PR adds:

  • A simple curriculum learning API to execute a configurable function (optional) at the end of each training iteration
    that determines, whether the env should be set to a new task.
  • Formalizes the already existing "task-get/set" API used in MAML to be generalized for curriculum learning (and e.g. MAML-style) task setting.
  • Adds example script and test case.

Why are these changes needed?

Related issue number

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@sven1977 sven1977 added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label May 13, 2021

# Check `env_task_fn` for possible update of the env's task.
if self.config["env_task_fn"] is not None:
assert callable(self.config["env_task_fn"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertions in code can be a bit unwieldy, especially when the users see it.

I would consider writing a special check function that also prints a user friendly message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 32 to 34
parser.add_argument("--stop-iters", type=int, default=50)
parser.add_argument("--stop-timesteps", type=int, default=200000)
parser.add_argument("--stop-reward", type=float, default=10000.0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add helpstring?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I'll do this for all the other example scripts in a separate PR.


if args.as_test:
check_learning_achieved(results, args.stop_reward)
ray.shutdown()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need this right?

  1. in local case, ray will shutdown when process exits
  2. in remote case, ray will disconnect when process exits

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember a time when not calling shutdown at the end of tests would lead to re-init errors when we run these tests in e.g. the CI.

Copy link
Contributor

@richardliaw richardliaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

Copy link
Contributor

@richardliaw richardliaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider highlighting this in a FAQ

@sven1977 sven1977 merged commit d89fb82 into ray-project:master May 16, 2021
@sven1977 sven1977 deleted the curriculum_learning_api branch June 2, 2023 20:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests-ok The tagger certifies test failures are unrelated and assumes personal liability.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants