Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] Metrics do-over 01: Introducing and testing new MetricsLogger and Stats APIs. #44442

Merged

Conversation

sven1977
Copy link
Contributor

@sven1977 sven1977 commented Apr 3, 2024

RLlib Metrics, custom user metrics, and ResultsDict do-over for new API stack:

This PR introduces a new unified API (MetricsLogger) for both RLlib's core codebase and its users to log (custom) metrics from within any(!) component. Thus, whether you are inside Algorithm (callback, overriding training_step, custom eval function, etc..), EnvRunner (callback), or Learner (custom loss?), you can now use the exact same API in all these components for logging your custom metrics.

The new MetricsLogger, which is held by all the above components under the self.metrics property, exposes the following simple API to log stats values:

# Log an individual value. By default, all logged values under the
# same key get mean-reduced once you call `metrics.reduce()`.
metrics.log_value(key="my_loss", value=-0.002)
# By default, mean-reduction happens via EMA (with coeff=0.01). You can change the EMA coeff by doing:
metrics.log_value(key="my_more_recent_loss", value=-0.002, ema_coeff=0.5)
# Or, if you would like to use a sliding window insted of EMA, you can also do::
metrics.log_value(key="my_win50_loss", value=-0.002, window=50)

# Use the same API, but for a lifetime counter. Note that here, we reduce with the "sum" method
metrics.log_value("my_counter", 100, reduce="sum")

There are two situations during which all logged values thus-far will be "reduced" or "merged":

  1. At the end of a component's cycle. For example, if you call EnvRunner.sample(), at the end of this call, the EnvRunner will call the reduce() method on its MetricsLogger object and return the results. Note that this does not necessarily mean that all historic data is reduced at this time. If - for example - you have a stat under the "abc" key with window=1000 and the EnvRunner only logged 50 new values during the sample() call, the previously logged 950 values will still remain in the cache under that key.

  2. After n parallel components (e.g. n EnvRunners) have returned their reduced results, the controlling component (e.g. Algorithm object controlling the n remote EnvRunners) will have to merge the n received result dicts.
    This can be achieved with the MetricsLogger of the controlling component:

# inside Algorithm

# Collect n result dicts from n EnvRunners (each dict already reduced by each EnvRunner's own MetricsLogger).
n_result_dicts = self.workers.foreach_worker(lambda env_runner: env_runner.sample())

# Log (and thereby merge) each of the n result dicts using our own MetricsLogger:
self.metrics.log_n_dicts(n_result_dicts, key="env_runner_results")

# Now, the Algorithm's own MetricsLogger object contains all the data from all EnvRunners, in a reduced fashion.

# Let Algorithm return its own (reduced) results, containing all the (reduced/merged) EnvRunner results.
return self.metrics.reduce()

Why are these changes needed?

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
…nup_examples_folder_03

# Conflicts:
#	rllib/examples/multi_agent/multi_agent_pendulum.py

and wip

Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Copy link
Collaborator

@simonsays1980 simonsays1980 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Great and very appreciated PR. Some nits here and there and a couple of questions. What is missing imo is to show the basic structure of the ResultDict for EnvRunner , Learner, etc. and for the overall ResultDict containing all of them. This is important when putting custom metrics into order - specifically, if a user wants to log them to a specific chart in TensorBoard/WandB.

rllib/algorithms/algorithm.py Show resolved Hide resolved
rllib/algorithms/algorithm.py Show resolved Hide resolved
rllib/algorithms/algorithm.py Show resolved Hide resolved
rllib/algorithms/algorithm.py Outdated Show resolved Hide resolved
rllib/algorithms/algorithm.py Outdated Show resolved Hide resolved
rllib/env/single_agent_episode.py Outdated Show resolved Hide resolved
rllib/env/single_agent_episode.py Show resolved Hide resolved
rllib/env/single_agent_episode.py Outdated Show resolved Hide resolved
rllib/examples/metrics/custom_metrics_in_env_runners.py Outdated Show resolved Hide resolved
else:
for batch_or_episode in sampled_data:
if max_agent_steps:
agent_or_env_steps += (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this, if we return metrics? ANd when don't we want to retrun metrics?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, unfortunately, we do here in this case, b/c we have to determine right here (before even logging the actual steps to the Algorithm's metrics) when to stop the while loop.

We might rearrange this entire utility, but for now, it works just fine.

Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
@sven1977 sven1977 removed the do-not-merge Do not merge this PR! label Apr 18, 2024
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: sven1977 <svenmika1977@gmail.com>
@sven1977 sven1977 changed the title [RLlib] Complete metrics, custom metrics and ResultsDict do-over. [RLlib] Metrics do-over 01: Introducing and testing new MetricsLogger and Stats APIs. Apr 19, 2024
Signed-off-by: sven1977 <svenmika1977@gmail.com>
…lete_metrics_and_stats_do_over

Signed-off-by: sven1977 <svenmika1977@gmail.com>

# Conflicts:
#	rllib/utils/test_utils.py
@sven1977 sven1977 merged commit 054aad6 into ray-project:master Apr 19, 2024
5 checks passed
@sven1977 sven1977 deleted the complete_metrics_and_stats_do_over branch April 20, 2024 07:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rllib RLlib related issues rllib-newstack
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants