Skip to content

Conversation

allenwang28
Copy link
Contributor

Not sure what issue @felipemello1 had earlier but I was able to see:

WandbBackend: Logged 92 metrics at step 2
=== [global_logger_5Dxs_r0] - METRICS STEP 2 ===
  buffer/add/count_episodes_added: 16.0
  ...
  policy_worker_perf/update/total_duration_avg_s: 6.04653369140625
  policy_worker_perf/update/total_duration_max_s: 6.04653369140625
  reference_perf/forward/avg_sequence_length: 1024.0
  reference_perf/forward/count_forward_passes: 2.0
==============================

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 2, 2025
Copy link
Contributor

@felipemello1 felipemello1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good stuff!

I think i added this later thinking of workers that may crash/respawn after we call .init_backends, and it probably fixed this issue too. Glad that you tried!

https://github.com/meta-pytorch/forge/blob/1f27b54984bd2c79c82d2c07b9540e057aa81c23/src/forge/observability/metric_actors.py#L243

@allenwang28 allenwang28 merged commit 97a33e4 into meta-pytorch:main Oct 2, 2025
5 checks passed
@allenwang28 allenwang28 deleted the policy_measure branch October 2, 2025 00:29
@felipemello1 felipemello1 mentioned this pull request Oct 5, 2025
14 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants