Skip to content

Conversation

@ZhiyuLi-Nvidia
Copy link
Contributor

@ZhiyuLi-Nvidia ZhiyuLi-Nvidia commented Jul 29, 2025

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.
As attached. Add the following metrics:

  • train/mean_prompt_length: mean token length of prompt
  • train/total_num_tokens: total num of tokens (including prompt and generation tokens)
  • train/throughput_per_gpu: num of tokens per second (including prompt and generation tokens)

https://wandb.ai/nvidia/grpo-dev/runs/33pb40kk?nw=nwuserzhiyul

As a quick verification/test:

  • train/total_num_tokens = num_prompts_per_step x num_generations_per_prompt x (train/mean_prompt_length + train/mean_total_tokens_per_sample)
    • actual: 2869598 vs calculated 2869596.16 (numerical difference)
  • train/throughput = train/total_num_tokens / step time / total_num_gpus
    • actual: 1199 vs 1199

Issues

List issues that this PR closes (syntax):

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

  • ...

Signed-off-by: Zhiyu Li <zhiyul@nvidia.com>
@ZhiyuLi-Nvidia ZhiyuLi-Nvidia requested review from SahilJain314, parthchadha and terrykong and removed request for parthchadha July 29, 2025 08:07
Signed-off-by: Zhiyu Li <zhiyul@nvidia.com>
@snowmanwwg snowmanwwg linked an issue Jul 29, 2025 that may be closed by this pull request
@ZhiyuLi-Nvidia
Copy link
Contributor Author

@terrykong @SahilJain314 could you take a look?

Signed-off-by: Zhiyu Li <zhiyul@NVIDIA.com>
@ZhiyuLi-Nvidia ZhiyuLi-Nvidia force-pushed the zhiyul/add_throughput_metric branch from 345388b to 707a7d2 Compare July 31, 2025 22:00
@terrykong terrykong added this pull request to the merge queue Aug 1, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 1, 2025
@terrykong terrykong added this pull request to the merge queue Aug 1, 2025
Merged via the queue into main with commit 60633f4 Aug 2, 2025
15 checks passed
@terrykong terrykong deleted the zhiyul/add_throughput_metric branch August 2, 2025 01:36
tpoisonooo pushed a commit to tpoisonooo/RL that referenced this pull request Aug 4, 2025
…eMo#781)

Signed-off-by: Zhiyu Li <zhiyul@nvidia.com>
Signed-off-by: Zhiyu Li <zhiyul@NVIDIA.com>
Signed-off-by: tpoisonooo <khj.application@aliyun.com>
soodoshll pushed a commit to soodoshll/RL that referenced this pull request Aug 13, 2025
…eMo#781)

Signed-off-by: Zhiyu Li <zhiyul@nvidia.com>
Signed-off-by: Zhiyu Li <zhiyul@NVIDIA.com>
Signed-off-by: Qidong Su <qidongs@nvidia.com>
PrinsYin pushed a commit to PrinsYin/RL that referenced this pull request Nov 30, 2025
…eMo#781)

Signed-off-by: Zhiyu Li <zhiyul@nvidia.com>
Signed-off-by: Zhiyu Li <zhiyul@NVIDIA.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

No built-in tokens/second metric

4 participants