fix: reward profiling not require usage by cmunley1 · Pull Request #824 · NVIDIA-NeMo/Gym

cmunley1 · 2026-03-05T01:17:39Z

(Gym) phtran@eos0065:~/lustre_phtran/Gym$ ng_collect_rollouts \
    +agent_name=verifiers_agent \
    +input_jsonl_fpath=responses_api_agents/verifiers_agent/data/acereason-math-example.jsonl \
    +output_jsonl_fpath=responses_api_agents/verifiers_agent/data/acereason-math-example-rollouts.jsonl \
    +limit=5
Limiting the number of rows to 5
Using `verifiers_agent` for rows that do not already have an agent ref
Repeating rows 1 times (in a pattern of abc to aabbcc)!
Reading rows: 4it [00:00, 12291.00it/s]
Clearing output fpath since `resume_from_cache=False`!
Collecting rollouts: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:50<00:00, 10.01s/it]
Traceback (most recent call last):
  File "/lustre/fsw/coreai_dlalgo_genai/phtran/Gym/.venv/bin/ng_collect_rollouts", line 10, in <module>
    sys.exit(collect_rollouts())
             ^^^^^^^^^^^^^^^^^^
  File "/lustre/fsw/coreai_dlalgo_genai/phtran/Gym/nemo_gym/rollout_collection.py", line 357, in collect_rollouts
    asyncio.run(rch.run_from_config(config))
  File "/home/phtran/.local/share/uv/python/cpython-3.12.12-linux-x86_64-gnu/lib/python3.12/asyncio/runners.py", line 195, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/home/phtran/.local/share/uv/python/cpython-3.12.12-linux-x86_64-gnu/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/phtran/.local/share/uv/python/cpython-3.12.12-linux-x86_64-gnu/lib/python3.12/asyncio/base_events.py", line 691, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/lustre/fsw/coreai_dlalgo_genai/phtran/Gym/nemo_gym/rollout_collection.py", line 279, in run_from_config
    group_level_metrics, agent_level_metrics = rp.profile_from_data(rows, results)
                                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/lustre/fsw/coreai_dlalgo_genai/phtran/Gym/nemo_gym/reward_profile.py", line 91, in profile_from_data
    result = result | result["response"].get("usage", dict())
             ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TypeError: unsupported operand type(s) for |: 'dict' and 'NoneType'

verifiers agent is crashing due to recent change in ng collect rollouts which requires usage field from agents. this makes the fall back not error.

Signed-off-by: cmunley1 <cmunley@nvidia.com>

copy-pr-bot · 2026-03-05T01:17:43Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

``` (Gym) phtran@eos0065:~/lustre_phtran/Gym$ ng_collect_rollouts \ +agent_name=verifiers_agent \ +input_jsonl_fpath=responses_api_agents/verifiers_agent/data/acereason-math-example.jsonl \ +output_jsonl_fpath=responses_api_agents/verifiers_agent/data/acereason-math-example-rollouts.jsonl \ +limit=5 Limiting the number of rows to 5 Using `verifiers_agent` for rows that do not already have an agent ref Repeating rows 1 times (in a pattern of abc to aabbcc)! Reading rows: 4it [00:00, 12291.00it/s] Clearing output fpath since `resume_from_cache=False`! Collecting rollouts: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:50<00:00, 10.01s/it] Traceback (most recent call last): File "/lustre/fsw/coreai_dlalgo_genai/phtran/Gym/.venv/bin/ng_collect_rollouts", line 10, in <module> sys.exit(collect_rollouts()) ^^^^^^^^^^^^^^^^^^ File "/lustre/fsw/coreai_dlalgo_genai/phtran/Gym/nemo_gym/rollout_collection.py", line 357, in collect_rollouts asyncio.run(rch.run_from_config(config)) File "/home/phtran/.local/share/uv/python/cpython-3.12.12-linux-x86_64-gnu/lib/python3.12/asyncio/runners.py", line 195, in run return runner.run(main) ^^^^^^^^^^^^^^^^ File "/home/phtran/.local/share/uv/python/cpython-3.12.12-linux-x86_64-gnu/lib/python3.12/asyncio/runners.py", line 118, in run return self._loop.run_until_complete(task) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/phtran/.local/share/uv/python/cpython-3.12.12-linux-x86_64-gnu/lib/python3.12/asyncio/base_events.py", line 691, in run_until_complete return future.result() ^^^^^^^^^^^^^^^ File "/lustre/fsw/coreai_dlalgo_genai/phtran/Gym/nemo_gym/rollout_collection.py", line 279, in run_from_config group_level_metrics, agent_level_metrics = rp.profile_from_data(rows, results) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/lustre/fsw/coreai_dlalgo_genai/phtran/Gym/nemo_gym/reward_profile.py", line 91, in profile_from_data result = result | result["response"].get("usage", dict()) ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ TypeError: unsupported operand type(s) for |: 'dict' and 'NoneType' ``` verifiers agent is crashing due to recent change in ng collect rollouts which requires usage field from agents. this makes the fall back not error. Signed-off-by: cmunley1 <cmunley@nvidia.com>

fail gracefully

e7c1853

Signed-off-by: cmunley1 <cmunley@nvidia.com>

cmunley1 changed the title ~~fail gracefully~~ fix: reward profiling not require usage Mar 5, 2026

bxyu-nvidia approved these changes Mar 5, 2026

View reviewed changes

bxyu-nvidia merged commit 545fb5f into main Mar 5, 2026
5 checks passed

bxyu-nvidia deleted the cmunley1/reward-profile-fail-gracefully branch March 5, 2026 03:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: reward profiling not require usage#824

fix: reward profiling not require usage#824
bxyu-nvidia merged 1 commit intomainfrom
cmunley1/reward-profile-fail-gracefully

cmunley1 commented Mar 5, 2026

Uh oh!

copy-pr-bot bot commented Mar 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cmunley1 commented Mar 5, 2026

Uh oh!

copy-pr-bot bot commented Mar 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants