Skip to content

docs: version bump, CTA link changes#722

Closed
lbliii wants to merge 276 commits intomainfrom
llane/docs-remove-link
Closed

docs: version bump, CTA link changes#722
lbliii wants to merge 276 commits intomainfrom
llane/docs-remove-link

Conversation

@lbliii
Copy link
Copy Markdown
Contributor

@lbliii lbliii commented Feb 17, 2026

No description provided.

kbhardwaj-nvidia and others added 30 commits September 8, 2025 20:13
…nfo (#27)

Signed-off-by: Brian Yu <bxyu@nvidia.com>
updated the following logging print when running ng_prepare_data from,
for example:

"Found 0 agent server instance configs withOUT datasets:"

to 

"Found 0 agent server instance configs WITHOUT datasets:" 

to match the format of the subsequent logs, for example: 
"Found 1 agent server instance configs WITH datasets:"

Signed-off-by: chrismun <cmunley@nvidia.com>
update readme for resources servers for updated cli

Signed-off-by: chrismun <cmunley@nvidia.com>
…item to be present (#19)

Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Peter Jin <pjin@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Co-authored-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
This change adds a new dataset for the library judge math resources
server. The new dataset contains math problems from Stack Overflow.

This dataset corresponds to the StackOverflow Dump Data entry in the RL
verifier data tracking
[spreadsheet](https://docs.google.com/spreadsheets/d/1VK4-ZonMSR-4Ulk161Au1f-nGhs4r9V9-beScboaR3I/edit?gid=0#gid=0&range=7:7).

The prompts in the dataset have been formatted in the same way as the
OpenMathReasoning dataset that was previously added. All the Stack
Overflow problems with expected answers were included, so it may be
necessary in the future to filter the problems according to difficulty
based on the responses of a model.

Approval for the use of this dataset is covered in this ticket:
[DGPTT-96](https://jirasw.nvidia.com/browse/DGPTT-96).

Signed-off-by: Damon Mosk-Aoyama <dmoskaoyama@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Address #43 and
#45. Thank you to @xinyu-dev
for the raises!

---------

Signed-off-by: Brian Yu <bxyu@nvidia.com>
Add LLM-as-judge resources server

Introduce a resources server that uses an LLM as a judge to compare a
model’s generated answer against an expected (gold) answer.

### Details
- Only the last assistant message is graded;
- Optional regex extraction for both question and response:
- Uses the last regex match; returns the first non-empty capture group,
else the whole match.
- Optional second-pass verification to mitigate positional bias (swap
expected/generated).

### Configuration:
- judge_system_message (optional): system prompt for the judge.
- judge_prompt_template (required): must include {question},
{expected_answer}, {generated_answer}.
- judge_equal_label / judge_not_equal_label: defaults to [[A=B]] /
[[A!=B]].
- check_twice_swap (bool, default false): if true, on an initial equal
verdict, perform a second judge pass with expected/prediction swapped;
reward remains 1 only if the second pass is also equal.
- reward_if_swap_fails (float, default 0.0): reward to assign if the
second pass fails.
- Here the training framework will need to handle how to discard this
sample based on this reward.
- question_extract_regex (optional): extract from the last user message
(last match semantics).
- response_extract_regex (optional): extract from the last assistant
message (last match semantics).
The regex part can be helpful if for some reason the user decided to
train with in-context examples.
For the responses, this can be useful if the training specifically
expects the final response to be inside \boxed or \text, for instance.

### Steps
- Extracts question from the last user message; applies
question_extract_regex if provided.
- Extracts generated answer from the last assistant message; applies
response_extract_regex if provided.
- Calls the judge model with a configurable prompt and optional system
message.
- Parses the judge’s last message for judge_equal_label /
judge_not_equal_label to set reward and verdict.

### Running the server
```bash
config_paths="responses_api_models/openai_model/configs/openai_model.yaml, \
resources_servers/equivalence_llm_judge/configs/equivalence_llm_judge.yaml"

ng_run "+config_paths=[$config_paths]" \
  +equivalence_llm_judge.resources_servers.equivalence_llm_judge.judge_model_server.name=openai_model
```

### Collecting rollouts
```bash
ng_collect_rollouts +agent_name=equivalence_llm_judge_simple_agent \
  +input_jsonl_fpath=resources_servers/equivalence_llm_judge/data/example.jsonl \
  +output_jsonl_fpath=results/example_rollouts.jsonl \
  +limit=5
```

### Licensing
Code: Apache 2.0   
Data: CC-BY-NC-3.0 (Examples from
https://huggingface.co/datasets/allenai/sciq)

---------

Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: soares-f <soarescmsa@gmail.com>
Signed-off-by: Khushi Bhardwaj <kbhardwaj@nvidia.com>
Signed-off-by: Frankie Siino <fsiino@nvidia.com>
Co-authored-by: bxyu-nvidia <bxyu@nvidia.com>
Co-authored-by: Khushi Bhardwaj <kbhardwaj@nvidia.com>
Co-authored-by: fsiino-nvidia <fsiino@nvidia.com>
Signed-off-by: Peter Jin <pjin@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
…44)

### Switch python_math_exec to use session cookies and fix tool usage

### Changes
1. Updated `python_math_exec` server to use built-in session cookies:
   - Removed manual `session_id` handling from request/response models
   - Now using `request.session[SESSION_ID_KEY]` for state tracking
   
2. Removed `simple_agent_stateful/*`

### To test, follow these steps from the README.md of `python_math_exec`

1. Download dataset
```
ng_download_dataset_from_gitlab \
    +dataset_name=open_math_reasoning_problems_tool \
    +version=0.0.1 \
    +artifact_fpath=open_math_reasoning_problems_tool.jsonl \
    +output_fpath=data/open_math_reasoning_problems_tool.jsonl
```


2. Start server 
```
ng_run "+config_paths=[responses_api_agents/simple_agent/configs/simple_agent.yaml,responses_api_models/openai_model/configs/openai_model.yaml,resources_servers/python_math_exec/configs/python_math_exec.yaml]"     +simple_agent.responses_api_agents.simple_agent.resources_server.name=python_math_exec
```


3. Collect trajectories
```
ng_collect_rollouts +agent_name=simple_agent +input_jsonl_fpath=data/open_math_reasoning_problems_tool.jsonl +output_jsonl_fpath=results/open_math_reasoning_problems_tool_output_new.jsonl +limit=1
```

---------

Signed-off-by: Rahul Chand <rchand@nvidia.com>
Co-authored-by: Rahul Chand <rchand@cw-dfw-cs-001-vscode-02.cm.cluster>
## Details 

This is a resource server for sudoku text based game. The correctness is
checked programtically. The reward is returned at each step (each call
of the `make_move` function. The final reward is the sum)

**Dataset**: The dataset is proceduraly generated. But for it to be
compatible with current trajectory generation framework. We have a
`simple_sudoku.jsonl` and a script (steps below) that can be used to
create a jsonl to use for trajectory generation. This would be changed
later


## Steps to run

1. Download reference dataset

```
ng_download_dataset_to_gitlab \
    +dataset_name=simple_sudoku \
    +version=0.0.2 \
    +artifact_fpath=simple_sudoku.jsonl \
    +output_fpath=data/simple_sudoku.jsonl
```

2. Create a larger jsonl (the reference dataset just has 1 data point as
reference for how the system prompt, tool definition should look like).
Below generates 5 examples. Currently the game parameters , board size &
number of clues are selected randomly in a reasonable range (4 & 9 for
board size and btw (6,12) for board of sizse 4 and (16, 48) for board
size 9. In the future we could replace this with a parameter of how
tough we want the game)

Run below from the `simple_sudoku/` folder
```
python generate_sudoku_jsonl.py ../../data/simple_sudoku.jsonl 5 ../../data/sudoku_batch.jsonl
```


2. Start the servers (this starts the simple_game_agent, the model
server & the sudkoku environment server)

```
ng_run "+config_paths=[responses_api_agents/simple_game_agent/configs/simple_game_agent.yaml,responses_api_models/openai_model/configs/openai_model.yaml,resources_servers/simple_sudoku/configs/simple_sudoku.yaml]" +simple_game_agent.responses_api_agents.simple_game_agent.resources_server.name=simple_sudoku
```

3. Start trajectory collection

```
ng_collect_traj +agent_name=simple_game_agent +input_jsonl_fpath=data/sudoku_batch.jsonl +output_jsonl_fpath=results/sudoku_output_NEWLOGIC.jsonl
```

---

**Example Rollouts**:

```
{"responses_create_params": {"input": [], "tools": [{"name": "make_move", "parameters": {"type": "object", "properties": {"row": {"type": "integer", "description": "Row number (1-4)"}, "col": {"type": "integer", "description": "Column number (1-4)"}, "number": {"type": "integer", "description": "Digit to place (1-4)"}}, "required": ["row", "col", "number"], "additionalProperties": false}, "strict": true, "type": "function", "description": "Place a digit in the Sudoku grid"}]}, "response": {"id": "resp_68bd93aa0e9081948ee1fb98147b354701b662383935ba9f", "created_at": 1757254570.0, "error": null, "incomplete_details": null, "instructions": null, "metadata": {}, "model": "gpt-4.1-2025-04-14", "object": "response", "output": [{"arguments": "{\"row\":2,\"col\":3,\"number\":3}", "call_id": "call_HRb8CVBJzjVXbqiYvDNf63yS", "name": "make_move", "type": "function_call", "id": "fc_68bd93a1faec8194b1669db531b87b8a039ed859b3babbe5", "status": "completed"}, {"call_id": "call_HRb8CVBJzjVXbqiYvDNf63yS", "output": "{\"success\": false, \"message\": \"Incorrect move: R2 C3 = 3 violates Sudoku rules\", \"game_state\": {\"current_board\": [[0, 4, 2, 0], [0, 0, 0, 1], [4, 1, 3, 0], [3, 2, 0, 0]], \"solution\": [[1, 4, 2, 3], [2, 3, 4, 1], [4, 1, 3, 2], [3, 2, 1, 4]], \"scale\": 4, \"initial_empty_count\": 8, \"moves_made\": 1, \"correct_moves\": 0}, \"board_text\": \"   C1 C2   C3 C4  \\nR1  .  4 |  2  .\\nR2  .  . |  .  1\\n   - - - - - - \\nR3  4  1 |  3  .\\nR4  3  2 |  .  .\", \"is_complete\": false, \"move_reward\": 0.0}", "type": "function_call_output", "id": null, "status": null}, {"arguments": "{\"row\":1,\"col\":1,\"number\":1}", "call_id": "call_hM2UMng5iSWnuET4aMpeKIoJ", "name": "make_move", "type": "function_call", "id": "fc_68bd93a31d648193b06f9923513b16640b983aa011463905", "status": "completed"}, {"call_id": "call_hM2UMng5iSWnuET4aMpeKIoJ", "output": "{\"success\": true, \"message\": \"Correct move! R1 C1 = 1\", \"game_state\": {\"current_board\": [[1, 4, 2, 0], [0, 0, 0, 1], [4, 1, 3, 0], [3, 2, 0, 0]], \"solution\": [[1, 4, 2, 3], [2, 3, 4, 1], [4, 1, 3, 2], [3, 2, 1, 4]], \"scale\": 4, \"initial_empty_count\": 8, \"moves_made\": 2, \"correct_moves\": 1}, \"board_text\": \"   C1 C2   C3 C4  \\nR1  1  4 |  2  .\\nR2  .  . |  .  1\\n   - - - - - - \\nR3  4  1 |  3  .\\nR4  3  2 |  .  .\", \"is_complete\": false, \"move_reward\": 0.0}", "type": "function_call_output", "id": null, "status": null}, {"arguments": "{\"row\":1,\"col\":4,\"number\":3}", "call_id": "call_Q866NsiyJeD807NjpnQxJk0V", "name": "make_move", "type": "function_call", "id": "fc_68bd93a44ac081978ee4664c6853523e0e28ca231f6afdf1", "status": "completed"}, {"call_id": "call_Q866NsiyJeD807NjpnQxJk0V", "output": "{\"success\": true, \"message\": \"Correct move! R1 C4 = 3\", \"game_state\": {\"current_board\": [[1, 4, 2, 3], [0, 0, 0, 1], [4, 1, 3, 0], [3, 2, 0, 0]], \"solution\": [[1, 4, 2, 3], [2, 3, 4, 1], [4, 1, 3, 2], [3, 2, 1, 4]], \"scale\": 4, \"initial_empty_count\": 8, \"moves_made\": 3, \"correct_moves\": 2}, \"board_text\": \"   C1 C2   C3 C4  \\nR1  1  4 |  2  3\\nR2  .  . |  .  1\\n   - - - - - - \\nR3  4  1 |  3  .\\nR4  3  2 |  .  .\", \"is_complete\": false, \"move_reward\": 0.0}", "type": "function_call_output", "id": null, "status": null}, {"arguments": "{\"row\":2,\"col\":1,\"number\":2}", "call_id": "call_giYxUnyIL7BLZCp7hKHPxHn9", "name": "make_move", "type": "function_call", "id": "fc_68bd93a5838c8196b7ac194f34837bfb039cf8ecc424db03", "status": "completed"}, {"call_id": "call_giYxUnyIL7BLZCp7hKHPxHn9", "output": "{\"success\": true, \"message\": \"Correct move! R2 C1 = 2\", \"game_state\": {\"current_board\": [[1, 4, 2, 3], [2, 0, 0, 1], [4, 1, 3, 0], [3, 2, 0, 0]], \"solution\": [[1, 4, 2, 3], [2, 3, 4, 1], [4, 1, 3, 2], [3, 2, 1, 4]], \"scale\": 4, \"initial_empty_count\": 8, \"moves_made\": 4, \"correct_moves\": 3}, \"board_text\": \"   C1 C2   C3 C4  \\nR1  1  4 |  2  3\\nR2  2  . |  .  1\\n   - - - - - - \\nR3  4  1 |  3  .\\nR4  3  2 |  .  .\", \"is_complete\": false, \"move_reward\": 0.0}", "type": "function_call_output", "id": null, "status": null}, {"arguments": "{\"row\":2,\"col\":2,\"number\":3}", "call_id": "call_y9bTBhRd2FvRDadiap5nNujC", "name": "make_move", "type": "function_call", "id": "fc_68bd93a6908c8194962f9d783ef9c08405b4c0e16f7cb6be", "status": "completed"}, {"call_id": "call_y9bTBhRd2FvRDadiap5nNujC", "output": "{\"success\": true, \"message\": \"Correct move! R2 C2 = 3\", \"game_state\": {\"current_board\": [[1, 4, 2, 3], [2, 3, 0, 1], [4, 1, 3, 0], [3, 2, 0, 0]], \"solution\": [[1, 4, 2, 3], [2, 3, 4, 1], [4, 1, 3, 2], [3, 2, 1, 4]], \"scale\": 4, \"initial_empty_count\": 8, \"moves_made\": 5, \"correct_moves\": 4}, \"board_text\": \"   C1 C2   C3 C4  \\nR1  1  4 |  2  3\\nR2  2  3 |  .  1\\n   - - - - - - \\nR3  4  1 |  3  .\\nR4  3  2 |  .  .\", \"is_complete\": false, \"move_reward\": 0.0}", "type": "function_call_output", "id": null, "status": null}, {"arguments": "{\"row\":2,\"col\":3,\"number\":4}", "call_id": "call_YV4txJW0g9sgS5UZpS0cnzTC", "name": "make_move", "type": "function_call", "id": "fc_68bd93a780408197a8f194c0c7253d3e07d834853ce8c529", "status": "completed"}, {"call_id": "call_YV4txJW0g9sgS5UZpS0cnzTC", "output": "{\"success\": true, \"message\": \"Correct move! R2 C3 = 4\", \"game_state\": {\"current_board\": [[1, 4, 2, 3], [2, 3, 4, 1], [4, 1, 3, 0], [3, 2, 0, 0]], \"solution\": [[1, 4, 2, 3], [2, 3, 4, 1], [4, 1, 3, 2], [3, 2, 1, 4]], \"scale\": 4, \"initial_empty_count\": 8, \"moves_made\": 6, \"correct_moves\": 5}, \"board_text\": \"   C1 C2   C3 C4  \\nR1  1  4 |  2  3\\nR2  2  3 |  4  1\\n   - - - - - - \\nR3  4  1 |  3  .\\nR4  3  2 |  .  .\", \"is_complete\": false, \"move_reward\": 0.0}", "type": "function_call_output", "id": null, "status": null}, {"arguments": "{\"row\":3,\"col\":4,\"number\":2}", "call_id": "call_tOUxWELZGfigplBp5lEqQLvs", "name": "make_move", "type": "function_call", "id": "fc_68bd93a895148197bfb733d4c5af173f0db6c67fa5f6bb70", "status": "completed"}, {"call_id": "call_tOUxWELZGfigplBp5lEqQLvs", "output": "{\"success\": true, \"message\": \"Correct move! R3 C4 = 2\", \"game_state\": {\"current_board\": [[1, 4, 2, 3], [2, 3, 4, 1], [4, 1, 3, 2], [3, 2, 0, 0]], \"solution\": [[1, 4, 2, 3], [2, 3, 4, 1], [4, 1, 3, 2], [3, 2, 1, 4]], \"scale\": 4, \"initial_empty_count\": 8, \"moves_made\": 7, \"correct_moves\": 6}, \"board_text\": \"   C1 C2   C3 C4  \\nR1  1  4 |  2  3\\nR2  2  3 |  4  1\\n   - - - - - - \\nR3  4  1 |  3  2\\nR4  3  2 |  .  .\", \"is_complete\": false, \"move_reward\": 0.0}", "type": "function_call_output", "id": null, "status": null}, {"arguments": "{\"row\":4,\"col\":3,\"number\":1}", "call_id": "call_JxdEvztCmWNb0oxvgDzqFJFV", "name": "make_move", "type": "function_call", "id": "fc_68bd93a992a88193a5a0bf428d294140050e29fd449a66b6", "status": "completed"}, {"call_id": "call_JxdEvztCmWNb0oxvgDzqFJFV", "output": "{\"success\": true, \"message\": \"Correct move! R4 C3 = 1\", \"game_state\": {\"current_board\": [[1, 4, 2, 3], [2, 3, 4, 1], [4, 1, 3, 2], [3, 2, 1, 0]], \"solution\": [[1, 4, 2, 3], [2, 3, 4, 1], [4, 1, 3, 2], [3, 2, 1, 4]], \"scale\": 4, \"initial_empty_count\": 8, \"moves_made\": 8, \"correct_moves\": 7}, \"board_text\": \"   C1 C2   C3 C4  \\nR1  1  4 |  2  3\\nR2  2  3 |  4  1\\n   - - - - - - \\nR3  4  1 |  3  2\\nR4  3  2 |  1  .\", \"is_complete\": false, \"move_reward\": 0.0}", "type": "function_call_output", "id": null, "status": null}, {"arguments": "{\"row\":4,\"col\":4,\"number\":4}", "call_id": "call_FAcCh4KDrE657ryKXNgZL695", "name": "make_move", "type": "function_call", "id": "fc_68bd93ab49f481949fa445b5d6ad069501b662383935ba9f", "status": "completed"}, {"call_id": "call_FAcCh4KDrE657ryKXNgZL695", "output": "{\"success\": true, \"message\": \"Correct move! R4 C4 = 4. Congratulations! Puzzle completed!\", \"game_state\": {\"current_board\": [[1, 4, 2, 3], [2, 3, 4, 1], [4, 1, 3, 2], [3, 2, 1, 4]], \"solution\": [[1, 4, 2, 3], [2, 3, 4, 1], [4, 1, 3, 2], [3, 2, 1, 4]], \"scale\": 4, \"initial_empty_count\": 8, \"moves_made\": 9, \"correct_moves\": 8}, \"board_text\": \"   C1 C2   C3 C4  \\nR1  1  4 |  2  3\\nR2  2  3 |  4  1\\n   - - - - - - \\nR3  4  1 |  3  2\\nR4  3  2 |  1  4\", \"is_complete\": true, \"move_reward\": 1.0}", "type": "function_call_output", "id": null, "status": null}], "parallel_tool_calls": false, "temperature": 1.0, "tool_choice": "auto", "tools": [{"name": "make_move", "parameters": {"type": "object", "properties": {"row": {"type": "integer", "description": "Row number (1-4)"}, "col": {"type": "integer", "description": "Column number (1-4)"}, "number": {"type": "integer", "description": "Digit to place (1-4)"}}, "required": ["row", "col", "number"], "additionalProperties": false}, "strict": true, "type": "function", "description": "Place a digit in the Sudoku grid"}], "top_p": 1.0, "background": false, "max_output_tokens": null, "max_tool_calls": 1, "previous_response_id": null, "prompt": null, "reasoning": {"effort": null, "generate_summary": null, "summary": null}, "service_tier": "default", "status": "completed", "text": {"format": {"type": "text"}, "verbosity": "medium"}, "top_logprobs": 0, "truncation": "disabled", "usage": {"input_tokens": 1103, "input_tokens_details": {"cached_tokens": 0}, "output_tokens": 23, "output_tokens_details": {"reasoning_tokens": 0}, "total_tokens": 1126}, "user": null, "prompt_cache_key": null, "safety_identifier": null, "store": true, "input": [{"role": "user", "type": "message", "content": "You are playing a simple version of Sudoku.\nEach row is numbered from 1 to 4, and each column is also numbered from 1 to 4.\nEmpty cells are represented by '.', and pre-filled cells contain digits from 1 to 4.\n\nYour objective is to fill the empty cells in the 4x4 grid with digits from 1 to 4 such that:\n1. Each row contains all digits from 1 to 4 without repetition.\n2. Each column contains all digits from 1 to 4 without repetition.\n3. Each 2x2 subgrid contains all digits from 1 to 4 without repetition.\n\nRules and Instructions:\n1. **Do not overwrite** the initial numbers provided in the grid.\n2. **Only fill** empty cells represented by '.'.\n3. You must respond with the format '\\boxed{row column number}', e.g. \\boxed{1 1 5}.\n4. **Ensure** that your move does not violate Sudoku rules. Invalid moves will result in penalties.\nUse the make_move function to submit your moves. Good luck!\n\n\n\n   C1 C2   C3 C4  \nR1  .  4 |  2  .\nR2  .  . |  .  1\n   - - - - - - \nR3  4  1 |  3  .\nR4  3  2 |  .  ."}]}, "reward": 1.0, "total_moves": 9, "is_complete": true}
```
---

---------

Signed-off-by: Rahul Chand <rchand@nvidia.com>
Reverts #30

Signed-off-by: Brian Yu <bxyu@nvidia.com>
This change updates the train_data_utils via `ng_prepare_data` to apply
data aggregations to the other keys within an `example.jsonl`. file.

---------

Signed-off-by: Frankie Siino <fsiino@nvidia.com>
Co-authored-by: bxyu-nvidia <bxyu@nvidia.com>
…mprovements (#77)

Signed-off-by: Brian Yu <bxyu@nvidia.com>
…ning (#66)

Remove unnecessary Github Action CI and add uv config to enable
dependency scanning

* This project's current CI doesn't need to build and test through a
docker image. So, deleting the unnecssary CI Dockerfile and Github
Actions template
* Adding `managed = true` under `[uv.tool]` to allow for repo dependency
scanning

---------

Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
This implements a simple rounding rule for `AvgMinMax` floats in order
to keep example_metrics consistent.
For background, the addition of median and std dev did not assign a
ceiling for decimal places, so trivial value differences such as `1.2 !=
1.200002` caused ValueErrors.

---------

Signed-off-by: Frankie Siino <fsiino@nvidia.com>
…dd uvicorn logging filtering (#79)

Signed-off-by: Brian Yu <bxyu@nvidia.com>
From now on, any Github repo under the Nvidia-NeMo org will use the
default Issues system / behavior
(unless overwritten i.e. in this case the individual repo would have a
.github/ISSUE_TEMPLATE folder in the repo itself, effectively overriding
the default behavior coming from the NVIDIA-NeMo/.github repo [which
establishes the default system]).

Default behavior for Issues creation now is :
- There is a Bug Report issue
- There is a Feature Request issue
- Blank issues are NOT allowed to be created
# Add `num_repeats` hyperparameter for dataset repetition

## Summary
Adds optional `num_repeats` parameter to `DatasetConfig` that allows
repeating each dataset sample during training data processing and
preparation.

## Changes
- **Config**: Added `num_repeats: Optional[int] = Field(default=None,
ge=1)` to `DatasetConfig`
- **Processing**: Modified `_iter_dataset_lines()` to repeat each line
`num_repeats` times (defaults to 1)
- **Integration**: Updated data validation, metrics aggregation, and
preparation workflows to handle repeated samples
- **Documentation**: Updated README with usage examples

## Usage
```yaml
datasets:
  - name: train
    type: train
    jsonl_fpath: data/train.jsonl
    num_repeats: 3  # Each sample appears 3 times during processing
```

## Testing
Added comprehensive unit tests covering:
- Configuration validation (accepts positive integers, rejects invalid
values)
- Data iteration with different repeat values
- Metrics aggregation with repeated samples
- Data preparation workflow integration

All existing functionality preserved with backward compatibility
(defaults to 1 repetition).

---------

Signed-off-by: Mahan Fathi <mfathi@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Co-authored-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Khushi Bhardwaj <kbhardwaj@nvidia.com>
Co-authored-by: Khushi Bhardwaj <kbhardwaj@nvidia.com>
cwing-nvidia and others added 9 commits February 15, 2026 21:40
will add back later with Nano 3 recipe

---------

Signed-off-by: Chris Wing <cwing@nvidia.com>
adds docs for trl integration 

see #371

---------

Signed-off-by: cmunley1 <cmunley@nvidia.com>
Signed-off-by: Lawrence Lane <llane@nvidia.com>
Co-authored-by: Lawrence Lane <llane@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
- Replace intro with clearer scope statement
- Add tip linking to README for existing environments
- Add verification method as fifth environment property with link to
concepts page
- Improve explanatory text for rollout structure and core capabilities
tables
- Add section headers for tables
- Remove duplicate Verification Methods table (covered in concepts)
- Remove Reference Implementations table (covered in README)
- Remove Learning Path section
- Update environment tutorial card descriptions and add
multi-environment card to docs home

---------

Signed-off-by: Chris Wing <cwing@nvidia.com>
Fixes #670 
- Fix #-available-resource-servers anchor to #-available-environments
across 5 files
- Remove Next Steps section from unsloth tutorial
- Improve environment card descriptions

Signed-off-by: Chris Wing <cwing@nvidia.com>
no content changes just style guide application

Signed-off-by: Lawrence Lane <llane@nvidia.com>
…ting Started (#721)

- Remove first-training-run.md (duplicated unsloth tutorial content)
- Add "Start Training" as recommended next step in quickstart, rollout
collection
- Add training bullet to README next steps
- Remove first-training-run from toctree and landing page

Signed-off-by: Chris Wing <cwing@nvidia.com>
Co-authored-by: Lawrence Lane <llane@nvidia.com>
Signed-off-by: Lawrence Lane <llane@nvidia.com>
@lbliii lbliii self-assigned this Feb 17, 2026
Signed-off-by: Lawrence Lane <llane@nvidia.com>
@lbliii lbliii changed the title version bump docs: version bump, CTA link changes Feb 17, 2026
Signed-off-by: Brian Yu <bxyu@nvidia.com>
bxyu-nvidia
bxyu-nvidia previously approved these changes Feb 17, 2026
@lbliii lbliii enabled auto-merge (squash) February 17, 2026 20:45
@vadam5 vadam5 closed this Mar 10, 2026
auto-merge was automatically disabled March 10, 2026 23:08

Pull request was closed

@vadam5 vadam5 force-pushed the llane/docs-remove-link branch from 9d4ed48 to aa52c7e Compare March 10, 2026 23:08
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Mar 10, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@bxyu-nvidia
Copy link
Copy Markdown
Contributor

Sorry folks, this PR was mistakenly closed when one of our folks mistakenly force-pushed diverging refs to Github. We are looking to remedy this and re-open the PR

@vadam5
Copy link
Copy Markdown
Contributor

vadam5 commented Mar 11, 2026

Replacement PR opened here: #880

bxyu-nvidia pushed a commit that referenced this pull request Mar 20, 2026
Replacement PR for: #722
Contributors: @lbliii

---------

Signed-off-by: Lawrence Lane <llane@nvidia.com>
Co-authored-by: Lawrence Lane <llane@nvidia.com>
MahanFathi pushed a commit that referenced this pull request Mar 24, 2026
Replacement PR for: #722
Contributors: @lbliii

---------

Signed-off-by: Lawrence Lane <llane@nvidia.com>
Co-authored-by: Lawrence Lane <llane@nvidia.com>
jsw-zorro pushed a commit to niletron/Gym that referenced this pull request Apr 7, 2026
Replacement PR for: NVIDIA-NeMo#722
Contributors: @lbliii

---------

Signed-off-by: Lawrence Lane <llane@nvidia.com>
Co-authored-by: Lawrence Lane <llane@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.