feat: support multiple datasets for response dataset#1691
Conversation
ff87f2c to
5835ce7
Compare
94e40a6 to
c0b8cde
Compare
2a4cedd to
20f3a62
Compare
d9836a6 to
8577efb
Compare
f9def0d to
ec862a3
Compare
74a26c0 to
a990378
Compare
📝 WalkthroughWalkthroughIntroduces multi-dataset support for the GRPO framework through a new configuration file, refactored data handling across two modules, an override-aware config merge utility, and corresponding functional tests. Changes
Sequence Diagram(s)sequenceDiagram
participant Config as Config Loader
participant DataList as Multiple Datasets
participant Merge as Dataset Merger
participant Processor as Task Processors
participant Train as Training
Config->>DataList: Load each dataset config
DataList->>DataList: Apply per-dataset defaults
DataList->>Merge: Collect all datasets
Merge->>Merge: Concatenate datasets
Merge->>Processor: Build per-task processors<br/>(per dataset)
Processor->>Processor: Merge processor mappings<br/>(task_name → processor)
Processor->>Train: Create AllTaskProcessedDataset<br/>with merged data + processors
Train->>Train: Execute training
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
nemo_rl/utils/config.py (1)
1-1: Update the copyright header year to include 2026.Line 1 still shows 2025; please update to include the current year. As per coding guidelines, headers must include the current year.
📝 Suggested header update
-# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved. +# Copyright (c) 2025-2026, NVIDIA CORPORATION. All rights reserved.examples/run_sft.py (1)
1-1: Update the copyright header year to include 2026.Line 1 still shows 2025; please update to include the current year. As per coding guidelines, headers must include the current year.
📝 Suggested header update
-# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved. +# Copyright (c) 2025-2026, NVIDIA CORPORATION. All rights reserved.nemo_rl/data/utils.py (1)
1-1: Update the copyright header year to include 2026.Line 1 still shows 2025; please update to include the current year. As per coding guidelines, headers must include the current year.
📝 Suggested header update
-# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved. +# Copyright (c) 2025-2026, NVIDIA CORPORATION. All rights reserved.
🤖 Fix all issues with AI agents
In `@nemo_rl/data/utils.py`:
- Around line 75-77: The normalization logic in utils.py currently only wraps
single-dataset entries when data_config["train"] or data_config["validation"] is
a plain dict, but misses omegaconf.DictConfig; import DictConfig from omegaconf
and update the two isinstance checks (the one that normalizes
data_config["train"] and the one for data_config["validation"]) to check
isinstance(..., (dict, DictConfig)) so single-dataset DictConfig objects are
wrapped into a list before iterating (this fixes errors in
load_response_dataset).
In `@tests/functional/grpo_multiple_datasets.sh`:
- Around line 20-40: The shell command at the end of
tests/functional/grpo_multiple_datasets.sh uses an unquoted $@ which can break
arguments with spaces; update the invocation to use "$@" (i.e., replace $@ with
"$@") so all passed overrides/arguments preserve their boundaries when appended
to the uv run command before the redirection and tee pipeline.
🧹 Nitpick comments (1)
nemo_rl/utils/config.py (1)
30-44: Consider supporting nested_override_markers for consistency and future-proofing.Currently, the function only processes top-level
_override_markers. While no nested markers are used today, a recursive approach would ensure consistent behavior if nested overrides are ever needed.Suggested recursive handling
+def _apply_override_markers( + base_cfg: DictConfig, override_cfg: DictConfig +) -> None: + for key, value in list(override_cfg.items()): + if isinstance(value, DictConfig): + if value.get("_override_", False): + value.pop("_override_") + base_cfg.pop(key, None) + else: + child_base = base_cfg.get(key) + if isinstance(child_base, DictConfig): + _apply_override_markers(child_base, value) + else: + _apply_override_markers(OmegaConf.create({}), value) + def merge_with_override( base_config: DictConfig, override_config: DictConfig ) -> DictConfig: """Merge configs with support for _override_ marker to completely override sections.""" - for key in list(override_config.keys()): - if isinstance(override_config[key], DictConfig): - if override_config[key].get("_override_", False): - # remove the _override_ marker - override_config[key].pop("_override_") - # remove the key from base_config so it won't be merged - if key in base_config: - base_config.pop(key) + _apply_override_markers(base_config, override_config)
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
d1d8e05 to
c570a05
Compare
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Related issue: #1049
Usage
Summary by CodeRabbit
New Features
Tests
✏️ Tip: You can customize this high-level summary in your review settings.