[benchmarking] Adds support for fine-grained config overrides and global ray config#1840
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
Greptile SummaryThis PR replaces the shallow Confidence Score: 5/5Safe to merge; all prior P1 concerns have been addressed and the only remaining finding is a P2 documentation gap. Prior thread issues (None-document crash, key-order match robustness, non-dict append semantics) are either fixed or previously flagged. The one new finding is a P2 usability note about scalar-list append-only semantics — it doesn't cause incorrect benchmark execution or data loss, so it does not reduce the score below 5. No files require special attention; the scalar-list documentation gap in benchmarking/run.py and benchmarking/README.md is optional to address before merge. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["merge_config_files(config_files)"] --> B["for each YAML file"]
B --> C["yaml.full_load_all(f)"]
C --> D{document is None?}
D -- yes --> B
D -- no --> E["update_config(config_dict, new_dict)"]
E --> F{key in config_dict?}
F -- no --> G["config_dict[key] = value"]
F -- yes --> H{both dicts?}
H -- yes --> I["recurse: update_config(config_dict[key], value)"]
H -- no --> J{both lists?}
J -- yes --> K["for each item in override list"]
K --> L{item is non-empty dict?}
L -- yes --> M["find base item with same first-key name & value"]
M -- found --> N["recurse: update_config(base_item, item)"]
M -- not found --> O["append item to base list"]
L -- no --> P["append scalar/empty to base list"]
J -- no --> Q["config_dict[key] = value (replace)"]
E --> R["return config_dict"]
R --> S["Session.from_dict(config_dict)"]
S --> T["Entry.from_dict per entry"]
T --> U["Session.__post_init__"]
U --> V["entry.ray = {**global_ray, **entry.ray}"]
Reviews (7): Last reviewed commit: "handles empty YAML files." | Re-trigger Greptile |
| @@ -12,6 +12,8 @@ | |||
| # See the License for the specific language governing permissions and | |||
| # limitations under the License. | |||
|
|
|||
There was a problem hiding this comment.
File-level ERA001 suppression too broad
# ruff: noqa: ERA001 disables "found commented-out code" for the entire file. The trigger appears to be the YAML example block in lines 76–87 of update_config. Suppressing the rule file-wide will silently hide any genuinely commented-out code added anywhere in run.py in the future. Prefer a targeted inline suppression on just those lines, or wrap the example in a docstring.
Or, better, move the inline YAML example into the function's docstring and remove this directive entirely.
| first_key = next(iter(sub_val.keys())) | ||
| for config_sub_val in config_dict[key]: | ||
| if ( | ||
| isinstance(config_sub_val, dict) | ||
| and config_sub_val | ||
| and next(iter(config_sub_val.keys())) == first_key | ||
| and config_sub_val[first_key] == sub_val[first_key] |
There was a problem hiding this comment.
Key-order dependency can silently drop overrides
List items are matched by comparing the first key name of the override dict against the first key name of each base dict (next(iter(config_sub_val.keys())) == first_key). If a base entry has a different first key than the override — e.g., enabled: comes before name: in the base file but name: is first in the override — the condition is always False, so the override item is silently appended rather than merged.
This creates a duplicate entry that is only caught later in Session.__post_init__ as a confusing ValueError with no hint about the root cause.
A more robust match uses config_sub_val.get(first_key) == sub_val[first_key] (removing the next(iter(...)) == first_key check), so matching works regardless of key ordering in the base dict.
…ched by first key), and merge_config_files() to apply it across all config files. Replaces the previous shallow dict.update() in main() which would clobber entire top-level keys (e.g. all entries) when overriding config. Signed-off-by: rlratzel <rratzel@nvidia.com>
Signed-off-by: rlratzel <rratzel@nvidia.com>
Signed-off-by: rlratzel <rratzel@nvidia.com>
Adds a top-level `ray` section to the YAML config that defines global
ray defaults (num_cpus, num_gpus, enable_object_spilling) inherited by
all entries. Per-entry `ray` sections now only need to specify keys that
differ from the global defaults.
Session.__post_init__ applies the merge: `{**self.ray, **entry.ray}`,
so per-entry values always take precedence. Updates nightly-benchmark.yaml
to use a global `ray` section and removes redundant per-entry ray blocks
(18 entries now specify only their overrides; 18 had no overrides and had
their ray sections removed entirely). Updates README to document the behavior.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: rlratzel <rratzel@nvidia.com>
Signed-off-by: rlratzel <rratzel@nvidia.com>
rayconfiguration which is inherited by each entry. Entries can override the global ray config by providing their ownraysection.These changes make it practical to write small override files that change only specific entries or requirements without duplicating the full configuration. For example:
Base config (
nightly-benchmark.yaml) defines many entries including:Override file (
my_overrides.yaml) changes only that entry's timeout and requirement minimum:Running with both files:
Results in
domain_classification_xennausingtimeout_s: 2000andmin_value: 2000, while all other entries remain unchanged.The previous implementation performed a shallow copy, which would mean the entire
entrieslist would be overridden by the content above and every entry specified in previously-read YAMLs (eg.nightly-benchmark.yaml) would not be present in the final config.The global
raysection eliminates a significant amount of repeated YAML fromnightly-benchmark.yaml