Simplify puzzletron test configs: use HF model names and shared base YAMLs#1039
Merged
kevalmorabia97 merged 7 commits intodkorzekwa/any_model_other_modelsfrom Mar 17, 2026
Conversation
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
Contributor
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. 🗂️ Base branches to auto review (3)
Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
08cf5d9 to
6380ec1
Compare
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
74849ea to
9124574
Compare
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
9124574 to
b1f9faf
Compare
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
…y-dkorzekwa/any_model_other_models Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
…led for now: # "nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-Base-BF16": 4.7737884521484375, Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
danielkorzekwa
approved these changes
Mar 17, 2026
1357b26
into
dkorzekwa/any_model_other_models
11 checks passed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Type of change: New tests / Refactoring
Simplifies the puzzletron test infrastructure by:
Removing
hf_configs/folder — HuggingFace configs are now loaded on-the-fly viaAutoConfig.from_pretrained(hf_model_name)instead of from cached static files.Removing
HF_MODEL_CARD_NAMESmapping — HF model names (e.g.meta-llama/Llama-3.1-8B-Instruct) are passed directly as test parameters.Replacing hardcoded VL model check with
hasattr(config, "text_config") and hasattr(config, "vision_config")for generic detection.Unifying ~6k lines of near-identical YAML into shared base configs with per-model overrides:
validate_model_defaults.yaml,validate_solutions_defaults.yaml— shared validation paramspruning/pruning_defaults.yaml,pruning/ffn_pruning_base.yaml,pruning/attn_pruning.yaml,pruning/hidden_dim_pruning.yaml— shared pruning basesmeta-llama/Llama-3.1-8B-Instruct/) and contain only model-specific overrides (e.g. just thelayer_descriptor._target_class)Removing
hydra_config_subdirparameter from test parametrize — config path is derived fromhf_model_namedirectly.Removing unused
bypass:entries from all per-model main YAMLs.Usage
Testing
All 8 parametrized test cases in
test_puzzletron.pypass:CI Job: https://github.com/NVIDIA/Model-Optimizer/actions/runs/23087216443/job/67065820836
Before your PR is "Ready for review"
Additional Information
Hydra packaging notes (non-obvious fixes required):
# @package _global_to all per-model main YAMLs — needed whenconfig_namecontains path separators, otherwise Hydra nests all keys under the org/model package@_here_to sub-defaults insidepruning/configs — prevents Hydra from compounding thepruningpackage at each inheritance level (pruning→pruning.pruning→pruning.pruning.pruning)hydra/hydra_logging=disabledfrom YAML defaults list tooverrides=inpuzzletron.py— the YAML override syntax broke with nested config paths