gpt-oss 20b support by chochowski · Pull Request #889 · NVIDIA/Model-Optimizer

chochowski · 2026-02-13T11:18:24Z

What does this PR do?

Adds gpt-oss-20b support for puzzle any-model pruning.

Type of change:
new feature

Overview:
adds descriptor, converter and yaml configuration files for expert removal. Introduces slight changes on conversion to account for mxfp4 quantized checkpoint of gpt-oss

Usage

# Add a code snippet demonstrating how to use this

Testing

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed.
Is this change backward compatible?: Yes/No
Did you write any new necessary tests?: Yes/No
Did you add or update any necessary documentation?: Yes/No
Did you update Changelog?: Yes/No

Additional Information

copy-pr-bot · 2026-02-13T11:18:28Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-02-13T11:18:33Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

🗂️ Base branches to auto review (3)

main
release/.*
feature/.*

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

modelopt/torch/puzzletron/anymodel/models/gpt_oss_20b/gpt_oss_pruned_to_mxfp4.py

examples/puzzletron/README.md

examples/puzzletron/configs/gptoss-20b_remove_experts_memory/gptoss-20b.yaml

...es/puzzletron/configs/gptoss-20b_remove_experts_memory/gptoss-20b_remove_experts_memory.yaml

examples/puzzletron/README.md

modelopt/torch/puzzletron/anymodel/models/gpt_oss_20b/gpt_oss_pruned_to_mxfp4.py

examples/puzzletron/evaluation/lm_eval_anymodel.py

Signed-off-by: mchochowski <mchochowski@nvidia.com>

…uator (NVIDIA#894) This PR adds Nemo Evaluator support to the AnyModel branch. It includes documentation and a deployment script that allow for evaluation of AnyModel Puzzletron checkpoints with Nemo Evaluator. We assume development on a GPU node, following the current tutorial style, so we don't rely on Slurm-based deployment/evaluation, but instead use direct evaluation via `eval-factory run_eval`. --------- Signed-off-by: jrausch <jrausch@nvidia.com> Signed-off-by: mchochowski <mchochowski@nvidia.com>

## What does this PR do? **Overview:** - Update the AnyModel Puzzletron tutorial to use lm-eval. We add a script that monkey patches lm-eval to use the patched AnyModel model loading - No need for running ray deployments or replacing the NeMo Export-Deploy deployment script with a patched version - Moved instructions for using NeMo Evaluator to an alternative readme file --------- Signed-off-by: jrausch <jrausch@nvidia.com> Signed-off-by: mchochowski <mchochowski@nvidia.com>

## What does this PR do? **Overview:** Updated license of examples/puzzletron/evaluation/lm_eval_anymodel.py to match that of reference examples/llm_eval/lm_eval_hf.py. Signed-off-by: jrausch <jrausch@nvidia.com> Signed-off-by: mchochowski <mchochowski@nvidia.com>

Signed-off-by: mchochowski <mchochowski@nvidia.com>

…ml config Signed-off-by: mchochowski <mchochowski@nvidia.com>

Signed-off-by: mchochowski <mchochowski@nvidia.com>

kevalmorabia97

Left some comments. Also seeing pre-commit formatting not applied. Please run pre-commit run --all-files

examples/puzzletron/evaluation/hf_deployable_anymodel.py

kevalmorabia97 · 2026-02-24T14:01:30Z

examples/puzzletron/evaluation/hf_deployable_anymodel.py

Why do we need this env variable and to add the workdir to sys.path?

examples/puzzletron/evaluation/hf_deployable_anymodel.py

kevalmorabia97 · 2026-02-24T14:06:42Z

examples/puzzletron/evaluation/hf_deployable_anymodel.py

Why do we need to broadcast_list twice instead of reusing output of first call for 2nd one?

this is actually nemo-deploy code with patch - didn't want to touch the internals, only update the model loading

examples/puzzletron/evaluation/lm_eval_anymodel.py

examples/puzzletron/evaluation/nemo_evaluator_instructions.md

examples/puzzletron/GPTOSS.md

.pre-commit-config.yaml

Signed-off-by: mchochowski <mchochowski@nvidia.com>

Signed-off-by: chochowski <Marcin.Chochowski@gmail.com>

Co-authored-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com> Signed-off-by: chochowski <Marcin.Chochowski@gmail.com>

## What does this PR do? Adds gpt-oss-20b support for puzzle any-model pruning. **Type of change:**  new feature **Overview:** adds descriptor, converter and yaml configuration files for expert removal. Introduces slight changes on conversion to account for mxfp4 quantized checkpoint of gpt-oss ## Usage  ```python # Add a code snippet demonstrating how to use this ``` ## Testing  ## Before your PR is "*Ready for review*"  - **Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)** and your commits are signed. - **Is this change backward compatible?**: Yes/No  - **Did you write any new necessary tests?**: Yes/No - **Did you add or update any necessary documentation?**: Yes/No - **Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**: Yes/No  ## Additional Information  --------- Signed-off-by: mchochowski <mchochowski@nvidia.com> Signed-off-by: jrausch <jrausch@nvidia.com> Signed-off-by: chochowski <Marcin.Chochowski@gmail.com> Co-authored-by: J Rausch <38429553+j-rausch@users.noreply.github.com> Co-authored-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com> Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

chochowski requested review from a team as code owners February 13, 2026 11:18

chochowski requested review from jingyu-ml and removed request for a team February 13, 2026 11:18

kevalmorabia97 requested review from danielkorzekwa and kevalmorabia97 and removed request for jingyu-ml February 13, 2026 19:33

kevalmorabia97 reviewed Feb 13, 2026

View reviewed changes

modelopt/torch/puzzletron/anymodel/models/gpt_oss_20b/gpt_oss_pruned_to_mxfp4.py Show resolved Hide resolved

kevalmorabia97 reviewed Feb 13, 2026

View reviewed changes

examples/puzzletron/README.md Outdated Show resolved Hide resolved

danielkorzekwa reviewed Feb 16, 2026

View reviewed changes

chochowski requested review from a team as code owners February 21, 2026 16:32

chochowski requested review from kevalmorabia97 and removed request for a team February 21, 2026 16:32

kevalmorabia97 reviewed Feb 22, 2026

View reviewed changes

examples/puzzletron/evaluation/lm_eval_anymodel.py Outdated Show resolved Hide resolved

chochowski and others added 9 commits February 23, 2026 01:17

gpt-oss 20b support

3d5f44d

Signed-off-by: mchochowski <mchochowski@nvidia.com>

added paragraph in readme

2b2f762

Signed-off-by: mchochowski <mchochowski@nvidia.com>

fixes to readme, and config yaml. added copyrights header

64601d8

Signed-off-by: mchochowski <mchochowski@nvidia.com>

fix gptosss hook

ac6baa4

Signed-off-by: mchochowski <mchochowski@nvidia.com>

fix hooks, descriptor and converter to support gptoss. fix typo in ya…

16a6732

…ml config Signed-off-by: mchochowski <mchochowski@nvidia.com>

Fix license and Readme

ee182b5

Signed-off-by: mchochowski <mchochowski@nvidia.com>

chochowski force-pushed the mchochowski/any_model_gptoss branch from e07dbaa to ee182b5 Compare February 23, 2026 09:17

removed unnecessary pruning params

87fca5a

Signed-off-by: mchochowski <mchochowski@nvidia.com>

danielkorzekwa approved these changes Feb 23, 2026

View reviewed changes

kevalmorabia97 reviewed Feb 24, 2026

View reviewed changes

chochowski and others added 4 commits February 26, 2026 05:57

updated evals scripts, moved some functionality from examples to models

b6e1187

Signed-off-by: mchochowski <mchochowski@nvidia.com>

clear formatting pre-commit

43d2a3b

Signed-off-by: mchochowski <mchochowski@nvidia.com>

Merge branch 'dkorzekwa/any_model' into mchochowski/any_model_gptoss

c062bd0

Signed-off-by: chochowski <Marcin.Chochowski@gmail.com>

Update .pre-commit-config.yaml

272058a

Co-authored-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com> Signed-off-by: chochowski <Marcin.Chochowski@gmail.com>

kevalmorabia97 approved these changes Feb 27, 2026

View reviewed changes

chochowski merged commit 2409ac8 into NVIDIA:dkorzekwa/any_model Feb 27, 2026
2 checks passed

Conversation

chochowski commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Uh oh!

copy-pr-bot bot commented Feb 13, 2026

Uh oh!

coderabbitai bot commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kevalmorabia97 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

kevalmorabia97 Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kevalmorabia97 Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

chochowski Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

chochowski commented Feb 13, 2026 •

edited

Loading

coderabbitai bot commented Feb 13, 2026 •

edited

Loading