SpecDec Bench: PostProcess flag #759

IzzyPutterman · 2026-01-11T08:26:42Z

What does this PR do?

Type of change: ? Bug Fix: https://nvbugspro.nvidia.com/bug/5795144

Overview: ? Pass postprocess flag to handle slicing message.

Usage

# Add a code snippet demonstrating how to use this

Testing

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed.
Is this change backward compatible?: Yes/No
Did you write any new necessary tests?: Yes/No
Did you add or update any necessary documentation?: Yes/No
Did you update Changelog?: Yes/No

Additional Information

Summary by CodeRabbit

Release Notes

New Features
- Introduced --postprocess command-line option to select postprocessing strategy. Users can choose "base" (default, preserves existing behavior) or "gptoss" (new alternative method) with validation to reject invalid selections.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-11T08:26:52Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

Adds a new --postprocess CLI option to support selecting between "base" (existing) and "gptoss" (new) postprocessing modes. The selected postprocessing function is threaded through the call stack to the main loop.

Changes

Cohort / File(s)	Summary
Postprocessing CLI option `examples/specdec_bench/run.py`	Introduces `--postprocess` CLI argument with "base" and "gptoss" choices (default "base"). Imports `postprocess_gptoss` from utils. Wires postprocessing selection through `run_simple` to `run_loop`, passing the appropriate postprocess callable based on user input.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: adding a PostProcess flag to SpecDec Bench, which aligns with the changeset's focus on introducing a new --postprocess CLI option and postprocessing path selection.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

examples/specdec_bench/run.py (1)
36-46: Update the docstring to document the postprocess parameter.

The function signature now includes a postprocess parameter, but the docstring doesn't document it. Please add documentation for this parameter to maintain clarity.
📝 Proposed docstring update
     """
     Async version of run_loop with concurrency control using a semaphore.
 
     Args:
         runner: The model runner instance
         dataset: The dataset containing requests
         tokenizer: The tokenizer instance
         output_length: Maximum output length
+        postprocess: Postprocessing function to apply to model outputs
         concurrency: Maximum number of concurrent requests (default: 10)
     """

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5104513 and 068dcc3.

📒 Files selected for processing (1)

examples/specdec_bench/run.py

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: linux
GitHub Check: wait-checks / wait
GitHub Check: build-docs
GitHub Check: code-quality

🔇 Additional comments (3)

examples/specdec_bench/run.py (3)

118-123: LGTM: Clean postprocessing function selection.

The selection logic correctly maps the CLI argument to the appropriate postprocessing function. The ValueError provides a good safeguard if run_simple is called programmatically.

197-204: LGTM: Well-structured CLI argument.

The --postprocess argument is properly configured with appropriate defaults, choices validation, and maintains backward compatibility by defaulting to "base".

21-27: Import verified. The postprocess_gptoss function is properly defined in specdec_bench.utils and will import without errors.

codecov · 2026-01-11T08:37:15Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.62%. Comparing base (5104513) to head (4a4723c).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #759   +/-   ##
=======================================
  Coverage   74.62%   74.62%           
=======================================
  Files         192      192           
  Lines       18989    18989           
=======================================
  Hits        14171    14171           
  Misses       4818     4818

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: Izzy Putterman <iputterman@nvidia.com>

## What does this PR do? **Type of change:** ? Bug Fix: https://nvbugspro.nvidia.com/bug/5795144 **Overview:** ? Pass postprocess flag to handle slicing message. ## Usage  ```python # Add a code snippet demonstrating how to use this ``` ## Testing  ## Before your PR is "*Ready for review*"  - **Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)** and your commits are signed. - **Is this change backward compatible?**: Yes/No  - **Did you write any new necessary tests?**: Yes/No - **Did you add or update any necessary documentation?**: Yes/No - **Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**: Yes/No  ## Additional Information   ## Summary by CodeRabbit ## Release Notes * **New Features** * Introduced --postprocess command-line option to select postprocessing strategy. Users can choose "base" (default, preserves existing behavior) or "gptoss" (new alternative method) with validation to reject invalid selections. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub>  Signed-off-by: Izzy Putterman <iputterman@nvidia.com>

## What does this PR do? **Type of change:** ? Bug Fix: https://nvbugspro.nvidia.com/bug/5795144 **Overview:** ? Pass postprocess flag to handle slicing message. ## Usage  ```python # Add a code snippet demonstrating how to use this ``` ## Testing  ## Before your PR is "*Ready for review*"  - **Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)** and your commits are signed. - **Is this change backward compatible?**: Yes/No  - **Did you write any new necessary tests?**: Yes/No - **Did you add or update any necessary documentation?**: Yes/No - **Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**: Yes/No  ## Additional Information   ## Summary by CodeRabbit ## Release Notes * **New Features** * Introduced --postprocess command-line option to select postprocessing strategy. Users can choose "base" (default, preserves existing behavior) or "gptoss" (new alternative method) with validation to reject invalid selections. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub>  Signed-off-by: Izzy Putterman <iputterman@nvidia.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

IzzyPutterman requested a review from a team as a code owner January 11, 2026 08:26

IzzyPutterman requested a review from ChenhanYu January 11, 2026 08:26

coderabbitai bot reviewed Jan 11, 2026

View reviewed changes

SpecDec Bench: PostProcess flag

4a4723c

Signed-off-by: Izzy Putterman <iputterman@nvidia.com>

IzzyPutterman force-pushed the iputterman/postproc-gptoss branch from 068dcc3 to 4a4723c Compare January 11, 2026 12:28

kevalmorabia97 requested review from h-guo18 and yeyu-nvidia January 12, 2026 19:20

kevalmorabia97 approved these changes Jan 12, 2026

View reviewed changes

yeyu-nvidia approved these changes Jan 12, 2026

View reviewed changes

kevalmorabia97 merged commit 727da95 into main Jan 12, 2026
36 checks passed

kevalmorabia97 deleted the iputterman/postproc-gptoss branch January 12, 2026 20:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SpecDec Bench: PostProcess flag #759

SpecDec Bench: PostProcess flag #759

Uh oh!

IzzyPutterman commented Jan 11, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Jan 11, 2026 •

edited

Loading

Review skipped

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Uh oh!

codecov bot commented Jan 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

SpecDec Bench: PostProcess flag #759

SpecDec Bench: PostProcess flag #759

Uh oh!

Conversation

IzzyPutterman commented Jan 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Jan 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jan 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

IzzyPutterman commented Jan 11, 2026 •

edited

Loading

coderabbitai bot commented Jan 11, 2026 •

edited

Loading

codecov bot commented Jan 11, 2026 •

edited

Loading