-
Notifications
You must be signed in to change notification settings - Fork 242
SpecDec Bench: PostProcess flag #759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the 📝 WalkthroughWalkthroughAdds a new Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
examples/specdec_bench/run.py (1)
36-46: Update the docstring to document thepostprocessparameter.The function signature now includes a
postprocessparameter, but the docstring doesn't document it. Please add documentation for this parameter to maintain clarity.📝 Proposed docstring update
""" Async version of run_loop with concurrency control using a semaphore. Args: runner: The model runner instance dataset: The dataset containing requests tokenizer: The tokenizer instance output_length: Maximum output length + postprocess: Postprocessing function to apply to model outputs concurrency: Maximum number of concurrent requests (default: 10) """
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
examples/specdec_bench/run.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: linux
- GitHub Check: wait-checks / wait
- GitHub Check: build-docs
- GitHub Check: code-quality
🔇 Additional comments (3)
examples/specdec_bench/run.py (3)
118-123: LGTM: Clean postprocessing function selection.The selection logic correctly maps the CLI argument to the appropriate postprocessing function. The ValueError provides a good safeguard if
run_simpleis called programmatically.
197-204: LGTM: Well-structured CLI argument.The
--postprocessargument is properly configured with appropriate defaults, choices validation, and maintains backward compatibility by defaulting to "base".
21-27: Import verified. Thepostprocess_gptossfunction is properly defined inspecdec_bench.utilsand will import without errors.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #759 +/- ##
=======================================
Coverage 74.62% 74.62%
=======================================
Files 192 192
Lines 18989 18989
=======================================
Hits 14171 14171
Misses 4818 4818 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: Izzy Putterman <iputterman@nvidia.com>
068dcc3 to
4a4723c
Compare
## What does this PR do? **Type of change:** ? Bug Fix: https://nvbugspro.nvidia.com/bug/5795144 **Overview:** ? Pass postprocess flag to handle slicing message. ## Usage <!-- You can potentially add a usage example below. --> ```python # Add a code snippet demonstrating how to use this ``` ## Testing <!-- Mention how have you tested your change if applicable. --> ## Before your PR is "*Ready for review*" <!-- If you haven't finished some of the above items you can still open `Draft` PR. --> - **Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)** and your commits are signed. - **Is this change backward compatible?**: Yes/No <!--- If No, explain why. --> - **Did you write any new necessary tests?**: Yes/No - **Did you add or update any necessary documentation?**: Yes/No - **Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**: Yes/No <!--- Only for new features, API changes, critical bug fixes or bw breaking changes. --> ## Additional Information <!-- E.g. related issue. --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes * **New Features** * Introduced --postprocess command-line option to select postprocessing strategy. Users can choose "base" (default, preserves existing behavior) or "gptoss" (new alternative method) with validation to reject invalid selections. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: Izzy Putterman <iputterman@nvidia.com>
## What does this PR do? **Type of change:** ? Bug Fix: https://nvbugspro.nvidia.com/bug/5795144 **Overview:** ? Pass postprocess flag to handle slicing message. ## Usage <!-- You can potentially add a usage example below. --> ```python # Add a code snippet demonstrating how to use this ``` ## Testing <!-- Mention how have you tested your change if applicable. --> ## Before your PR is "*Ready for review*" <!-- If you haven't finished some of the above items you can still open `Draft` PR. --> - **Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)** and your commits are signed. - **Is this change backward compatible?**: Yes/No <!--- If No, explain why. --> - **Did you write any new necessary tests?**: Yes/No - **Did you add or update any necessary documentation?**: Yes/No - **Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**: Yes/No <!--- Only for new features, API changes, critical bug fixes or bw breaking changes. --> ## Additional Information <!-- E.g. related issue. --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes * **New Features** * Introduced --postprocess command-line option to select postprocessing strategy. Users can choose "base" (default, preserves existing behavior) or "gptoss" (new alternative method) with validation to reject invalid selections. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: Izzy Putterman <iputterman@nvidia.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>
What does this PR do?
Type of change: ? Bug Fix: https://nvbugspro.nvidia.com/bug/5795144
Overview: ? Pass postprocess flag to handle slicing message.
Usage
# Add a code snippet demonstrating how to use thisTesting
Before your PR is "Ready for review"
Additional Information
Summary by CodeRabbit
Release Notes
✏️ Tip: You can customize this high-level summary in your review settings.