[https://nvbugs/5808603][fix] Add bias support to WeightOnlyQuantLinearMethod by stnie · Pull Request #12317 · NVIDIA/TensorRT-LLM

stnie · 2026-03-18T10:33:26Z

Summary by CodeRabbit

Bug Fixes
- Fixed bias handling in linear layer operations to ensure bias is correctly applied to outputs across all execution paths.
Tests
- Enhanced test coverage for quantized linear operations with parameterized bias configuration validation.

Description

WeightOnlyQuantLinearMethods did not apply bias, which is for example used in Qwen2.5 models

This PR:

Modified WeightOnlyQuantLinearMethod to include bias in the output calculation.
Updated test_weight_only_quant_linear to parameterize bias and validate its effect on output.

Test Coverage

Updated test_weight_only_quant_linear to parameterize bias and validate its effect on output.

PR Checklist

Please review the following before submitting your PR:

PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

…arMethod and update corresponding tests - Modified WeightOnlyQuantLinearMethod to include bias in the output calculation. - Updated test_weight_only_quant_linear to parameterize bias and validate its effect on output. Signed-off-by: Stefan Niebler <82932102+stnie@users.noreply.github.com>

stnie · 2026-03-18T10:35:06Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-03-18T10:41:25Z

PR_Github #39446 [ run ] triggered by Bot. Commit: 5763719 Link to invocation

tensorrt-cicd · 2026-03-18T14:18:58Z

PR_Github #39446 [ run ] completed with state SUCCESS. Commit: 5763719
/LLM/main/L0_MergeRequest_PR pipeline #30673 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

CI Report

Link to invocation

coderabbitai · 2026-03-18T14:43:09Z

📝 Walkthrough

Walkthrough

Adds bias support to the Linear module's apply method by conditionally adding a bias tensor to the output. Extends the weight-only quantized linear test with parameterized bias cases, including bias tensor creation, loading, and reference output adjustment.

Changes

Cohort / File(s)	Summary
Linear Module Bias Support `tensorrt_llm/_torch/modules/linear.py`	Adds three lines to conditionally apply bias in the Linear.apply path: checks if bias tensor is provided and adds it to the computed output.
Quantized Linear Test Extension `tests/unittest/_torch/thop/parallel/test_weight_only_quant_linear.py`	Introduces `bias` as a pytest parameterized test parameter (True/False). Creates bias tensor when enabled, passes it to Linear constructor and load_weights payload, and adjusts reference output computation to include bias term conditionally.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main change: adding bias support to WeightOnlyQuantLinearMethod, which aligns perfectly with the code changes in the PR.
Description check	✅ Passed	The PR description includes a clear problem statement, explains the changes made, and lists test coverage, following the required template structure.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

📝 Coding Plan

Generate coding plan for human review comments

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Tip

You can disable the changed files summary in the walkthrough.

Disable the reviews.changed_files_summary setting to disable the changed files summary in the walkthrough.

github-actions bot assigned stnie Mar 18, 2026

stnie marked this pull request as ready for review March 18, 2026 14:40

stnie requested a review from a team as a code owner March 18, 2026 14:40

stnie requested a review from yuxianq March 18, 2026 14:40

yuxianq approved these changes Mar 19, 2026

View reviewed changes

stnie merged commit eb5b602 into NVIDIA:main Mar 19, 2026
9 of 10 checks passed

stnie mentioned this pull request Mar 20, 2026

[Bug]: INT8WO quantization leads to broken outputs for Qwen2.5 models #10661

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[https://nvbugs/5808603][fix] Add bias support to WeightOnlyQuantLinearMethod#12317

[https://nvbugs/5808603][fix] Add bias support to WeightOnlyQuantLinearMethod#12317
stnie merged 1 commit intoNVIDIA:mainfrom
stnie:bugfix/5808603

stnie commented Mar 18, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

stnie commented Mar 18, 2026

Uh oh!

tensorrt-cicd commented Mar 18, 2026

Uh oh!

tensorrt-cicd commented Mar 18, 2026

Uh oh!

coderabbitai bot commented Mar 18, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

stnie commented Mar 18, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Description

Test Coverage

PR Checklist

GitHub Bot Help

Uh oh!

stnie commented Mar 18, 2026

Uh oh!

tensorrt-cicd commented Mar 18, 2026

Uh oh!

tensorrt-cicd commented Mar 18, 2026

Uh oh!

coderabbitai bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

stnie commented Mar 18, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 18, 2026 •

edited

Loading