Skip to content

[TRTLLM-11385][chore] Mark TRTLLMSampler as deprecated and update documentation#11938

Open
Funatiq wants to merge 2 commits intoNVIDIA:mainfrom
Funatiq:dev/deprecate_trtllm_sampler
Open

[TRTLLM-11385][chore] Mark TRTLLMSampler as deprecated and update documentation#11938
Funatiq wants to merge 2 commits intoNVIDIA:mainfrom
Funatiq:dev/deprecate_trtllm_sampler

Conversation

@Funatiq
Copy link
Copy Markdown
Collaborator

@Funatiq Funatiq commented Mar 5, 2026

Summary by CodeRabbit

  • Documentation

    • Updated sampling documentation, removing backend-specific content and guidance on sampler selection.
  • Deprecations

    • TRTLLMSampler is deprecated and will be removed in release 1.4. TorchSampler is now the default and exclusive sampling backend for all configurations.

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

Copy link
Copy Markdown
Collaborator

@ixlmar ixlmar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Funatiq
Copy link
Copy Markdown
Collaborator Author

Funatiq commented Mar 6, 2026

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #38014 [ run ] triggered by Bot. Commit: b04bf0d Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #38014 [ run ] completed with state SUCCESS. Commit: b04bf0d
/LLM/main/L0_MergeRequest_PR pipeline #29445 completed with status: 'FAILURE'

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@Funatiq Funatiq force-pushed the dev/deprecate_trtllm_sampler branch from b04bf0d to 5a2cd58 Compare March 6, 2026 13:45
@Funatiq
Copy link
Copy Markdown
Collaborator Author

Funatiq commented Mar 6, 2026

/bot run --stage-list "H100_PCIe-PyTorch-1, H100_PCIe-PyTorch-2, DGX_B200-4_GPUs-PyTorch-Ray-1"

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #38029 [ run ] triggered by Bot. Commit: 5a2cd58 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #38029 [ run ] completed with state FAILURE. Commit: 5a2cd58
/LLM/main/L0_MergeRequest_PR pipeline #29456 (Partly Tested) completed with status: 'FAILURE'

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@Funatiq
Copy link
Copy Markdown
Collaborator Author

Funatiq commented Mar 6, 2026

/bot run --stage-list "DGX_H100-PyTorch-1, DGX_H100-PyTorch-3, DGX_B200-4_GPUs-PyTorch-Ray-1"

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #38051 [ run ] triggered by Bot. Commit: 5a2cd58 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #38051 [ run ] completed with state SUCCESS. Commit: 5a2cd58
/LLM/main/L0_MergeRequest_PR pipeline #29478 (Partly Tested) completed with status: 'SUCCESS'

Link to invocation

@Funatiq
Copy link
Copy Markdown
Collaborator Author

Funatiq commented Mar 6, 2026

/bot run

@Funatiq Funatiq marked this pull request as ready for review March 7, 2026 07:38
@Funatiq Funatiq requested review from a team as code owners March 7, 2026 07:38
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 7, 2026

📝 Walkthrough

Walkthrough

These changes deprecate TRTLLMSampler functionality by removing automatic beam-search activation while retaining explicit invocation with deprecation warnings. Documentation is generalized to reduce sampler-specific references, and test code is updated to remove reliance on the deprecated sampler type.

Changes

Cohort / File(s) Summary
Documentation
docs/source/features/sampling.md
Removed backend-specific sampler references (Torch Sampler and TRTLLM Sampler) and examples. Generalized section title from "LLM API sampling behavior when using Torch Sampler" to "LLM API sampling behavior" and standardized "sampler" capitalization.
Core Library Changes
tensorrt_llm/llmapi/llm_args.py, tensorrt_llm/_torch/pyexecutor/_util.py
Updated sampler_type field status from "beta" to "deprecated" with revised description. Removed conditional path that activated TRTLLMSampler for beam search; now only uses TRTLLMSampler when explicitly set, with deprecation warning logged.
Test Updates
tests/unittest/_torch/modeling/test_modeling_nemotron_h.py, tests/unittest/_torch/ray_orchestrator/multi_gpu/test_accuracy_with_allreduce_strategy.py, tests/unittest/api_stability/references/llm.yaml
Removed explicit sampler_type argument from LLM construction and test parametrization. Updated test function signature to remove sampler_type parameter. Updated API stability reference metadata to reflect deprecated status.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ⚠️ Warning The PR description is empty except for the repository template structure; no actual explanation, test coverage details, or checklist completion are provided by the author. Add a clear description explaining why TRTLLMSampler is being deprecated, what the migration path is, and list relevant tests that validate the changes (e.g., tests that verify deprecation warnings and sampler behavior).
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The pull request title clearly and specifically describes the main change: marking TRTLLMSampler as deprecated and updating documentation, with proper JIRA ticket reference and chore type indicator.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/source/features/sampling.md`:
- Around line 84-96: The paragraph about FlashInfer should be restricted to the
default/TorchSampler path: update the text so it explicitly states that
FlashInfer optimizations and sorting-free implementations apply to TorchSampler
(or the sampler returned by _util.instantiate_sampler() when it yields a
TorchSampler) and not to other backends like TRTLLMSampler; mention that
TRTLLMSampler remains available until 1.4 and may not use FlashInfer, and keep
the note about disable_flashinfer_sampling scoped to TorchSampler behavior.
Locate references to _util.instantiate_sampler, TorchSampler, and TRTLLMSampler
and adjust wording so the FlashInfer performance notes only apply to
TorchSampler/default sampler.

In `@tensorrt_llm/llmapi/llm_args.py`:
- Around line 3014-3019: The Field declaration for sampler_type currently marks
the entire parameter deprecated via status="deprecated", which incorrectly flags
values like auto and TorchSampler as deprecated; remove the status="deprecated"
argument from the sampler_type Field so the parameter itself is not deprecated,
keep/update the description to explicitly state only SamplerType.TRTLLMSampler
(TRTLLMSampler) is deprecated and ensure any runtime warning continues to
trigger only when sampler_type == SamplerType.TRTLLMSampler; this involves
editing the sampler_type Field call and the description text around
SamplerType/TRTLLMSampler and leaving SamplerType.auto/TorchSampler behavior
unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8aeee868-37bc-4f11-a957-1208be1f3ebf

📥 Commits

Reviewing files that changed from the base of the PR and between 5b0c956 and 5a2cd58.

📒 Files selected for processing (6)
  • docs/source/features/sampling.md
  • tensorrt_llm/_torch/pyexecutor/_util.py
  • tensorrt_llm/llmapi/llm_args.py
  • tests/unittest/_torch/modeling/test_modeling_nemotron_h.py
  • tests/unittest/_torch/ray_orchestrator/multi_gpu/test_accuracy_with_allreduce_strategy.py
  • tests/unittest/api_stability/references/llm.yaml
💤 Files with no reviewable changes (1)
  • tests/unittest/_torch/modeling/test_modeling_nemotron_h.py

@Funatiq Funatiq force-pushed the dev/deprecate_trtllm_sampler branch from 5a2cd58 to 06c1c8f Compare March 9, 2026 14:27
@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #39951 [ run ] completed with state SUCCESS. Commit: b720871
/LLM/main/L0_MergeRequest_PR pipeline #31116 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@Funatiq
Copy link
Copy Markdown
Collaborator Author

Funatiq commented Mar 23, 2026

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #39974 [ run ] triggered by Bot. Commit: b720871 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #39974 [ run ] completed with state SUCCESS. Commit: b720871
/LLM/main/L0_MergeRequest_PR pipeline #31136 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@Funatiq
Copy link
Copy Markdown
Collaborator Author

Funatiq commented Mar 24, 2026

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40086 [ run ] triggered by Bot. Commit: b720871 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40086 [ run ] completed with state SUCCESS. Commit: b720871
/LLM/main/L0_MergeRequest_PR pipeline #31238 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@Funatiq
Copy link
Copy Markdown
Collaborator Author

Funatiq commented Mar 24, 2026

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40148 [ run ] triggered by Bot. Commit: b720871 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40148 [ run ] completed with state FAILURE. Commit: b720871
/LLM/main/L0_MergeRequest_PR pipeline #31294 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@Funatiq
Copy link
Copy Markdown
Collaborator Author

Funatiq commented Mar 24, 2026

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40161 [ run ] triggered by Bot. Commit: b720871 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40161 [ run ] completed with state SUCCESS. Commit: b720871
/LLM/main/L0_MergeRequest_PR pipeline #31305 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@Funatiq Funatiq marked this pull request as ready for review March 25, 2026 09:07
@Funatiq
Copy link
Copy Markdown
Collaborator Author

Funatiq commented Mar 25, 2026

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40308 [ run ] triggered by Bot. Commit: b720871 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40308 [ run ] completed with state SUCCESS. Commit: b720871
/LLM/main/L0_MergeRequest_PR pipeline #31419 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

Funatiq added 2 commits March 27, 2026 16:46
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
@Funatiq Funatiq force-pushed the dev/deprecate_trtllm_sampler branch from b720871 to 391d683 Compare March 27, 2026 15:46
@Funatiq
Copy link
Copy Markdown
Collaborator Author

Funatiq commented Mar 27, 2026

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40509 [ run ] triggered by Bot. Commit: 391d683 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40509 [ run ] completed with state SUCCESS. Commit: 391d683
/LLM/main/L0_MergeRequest_PR pipeline #31597 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@Funatiq
Copy link
Copy Markdown
Collaborator Author

Funatiq commented Mar 30, 2026

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40685 [ run ] triggered by Bot. Commit: 391d683 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40685 [ run ] completed with state SUCCESS. Commit: 391d683
/LLM/main/L0_MergeRequest_PR pipeline #31714 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@Funatiq
Copy link
Copy Markdown
Collaborator Author

Funatiq commented Mar 30, 2026

/bot run --disable-fail-fast

1 similar comment
@Funatiq
Copy link
Copy Markdown
Collaborator Author

Funatiq commented Mar 30, 2026

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40774 [ run ] triggered by Bot. Commit: 391d683 Link to invocation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants