Skip to content

[None][refactor] Request management in ScheduledRequests#11784

Merged
Funatiq merged 4 commits intoNVIDIA:mainfrom
Funatiq:dev/refactor/scheduled_batch
Mar 7, 2026
Merged

[None][refactor] Request management in ScheduledRequests#11784
Funatiq merged 4 commits intoNVIDIA:mainfrom
Funatiq:dev/refactor/scheduled_batch

Conversation

@Funatiq
Copy link
Collaborator

@Funatiq Funatiq commented Feb 27, 2026

Summary by CodeRabbit

Release Notes

Refactoring & Improvements

  • Refactored request scheduling and batching logic to improve handling of mixed context and generation workloads in the executor pipeline.
  • Enhanced resource management and KV cache preparation for more efficient batch processing.
  • Updated request tracking mechanisms across the model engine, scheduler, and sampler for better performance consistency.

Description

The main changes are:

  • Introduce separate lists in ScheduledRequests for context_requests_chunking and context_requests_last_chunk
  • Store ScheduledRequests separately in BatchState and SampleState (this is a preparation to selecting the requests that should run the sampling)

Test Coverage

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

@Funatiq
Copy link
Collaborator Author

Funatiq commented Feb 27, 2026

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #37069 [ run ] triggered by Bot. Commit: 0c95753 Link to invocation

@Funatiq Funatiq force-pushed the dev/refactor/scheduled_batch branch from 0c95753 to c9aecea Compare February 27, 2026 12:38
@Funatiq
Copy link
Collaborator Author

Funatiq commented Feb 27, 2026

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #37072 [ run ] triggered by Bot. Commit: c9aecea Link to invocation

@tensorrt-cicd
Copy link
Collaborator

PR_Github #37072 [ run ] completed with state SUCCESS. Commit: c9aecea
/LLM/main/L0_MergeRequest_PR pipeline #28704 completed with status: 'FAILURE'

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@Funatiq Funatiq force-pushed the dev/refactor/scheduled_batch branch from c9aecea to a10d4bb Compare February 27, 2026 15:24
@Funatiq
Copy link
Collaborator Author

Funatiq commented Feb 27, 2026

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #37087 [ run ] triggered by Bot. Commit: a10d4bb Link to invocation

@tensorrt-cicd
Copy link
Collaborator

PR_Github #37087 [ run ] completed with state SUCCESS. Commit: a10d4bb
/LLM/main/L0_MergeRequest_PR pipeline #28715 completed with status: 'FAILURE'

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@Funatiq Funatiq force-pushed the dev/refactor/scheduled_batch branch from a10d4bb to 82372d1 Compare February 28, 2026 11:28
@Funatiq
Copy link
Collaborator Author

Funatiq commented Feb 28, 2026

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #37167 [ run ] triggered by Bot. Commit: 82372d1 Link to invocation

@tensorrt-cicd
Copy link
Collaborator

PR_Github #37167 [ run ] completed with state SUCCESS. Commit: 82372d1
/LLM/main/L0_MergeRequest_PR pipeline #28777 completed with status: 'FAILURE'

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@Funatiq Funatiq changed the title [refactor] Request management in ScheduledRequests [None][refactor] Request management in ScheduledRequests Feb 28, 2026
@Funatiq Funatiq force-pushed the dev/refactor/scheduled_batch branch from 82372d1 to ba0482f Compare February 28, 2026 15:40
@Funatiq
Copy link
Collaborator Author

Funatiq commented Feb 28, 2026

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #37176 [ run ] triggered by Bot. Commit: ba0482f Link to invocation

@tensorrt-cicd
Copy link
Collaborator

PR_Github #37176 [ run ] completed with state SUCCESS. Commit: ba0482f
/LLM/main/L0_MergeRequest_PR pipeline #28784 completed with status: 'FAILURE'

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@Funatiq
Copy link
Collaborator Author

Funatiq commented Feb 28, 2026

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #37181 [ run ] triggered by Bot. Commit: ba0482f Link to invocation

@tensorrt-cicd
Copy link
Collaborator

PR_Github #37181 [ run ] completed with state DISABLED
CI server is currently disabled for scheduled maintenance. Estimated completion time: 8 PM PST on 2/28.

Link to invocation

@chzblych
Copy link
Collaborator

chzblych commented Mar 1, 2026

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #37210 [ run ] triggered by Bot. Commit: ba0482f Link to invocation

@tensorrt-cicd
Copy link
Collaborator

PR_Github #37210 [ run ] completed with state SUCCESS. Commit: ba0482f
/LLM/main/L0_MergeRequest_PR pipeline #28799 completed with status: 'FAILURE'

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@Funatiq
Copy link
Collaborator Author

Funatiq commented Mar 1, 2026

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #37219 [ run ] triggered by Bot. Commit: ba0482f Link to invocation

@tensorrt-cicd
Copy link
Collaborator

PR_Github #37219 [ run ] completed with state SUCCESS. Commit: ba0482f
/LLM/main/L0_MergeRequest_PR pipeline #28807 completed with status: 'FAILURE'

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@Funatiq
Copy link
Collaborator Author

Funatiq commented Mar 1, 2026

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #37234 [ run ] triggered by Bot. Commit: ba0482f Link to invocation

Copy link
Collaborator

@eopXD eopXD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thank you.

@tensorrt-cicd
Copy link
Collaborator

PR_Github #37819 [ run ] completed with state SUCCESS. Commit: d3130b3
/LLM/main/L0_MergeRequest_PR pipeline #29281 completed with status: 'FAILURE'

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@Funatiq Funatiq force-pushed the dev/refactor/scheduled_batch branch from 28f118c to 44f1fef Compare March 5, 2026 07:56
@Funatiq
Copy link
Collaborator Author

Funatiq commented Mar 5, 2026

/bot skip --comment "All tests passed in last pipeline. Only release check was failing and was fixed in main."

@tensorrt-cicd
Copy link
Collaborator

PR_Github #37831 [ skip ] triggered by Bot. Commit: 44f1fef Link to invocation

@tensorrt-cicd
Copy link
Collaborator

PR_Github #37831 [ skip ] completed with state SUCCESS. Commit: 44f1fef
Skipping testing for commit 44f1fef

Link to invocation

@Funatiq Funatiq force-pushed the dev/refactor/scheduled_batch branch from 7c9f79e to 1bbe808 Compare March 6, 2026 08:38
@Funatiq
Copy link
Collaborator Author

Funatiq commented Mar 6, 2026

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #37996 [ run ] triggered by Bot. Commit: 1bbe808 Link to invocation

@Funatiq Funatiq requested a review from venkywonka March 6, 2026 15:08
Funatiq added 4 commits March 6, 2026 15:08
- Separate context requests into chunking and last chunk lists.
- Add context_requests property to combine chunking and last chunk lists.
- Add num_context_requests and num_generation_requests properties.
- Add scheduled requests to BatchState.

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
- Introduce append_context_request and append_generation_request functions for ScheduledRequests.
- Append context requests to the appropriate lists in the ScheduledRequests object.

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
@Funatiq Funatiq force-pushed the dev/refactor/scheduled_batch branch from ff52270 to d668f0b Compare March 6, 2026 15:09
@Funatiq
Copy link
Collaborator Author

Funatiq commented Mar 6, 2026

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #37996 [ run ] completed with state SUCCESS. Commit: 1bbe808
/LLM/main/L0_MergeRequest_PR pipeline #29428 completed with status: 'FAILURE'

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@tensorrt-cicd
Copy link
Collaborator

PR_Github #38045 [ run ] triggered by Bot. Commit: d668f0b Link to invocation

@Funatiq Funatiq enabled auto-merge (squash) March 6, 2026 17:55
@tensorrt-cicd
Copy link
Collaborator

PR_Github #38045 [ run ] completed with state SUCCESS. Commit: d668f0b
/LLM/main/L0_MergeRequest_PR pipeline #29473 completed with status: 'FAILURE'

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@Funatiq
Copy link
Collaborator Author

Funatiq commented Mar 6, 2026

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #38065 [ run ] triggered by Bot. Commit: d668f0b Link to invocation

@tensorrt-cicd
Copy link
Collaborator

PR_Github #38065 [ run ] completed with state SUCCESS. Commit: d668f0b
/LLM/main/L0_MergeRequest_PR pipeline #29492 completed with status: 'SUCCESS'

Link to invocation

@Funatiq Funatiq merged commit 2087b24 into NVIDIA:main Mar 7, 2026
5 checks passed
@Funatiq Funatiq deleted the dev/refactor/scheduled_batch branch March 7, 2026 07:00
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Mar 9, 2026
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants