NVIDIA / TensorRT-LLM Public

Notifications
Fork 1.6k
Star 10.9k

Code
Issues 653
Pull requests 316
Discussions
Actions
Projects 1
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: NVIDIA/TensorRT-LLM

Labels 44 Milestones 1

New pull request New

316 Open 2,601 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[Test] - Waive or fix few known test failures

#5769 opened Jul 6, 2025 by chzblych

Loading…

feat: Enable Gemma3 with FlashInfer backend

#5768 opened Jul 5, 2025 by brb-nv • Draft

[TRTLLM-5881] feat: Integrate TRT-LLM Gen FP4 block scale MoE with Pytorch workflow kernel autotuner

#5764 opened Jul 4, 2025 by DomBrown • Draft

Fix docker cache mount

#5763 opened Jul 4, 2025 by MartinMarciniszyn

Loading…

[nvbugs/5369799] fix: Update disaggregation handling in sampler

#5762 opened Jul 4, 2025 by stnie • Draft

[nvbugs/5345391] fix: chunked prefill + overlap scheduling

#5761 opened Jul 4, 2025 by Funatiq • Draft

Fix --image_path param error in multimodal run.py tests Community want to contribute

PRs initiated from Community

#5757 opened Jul 4, 2025 by pandalee99

Loading…

Fix cancel request bug in attentiondp

#5754 opened Jul 4, 2025 by Shunkangz

Loading…

[TRTLLM-5530][BREAKING CHANGE] refactor: unify KvCacheConfig in LLM class for pytorch backend

#5752 opened Jul 4, 2025 by Superjomn

Loading…

[TRTLLM-5530][BREAKING CHANGE] refactor: LLM arglist rename mixed_sampler to enable_mixed_sampler

#5751 opened Jul 4, 2025 by Superjomn

Loading…

chore: log stack trace on error in openai server

#5749 opened Jul 4, 2025 by zhengd-nv

Loading…

Update transformers to 4.53.0

#5747 opened Jul 4, 2025 by Wanli-Jiang

Loading…

[nvbug5266240] chore: unwaive test_llm_with_dummy_weights

#5744 opened Jul 4, 2025 by Superjomn

Loading…

[feat] Adds optional module cache for TRT-LLM Gen Gemm interfaces

#5743 opened Jul 4, 2025 by davidclark-nv

Loading…

feat: moe prepare support topk % 4 != 0

#5742 opened Jul 4, 2025 by WeiHaocheng

Loading…

[feat] Support nvidia/Cosmos-Reason1-7B

#5739 opened Jul 4, 2025 by meatybobby

Loading…

Custom attention mask in TRTLLM attention backend

#5738 opened Jul 4, 2025 by brb-nv • Draft

chores: merge examples for v1.0 doc

#5736 opened Jul 3, 2025 by hchings • Draft

1 of 3 tasks

[TRTLLM-6070] docs: Add initial documentation for trtllm-bench CLI.

#5734 opened Jul 3, 2025 by FrankD412

Loading…

[TRTLLM-5847][feat] Support n-gram speculative decoding with disagg

#5732 opened Jul 3, 2025 by raayandhar

Loading…

chore: some refactor on WideEP

#5727 opened Jul 3, 2025 by dongxuy04

Loading…

[ci] speedup fused moe tests

#5726 opened Jul 3, 2025 by omera-nv

Loading…

[mock] Testing the auto-assign PR functionality

#5725 opened Jul 3, 2025 by venkywonka • Draft

[feat] [https://nvbugs/5369010] Add TRTLLM MoE nvfp4 cubins for mid-high concurrency

#5723 opened Jul 3, 2025 by rosenrodt

Loading…

fix: auto-load sliding window config for VSWA models

#5722 opened Jul 3, 2025 by jaedeok-nvidia

Loading…

Previous 1 2 3 4 5 … 12 13 Next

Previous Next

ProTip! Filter pull requests by the default branch with base:main.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!