NVIDIA / TensorRT-LLM Public

Notifications
Fork 1.5k
Star 10.9k

Code
Issues 640
Pull requests 293
Discussions
Actions
Projects 1
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: NVIDIA/TensorRT-LLM

Labels 44 Milestones 1

New pull request New

293 Open 2,460 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Draft: test: [CI] Add failed cases into waives.txt

#5569 opened Jun 28, 2025 by xinhe-nv • Draft

[nvbug/5354946][fix] Fix mtp vanilla draft inputs

#5568 opened Jun 28, 2025 by lfr-0531

Loading…

Refactor the topk parallelization part for the routing kernels

#5567 opened Jun 28, 2025 by ChristinaZ

Loading…

[feat][test] reuse MPI pool executor across tests

#5566 opened Jun 28, 2025 by omera-nv

Loading…

Investigate Gemma3 1B discrepancy

#5564 opened Jun 28, 2025 by brb-nv • Draft

Fix GEMM+AR fusion on blackwell

#5563 opened Jun 28, 2025 by xavier-nvidia

Loading…

test: Deprecate gpt_model_type "v1" static batching from triton_backe…

#5562 opened Jun 27, 2025 by mc-nv

Loading…

Implement --served_model_name and improve command line parsing

#5561 opened Jun 27, 2025 by pathorn

Loading…

[TRTLLM-4926][feat] Reimplement metrics endpoint with stats about requests

#5560 opened Jun 27, 2025 by pathorn

Loading…

[fix] Use decorator for request cancelation and handle CancelledError

#5559 opened Jun 27, 2025 by pathorn

Loading…

[nvbug/5337601][fix] Fix disagg + speculative decoding

#5558 opened Jun 27, 2025 by Tabrizian

Loading…

Refactor moe permute and finalize op by removing duplicated code

#5557 opened Jun 27, 2025 by limin2021

Loading…

[TRTLLM-6104] feat: add request_perf_metrics to triton LLMAPI backend

#5554 opened Jun 27, 2025 by xuanzic

Loading…

[feat] Support MXFP4 x BF16 Grouped GEMM in FusedMoE Pytorch Module

#5552 opened Jun 27, 2025 by jinyangyuan-nvidia

Loading…

feat: Improve dev container tagging

#5551 opened Jun 27, 2025 by ixlmar • Draft

feat: Optimize TRTLLM Sampler perf single beam single step

#5550 opened Jun 27, 2025 by dcampora

Loading…

tests: add test_chunked_prefill for llama4

#5549 opened Jun 27, 2025 by xinhe-nv • Draft

Deduplicate waive list

#5546 opened Jun 27, 2025 by yiqingy0

Loading…

test: Use multiple workers for multi-GPU engine building

#5545 opened Jun 27, 2025 by Funatiq • Draft

rcca: test default kv_cache_reuse option for pytorch multimodal

#5544 opened Jun 27, 2025 by StanleySun639

Loading…

[draft] chore: enhance GenerationExecutor with RPC

#5543 opened Jun 27, 2025 by Superjomn • Draft

[DON'T MERGE] NGram with iter_stats

#5542 opened Jun 27, 2025 by wili-65535 • Draft

[nvbug 5304752][fix]: enhance _check_arguments to filter illegal requests for pytorch backend

#5541 opened Jun 27, 2025 by LinPoly

Loading…

Add pd dynamic scaling readme

#5540 opened Jun 27, 2025 by Shunkangz

Loading…

[Infra][release/0.21]Update nccl to 2.27.5

#5539 opened Jun 27, 2025 by EmmaQiaoCh

Loading…

Previous 1 2 3 4 5 … 11 12 Next

Previous Next

ProTip! Exclude everything labeled bug with -label:bug.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!