-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Pull requests: NVIDIA/TensorRT-LLM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix --image_path param error in multimodal run.py tests
Community want to contribute
PRs initiated from Community
#5757
opened Jul 4, 2025 by
pandalee99
Loading…
[TRTLLM-5530][BREAKING CHANGE] refactor: unify KvCacheConfig in LLM class for pytorch backend
#5752
opened Jul 4, 2025 by
Superjomn
Loading…
[TRTLLM-5530][BREAKING CHANGE] refactor: LLM arglist rename mixed_sampler to enable_mixed_sampler
#5751
opened Jul 4, 2025 by
Superjomn
Loading…
[nvbug5266240] chore: unwaive test_llm_with_dummy_weights
#5744
opened Jul 4, 2025 by
Superjomn
Loading…
[feat] Adds optional module cache for TRT-LLM Gen Gemm interfaces
#5743
opened Jul 4, 2025 by
davidclark-nv
Loading…
[TRTLLM-6070] docs: Add initial documentation for trtllm-bench CLI.
#5734
opened Jul 3, 2025 by
FrankD412
Loading…
[TRTLLM-5847][feat] Support n-gram speculative decoding with disagg
#5732
opened Jul 3, 2025 by
raayandhar
Loading…
[feat] [https://nvbugs/5369010] Add TRTLLM MoE nvfp4 cubins for mid-high concurrency
#5723
opened Jul 3, 2025 by
rosenrodt
Loading…
fix: auto-load sliding window config for VSWA models
#5722
opened Jul 3, 2025 by
jaedeok-nvidia
Loading…
Previous Next
ProTip!
Filter pull requests by the default branch with base:main.