Skip to content

Issues: NVIDIA/TensorRT-LLM

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

How to build TensorRT-LLM engine on host and deploy to Jetson Orin Nano Super? question Further information is requested triaged Issue has been triaged by maintainers
#3149 opened Mar 29, 2025 by Sesameisgod
When will Gemma 3 be supported? feature request New feature or request triaged Issue has been triaged by maintainers
#3143 opened Mar 29, 2025 by bebilli
Executor API: How to get throughput Investigating Performance Issue about performance number triaged Issue has been triaged by maintainers
#3142 opened Mar 28, 2025 by khayamgondal
Lookahead decoding and multimodal input support question Further information is requested triaged Issue has been triaged by maintainers
#3137 opened Mar 28, 2025 by maxilevi
Force KV Cache Offload question Further information is requested triaged Issue has been triaged by maintainers
#3130 opened Mar 27, 2025 by khayamgondal
Model built with ReDrafter produces substantially lower quality outputs bug Something isn't working
#3125 opened Mar 27, 2025 by geaned
2 of 4 tasks
CUDA Device Binding Runtime Error When Running GPT-3 in Multi-Node Mode Using Slurm bug Something isn't working triaged Issue has been triaged by maintainers
#3123 opened Mar 27, 2025 by glara76
4 tasks
How to implement attention when query and value have different hidden dims? question Further information is requested triaged Issue has been triaged by maintainers
#3121 opened Mar 27, 2025 by ChaseMonsterAway
Unable to run Deepseek R1 on blackwell bug Something isn't working triaged Issue has been triaged by maintainers
#3118 opened Mar 27, 2025 by pankajroark
1 of 4 tasks
.devcontainer points to internal Docker image feature request New feature or request triaged Issue has been triaged by maintainers
#3111 opened Mar 26, 2025 by aspctu
How to reproduce 150 TPS using FP8 + MTP=0 + BSZ=1 on H200? triaged Issue has been triaged by maintainers
#3108 opened Mar 26, 2025 by ghostplant
How to achieve 253 tok/sec with DeepSeek-R1-FP4 on 8xB200 triaged Issue has been triaged by maintainers
#3058 opened Mar 25, 2025 by jeffye-dev
[Question] Why delete q_b_scale kv_b_scale k_b_trans_scale not a bug Some known limitation, but not a bug.
#2970 opened Mar 21, 2025 by nanmi
Request for Reproduction Configuration of DeepSeek-R1 on H200 & B200 triaged Issue has been triaged by maintainers
#2964 opened Mar 20, 2025 by xwuShirley
Does trtllm-serve enables prefix caching automatically with Deepseek-R1? triaged Issue has been triaged by maintainers
#2932 opened Mar 17, 2025 by Bihan
What's the throughput of R1 671B using bs=1 without quant? not a bug Some known limitation, but not a bug.
#2928 opened Mar 17, 2025 by ghostplant
ProTip! Updated in the last three days: updated:>2025-03-26.