-
Notifications
You must be signed in to change notification settings - Fork 570
Pull requests: InternLM/lmdeploy
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add VRAM bandwidth utilization stat to attention test
#3731
opened Jul 11, 2025 by
lzhangzz
Loading…
Fix the logic of calculating max_new_tokens and determining finish_reason
improvement
#3727
opened Jul 10, 2025 by
lvhan028
Loading…
feat(build): Integrate and build turbomind backend directly in setup.py
#3726
opened Jul 10, 2025 by
windreamer
Loading…
1 of 4 tasks
Override HF config.json via CLI
improvement
#3722
opened Jul 9, 2025 by
CUHKSZzxy
Loading…
1 of 2 tasks
[ascend] support lora
enhancement
New feature or request
#3715
opened Jul 7, 2025 by
tangzhiyi11
•
Draft
Relax FP8 TP requirement
enhancement
New feature or request
#3697
opened Jul 1, 2025 by
lzhangzz
Loading…
Add Gloo communication to turobmind
enhancement
New feature or request
#3362
opened Mar 28, 2025 by
irexyc
Loading…
Improve turbomind's prefix cache
BC-breaking
improvement
#3332
opened Mar 25, 2025 by
lvhan028
Loading…
6 of 8 tasks
add deepseekv3 doc
documentation
Improvements or additions to documentation
WIP
#3265
opened Mar 17, 2025 by
CUHKSZzxy
Loading…
support setting devices for turbomind backend
improvement
#3203
opened Mar 3, 2025 by
irexyc
Loading…
fix: replace inf with max or min finite value, then do softmax
#3059
opened Jan 21, 2025 by
KenForever1
Loading…
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.