-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Pull requests: sgl-project/sglang
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Feature] Support Tensor Parallelism and Weight Slicing for Lora
#4239
opened Mar 9, 2025 by
aoshen524
Loading…
2 of 3 tasks
fix per_token_group_quant_fp8 illegal memory when num_groups % 16 != 0
#4231
opened Mar 9, 2025 by
BBuf
Loading…
refactor: move image processors to individual files
#4229
opened Mar 9, 2025 by
mickqian
Loading…
1 of 6 tasks
[Feature] Prefill assistant response - add continue_final_message parameter
#4226
opened Mar 9, 2025 by
adarshxs
Loading…
3 tasks done
Add H20 tuning configs support DeepSeek V3/R1 INT8(block-wise)
#4220
opened Mar 9, 2025 by
Ximingwang-09
Loading…
4 of 6 tasks
Remove vllm ops scaled fp8 quant and accelerate per token quant by 25-30%
#4215
opened Mar 8, 2025 by
hebiao064
Loading…
3 of 6 tasks
[Fix] Check the device backend before calling empty_cache function
#4212
opened Mar 8, 2025 by
cboss6
Loading…
1 of 6 tasks
Statistical Analysis of the Output Stability of the Deepseek Model
#4202
opened Mar 8, 2025 by
tanzelin430
•
Draft
2 of 6 tasks
[ROCm/Draft/No-Merge]: Flex Attention Enablement
amd
collaboration
documentation
Improvements or additions to documentation
DeepGemm integrate to sgl-kernel
high priority
#4165
opened Mar 7, 2025 by
laixinn
Loading…
6 tasks done
[ROCm] Enable silu_and_mul, gelu_and_mul, gelu_tanh_and_mul in amd platform
#4150
opened Mar 6, 2025 by
yiakwy-xpu-ml-framework-team
Loading…
6 tasks
Add A800 tuning configs support DeepSeek V3/R1 BF16 and INT8(block-wise)
#4136
opened Mar 6, 2025 by
lambert0312
Loading…
1 of 6 tasks
Add awq dequantize kernel to sgl with 1x to 3x speedup
#4104
opened Mar 5, 2025 by
zcnrex
Loading…
6 tasks
Previous Next
ProTip!
What’s not been updated in a month: updated:<2025-02-09.