-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Pull requests: sgl-project/sglang
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[PD] Raise error for incompatible mooncake version and some minor fixes
#7527
opened Jun 25, 2025 by
ShangmingCai
Loading…
6 tasks
P/D load balancer forwards profiling requests to instances
#7525
opened Jun 25, 2025 by
gronsti-amd
Loading…
1 of 6 tasks
[CPU] add c++ kernel to bind CPU cores and memory node
#7524
opened Jun 25, 2025 by
chunyuan-w
Loading…
[CPU] remove process_group from inputs of shm_allreduce and shm_allgather
cpu
cpu backend performance optimization
intel
sgl-kernel
#7486
opened Jun 24, 2025 by
chunyuan-w
Loading…
[AMD] Remove vllm's scaled_fp8_quant and moe_sum when SGLANG_USE_AITER=1
high priority
#7484
opened Jun 23, 2025 by
hubertlu-tw
Loading…
3 of 6 tasks
[Feature] dynamic server payload size limit
#7475
opened Jun 23, 2025 by
khan-yin
Loading…
4 of 6 tasks
fix(bench_serving): handle None tokenizer.bos_token when apply_chat_template==True
#7466
opened Jun 23, 2025 by
renne444
Loading…
1 of 6 tasks
[BugFix] Destroy nccl Comm to fix cuda memory leak of destroy_model_parallel
#7465
opened Jun 23, 2025 by
wcsjtu
Loading…
2 of 6 tasks
Support non-contiguous query input for extend/decode attention
cpu
cpu backend performance optimization
intel
sgl-kernel
#7462
opened Jun 23, 2025 by
yanbing-j
Loading…
6 tasks
[benchmark] print final benchmark args in json format
#7455
opened Jun 23, 2025 by
staugust
Loading…
1 of 6 tasks
Fix for fp8 quantization failure of qwen 2.5 VL 7B model.
high priority
#7448
opened Jun 22, 2025 by
PanJason
Loading…
2 of 6 tasks
Support dynamic LoRA loading / unloading in engine/server API
ready-for-review
#7446
opened Jun 22, 2025 by
lifuhuang
Loading…
2 of 6 tasks
OPTForCasualLM Support (facebook/opt Series)
new-model
#7440
opened Jun 22, 2025 by
b8zhong
Loading…
Fix: remove duplicate initial assignments in PrefillBootstrapQueue
#7438
opened Jun 22, 2025 by
hzh0425
Loading…
1 of 6 tasks
Fix: resolve prefill of retracted request out-of-memory issue when ignore_eos is enabled
high priority
#7434
opened Jun 22, 2025 by
GaoYusong
Loading…
1 of 6 tasks
Fix stream reasoning parser and Adds Kimi reasoning parser
#7432
opened Jun 22, 2025 by
JustinTong0323
Loading…
2 of 6 tasks
[RL] add pause and continue generation for async rl training
#7419
opened Jun 21, 2025 by
zhuzilin
Loading…
1 of 6 tasks
[RL] Add --nccl-port and --other-ports to prevent port conflict
#7418
opened Jun 21, 2025 by
zhuzilin
Loading…
1 of 6 tasks
Previous Next
ProTip!
no:milestone will show everything without a milestone.