-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Pull requests: sgl-project/sglang
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix(hicache-kernel): Remove redundant arguments (io_backend and page_size) in memory_pool.py
#8565
opened Jul 30, 2025 by
hzh0425
Loading…
6 tasks
[Fix]Fix index oob in get_group_gemm_starts kernel.
#8564
opened Jul 30, 2025 by
HydraQYH
Loading…
5 of 6 tasks
enable aiter gemm_a8w8_bpreshuffle for ptpc gemm
#8555
opened Jul 30, 2025 by
Yuechguo
Loading…
6 tasks
feat(server): add image count limit to prevent OOM in multimodal requestsests
#8553
opened Jul 30, 2025 by
vincentzed
•
Draft
6 tasks
[NVIDIA] Add Low Latency NVFP4 decode kernels from Flashinfer
#8552
opened Jul 30, 2025 by
azhurkevich
•
Draft
6 tasks
feat: Add new moe triton for NVIDIA RTX 6000 Ada
#8547
opened Jul 30, 2025 by
17Reset
Loading…
6 tasks
Add GKE's default CUDA runtime lib location to PATH and LD_LIBRARY_PATH.
#8544
opened Jul 29, 2025 by
pyc96
Loading…
6 tasks
[Not Ready for merge] fix per token cuda kernel hidden dim cannot divide by 16
#8543
opened Jul 29, 2025 by
hebiao064
Loading…
6 tasks
Bump transfomers to 4.54.1 to fix Gemma cache issue.
#8541
opened Jul 29, 2025 by
lifuhuang
Loading…
6 tasks
fix: use correct causality condition for
flashattention
, flashinfer
, and triton
backends
#8534
opened Jul 29, 2025 by
MahmoudAshraf97
Loading…
3 of 6 tasks
Fix nan value generated after custom all reduce
#8532
opened Jul 29, 2025 by
kkHuang-amd
Loading…
6 tasks
[sgl-kernel code style] clean moe_align_block_size kernel token_cnts_buffer
#8526
opened Jul 29, 2025 by
BBuf
Loading…
6 tasks
Previous Next
ProTip!
Filter pull requests by the default branch with base:main.