sgl-project / sglang Public

Notifications
Fork 1.5k
Star 13.3k

Code
Issues 497
Pull requests 290
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Issues: sgl-project/sglang

Development Roadmap (2025 H1)

#4042 opened Mar 4, 2025 by zhyncs

Open 14

[Roadmap] Prefill and Decoding Disaggregation

#4655 opened Mar 21, 2025 by ByronHsu

Open 17

[Roadmap] EP Enhancement

#4734 opened Mar 24, 2025 by ch-wan

Open 6

Beta

Labels 37 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

497 Open 1,267 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[Feature] Optimize the Dockerfile for the project

#5474 opened Apr 16, 2025 by whybeyoung

2 tasks

[RFC][Feature][Model] Add templated fallback HF transformers model backend in SRT

#5471 opened Apr 16, 2025 by XuehaiPan

1 of 6 tasks

[Bug] SGLang Server Freezes During High Traffic Periods with 16-GPU DeepSeek v3 Setup

#5465 opened Apr 16, 2025 by moqimoqidea

5 tasks done

[Bug] 模型一直停留在Prefill batch阶段，无法进行decode 出现卡死情况，最终导致计算节点Watchdog Timeout

#5458 opened Apr 16, 2025 by WALLE-AI

5 tasks

[Bug] rank 0 could finish the model loading, but there are other ranks that didn't finish loading. It is likely due to unexpected failures (eg,OOM) or a slow node.

#5457 opened Apr 16, 2025 by Alex-9827

5 tasks done

[Bug] possible prefix matching issue for numerous VLMs

#5455 opened Apr 16, 2025 by KivenChen

5 tasks done

What is the relationship between ModelRunner and the model(deepseek.py,llama.py..etc)?

#5453 opened Apr 16, 2025 by wangzhen2271

[Bug] sglang.bench_serving IndexError when add extra body stream_options.include_usage

#5451 opened Apr 16, 2025 by morty-zxb

5 tasks done

[Bug] PD disaggregation, KV transfer slow down under high concurrency

#5450 opened Apr 16, 2025 by MtFitzRoy

5 tasks done

[Bug] run eagle3 failed

#5448 opened Apr 16, 2025 by riou-chen

[Feature] disable-req-waiting

#5446 opened Apr 16, 2025 by voidxb

2 tasks

[Bug] v0.4.5 NameError: name 'VLLM_AVAILABLE' is not defined from compressed_tensors.py

#5443 opened Apr 16, 2025 by kratorado

5 tasks done

[Bug] ValueError: Model architectures ['Glm4ForCausalLM'] are not supported for now.

#5441 opened Apr 16, 2025 by chunxingque

5 tasks done

[Feature] optimize SegmentPackBits high priority speculative-decoding

#5437 opened Apr 15, 2025 by zhyncs

2 tasks

Help Needed: Slow Inference on 2xH100x8 Setup with DeepSeek-R1

#5429 opened Apr 15, 2025 by seungduk-yanolja

[Bug] AMD ROCm system: Inference error

#5414 opened Apr 15, 2025 by daishiqiang123

5 tasks done

[Bug] hiradix_cache encountered an exception while executing self.dec_lock_ref

#5410 opened Apr 15, 2025 by a4zhangfei

[Bug] Auto-truncation still uses full context length instead of (context_length - max_tokens)

#5409 opened Apr 15, 2025 by BaiMoHan

5 tasks done

https://flashinfer.ai/whl/cu124/torch2.5/flashinfer-python is too slow to download!

#5407 opened Apr 15, 2025 by Huixxi

[Bug] How to build sglang using 5080/5090 gpu

#5403 opened Apr 15, 2025 by mesuzhiben

1 of 5 tasks

[Bug] Llama 4 Scout Outputs Garbage with 2xMi300x high priority

#5402 opened Apr 15, 2025 by Bihan

5 tasks done

[Bug] loading meta-llama/Llama-4-Scout-17B-16E-Instruct with flashinfer-0.2.0(+) raises an error when dtype="float16"

#5391 opened Apr 15, 2025 by liye0626

5 tasks done

[Bug] SGLang server fails during CUDA graph capture: flashinfer JIT requires cuda_fp8.h even with pre-compiled wheel and --disable-cuda-graph

#5389 opened Apr 14, 2025 by brando90

5 tasks done

[Bug] HF_HUB_OFFLINE not longer supported in version 0.4.5

#5386 opened Apr 14, 2025 by dcfidalgo

5 tasks done

[Bug] Non-deterministic outputs with fixed seed parameter in API server

#5377 opened Apr 14, 2025 by BaiMoHan

5 tasks done

Previous 1 2 3 4 5 … 19 20 Next

Previous Next

ProTip! Follow long discussions with comments:>50.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly