Kernel Library for SGLang
For CUDA 11.8:
pip3 install sgl-kernel -i https://docs.sglang.ai/whl/cu118
For CUDA 12.1 or CUDA 12.4:
pip3 install sgl-kernel
Use Docker to set up the development environment. See Docker setup guide.
Create and enter development container:
docker run -itd --shm-size 32g --gpus all -v $HOME/.cache:/root/.cache --ipc=host --name sglang_zhyncs lmsysorg/sglang:dev /bin/zsh
docker exec -it sglang_zhyncs /bin/zsh
Third-party libraries:
Steps to add a new kernel:
- Implement the kernel in csrc
- Expose the interface in include/sgl_kernel_ops.h
- Create torch extension in csrc/torch_extension.cc
- Update setup.py to include new CUDA source
- Expose Python interface in python
Development build:
make build
Note:
The sgl-kernel
is rapidly evolving. If you experience a compilation failure, try using make rebuild
.
- Add pytest tests in tests/
- Add benchmarks using triton benchmark in benchmark/
- Run test suite
Update version in pyproject.toml and version.py