-
Sun Yat-sen University
- Guangzhou, Guangdong, China
-
13:49
- 8h ahead - https://wu-kan.cn/
Highlights
- Pro
Stars
A tool for examining GPU scheduling behavior.
A GPU benchmark tool for evaluating GPUs and CPUs on mixed operational intensity kernels (CUDA, OpenCL, HIP, SYCL, OpenMP)
karthikeyann / cuda-calculator
Forked from szho42/cuda-calculatorHTML/JS port of CUDA Occupancy Calculator
BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
A throughput-oriented high-performance serving framework for LLMs
collection of benchmarks to measure basic GPU capabilities
A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.
OpenPPL / CuAssembler
Forked from cloudcores/CuAssemblerAn unofficial cuda assembler, for all generations of SASS, hopefully :)
A fast communication-overlapping library for tensor/expert parallelism on GPUs.
A list of tutorials, paper, talks, and open-source projects for emerging compiler and architecture
Timeloop performs modeling, mapping and code-generation for tensor algebra workloads on various accelerator architectures.
The translator that supports translating NVPTX to SPIR-V. This translator is modified from LLVM-SPIR-V Translator.
A library to manipulate font files from Python.
Generate CSS unicode-range from a font file
Sakana widget for Web. | 网页小组件版本的石蒜模拟器。
An innovative superfamily of fonts for code
A simple Python Pydantic model for Honkai: Star Rail parsed data from the Mihomo API.