Skip to content
View wu-kan's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Organizations

@SYSU-SCC

Block or report wu-kan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Material for gpu-mode lectures

Jupyter Notebook 4,134 417 Updated Feb 9, 2025

A tool for examining GPU scheduling behavior.

Cuda 74 18 Updated Aug 17, 2024

A GPU benchmark tool for evaluating GPUs and CPUs on mixed operational intensity kernels (CUDA, OpenCL, HIP, SYCL, OpenMP)

C++ 391 69 Updated Jan 13, 2025

HTML/JS port of CUDA Occupancy Calculator

CoffeeScript 17 8 Updated Nov 23, 2021

Parboil benchmark

C 4 6 Updated Nov 7, 2016

BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.

Python 566 42 Updated Feb 14, 2025

A throughput-oriented high-performance serving framework for LLMs

Cuda 782 31 Updated Sep 21, 2024

collection of benchmarks to measure basic GPU capabilities

C++ 332 47 Updated Feb 11, 2025

FGO自动刷本,自动搓丸子以及将会实装的(抽取友情池、整理邮箱狗粮)

17 Updated Sep 7, 2021
Python 12 7 Updated May 28, 2024

A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.

C++ 50 6 Updated Mar 20, 2025

An unofficial cuda assembler, for all generations of SASS, hopefully :)

Python 82 10 Updated Mar 20, 2023

A framework for pipelined computing on GPU

C++ 29 9 Updated Jul 17, 2019

A fast communication-overlapping library for tensor/expert parallelism on GPUs.

C++ 810 50 Updated Mar 19, 2025

A list of tutorials, paper, talks, and open-source projects for emerging compiler and architecture

438 35 Updated Jan 15, 2025
Python 1 Updated Jun 30, 2024

Timeloop performs modeling, mapping and code-generation for tensor algebra workloads on various accelerator architectures.

C++ 371 111 Updated Mar 24, 2025

Rodinia benchmark

C 16 5 Updated Jul 5, 2024

The translator that supports translating NVPTX to SPIR-V. This translator is modified from LLVM-SPIR-V Translator.

LLVM 37 8 Updated Oct 25, 2021

Example cmake project for grpc / protobuf

C++ 118 31 Updated Mar 24, 2025

C in four functions

C 10,073 1,456 Updated Dec 26, 2023

CUDA on non-NVIDIA GPUs

Rust 11,033 707 Updated Mar 17, 2025

A library to manipulate font files from Python.

Python 4,551 468 Updated Mar 27, 2025

Generate CSS unicode-range from a font file

TypeScript 7 1 Updated Dec 3, 2023

🐟「Sakana!」石蒜模拟器

JavaScript 2,036 139 Updated Nov 9, 2022

Sakana widget for Web. | 网页小组件版本的石蒜模拟器。

TypeScript 1,208 66 Updated Sep 26, 2023

An innovative superfamily of fonts for code

TypeScript 15,705 270 Updated Mar 7, 2025

A simple Python Pydantic model for Honkai: Star Rail parsed data from the Mihomo API.

Python 19,104 2,885 Updated Mar 28, 2025
Next
Showing results