liangyuwang

Follow

Liangyu Wang liangyuwang

Follow

LLM, CUDA/System, Distributed training

8 followers · 1 following

KAUST (King Abdullah University of Science and Technology)

Achievements

Achievements

Pinned Loading

zo2 Public

ZO2 (Zeroth-Order Offloading): Full Parameter Fine-Tuning 175B LLMs with 18GB GPU Memory

Python 42 5
TheCoreTeam/core_scheduler Public

CoreScheduler: A High-Performance Scheduler for Large Model Training

C++ 21 6
Tiny-DeepSpeed Public

Tiny-DeepSpeed, a minimalistic re-implementation of the DeepSpeed library

Python 5 1
Flash-Attention-Implementation Public

Implementation of Flash-Attention (both forward and backward) with PyTorch, CUDA, and Triton

Python 1