-
Peking University
- Beijing, China
Highlights
- Pro
Stars
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
verl: Volcano Engine Reinforcement Learning for LLMs
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
A tiny yet powerful LLM inference system tailored for researching purpose. vLLM-equivalent performance with only 2k lines of code (2% of vLLM).
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
LaTeX template for dissertations in Peking University
zhiyunyao / pkuthss
Forked from CasperVector/pkuthssLaTeX template for dissertations in Peking University
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
A curated list for Efficient Large Language Models
Zero Bubble Pipeline Parallelism
Development repository for the Triton language and compiler
Large World Model -- Modeling Text and Video with Millions Context
Automatic resource configuration for serverless workflows.
Survey Paper List - Efficient LLM and Foundation Models
SGLang is a fast serving framework for large language models and vision language models.
[TMLR 2024] Efficient Large Language Models: A Survey
Branch Prediction Pin tool, implementing 2-bit saturating counter and perceptron branch predictors.
The official repository for the gem5 computer-system architecture simulator.
A C version of Branch Predictor Simulator
A repository for research on medium sized language models.
A developer reference project for creating Retrieval Augmented Generation (RAG) chatbots on Windows using TensorRT-LLM
Codebase for Merging Language Models (ICML 2024)