Skip to content
View BingyangWu's full-sized avatar
  • Peking University
  • Beijing, China

Highlights

  • Pro

Block or report BingyangWu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)

Python 5,965 586 Updated Mar 27, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 5,846 585 Updated Mar 29, 2025

Redis for LLMs

Python 661 73 Updated Mar 28, 2025

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…

Python 2,307 388 Updated Mar 27, 2025
Python 47 3 Updated Dec 3, 2024

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 13,487 2,759 Updated Mar 29, 2025

A tiny yet powerful LLM inference system tailored for researching purpose. vLLM-equivalent performance with only 2k lines of code (2% of vLLM).

Python 153 13 Updated Jul 5, 2024

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 5,951 517 Updated Mar 28, 2025

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Python 781 63 Updated Sep 4, 2024

An Attention Superoptimizer

C++ 21 Updated Jan 20, 2025

LaTeX template for dissertations in Peking University

TeX 561 190 Updated Apr 25, 2024

LaTeX template for dissertations in Peking University

TeX 23 1 Updated Mar 18, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 45,607 5,572 Updated Mar 28, 2025

A curated list for Efficient Large Language Models

Python 1,572 123 Updated Mar 23, 2025

Zero Bubble Pipeline Parallelism

Python 375 22 Updated Mar 4, 2025

Development repository for the Triton language and compiler

MLIR 15,018 1,891 Updated Mar 29, 2025

Large World Model -- Modeling Text and Video with Millions Context

Python 7,261 556 Updated Oct 19, 2024

Automatic resource configuration for serverless workflows.

Python 20 2 Updated Mar 24, 2024

Survey Paper List - Efficient LLM and Foundation Models

241 18 Updated Sep 22, 2024

SGLang is a fast serving framework for large language models and vision language models.

Python 12,618 1,388 Updated Mar 29, 2025

[TMLR 2024] Efficient Large Language Models: A Survey

1,124 95 Updated Feb 27, 2025

Branch Prediction Pin tool, implementing 2-bit saturating counter and perceptron branch predictors.

C++ 23 11 Updated Apr 1, 2016

6.823 Advanced Computer Architecture Lab

C++ 13 9 Updated Oct 17, 2016

The official repository for the gem5 computer-system architecture simulator.

C++ 1,908 1,368 Updated Mar 28, 2025

A C version of Branch Predictor Simulator

C 17 6 Updated Jul 10, 2024
C++ 69 41 Updated Apr 1, 2013

A repository for research on medium sized language models.

Python 493 69 Updated Jan 13, 2025

Naive Bayes-based Context Extension

Python 322 22 Updated Dec 9, 2024

A developer reference project for creating Retrieval Augmented Generation (RAG) chatbots on Windows using TensorRT-LLM

TypeScript 2,934 394 Updated Aug 21, 2024

Codebase for Merging Language Models (ICML 2024)

Python 805 49 Updated May 5, 2024
Next
Showing results