chenghuaWang

🎯

Focusing

Chenghua chenghuaWang

🎯

Focusing

postgraduate student @ BUPT. Previous CS undergrad @ ZJGSU. AI & Sys.

43 followers · 180 following

I work for myself
HangZhou
https://chenghuawang.github.io/keep-moving-forward/

Achievements

Lists (10)

Sort

Starred repositories

binary-husky / gpt_academic

为GPT/GLM等LLM大语言模型提供实用化交互接口，特别优化论文阅读/润色/写作体验，模块化设计，支持自定义快捷按钮&函数插件，支持Python和C++等项目剖析&自译解功能，PDF/LaTex论文翻译&总结功能，支持并行问询多种LLM模型，支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…

Python 67,835 8,319 Updated Mar 8, 2025

touying-typ / touying

Touying is a powerful package for creating presentation slides in Typst.

Typst 1,139 31 Updated Mar 5, 2025

wjakob / nanobind

nanobind: tiny and efficient C++/Python bindings

C++ 2,635 220 Updated Mar 2, 2025

deepseek-ai / 3FS

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 7,738 685 Updated Mar 8, 2025

deepseek-ai / smallpond

A lightweight data processing framework built on DuckDB and 3FS.

Python 4,028 336 Updated Mar 5, 2025

Wan-Video / Wan2.1

Wan: Open and Advanced Large-Scale Video Generative Models

Python 7,713 789 Updated Mar 7, 2025

deepseek-ai / profile-data

Analyze computation-communication overlap in V3/R1.

899 116 Updated Mar 3, 2025

deepseek-ai / EPLB

Expert Parallelism Load Balancer

Python 1,040 151 Updated Feb 27, 2025

deepseek-ai / DualPipe

A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.

Python 2,542 246 Updated Mar 5, 2025

deepseek-ai / DeepGEMM

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 4,857 474 Updated Mar 10, 2025

deepseek-ai / DeepEP

DeepEP: an efficient expert-parallel communication library

Cuda 7,093 612 Updated Mar 6, 2025

NVIDIA / online-softmax

Benchmark code for the "Online normalizer calculation for softmax" paper

Cuda 85 7 Updated Jul 27, 2018

deepseek-ai / FlashMLA

FlashMLA: Efficient MLA decoding kernels

C++ 11,219 785 Updated Mar 1, 2025

tile-ai / tilelang

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 610 40 Updated Mar 9, 2025

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

6,700 195 Updated Mar 4, 2025

MoonshotAI / MoBA

MoBA: Mixture of Block Attention for Long-Context LLMs

Python 1,629 93 Updated Mar 7, 2025

filipdutescu / modern-cpp-template

A template for modern C++ projects using CMake, Clang-Format, CI, unit testing and more, with support for downstream inclusion.

CMake 1,782 220 Updated Mar 16, 2024

Deep-Agent / R1-V

Witness the aha moment of VLM with less than $3.

Python 3,099 242 Updated Mar 1, 2025

EvolvingLMMs-Lab / open-r1-multimodal

A fork to add multimodal model training to open-r1

Python 998 51 Updated Feb 8, 2025

open-thoughts / open-thoughts

Fully open data curation for reasoning models

Python 1,477 126 Updated Feb 23, 2025

zhaochenyang20 / Awesome-ML-SYS-Tutorial

My learning notes/codes for ML SYS.

Python 1,335 69 Updated Mar 8, 2025

simplescaling / s1

s1: Simple test-time scaling

Python 5,899 678 Updated Mar 6, 2025

spcl / QuaRot

Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.

Python 352 32 Updated Nov 26, 2024

pytorch / torchdynamo

A Python-level JIT compiler designed to make unmodified PyTorch programs faster.

Python 1,034 125 Updated Apr 17, 2024

DefTruth / ffpa-attn-mma

📚FFPA(Split-D): Yet another Faster Flash Prefill Attention with O(1) GPU SRAM complexity for headdim > 256, ~2x↑🎉vs SDPA EA.

Cuda 129 5 Updated Mar 5, 2025

zhiyiYo / PyQt-Fluent-Widgets

A fluent design widgets library based on C++ Qt/PyQt/PySide. Make Qt Great Again.

Python 6,288 609 Updated Mar 9, 2025

sail-sg / oat

🌾 OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.

Python 213 12 Updated Feb 24, 2025

RLHF-V / RLAIF-V

[CVPR'25] RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness

Python 305 12 Updated Mar 4, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 22,431 2,011 Updated Mar 9, 2025

MoonshotAI / Kimi-k1.5

3,189 190 Updated Mar 7, 2025

Chenghua chenghuaWang

Lists (10)

👀 AI

📖 Learning

☄️ Compile & Building

⛏️ Computer Graphics

📚 Database

🍇 Desktop App toolchain

🌱 Distributed Sys

🌟 MLSys

quant

✍️ Writing

Starred repositories

leveldb

PyTorch

Lua

Linux

Go

Docker

Tensorflow

C++

C