fxmeng

Fanxu Meng fxmeng

81 followers · 13 following

Peking University
https://fxmeng.github.io

Achievements

Lists (3)

Sort

Stars

CraftJarvis / JarvisVLA

Official Implementation of "JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse"

Python 55 3 Updated Mar 24, 2025

zhanshijinwat / Steel-LLM

Train a 1B LLM with 1T tokens from scratch by personal

Jupyter Notebook 589 64 Updated Mar 9, 2025

bet0x / transmla-converter

TransMLA: Multi-Head Latent Attention Converter

Python 4 2 Updated Mar 2, 2025

JT-Ushio / MHA2MLA

Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs

Python 148 16 Updated Mar 24, 2025

google-deepmind / concordia

A library for generative social simulation

Python 822 177 Updated Mar 27, 2025

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

6,944 230 Updated Mar 4, 2025

deepseek-ai / FlashMLA

FlashMLA: Efficient MLA decoding kernels

C++ 11,381 811 Updated Mar 1, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 23,400 2,127 Updated Mar 27, 2025

deepseek-ai / DeepSeek-V3

Python 94,288 15,250 Updated Mar 16, 2025

October2001 / Awesome-KV-Cache-Compression

📰 Must-read papers on KV Cache Compression (constantly updating 🤗).

356 8 Updated Mar 25, 2025

Wang-ML-Lab / llm-continual-learning-survey

Continual Learning of Large Language Models: A Comprehensive Survey

374 17 Updated Mar 3, 2025

pytorch / ao

PyTorch native quantization and sparsity for training and inference

Python 1,923 234 Updated Mar 27, 2025

pytorch / torchtune

PyTorch native post-training library

Python 5,032 564 Updated Mar 28, 2025

CraftJarvis / MineStudio

MineStudio: A Streamlined Package for Minecraft AI Agent Development

Python 214 10 Updated Mar 23, 2025

zjunlp / KnowledgeEditingPapers

Must-read Papers on Knowledge Editing for Large Language Models.

1,042 71 Updated Mar 7, 2025

meethigher / black-wukong-youji

黑神话悟空妖怪平生录

671 51 Updated Sep 2, 2024

hemingkx / SpeculativeDecodingPapers

📰 Must-read papers and blogs on Speculative Decoding ⚡️

661 33 Updated Mar 27, 2025

HqWu-HITCS / Awesome-Chinese-LLM

整理开源的中文大语言模型，以规模较小、可私有化部署、训练成本较低的模型为主，包括底座模型，垂直领域微调及应用，数据集与教程等。

19,191 1,846 Updated Sep 19, 2024

HuangOwen / Awesome-LLM-Compression

Awesome LLM compression research papers and tools.

1,444 93 Updated Mar 26, 2025

modula-systems / modula

🧱 Modula software package

Python 186 13 Updated Mar 26, 2025

NVlabs / DoRA

[ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation

Python 749 52 Updated Oct 1, 2024

meta-math / MetaMath

MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models

Python 423 40 Updated Feb 1, 2024

bigcode-project / bigcode-evaluation-harness

A framework for the evaluation of autoregressive code generation language models.

Python 914 237 Updated Oct 31, 2024

Outsider565 / LoRA-GA

Jupyter Notebook 185 8 Updated Oct 21, 2024

Lightning-AI / litgpt

20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

Python 11,879 1,196 Updated Mar 27, 2025

jzhang38 / TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 8,346 520 Updated May 3, 2024

allenai / OLMo

Modeling, training, eval, and inference code for OLMo

Python 5,445 583 Updated Mar 26, 2025

madsys-dev / deepseekv2-profile

Jupyter Notebook 126 16 Updated Mar 4, 2025

karpathy / build-nanogpt

Video+code lecture on building nanoGPT from scratch

Python 4,011 599 Updated Aug 13, 2024

logancyang / loss-landscape-anim

Create animations for the optimization trajectory of neural nets

Python 147 24 Updated Jan 30, 2024

Fanxu Meng fxmeng

Lists (3)

🔮 Future ideas

✨ Inspiration

🚀 My stack

Stars