
Starred repositories
📦 Repomix (formerly Repopack) is a powerful tool that packs your entire repository into a single, AI-friendly file. Perfect for when you need to feed your codebase to Large Language Models (LLMs) o…
Official Repository for SIGIR2024 Demo Paper "An Integrated Data Processing Framework for Pretraining Foundation Models"
Training Turn-by-Turn Verifiers for Dialogue Tutoring Agents: The Curious Case of LLMs as Your Coding Tutors
🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"
[TMLR] A curated list of language modeling researches for code (and other software engineering activities), plus related datasets.
Fully open reproduction of DeepSeek-R1
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
APOLLO: SGD-like Memory, AdamW-level Performance
[arXiv preprint] Official Repository for "Evaluating Language Models as Synthetic Data Generators"
Continual Learning of Large Language Models: A Comprehensive Survey
Representation Engineering: A Top-Down Approach to AI Transparency
Rigourous evaluation of LLM-synthesized code - NeurIPS 2023 & COLM 2024
Efficient Triton Kernels for LLM Training
📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥
Tools for merging pretrained large language models.
Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.
Minimalistic large language model 3D-parallelism training
心理健康大模型、LLM、The Big Model of Mental Health、Finetune、InternLM2、InternLM2.5、Qwen、ChatGLM、Baichuan、DeepSeek、Mixtral、LLama3、GLM4、Qwen2、LLama3.1
Netease Youdao's open-source embedding and reranker models for RAG products.
Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]
A curated list of Large Language Model (LLM) Interpretability resources.
OpenChat: Advancing Open-source Language Models with Imperfect Data