Stars
FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
Hierarchical Reasoning Model Official Release
GLM-4.5: An open-source large language model designed for intelligent agents by Z.ai
Official Codes for our CVPR paper <CL-MoE: Enhancing Multimodal Large Language Model with Dual Momentum Mixture-of-Experts for Continual Visual Question Answering>
State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!
Prompts for our Grok chat assistant and the `@grok` bot on X.
Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling
A unified library for object tracking featuring clean room re-implementations of leading multi-object tracking algorithms
MiMo: Unlocking the Reasoning Potential of Language Model – From Pretraining to Posttraining
XiaomiMiMo / vllm
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Ovis2, InternVL3, Lla…
Pretraining and inference code for a large-scale depth-recurrent language model
Large Language Model-enhanced Recommender System Papers
A list of awesome papers and resources of recommender system on large language model (LLM).
Paper List for Recommend-system PreTrained Models
Recommender systems with large language models (Paper list)
Survey: A collection of AWESOME papers and resources on the large language model (LLM) related recommender system topics.
Scalable and memory-optimized training of diffusion models
Build Real-Time Knowledge Graphs for AI Agents
[NeurIPS'24] This repository is the implementation of "SpatialRGPT: Grounded Spatial Reasoning in Vision Language Models"
A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training
Large Language Model (LLM) Systems Paper List
Lets make video diffusion practical!