Skip to content

alibaba/EfficientAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🚀 EfficientAI

Efficient Inference for LLMs & MLLMs
An open-source research project from Alibaba Cloud dedicated to efficient large language model inference.

EfficientAI Banner

License Papers Stars Issues


📋 Table of Contents


✨ Key Features

EfficientAI focuses on inference-time optimizations for LLMs and MLLMs:

Feature Description Status
🔹 Activation Sparsity Dynamic sparsity methods for faster inference ✅ LaRoSa (ICML 2025)
🔹 Quantization Post-training & quantization-aware techniques for MLLMs ✅ MASQuant (CVPR 2026)
🔹 Agentic Reasoning Efficient tool-use and reasoning frameworks ✅ D-CORE
🔹 Reproducible Benchmarks Standardized eval pipelines for research & production 🔄 In Progress

🔥 Latest Updates

📰 Changelog (Click to expand)
  • [2026-03] 🎉 MASQuant accepted to CVPR 2026
    → Multimodal LLM PTQ algorithm with SOTA accuracy-efficiency tradeoff
    📄 Paper | 💻 Code

  • [2026-02] 🚀 D-CORE open-sourced
    → Efficient tool-use reasoning via dynamic computation routing
    📄 Paper | 💻 Code | 🎮 Demo

  • [2026-01] 🏆 LaRoSa accepted to ICML 2025
    → Training-free activation sparsity for LLM acceleration
    📄 Paper | 💻 Code


📦 Installation

# Clone the repository
git clone https://github.com/alibaba/EfficientAI.git
cd EfficientAI

# Install dependencies (recommended: use conda)
pip install -r requirements.txt

# Optional: Install with specific module support
# pip install -e ".[larosa]"   # for LaRoSa
# pip install -e ".[masquant]" # for MASQuant

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages