Here are the source codes of papers in the EasyInfra series.
- 2026/05/01 Our work EasyBalance: Cross-Layer Load Balancing in Distributed MoE Inference has been accepted to ICML 2026. See you in Seoul!
- 2025/09/18 Our work EasySpec: Layer-Parallel Speculative Decoding for Efficient Multi-GPU Utilization has been accepted to NeurIPS 2025. See you in San Diego!