Skip to content

leoustc/llama.cpp-moe

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9,012 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

llama.cpp-moe

llama.cpp-moe is built for practical Mixture-of-Experts (MoE) inference: local, efficient, and understandable.

This repository centers on MoE-first workflows with lightweight controls that make expert behavior visible, tunable, and easy to reason about.

MoE Innovation Design Philosophy

The original design philosophy section has been moved to work.md, focused on your MoE innovation and the --moe-gpu-expert-slot-num design approach.

Repository Notes

  • The previous project README has been preserved as README_OLD.md.
  • Use that file as a historical and technical reference while this new README defines project intent and guiding principles.

About

LLM inference in C/C++

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • C++ 57.0%
  • C 12.7%
  • Python 7.7%
  • Cuda 5.8%
  • TypeScript 3.2%
  • HTML 3.0%
  • Other 10.6%