You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
AlpinDale edited this page Mar 11, 2024
·
4 revisions
Aphrodite Engine
Aphrodite Engine is designed for serving LLMs at scale, based on vLLM. It supports the majority of HuggingFace models, including Llama, Mistral, and Mixtral.
Aphrodite also supports multiple weight quantization methods for not-at-scale (and at-scale!) use-cases. Please see this page for details.