BentoML is an open-source model serving library for building model inference APIs and multi-model serving systems with any open-source or custom AI models. It comes with everything you need for serving optimization, model packaging, and simplifies production deployment via āļø BentoCloud.
- š± BentoML: The Unified Model Serving Framework
- š¦¾ OpenLLM: Self-hosting Large Language Models Made Easy
- āļø BentoCloud: Inference Platform for fast-moving AI teams
š Join our Slack community!
š Follow us on X @bentomlai and LinkedIn
š Read our blog