Skip to content
@bentoml

BentoML

The most flexible way to serve AI models in production

Welcome to BentoML πŸ‘‹ Twitter Follow Slack

BentoML

What is BentoML? πŸ‘©β€πŸ³

BentoML is an open-source model serving library for building performant and scalable AI applications with Python. It comes with everything you need for serving optimization, model packaging, and production deployment.

πŸ”¨ Build Anywhere with Open-Source:

🚒 Efficient scaling on your/our Cloud:

  • ☁️ BentoCloud: Inference Platform for enterprise AI teams to build fast, secure, and scalable AI applications.

Get in touch πŸ’¬

πŸ‘‰ Join our Slack community!

πŸ‘€ Follow us on X @bentomlai and LinkedIn

πŸ“– Read our blog

Pinned

  1. BentoML BentoML Public

    The most flexible way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Inference Graph/Pipelines, Compound AI systems, Multi-Modal, RAG as a Service, and more!

    Python 6.5k 734

  2. OpenLLM OpenLLM Public

    Run any open-source LLMs, such as Llama 2, Mistral, as OpenAI compatible API endpoint, locally and in the cloud.

    Python 8.7k 544

Repositories

Showing 10 of 71 repositories
  • rag-tutorials Public

    a series of tutorials implementing rag service with BentoML and LlamaIndex

    Python 1 0 0 0 Updated Apr 16, 2024
  • BentoML Public

    The most flexible way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Inference Graph/Pipelines, Compound AI systems, Multi-Modal, RAG as a Service, and more!

    Python 6,515 Apache-2.0 734 208 18 Updated Apr 16, 2024
  • OpenLLM Public

    Run any open-source LLMs, such as Llama 2, Mistral, as OpenAI compatible API endpoint, locally and in the cloud.

    Python 8,674 Apache-2.0 544 80 11 Updated Apr 15, 2024
  • plugins Public

    the swish knife to all things bentoml.

    Starlark 6 Apache-2.0 1 0 0 Updated Apr 15, 2024
  • BentoIris Public

    how to build an Iris classification application using BentoML

    Python 0 1 0 0 Updated Apr 12, 2024
  • BentoVLLM Public

    Self-host LLMs with vLLM and BentoML

    Python 12 6 2 0 Updated Apr 12, 2024
  • BentoSentenceTransformers Public

    how to build a sentence embedding application using BentoML

    Python 1 0 0 1 Updated Apr 11, 2024
  • BentoCLIP Public

    building a CLIP application using BentoML

    Python 3 1 0 2 Updated Apr 10, 2024
  • BentoLCM Public

    how to build a Latent Consistency Models application using BentoML

    Python 0 0 0 2 Updated Apr 10, 2024
  • BentoBLIP Public

    how to build an image captioning application on top of a BLIP model with BentoML

    Python 1 0 0 2 Updated Apr 10, 2024