awesome-ml-serving

A curated list of awesome open source and commercial platforms for serving models in production 🚀

Banana: Host your ML inference code on serverless GPUs and integrate it into your app with one line of code.
BentoML: Open-source platform for high-performance ML model serving.
BudgetML: Deploy a ML inference service on a budget in less than 10 lines of code.
Cortex: Machine learning model serving infrastructure.
Gradio: Create customizable UI components around your models.
GraphPipe: Machine learning model deployment made simple.
Hydrosphere: Platform for deploying your Machine Learning to production.
KFServe: Kubernetes custom resource definition for serving ML models on arbitrary frameworks.
Merlin: A platform for deploying and serving machine learning models.
Opyrator: Turns your ML code into microservices with web API, interactive GUI, and more.
PredictionIO: Event collection, deployment of algorithms, evaluation, querying predictive results via APIs.
Rune: Provides containers to encapsulate and deploy EdgeML pipelines and applications.
Seldon: Take your ML projects from POC to production with maximum efficiency and minimal risk.
Streamlit: Lets you create apps for your ML projects with deceptively simple Python scripts.
TensorFlow Serving: Flexible, high-performance serving system for ML models, designed for production.
TorchServe: A flexible and easy to use tool for serving PyTorch models.
Triton Inference Server: Provides an optimized cloud and edge inferencing solution.

Provide feedback