A curated list of awesome open source and commercial platforms for serving models in production 🚀
- Banana: Host your ML inference code on serverless GPUs and integrate it into your app with one line of code.
- BentoML: Open-source platform for high-performance ML model serving.
- BudgetML: Deploy a ML inference service on a budget in less than 10 lines of code.
- Cortex: Machine learning model serving infrastructure.
- Gradio: Create customizable UI components around your models.
- GraphPipe: Machine learning model deployment made simple.
- Hydrosphere: Platform for deploying your Machine Learning to production.
- KFServe: Kubernetes custom resource definition for serving ML models on arbitrary frameworks.
- Merlin: A platform for deploying and serving machine learning models.
- Opyrator: Turns your ML code into microservices with web API, interactive GUI, and more.
- PredictionIO: Event collection, deployment of algorithms, evaluation, querying predictive results via APIs.
- Rune: Provides containers to encapsulate and deploy EdgeML pipelines and applications.
- Seldon: Take your ML projects from POC to production with maximum efficiency and minimal risk.
- Streamlit: Lets you create apps for your ML projects with deceptively simple Python scripts.
- TensorFlow Serving: Flexible, high-performance serving system for ML models, designed for production.
- TorchServe: A flexible and easy to use tool for serving PyTorch models.
- Triton Inference Server: Provides an optimized cloud and edge inferencing solution.