Skip to content
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
Branch: master
Clone or download
Latest commit dd44ea5 Aug 13, 2019
Type Name Latest commit message Commit time
Failed to load latest commit information.
images added Feb 8, 2019
.gitignore gitignore Dec 30, 2018
LICENSE Initial commit Aug 15, 2018 Adding additional annotation tool Aug 14, 2019

Awesome Maintenance GitHub GitHub GitHub GitHub

Awesome production machine learning

This repository contains a curated list of awesome open source libraries that will help you deploy, monitor, version, scale, and secure your production machine learning.

Quick links to sections in this page

🔍 Explaining predictions & models 🔏 Privacy preserving ML 📜 Model & data versioning
🏁 Model Orchestration ⚔ Adversarial Robustness 🤖 Neural Architecture Search
📓 Reproducible Notebooks 📊 Visualisation frameworks 🔠 Industry-strength NLP
🧵 Data pipelines & ETL 🏷️ Data Labelling 🗞️ Data storage
📡 Functions as a service 🗺️ Computation distribution 📥 Model serialisation
🎁 Compiler optimisation 💸 Data Stream Processing 🌀 Feature engineering
💰 Commercial Platforms

10 Min Video Overview

This 10 minute video provides an overview of the motivations for machine learning operations as well as a high level overview on some of the tools in this repo.

Want to receive recurrent updates on this repo and other advancements?

You can join the Machine Learning Engineer newsletter. You will receive updates on open source frameworks, tutorials and articles curated by machine learning professionals.

Main Content

Explaining Black Box Models and Datasets

  • XAI - eXplainableAI - An eXplainability toolbox for machine learning.
  • Alibi - Alibi is an open source Python library aimed at machine learning model inspection and interpretation. The initial focus on the library is on black-box, instance based model explanations.
  • SHAP - SHapley Additive exPlanations is a unified approach to explain the output of any machine learning model.
  • DeepLIFT - Codebase that contains the methods in the paper "Learning important features through propagating activation differences". Here is the slides and the video of the 15 minute talk given at ICML.
  • TreeInterpreter - Package for interpreting scikit-learn's decision tree and random forest predictions. Allows decomposing each prediction into bias and feature contribution components as described in
  • LIME - Local Interpretable Model-agnostic Explanations for machine learning models.
  • ELI5 - "Explain Like I'm 5" is a Python package which helps to debug machine learning classifiers and explain their predictions.
  • Skater - Skater is a unified framework to enable Model Interpretation for all forms of model to help one build an Interpretable machine learning system often needed for real world use-cases
  • themis-ml - themis-ml is a Python library built on top of pandas and sklearn that implements fairness-aware machine learning algorithms.
  • AI Fairness 360 - A comprehensive set of fairness metrics for datasets and machine learning models, explanations for these metrics, and algorithms to mitigate bias in datasets and models.
  • casme - Example of using classifier-agnostic saliency map extraction on ImageNet presented on the paper "Classifier-agnostic saliency map extraction".
  • ContrastiveExplanation (Foil Trees) - Python script for model agnostic contrastive/counterfactual explanations for machine learning. Accompanying code for the paper "Contrastive Explanations with Local Foil Trees".
  • Microsoft InterpretML - InterpretML is an open-source package for training interpretable models and explaining blackbox systems.
  • DeepVis Toolbox - This is the code required to run the Deep Visualization Toolbox, as well as to generate the neuron-by-neuron visualizations using regularized optimization. The toolbox and methods are described casually here and more formally in this paper.
  • FairML - FairML is a python toolbox auditing the machine learning models for bias.
  • fairness - This repository is meant to facilitate the benchmarking of fairness aware machine learning algorithms based on this paper.
  • Integrated-Gradients - This repository provides code for implementing integrated gradients for networks with image inputs.
  • iNNvestigate - An open-source library for analyzing Keras models visually by methods such as DeepTaylor-Decomposition, PatternNet, Saliency Maps, and Integrated Gradients.
  • LOFO Importance - LOFO (Leave One Feature Out) Importance calculates the importances of a set of features based on a metric of choice, for a model of choice, by iteratively removing each feature from the set, and evaluating the performance of the model, with a validation scheme of choice, based on the chosen metric.
  • L2X - Code for replicating the experiments in the paper "Learning to Explain: An Information-Theoretic Perspective on Model Interpretation" at ICML 2018
  • Aequitas - An open-source bias audit toolkit for data scientists, machine learning researchers, and policymakers to audit machine learning models for discrimination and bias, and to make informed and equitable decisions around developing and deploying predictive risk-assessment tools.
  • pyBreakDown - A model agnostic tool for decomposition of predictions from black boxes. Break Down Table shows contributions of every variable to a final prediction.
  • rationale - Code to implement learning rationales behind predictions with code for paper "Rationalizing Neural Predictions"
  • Tensorflow's cleverhans - An adversarial example library for constructing attacks, building defenses, and benchmarking both. A python library to benchmark system's vulnerability to adversarial examples
  • tensorflow's lucid - Lucid is a collection of infrastructure and tools for research in neural network interpretability.
  • tensorflow's Model Analysis - TensorFlow Model Analysis (TFMA) is a library for evaluating TensorFlow models. It allows users to evaluate their models on large amounts of data in a distributed manner, using the same metrics defined in their trainer.
  • Tensorboard's Tensorboard WhatIf - Tensorboard screen to analyse the interactions between inference results and data inputs.
  • Themis - Themis is a testing-based approach for measuring discrimination in a software system.
  • anchor - Code for the paper "High precision model agnostic explanations", a model-agnostic system that explains the behaviour of complex models with high-precision rules called anchors.
  • woe - Tools for WoE Transformation mostly used in ScoreCard Model for credit rating
  • responsibly - Toolkit for auditing and mitigating bias and fairness of machine learning systems

Privacy Preserving Machine Learning

  • Tensorflow Privacy - A Python library that includes implementations of TensorFlow optimizers for training machine learning models with differential privacy.
  • TF-Encrypted - A Python library built on top of TensorFlow for researchers and practitioners to experiment with privacy-preserving machine learning.
  • PySyft - A Python library for secure, private Deep Learning. PySyft decouples private data from model training, using Multi-Party Computation (MPC) within PyTorch.
  • Uber SQL Differencial Privacy - Uber's open source framework that enforces differential privacy for general-purpose SQL queries.
  • Intel Homomorphic Encryption Backend - The Intel HE transformer for nGraph is a Homomorphic Encryption (HE) backend to the Intel nGraph Compiler, Intel's graph compiler for Artificial Neural Networks.

Model and Data Versioning

  • DAGsHub - The home for data science collaboration. A platform, based on DVC, for data science project management and collaboration.
  • Data Version Control (DVC) - A git fork that allows for version management of models
  • ModelDB - Framework to track all the steps in your ML code to keep track of what version of your model obtained which accuracy, and then visualise it and query it via the UI
  • Pachyderm - Open source distributed processing framework build on Kubernetes focused mainly on dynamic building of production machine learning pipelines - (Video)
  • steppy - Lightweight, Python3 library for fast and reproducible machine learning experimentation. Introduces simple interface that enables clean machine learning pipeline design.
  • Quilt Data - Versioning, reproducibility and deployment of data and models.
  • ModelChimp - Framework to track and compare all the results and parameters from machine learning models (Video)
  • PredictionIO - An open source Machine Learning Server built on top of a state-of-the-art open source stack for developers and data scientists to create predictive engines for any machine learning task
  • MLflow - Open source platform to manage the ML lifecycle, including experimentation, reproducibility and deployment.
  • Sacred - Tool to help you configure, organize, log and reproduce machine learning experiments.
  • Catalyst - High-level utils for PyTorch DL & RL research. It was developed with a focus on reproducibility, fast experimentation and code/ideas reusing.
  • FGLab - Machine learning dashboard, designed to make prototyping experiments easier.
  • Studio.ML - Model management framework which minimizes the overhead involved with scheduling, running, monitoring and managing artifacts of your machine learning experiments.
  • Flor - Easy to use logger and automatic version controller made for data scientists who write ML code
  • D6tflow - A python library that allows for building complex data science workflows on Python.
  • TRAINS - Auto-Magical Experiment Manager & Version Control for AI.
  • Kedro - Kedro is a workflow development tool that helps you build data pipelines that are robust, scalable, deployable, reproducible and versioned.
  • MLWatcher - MLWatcher is a python agent that records a large variety of time-serie metrics of your running ML classification algorithm. It enables you to monitor in real time.

Model Deployment and Orchestration Frameworks

  • Seldon - Open source platform for deploying and monitoring machine learning models in kubernetes - (Video)
  • Redis-ML - Module available from unstable branch that supports a subset of ML models as Redis data types
  • Model Server for Apache MXNet (MMS) - A model server for Apache MXNet from Amazon Web Services that is able to run MXNet models as well as Gluon models (Amazon's SageMaker runs a custom version of MMS under the hood)
  • Tensorflow Serving - High-performant framework to serve Tensorflow models via grpc protocol able to handle 100k requests per second per core
  • Clipper - Model server project from Berkeley's Rise Rise Lab which includes a standard RESTful API and supports TensorFlow, Scikit-learn and Caffe models
  • DeepDetect - Machine Learning production server for TensorFlow, XGBoost and Cafe models written in C++ and maintained by Jolibrain
  • MLeap - Standardisation of pipeline and model serialization for Spark, Tensorflow and sklearn
  • OpenScoring - REST web service for scoring PMML models built and maintained by
  • Open Platform for AI - Platform that provides complete AI model training and resource management capabilities.
  • NVIDIA TensorRT - TensorRT is a C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators.
  • NVIDIA TensorRT Inference Server - TensorRT Inference Server is an inference microservice that lets you serve deep learning models in production while maximizing GPU utilization.
  • Kubeflow - A cloud native platform for machine learning based on Google’s internal machine learning pipelines.
  • Polyaxon - A platform for reproducible and scalable machine learning and deep learning on kubernetes. - (Video)
  • Ray - Ray is a flexible, high-performance distributed execution framework for machine learning (VIDEO)

Adversarial Robustness Libraries

  • CleverHans - library for testing adversarial attacks / defenses maintained by some of the most important names in adversarial ML, namely Ian Goodfellow (ex-Google Brain, now Apple) and Nicolas Papernot (Google Brain). Comes with some nice tutorials!
  • IBM Adversarial Robustness Toolbox (ART) - at the time of writing this is the most complete off-the-shelf resource for testing adversarial attacks and defenses. It includes a library of 15 attacks, 10 empirical defenses, and some nice evaluation metrics. Neural networks only.
  • Foolbox - second biggest adversarial library. Has an even longer list of attacks - but no defenses or evaluation metrics. Geared more towards computer vision. Code easier to understand / modify than ART - also better for exploring blackbox attacks on surrogate models.
  • AdvBox - generate adversarial examples from the command line with 0 coding using PaddlePaddle, PyTorch, Caffe2, MxNet, Keras, and TensorFlow. Includes 10 attacks and also 6 defenses. Used to implement StealthTshirt at DEFCON!
  • EvadeML - benchmarking and visualization tool for adversarial ML maintained by Weilin Xu, a PhD at University of Virginia, working with David Evans. Has a tutorial on re-implementation of one of the most important adversarial defense papers - feature squeezing (same team).
  • Adversarial DNN Playground - think TensorFlow Playground, but for Adversarial Examples! A visualization tool designed for learning and teaching - the attack library is limited in size, but it has a nice front-end to it with buttons you can press!
  • AdverTorch - library for adversarial attacks / defenses specifically for PyTorch.
  • TextFool - plausible looking adversarial examples for text generation.
  • Artificial Adversary AirBnB's library to generate text that reads the same to a human but passes adversarial classifiers.
  • DEEPSEC - another systematic tool for attacking and defending deep learning models.
  • MIA - A library for running membership inference attacks (MIA) against machine learning models.
  • Trickster - Library and experiments for attacking machine learning in discrete domains using graph search.
  • Nicolas Carlini’s Adversarial ML reading list - not a library, but a curated list of the most important adversarial papers by one of the leading minds in Adversarial ML, Nicholas Carlini. If you want to discover the 10 papers that matter the most - I would start here.
  • Robust ML - another robustness resource maintained by some of the leading names in adversarial ML. They specifically focus on defenses, and ones that have published code available next to papers. Practical and useful.

Neural Architecture Search

Data Science Notebook Frameworks

  • Jupyter Notebooks - Web interface python sandbox environments for reproducible development
  • Stencila - Stencila is a platform for creating, collaborating on, and sharing data driven content. Content that is transparent and reproducible.
  • RMarkdown - The rmarkdown package is a next generation implementation of R Markdown based on Pandoc.
  • Hydrogen - A plugin for ATOM that enables it to become a jupyter-notebook-like interface that prints the outputs directly in the editor.
  • H2O Flow - Jupyter notebook-like interface for H2O to create, save and re-use "flows"

Industrial Strength Visualisation libraries

  • Plotly Dash - Dash is a Python framework for building analytical web applications without the need to write javascript.
  • PDPBox - This repository is inspired by ICEbox. The goal is to visualize the impact of certain features towards model prediction for any supervised learning algorithm. (now support all scikit-learn algorithms)
  • PyCEbox - Python Individual Conditional Expectation Plot Toolbox
  • - An interactive, open source, and browser-based graphing library for Python.
  • Pixiedust - PixieDust is a productivity tool for Python or Scala notebooks, which lets a developer encapsulate business logic into something easy for your customers to consume.
  • ggplot2 - An implementation of the grammar of graphics for python.
  • seaborn - Seaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing attractive statistical graphics.
  • Bokeh - Bokeh is an interactive visualization library for Python that enables beautiful and meaningful visual presentation of data in modern web browsers.
  • matplotlib - A Python 2D plotting library which produces publication-quality figures in a variety of hardcopy formats and interactive environments across platforms.
  • pygal - pygal is a dynamic SVG charting library written in python
  • Geoplotlib - geoplotlib is a python toolbox for visualizing geographical data and making maps
  • Missigno - missingno provides a small toolset of flexible and easy-to-use missing data visualizations and utilities that allows you to get a quick visual summary of the completeness (or lack thereof) of your dataset.
  • XKCD-style plots - An XKCD theme for matblotlib visualisations
  • yellowbrick - yellowbrick is a matplotlib-based model evaluation plots for scikit-learn and other machine learning libraries.

Industrial Strength NLP

  • SpaCy - Industrial-strength natural language processing library built with python and cython by the team.
  • Flair - Simple framework for state-of-the-art NLP developed by Zalando which builds directly on PyTorch.
  • Wav2Letter++ - A speech to text system developed by Facebook's FAIR teams.
  • GNES - Generic Neural Elastic Search is a cloud-native semantic search system based on deep neural networks.

Data Pipeline ETL Frameworks

  • Apache Airflow - Data Pipeline framework built in Python, including scheduler, DAG definition and a UI for visualisation
  • Azkaban - Azkaban is a batch workflow job scheduler created at LinkedIn to run Hadoop jobs. Azkaban resolves the ordering through job dependencies and provides an easy to use web user interface to maintain and track your workflows.
  • Luigi - Luigi is a Python module that helps you build complex pipelines of batch jobs, handling dependency resolution, workflow management, visualisation, etc
  • Genie - Job orchestration engine to interface and trigger the execution of jobs from Hadoop-based systems
  • Oozie - Workflow scheduler for Hadoop jobs
  • Apache Nifi - Apache NiFi was made for dataflow. It supports highly configurable directed graphs of data routing, transformation, and system mediation logic.

Data Labelling Tools and Frameworks

  • Labelimg - Open source graphical image annotation tool writen in Python using QT for graphical interface focusing primarily on bounding boxes.
  • Computer Vision Annotation Tool (CVAT) - OpenCV's web-based annotation tool for both VIDEOS and images for computer algorithms.
  • Visual Object Tagging Tool (VOTT) - Microsoft's Open Source electron app for labelling videos and images for object detection models (with active learning functionality)
  • Label Studio - Multi-domain data labeling and annotation tool with standardized output format
  • Labelbox - Open source image labelling tool with support for semantic segmentation (brush & superpixels), bounding boxes and nested classifications.
  • Doccano - Open source text annotation tools for humans, providing functionality for sentiment analysis, named entity recognition, and machine translation.
  • ImgLab - Image annotation tool for bounding boxes with auto-suggestion and extensibility for plugins.
  • ImageTagger - Image labelling tool with support for collaboration, supporting bounding box, polygon, line, point labelling, label export, etc.
  • PixelAnnotationTool - Image annotation tool with ability to "colour" on the images to select labels for segmentation. Process is semi-automated with the watershed marked algorithm of OpenCV
  • OpenLabeling - Open source tool for labelling images with support for labels, edges, as well as image resizing and zooming in.
  • Semantic Segmentation Editor - Hitachi's Open source tool for labelling camera and LIDAR data.

Data Storage Optimisation

  • EdgeDB - NoSQL interface for Postgres that allows for object interaction to data stored
  • BayesDB - Database that allows for built-in non-parametric Bayesian model discovery and queryingi for data on a database-like interface - (Video)
  • Apache Arrow - In-memory columnar representation of data compatible with Pandas, Hadoop-based systems, etc
  • Apache Parquet - On-disk columnar representation of data compatible with Pandas, Hadoop-based systems, etc
  • Apache Kafka - Distributed streaming platform framework
  • ClickHouse - ClickHouse is an open source column oriented database management system supported by Yandex - (Video)
  • Alluxio - A virtual distributed storage system that bridges the gab between computation frameworks and storage systems.

Function as a Service Frameworks

  • OpenFaaS - Serverless functions framework with RESTful API on Kubernetes
  • Fission - (Early Alpha) Serverless functions as a service framework on Kubernetes
  • Hydrosphere ML Lambda - Open source model management cluster for deploying, serving and monitoring machine learning models and ad-hoc algorithms with a FaaS architecture
  • Hydrosphere Mist - Serverless proxy for Apache Spark clusters
  • Apache OpenWhisk - Open source, distributed serverless platform that executes functions in response to events at any scale.
  • KNative Serving - Kubernetes based serverless microservices with "scale-to-zero" functionality.

Computation load distribution frameworks

  • Hadoop Open Platform-as-a-service (HOPS) - A multi-tenancy open source framework with RESTful API for data science on Hadoop which enables for Spark, Tensorflow/Keras, it is Python-first, and provides a lot of features
  • PyWren - Answer the question of the "cloud button" for python function execution. It's a framework that abstracts AWS Lambda to enable data scientists to execute any Python function - (Video)
  • NumPyWren - Scientific computing framework build on top of pywren to enable numpy-like distributed computations
  • BigDL - Deep learning framework on top of Spark/Hadoop to distribute data and computations across a HDFS system
  • Horovod - Uber's distributed training framework for TensorFlow, Keras, and PyTorch
  • Apache Spark MLib - Apache Spark's scalable machine learning library in Java, Scala, Python and R
  • Dask - Distributed parallel processing framework for Pandas and NumPy computations - (Video)

Model serialisation formats

  • ONNX - Open Neural Network Exchange Format
  • Neural Network Exchange Format (NNEF) - A standard format to store models across Torch, Caffe, TensorFlow, Theano, Chainer, Caffe2, PyTorch, and MXNet
  • PFA - Created by the same organisation as PMML, the Predicted Format for Analytics is an emerging standard for statistical models and data transformation engines.
  • PMML - The Predictive Model Markup Language standard in XML - (Video)_
  • MMdnn - Cross-framework solution to convert, visualize and diagnose deep neural network models.
  • Java PMML API - Java libraries for consuming and producing PMML files containing models from different frameworks, including:

Compiler optimisation frameworks

  • Numba - A compiler for Python array and numerical functions

Data Stream Processing

  • Apache Flink - Open source stream processing framework with powerful stream and batch processing capabilities.
  • Faust - Streaming library built on top of Python's Asyncio library using the async kafka client inspired by the kafka streaming library.
  • Kafka Streams - Kafka client library for buliding applications and microservices where the input and output are stored in kafka clusters
  • Spark Streaming - Micro-batch processing for streams using the apache spark framework as a backend supporting stateful exactly-once semantics
  • Apache Samza - Distributed stream processing framework. It uses Apache Kafka for messaging, and Apache Hadoop YARN to provide fault tolerance, processor isolation, security, and resource management.
  • Brooklin - Distributed stream processing framework. It uses Apache Kafka for messaging, and Apache Hadoop YARN to provide fault tolerance, processor isolation, security, and resource management.

Feature Engineering Automation

  • auto-sklearn - Framework to automate algorithm and hyperparameter tuning for sklearn
  • TPOT - Automation of sklearn pipeline creation (including feature selection, pre-processor, etc)
  • tsfresh - Automatic extraction of relevant features from time series
  • Featuretools - An open source framework for automated feature engineering
  • Colombus - A scalable framework to perform exploratory feature selection implemented in R
  • AutoML-GS - Automatic feature and model search with code generation in Python, on top of common data science libraries (tensorflow, sklearn, etc)
  • automl - Automated feature engineering, feature/model selection, hyperparam. optimisation
  • Feature Engine - Feature-engine is a Python library that contains several transformers to engineer features for use in machine learning models.

Commercial Platforms

  • - An end-to-end platform to manage, build and automate machine learning
  • - Machine learning experiment management. Free for open source and students (Video)
  • Skytree 16.0 - End to end machine learning platform (Video)
  • Algorithmia - Cloud platform to build, deploy and serve machine learning models (Video)
  • y-hat - Deployment, updating and monitoring of predictive models in multiple languages (Video)
  • Amazon SageMaker - End-to-end machine learning development and deployment interface where you are able to build notebooks that use EC2 instances as backend, and then can host models exposed on an API
  • Google Cloud Machine Learning Engine - Managed service that enables developers and data scientists to build and bring machine learning models to production.
  • Microsoft Azure Machine Learning service - Build, train, and deploy models from the cloud to the edge.
  • IBM Watson Machine Learning - Create, train, and deploy self-learning models using an automated, collaborative workflow.
  • - community-friendly platform supporting data scientists in creating and sharing machine learning models. Neptune facilitates teamwork, infrastructure management, models comparison and reproducibility.
  • Datmo - Workflow tools for monitoring your deployed models to experiment and optimize models in production.
  • Valohai - Machine orchestration, version control and pipeline management for deep learning.
  • Dataiku - Collaborative data science platform powering both self-service analytics and the operationalization of machine learning models in production.
  • MCenter - MLOps platform automates the deployment, ongoing optimization, and governance of machine learning applications in production.
  • Skafos - Skafos platform bridges the gap between data science, devops and engineering; continuous deployment, automation and monitoring.
  • SKIL - Software distribution designed to help enterprise IT teams manage, deploy, and retrain machine learning models at scale.
  • MLJAR - Platform for rapid prototyping, developing and deploying machine learning models.
  • MissingLink - MissingLink helps data engineers streamline and automate the entire deep learning lifecycle.
  • DataRobot - Automated machine learning platform which enables users to build and deploy machine learning models.
  • RiseML - Machine Learning Platform for Kubernetes: RiseML simplifies running machine learning experiments on bare metal and cloud GPU clusters of any size.
  • Datatron - Machine Learning Model Governance Platform for all your AI models in production for large Enterprises.
  • Talend Studio
You can’t perform that action at this time.