A curated list of articles, papers and tools for managing the building and deploying of machine learning models, aka machine learning engineering.
- Where to start
- Data
- Best practice
- Example pipelines
- Conference tracks and workshops
- Big data on a single machine / on the command line
- Software
- Related awesome lists
- The Unreasonable Effectiveness of Data
- Revisiting the Unreasonable Effectiveness of Data
- Why you need to improve your training data, and how to do it
- Rules of Machine Learning: Best Practices for ML Engineering
- What’s your ML test score? A rubric for ML production systems
- Machine Learning: The High Interest Credit Card of Technical Debt
- Introducing the Facebook Field Guide to Machine Learning video series
- Patterns for Research in Machine Learning
- Production Data Science
- Making Netflix Machine Learning Algorithms Reliable
- Scaling Knowledge at Airbnb
- Ad Click Prediction: a View from the Trenches
- Learning a Personalized Homepage
- Distributed Time Travel for Feature Generation
- Reliable Machine Learning in the Wild NIPS 2016 workshop
- Reliable Machine Learning in the Wild ICML 2017 workshop
- KDD 2017 Applied Data Science
- KDD 2018 Applied Data Science
- ECMLPKDD 2016 Industrial track
- ECMLPKDD 2017 Applied Data Science track
- ECMLPKDD 2018
- WWW 2018 Industry track
- Command-line Tools can be 235x Faster than your Hadoop Cluster
- Big Data, Small Machine
- Dask
- Unix for poets
- Data Science at the Command Line
- Data hacks command line utilities
- Split command
- Parallel command
- Xargs command parallel flag
- kubeflow Machine Learning Toolkit for Kubernetes (kubeflow)
- ModelDB A system to manage machine learning models (MIT)
- mlflow Open source platform for the complete machine learning lifecycle (Databricks)
- datmo Open source model tracking tool for data scientists
- Luigi is a Python module that helps you build complex pipelines of batch jobs. (Spotify)
- Airflow is a platform to programmatically author, schedule, and monitor workflows (Netflix)
- Azkaban workflow manager (LinkedIn)
- Pinball is a scalable workflow manager (pinterest)
- Serving A flexible, high-performance serving system for machine learning models (Google)
- deepdetect Deep Learning API and Server in C++11 with Python bindings and support for Caffe, Tensorflow, XGBoost and TSNE (deepdetect)
- clipper A low-latency prediction-serving system (Berkeley)
- MLeap Deploy Spark Pipelines to Production (combust.ml)
- openscoring REST web service for the true real-time scoring (<1 ms) of R, Scikit-Learn and Apache Spark models (openscoring)
- mxnet-model-server Model Server for Apache MXNet is a tool for serving neural net models for inference (AWS)
- hydro-serving ML FaaS - Machine Learning Serving cluster (hydrosphere.io)
- Predictive Model Markup Language (PMML)
- jpmml-sklearn Java library and command-line application for converting Scikit-Learn pipelines to PMML
- sklearn2pmml Python library for converting Scikit-Learn pipelines to PMML
- sklearn-porter Transpile trained scikit-learn estimators to C, Java, JavaScript and others
- Knowledge Repo A next-generation curated knowledge sharing platform for data scientists and other technical professions.
- Data Pipeline "is a web service that you can use to automate the movement and transformation of data"
- Glue "is a fully managed ETL (extract, transform, and load) service"
- Simple Workflow "makes it easy to build applications that coordinate work across distributed components"
- Batch "enables you to run batch computing workloads on the AWS Cloud"
- Machine Learning "cloud-based service that makes it easy for developers of all skill levels to use machine learning technology"
- Sagemaker "is a fully managed machine learning service"
- Dataflow "is a unified programming model and a managed service for developing and executing a wide variety of data processing patterns"
- ML Engine "brings the power and flexibility of TensorFlow, scikit-learn and XGBoost to the cloud"
- Batch AI "helps you experiment with your AI models using any framework and then train them at scale across GPU and CPU clusters"
- Machine Learning services "enable building, deploying, and managing machine learning and AI models using any Python tools and libraries"
- Machine Learning Studio "is a collaborative, drag-and-drop tool you can use to build, test, and deploy predictive analytics solutions on your data"