ML Reproducibility with Open Source Tooling - AMLD 2018
This repo includes all of the materials that you will need for the "Machine Learning Reproducibility with Open Source Tooling" workshop at Applied ML Days 2018. In this workshop, we will discuss the importance of reproducibility and data provenance in any applied machine learning workflow. We will then implement a realistic machine learning workflow, emphasizing these points and utilizing open source tooling to overcome the challenges associated with reproducibility.
- Reproducibility challenges
- Predictable application behavior with Docker
- Fully reproducible orchestration of ML workflows
- You will need to ssh into a cloud instance. Remind yourself of how to do that and install a client if needed:
- You will also need to work a bit at the command line. If you are new to the command line or need a refresher, look through this quick tutorial.