This notebook shows how to use [dvc](https://dvc.org/) [experiments](https://github.com/iterative/dvc/wiki/Experiments) in model development. This example uses the [MNIST](http://yann.lecun.com/exdb/mnist/) data of handwritten digits and builds a classification model to predict the digit (0-9) in each image. The model is built in [pytorch](https://pytorch.org/) as convolutional neural network with a simplified architecture, which should be able to quickly run on most computers.

### Getting started

To get started, clone this repository and navigate to it.

The only other prerequisite is [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/). Once conda is installed, create a virtual environment from the existing `environment.yaml` file and activate it:

```bash
conda env create -f environment.yml
conda activate dvc
```

If you want to run this notebook directly, do so after activating the conda environment.

### Establishing the pipeline DAG

Before experimenting, a dvc pipeline must be established (see the docs if you are new to dvc). Review the contents of `dvc.yaml` below to see the pipeline.

In [1]:
%%bash
cat dvc.yaml

stages:
  download:
    cmd: python download.py
    deps:
    - download.py
    outs:
    - data/MNIST
  train:
    cmd: python train.py --model_path=model.pt --metrics_path=metrics.yaml
    deps:
    - data/MNIST
    - train.py
    params:
    - lr
    - weight_decay
    outs:
    - model.pt
    metrics:
    - metrics.yaml
  train_checkpoint:
    cmd: python train.py --model_path=model_checkpoint.pt --metrics_path=metrics_checkpoint.yaml --checkpoint=5
    deps:
    - data/MNIST
    - train.py
    params:
    - lr
    - weight_decay
    outs:
    - model_checkpoint.pt:
        checkpoint: true
    metrics:
    - metrics_checkpoint.yaml


The download stage gets the data using the `download.py` script. The train stage performs model training and evaluation on the downloaded data using the `train.py` script. The train stage uses the lr and weight_decay metrics defined in `params.yaml`. The model output is saved to `model.pt`, and the metrics are saved to `metrics.yaml`. The train_checkpoint stage is similar but saves output periodically.

Execute the pipeline to reproduce the train stage:

In [2]:
%%bash
dvc repro train

Running stage 'download' with command:
	python download.py
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to data/MNIST/raw/train-images-idx3-ubyte.gz
Extracting data/MNIST/raw/train-images-idx3-ubyte.gz to data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to data/MNIST/raw/train-labels-idx1-ubyte.gz
Extracting data/MNIST/raw/train-labels-idx1-ubyte.gz to data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to data/MNIST/raw/t10k-images-idx3-ubyte.gz
Extracting data/MNIST/raw/t10k-images-idx3-ubyte.gz to data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to data/MNIST/raw/t10k-labels-idx1-ubyte.gz
Extracting data/MNIST/raw/t10k-labels-idx1-ubyte.gz to data/MNIST/raw
Processing...
Done!
Generating lock file 'dvc.lock'
Updating lock file 'dvc.lock'

Running stage 'train' with command:
	python train.py --model_path=model.pt --metrics_path=metrics.yaml
Updating 

0it [00:00, ?it/s]  0%|          | 0/9912422 [00:00<?, ?it/s]  2%|▏         | 204800/9912422 [00:00<00:04, 2032096.15it/s]  8%|▊         | 753664/9912422 [00:00<00:03, 2501974.76it/s] 12%|█▏        | 1236992/9912422 [00:00<00:02, 2909853.30it/s] 15%|█▌        | 1523712/9912422 [00:00<00:03, 2786750.83it/s] 21%|██        | 2080768/9912422 [00:00<00:02, 3275699.46it/s] 27%|██▋       | 2654208/9912422 [00:00<00:01, 3756365.15it/s] 32%|███▏      | 3211264/9912422 [00:00<00:01, 4161862.85it/s] 37%|███▋      | 3694592/9912422 [00:00<00:01, 4321440.40it/s] 42%|████▏     | 4161536/9912422 [00:01<00:01, 3971699.81it/s] 47%|████▋     | 4636672/9912422 [00:01<00:01, 4165433.46it/s] 51%|█████▏    | 5095424/9912422 [00:01<00:01, 3684572.46it/s] 57%|█████▋    | 5611520/9912422 [00:01<00:01, 4020826.19it/s] 62%|██████▏   | 6127616/9912422 [00:01<00:00, 4289559.33it/s] 66%|██████▋   | 6586368/9912422 [00:01<00:00, 4249978.57it/s] 72%|███████▏  | 7118848/9912422 [00:01<00:00, 4458765.4

**IMPORTANT:** Be sure to run the `git add` command above and also `git commit` before running experiments. Anytime you modify the pipeline, be sure to `dvc repro` and track changes with git before running experiments.

In [3]:
%%bash
git add data/.gitignore dvc.lock
git commit -m "download data"

[checkpoint 33caf3c] download data
 2 files changed, 33 insertions(+)
 create mode 100644 data/.gitignore
 create mode 100644 dvc.lock


### Run an experiment

Run an experiment with the default parameters defined in `params.yaml`.

In [4]:
%%bash
dvc exp run train

Stage 'download' didn't change, skipping
Stage 'train' didn't change, skipping
Reproduced experiment '33caf3c'.


Review the output of the run, including identifying hashes, metrics, and parameters:

In [5]:
%%bash
dvc exp show

┏━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Experiment ┃ Created  ┃    acc ┃   loss ┃ lr    ┃ weight_decay ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━┩
│ workspace  │ -        │ 0.2895 │ 2.2134 │ 0.001 │ 0            │
│ checkpoint │ 04:58 PM │ 0.2895 │ 2.2134 │ 0.001 │ 0            │
└────────────┴──────────┴────────┴────────┴───────┴──────────────┘


### Experimenting with different parameters

Experiments can be run and compared with different parameters.

In [6]:
%%bash
dvc exp run train --params weight_decay=0.1

Stage 'download' didn't change, skipping


ERROR: Failed to reproduce experiment '9cf47aa' - failed to reproduce 'dvc.yaml': restore() got an unexpected keyword argument 'checkpoint_resume'
