This notebook shows how to use [dvc](https://dvc.org/) [experiments](https://github.com/iterative/dvc/wiki/Experiments) in model development. This example uses the [MNIST](http://yann.lecun.com/exdb/mnist/) data of handwritten digits and builds a classification model to predict the digit (0-9) in each image. The model is built in [pytorch](https://pytorch.org/) as a convolutional neural network with a simplified architecture, which should be able to quickly run on most computers.

### Get started

To get started, clone this repository and navigate to it.

The only other prerequisite is [docker](https://www.docker.com). Once docker is installed, build a docker image from the existing `Dockerfile` and run it:

```bash
docker build -t dvc-exp-mnist .
docker run -p 8888:8888 $(docker images -q dvc-exp-mnist)
```

To run the notebook, navigate to the link provided in the output that starts with `http://127.0.0.1:8888/`.

Next, set up git:

In [1]:
email = input("Your email:")
username = input("Your name:")

Your email:dave@iterative.ai
Your name:dave


In [5]:
%%bash -s {email} {username}
git config --global user.email "$1"
git config --global user.name "$2"

Finally, initialize dvc and enable the experiments feature:

In [7]:
%%bash
dvc init -f
dvc config --global core.experiments true

Initialized DVC repository.

You can now commit the changes to git.

+---------------------------------------------------------------------+
|                                                                     |
|        DVC has enabled anonymous aggregate usage analytics.         |
|     Read the analytics documentation (and how to opt-out) here:     |
|             <https://dvc.org/doc/user-guide/analytics>              |
|                                                                     |
+---------------------------------------------------------------------+

What's next?
------------
- Check out the documentation: <https://dvc.org/doc>
- Get help and share ideas: <https://dvc.org/chat>
- Star us on GitHub: <https://github.com/iterative/dvc>


### Establish the pipeline DAG

Before experimenting, a dvc pipeline must be established (see the docs if you are new to dvc). Review the contents of `dvc.yaml` below to see the pipeline.

In [8]:
%%bash
cat dvc.yaml

stages:
  download:
    cmd: python download.py
    deps:
    - download.py
    outs:
    - data/MNIST
  train:
    cmd: python train.py
    deps:
    - data/MNIST
    - train.py
    params:
    - lr
    - weight_decay
    outs:
    - model.pt:
        checkpoint: true
    metrics:
    - metrics.yaml


The download stage gets the data using the `download.py` script. The train stage performs model training and evaluation on the downloaded data using the `train.py` script. The train stage uses the lr and weight_decay metrics defined in `params.yaml`. The model output is saved to `model.pt`, and the metrics are saved to `metrics.yaml`.

Execute the download stage to get the data.

In [9]:
%%bash
dvc repro download

Running stage 'download':
> python download.py
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to data/MNIST/raw/train-images-idx3-ubyte.gz
Extracting data/MNIST/raw/train-images-idx3-ubyte.gz to data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to data/MNIST/raw/train-labels-idx1-ubyte.gz
Extracting data/MNIST/raw/train-labels-idx1-ubyte.gz to data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to data/MNIST/raw/t10k-images-idx3-ubyte.gz
Extracting data/MNIST/raw/t10k-images-idx3-ubyte.gz to data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to data/MNIST/raw/t10k-labels-idx1-ubyte.gz
Extracting data/MNIST/raw/t10k-labels-idx1-ubyte.gz to data/MNIST/raw
Processing...
Done!
Generating lock file 'dvc.lock'
Updating lock file 'dvc.lock'

To track the changes with git, run:

	git add dvc.lock
Use `dvc push` to send your updates to remote storage.


0it [00:00, ?it/s]  0%|          | 0/9912422 [00:00<?, ?it/s]  1%|          | 81920/9912422 [00:00<00:26, 366508.96it/s]  4%|▍         | 409600/9912422 [00:00<00:06, 1498444.27it/s]  7%|▋         | 663552/9912422 [00:00<00:04, 1862400.12it/s] 10%|█         | 1024000/9912422 [00:00<00:03, 2337174.22it/s] 13%|█▎        | 1335296/9912422 [00:00<00:03, 2576931.22it/s] 16%|█▋        | 1622016/9912422 [00:00<00:03, 2653577.68it/s] 20%|█▉        | 1966080/9912422 [00:01<00:02, 2894142.33it/s] 23%|██▎       | 2260992/9912422 [00:01<00:02, 2869135.56it/s] 26%|██▌       | 2555904/9912422 [00:01<00:02, 2887088.61it/s] 29%|██▉       | 2850816/9912422 [00:01<00:02, 2782496.23it/s] 32%|███▏      | 3137536/9912422 [00:01<00:02, 2723806.98it/s] 34%|███▍      | 3416064/9912422 [00:01<00:02, 2661710.67it/s] 37%|███▋      | 3686400/9912422 [00:01<00:03, 1777937.04it/s] 40%|███▉      | 3940352/9912422 [00:02<00:03, 1551102.46it/s] 45%|████▍     | 4448256/9912422 [00:02<00:02, 2231909.50it

**IMPORTANT:** Be sure to run the `git add` command above and also `git commit` before running experiments. Anytime you modify the pipeline, be sure to `dvc repro` and track changes with git before running experiments.

In [10]:
%%bash
git add dvc.lock data/.gitignore
git commit -m "download data"

[2.0 513e35e] download data
 1 file changed, 13 insertions(+)
 create mode 100644 dvc.lock


### Run an experiment

Run an experiment with the default parameters defined in `params.yaml`.

In [11]:
%%bash
dvc exp run

Stage 'download' didn't change, skipping
Running stage 'train':
> python train.py
Updating lock file 'dvc.lock'
Checkpoint experiment iteration 'c78819d'.
Updating lock file 'dvc.lock'
Checkpoint experiment iteration '51075f8'.

To track the changes with git, run:

	git add data/MNIST params.yaml train.py dvc.lock dvc.yaml .gitignore

Reproduced experiment(s): exp-9f4b6
Experiment results have been applied to your workspace.

To promote an experiment to a Git branch run:

	dvc exp branch <exp>



Review the output of the run, including identifying hashes, metrics, and parameters:

In [12]:
%%bash
dvc exp show

┏━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Experiment    ┃ Created  ┃    acc ┃   loss ┃ lr    ┃ weight_decay ┃
┡━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━┩
│ workspace     │ -        │ 0.1269 │ 2.2674 │ 0.001 │ 0.1          │
│ 2.0           │ 08:37 PM │      - │      - │ 0.001 │ 0.1          │
│ │ ╓ exp-9f4b6 │ 08:38 PM │ 0.1269 │ 2.2674 │ 0.001 │ 0.1          │
│ │ ╟ 51075f8   │ 08:38 PM │ 0.1269 │ 2.2674 │ 0.001 │ 0.1          │
│ ├─╨ c78819d   │ 08:38 PM │ 0.0977 │ 2.2875 │ 0.001 │ 0.1          │
└───────────────┴──────────┴────────┴────────┴───────┴──────────────┘


Note that two experiments were run. These are checkpoints for the run. It's not necessary to have checkpoints for experiments, but they can be helpful for models that may be run for a number of epochs. See below for more information about how checkpoints work.

### Experiment with different parameters

Experiments can be run and compared with different parameters.

In [None]:
%%bash
dvc exp run --params weight_decay=0.1

In [None]:
%%bash
dvc exp show

Increasing `weight_decay` helped a bit. Next, try different `lr` parameters.

Experiments can be added in bulk to the queue and executed on demand (see the `-j` flag for parallel execution!).

In [9]:
%%bash
dvc exp run --params lr=0.01 --queue
dvc exp run --params lr=0.1 --queue

Queued experiment '7f1d45d' for future execution.
Queued experiment '58f1dfe' for future execution.


In [10]:
%%bash
dvc exp show

┏━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Experiment    ┃ Created  ┃    acc ┃   loss ┃ lr    ┃ weight_decay ┃
┡━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━┩
│ workspace     │ -        │ 0.1009 │ 2.3028 │ 0.001 │ 0.1          │
│ main          │ 11:19 AM │      - │      - │ 0.001 │ 0            │
│ │ ╓ exp-98b33 │ 11:21 AM │ 0.1009 │ 2.3028 │ 0.001 │ 0.1          │
│ │ ╟ 715fc6e   │ 11:21 AM │ 0.1009 │ 2.3028 │ 0.001 │ 0.1          │
│ ├─╨ 2b34f9f   │ 11:21 AM │ 0.1009 │ 2.3034 │ 0.001 │ 0.1          │
│ │ ╓ exp-ff372 │ 11:19 AM │ 0.1662 │ 2.2971 │ 0.001 │ 0            │
│ │ ╟ 8d76dda   │ 11:19 AM │ 0.1662 │ 2.2971 │ 0.001 │ 0            │
│ ├─╨ a9901cf   │ 11:19 AM │ 0.1118 │ 2.3028 │ 0.001 │ 0            │
│ ├── *58f1dfe  │ 11:21 AM │      - │      - │ 0.1   │ 0.1          │
│ └── *7f1d45d  │ 11:21 AM │      - │      - │ 0.01  │ 0.1          │
└───────────────┴──────────┴────────┴────────┴───────┴──────────────┘


In [11]:
%%bash
dvc exp run --run-all

Running stage 'download':
> python download.py
Stage 'download' didn't change, skipping
Running stage 'download':
> python download.py
Running stage 'train':
> python train.py
Stage 'download' didn't change, skipping
Running stage 'train':
> python train.py
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to data/MNIST/raw/train-images-idx3-ubyte.gz
Extracting data/MNIST/raw/train-images-idx3-ubyte.gz to data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to data/MNIST/raw/train-labels-idx1-ubyte.gz
Extracting data/MNIST/raw/train-labels-idx1-ubyte.gz to data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to data/MNIST/raw/t10k-images-idx3-ubyte.gz
Extracting data/MNIST/raw/t10k-images-idx3-ubyte.gz to data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to data/MNIST/raw/t10k-labels-idx1-ubyte.gz
Extracting data/MNIST/raw/t10k-labels-idx1-ubyte.gz to data/MNIST/

0it [00:00, ?it/s]0it [00:00, ?it/s]  0%|          | 0/9912422 [00:00<?, ?it/s]  1%|          | 90112/9912422 [00:00<00:11, 875940.93it/s]  0%|          | 16384/9912422 [00:00<01:27, 112853.95it/s]  3%|▎         | 278528/9912422 [00:00<00:06, 1458695.68it/s]  1%|          | 114688/9912422 [00:00<00:23, 425674.08it/s]  6%|▌         | 557056/9912422 [00:00<00:04, 2055465.70it/s]  2%|▏         | 163840/9912422 [00:00<00:23, 423773.94it/s]  9%|▊         | 843776/9912422 [00:00<00:05, 1783472.74it/s] 12%|█▏        | 1171456/9912422 [00:00<00:04, 1994029.17it/s]  4%|▎         | 352256/9912422 [00:00<00:15, 619652.52it/s]  7%|▋         | 712704/9912422 [00:00<00:07, 1282033.56it/s] 14%|█▍        | 1368064/9912422 [00:00<00:06, 1409601.33it/s]  9%|▊         | 860160/9912422 [00:01<00:09, 990862.97it/s]  17%|█▋        | 1720320/9912422 [00:01<00:04, 1756451.11it/s] 21%|██        | 2048000/9912422 [00:01<00:03, 2097923.85it/s] 12%|█▏        | 1212416/9912422 [00:01<00:06, 13898

In [12]:
%%bash
dvc exp show

┏━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Experiment    ┃ Created  ┃    acc ┃   loss ┃ lr    ┃ weight_decay ┃
┡━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━┩
│ workspace     │ -        │ 0.1009 │ 2.3028 │ 0.001 │ 0.1          │
│ main          │ 11:19 AM │      - │      - │ 0.001 │ 0            │
│ │ ╓ exp-15d4f │ 11:23 AM │ 0.2012 │  2.071 │ 0.01  │ 0.1          │
│ ├─╨ 5fd4ddb   │ 11:22 AM │ 0.1914 │ 2.2331 │ 0.01  │ 0.1          │
│ │ ╓ exp-98b33 │ 11:21 AM │ 0.1009 │ 2.3028 │ 0.001 │ 0.1          │
│ │ ╟ 715fc6e   │ 11:21 AM │ 0.1009 │ 2.3028 │ 0.001 │ 0.1          │
│ ├─╨ 2b34f9f   │ 11:21 AM │ 0.1009 │ 2.3034 │ 0.001 │ 0.1          │
│ │ ╓ exp-ff372 │ 11:19 AM │ 0.1662 │ 2.2971 │ 0.001 │ 0            │
│ │ ╟ 8d76dda   │ 11:19 AM │ 0.1662 │ 2.2971 │ 0.001 │ 0            │
│ ├─╨ a9901cf   │ 11:19 AM │ 0.1118 │ 2.3028 │ 0.001 │ 0            │
│ └── *58f1dfe  │ 11:21 AM │      - │      - │ 0.1   │ 0.1          │
└───────────────┴───

**FIXME:** Running queued experiments does not run all queued experiments, at least if there are checkpoints in those experiments. In the example above, there are two checkpoints so that every set of parameters should be run twice. Since there were two different sets of parameters queued, four total experiment runs were expected, but only two ran. Nothing was run for the second set of parameters.

Let's try again to run the remaining queued experiments and to test that including the parallel execution flag doesn't break anything. Nothing should actually run in parallel here, since checkpoints need to be executed sequentially.

In [14]:
%%bash
dvc exp run --run-all -j 2

Running stage 'download':
> python download.py
Stage 'download' didn't change, skipping
Running stage 'train':
> python train.py
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to data/MNIST/raw/train-images-idx3-ubyte.gz
Extracting data/MNIST/raw/train-images-idx3-ubyte.gz to data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to data/MNIST/raw/train-labels-idx1-ubyte.gz
Extracting data/MNIST/raw/train-labels-idx1-ubyte.gz to data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to data/MNIST/raw/t10k-images-idx3-ubyte.gz
Extracting data/MNIST/raw/t10k-images-idx3-ubyte.gz to data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to data/MNIST/raw/t10k-labels-idx1-ubyte.gz
Extracting data/MNIST/raw/t10k-labels-idx1-ubyte.gz to data/MNIST/raw
Processing...
Done!
Migrating lock file 'dvc.lock' from v1 to v2
Updating lock file 'dvc.lock'

Running stage 'train':
> pyth

0it [00:00, ?it/s]  0%|          | 0/9912422 [00:00<?, ?it/s]  0%|          | 8192/9912422 [00:00<08:38, 19092.10it/s]  2%|▏         | 204800/9912422 [00:00<00:19, 498756.25it/s]  5%|▌         | 507904/9912422 [00:00<00:08, 1149102.99it/s]  9%|▉         | 917504/9912422 [00:00<00:04, 1931639.53it/s] 14%|█▎        | 1359872/9912422 [00:01<00:03, 2616751.53it/s] 19%|█▉        | 1875968/9912422 [00:01<00:02, 3327064.94it/s] 25%|██▍       | 2433024/9912422 [00:01<00:01, 3926578.88it/s] 30%|███       | 3014656/9912422 [00:01<00:01, 4237306.07it/s] 36%|███▌      | 3538944/9912422 [00:01<00:01, 4503739.89it/s] 40%|████      | 4014080/9912422 [00:01<00:01, 4572925.86it/s] 45%|████▌     | 4489216/9912422 [00:01<00:01, 4624153.70it/s] 50%|█████     | 4964352/9912422 [00:01<00:01, 4421854.16it/s] 55%|█████▍    | 5414912/9912422 [00:01<00:01, 3863594.35it/s] 59%|█████▉    | 5824512/9912422 [00:02<00:01, 3572009.64it/s] 65%|██████▍   | 6430720/9912422 [00:02<00:00, 4047320.00it/s]

In [15]:
%%bash
dvc exp show

┏━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Experiment    ┃ Created  ┃    acc ┃   loss ┃ lr    ┃ weight_decay ┃
┡━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━┩
│ workspace     │ -        │ 0.1009 │ 2.3028 │ 0.001 │ 0.1          │
│ main          │ 11:19 AM │      - │      - │ 0.001 │ 0            │
│ │ ╓ exp-e08c4 │ 11:27 AM │  0.101 │  2.303 │ 0.1   │ 0.1          │
│ ├─╨ 7251640   │ 11:26 AM │ 0.1135 │ 2.3026 │ 0.1   │ 0.1          │
│ │ ╓ exp-15d4f │ 11:23 AM │ 0.2012 │  2.071 │ 0.01  │ 0.1          │
│ ├─╨ 5fd4ddb   │ 11:22 AM │ 0.1914 │ 2.2331 │ 0.01  │ 0.1          │
│ │ ╓ exp-98b33 │ 11:21 AM │ 0.1009 │ 2.3028 │ 0.001 │ 0.1          │
│ │ ╟ 715fc6e   │ 11:21 AM │ 0.1009 │ 2.3028 │ 0.001 │ 0.1          │
│ ├─╨ 2b34f9f   │ 11:21 AM │ 0.1009 │ 2.3034 │ 0.001 │ 0.1          │
│ │ ╓ exp-ff372 │ 11:19 AM │ 0.1662 │ 2.2971 │ 0.001 │ 0            │
│ │ ╟ 8d76dda   │ 11:19 AM │ 0.1662 │ 2.2971 │ 0.001 │ 0            │
│ ├─╨ a9901cf   │ 11

### Iteratively train using checkpoints

Use checkpoints to periodically save the model during training (as shown above), and to resume training from previously saved state. Resume training the experiment with the best accuracy.

In [16]:
%%bash
dvc exp show --sort-by acc --include-metrics acc

┏━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Experiment    ┃ Created  ┃    acc ┃ lr    ┃ weight_decay ┃
┡━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━┩
│ workspace     │ -        │ 0.1009 │ 0.001 │ 0.1          │
│ main          │ 11:19 AM │      - │ 0.001 │ 0            │
│ │ ╓ exp-98b33 │ 11:21 AM │ 0.1009 │ 0.001 │ 0.1          │
│ │ ╟ 715fc6e   │ 11:21 AM │ 0.1009 │ 0.001 │ 0.1          │
│ ├─╨ 2b34f9f   │ 11:21 AM │ 0.1009 │ 0.001 │ 0.1          │
│ │ ╓ exp-e08c4 │ 11:27 AM │  0.101 │ 0.1   │ 0.1          │
│ ├─╨ 7251640   │ 11:26 AM │ 0.1135 │ 0.1   │ 0.1          │
│ │ ╓ exp-ff372 │ 11:19 AM │ 0.1662 │ 0.001 │ 0            │
│ │ ╟ 8d76dda   │ 11:19 AM │ 0.1662 │ 0.001 │ 0            │
│ ├─╨ a9901cf   │ 11:19 AM │ 0.1118 │ 0.001 │ 0            │
│ │ ╓ exp-15d4f │ 11:23 AM │ 0.2012 │ 0.01  │ 0.1          │
│ ├─╨ 5fd4ddb   │ 11:22 AM │ 0.1914 │ 0.01  │ 0.1          │
└───────────────┴──────────┴────────┴───────┴──────────────┘


Apply the top experiment.

In [19]:
%%bash
dvc exp apply exp-98b333

ERROR: 'exp-98b333' does not appear to be an experiment commit.: unknown Git revision 'exp-98b333'


CalledProcessError: Command 'b'dvc exp apply exp-98b333\n'' returned non-zero exit status 255.

**FIXME**: That didn't work. What are the lines with `exp-` at the start? They seem to group experiment runs under the same set of parameters, but it's unclear how a user should interpret these lines.

Let's try the last revision under those same parameters.

In [23]:
%%bash
dvc exp apply 715fc6e

Changes for experiment '715fc6e' have been applied to your current workspace.


Next, resume iterations from this experiment.

In [24]:
%%bash
dvc exp res -r 715fc6e

Stage 'download' didn't change, skipping
Running stage 'train':
> python train.py
Updating lock file 'dvc.lock'


ERROR: Error generating checkpoint, stage: 'train' will be aborted - Failed to set 'refs/exps/b9/f7373bf810065c85ac12f4af1f04a7dbae0b79/exp-98b33'


**FIXME**: Resuming experiment failed.

In [25]:
%%bash
dvc exp show

┏━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Experiment    ┃ Created  ┃    acc ┃   loss ┃ lr    ┃ weight_decay ┃
┡━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━┩
│ workspace     │ -        │ 0.1043 │ 2.3024 │ 0.001 │ 0.1          │
│ main          │ 11:19 AM │      - │      - │ 0.001 │ 0            │
│ │ ╓ exp-e08c4 │ 11:27 AM │  0.101 │  2.303 │ 0.1   │ 0.1          │
│ ├─╨ 7251640   │ 11:26 AM │ 0.1135 │ 2.3026 │ 0.1   │ 0.1          │
│ │ ╓ exp-15d4f │ 11:23 AM │ 0.2012 │  2.071 │ 0.01  │ 0.1          │
│ ├─╨ 5fd4ddb   │ 11:22 AM │ 0.1914 │ 2.2331 │ 0.01  │ 0.1          │
│ │ ╓ exp-98b33 │ 11:21 AM │ 0.1009 │ 2.3028 │ 0.001 │ 0.1          │
│ │ ╟ 715fc6e   │ 11:21 AM │ 0.1009 │ 2.3028 │ 0.001 │ 0.1          │
│ ├─╨ 2b34f9f   │ 11:21 AM │ 0.1009 │ 2.3034 │ 0.001 │ 0.1          │
│ │ ╓ exp-ff372 │ 11:19 AM │ 0.1662 │ 2.2971 │ 0.001 │ 0            │
│ │ ╟ 8d76dda   │ 11:19 AM │ 0.1662 │ 2.2971 │ 0.001 │ 0            │
│ ├─╨ a9901cf   │ 11

In [26]:
%%bash
cat params.yaml

lr: 0.001
weight_decay: 0.1


The parameters from `dvc exp apply 31e8e72` were checked out into the workspace.

Compare this experiment to the initial one run with the default parameters:

In [29]:
%%bash
dvc exp diff 715fc6e a9901cf

Path          Metric    Value    Change
metrics.yaml  acc       0.1118   0.0109
metrics.yaml  loss      2.3028   -8.8215e-06

Path         Param         Value    Change
params.yaml  weight_decay  0        -0.1



### Persist models

Additonal epochs didn't improve accuracy, so commit the model iteration with peak accuracy. Checkout the experiment rev in dvc and then commit to git.

In [30]:
%%bash
dvc exp branch 715fc6e 2.0

Git branch '2.0' has been created from experiment 'exp-98b33'.
To switch to the new branch run:

	git checkout 2.0


In [31]:
%%bash
dvc exp show

┏━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Experiment    ┃ Created  ┃    acc ┃   loss ┃ lr    ┃ weight_decay ┃
┡━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━┩
│ workspace     │ -        │ 0.1043 │ 2.3024 │ 0.001 │ 0.1          │
│ main          │ 11:19 AM │      - │      - │ 0.001 │ 0            │
│ │ ╓ exp-e08c4 │ 11:27 AM │  0.101 │  2.303 │ 0.1   │ 0.1          │
│ ├─╨ 7251640   │ 11:26 AM │ 0.1135 │ 2.3026 │ 0.1   │ 0.1          │
│ │ ╓ exp-15d4f │ 11:23 AM │ 0.2012 │  2.071 │ 0.01  │ 0.1          │
│ ├─╨ 5fd4ddb   │ 11:22 AM │ 0.1914 │ 2.2331 │ 0.01  │ 0.1          │
│ │ ╓ 2.0       │ 11:21 AM │ 0.1009 │ 2.3028 │ 0.001 │ 0.1          │
│ │ ╟ 715fc6e   │ 11:21 AM │ 0.1009 │ 2.3028 │ 0.001 │ 0.1          │
│ ├─╨ 2b34f9f   │ 11:21 AM │ 0.1009 │ 2.3034 │ 0.001 │ 0.1          │
│ │ ╓ exp-ff372 │ 11:19 AM │ 0.1662 │ 2.2971 │ 0.001 │ 0            │
│ │ ╟ 8d76dda   │ 11:19 AM │ 0.1662 │ 2.2971 │ 0.001 │ 0            │
│ ├─╨ a9901cf   │ 11

### Clean up experiments

In [32]:
%%bash
dvc exp gc --workspace



CalledProcessError: Command 'b'dvc exp gc --workspace\n'' returned non-zero exit status 1.

**FIXME**: `dvc exp gc --workspace` did not work.

In [33]:
%%bash
dvc exp show

┏━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Experiment    ┃ Created  ┃    acc ┃   loss ┃ lr    ┃ weight_decay ┃
┡━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━┩
│ workspace     │ -        │ 0.1043 │ 2.3024 │ 0.001 │ 0.1          │
│ main          │ 11:19 AM │      - │      - │ 0.001 │ 0            │
│ │ ╓ exp-e08c4 │ 11:27 AM │  0.101 │  2.303 │ 0.1   │ 0.1          │
│ ├─╨ 7251640   │ 11:26 AM │ 0.1135 │ 2.3026 │ 0.1   │ 0.1          │
│ │ ╓ exp-15d4f │ 11:23 AM │ 0.2012 │  2.071 │ 0.01  │ 0.1          │
│ ├─╨ 5fd4ddb   │ 11:22 AM │ 0.1914 │ 2.2331 │ 0.01  │ 0.1          │
│ │ ╓ 2.0       │ 11:21 AM │ 0.1009 │ 2.3028 │ 0.001 │ 0.1          │
│ │ ╟ 715fc6e   │ 11:21 AM │ 0.1009 │ 2.3028 │ 0.001 │ 0.1          │
│ ├─╨ 2b34f9f   │ 11:21 AM │ 0.1009 │ 2.3034 │ 0.001 │ 0.1          │
│ │ ╓ exp-ff372 │ 11:19 AM │ 0.1662 │ 2.2971 │ 0.001 │ 0            │
│ │ ╟ 8d76dda   │ 11:19 AM │ 0.1662 │ 2.2971 │ 0.001 │ 0            │
│ ├─╨ a9901cf   │ 11