From 6b3f209c64e6b6cb8e704fcb808618b33adce479 Mon Sep 17 00:00:00 2001 From: skshetry Date: Fri, 23 Sep 2022 20:28:21 +0545 Subject: [PATCH] make_checkpoint: rewrite code for every100th iterations (#3983) * make_checkpoint: rewrite code for every100th iterations * Restyled by prettier (#3984) Co-authored-by: Restyled.io * Update content/docs/api-reference/make_checkpoint.md Co-authored-by: Jorge Orpinel * Restyled by prettier (#3991) Co-authored-by: Restyled.io Co-authored-by: restyled-io[bot] <32688539+restyled-io[bot]@users.noreply.github.com> Co-authored-by: Restyled.io Co-authored-by: Dave Berenbaum Co-authored-by: Jorge Orpinel --- content/docs/api-reference/make_checkpoint.md | 48 ++++++------------- 1 file changed, 14 insertions(+), 34 deletions(-) diff --git a/content/docs/api-reference/make_checkpoint.md b/content/docs/api-reference/make_checkpoint.md index 014b70a530..75a828ab86 100644 --- a/content/docs/api-reference/make_checkpoint.md +++ b/content/docs/api-reference/make_checkpoint.md @@ -52,42 +52,29 @@ Let's consider the following `dvc.yaml` file: ```yaml stages: - every100: - cmd: python iterate.py + train: + cmd: python train.py outs: - - int.txt: + - model: checkpoint: true ``` -The code in `iterate.py` will execute continuously increment an integer number -saved in `int.txt` (starting at 0). At 0 and every 100 loops, it makes a -checkpoint for [DVC experiments]: +The code in `train.py` will train a model up to a number of epochs. Every 100 +iterations, it saves the `model`, evaluates it, and makes a checkpoint for [DVC +experiments]: [dvc experiments]: /doc/user-guide/experiment-management#experiments ```py -import os - from dvc.api import make_checkpoint -while True: - try: - if os.path.exists("int.txt"): - with open("int.txt", "r") as f: - i_ = int(f.read()) + 1 - else: - i_ = 0 - - # ... do something meaningful - - with open("int.txt", "w") as f: - f.write(f"{i_}") +for epoch in range(epochs): + train(model, x_train, y_train) - if i_ % 100 == 0: - make_checkpoint() - - except KeyboardInterrupt: - exit() + if epoch % 100 == 0: + save_model(model, "model") + evaluate(model, x_test, y_test) + make_checkpoint() ``` Since `checkpoint` outputs in effect implement a circular dependency, @@ -115,15 +102,8 @@ Experiment results have been applied to your workspace. > DVC checkpoints to behave as expected. In this example we killed the process (with `[Ctrl] C`) after 3 checkpoints (at -0, 100, and 200 `i_`). The cache will contain those 3 versions of -`int.txt`. - -```dvc -$ cat int.txt -200 -$ ls .dvc/cache -36 cf f8 -``` +0, 100, and 200 epochs). The cache will contain those 3 versions of +`model`. `dvc exp show` will display these checkpoints as an experiment branch: