Skip to content

Commit

Permalink
docs: add checkpoints to documentation (#312)
Browse files Browse the repository at this point in the history
* docs: add checkpoints to documentation

* fix: commit suggestion

Co-authored-by: Wang Bo <bo.wang@jina.ai>

Co-authored-by: Wang Bo <bo.wang@jina.ai>
  • Loading branch information
Tadej Svetina and bwanglzu committed Jan 4, 2022
1 parent 8ff5649 commit 0eae320
Show file tree
Hide file tree
Showing 2 changed files with 49 additions and 8 deletions.
20 changes: 12 additions & 8 deletions docs/components/tuner.md
Expand Up @@ -3,10 +3,11 @@
Tuner is one of the three key components of Finetuner. Given an {term}`embedding model` and {term}`labeled dataset` (see {ref}`the guide on data formats<data-format>` for more information), Tuner trains the model to fit the data.

With Tuner, you can customize the training process to best fit your data, and track your experiements in a clear and transparent manner. You can do things like
- choose between different loss functions, use hard negative mining for triplets/pairs
- set your own optimizers and learning rates
- track the training and evaluation metrics with Weights and Biases
- write custom callbacks
- Choose between different loss functions, use hard negative mining for triplets/pairs
- Set your own optimizers and learning rates
- Track the training and evaluation metrics with Weights and Biases
- Save checkpoints during training
- Write custom callbacks

You can read more on these different options here or in these sub-sections:

Expand Down Expand Up @@ -203,12 +204,13 @@ Then we can create the {class}`~finetuner.tuner.pytorch.PytorchTuner` object. In
- Triplet loss with hard miner with the easy positive and semihard negative strategy
- Adam optimizer with initial learning rate of 0.0005, which will be halved every 30 epochs
- WandB for tracking the experiement
- A {class}`~finetuner.tuner.callback.training_checkpoint.TrainingCheckpoint` to save a checkpoint every epoch - if training is interrupted we can later continue from this checkpoint. We need to create a `checkpoints/` folder inside our current directory to store checkpoints there.

```python
from torch.optim import Adam
from torch.optim.lr_scheduler import MultiStepLR

from finetuner.tuner.callback import WandBLogger
from finetuner.tuner.callback import WandBLogger, TrainingCheckpoint
from finetuner.tuner.pytorch import PytorchTuner
from finetuner.tuner.pytorch.losses import TripletLoss
from finetuner.tuner.pytorch.miner import TripletEasyHardMiner
Expand All @@ -225,13 +227,14 @@ loss = TripletLoss(
miner=TripletEasyHardMiner(pos_strategy='easy', neg_strategy='semihard')
)
logger_callback = WandBLogger()
checkpoint = TrainingCheckpoint('checkpoints')

tuner = PytorchTuner(
embed_model,
loss=loss,
configure_optimizer=configure_optimizer,
scheduler_step='epoch',
callbacks=[logger_callback],
callbacks=[logger_callback, checkpoint],
device='cpu',
)
```
Expand All @@ -244,7 +247,7 @@ from torch.optim import Adam
from torch.optim.lr_scheduler import MultiStepLR

from finetuner.toydata import generate_fashion
from finetuner.tuner.callback import WandBLogger
from finetuner.tuner.callback import WandBLogger, TrainingCheckpoint
from finetuner.tuner.pytorch import PytorchTuner
from finetuner.tuner.pytorch.losses import TripletLoss
from finetuner.tuner.pytorch.miner import TripletEasyHardMiner
Expand Down Expand Up @@ -276,13 +279,14 @@ loss = TripletLoss(
miner=TripletEasyHardMiner(pos_strategy='easy', neg_strategy='semihard')
)
logger_callback = WandBLogger()
checkpoint = TrainingCheckpoint('checkpoints')

tuner = PytorchTuner(
embed_model,
loss=loss,
configure_optimizer=configure_optimizer,
scheduler_step='epoch',
callbacks=[logger_callback],
callbacks=[logger_callback, checkpoint],
device='cpu',
)

Expand Down
37 changes: 37 additions & 0 deletions docs/components/tuner/callbacks.md
Expand Up @@ -3,6 +3,7 @@
Callbacks offer a way to integrate various auxiliary tasks into the training loop. We offer built-in callbacks for some common tasks, such as
- Showing a progress bar (which is shown by default)
- [Tracking experiements](#experiement-tracking)
- [Checkpoint training progress](#checkpoints)

You can also [write your own callbacks](#custom-callbacks).

Expand Down Expand Up @@ -32,6 +33,42 @@ tuner = PytorchTuner(..., callbacks=[logger])

You should then be able to see your training runs in wandb.

## Checkpoints

On long train jobs, you may want to periodically save the progress, so you can continue
from this checkpoint later, if training gets interrupted. Or you may want to save the
best model, so that you can use it after the training finishes. For these purposes, we
offer {class}`~finetuner.tuner.callback.training_checkpoint.TrainingCheckpoint` and {class}`~finetuner.tuner.callback.best_model_checkpoint.BestModelCheckpoint`, respectively.

For the {class}`~finetuner.tuner.callback.training_checkpoint.TrainingCheckpoint` checkpoint, you would simply add it to `callbacks`. Later you could then load the tuner from the checkpoint, as in the example below


```python
from finetuner.tuner.callback import TrainingCheckpoint
from finetuner.tuner.pytorch import PytorchTuner

checkpoint = TrainingCheckpoint('checkpoints')

tuner = PytorchTuner(..., callbacks=[checkpoint])

# Afterwards, load tuner from the saved theckpoint
TrainingCheckpoint.load(tuner, 'checkpoints/saved_model_epoch_10')
```

For the {class}`~finetuner.tuner.callback.best_model_checkpoint.BestModelCheckpoint`, you would also add it to `callbacks`, and later you could load the model from it.

```python
from finetuner.tuner.callback import BestModelCheckpoint
from finetuner.tuner.pytorch import PytorchTuner

checkpoint = BestModelCheckpoint('checkpoints')

tuner = PytorchTuner(..., callbacks=[checkpoint])

# Afterwards, load model from the saved theckpoint
BestModelCheckpoint.load_model(tuner, 'checkpoints/best_model_val_loss')
```

## Custom callbacks

If the existing callbacks don't provide the functionality you need, you can easily write your own.
Expand Down

0 comments on commit 0eae320

Please sign in to comment.