-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Description
Currently, I am able to log training metrics to Tensorboard using:
import pytorch_lightning as pl
from pytorch_lightning.loggers import TensorBoardLogger
logger = TensorBoardLogger(save_dir=save_dir, name="my_model")
[...]
trainer = pl.Trainer(logger=logger)
This logs training metrics (loss, for instance) after each batch. I would like to be able to average these metrics across all batches and log them to TensorBoard only once, at the end of each epoch. This is what the validation_end method does in your example: https://github.com/PyTorchLightning/pytorch-lightning/blob/446a1e23d7fe3b2e07f1a5887fe819d0dfa7d4e0/pl_examples/basic_examples/lightning_module_template.py#L145.
I first thought about writing my own training_end method. But this method is called after each batch instead of being called at the end of an epoch (as I would have thought). The method on_epoch_end seems interesting but does not receive an outputs argument as training_end does. Basically, in my model, I would like to write something like: self.logger.experiment.add_scalar('training_loss', train_loss_mean, global_step=self.current_epoch), but I do not know where to put this line.
- OS: Debian GNU/Linux 9.11 (stretch)
- Packaging: PIP
- Version 0.6.1.dev0