You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The directories where checkpoints are saved are always something like
$DIR/$SUBDIR/checkpoint/
I have $DIR controlled: it can be set by default_save_path argument in Trainer, and it is also a member of the class with the same name.
$SUBDIR seems randomly generated. Where is it generated? Is there any way where I can access it after training?
My use case is: I am training a LightningModule, and by default it is saving the best model (checkpoint) during training (the one with lowest val_loss). I want to load the best model just after training and, for example run a test run.
Second question:
A similar question about wandb logger. Is there any way that I can see which logger directory and subdirectory it is pointing to?
My goal would be to select the best, for example, 5 models out of 20 runs and load them by combining the metrics stored in the logger and the weights stored in the checkpoints. Alternatively, what would be the best practice here?
What have you tried?
I went through the code looking for where $SUBDIR is generated and I did not find it. I could not find anything relevant as members of the classes Trainer or LightningModule either.
The text was updated successfully, but these errors were encountered:
Hi, it seems that I found part of the answer: trainer.checkpoint_callback.best_k_models is a dictionary with the path and the metric (in my case validation loss). By default it only contains an element, which is the best model, so this works for me:
❓ Questions and Help
First question
The directories where checkpoints are saved are always something like
I have $DIR controlled: it can be set by
default_save_path
argument inTrainer
, and it is also a member of the class with the same name.$SUBDIR seems randomly generated. Where is it generated? Is there any way where I can access it after training?
My use case is: I am training a
LightningModule
, and by default it is saving the best model (checkpoint) during training (the one with lowestval_loss
). I want to load the best model just after training and, for example run a test run.Second question:
A similar question about wandb logger. Is there any way that I can see which logger directory and subdirectory it is pointing to?
My goal would be to select the best, for example, 5 models out of 20 runs and load them by combining the metrics stored in the logger and the weights stored in the checkpoints. Alternatively, what would be the best practice here?
What have you tried?
I went through the code looking for where $SUBDIR is generated and I did not find it. I could not find anything relevant as members of the classes
Trainer
orLightningModule
either.The text was updated successfully, but these errors were encountered: