bug: restore_checkpoint_path doesn't seem to work. #113

PaulScemama · 2023-07-31T13:51:50Z

Bug Report

Fortuna version: 0.1.21

I am trying to train a MAP posterior approximator first, then continue training with Laplace starting from the MAP checkpoint:

# // Only differ by posterior_approximator method 
map_prob_model
laplace_prob_model

checkpoint = "/path/to/map/checkpoint"

# // Validation accuracy of MAP model at checkpoint is as expected...
map_prob_model.load_state("/path/to/map/checkpoint")
map_out = map_prob_model.predictive.mean(val_loader.to_inputs_loader())
(map_out.argmax(axis=-1) == val_loader.to_array_targets()).sum() / val_loader.size
# '0.67'

from fortuna.metric.classification import accuracy
from fortuna.prob_model import FitCheckpointer, FitConfig, FitMonitor, FitOptimizer

optimizer = FitOptimizer(n_epochs=main_epochs)
monitor = FitMonitor(
    metrics=(accuracy,),
    eval_every_n_epochs=1,
)
checkpointer = FitCheckpointer(
    save_checkpoint_dir=main_save_dir,
   # // Start training from the MAP checkpoint
    restore_checkpoint_path="/path/to/map/checkpoint/",
    keep_top_n_checkpoints=2,
)
config = FitConfig(checkpointer=checkpointer, monitor=monitor)
laplace_status = laplace_prob_model.train(
    fit_config=config,
    train_data_loader=train_loader,
    val_data_loader=val_loader,
)
# // Validation accuracy is NOT as expected...
laplace_out = laplace_prob_model.predictive.mean(val_loader.to_inputs_loader())
(laplace_out.argmax(axis=-1) == val_loader.to_array_targets()).sum() / val_loader.size
# '0.11'

However, it seems like the Laplace model is not starting from the checkpoint I pass into restore_checkpoint_path. Is there a chance that the restore_checkpoint_path is not properly working? Let me know if you need more information and I can provide more detailed code!

Thanks :)

The text was updated successfully, but these errors were encountered:

gianlucadetommaso · 2023-07-31T14:08:23Z

Hi Paul,
I could be wrong, but I suspect the issue is not the checkpoint here. The way we obtain the mean of the predictive distribution is by sampling from the posterior. The posterior approximation given by the Laplace approximation is a Gaussian, namely

$N(\theta^*, diag(\sigma^2))$

where $\theta^*$ is the MAP and $\sigma$ is a vector of standard deviations for each parameter.

Now, if the estimate $\sigma$ is very bad - for instance, it is too large - the samples that you will obtain from this distribution might be too far from $\theta^*$ to provide good predictive estimates, and therefore a good accuracy.

I have noticed myself that the Laplace approximation may suffer quite a bit of this issue. As a possible solution, try doing only last-layer Laplace by enabling the freeze_fun in the FitOptimizer. This might help.

Let me know!

PaulScemama · 2023-07-31T14:12:46Z

@gianlucadetommaso okay! Coincidentally I just tried that myself and yes you're right -- it yields more reasonable results now, thank you :) That's good to know about the Laplace approx.

PaulScemama added the bug Something isn't working label Jul 31, 2023

PaulScemama closed this as completed Jul 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: restore_checkpoint_path doesn't seem to work. #113

bug: restore_checkpoint_path doesn't seem to work. #113

PaulScemama commented Jul 31, 2023

gianlucadetommaso commented Jul 31, 2023

PaulScemama commented Jul 31, 2023 •

edited

Loading

bug: restore_checkpoint_path doesn't seem to work. #113

bug: restore_checkpoint_path doesn't seem to work. #113

Comments

PaulScemama commented Jul 31, 2023

Bug Report

gianlucadetommaso commented Jul 31, 2023

PaulScemama commented Jul 31, 2023 • edited Loading

PaulScemama commented Jul 31, 2023 •

edited

Loading