Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ConfidNet performs worse than MCP when I reproduce SVHN results #8

Closed
pfjaeger opened this issue Apr 7, 2021 · 2 comments
Closed

Comments

@pfjaeger
Copy link

pfjaeger commented Apr 7, 2021

Hi Charles!
Thank you for your nice work on ConfidNet and for providing this framework with your paper. For an upcoming publication, I would like to run ConfidNet as a baseline. However, when I try to reproduce your results on SVHN, ConfidNet performs inferior to MCP:

Screenshot 2021-04-07 at 11 49 01

I understand, that results are volatile due to the limited number of incorrect predictions, but I tried multiple runs and always got the same performance pattern. So here is what I did exactly:

  • I run the standard exp_svhn.yaml and select the best epoch according to val-accuracy
  • I run confident training with the following configs:

`Data parameters
data:
dataset: svhn
data_dir: /media/paul/ssd1/datasets/svhn
input_size: [32,32]
input_channels: 3
num_classes: 10
valid_size: 0.1

Training parameters
training:
output_folder: /mnt/hdd2/checkpoints/confid_test/svhn_smallconv_run_selfconfid
task: classification
learner: selfconfid
nb_epochs: 200
batch_size: 128
loss:
name: selfconfid_mse
weighting: 1
optimizer:
name: adam
lr: 0.0001
weight_decay: 0.0001
lr_schedule:
ft_on_val: False
metrics: ['accuracy', 'auc', 'ap_success', 'ap_errors']
pin_memory: False
num_workers: 12
augmentations:
normalize: [[0.5, 0.5, 0.5], [0.5, 0.5, 0.5]]

Model parameters
model:
name: small_convnet_svhn_selfconfid_classic
resume: /mnt/hdd2/checkpoints/confid_test/svhn_smallconv_run/model_epoch_052.ckpt # best val-acc of previous encoder-classifier training
feature_dim: 512
uncertainty:`

  • I select the best epoch according to val-aupr-err
  • I run fine-tuning wit the following confids:

`Data parameters
data:
dataset: svhn
data_dir: /media/paul/ssd1/datasets/svhn
input_size: [32,32]
input_channels: 3
num_classes: 10
valid_size: 0.1

Training parameters
training:
output_folder: /mnt/hdd2/checkpoints/confid_test/svhn_smallconv_run_finetune
task: classification
learner: selfconfid
nb_epochs: 20
batch_size: 128
loss:
name: selfconfid_mse
weighting: 1
optimizer:
name: adam
lr: 0.0000001 # 1e-7
lr_schedule:
ft_on_val: False
metrics: ['accuracy', 'auc', 'ap_success', 'ap_errors']
pin_memory: False
num_workers: 12
augmentations:
normalize: [[0.5, 0.5, 0.5], [0.5, 0.5, 0.5]]

Model parameters
model:
name: small_convnet_svhn_selfconfid_cloning
resume: /mnt/hdd2/checkpoints/confid_test/svhn_smallconv_run/model_epoch_052.ckpt # best val-acc of previous encoder-classifier training
feature_dim: 512
uncertainty: /mnt/hdd2/checkpoints/confid_test/svhn_smallconv_run_selfconfid/model_epoch_111.ckpt # best AURP-error of previous confidnet training.
`

  • Again I select the epoch according to best val-aupr-err for testing

I would like to make sure I use your code correctly in order to not report unfair baseline results. It would be great if you could give me feedback on this, thanks in advance!

@chcorbi
Copy link
Collaborator

chcorbi commented Apr 7, 2021

Hi Paul,

Thank you for your interest in the paper!

Looking back at my configuration files, I've noticed that using no weight decay during ConfidNet training helped to obtain better performances. You should try without it in your config file, the rest seems fine to me.

Please let me know if it helped :)

Charles

@pfjaeger
Copy link
Author

pfjaeger commented Apr 9, 2021

Thank you for your quick reply!
Goot hint with the weight decay, this helped indeed. After experiments on all datasets and multiple runs I can see how confidnet is often better than mcp, although with limited consistency: results/rankings seem very volatile and dependent on the current run and train-split etc. Also: mcp of the dropout-based mean softmax seems to be a strong competition for confidnet ;)

@pfjaeger pfjaeger closed this as completed Apr 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants