You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi Charles!
Thank you for your nice work on ConfidNet and for providing this framework with your paper. For an upcoming publication, I would like to run ConfidNet as a baseline. However, when I try to reproduce your results on SVHN, ConfidNet performs inferior to MCP:
I understand, that results are volatile due to the limited number of incorrect predictions, but I tried multiple runs and always got the same performance pattern. So here is what I did exactly:
I run the standard exp_svhn.yaml and select the best epoch according to val-accuracy
I run confident training with the following configs:
Model parameters
model:
name: small_convnet_svhn_selfconfid_classic
resume: /mnt/hdd2/checkpoints/confid_test/svhn_smallconv_run/model_epoch_052.ckpt # best val-acc of previous encoder-classifier training
feature_dim: 512
uncertainty:`
Model parameters
model:
name: small_convnet_svhn_selfconfid_cloning
resume: /mnt/hdd2/checkpoints/confid_test/svhn_smallconv_run/model_epoch_052.ckpt # best val-acc of previous encoder-classifier training
feature_dim: 512
uncertainty: /mnt/hdd2/checkpoints/confid_test/svhn_smallconv_run_selfconfid/model_epoch_111.ckpt # best AURP-error of previous confidnet training.
`
Again I select the epoch according to best val-aupr-err for testing
I would like to make sure I use your code correctly in order to not report unfair baseline results. It would be great if you could give me feedback on this, thanks in advance!
The text was updated successfully, but these errors were encountered:
Looking back at my configuration files, I've noticed that using no weight decay during ConfidNet training helped to obtain better performances. You should try without it in your config file, the rest seems fine to me.
Thank you for your quick reply!
Goot hint with the weight decay, this helped indeed. After experiments on all datasets and multiple runs I can see how confidnet is often better than mcp, although with limited consistency: results/rankings seem very volatile and dependent on the current run and train-split etc. Also: mcp of the dropout-based mean softmax seems to be a strong competition for confidnet ;)
Hi Charles!
Thank you for your nice work on ConfidNet and for providing this framework with your paper. For an upcoming publication, I would like to run ConfidNet as a baseline. However, when I try to reproduce your results on SVHN, ConfidNet performs inferior to MCP:
I understand, that results are volatile due to the limited number of incorrect predictions, but I tried multiple runs and always got the same performance pattern. So here is what I did exactly:
`Data parameters
data:
dataset: svhn
data_dir: /media/paul/ssd1/datasets/svhn
input_size: [32,32]
input_channels: 3
num_classes: 10
valid_size: 0.1
Training parameters
training:
output_folder: /mnt/hdd2/checkpoints/confid_test/svhn_smallconv_run_selfconfid
task: classification
learner: selfconfid
nb_epochs: 200
batch_size: 128
loss:
name: selfconfid_mse
weighting: 1
optimizer:
name: adam
lr: 0.0001
weight_decay: 0.0001
lr_schedule:
ft_on_val: False
metrics: ['accuracy', 'auc', 'ap_success', 'ap_errors']
pin_memory: False
num_workers: 12
augmentations:
normalize: [[0.5, 0.5, 0.5], [0.5, 0.5, 0.5]]
Model parameters
model:
name: small_convnet_svhn_selfconfid_classic
resume: /mnt/hdd2/checkpoints/confid_test/svhn_smallconv_run/model_epoch_052.ckpt # best val-acc of previous encoder-classifier training
feature_dim: 512
uncertainty:`
`Data parameters
data:
dataset: svhn
data_dir: /media/paul/ssd1/datasets/svhn
input_size: [32,32]
input_channels: 3
num_classes: 10
valid_size: 0.1
Training parameters
training:
output_folder: /mnt/hdd2/checkpoints/confid_test/svhn_smallconv_run_finetune
task: classification
learner: selfconfid
nb_epochs: 20
batch_size: 128
loss:
name: selfconfid_mse
weighting: 1
optimizer:
name: adam
lr: 0.0000001 # 1e-7
lr_schedule:
ft_on_val: False
metrics: ['accuracy', 'auc', 'ap_success', 'ap_errors']
pin_memory: False
num_workers: 12
augmentations:
normalize: [[0.5, 0.5, 0.5], [0.5, 0.5, 0.5]]
Model parameters
model:
name: small_convnet_svhn_selfconfid_cloning
resume: /mnt/hdd2/checkpoints/confid_test/svhn_smallconv_run/model_epoch_052.ckpt # best val-acc of previous encoder-classifier training
feature_dim: 512
uncertainty: /mnt/hdd2/checkpoints/confid_test/svhn_smallconv_run_selfconfid/model_epoch_111.ckpt # best AURP-error of previous confidnet training.
`
I would like to make sure I use your code correctly in order to not report unfair baseline results. It would be great if you could give me feedback on this, thanks in advance!
The text was updated successfully, but these errors were encountered: