-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batchnormtrack Flag setting for cifar10 #10
Comments
Setting The exception is STL, where the full training set contains a lot of distractor classes in the unlabelled images not present in test. This is why when setting |
Hi,
in this line, you let the input image to be differentiable, while in the cluster_greyscale.py , you haven't. Could you explain why here the gradient is required?
|
Hi, that's redundant, thanks for pointing it out. I've removed it. |
Hi @xu-ji |
Yes, it is. |
Is that normal to have an average acc of 84%? |
Sorry, I skimmed over that second graph. It’s ok but not as good as my reported model. My trained model: (By the way if you download the models you can see the plots and records.) As you can see in that graph the average is 98.4. Other MNIST models I've trained have averaged at 96.6, 92.0, 92.5, 95.9. |
If I understand correctly, you may run several experiments and choose to show the best one. |
Yes, I ran a few experiments and show the best model. |
Can we say that as long as the distribution of train and test set stays the same,setting this flag or not should not make too much difference?For example,if we split CIFAR10 into a 7:3 train-test partition,train on the train set and test on the unseen test set,using a batch size of 660 this flag should not affect much of the performance? |
If the test batches' statistics are representative of the training batches' statistics (same class distribution, same input distribution, same size) and training is given enough time for batchnorm stats to reflect the latest features, then yes theoretically there should not be a material difference between taking the test time batchnorm statistics from training batches or test batches. In practice, there may be a small difference. |
Hi @xu-ji,
Thanks for this wonderful work. I am rerunning your code here and noticed that in the
commands.txt
, the setting with cifar10 is without --batchnorm_track, while most other commands are with this flag. I can understand that freezing the BN can be used in a finetuning setting but here apparently not the case. Can you tell me why the BN has been freezed for this particular setting for training Cifar10 from scratch?Thanks for your help in advance.
The text was updated successfully, but these errors were encountered: