-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
supernet training #4
Comments
Please pull this commit. |
When supernet_search.py is executed, the batch size is set to 200. I found that the same model has a change in test accuracy. If update_bn is removed, the accuracy does not change. I think the reason may be that there is a problem with the processing of 20,000 images, and the train_data should not be shuffled. |
You are right, very good observation! Indeed, if training data is shuffled, same architecture may have different performance in multiple runs because BN will be updated with different images. I thought shuffle or not will be both fine because of the question 3 of this issue. |
I found that if the train_data is not shuffled, the accuracy of the test obtained each time is still different. The update_bn function may have a little problem? In addition, I re-read the paper. The reason for updating bn seems to be because the supernet training is unstable, so the update_bn should be used in supernet training instead of one shot model evaluation time? |
Did you use the To eliminate this concern, I would suggest, in addition to changing supernet = ShuffleNas()
for _ in range(repeat_times):
# TODO: try to add this in the for loop
supernet.load_parameters(supernet_params, ctx=context)
# End of edition
update_bn(supernet)
val_acc = get_accuracy(supernet, fixed_block_choice, fixed_channel_choice)
print(val_acc) If it is still not working as you expected, then the NasBN part can be a problem. Any reproducible unit test case for the NasBN, like this, would be appreciated. I will help to debug on my best |
The BN statistics is only found mentioned in the section |
I solved the problem of the accuracy change after The 20,000 images should not be processed according to the preprocessing method of |
After training supernet according to the above script, Val acc dropped sharply after 60 epochs, but train acc did not. Your shufflenas_supernet.log file results: |
Yes I did use channel selection. Please check whether you were using all channels in the test function. If so, pulling the latest code should help. |
Thank you! In addition, it should be better to change |
You are right, for the validation evaluation, we are not supposed to use all channels. And I think you might misunderstand the code. Before epoch_start_cs, use_all_channels is set to Ture and after epoch_start_cs the use_all_channels is set to False. So that the val accuracy dop in your experiment was caused by using all channels and was actually fixed by not using it. |
Thank you, but if I follow the script above to run the program and start channel search at the beginning, the accuracy will not be trained, it is always 0.001. |
I've only been trying |
Update: I did various experiments during the weekend and here is a short summary:
|
Thank you very much, I will try it, and the results will be promptly feedback. |
Training an updated version of the supernet, resulting in the following error:
The text was updated successfully, but these errors were encountered: