New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

about the input channels #11

Closed

Liupengshuaige opened this issue Apr 29, 2020 · 1 comment

Liupengshuaige commented Apr 29, 2020 •

edited

Loading

Thanks for your great job, it seems that you normlize the conv' weight, why ? and how about the performance

Owner

MarcoForte commented Apr 29, 2020

Hi glad you liked the paper. We normalize the conv weight in layers preceding group-normalisation because of the advice in this paper. https://arxiv.org/abs/1903.10520 https://github.com/joe-siyuan-qiao/WeightStandardization

Note I changed their implementation slightly to avoid nan during training, joe-siyuan-qiao/WeightStandardization#1 (comment)

The effect of this normalization is significant, it reduces the average number of clicks necessary to reach 90% accuracy by around 0.25-0.5 clicks.

Liupengshuaige closed this as completed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment