This repository was archived by the owner on Jul 1, 2023. It is now read-only.
Fix Glorot uniform initialization for convolutional layers #22
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The correct way to extend the initialization scheme introduced in Glorot and Bengio for dense layers to convolutional layers is to multiply the
fanInandfanOutsizes by the receptive field size (the product of kernel dimensions). Keras, PyTorch, Lasagne etc. all implement this correction without mentioning it in the relevant docstrings. As discussed for a related initialization in He et al., this is needed because the responses produced in a convolutional layer are equivalent to those produced by a pointwise dense layer over a feature space that has been expanded by a factor of the receptive field size.Fixes the convergence discrepancy seen on CIFAR convnets.