Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transposed dimisions for vgg when loading weights of conv layers #180

Closed
ysj1173886760 opened this issue Nov 23, 2021 · 3 comments
Closed

Comments

@ysj1173886760
Copy link

For loading pretrained vgg model. i think the weight matrix for conv layers is [H, W, Cin, Cout].
The codes says the pretrained model has the matrix of [W, H, Cin, Cout], so it transposed it.
Because i've used the same pretrained model to build the original vgg net and evaluated it. It turns out that if you use [H, W, Cin, Cout] to load the model, the effect is better than the [W, H, Cin, Cout] version.
So i want to ask why we are transposing the [H, W]. Or any motivation to do this.

@anishathalye
Copy link
Owner

We're trying to match the original VGG network, not transposing filters on purpose, so if that's indeed the case, it's a bug and should be fixed.

Though I think the code in vgg.py is correct as-is? We load from a pre-trained MatConvNet model, which stores weights in [W, H, Cin, Cout] order, according to vl_nnconv docs:

F is an array of dimension FW x FH x FC x K where (FH,FW) are the filter height and width and K the number o filters in the bank. FC is the number of feature channels in each filter

On the other hand, TensorFlow expects weights in [H, W, Cin, Cout] order, according to tf.nn.conv2d docs:

filter / kernel tensor of shape [filter_height, filter_width, in_channels, out_channels]

So to convert from MatConvNet format to TensorFlow format, we transpose the 0th and 1st indices with kernels = np.transpose(kernels, (1, 0, 2, 3)).

@ysj1173886760
Copy link
Author

Thanks for your clarification. You really made my day.
I've use the different format to read the parameters and tested on some pictrues.
Here is the result
image
You're about to convince me now. I will do more investigations and will have feedback here.

@anishathalye
Copy link
Owner

Closing due to inactivity, feel free to open if there's anything new.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants