Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

weight distribution for deconv layer #214

Closed
r0drigor opened this issue Jan 23, 2020 · 4 comments
Closed

weight distribution for deconv layer #214

r0drigor opened this issue Jan 23, 2020 · 4 comments
Assignees

Comments

@r0drigor
Copy link

This might be a question that is more about caffe than FlowNet but I'll ask here anyway.
I'm doing some work with FlowNet2-s (the first iteration of the project), and I'm using MATLAB to get the weight values for each layer and I can't understand why the deconvolution layers' weights use a different structure than regular convolution.
(I'm using net.params('conv1').get_data()).

For regular convolutions the weight distribution is (h,w,c,n), while for deconv layers it's (h,w,n,c). Why is this the case?

@nikolausmayer
Copy link
Contributor

This is just a quick guess, but "deconvolution" is implemented as transposed convolution (both layers directly use the cublasSgemm matrix multiplication).

@r0drigor
Copy link
Author

I can't find any documentation on any of this, but maybe I'm just not looking hard enough.
There's probably some kind of transposition on the height and width parameters, right?
I'm having some difficulty getting reasonable values and it might be an explanation.

Thank you for the fast response.

(After your answer you can lock this issue)

@nikolausmayer
Copy link
Contributor

I think it's common to implement these "deconvolutions" like that; see e.g. http://deeplearning.net/software/theano_versions/dev/tutorial/conv_arithmetic.html#transposed-convolution-arithmetic

You can check the Forward_cpu implementations in the conv and deconv layers, they should match the method in the link.

I can't tell you why exactly the parameters are structured differently; I can only assume that the transposed convolution plays a role.

@nikolausmayer
Copy link
Contributor

(closed due to inactivity)

@nikolausmayer nikolausmayer self-assigned this Aug 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants