Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve TransposedConvolution layer. #1493

Merged
merged 6 commits into from Oct 12, 2019

Conversation

@akhandait
Copy link
Member

akhandait commented Aug 13, 2018

No description provided.


TransposedConvolution<> module2(1, 1, 4, 4, 1, 1, 2, 2, 5, 5);
TransposedConvolution<> module2(1, 1, 3, 3, 1, 1, 1, 1, 6, 6, 6, 6);

This comment has been minimized.

Copy link
@ShikharJ

ShikharJ Aug 14, 2018

Member

Please keep the original parameters here. These tests follow from the same Convolutional Arithmetic report, figure 4.1 and onwards.

This comment has been minimized.

Copy link
@akhandait

akhandait Aug 23, 2018

Author Member

@ShikharJ As far as I could figure out, tf.nn.conv2d_transpose doesn't allow size to go down(As it only allows VALID or SAME padding on the output size). In the above example, using the current formula for output size, it should be 4. Can you tell me what parameters(in the tf.nn.conv2d_transpose function) along with the below weights did you use to arrive at the value 2100?

This comment has been minimized.

Copy link
@ShikharJ

ShikharJ Aug 23, 2018

Member

@akhandait I don't think the size should decrease, take a look at Figure 4.2, aren't you able to obtain an output size of 8? I don't remember the parameters from the top of my head, though I can derive them, just give me sometime, I need to reinstall tensorflow. I have sort of moved on to Pytorch these days.

This comment has been minimized.

Copy link
@akhandait

akhandait Aug 29, 2018

Author Member

Sorry for the super late response. I understand the approach that was used. The output size calculated this way includes the padding added to the input of the equivalent convolution operation. However, most of the deep learning frameworks don't do that.(I tried Tensorflow and PyTorch. Thenao's approach is exactly the same as our new implementation.)
So, we will have to change the parameters. (but not the cases according to the paper)
For example, in this case, the output size should actually be 6(as shown in the figure, 1 + 1 is the padding and 8 is not the size of the original input. I tested it in PyTorch.) In PyTorch and our new implementation, this case should have parameters like this:

kernel size = 4
stride = 1
padding = 1(what all deep learning frameworks do is take the padding of the equivalent convolution operation, so here padding of 1 was used which changed the actual input size 6x6 to 8x8. We should output 6 and not 8.)
inputHeight and Width = 5
outputHight and Width = 6

So, 5 = (6 + 2(1) - 4) / 1 + 1
But, as you correctly mentioned, we should test all the cases from that paper. I am going to shift to PyTorch for calculating the values for testing as Tensorflow sadly only allows 'SAME' and VALID padding which doesn't allow us to test all the cases from that paper.

This comment has been minimized.

Copy link
@akhandait

akhandait Aug 29, 2018

Author Member

@zoq Please verify this.

This comment has been minimized.

Copy link
@akhandait

akhandait Sep 4, 2018

Author Member

@ShikharJ @zoq Can you please comment on this?

This comment has been minimized.

Copy link
@zoq

zoq Sep 16, 2018

Member

@akhandait Sorry for the slow response, I don't see any reason against the change; I did also check PyTorch and as you already said it's the same parameter set.

This comment has been minimized.

Copy link
@akhandait

akhandait Sep 27, 2018

Author Member

@zoq Do you have any further comments on the PR? Can you give it a final review?

This comment has been minimized.

Copy link
@ShikharJ

ShikharJ Sep 27, 2018

Member

@akhandait Your approach sounds reasonable to me, I was actually referring to the decreased kernel sizes which have now been fixed. I'll need to run the tests for the GANs and test their output, and cross check the tests as well. I'm planning on doing these over the weekend. I'll let you know what I find.

This comment has been minimized.

Copy link
@akhandait

akhandait Sep 27, 2018

Author Member

@ShikharJ Sounds good. Thanks!

@ShikharJ

This comment has been minimized.

Copy link
Member

ShikharJ commented Aug 15, 2018

@akhandait You can rebase to master now. #1491 is merged.

@akhandait akhandait force-pushed the akhandait:transposed_conv_fix branch 2 times, most recently from 41654d0 to a3e5c2f Aug 29, 2018
@zoq

This comment has been minimized.

Copy link
Member

zoq commented Sep 27, 2018

Looks like the Transposed layer test failed.

@akhandait

This comment has been minimized.

Copy link
Member Author

akhandait commented Oct 1, 2018

@zoq It should pass now.


TransposedConvolution<> module5(1, 1, 3, 3, 2, 2, 0, 0, 5, 5);
TransposedConvolution<> module5(1, 1, 3, 3, 2, 2, 0, 0, 2, 2, 5, 5);

This comment has been minimized.

Copy link
@ShikharJ

ShikharJ Oct 1, 2018

Member

Can you use the same input sizes here as well from the paper as done above? Also for module 6 and 7.

This comment has been minimized.

Copy link
@akhandait

akhandait Nov 11, 2018

Author Member

Maybe there's been some confusion. They already are according to the paper. The above module 5 has input size 2 and output size 5 which follows 4.5 in the paper, similarly the modules 6 and 7 follow 4.6 and 4.7 respectively.

@ShikharJ

This comment has been minimized.

Copy link
Member

ShikharJ commented Oct 1, 2018

@akhandait Sorry if I'm being unreasonable, but I'm unable to understand the rationale behind inserting the zeros on the input first and then applying convolution operation. Could you explain a bit on that? Also, I think since both the ForwardConvolutionRule and BackwardConvolutionRule are valid, I think it'd be better to use them in Forward and Backward functions respectively.

@akhandait

This comment has been minimized.

Copy link
Member Author

akhandait commented Nov 7, 2018

@zoq @ShikharJ Sorry for the inactivity on this one, I got busy with some other things. I will get back on this in a couple of days.

@akhandait

This comment has been minimized.

Copy link
Member Author

akhandait commented Nov 11, 2018

@ShikharJ Inserting zeros has been done to implement fractionally strided convolutions with the current convolution rules. Another way could have been to implement another convolution rule for fractionally strided convolution which performs it without actually adding zeros in the input. But this way for easier for the time being.

@akhandait akhandait force-pushed the akhandait:transposed_conv_fix branch from 4cad362 to 7974bdc Nov 26, 2018
@akhandait

This comment has been minimized.

Copy link
Member Author

akhandait commented Jan 4, 2019

@ShikharJ Any updates here?

@ShikharJ

This comment has been minimized.

Copy link
Member

ShikharJ commented Jan 4, 2019

@akhandait Apologies, I'll get back on reviewing this and give you an update soon.

@zoq

This comment has been minimized.

Copy link
Member

zoq commented Mar 30, 2019

@akhandait looks like there is a merge conflict, I can resolve this one if you like, just let me know. @ShikharJ Any update on this one from your side?

@ShikharJ

This comment has been minimized.

Copy link
Member

ShikharJ commented Mar 30, 2019

@zoq I had been primarily busy with CycleGAN recently, but I think this is close to completion, just need to check the output on the GANs.

@akhandait

This comment has been minimized.

Copy link
Member Author

akhandait commented Mar 31, 2019

I resolved it, let's check the tests now.

@ShikharJ

This comment has been minimized.

Copy link
Member

ShikharJ commented Apr 6, 2019

@akhandait This seems to have a bit of style issues, do you think you can rectify them?

@akhandait

This comment has been minimized.

Copy link
Member Author

akhandait commented Apr 10, 2019

Hey @ShikharJ, Jenkins isn't showing me where the checks have failed, it instead shows a Stack trace. Can you please restart the Jenkins build as I am not sure how to?

@zoq

This comment has been minimized.

Copy link
Member

zoq commented May 13, 2019

@mlpack-jenkins test this please

@akhandait akhandait force-pushed the akhandait:transposed_conv_fix branch from 6c58629 to f812835 May 25, 2019
@akhandait

This comment has been minimized.

Copy link
Member Author

akhandait commented May 25, 2019

@zoq @ShikharJ I can't understand the static checks warning here:
"The expression kW - padW - 1 > 0 will work as kW - padW != 1." - line 211, 233
Why is it != 1?
If they had said > 1, I wouldn't have worried about it, but I can't figure out this case.

@rcurtin

This comment has been minimized.

Copy link
Member

rcurtin commented May 25, 2019

@akhandait I think both kW and padW are size_t, so this has to do with the fact that negative numbers can't be represented. If kW < padW (which hopefully should not be the case) then kW - padW is some giant number greater than 0. So, the only situation for which we can make the condition kW - padW - 1 > 0 false is when kW - padW - 1 == 0, or, when kW - padW == 1 (hence, the suggested condition of kW - padW != 1. I think this is right, hope it helps. :)

@akhandait akhandait force-pushed the akhandait:transposed_conv_fix branch from f812835 to b468374 May 26, 2019
@akhandait

This comment has been minimized.

Copy link
Member Author

akhandait commented May 26, 2019

@rcurtin Ahh I missed that, silly error. I will just cast it to int.

@@ -222,18 +214,23 @@ class TransposedConvolution
* @param input The input to be padded.
* @param wPad Padding width of the input.
* @param hPad Padding height of the input.
* @param wExtra The number of extra zeros to the right.
* @param hExtra The number of extra zeros to the bottom.
* @param output The padded output data.
*/
template<typename eT>
void Pad(const arma::Mat<eT>& input,

This comment has been minimized.

Copy link
@saksham189

saksham189 Jul 1, 2019

Contributor

Maybe we could introduce a Padding layer and use that inside the convolution layers to avoid redundant code. Pytorch already has support for quite a lot of padding layers. This would also make the implementation more clear.

This comment has been minimized.

Copy link
@akhandait

akhandait Jul 5, 2019

Author Member

@saksham189 Has there been any discussion on this on irc after you posted this? I think this might be good idea and should be given some thought.

This comment has been minimized.

Copy link
@zoq

zoq Jul 5, 2019

Member

Sorry for the slow response, makes totally sense to me.

This comment has been minimized.

Copy link
@saksham189

saksham189 Jul 5, 2019

Contributor

@akhandait I will work on adding the Padding layer and remove the redundant code in a new PR so that it is easy to review.

This comment has been minimized.

Copy link
@rcurtin

rcurtin Sep 12, 2019

Member

Looks like the Padding layer is done and merged; should we incorporate those changes here, or maybe in another PR after this is merged? It may be worth opening an issue so we don't forget. :)

@mlpack-bot

This comment has been minimized.

Copy link

mlpack-bot bot commented Aug 4, 2019

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍

@akhandait

This comment has been minimized.

Copy link
Member Author

akhandait commented Sep 15, 2019

@rcurtin I guess that would be much better for me right now, thanks! I'll push the changes tomorrow.

Copy link
Contributor

walragatver left a comment

Hi, it looks good. However, I found one small mistake. It's quite small. But maybe quite important.

weights.set_size((outSize * inSize * kW * kH) + outSize, 1);

aW = (outputWidth + kW - 2 * padW - 2) % dW;

This comment has been minimized.

Copy link
@walragatver

walragatver Sep 24, 2019

Contributor

If we are serializing these parameters then we don't need to calculate them while loading. Let me know if it's not correct. May be I missing something.

This comment has been minimized.

Copy link
@rcurtin

rcurtin Oct 6, 2019

Member

Ah, actually, this is a good point. If we are serializing aW and aH, then we have to handle reverse compatibility. (That's not hard but it's just a little bit tedious.) Since as far as I can tell aW and aH are always computed, I'd suggest that we just remove the two lines ar & BOOST_SERIALIZATION_NVP(aw); and ar & BOOST_SERIALIZATION_NVP(aH); above.

@rcurtin

This comment has been minimized.

Copy link
Member

rcurtin commented Sep 26, 2019

@walragatver pointed out in IRC today that we shouldn't wait on the release any longer, as mlpack 3.1.1 downloads ensmallen-latest.tar.gz---which as of ensmallen 2.10.0 no longer works! I agreed, so I've gone ahead and released 3.2.0 tonight (which doesn't have the same issue as 3.1.1 did with respect to the ensmallen version), and I've moved this to the 3.2.1 milestone. Once this is finished and merged we can easily issue a patch release. 👍

Copy link
Member

rcurtin left a comment

I don't have any further comments on this one other than about the serialization; do we think that it's ready otherwise? I'd love to get this merged and release 3.2.2 soon. :)

weights.set_size((outSize * inSize * kW * kH) + outSize, 1);

aW = (outputWidth + kW - 2 * padW - 2) % dW;

This comment has been minimized.

Copy link
@rcurtin

rcurtin Oct 6, 2019

Member

Ah, actually, this is a good point. If we are serializing aW and aH, then we have to handle reverse compatibility. (That's not hard but it's just a little bit tedious.) Since as far as I can tell aW and aH are always computed, I'd suggest that we just remove the two lines ar & BOOST_SERIALIZATION_NVP(aw); and ar & BOOST_SERIALIZATION_NVP(aH); above.

@rcurtin

This comment has been minimized.

Copy link
Member

rcurtin commented Oct 6, 2019

It looks like there is still a travis test failing, also. If nobody has any time, I can try and debug it.

@akhandait

This comment has been minimized.

Copy link
Member Author

akhandait commented Oct 7, 2019

Sorry for stalling here. I am having issues building mlpack on my system, so it became harder to work on this as I need to wait for travis to build and then debug through those errors. I'll still try completing this today.

@zoq

This comment has been minimized.

Copy link
Member

zoq commented Oct 7, 2019

Hopefully the build issues you see, are fixed with the latest ensmallen + mlpack version.

@rcurtin

This comment has been minimized.

Copy link
Member

rcurtin commented Oct 9, 2019

@akhandait no problem, and I'm happy to help try and debug your system if you like, just let me know. I think @zoq is right that the update of ensmallen might be the thing that makes a difference here; if you make sure the branch is rebased against master and then delete/remake your build directory, that should work out any issues (I hope).

@akhandait

This comment has been minimized.

Copy link
Member Author

akhandait commented Oct 9, 2019

@zoq It didn't :(. I am using ensmallen 2.10.3 and mlpack latest master(so rebase isn't an issue). I just posted the entire error on irc. Can you please see if it seems familiar? If not I will post the build config so that someone can reproduce it.

@akhandait akhandait force-pushed the akhandait:transposed_conv_fix branch from 48667d6 to d2c78df Oct 10, 2019
@akhandait

This comment has been minimized.

Copy link
Member Author

akhandait commented Oct 10, 2019

Hmm, one of the jobs in travis is still failing due to some reason. I don't think it's due to this PR though.

@rcurtin

This comment has been minimized.

Copy link
Member

rcurtin commented Oct 10, 2019

You're right, it just timed out. Let me try running it again and we'll see what happens...

Copy link
Member

rcurtin left a comment

Everything looks good to me and the tests are passing. 👍 There was one comment about using the Padding layer, so I think when we merge this we should be sure to open an issue about that:

https://github.com/mlpack/mlpack/pull/1493/files#r299114822

Also, either before merge or during merge, we should update HISTORY.md also. :)

@mlpack-bot
mlpack-bot bot approved these changes Oct 11, 2019
Copy link

mlpack-bot bot left a comment

Second approval provided automatically after 24 hours. 👍

@akhandait akhandait merged commit 44457ef into mlpack:master Oct 12, 2019
6 checks passed
6 checks passed
LaTeX Documentation Checks Build finished.
Details
Memory Checks Build finished.
Details
Static Code Analysis Checks Build finished.
Details
Style Checks Build finished.
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@akhandait

This comment has been minimized.

Copy link
Member Author

akhandait commented Oct 12, 2019

I forgot to update HISTORY.md before the merge, could someone please do it? Or should I commit directly to the repository?

@zoq

This comment has been minimized.

Copy link
Member

zoq commented Oct 12, 2019

Updated HISTORY.mdin 7d677d0.

@ShikharJ ShikharJ added this to To Review in Shikhar's Board Oct 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.