Simplify SubpixelConv2d #176

ShieLian · 2017-07-14T03:43:16Z

Reduce the number of ops to do subpixel convolution
Fix #175

Reduce the number of ops to do subpixel convolution

zsdonghao · 2017-07-15T19:09:47Z

Hi, are you sure the code is correct? i cant found where to define h and w in the code.
To test whether the implementation is correct, I suggest to test it on SRGAN example, do their also output the image with checkerboard artifact at the beginning?

ShieLian · 2017-07-15T19:27:12Z

emm...
Sorry for didn't check the code before commit.(for readability, codes here are different from my local version)
w and h should be a and b, or just change a and b to w and h.
If you'd like to do this quick fix, it'will be helpful. Or I'll do it torrow.(东八区)

zsdonghao · 2017-07-15T19:40:46Z

Great. What I worry is your implementation have different shuffle method with the original paper. Please double check, many thanks.

subzerofun · 2017-07-21T07:35:01Z

@ShieLian @zsdonghao I'm still experimenting with the SRGAN project (the results are amazing!) and have also mentioned the slow model building time in the Subpixel-Conv2D layers in an issue. But i thought that was because of a CPU-core/speed bottleneck on my side (i have an older i7 Quad core and thought you need a bazillion-core xeon-machine for this kind of computation).

So i was glad to see a solution to the problem! But as @zsdonghao is assuming – @ShieLian: your fix changes the image quality of the GANs output drastically. You can see a checkerboard pattern in the generated sample images as well as color artifacts and an overall low res, blurry output. I have tried letting the training run anew for a few epochs because the model weights from the previous training (checkpoint) seem to reset after the code change. But the resolution (sharp edges, high frequency details) does not get better and the checkerboard pattern is also persistent in all further results.

@ShieLian If your code would produce the same output as before i should see no change at all in my generated images and should be able to use the same weights as before – or am i wrong here?

If i revert the Subpixel-code (and model weights) the output images are clear and sharp again. I currently don't have access to the machine with my local repo, but will upload the images later so you can maybe check why the artifacts happen.

The model buildup time is nearly instantaneous with this PR – would love to see it producing the same results as the original code!

ShieLian · 2017-07-21T10:09:08Z

For my local implement, If the Input(1x2x2x4,Batch,H,W,C) is
[
[ [0,1,2,3] , [4,5,6,7] ],
[ [8,9,10,11] , [12,13,14,15] ]
]
Then the output is
[
[ [0], [1], [4], [5]],
[ [2], [3], [6], [7]],
[ [8], [9],[12],[13]],
[[10],[11],[14],[15]]
]
While the original output is
[
[ [0], [2], [4], [6]],
[ [1], [3], [5], [7]],
[ [8], [10],[12],[14]],
[ [9],[11],[13],[15]]
]
May that's why the same weights differ @subzerofun
But for a completely new trained model, it will give the equivalent result.

ShieLian · 2017-07-21T10:18:33Z

According to the paper, PS(T)_1,0,0=T_0,0,1=1
So accurately my implement is strictly same as the paper's.
But I will try to provide the version act the same as tensorlayer.
It's your turn to determine use which method. @zsdonghao

zsdonghao · 2017-07-21T10:40:35Z

let me check

subzerofun · 2017-07-23T15:11:42Z

I'm sorry, but i have no idea about the mathematics behind the changes. But i have started the SRGAN training from scratch – with the new code. And when i compare the quality and output progress of the new generated images to the old ones, the newer ones clearly are worse. Don't know why that is, but it has to be because of this change – since i did not modify anything else in the SRGAN repo.

Take a look for yourself: here are two pictures, both from training epoch nr. 175. The new output image has a lot more artifacts and some parts again show the checkerboard pattern. Just look at the palm leaves – the results are clearly not the same. I will leave the training running for a while, but i don't think that the new model will reach the same fidelity as the old one.

Old pixelshuffler code:

New code:

Could you please check the output again?

zsdonghao · 2017-07-23T15:14:39Z

@subzerofun i am training SRGAN as well, may finish in 1 to 2 days, I will let you know when training finish~ 😊

ShieLian · 2017-07-23T16:23:44Z

@subzerofun
If you train the network starting with original weights, it may have very bad performance.
Try to reset the weights of Conv layer before the first SubConv and weights behind the first SubConv. Train those weights only.
实际上是因为channel顺序变了，用原来的权值训练很难扳回来。
第一层SubConv之前的Conv之前的权值都可以用，但是这一层Conv要用的话要调一下channel顺序。。

ShieLian · 2017-07-23T16:53:52Z

@subzerofun @subzerofun
The version compatible with the original version is here: patch-2

subzerofun · 2017-07-23T21:19:04Z

I think you misunderstood me, i did delete all model weights – all .npz files before starting the training again. The comparison from above is a newly trained model with the only difference being the changed tensorlayer version (this PR). Starting from epoch nr. 0 – 175, compared to the old Subpixel-function.

There has to be a difference between the two functions (new/old) – unfortunately i don't understand the math behind it. @ShieLian But what could be the reason – if you say that the (numerical) outcome should be the same? Thank you for the patch, i will test your new version and compare it to the other results (will delete the old weights of course!).

@zsdonghao Lucky you – the two days on your Titan sound great. It took me 5-6 days on my 1080 Ti to get to epoch 2000.

zsdonghao · 2017-07-24T09:32:59Z

@ShieLian @subzerofun , I re-train the SRGAN with simplified version for 1400 epochs, please have a look.

From the result, it seems the simplified version is correct.

Simplified layer:

Previous version:

Simplify SubpixelConv2d

fc18318

Reduce the number of ops to do subpixel convolution

ShieLian mentioned this pull request Jul 15, 2017

SubpixelConv2d is too slow when construct a graph #175

Closed

ShieLian added 3 commits July 16, 2017 13:24

Quick fix

fdd1fd2

Quick fix

7ddf076

Quick fix

7d0101d

zsdonghao merged commit a31ead9 into tensorlayer:master Jul 21, 2017

ShieLian deleted the patch-1 branch July 23, 2017 17:37

zsdonghao mentioned this pull request Jul 25, 2017

Your hardware + GPU setup for training and average time to complete training tensorlayer/SRGAN#3

Open

Simplify SubpixelConv2d #176

Simplify SubpixelConv2d #176

Uh oh!

Conversation

ShieLian commented Jul 14, 2017

Uh oh!

zsdonghao commented Jul 15, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ShieLian commented Jul 15, 2017

Uh oh!

zsdonghao commented Jul 15, 2017

Uh oh!

subzerofun commented Jul 21, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ShieLian commented Jul 21, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ShieLian commented Jul 21, 2017

Uh oh!

zsdonghao commented Jul 21, 2017

Uh oh!

subzerofun commented Jul 23, 2017

Old pixelshuffler code:

New code:

Uh oh!

zsdonghao commented Jul 23, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ShieLian commented Jul 23, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ShieLian commented Jul 23, 2017

Uh oh!

subzerofun commented Jul 23, 2017

Uh oh!

zsdonghao commented Jul 24, 2017

Simplified layer:

Previous version:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

zsdonghao commented Jul 15, 2017 •

edited

Loading

subzerofun commented Jul 21, 2017 •

edited

Loading

ShieLian commented Jul 21, 2017 •

edited

Loading

zsdonghao commented Jul 23, 2017 •

edited

Loading

ShieLian commented Jul 23, 2017 •

edited

Loading