Advanced Machine Learning #231

DonaldTsang · 2018-04-13T01:30:50Z

Is it possible to replace caffe (the slowest in the Python platform) with PyTorch (fastest overall) or MXNet (can beat PyTorch in parallel GPUs)
Is it possible to replace VGG7 with Inception or ResNet, which out-performs VGG7?

DonaldTsang · 2018-04-13T01:36:18Z

Resnet https://github.com/facebook/fb.resnet.torch
Inception https://github.com/Moodstocks/inception-v3.torch
Others https://github.com/Cadene/pretrained-models.pytorch

DonaldTsang · 2018-04-13T01:58:26Z

Some idea: categorize images in the database into "pure", "single-JPG", "double-JPG", "multi-JPG" (JPG as in JPG compression).
Use that as the metric to how "noisy" an image is, and then proceed to apply the right amount of de-noising to not over-shoot.
Only the "pure" images should be used as the base dataset for testing reverse image compression and compression.
Reference: https://www.politesi.polimi.it/bitstream/10589/132721/1/2017_04_Chen.pdf

nagadomi · 2018-04-13T15:03:03Z

Is it possible to replace caffe (the slowest in the Python platform) with PyTorch (fastest overall) or MXNet (can beat PyTorch in parallel GPUs)

waifu2x is implemented in LuaJIT/Torch, not Caffe. Torch already seems to outdated, it is good to switch to PyTorch, but for now I don't have resource to do it.
tsurumeso has released the chainer version.
https://github.com/tsurumeso/waifu2x-chainer

Is it possible to replace VGG7 with Inception or ResNet, which out-performs VGG7?

ResNet model is already found in dev branch.
benchmark: https://github.com/nagadomi/waifu2x/blob/dev/appendix/benchmark.md
Unfortunately it is much slower than the current model, so it can not be used in web services.

Some idea: categorize images in the database into "pure", "single-JPG", "double-JPG", "multi-JPG" (JPG as in JPG compression).

It has already been realized. waifu2x can specify JPEG quality and compression times for real-time data augmentation at training. The dataset has been constructed with images that is not JPEG compressed.

DonaldTsang · 2018-04-14T04:49:52Z

@nagadomi

Unfortunately it is much slower than the current model

Maybe reduce the size of the ResNet by using less modules? And compare that with VGG5/7/9/16/19 to create a graph of epoch training speed compared to total training time and accuracy?

waifu2x can specify JPEG quality

what about auto-detection of JPEG quality? Could that be implemented as well?

nagadomi · 2018-04-17T17:12:19Z

Maybe reduce the size of the ResNet by using less modules? And compare that with VGG5/7/9/16/19 to create a graph of epoch training speed compared to total training time and accuracy?

Using shallow network, the accuracy is downgraded. I think it is related to the receptive field size (it depends on the number of layers and the filter size when use fully convolutional network). I think it may be solved with dilated convolution or progressive approach.

what about auto-detection of JPEG quality? Could that be implemented as well?

I already implemented it, but it is not an open source activity. JPEG noise level can be predicted with classification task, with sets of image patches.

DonaldTsang · 2018-04-18T02:25:59Z

@nagadomi what about using expert systems for JPEG noise level detection?

2ji3150 · 2018-05-02T21:45:47Z

Looks like the resnet version is 2.3 times slower than upconv version. But get better quallity than the upcov with TTA (8 times slower). Which means it faster than the upcov with TTA but better quality. So it make sence to replace the normal TTA option. BTW, is there any plain to train an resnet art version model?

DonaldTsang · 2018-09-20T05:42:30Z

@2ji3150 @nagadomi New idea: NASNet

It looks like NASNet can out-perform most other neural network architecture with LESS computation.

DonaldTsang · 2018-09-20T05:54:18Z

As a reference: #216
(BTW thanks @Yolkis for suggesting that)
We should consider training speed and model generation speed.

nagadomi · 2018-09-20T07:23:47Z

Generally, in super resolution task, pooling layer can not be used.
In network architectures for classification task, the input resolution decreases as the number of layers increases, but in super resolution task, it is not.

DonaldTsang · 2018-11-21T17:58:46Z

@nagadomi is it possible to see this graph (the purple parts) and see if there are alternatives for Waifu2x?

nagadomi · 2018-11-21T23:06:36Z

@DonaldTsang
I added a new model last week.
benchmark: https://github.com/nagadomi/waifu2x/blob/master/appendix/benchmark.md#art (cunet/art)
It is two cascaded U-Net extended by SEBlock(Squeeze and Excitation Networks).

Edit:
In the above figure, RefineNet (Stack-U-Net) is a similar model.

yu45020 · 2018-11-27T19:47:07Z

@nagadomi
I come from this issue. Thanks for sharing the new model. Have you tried atrous convolutions on image up-scaling?

There is a paper using atrous conv to segment small objects on satellite images. The model increase the atrous rates and then decrease them. I code a similar model on my manga text segmentation project and find a clear improvement on accuracy. I am rewriting and testing a similar model on image up-scaling. The preliminary result seems acceptable, and I plan to train it thoroughly on a server.

nagadomi · 2018-11-28T02:10:46Z

@yu45020
I have tried dilated/atrous convolution. It is better than ordinary FCN, but it does not dramatically improve. Currently, I think that Residual U-Net(Concat replaced with Add) has better speed and accuracy than full dilated convolution networks.

I also develop OCR Engine for Manga, it is a closed source product so I can not describe the details, but there is a result on P59~ of this slide (Japanese).

yu45020 · 2018-11-28T04:15:32Z

@nagadomi
Thanks for the advice! I will also check a U-Net like model before training.

Your project seems to complete what I desire. It is very interesting and seems to be comparable to the ABBYSS's engine. My project's in sample prediction achieves similar result, but my goal is to segment all text pixels only. Back to your product. I notice the slices come from a seminar. Do you plan to publish a technical report ?

DonaldTsang · 2018-12-08T13:49:24Z

@nagadomi @yu45020 any news? If yes, we can write something up in #251

DonaldTsang mentioned this issue Nov 21, 2018

State of Art ：Think about the Next-Gen waifu2x #236

Closed

yu45020 mentioned this issue Jan 12, 2019

Preliminary Suggestions on Model Configuration nmhkahn/CARN-pytorch#8

Closed

DonaldTsang mentioned this issue May 14, 2019

Advanced CNN Architectures lltcggie/waifu2x-caffe#168

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Advanced Machine Learning #231

Advanced Machine Learning #231

DonaldTsang commented Apr 13, 2018

DonaldTsang commented Apr 13, 2018

DonaldTsang commented Apr 13, 2018

nagadomi commented Apr 13, 2018 •

edited

Loading

DonaldTsang commented Apr 14, 2018 •

edited

Loading

nagadomi commented Apr 17, 2018 •

edited

Loading

DonaldTsang commented Apr 18, 2018

2ji3150 commented May 2, 2018 •

edited

Loading

DonaldTsang commented Sep 20, 2018 •

edited

Loading

DonaldTsang commented Sep 20, 2018 •

edited

Loading

nagadomi commented Sep 20, 2018

DonaldTsang commented Nov 21, 2018

nagadomi commented Nov 21, 2018 •

edited

Loading

yu45020 commented Nov 27, 2018 •

edited

Loading

nagadomi commented Nov 28, 2018

yu45020 commented Nov 28, 2018

DonaldTsang commented Dec 8, 2018

Advanced Machine Learning #231

Advanced Machine Learning #231

Comments

DonaldTsang commented Apr 13, 2018

DonaldTsang commented Apr 13, 2018

DonaldTsang commented Apr 13, 2018

nagadomi commented Apr 13, 2018 • edited Loading

DonaldTsang commented Apr 14, 2018 • edited Loading

nagadomi commented Apr 17, 2018 • edited Loading

DonaldTsang commented Apr 18, 2018

2ji3150 commented May 2, 2018 • edited Loading

DonaldTsang commented Sep 20, 2018 • edited Loading

DonaldTsang commented Sep 20, 2018 • edited Loading

nagadomi commented Sep 20, 2018

DonaldTsang commented Nov 21, 2018

nagadomi commented Nov 21, 2018 • edited Loading

yu45020 commented Nov 27, 2018 • edited Loading

nagadomi commented Nov 28, 2018

yu45020 commented Nov 28, 2018

DonaldTsang commented Dec 8, 2018

nagadomi commented Apr 13, 2018 •

edited

Loading

DonaldTsang commented Apr 14, 2018 •

edited

Loading

nagadomi commented Apr 17, 2018 •

edited

Loading

2ji3150 commented May 2, 2018 •

edited

Loading

DonaldTsang commented Sep 20, 2018 •

edited

Loading

DonaldTsang commented Sep 20, 2018 •

edited

Loading

nagadomi commented Nov 21, 2018 •

edited

Loading

yu45020 commented Nov 27, 2018 •

edited

Loading