Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An alternative to reconstruct large image by parts #238

yu45020 opened this issue Jun 4, 2018 · 0 comments

An alternative to reconstruct large image by parts #238

yu45020 opened this issue Jun 4, 2018 · 0 comments


Copy link

yu45020 commented Jun 4, 2018

That square grids will occurs when the output is affected by data other than input.
For example, using zero-padding in the convolution layer, it spoils the accuracy in the vicinity of an edge of the output image. Using 3x3 convolution with 1px padding, It can be seen that the edge 1px of the output image becomes unstable. So when using padding, it needs to calculate unstable area and cut out it. I believe batch normalization also causes similar problems.
So, in waifu2x, padding and batch normalization are not used.

Why we should care

Many architectures use padding during training, so if waifu2x use padding , then it can use others' model check points to initialize training. And the models can concatenate layers (e.g. DCSCN ) or easily add up residuals.

Re-training my models are expensive.

But padding makes the edge values unstable, especially when splitting a large image into pieces and merge them back. For example, the grids in the follow image always show up.

An alternative

Replication boarding padding + Overlapping Splitting

I first pad the whole image boarder with its replication values, then I split it into pieces. Each piece has overlapping boarders. After rescaling the pieces, I cut the overlapped parts and merge them back to an image. I also cut the padded boarder in the final image.

Overlapping 3 pixels seems enough, though larger value might be better.

Here is a naive and buggy example. An image is sliced from the top left. If a slice's width is smaller than the padded width, then the code will raise an error.

from PIL import Image

model = DCSCN(*)
# load and image and resize it
img ="2.png").convert("RGB")
img_up = img.resize((2*img.size[0], 2*img.size[1]), Image.BILINEAR)
img = to_tensor(img).unsqueeze(0)
img_up = to_tensor(img_up).unsqueeze(0)

# main
seg = 78
padded = 3
rem = padded *2
img = nn.ReplicationPad2d(padded)(img)
batch, channel, height, width = img.size()
final = torch.zeros((1,3,height*2, width*2))
for i in range(padded, height, seg):
    for j in range(padded, width, seg):
        part = img[:,:, (i-padded):min(i+seg+padded, height), (j-padded):min(j+seg+padded, width)]
        part_u = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=False)(part)
        out = model.forward_checkpoint((part, part_u))
        out = out[:,:,rem:-rem,rem:-rem]   # remove overlapping part but might raise error
        _,_,p_h, p_w = out.size()
        final[:,:,2*i:2*i+p_h, 2*j:2*j+p_w] = out
final_ = final[:,:,rem:-rem,rem:-rem]

This is a 2x scaling output image. I zoom in 10x and fail to find square grids.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
None yet

No branches or pull requests

1 participant