Generalize bilinear interpolation filler to N-D multilinear/multicubic/lanczos filler #4198

christianpayer · 2016-05-23T14:28:28Z

This branch implements an n-dimensional generalization of the bilinear filler (#2213) and adds cubic and lanczos fillers.

It makes #3984 obsolete, as it uses a new common base class for all interpolation fillers. The base class InterpolationFillerBase calculates the weight values by calling the virtual interpolation functions of its derived classes. This PR implements linear, cubic and lanczos interpolation fillers, while additional interpolation functions (e.g. Hermite, Mitchell, Gaussian) may be implemented easily by deriving the class InterpolationFillerBase and implementing the virtual functions f() and support().

I have written a simple test script in python, which tests the fillers with various scaling factors in x and y. I will upload it after some cleanup.

If you have comments/suggestions, let me know!

add base class InterpolationFillerBase that implements common interpolation function calculations the derived fillers support N dimensions and different integer scaling factors per dimension change 'linear' filler to 'multilinear' filler add generic interpolation weight filler add base class InterpolationFillerBase that implements common interpolation function calculations change 'linear' filler to 'multilinear' filler add 'cubic', 'lanczos2', 'lanczos3' and 'lanczos4' fillers all fillers support N dimensions and different integer scaling factors per dimension remove everything except multilinear

christianpayer · 2016-05-24T09:54:40Z

This jupyter notebook tests the new fillers:
http://nbviewer.jupyter.org/gist/christianpayer/3ee54b5c6463927410fff0fbc7cf556d

Coldmooon · 2016-05-24T12:03:22Z

Can this PR be used to downsample a feature map with a fractional factor?

christianpayer · 2016-05-24T13:00:29Z

@Coldmooon The fillers can also be used to downsample a feature map with fractional factors (1/2, 1/3, 1/4,...), as they work with Deconvolution and Convolution layers, where the strides define the up- and downsampling factors, respectively.
As these fillers are originally intended to be used as deconvolution fillers, the results for convolution may not give the desired results, i.e. the resulting feature map values are scaled by factor_x * factor_y. So if you use these fillers for downsampling, you need to append an additional layer (e.g. PowerLayer) that scales the values with 1 / (factor_x * factor_y) in order to get the same results as with Matlab, etc.

Coldmooon · 2016-05-29T06:14:44Z

@christianpayer Thank you very much~! Recently, I've been studying resampling by caffe. Your PR helps me a lot. I modified Test_interpolation_fillers.ipynb a little to test downsampling. The code is:

def get_kernel(factor, support):
    return 2 * support * factor - factor % 2;
def get_pad(factor, support):
    return ((2 * support - 1) * factor - factor % 2) / 2.

def downsample_net_file(in_size, factor, method, support):
    kernel = (get_kernel(factor[0], support), get_kernel(factor[1], support))
    pad = (get_pad(factor[0], support), get_pad(factor[1], support))
    stride = (factor[0], factor[1])
    scale_factor = 1/(stride[0]*stride[1])
    with tempfile.NamedTemporaryFile(mode='w+', delete=False) as f:
        f.write("""name: 'pythonnet' force_backward: true
          input: 'data' input_shape { dim: %d dim: %d dim: %d dim: %d }
          layer { type: 'Convolution' name: 'downsample' bottom: 'data' top: 'downsample'
          convolution_param { kernel_size: %d kernel_size: %d stride: %d stride: %d pad: %d pad: %d
          num_output: %d group: %d weight_filler: { type: '%s' } bias_term: false } } \n

          layer { type: 'Power' name: 'scale' bottom: 'downsample' top: 'downsample'
          power_param {scale: %f } }""" % (
          in_size[0], in_size[1], in_size[2], in_size[3],
          kernel[0], kernel[1], stride[0], stride[1], pad[0], pad[1],
          in_size[1], in_size[1], method, scale_factor))
    return f.name

Using multilinear mode, the above network can output a 180*240 downsampled image which looks very well. But the comparison between out and reference is NOK. The code is:

img_blob = img.reshape(1, *img.shape).transpose(0, 3, 1, 2)
support = 1
factor = (2, 2)
net_file = downsample_net_file(img_blob.shape, factor, 'multilinear', support)
net = caffe.Net(net_file, caffe.TEST)
os.remove(net_file)
net.blobs['data'].data[...] = img_blob
net.forward()
out = net.blobs['downsample'].data[0].transpose(1, 2, 0)
reference = skimage.transform.rescale(img, (0.5,0.5), mode='constant', order=1, clip=False, cval=0)

if numpy.allclose(out, reference, atol=1e-05):
    print("factor %d %d: OK" % factor)
else:
    print("factor %d %d: NOK" % factor)

A further question is what if non-integer factor? For example, 1.4286 for upsample and 0.7143 for downsample. Fractional stride will be rounded down in caffe. In these cases, I use another way to compute kernel_size, stride, and pad. But I don't know if it's correct.

# factor: 1.4286; upsample image from 7*7 to 10*10
# solution:
# computer output_size first: 
# output_size = factor * input_size = 7 * 1.4286 = 10
# Then, set kernel_size, stride, pad accordingly:
# (ouput_size + 2*pad - kernel_size)/stride + 1 = input_size
# we obtain: kernel_size = 4; stride = 1; pad = 0;
# The final prototxt is:
layer {
  name: "upsample-7to10"
  type: "Deconvolution"
  bottom: "data"
  top: "upsample"
  param { 
    lr_mult: 0 
    decay_mult: 0 
  }
  convolution_param {
    kernel_size: 4
    stride: 1
    num_output: 3
    group: 3
    weight_filler { 
      type: "multilinear" 
    } 
    bias_term: false
  }
}

# factor: 0.7143; image size: 7*7 -> 5*5
# solution:
# output_size = factor * input_size = 0.7143 * 7 = 5
# Then, set kernel_size, stride, pad accordingly:
# (input_size + 2*pad - kernel_size)/stride + 1 = output_size
# we obtain: kernel_size = 3; stride = 1; pad = 0;
# The final prototxt is:
layer {
  name: "downsample-7to5"
  type: "Convolution"
  bottom: "data"
  top: "downsample"
  param { 
    lr_mult: 0 
    decay_mult: 0 
  }
  convolution_param {
    kernel_size: 3
    stride: 1
    num_output: 3
    group: 3
    weight_filler { 
      type: "multilinear" 
    } 
    bias_term: false
  }
}

ajtulloch · 2016-05-30T01:29:10Z

Really nice work! Could you add some unit tests, following the existing examples in test_filler.cpp?

christianpayer · 2016-05-31T09:22:40Z

@Coldmooon Your changes for downsampling look OK. You are right, there is a difference between the downsampling of this PR and the reference created with skimage. I don't know exactly, how skimage performs downsampling, as it can be implemented in multiple ways. Even if you compare the outputs of skimage's linear/cubic resampling and opencv's, you will see many differences.
With this PR I tried to resample (almost) exactly like Matlab's imresize. So if you compare the outputs pixel by pixel, you should compare them with Matlab (and hopefully see no difference).

Regarding your question of non-integer factors, this is not possible with a caffe convolution. With an integer factor, you will get the same kernel for every x/y-coordinate of the image. With non-integer factors, this is not the case. You would need different kernels for different coordinates of the image, which is not possible with the convolution layer.

If you want to get more insights into this, I suggest you to debug Matlab's imresize and look for the function 'contributions' and compare the outputs for integer and non-integer factors.

christianpayer · 2016-05-31T09:39:31Z

@ajtulloch Thanks! I can add some unit test, but I don't know, what should be tested. I could check, whether the kernels sum up to a fixed value. Or I could compare with hardcoded weights (possibly created from Matlab?) and check for differences.
What would you suggest?

Coldmooon · 2016-06-01T02:09:18Z

@christianpayer I tested upsampling and downsampling with non-integer factors. The resampled images for the two cases look like noise and look like the downsampled image with an integer factor but without appending the Power layer. Now I can see what you said about the non-integer case. Thank you~.

Maybe it's possible to implement a new nearest interpolation layer for non-integer factors. By the way, it seems that STN could create a bilinear sampling kernel. I don't know if STN can work for non-integer factors.

Christian Payer added 4 commits May 23, 2016 13:59

implement upgrade_proto for deprecated weight fillers

c4e2512

add multicubic interpolation filler

230eadc

add lanczos{2,3,4} interpolation fillers

eee92ca

shelhamer added enhancement focus labels May 31, 2016

christianpayer mentioned this pull request Jun 1, 2016

Generalize bilinear filler to for N-D multilinear filler #3984

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generalize bilinear interpolation filler to N-D multilinear/multicubic/lanczos filler #4198

Generalize bilinear interpolation filler to N-D multilinear/multicubic/lanczos filler #4198

christianpayer commented May 23, 2016

christianpayer commented May 24, 2016

Coldmooon commented May 24, 2016 •

edited

Loading

christianpayer commented May 24, 2016

Coldmooon commented May 29, 2016 •

edited

Loading

ajtulloch commented May 30, 2016

christianpayer commented May 31, 2016

christianpayer commented May 31, 2016

Coldmooon commented Jun 1, 2016 •

edited

Loading

Generalize bilinear interpolation filler to N-D multilinear/multicubic/lanczos filler #4198

Are you sure you want to change the base?

Generalize bilinear interpolation filler to N-D multilinear/multicubic/lanczos filler #4198

Conversation

christianpayer commented May 23, 2016

christianpayer commented May 24, 2016

Coldmooon commented May 24, 2016 • edited Loading

christianpayer commented May 24, 2016

Coldmooon commented May 29, 2016 • edited Loading

ajtulloch commented May 30, 2016

christianpayer commented May 31, 2016

christianpayer commented May 31, 2016

Coldmooon commented Jun 1, 2016 • edited Loading

Coldmooon commented May 24, 2016 •

edited

Loading

Coldmooon commented May 29, 2016 •

edited

Loading

Coldmooon commented Jun 1, 2016 •

edited

Loading