Please clarify interpolation algorithm for resample2d #358

BruceDai · 2023-02-28T06:18:59Z

There're two interpolation modes for resample2d:

Nearest Neighbor interpolation
Linear interpolation

@huningxin @wchao1115 @zolkis Please clarify interpolation algorithm for them, thanks.

BruceDai · 2023-02-28T06:19:09Z

Link to #270

fdwr · 2023-03-01T02:26:37Z

BruceDai: You want to sample pixel centers, not the top left of the pixel (which would poorly shift the output image slightly), and you want the output to be reflection invariant.

For bilinear sampling

scale.x = outputSize.x / inputSize.x
scale.y = outputSize.y / inputSize.y
inputCoordinate.x = (outputCoordinate.x + 0.5) / scale.x - 0.5
inputCoordinate.y = (outputCoordinate.y + 0.5) / scale.y - 0.5

Starting from a given output coordinate, compute the corresponding input coordinate. The input coordinate gives you the location of the 4 input pixels to sample, and the fractional component of the input coordinate gives the weight for a linear interpolation between the four adjacent pixels. e.g. Given a 1D tall image being stretched horizontally, if the mapped input coordinate was x=10.25, then the output pixel would weigh 75% from input pixel x=10 and 25% from input pixel x=11. For 2D, you take the bilinear weight vertically too.

In ONNX, it is Resize with coordinate_transformation_mode = half_pixel.

In TF2, it looks like:

import tensorflow as tf

# NCHW
x = tf.constant(
        [[[
           [ 0, 1, 2, 3],
           [ 0, 1, 2, 3],
           [12,13,14,15],
           [12,13,14,15]
        ]]],
        dtype=tf.float32
    )
x_nhwc = tf.transpose(x, perm=[0, 2, 3, 1])

y_nhwc = tf.image.resize(
    x_nhwc,
    size=(8,8),
    method=tf.image.ResizeMethod.BILINEAR,
    preserve_aspect_ratio=False,
    antialias=False,
    name=None
)
y = tf.transpose(y_nhwc, perm=[0, 3, 1, 2])
print("x\n", x, sep='')
print("y\n", y, sep='');

#   tf.Tensor(
#   [[[[ 0.  1.  2.  3.]
#      [ 0.  1.  2.  3.]
#      [12. 13. 14. 15.]
#      [12. 13. 14. 15.]]]], shape=(1, 1, 4, 4), dtype=float32)
#   y
#   tf.Tensor(
#   [[[[ 0.    0.25  0.75  1.25  1.75  2.25  2.75  3.  ]
#      [ 0.    0.25  0.75  1.25  1.75  2.25  2.75  3.  ]
#      [ 0.    0.25  0.75  1.25  1.75  2.25  2.75  3.  ]
#      [ 3.    3.25  3.75  4.25  4.75  5.25  5.75  6.  ]
#      [ 9.    9.25  9.75 10.25 10.75 11.25 11.75 12.  ]
#      [12.   12.25 12.75 13.25 13.75 14.25 14.75 15.  ]
#      [12.   12.25 12.75 13.25 13.75 14.25 14.75 15.  ]
#      [12.   12.25 12.75 13.25 13.75 14.25 14.75 15.  ]]]], shape=(1, 1, 8, 8), dtype=float32)

The sampling pattern should look evenly distributed like:

(from here https://jricheimer.github.io/tensorflow/2019/02/11/resize-confusion/)

And not like:

For nearest neighbor sampling

I don't see any details in https://www.w3.org/TR/webnn/#api-mlgraphbuilder-resample2d, but I presume/propose you just round to nearest with X.5 halves toward negative infinity (which is a common default in graphics). So an input coordinate of x = 10.4 would read from x = 10, x = 10.9 from 11, and 10.5 from 10.0. So x = ceil(x - 0.5) (not x = floor(x + 0.5)). Note this differs from the classic "round halves up" mode used in banking, and it differs from round halves to nearest even (which would give a bad staggered appearance).

BruceDai · 2023-03-07T06:36:38Z

@fdwr Thanks much for your clarifications, please take a look at resample2d implementations for WebNN-Baseline.

anssiko · 2023-03-08T06:25:01Z

Just want to say Thank You to @fdwr and @BruceDai for your attention to detail in this and other issues.

We can include to the specification figures similar to those referred to in this issue if/when they help explain the algorithms. Any figures are considered informative and complement the normative algorithms i.e. they cannot supplant them. You can consider this idea an extra "enhancement" effort. The icing on the cake.

As a concrete example inline SVG works for smaller and simpler figures: example and source. A good property of SVG is it allows links and of course as a text-based format allows for human-readable diffs.

anssiko · 2023-05-19T13:15:09Z

We discussed this in https://www.w3.org/2023/05/11-webmachinelearning-minutes.html#t11 and it looks like this would be an opportunity for an interested expert and WG participant (@fdwr maybe? :-)) to propose a PR that clarifies the expected semantics of these two interpolation algorithms:

enum MLInterpolationMode {
  "nearest-neighbor",
  "linear"
};

Normative prose would go into Arguments: > mode of the resample2d() method section and informative content such as figures could be added to a green note box similar to those "can be generically emulated" boxes in various decomposable ops following Returns:.

BruceDai mentioned this issue Feb 28, 2023

Implement resample2d webmachinelearning/webnn-baseline#18

Closed

anssiko added the enhancement label Mar 3, 2023

BruceDai mentioned this issue Mar 7, 2023

Implement resample2d webmachinelearning/webnn-baseline#33

Merged

huningxin mentioned this issue Sep 11, 2023

Improve "underlying platform" references in spec #462

Open

inexorabletash mentioned this issue Feb 6, 2024

Process: Add documentation for labels, current and proposed #533

Merged

3 tasks

anssiko added the operator specific label Feb 7, 2024

inexorabletash removed the enhancement label Feb 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Please clarify interpolation algorithm for resample2d #358

Please clarify interpolation algorithm for resample2d #358

BruceDai commented Feb 28, 2023

BruceDai commented Feb 28, 2023

fdwr commented Mar 1, 2023 •

edited

Loading

BruceDai commented Mar 7, 2023

anssiko commented Mar 8, 2023

anssiko commented May 19, 2023

Please clarify interpolation algorithm for resample2d #358

Please clarify interpolation algorithm for resample2d #358

Comments

BruceDai commented Feb 28, 2023

BruceDai commented Feb 28, 2023

fdwr commented Mar 1, 2023 • edited Loading

For bilinear sampling

For nearest neighbor sampling

BruceDai commented Mar 7, 2023

anssiko commented Mar 8, 2023

anssiko commented May 19, 2023

fdwr commented Mar 1, 2023 •

edited

Loading