Skip to content

Commit

Permalink
Correct types in GridSample def.cc (#4655)
Browse files Browse the repository at this point in the history
  • Loading branch information
liqunfu committed Nov 15, 2022
1 parent d006f33 commit 8dc0ed5
Show file tree
Hide file tree
Showing 3 changed files with 49 additions and 30 deletions.
25 changes: 15 additions & 10 deletions docs/Changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -19937,11 +19937,16 @@ This version of the operator has been available since version 16 of the default

### <a name="GridSample-16"></a>**GridSample-16**</a>

Given an `input` and a flow-field `grid`, computes the `output` using `input` values and pixel locations from `grid`.
Currently, only spatial (4-D) inputs are supported. For `input` with shape (N, C, H, W) and `grid` with shape (N, H_out, W_out, 2),
the `output` will have shape (N, C, H_out, W_out).
For each output location `output[N, C, H_out, W_out]`, the size-2 vector `grid[N, H_out, W_out]` specifies `input` pixel locations `x` and `y`,
which are used to interpolate the output value `output[N, C, H_out, W_out]`.
Given an input `X` and a flow-field `grid`, computes the output `Y` using `X` values and pixel locations from `grid`.
Currently, only spatial (4-D) inputs are supported. For input `X` with shape (N, C, H, W) and `grid` with shape (N, H_out, W_out, 2),
the output `Y` will have shape (N, C, H_out, W_out).

The tensor `X` contains values at centers of square pixels in a H by W 2-dimensional image.
The tensor `grid` describes normalized positions where the output `Y` is to be computed
using a specified interpolation method (the mode) and a padding mode (for grid positions falling outside the 2-dimensional image).

Elements in `grid[N, H_out, W_out]` are size-2 vectors specifying positions in the 2-dimensional space of `X`.
They are used to interpolate output values of `Y[N, C, H_out, W_out]`.

The GridSample operator is often used in doing grid generator and sampler in the [Spatial Transformer Networks](https://arxiv.org/abs/1506.02025).
See also in [torch.nn.functional.grid_sample](https://pytorch.org/docs/master/generated/torch.nn.functional.grid_sample.html#torch-nn-functional-grid-sample).
Expand All @@ -19966,24 +19971,24 @@ This version of the operator has been available since version 16 of the default
<dl>
<dt><tt>X</tt> (differentiable) : T1</dt>
<dd>4-D tensor of shape (N, C, H, W), where N is the batch size, C is the numbers of channels, H and W are the height and width of the input data.</dd>
<dt><tt>grid</tt> (non-differentiable) : T1</dt>
<dt><tt>grid</tt> (non-differentiable) : T2</dt>
<dd>Input offset, 4-D tensor of shape (N, H_out, W_out, 2), where H_out and W_out are the height and width of grid and output, Grid specifies the sampling pixel locations normalized by the input spatial dimensions. Therefore, it should have most values in the range of [-1, 1]. If grid has values outside the range of [-1, 1], the corresponding outputs will be handled as defined by padding_mode.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T2</dt>
<dd>4-D tensor of shape (N, C, H_out, W_out).</dd>
<dt><tt>Y</tt> (differentiable) : T1</dt>
<dd>4-D tensor of shape (N, C, H_out, W_out) of sampled values. For integer input types, intermediate values are computed as floating point and cast to integer at the end.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain input types to all tensor types.</dd>
<dd>Constrain input `X` and output `Y` types to all tensor types.</dd>
<dt><tt>T2</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain output types to float tensors.</dd>
<dd>Constrain grid types to float tensors.</dd>
</dl>

### <a name="Identity-16"></a>**Identity-16**</a>
Expand Down
25 changes: 15 additions & 10 deletions docs/Operators.md
Original file line number Diff line number Diff line change
Expand Up @@ -9322,11 +9322,16 @@ Other versions of this operator: <a href="Changelog.md#GreaterOrEqual-12">12</a>

### <a name="GridSample"></a><a name="gridsample">**GridSample**</a>

Given an `input` and a flow-field `grid`, computes the `output` using `input` values and pixel locations from `grid`.
Currently, only spatial (4-D) inputs are supported. For `input` with shape (N, C, H, W) and `grid` with shape (N, H_out, W_out, 2),
the `output` will have shape (N, C, H_out, W_out).
For each output location `output[N, C, H_out, W_out]`, the size-2 vector `grid[N, H_out, W_out]` specifies `input` pixel locations `x` and `y`,
which are used to interpolate the output value `output[N, C, H_out, W_out]`.
Given an input `X` and a flow-field `grid`, computes the output `Y` using `X` values and pixel locations from `grid`.
Currently, only spatial (4-D) inputs are supported. For input `X` with shape (N, C, H, W) and `grid` with shape (N, H_out, W_out, 2),
the output `Y` will have shape (N, C, H_out, W_out).

The tensor `X` contains values at centers of square pixels in a H by W 2-dimensional image.
The tensor `grid` describes normalized positions where the output `Y` is to be computed
using a specified interpolation method (the mode) and a padding mode (for grid positions falling outside the 2-dimensional image).

Elements in `grid[N, H_out, W_out]` are size-2 vectors specifying positions in the 2-dimensional space of `X`.
They are used to interpolate output values of `Y[N, C, H_out, W_out]`.

The GridSample operator is often used in doing grid generator and sampler in the [Spatial Transformer Networks](https://arxiv.org/abs/1506.02025).
See also in [torch.nn.functional.grid_sample](https://pytorch.org/docs/master/generated/torch.nn.functional.grid_sample.html#torch-nn-functional-grid-sample).
Expand All @@ -9351,24 +9356,24 @@ This version of the operator has been available since version 16 of the default
<dl>
<dt><tt>X</tt> (differentiable) : T1</dt>
<dd>4-D tensor of shape (N, C, H, W), where N is the batch size, C is the numbers of channels, H and W are the height and width of the input data.</dd>
<dt><tt>grid</tt> (non-differentiable) : T1</dt>
<dt><tt>grid</tt> (non-differentiable) : T2</dt>
<dd>Input offset, 4-D tensor of shape (N, H_out, W_out, 2), where H_out and W_out are the height and width of grid and output, Grid specifies the sampling pixel locations normalized by the input spatial dimensions. Therefore, it should have most values in the range of [-1, 1]. If grid has values outside the range of [-1, 1], the corresponding outputs will be handled as defined by padding_mode.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T2</dt>
<dd>4-D tensor of shape (N, C, H_out, W_out).</dd>
<dt><tt>Y</tt> (differentiable) : T1</dt>
<dd>4-D tensor of shape (N, C, H_out, W_out) of sampled values. For integer input types, intermediate values are computed as floating point and cast to integer at the end.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain input types to all tensor types.</dd>
<dd>Constrain input `X` and output `Y` types to all tensor types.</dd>
<dt><tt>T2</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain output types to float tensors.</dd>
<dd>Constrain grid types to float tensors.</dd>
</dl>


Expand Down
29 changes: 19 additions & 10 deletions onnx/defs/tensor/defs.cc
Original file line number Diff line number Diff line change
Expand Up @@ -2340,11 +2340,16 @@ ONNX_OPERATOR_SET_SCHEMA(
.TypeAndShapeInferenceFunction([](InferenceContext& ctx) { resizeShapeInference_opset13_to_18(ctx); }));

static const char* GridSample_ver16_doc = R"DOC(
Given an `input` and a flow-field `grid`, computes the `output` using `input` values and pixel locations from `grid`.
Currently, only spatial (4-D) inputs are supported. For `input` with shape (N, C, H, W) and `grid` with shape (N, H_out, W_out, 2),
the `output` will have shape (N, C, H_out, W_out).
For each output location `output[N, C, H_out, W_out]`, the size-2 vector `grid[N, H_out, W_out]` specifies `input` pixel locations `x` and `y`,
which are used to interpolate the output value `output[N, C, H_out, W_out]`.
Given an input `X` and a flow-field `grid`, computes the output `Y` using `X` values and pixel locations from `grid`.
Currently, only spatial (4-D) inputs are supported. For input `X` with shape (N, C, H, W) and `grid` with shape (N, H_out, W_out, 2),
the output `Y` will have shape (N, C, H_out, W_out).
The tensor `X` contains values at centers of square pixels in a H by W 2-dimensional image.
The tensor `grid` describes normalized positions where the output `Y` is to be computed
using a specified interpolation method (the mode) and a padding mode (for grid positions falling outside the 2-dimensional image).
Elements in `grid[N, H_out, W_out]` are size-2 vectors specifying positions in the 2-dimensional space of `X`.
They are used to interpolate output values of `Y[N, C, H_out, W_out]`.
The GridSample operator is often used in doing grid generator and sampler in the [Spatial Transformer Networks](https://arxiv.org/abs/1506.02025).
See also in [torch.nn.functional.grid_sample](https://pytorch.org/docs/master/generated/torch.nn.functional.grid_sample.html#torch-nn-functional-grid-sample).
Expand Down Expand Up @@ -2395,25 +2400,29 @@ ONNX_OPERATOR_SET_SCHEMA(
"Grid specifies the sampling pixel locations normalized by the input spatial dimensions. "
"Therefore, it should have most values in the range of [-1, 1]. "
"If grid has values outside the range of [-1, 1], the corresponding outputs will be handled as defined by padding_mode.",
"T1",
"T2",
OpSchema::Single,
true,
1,
OpSchema::NonDifferentiable)
.Output(
0,
"Y",
"4-D tensor of shape (N, C, H_out, W_out).",
"T2",
"4-D tensor of shape (N, C, H_out, W_out) of sampled values. "
"For integer input types, intermediate values are computed as floating point and cast to integer at the end.",
"T1",
OpSchema::Single,
true,
1,
OpSchema::Differentiable)
.TypeConstraint("T1", OpSchema::all_tensor_types(), "Constrain input types to all tensor types.")
.TypeConstraint(
"T1",
OpSchema::all_tensor_types(),
"Constrain input `X` and output `Y` types to all tensor types.")
.TypeConstraint(
"T2",
{"tensor(float16)", "tensor(float)", "tensor(double)"},
"Constrain output types to float tensors.")
"Constrain grid types to float tensors.")
.SetDoc(GridSample_ver16_doc)
.TypeAndShapeInferenceFunction([](InferenceContext& ctx) {
propagateElemTypeFromInputToOutput(ctx, 0, 0);
Expand Down

0 comments on commit 8dc0ed5

Please sign in to comment.