Skip to content

Commit

Permalink
Unique (#2141)
Browse files Browse the repository at this point in the history
* deprecate spatial mode of BN

* Initial commit

* More changes

* Revert accidental deletion

* More changes

* Tests

* Minor nits

* Checking in defs file

* Fix build break

* Fix build break

* Resolve comments

* Fix node test case

* Fix build failure

* Fix stale docs

* Resolve PR comments

* Make output naming conventions consistent

* Fix build break

* N-D input

* unique of N-D tensors

* fix test failure

* review according to reviewer's comments

* fix test failures

* fix shape inference tests

* update doc according to reviewer's comments

* fix build and comment for index

* sample implementation for unique unsorted case

* fix type error in shape inference

* update according to reviewer's comment to make unique implementation clean

* fix docs
  • Loading branch information
liqunfu authored and gramalingam committed Aug 1, 2019
1 parent 5a5588a commit 3ba2e31
Show file tree
Hide file tree
Showing 31 changed files with 832 additions and 2 deletions.
118 changes: 118 additions & 0 deletions docs/Changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -10748,3 +10748,121 @@ This version of the operator has been available since version 11 of the default
<dd>Constrain index tensor to int64</dd>
</dl>

### <a name="Unique-11"></a>**Unique-11**</a>

Find the unique elements of a tensor. When an optional attribute 'axis' is provided, unique subtensors sliced along the 'axis' are returned.
Otherwise the input tensor is flattened and unique values of the flattened tensor are returned.

This operator returns the unique values or sliced unique subtensors of the input tensor and three optional outputs.
The first output tensor 'Y' contains all unique values or subtensors of the input.
The second optional output tensor 'indices' contains indices of 'Y' elements' first occurance in 'X'..
The third optional output tensor 'inverse_indices' contains, for elements of 'X', its corresponding indices in 'Y'. ".
The fourth optional output tensor 'counts' contains the count of each element of 'Y' in the input.

Outputs are either sorted in ascending order or optionally in the order of the first occurrence of the values in the input.

https://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html

Example 1:
input_X = [2, 1, 1, 3, 4, 3]
attribute_sorted = 0
attribute_axis = None
output_Y = [2, 1, 3, 4]
output_indices = [0, 1, 3, 4]
output_inverse_indices = [0, 1, 1, 2, 3, 2]
output_counts = [1, 2, 2, 1]

Example 2:
input_X = [[1, 3], [2, 3]]
attribute_sorted = 1
attribute_axis = None
output_Y = [1, 2, 3]
output_indices = [0, 2, 1]
output_inverse_indices = [0, 2, 1, 2]
output_counts = [1, 1, 2]

Example 3:
input_X = [[1, 0, 0], [1, 0, 0], [2, 3, 4]]
attribute_sorted = 1
attribute_axis = 0
output_Y = [[1, 0, 0], [2, 3, 4]]
output_indices = [0, 2]
output_inverse_indices = [0, 0, 1]
output_counts = [2, 1]

Example 4:
input_x = [[[1., 1.], [0., 1.], [2., 1.], [0., 1.]],
[[1., 1.], [0., 1.], [2., 1.], [0., 1.]]]
attribute_sorted = 1
attribute_axis = 1

intermediate data are presented below for better understanding:

there are 4 subtensors sliced along axis 1 of input_x (shape = (2, 4, 2)):
A: [[1, 1], [1, 1]],
[[0, 1], [0, 1]],
[[2, 1], [2, 1]],
[[0, 1], [0, 1]].

there are 3 unique subtensors:
[[1, 1], [1, 1]],
[[0, 1], [0, 1]],
[[2, 1], [2, 1]].

sorted unique subtensors:
B: [[0, 1], [0, 1]],
[[1, 1], [1, 1]],
[[2, 1], [2, 1]].

output_Y is constructed from B:
[[[0. 1.], [1. 1.], [2. 1.]],
[[0. 1.], [1. 1.], [2. 1.]]]

output_indices is to map from B to A:
[1, 0, 2]

output_inverse_indices is to map from A to B:
[1, 0, 2, 0]

output_counts = [2 1 1]

#### Version

This version of the operator has been available since version 11 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>(Optional) The dimension to apply unique. If not specified, the unique elements of the flattened input are returned.</dd>
<dt><tt>sorted</tt> : int (default is 1)</dt>
<dd>(Optional) Whether to sort the unique elements in ascending order before returning as output. Must be one of 0, or 1 (default).</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>A N-D input tensor that is to be processed.</dd>
</dl>

#### Outputs (1 - 4)

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>A tensor of the same type as 'X' containing all the unique values or subtensors sliced along a provided 'axis' in 'X', either sorted or maintained in the same order they occur in input 'X'</dd>
<dt><tt>indices</tt> (optional) : tensor(int64)</dt>
<dd>A 1-D INT64 tensor containing indices of 'Y' elements' first occurance in 'X'. When 'axis' is provided, it contains indices to subtensors in input 'X' on the 'axis'. When 'axis' is not provided, it contains indices to values in the flattened input tensor. </dd>
<dt><tt>inverse_indices</tt> (optional) : tensor(int64)</dt>
<dd>A 1-D INT64 tensor containing, for elements of 'X', its corresponding indices in 'Y'. When 'axis' is provided, it contains indices to subtensors in output 'Y' on the 'axis'. When 'axis' is not provided, it contains indices to values in output 'Y'. </dd>
<dt><tt>counts</tt> (optional) : tensor(int64)</dt>
<dd>A 1-D INT64 tensor containing the count of each element of 'Y' in input 'X'</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Input can be of any tensor type.</dd>
</dl>

243 changes: 243 additions & 0 deletions docs/Operators.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,7 @@
* <a href="#Tile">Tile</a>
* <a href="#TopK">TopK</a>
* <a href="#Transpose">Transpose</a>
* <a href="#Unique">Unique</a>
* <a href="#Unsqueeze">Unsqueeze</a>
* <a href="#Upsample">Upsample</a>
* <a href="#Where">Where</a>
Expand Down Expand Up @@ -15210,6 +15211,248 @@ expect(node, inputs=[data], outputs=[transposed],
</details>


### <a name="Unique"></a><a name="unique">**Unique**</a>

Find the unique elements of a tensor. When an optional attribute 'axis' is provided, unique subtensors sliced along the 'axis' are returned.
Otherwise the input tensor is flattened and unique values of the flattened tensor are returned.

This operator returns the unique values or sliced unique subtensors of the input tensor and three optional outputs.
The first output tensor 'Y' contains all unique values or subtensors of the input.
The second optional output tensor 'indices' contains indices of 'Y' elements' first occurance in 'X'..
The third optional output tensor 'inverse_indices' contains, for elements of 'X', its corresponding indices in 'Y'. ".
The fourth optional output tensor 'counts' contains the count of each element of 'Y' in the input.

Outputs are either sorted in ascending order or optionally in the order of the first occurrence of the values in the input.

https://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html

Example 1:
input_X = [2, 1, 1, 3, 4, 3]
attribute_sorted = 0
attribute_axis = None
output_Y = [2, 1, 3, 4]
output_indices = [0, 1, 3, 4]
output_inverse_indices = [0, 1, 1, 2, 3, 2]
output_counts = [1, 2, 2, 1]

Example 2:
input_X = [[1, 3], [2, 3]]
attribute_sorted = 1
attribute_axis = None
output_Y = [1, 2, 3]
output_indices = [0, 2, 1]
output_inverse_indices = [0, 2, 1, 2]
output_counts = [1, 1, 2]

Example 3:
input_X = [[1, 0, 0], [1, 0, 0], [2, 3, 4]]
attribute_sorted = 1
attribute_axis = 0
output_Y = [[1, 0, 0], [2, 3, 4]]
output_indices = [0, 2]
output_inverse_indices = [0, 0, 1]
output_counts = [2, 1]

Example 4:
input_x = [[[1., 1.], [0., 1.], [2., 1.], [0., 1.]],
[[1., 1.], [0., 1.], [2., 1.], [0., 1.]]]
attribute_sorted = 1
attribute_axis = 1

intermediate data are presented below for better understanding:

there are 4 subtensors sliced along axis 1 of input_x (shape = (2, 4, 2)):
A: [[1, 1], [1, 1]],
[[0, 1], [0, 1]],
[[2, 1], [2, 1]],
[[0, 1], [0, 1]].

there are 3 unique subtensors:
[[1, 1], [1, 1]],
[[0, 1], [0, 1]],
[[2, 1], [2, 1]].

sorted unique subtensors:
B: [[0, 1], [0, 1]],
[[1, 1], [1, 1]],
[[2, 1], [2, 1]].

output_Y is constructed from B:
[[[0. 1.], [1. 1.], [2. 1.]],
[[0. 1.], [1. 1.], [2. 1.]]]

output_indices is to map from B to A:
[1, 0, 2]

output_inverse_indices is to map from A to B:
[1, 0, 2, 0]

output_counts = [2 1 1]

#### Version

This version of the operator has been available since version 11 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>(Optional) The dimension to apply unique. If not specified, the unique elements of the flattened input are returned.</dd>
<dt><tt>sorted</tt> : int (default is 1)</dt>
<dd>(Optional) Whether to sort the unique elements in ascending order before returning as output. Must be one of 0, or 1 (default).</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>A N-D input tensor that is to be processed.</dd>
</dl>

#### Outputs (1 - 4)

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>A tensor of the same type as 'X' containing all the unique values or subtensors sliced along a provided 'axis' in 'X', either sorted or maintained in the same order they occur in input 'X'</dd>
<dt><tt>indices</tt> (optional) : tensor(int64)</dt>
<dd>A 1-D INT64 tensor containing indices of 'Y' elements' first occurance in 'X'. When 'axis' is provided, it contains indices to subtensors in input 'X' on the 'axis'. When 'axis' is not provided, it contains indices to values in the flattened input tensor. </dd>
<dt><tt>inverse_indices</tt> (optional) : tensor(int64)</dt>
<dd>A 1-D INT64 tensor containing, for elements of 'X', its corresponding indices in 'Y'. When 'axis' is provided, it contains indices to subtensors in output 'Y' on the 'axis'. When 'axis' is not provided, it contains indices to values in output 'Y'. </dd>
<dt><tt>counts</tt> (optional) : tensor(int64)</dt>
<dd>A 1-D INT64 tensor containing the count of each element of 'Y' in input 'X'</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Input can be of any tensor type.</dd>
</dl>


#### Examples

<details>
<summary>not_sorted_without_axis</summary>

```python
node_not_sorted = onnx.helper.make_node(
'Unique',
inputs=['X'],
outputs=['Y', 'indices', 'inverse_indices', 'counts'],
sorted=0
)
# numpy unique does not retain original order (it sorts the output unique values)
# https://github.com/numpy/numpy/issues/8621
# we need to recover unsorted output and indices
x = np.array([2.0, 1.0, 1.0, 3.0, 4.0, 3.0], dtype=np.float32)
y, indices, inverse_indices, counts = np.unique(x, True, True, True)

# prepare index mapping from sorted to unsorted
argsorted_indices = np.argsort(indices)
inverse_indices_map = {i: si for i, si in zip(argsorted_indices, np.arange(len(argsorted_indices)))}

y = np.take(x, indices, axis=0)
indices = indices[argsorted_indices]
inverse_indices = np.asarray([inverse_indices_map[i] for i in inverse_indices], dtype=np.int64)
counts = counts[argsorted_indices]
# print(y)
# [2.0, 1.0, 3.0, 4.0]
# print(indices)
# [0 1 3 4]
# print(inverse_indices)
# [0, 1, 1, 2, 3, 2]
# print(counts)
# [1, 2, 2, 1]

expect(node_not_sorted, inputs=[x], outputs=[y, indices, inverse_indices, counts], name='test_unique_not_sorted_without_axis')
```

</details>


<details>
<summary>sorted_with_axis</summary>

```python
node_sorted = onnx.helper.make_node(
'Unique',
inputs=['X'],
outputs=['Y', 'indices', 'inverse_indices', 'counts'],
sorted=1,
axis=0
)

x = np.array([[1, 0, 0], [1, 0, 0], [2, 3, 4]], dtype=np.float32)
y, indices, inverse_indices, counts = np.unique(x, True, True, True, axis=0)
# print(y)
# [[1. 0. 0.]
# [2. 3. 4.]]
# print(indices)
# [0 2]
# print(inverse_indices)
# [0 0 1]
# print(counts)
# [2 1]

expect(node_sorted, inputs=[x], outputs=[y, indices, inverse_indices, counts], name='test_unique_sorted_with_axis')
```

</details>


<details>
<summary>sorted_with_axis_3d</summary>

```python
node_sorted = onnx.helper.make_node(
'Unique',
inputs=['X'],
outputs=['Y', 'indices', 'inverse_indices', 'counts'],
sorted=1,
axis=1
)

x = np.array([[[1., 1.], [0., 1.], [2., 1.], [0., 1.]],
[[1., 1.], [0., 1.], [2., 1.], [0., 1.]]], dtype=np.float32)
y, indices, inverse_indices, counts = np.unique(x, True, True, True, axis=1)
# print(y)
# [[[0. 1.]
# [1. 1.]
# [2. 1.]]
# [[0. 1.]
# [1. 1.]
# [2. 1.]]]
# print(indices)
# [1 0 2]
# print(inverse_indices)
# [1 0 2 0]
# print(counts)
# [2 1 1]
expect(node_sorted, inputs=[x], outputs=[y, indices, inverse_indices, counts], name='test_unique_sorted_with_axis_3d')
```

</details>


<details>
<summary>sorted_without_axis</summary>

```python
node_sorted = onnx.helper.make_node(
'Unique',
inputs=['X'],
outputs=['Y', 'indices', 'inverse_indices', 'counts']
)

x = np.array([2.0, 1.0, 1.0, 3.0, 4.0, 3.0], dtype=np.float32)
y, indices, inverse_indices, counts = np.unique(x, True, True, True)
expect(node_sorted, inputs=[x], outputs=[y, indices, inverse_indices, counts], name='test_unique_sorted_without_axis')
```

</details>


### <a name="Unsqueeze"></a><a name="unsqueeze">**Unsqueeze**</a>

Insert single-dimensional entries to the shape of a tensor.
Expand Down
Loading

0 comments on commit 3ba2e31

Please sign in to comment.