Skip to content

Commit

Permalink
Update softmax family ops behavior to align with other frameworks (fix
Browse files Browse the repository at this point in the history
…#2289) (#2879)

* Update softmax family ops behavior to align with other frameworks

* Update logsoftmax, hardmax tests, regenerate docs and test data

* fix wrong input name in function

* regenerate test data

* fix flake8 error

* regenerate docs

* regenerate docs

* add missing type annotation for hardmax

* add the math for softmax family operators

* remove the 'description' field in docs as it is covered by the math

* fix wrong format in axis attr

* replace name with description

* restore the name field for axis attr

* regenerate docs

* regenerate docs

* add the missing name

* regenerate docs

* update reducesum to align with master

* regenerate tests

Co-authored-by: Wei-Sheng Chin <wschin@outlook.com>
Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>
  • Loading branch information
2 people authored and jcwchen committed Sep 22, 2020
1 parent 370c187 commit 69e89dd
Show file tree
Hide file tree
Showing 81 changed files with 843 additions and 346 deletions.
84 changes: 39 additions & 45 deletions docs/Changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -16034,20 +16034,14 @@ This version of the operator has been available since version 13 of the default

### <a name="Hardmax-13"></a>**Hardmax-13**</a>

The operator computes the hardmax (1 for the first maximum value, and 0 for all others) values for each layer in the batch
of the given input.
The operator computes the hardmax values for the given input:

The input does not need to explicitly be a 2D vector; rather, it will be
coerced into one. For an arbitrary n-dimensional tensor
input \in [a_0, a_1, ..., a_{k-1}, a_k, ..., a_{n-1}] and k is
the axis provided, then input will be coerced into a 2-dimensional tensor with
dimensions [a_0 * ... * a_{k-1}, a_k * ... * a_{n-1}]. For the default
case where axis=1, this means the input tensor will be coerced into a 2D tensor
of dimensions [a_0, a_1 * ... * a_{n-1}], where a_0 is often the batch size.
In this situation, we must have a_0 = N and a_1 * ... * a_{n-1} = D.
Each of these dimensions must be matched correctly, or else the operator
will throw errors. The output tensor has the same shape
and contains the hardmax values of the corresponding input.
Hardmax(element in input, axis) = 1 if the element is the first maximum value along the specified axis, 0 otherwise

The input does not need to explicitly be a 2D vector. The "axis" attribute
indicates the dimension along which Hardmax will be performed.
The output tensor has the same shape
and contains the Hardmax values of the corresponding input.

#### Version

Expand All @@ -16056,8 +16050,12 @@ This version of the operator has been available since version 13 of the default
#### Attributes

<dl>
<dt><tt>axis</tt> : int (default is 1)</dt>
<dd>Describes the axis of the inputs when coerced to 2D; defaults to one because the 0th axis most likely describes the batch_size. Negative value means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(input).</dd>
<dt><tt>axis</tt> : int (default is -1)</dt>
<dd>
Describes the dimension Hardmax will be performed on.
Negative value means counting dimensions
from the back. Accepted range is [-r, r-1] where r = rank(input).,
</dd>
</dl>

#### Inputs
Expand Down Expand Up @@ -16299,20 +16297,14 @@ This version of the operator has been available since version 13 of the default

### <a name="LogSoftmax-13"></a>**LogSoftmax-13**</a>

The operator computes the logsoftmax (log of softmax) values for each layer in the batch
of the given input.
The operator computes the log of softmax values for the given input:

The input does not need to explicitly be a 2D vector; rather, it will be
coerced into one. For an arbitrary n-dimensional tensor
input \in [a_0, a_1, ..., a_{k-1}, a_k, ..., a_{n-1}] and k is
the axis provided, then input will be coerced into a 2-dimensional tensor with
dimensions [a_0 * ... * a_{k-1}, a_k * ... * a_{n-1}]. For the default
case where axis=1, this means the input tensor will be coerced into a 2D tensor
of dimensions [a_0, a_1 * ... * a_{n-1}], where a_0 is often the batch size.
In this situation, we must have a_0 = N and a_1 * ... * a_{n-1} = D.
Each of these dimensions must be matched correctly, or else the operator
will throw errors. The output tensor has the same shape
and contains the logsoftmax values of the corresponding input.
LogSoftmax(input, axis) = Log(Softmax(input, axis=axis))

The input does not need to explicitly be a 2D vector. The "axis" attribute
indicates the dimension along which LogSoftmax will be performed.
The output tensor has the same shape
and contains the LogSoftmax values of the corresponding input.

#### Version

Expand All @@ -16321,8 +16313,12 @@ This version of the operator has been available since version 13 of the default
#### Attributes

<dl>
<dt><tt>axis</tt> : int (default is 1)</dt>
<dd>Describes the axis of the inputs when coerced to 2D; defaults to one because the 0th axis most likely describes the batch_size. Negative value means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(input).</dd>
<dt><tt>axis</tt> : int (default is -1)</dt>
<dd>
Describes the dimension LogSoftmax will be performed on.
Negative value means counting dimensions
from the back. Accepted range is [-r, r-1] where r = rank(input).,
</dd>
</dl>

#### Inputs
Expand Down Expand Up @@ -18073,20 +18069,14 @@ This version of the operator has been available since version 13 of the default

### <a name="Softmax-13"></a>**Softmax-13**</a>

The operator computes the softmax (normalized exponential) values for each layer in the batch
of the given input.
The operator computes the normalized exponential values for the given input:

The input does not need to explicitly be a 2D vector; rather, it will be
coerced into one. For an arbitrary n-dimensional tensor
input \in [a_0, a_1, ..., a_{k-1}, a_k, ..., a_{n-1}] and k is
the axis provided, then input will be coerced into a 2-dimensional tensor with
dimensions [a_0 * ... * a_{k-1}, a_k * ... * a_{n-1}]. For the default
case where axis=1, this means the input tensor will be coerced into a 2D tensor
of dimensions [a_0, a_1 * ... * a_{n-1}], where a_0 is often the batch size.
In this situation, we must have a_0 = N and a_1 * ... * a_{n-1} = D.
Each of these dimensions must be matched correctly, or else the operator
will throw errors. The output tensor has the same shape
and contains the softmax values of the corresponding input.
Softmax(input, axis) = Exp(input) / ReduceSum(Exp(input), axis=axis, keepdims=1)

The input does not need to explicitly be a 2D vector. The "axis" attribute
indicates the dimension along which Softmax will be performed.
The output tensor has the same shape
and contains the Softmax values of the corresponding input.

#### Version

Expand All @@ -18095,8 +18085,12 @@ This version of the operator has been available since version 13 of the default
#### Attributes

<dl>
<dt><tt>axis</tt> : int (default is 1)</dt>
<dd>Describes the axis of the inputs when coerced to 2D; defaults to one because the 0th axis most likely describes the batch_size. Negative value means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(input).</dd>
<dt><tt>axis</tt> : int (default is -1)</dt>
<dd>
Describes the dimension Softmax will be performed on.
Negative value means counting dimensions
from the back. Accepted range is [-r, r-1] where r = rank(input).,
</dd>
</dl>

#### Inputs
Expand Down

0 comments on commit 69e89dd

Please sign in to comment.