-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update softmax family ops behavior to align with other frameworks (fix #2289) #2879
Conversation
"from the back. Accepted range is [-r, r-1] where r = rank(input).", | ||
AttributeProto::INT, | ||
static_cast<int64_t>(1)); | ||
"axis", axis_attr, AttributeProto::INT, static_cast<int64_t>(-1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The version should also be bumped, given the default value of "axis" changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion, if it's only default value change, I'd suggest to not change it. The benefit is not that big with fair change (version bump).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the current softmax op version has already been bumped since last release (current version is already 13). Is it necessary to bump it once more?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no in that way. Only one version bump in one release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, changing any of these {attribute names, attribute default values, tensors meanings} is a version breaking change.
.FillUsing( | ||
SoftmaxFamilyDocGenerator("softmax", "normalized exponential")) | ||
.SetContextDependentFunctionBodyBuilder( | ||
[](const FunctionBodyBuildContext& ctx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you verify the correctness of the subgraph? (the "expanded" and non "expanded" model generated in this PR, feeding them same inputs will get same outputs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's the old decomposition I wrote down before (ignore the flattening part now, I guess):
https://fdwr.github.io/LostOnnxDocs/OperatorFormulas.html
function SoftMax(Input; Output; axis):
FlattenedInput = Flatten(Input, axis) // Flatten to 2D
NormalizedInput = SoftMax2D(FlattenedInput; ; axis)
Output = Reshape(NormalizedInput, Shape(X))
endfunction
function SoftMax2D(Input; Output; axis):
MaxInput = ReduceMax(Input, axes=[1], keepdims=1)
ExpInput = Exp(Input - MaxInput)
ReducedExpInput = ReduceSum(ExpInput, axes=[1], keepdims=1)
Output = ExpInput / ReducedExpInput
endfunction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@linkerzhang @fdwr I have verified the correctness of the subgraph by inferencing the subgraph in onnxruntime and comparing the result with the reference numpy implementation
@linkerzhang @wschin @fdwr I have updated the PR according to the comments. Please review it again. Thanks! |
@wschin @linkerzhang I have updated this PR. Is it ok to merge it? Thanks! |
Could you please sync this branch with master again? Sorry for being late and we want this PR in for this release. |
712d393
to
56bf5bc
Compare
56bf5bc
to
712d393
Compare
Done. Do I need to sign off according to the DCO check? I tried signing off the commits according to its instructions but I found it makes git history confusing. |
@prasanthpul, Is DCO required now? @daquexian, maybe you can do
[Update] As discussed with @prasanthpul offline, DOC is not required now so I just merged it. Thank you for all the hard works! |
…onnx#2289) (onnx#2879) * Update softmax family ops behavior to align with other frameworks * Update logsoftmax, hardmax tests, regenerate docs and test data * fix wrong input name in function * regenerate test data * fix flake8 error * regenerate docs * regenerate docs * add missing type annotation for hardmax * add the math for softmax family operators * remove the 'description' field in docs as it is covered by the math * fix wrong format in axis attr * replace name with description * restore the name field for axis attr * regenerate docs * regenerate docs * add the missing name * regenerate docs * update reducesum to align with master * regenerate tests Co-authored-by: Wei-Sheng Chin <wschin@outlook.com>
…onnx#2289) (onnx#2879) * Update softmax family ops behavior to align with other frameworks * Update logsoftmax, hardmax tests, regenerate docs and test data * fix wrong input name in function * regenerate test data * fix flake8 error * regenerate docs * regenerate docs * add missing type annotation for hardmax * add the math for softmax family operators * remove the 'description' field in docs as it is covered by the math * fix wrong format in axis attr * replace name with description * restore the name field for axis attr * regenerate docs * regenerate docs * add the missing name * regenerate docs * update reducesum to align with master * regenerate tests Co-authored-by: Wei-Sheng Chin <wschin@outlook.com> Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>
…onnx#2289) (onnx#2879) * Update softmax family ops behavior to align with other frameworks * Update logsoftmax, hardmax tests, regenerate docs and test data * fix wrong input name in function * regenerate test data * fix flake8 error * regenerate docs * regenerate docs * add missing type annotation for hardmax * add the math for softmax family operators * remove the 'description' field in docs as it is covered by the math * fix wrong format in axis attr * replace name with description * restore the name field for axis attr * regenerate docs * regenerate docs * add the missing name * regenerate docs * update reducesum to align with master * regenerate tests Co-authored-by: Wei-Sheng Chin <wschin@outlook.com> Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>
* remove wrong description for pow Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * Add size check to make_tensor (#2987) Co-authored-by: Ke Zhang <linkerzhang@yeah.net> Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * Fix float16 data convert issue in numpy_helper.to_array (#3002) * handle f16 case for to_array * fix flake8 * nit: comment Co-authored-by: Ashwini Khade <askhade@microsoft.com> Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * Deprecate Travis CI (#2773) * adding gtests for c++ api test * deprecate travis * adding the build badges for the new pipelines in Azure, deprecating travis build badge, renaming circleCI badge * updating badge label * removing - in badge names * c++ api changes for linux * update environment variables * update env variables, setup tools call * Update Linux-CI.yml for Azure Pipelines * revert changes to Linux and Mac CIs * delete last travis file Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * Fix shape inference of scalar ConstantOfShape (#3005) When input shape is (0), we do not add any dim to inferred shape but we should initialize tensor_type.shape by calling mutable_shape(). Co-authored-by: Ashwini Khade <askhade@microsoft.com> Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * fix shape inference for loop (#3014) Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * Fix ConvTranspose: enhance attribute check (#3000) * add check for using auto_pad and pads simultaneously * fix description for auto_pads == SAME_UPPER * update docs for operator * fix the old one as well * add a test * Revert "fix description for auto_pads == SAME_UPPER" This reverts commit e75e287. * Revert "update docs for operator" This reverts commit 70952c0. * Revert "fix the old one as well" This reverts commit 8a0482d. Co-authored-by: Ashwini Khade <askhade@microsoft.com> Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * Schema change to support dynamic shapes in ORT (#2955) * Changes to schema and python tests * Modify test * Remove attribute that is input also * Changes to optimizers, adapters and tests * Run flake8 * undo unrequired comit files, fix formatting, review changes * Fix ci test, cleanup * Fix narrowing conversion error * add missed test model Co-authored-by: G. Ramalingam <grama@microsoft.com> Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * fix loop shape inference for ver 11 (#3023) Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * Update softmax family ops behavior to align with other frameworks (fix #2289) (#2879) * Update softmax family ops behavior to align with other frameworks * Update logsoftmax, hardmax tests, regenerate docs and test data * fix wrong input name in function * regenerate test data * fix flake8 error * regenerate docs * regenerate docs * add missing type annotation for hardmax * add the math for softmax family operators * remove the 'description' field in docs as it is covered by the math * fix wrong format in axis attr * replace name with description * restore the name field for axis attr * regenerate docs * regenerate docs * add the missing name * regenerate docs * update reducesum to align with master * regenerate tests Co-authored-by: Wei-Sheng Chin <wschin@outlook.com> Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> Co-authored-by: G. Ramalingam <grama@microsoft.com> Co-authored-by: Ke Zhang <linkerzhang@yeah.net> Co-authored-by: Ashwini Khade <askhade@microsoft.com> Co-authored-by: Vinitra Swamy <vinitras@gmail.com> Co-authored-by: Shinichiro Hamaji <shinichiro.hamaji@gmail.com> Co-authored-by: ashbhandare <ash.bhandare@gmail.com> Co-authored-by: daquexian <daquexian566@gmail.com> Co-authored-by: Wei-Sheng Chin <wschin@outlook.com>
…onnx#2289) (onnx#2879) * Update softmax family ops behavior to align with other frameworks * Update logsoftmax, hardmax tests, regenerate docs and test data * fix wrong input name in function * regenerate test data * fix flake8 error * regenerate docs * regenerate docs * add missing type annotation for hardmax * add the math for softmax family operators * remove the 'description' field in docs as it is covered by the math * fix wrong format in axis attr * replace name with description * restore the name field for axis attr * regenerate docs * regenerate docs * add the missing name * regenerate docs * update reducesum to align with master * regenerate tests Co-authored-by: Wei-Sheng Chin <wschin@outlook.com>
…x#2999) * remove wrong description for pow Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * Add size check to make_tensor (onnx#2987) Co-authored-by: Ke Zhang <linkerzhang@yeah.net> Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * Fix float16 data convert issue in numpy_helper.to_array (onnx#3002) * handle f16 case for to_array * fix flake8 * nit: comment Co-authored-by: Ashwini Khade <askhade@microsoft.com> Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * Deprecate Travis CI (onnx#2773) * adding gtests for c++ api test * deprecate travis * adding the build badges for the new pipelines in Azure, deprecating travis build badge, renaming circleCI badge * updating badge label * removing - in badge names * c++ api changes for linux * update environment variables * update env variables, setup tools call * Update Linux-CI.yml for Azure Pipelines * revert changes to Linux and Mac CIs * delete last travis file Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * Fix shape inference of scalar ConstantOfShape (onnx#3005) When input shape is (0), we do not add any dim to inferred shape but we should initialize tensor_type.shape by calling mutable_shape(). Co-authored-by: Ashwini Khade <askhade@microsoft.com> Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * fix shape inference for loop (onnx#3014) Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * Fix ConvTranspose: enhance attribute check (onnx#3000) * add check for using auto_pad and pads simultaneously * fix description for auto_pads == SAME_UPPER * update docs for operator * fix the old one as well * add a test * Revert "fix description for auto_pads == SAME_UPPER" This reverts commit e75e287. * Revert "update docs for operator" This reverts commit 70952c0. * Revert "fix the old one as well" This reverts commit 8a0482d. Co-authored-by: Ashwini Khade <askhade@microsoft.com> Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * Schema change to support dynamic shapes in ORT (onnx#2955) * Changes to schema and python tests * Modify test * Remove attribute that is input also * Changes to optimizers, adapters and tests * Run flake8 * undo unrequired comit files, fix formatting, review changes * Fix ci test, cleanup * Fix narrowing conversion error * add missed test model Co-authored-by: G. Ramalingam <grama@microsoft.com> Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * fix loop shape inference for ver 11 (onnx#3023) Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * Update softmax family ops behavior to align with other frameworks (fix onnx#2289) (onnx#2879) * Update softmax family ops behavior to align with other frameworks * Update logsoftmax, hardmax tests, regenerate docs and test data * fix wrong input name in function * regenerate test data * fix flake8 error * regenerate docs * regenerate docs * add missing type annotation for hardmax * add the math for softmax family operators * remove the 'description' field in docs as it is covered by the math * fix wrong format in axis attr * replace name with description * restore the name field for axis attr * regenerate docs * regenerate docs * add the missing name * regenerate docs * update reducesum to align with master * regenerate tests Co-authored-by: Wei-Sheng Chin <wschin@outlook.com> Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> Co-authored-by: G. Ramalingam <grama@microsoft.com> Co-authored-by: Ke Zhang <linkerzhang@yeah.net> Co-authored-by: Ashwini Khade <askhade@microsoft.com> Co-authored-by: Vinitra Swamy <vinitras@gmail.com> Co-authored-by: Shinichiro Hamaji <shinichiro.hamaji@gmail.com> Co-authored-by: ashbhandare <ash.bhandare@gmail.com> Co-authored-by: daquexian <daquexian566@gmail.com> Co-authored-by: Wei-Sheng Chin <wschin@outlook.com>
Fix #2289
This PR updates softmax and logsoftmax so that they are functions and align with other frameworks (tf, pytorch, etc), and hardmax is also updated as a primitive operator