From 3193b023cff0c984816ea007b3fe2046e2fa9fef Mon Sep 17 00:00:00 2001 From: Chin Huang Date: Mon, 30 Mar 2020 19:27:11 -0700 Subject: [PATCH] Rel 1.7.103 verify (#2687) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * Fix Greater/LessOrEqual function definition (#2645) * Fix Greater/LessOrEqual function definition * Update test data Co-authored-by: Ke Zhang * Suppress a warning in unsqueeze (#2637) I keep getting this warning when building PyTorch: ``` In file included from /home/hong/wsrc/pytorch/third_party/onnx/onnx/defs/tensor/utils.h:6, from /home/hong/wsrc/pytorch/third_party/onnx/onnx/defs/tensor/defs.cc:4: /home/hong/wsrc/pytorch/third_party/onnx/onnx/defs/tensor/defs.cc: In lambda function: /home/hong/wsrc/pytorch/third_party/onnx/onnx/defs/tensor/defs.cc:1414:22: warning: unnecessary parentheses in declaration of ‘i’ [-Wparentheses] for (size_t(i) = 0; i < axes.size(); ++i) { ^ /home/hong/wsrc/pytorch/third_party/onnx/onnx/defs/schema.h:959:12: note: in definition of macro ‘ONNX_OPERATOR_SET_SCHEMA_EX’ return impl.SetName(#name) \ ^~~~ /home/hong/wsrc/pytorch/third_party/onnx/onnx/defs/tensor/defs.cc:1369:1: note: in expansion of macro ‘ONNX_OPERATOR_SET_SCHEMA’ ONNX_OPERATOR_SET_SCHEMA( ``` This commit should fix it and modernize the code a bit. Co-authored-by: Ke Zhang * [Training] Add Adagrad optimizer operator (#1955) * Adagrad draft * MIMO * Support multiple tensors to be optimized * Address comments * Move optimizers to a new place Remove copied Add momentum Save Remove momentum Fix Move constants to attributes * Fix build * Add shape test Add two node tests Update test coverage * Fix shape inf * Fix shape inf * fix shape inf * Format * Add function type * Merge lines * Format * Fix version number * Update op version in model files * Fix a test function and update related test files * Update onnx/backend/test/case/node/adagrad.py * Remove unused file * sync docs * Fix shape test * sync doc * sync with master * Update onnx/defs/training/defs.cc Co-Authored-By: Michał Karzyński * sync doc * address comments * address a minor comment * Polish one line Co-authored-by: Michał Karzyński * [Training] SG with Momentum Optimizer (#1959) * SG with Momentum * Registrate Op Fix Update other docs * Add shape inference code and polish definition * Update docs * Add test cases and fix several bugs * Remove accidently added copy * Alpha -> alpha & Beta -> beta * Clarify an attribute * Fix an attribute * Fix bug * Fix missing attributes * sync doc * Remove unused domain * sync with master Co-authored-by: Chin Huang * Change type of label tensor to int32/int64 in SoftmaxCrossEntropyLoss spec. (#2667) * Update Pow input types in Opset 12 (#2666) * Update Pow input types in Opset 12 * gen doc and tests * remove uints and 8 bit ints * add tests * remove uint int x tets * Adding CI for ONNX Debug mode (Linux, OSX) (#2651) * adding an osx build, linux build, with and without onnx_ml for debug mode * test debug mode with ONNX_ML=1 * Rename OPTIONAL to OPTIONAL_VALUE (#2682) Co-authored-by: G. Ramalingam * Update Batchnorm test (#2674) * Update Batchnorm test * relax shape inference on scalar * Remove unnecessary copies and std::move (#2684) * Update sequence test case so input is not scalar and splits are specified (#2675) * Update sequence test case to input is not scalar and splits are specified * Add spaces to make the checker happy * Use cmake GNUInstallDirs (#2661) https://cmake.org/cmake/help/latest/module/GNUInstallDirs.html this make allow install the libraries (and headers) in different location than `lib` (Gentoo uses lib64 for 64-bits libs) also change the .cmake files for avoid conclicts if build both 32-bis and 64-bits (avoids conflict/overwrite files) Co-authored-by: Ke Zhang * Add 'ignore_index' input in the spec for SoftmaxCrossEntropyLoss and NLLLoss. (#2680) * Add 'ignore_index' input in the spec for SoftmaxCrossEntropyLoss and NLLLoss. * Add tests. * build break. * build break. * clean up. * build break. * Change ignore_index to attribute. * Change ignore_index to attribute. * PR feedback. * PR feedback. * Make ignore_index optional in NLLLoss. * Build break. * remove trailing spaces to fix build break. * Build break. * Update spec doc. * Fix NLLLoss function definition to fix test: test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index_expanded * PR feedback. * Fix test for softmax cross entropy loss to exclude ignored_index'ed weights from the sum of weights. * Build break. * Reduce binary size of libraries consuming ONNX (part 1/2) (#2643) * Change the return type for the zipmap operator to match the description in the spec. * Reduce binary size of libraries consuming ONNX (part 1/2) * Fix build error * Replace separate Get*Doc() functions with easy macro for greater convenience * Add one more macro for complicated operator doc documentation. Co-authored-by: Ke Zhang * Update pybind (#2340) (#2688) * Change version number for release verification Change version number for release verification Co-authored-by: Takeshi Watanabe Co-authored-by: Ke Zhang Co-authored-by: Hong Xu Co-authored-by: Wei-Sheng Chin Co-authored-by: Michał Karzyński Co-authored-by: M. Zeeshan Siddiqui Co-authored-by: Lara Haidar Co-authored-by: Vinitra Swamy Co-authored-by: Changming Sun Co-authored-by: G. Ramalingam Co-authored-by: Changming Sun Co-authored-by: Scott McKay Co-authored-by: Gustavo Alvarez <462213+sl1pkn07@users.noreply.github.com> Co-authored-by: Pranav Sharma --- .travis.yml | 13 + .travis/install.sh | 4 + CMakeLists.txt | 12 +- VERSION_NUMBER | 2 +- docs/Changelog.md | 153 +++- docs/Operators.md | 399 ++++++++++- docs/TestCoverage.md | 266 ++++++- onnx/backend/test/case/model/sequence.py | 15 +- onnx/backend/test/case/node/batchnorm.py | 8 +- onnx/backend/test/case/node/momentum.py | 147 ++++ .../case/node/negativeloglikelihoodloss.py | 44 +- onnx/backend/test/case/node/pow.py | 69 +- .../test/case/node/softmaxcrossentropy.py | 47 +- .../test_batchnorm_epsilon_old/model.onnx | 2 +- .../model.onnx | Bin 451 -> 455 bytes .../test_data_set_0/input_5.pb | 2 +- .../test_data_set_0/output_0.pb | Bin 496 -> 496 bytes .../test_data_set_0/output_1.pb | 2 +- .../test_data_set_0/output_2.pb | 2 +- .../test_data_set_0/output_3.pb | 4 +- .../test_data_set_0/output_4.pb | 2 +- .../test_batchnorm_example_old/model.onnx | 2 +- .../model.onnx | Bin 431 -> 435 bytes .../test_data_set_0/input_5.pb | 2 +- .../test_data_set_0/output_0.pb | Bin 39 -> 39 bytes .../test_data_set_0/output_1.pb | Bin 27 -> 27 bytes .../test_data_set_0/output_2.pb | 2 +- .../test_data_set_0/output_3.pb | Bin 20 -> 26 bytes .../test_data_set_0/output_4.pb | 2 +- .../test/data/node/test_momentum/model.onnx | Bin 0 -> 317 bytes .../test_momentum/test_data_set_0/input_0.pb | 1 + .../test_momentum/test_data_set_0/input_1.pb | Bin 0 -> 15 bytes .../test_momentum/test_data_set_0/input_2.pb | 1 + .../test_momentum/test_data_set_0/input_3.pb | Bin 0 -> 17 bytes .../test_momentum/test_data_set_0/input_4.pb | 1 + .../test_momentum/test_data_set_0/output_0.pb | 1 + .../test_momentum/test_data_set_0/output_1.pb | 1 + .../node/test_momentum_multiple/model.onnx | Bin 0 -> 462 bytes .../test_data_set_0/input_0.pb | 1 + .../test_data_set_0/input_1.pb | Bin 0 -> 15 bytes .../test_data_set_0/input_2.pb | Bin 0 -> 14 bytes .../test_data_set_0/input_3.pb | Bin 0 -> 18 bytes .../test_data_set_0/input_4.pb | Bin 0 -> 14 bytes .../test_data_set_0/input_5.pb | Bin 0 -> 18 bytes .../test_data_set_0/input_6.pb | Bin 0 -> 14 bytes .../test_data_set_0/input_7.pb | Bin 0 -> 18 bytes .../test_data_set_0/output_0.pb | Bin 0 -> 18 bytes .../test_data_set_0/output_1.pb | Bin 0 -> 22 bytes .../test_data_set_0/output_2.pb | 1 + .../test_data_set_0/output_3.pb | 1 + .../model.onnx | 2 +- .../model.onnx | Bin 1757 -> 1757 bytes .../model.onnx | 2 +- .../model.onnx | Bin 1837 -> 1837 bytes .../model.onnx | Bin 246 -> 246 bytes .../model.onnx | Bin 2319 -> 2319 bytes .../model.onnx | Bin 244 -> 244 bytes .../model.onnx | Bin 2302 -> 2302 bytes .../model.onnx | 2 +- .../model.onnx | Bin 2572 -> 2572 bytes .../model.onnx | Bin 288 -> 288 bytes .../model.onnx | Bin 3897 -> 3897 bytes .../model.onnx | Bin 286 -> 286 bytes .../model.onnx | Bin 3130 -> 3130 bytes .../model.onnx | Bin 0 -> 320 bytes .../test_data_set_0/input_0.pb | Bin 0 -> 2180 bytes .../test_data_set_0/input_1.pb | Bin 0 -> 451 bytes .../test_data_set_0/input_2.pb | 1 + .../test_data_set_0/output_0.pb | 1 + .../model.onnx | Bin 0 -> 4425 bytes .../test_data_set_0/input_0.pb | Bin 0 -> 2180 bytes .../test_data_set_0/input_1.pb | Bin 0 -> 451 bytes .../test_data_set_0/input_2.pb | 1 + .../test_data_set_0/output_0.pb | 1 + .../node/test_nesterov_momentum/model.onnx | Bin 0 -> 326 bytes .../test_data_set_0/input_0.pb | 1 + .../test_data_set_0/input_1.pb | Bin 0 -> 15 bytes .../test_data_set_0/input_2.pb | 1 + .../test_data_set_0/input_3.pb | Bin 0 -> 17 bytes .../test_data_set_0/input_4.pb | 1 + .../test_data_set_0/output_0.pb | 1 + .../test_data_set_0/output_1.pb | 1 + .../test/data/node/test_pow/model.onnx | 4 +- .../node/test_pow/test_data_set_0/output_0.pb | Bin 254 -> 254 bytes .../data/node/test_pow_bcast_array/model.onnx | 4 +- .../node/test_pow_bcast_scalar/model.onnx | Bin 108 -> 108 bytes .../data/node/test_pow_example/model.onnx | 4 +- .../data/node/test_pow_types_float/model.onnx | 16 + .../test_data_set_0/input_0.pb | Bin 0 -> 33 bytes .../test_data_set_0/input_1.pb | Bin 0 -> 21 bytes .../test_data_set_0/output_0.pb | Bin 0 -> 33 bytes .../test_pow_types_float32_int32/model.onnx | 16 + .../test_data_set_0/input_0.pb | Bin 0 -> 21 bytes .../test_data_set_0/input_1.pb | Bin 0 -> 21 bytes .../test_data_set_0/output_0.pb | Bin 0 -> 21 bytes .../test_pow_types_float32_int64/model.onnx | 16 + .../test_data_set_0/input_0.pb | Bin 0 -> 21 bytes .../test_data_set_0/input_1.pb | Bin 0 -> 33 bytes .../test_data_set_0/output_0.pb | Bin 0 -> 21 bytes .../test_pow_types_float32_uint32/model.onnx | 16 + .../test_data_set_0/input_0.pb | Bin 0 -> 21 bytes .../test_data_set_0/input_1.pb | Bin 0 -> 21 bytes .../test_data_set_0/output_0.pb | Bin 0 -> 21 bytes .../test_pow_types_float32_uint64/model.onnx | 16 + .../test_data_set_0/input_0.pb | Bin 0 -> 21 bytes .../test_data_set_0/input_1.pb | Bin 0 -> 33 bytes .../test_data_set_0/output_0.pb | Bin 0 -> 21 bytes .../data/node/test_pow_types_int/model.onnx | 16 + .../test_data_set_0/input_0.pb | Bin 0 -> 21 bytes .../test_data_set_0/input_1.pb | Bin 0 -> 33 bytes .../test_data_set_0/output_0.pb | Bin 0 -> 21 bytes .../test_pow_types_int32_float32/model.onnx | 16 + .../test_data_set_0/input_0.pb | Bin 0 -> 21 bytes .../test_data_set_0/input_1.pb | Bin 0 -> 21 bytes .../test_data_set_0/output_0.pb | Bin 0 -> 21 bytes .../test_pow_types_int32_int32/model.onnx | 16 + .../test_data_set_0/input_0.pb | Bin 0 -> 21 bytes .../test_data_set_0/input_1.pb | Bin 0 -> 21 bytes .../test_data_set_0/output_0.pb | Bin 0 -> 21 bytes .../test_pow_types_int64_float32/model.onnx | 16 + .../test_data_set_0/input_0.pb | Bin 0 -> 33 bytes .../test_data_set_0/input_1.pb | Bin 0 -> 21 bytes .../test_data_set_0/output_0.pb | Bin 0 -> 33 bytes .../test_pow_types_int64_int64/model.onnx | 16 + .../test_data_set_0/input_0.pb | Bin 0 -> 33 bytes .../test_data_set_0/input_1.pb | Bin 0 -> 33 bytes .../test_data_set_0/output_0.pb | Bin 0 -> 33 bytes .../model.onnx | Bin 165 -> 165 bytes .../test_data_set_0/input_1.pb | Bin 33 -> 21 bytes .../model.onnx | Bin 176 -> 176 bytes .../test_data_set_0/input_1.pb | Bin 59 -> 35 bytes .../model.onnx | Bin 1327 -> 1327 bytes .../test_data_set_0/input_1.pb | Bin 59 -> 35 bytes .../model.onnx | Bin 1277 -> 1277 bytes .../test_data_set_0/input_1.pb | Bin 33 -> 21 bytes .../model.onnx | Bin 192 -> 192 bytes .../test_data_set_0/input_1.pb | Bin 33 -> 21 bytes .../model.onnx | Bin 1395 -> 1395 bytes .../test_data_set_0/input_1.pb | Bin 33 -> 21 bytes .../model.onnx | Bin 0 -> 226 bytes .../test_data_set_0/input_0.pb | 1 + .../test_data_set_0/input_1.pb | Bin 0 -> 21 bytes .../test_data_set_0/input_2.pb | 1 + .../test_data_set_0/output_0.pb | 1 + .../model.onnx | Bin 0 -> 1598 bytes .../test_data_set_0/input_0.pb | 1 + .../test_data_set_0/input_1.pb | Bin 0 -> 21 bytes .../test_data_set_0/input_2.pb | 1 + .../test_data_set_0/output_0.pb | 1 + .../model.onnx | 2 +- .../test_data_set_0/input_1.pb | Bin 33 -> 21 bytes .../model.onnx | 2 +- .../test_data_set_0/input_1.pb | Bin 33 -> 21 bytes .../model.onnx | 2 +- .../test_data_set_0/input_1.pb | Bin 33 -> 21 bytes .../model.onnx | 2 +- .../test_data_set_0/input_1.pb | Bin 33 -> 21 bytes .../test_softmax_cross_entropy_sum/model.onnx | Bin 163 -> 163 bytes .../test_data_set_0/input_1.pb | Bin 33 -> 21 bytes .../model.onnx | Bin 1262 -> 1262 bytes .../test_data_set_0/input_1.pb | Bin 33 -> 21 bytes onnx/common/constants.h | 2 +- onnx/cpp2py_export.cc | 4 +- onnx/defs/generator/defs.cc | 18 +- onnx/defs/logical/defs.cc | 78 +- onnx/defs/logical/old.cc | 29 +- onnx/defs/math/defs.cc | 678 ++++++++++++------ onnx/defs/math/old.cc | 135 ++-- onnx/defs/nn/defs.cc | 296 ++++---- onnx/defs/nn/old.cc | 147 ++-- onnx/defs/operator_sets-training.h | 2 + onnx/defs/operator_sets.h | 2 + onnx/defs/reduction/defs.cc | 41 +- onnx/defs/reduction/old.cc | 24 +- onnx/defs/rnn/defs.cc | 21 +- onnx/defs/rnn/old.cc | 22 +- onnx/defs/schema.cc | 61 +- onnx/defs/schema.h | 116 ++- onnx/defs/sequence/defs.cc | 2 +- onnx/defs/tensor/defs.cc | 16 +- onnx/defs/tensor/old.cc | 18 +- onnx/defs/traditionalml/defs.cc | 136 ++-- onnx/defs/traditionalml/old.cc | 2 +- onnx/defs/training/defs.cc | 154 +++- onnx/optimizer/pass_manager.cc | 4 +- onnx/shape_inference/implementation.cc | 4 +- onnx/test/shape_inference_test.py | 41 ++ onnx/version_converter/convert.cc | 4 +- third_party/pybind11 | 2 +- 189 files changed, 2662 insertions(+), 807 deletions(-) create mode 100644 onnx/backend/test/case/node/momentum.py create mode 100644 onnx/backend/test/data/node/test_momentum/model.onnx create mode 100644 onnx/backend/test/data/node/test_momentum/test_data_set_0/input_0.pb create mode 100644 onnx/backend/test/data/node/test_momentum/test_data_set_0/input_1.pb create mode 100644 onnx/backend/test/data/node/test_momentum/test_data_set_0/input_2.pb create mode 100644 onnx/backend/test/data/node/test_momentum/test_data_set_0/input_3.pb create mode 100644 onnx/backend/test/data/node/test_momentum/test_data_set_0/input_4.pb create mode 100644 onnx/backend/test/data/node/test_momentum/test_data_set_0/output_0.pb create mode 100644 onnx/backend/test/data/node/test_momentum/test_data_set_0/output_1.pb create mode 100644 onnx/backend/test/data/node/test_momentum_multiple/model.onnx create mode 100644 onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/input_0.pb create mode 100644 onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/input_1.pb create mode 100644 onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/input_2.pb create mode 100644 onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/input_3.pb create mode 100644 onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/input_4.pb create mode 100644 onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/input_5.pb create mode 100644 onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/input_6.pb create mode 100644 onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/input_7.pb create mode 100644 onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/output_0.pb create mode 100644 onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/output_1.pb create mode 100644 onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/output_2.pb create mode 100644 onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/output_3.pb create mode 100644 onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index/model.onnx create mode 100644 onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index/test_data_set_0/input_0.pb create mode 100644 onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index/test_data_set_0/input_1.pb create mode 100644 onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index/test_data_set_0/input_2.pb create mode 100644 onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index/test_data_set_0/output_0.pb create mode 100644 onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index_expanded/model.onnx create mode 100644 onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index_expanded/test_data_set_0/input_0.pb create mode 100644 onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index_expanded/test_data_set_0/input_1.pb create mode 100644 onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index_expanded/test_data_set_0/input_2.pb create mode 100644 onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index_expanded/test_data_set_0/output_0.pb create mode 100644 onnx/backend/test/data/node/test_nesterov_momentum/model.onnx create mode 100644 onnx/backend/test/data/node/test_nesterov_momentum/test_data_set_0/input_0.pb create mode 100644 onnx/backend/test/data/node/test_nesterov_momentum/test_data_set_0/input_1.pb create mode 100644 onnx/backend/test/data/node/test_nesterov_momentum/test_data_set_0/input_2.pb create mode 100644 onnx/backend/test/data/node/test_nesterov_momentum/test_data_set_0/input_3.pb create mode 100644 onnx/backend/test/data/node/test_nesterov_momentum/test_data_set_0/input_4.pb create mode 100644 onnx/backend/test/data/node/test_nesterov_momentum/test_data_set_0/output_0.pb create mode 100644 onnx/backend/test/data/node/test_nesterov_momentum/test_data_set_0/output_1.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_float/model.onnx create mode 100644 onnx/backend/test/data/node/test_pow_types_float/test_data_set_0/input_0.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_float/test_data_set_0/input_1.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_float/test_data_set_0/output_0.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_float32_int32/model.onnx create mode 100644 onnx/backend/test/data/node/test_pow_types_float32_int32/test_data_set_0/input_0.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_float32_int32/test_data_set_0/input_1.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_float32_int32/test_data_set_0/output_0.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_float32_int64/model.onnx create mode 100644 onnx/backend/test/data/node/test_pow_types_float32_int64/test_data_set_0/input_0.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_float32_int64/test_data_set_0/input_1.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_float32_int64/test_data_set_0/output_0.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_float32_uint32/model.onnx create mode 100644 onnx/backend/test/data/node/test_pow_types_float32_uint32/test_data_set_0/input_0.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_float32_uint32/test_data_set_0/input_1.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_float32_uint32/test_data_set_0/output_0.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_float32_uint64/model.onnx create mode 100644 onnx/backend/test/data/node/test_pow_types_float32_uint64/test_data_set_0/input_0.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_float32_uint64/test_data_set_0/input_1.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_float32_uint64/test_data_set_0/output_0.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_int/model.onnx create mode 100644 onnx/backend/test/data/node/test_pow_types_int/test_data_set_0/input_0.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_int/test_data_set_0/input_1.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_int/test_data_set_0/output_0.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_int32_float32/model.onnx create mode 100644 onnx/backend/test/data/node/test_pow_types_int32_float32/test_data_set_0/input_0.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_int32_float32/test_data_set_0/input_1.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_int32_float32/test_data_set_0/output_0.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_int32_int32/model.onnx create mode 100644 onnx/backend/test/data/node/test_pow_types_int32_int32/test_data_set_0/input_0.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_int32_int32/test_data_set_0/input_1.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_int32_int32/test_data_set_0/output_0.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_int64_float32/model.onnx create mode 100644 onnx/backend/test/data/node/test_pow_types_int64_float32/test_data_set_0/input_0.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_int64_float32/test_data_set_0/input_1.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_int64_float32/test_data_set_0/output_0.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_int64_int64/model.onnx create mode 100644 onnx/backend/test/data/node/test_pow_types_int64_int64/test_data_set_0/input_0.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_int64_int64/test_data_set_0/input_1.pb create mode 100644 onnx/backend/test/data/node/test_pow_types_int64_int64/test_data_set_0/output_0.pb create mode 100644 onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index/model.onnx create mode 100644 onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index/test_data_set_0/input_0.pb create mode 100644 onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index/test_data_set_0/input_1.pb create mode 100644 onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index/test_data_set_0/input_2.pb create mode 100644 onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index/test_data_set_0/output_0.pb create mode 100644 onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index_expanded/model.onnx create mode 100644 onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index_expanded/test_data_set_0/input_0.pb create mode 100644 onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index_expanded/test_data_set_0/input_1.pb create mode 100644 onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index_expanded/test_data_set_0/input_2.pb create mode 100644 onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index_expanded/test_data_set_0/output_0.pb diff --git a/.travis.yml b/.travis.yml index 5abb6f90e99..8d7c5580a80 100644 --- a/.travis.yml +++ b/.travis.yml @@ -12,6 +12,16 @@ matrix: env: PYTHON_VERSION=python3 ONNX_ML=0 language: python python: "3.6" + - os: linux + sudo: required + env: PYTHON_VERSION=python3 ONNX_ML=0 ONNX_DEBUG=1 + language: python + python: "3.6" + - os: linux + sudo: required + env: PYTHON_VERSION=python3 ONNX_ML=1 ONNX_DEBUG=1 + language: python + python: "3.6" - os: osx osx_image: xcode9.3 env: PYTHON_VERSION=python2 ONNX_ML=0 @@ -34,6 +44,9 @@ matrix: - os: osx osx_image: xcode9.3 env: PYTHON_VERSION=python3 + - os: osx + osx_image: xcode9.3 + env: PYTHON_VERSION=python3 ONNX_DEBUG=1 - os: linux sudo: required env: PYTHON_VERSION=python2 LITE=1 diff --git a/.travis/install.sh b/.travis/install.sh index 1c6555eea83..edfc75db787 100755 --- a/.travis/install.sh +++ b/.travis/install.sh @@ -13,5 +13,9 @@ fi export CMAKE_ARGS="${CMAKE_ARGS} -DONNXIFI_DUMMY_BACKEND=ON" export ONNX_NAMESPACE=ONNX_NAMESPACE_FOO_BAR_FOR_CI +if [ "${ONNX_DEBUG}" == "1" ]; then + export DEBUG=1 +fi + time python setup.py --quiet bdist_wheel --universal --dist-dir . find . -maxdepth 1 -name "*.whl" -ls -exec pip install {} \; diff --git a/CMakeLists.txt b/CMakeLists.txt index ca3a65d7fd8..0aa9fda2451 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -640,12 +640,14 @@ if(MSVC) add_msvc_runtime_flag(onnxifi_dummy) endif() +include(GNUInstallDirs) + install(DIRECTORY ${ONNX_ROOT}/onnx - DESTINATION include + DESTINATION ${CMAKE_INSTALL_INCLUDEDIR} FILES_MATCHING PATTERN "*.h") install(DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}/onnx - DESTINATION include + DESTINATION ${CMAKE_INSTALL_INCLUDEDIR} FILES_MATCHING PATTERN "*.h") @@ -660,13 +662,13 @@ configure_file( install(FILES ${PROJECT_BINARY_DIR}/ONNXConfigVersion.cmake ${PROJECT_BINARY_DIR}/ONNXConfig.cmake - DESTINATION share/cmake/ONNX + DESTINATION ${CMAKE_INSTALL_LIBDIR}/cmake/ONNX COMPONENT dev) -install(EXPORT ONNXTargets DESTINATION share/cmake/ONNX) +install(EXPORT ONNXTargets DESTINATION "${CMAKE_INSTALL_LIBDIR}/cmake/ONNX") install(TARGETS onnx onnx_proto onnxifi onnxifi_dummy onnxifi_loader - EXPORT ONNXTargets DESTINATION lib) + EXPORT ONNXTargets DESTINATION ${CMAKE_INSTALL_LIBDIR}) if(NOT ANDROID AND NOT IOS) install(TARGETS onnxifi_wrapper diff --git a/VERSION_NUMBER b/VERSION_NUMBER index a5e19fb8444..8f27ab8c269 100644 --- a/VERSION_NUMBER +++ b/VERSION_NUMBER @@ -1 +1 @@ -1.7.102 \ No newline at end of file +1.7.103 \ No newline at end of file diff --git a/docs/Changelog.md b/docs/Changelog.md index 0f66c6e22eb..e10bcaf83d3 100644 --- a/docs/Changelog.md +++ b/docs/Changelog.md @@ -14801,6 +14801,8 @@ This version of the operator has been available since version 12 of the default #### Attributes
+
ignore_index : int
+
Specifies a target value that is ignored and does not contribute to the input gradient. It is an optional value and valid values are [0, C).
reduction : string (default is mean)
Type of reduction to apply to loss: none, sum, mean (default). 'none': the output is the loss for each sample. 'sum': the output will be summed. 'mean': the sum of the output will be divided by the sum of applied weights.
@@ -14832,6 +14834,42 @@ This version of the operator has been available since version 12 of the default
Constrain target to integer types
+### **Pow-12** + + Pow takes input data (Tensor) and exponent Tensor, and + produces one output data (Tensor) where the function `f(x) = x^exponent`, + is applied to the data tensor elementwise. + This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md). + +#### Version + +This version of the operator has been available since version 12 of the default ONNX operator set. + +#### Inputs + +
+
X : T
+
First operand, base of the exponent.
+
Y : T1
+
Second operand, power of the exponent.
+
+ +#### Outputs + +
+
Z : T
+
Output tensor (same size as X)
+
+ +#### Type Constraints + +
+
T : tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)
+
Constrain input X and output types to float/int tensors.
+
T1 : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)
+
Constrain input Y types to float/int tensors.
+
+ ### **ReduceMax-12** Computes the max of the input tensor's element along the provided axes. The resulted @@ -14956,6 +14994,8 @@ This version of the operator has been available since version 12 of the default #### Attributes
+
ignore_index : int
+
Specifies a target value that is ignored and does not contribute to the input gradient. It is an optional value and valid values are [0, C).
reduction : string (default is mean)
Type of reduction to apply to loss: none, sum, mean(default). 'none': no reduction will be applied, 'sum': the output will be summed. 'mean': the sum of the output will be divided by the number of elements in the output.
@@ -14965,7 +15005,7 @@ This version of the operator has been available since version 12 of the default
scores : T
The predicted outputs with shape [batch_size, class_size], or [batch_size, class_size, D1, D2 , ..., Dk], where K is the number of dimensions.
-
labels : T
+
labels : Tind
The ground truth output tensor, with shape [batch_size], or [batch_size, D1, D2, ..., Dk], where K is the number of dimensions.
weights (optional) : T
A manual rescaling weight given to each class. If given, it has to be a 1D Tensor assigning weight to each of the classes. Otherwise, it is treated as if having all ones.
@@ -14985,6 +15025,8 @@ This version of the operator has been available since version 12 of the default
T : tensor(float16), tensor(float), tensor(double)
Constrain input and output types to float tensors.
+
Tind : tensor(int32), tensor(int64)
+
Constrain target to integer types
### **UnfoldToDepth-12** @@ -15424,3 +15466,112 @@ This version of the operator has been available since version 1 of the 'ai.onnx.
Allow inputs and outputs to be any kind of tensor.
+### **ai.onnx.training.Momentum-1** + + Compute one iteration of stochastic gradient update with momentum. + This operator can conduct the optimization of multiple tensor variables. + + Let's define the behavior of this operator. As you can imagine, SG with momentum requires + several parameters: + + - The learning-rate "R". + - The update count "T". That is, the number of conducted training iterations. It should + be zero in the first training iteration. + - A L2-norm regularization coefficient "norm_coefficient". + - A decay coefficient of previous accumulated gradient (i.e., momentum) "alpha". + - The scaling coefficient of current gradient "beta". + - An attribute to choose either standard momentum or Nesterov's momentum "mode" should + be used. + + For the sake of simplicity, assume that there is only one tensor (called "X") to be optimized. + Other necessary inputs are "X"'s gradient (called "G") and "X"'s momentum (called "V"). This + Momentum operator maps all these inputs to the new value of "X" (called "X_new") and its new + momentum (called "V_new"). + + This operator supports two different momentum algorithms. Set the attribute "mode" to + "nesterov" if Nesterov's momentum is desired. Otherwise, set the attribute "model" to + "standard" to use standard momentum. Computation details are described subsequently. + + Let "+", "-", "*", and "/" are all element-wise operations with numpy-style broadcasting. + + Pseudo code for SG with standard momentum: + + // Add gradient of 0.5 * norm_coefficient * ||X||^2, where ||X|| is the sum of squared + // values of all elements in X. + G_regularized = norm_coefficient * X + G + + // In the first training iteration, beta should always be 1. + beta_adjusted = T > 0 ? beta : 1 + + // Compute the current momentum based on previous momentum and the current gradient. + V_new = alpha * V + beta_adjusted * G_regularized + + // Update X. + X_new = X - R * V_new + + Pseudo code for SG with Nesterov's momentum: + + // Add gradient of 0.5 * norm_coefficient * ||X||^2, where ||X|| is the sum of squared + // values of all elements in X. + G_regularized = norm_coefficient * X + G; + + // In the first training iteration, beta should always be 1. + beta_adjusted = T > 0 ? beta : 1 + + // Compute the current momentum based on previous momentum and the current gradient. + V_new = alpha * V + beta_adjusted * G_regularized; + + // Compute final update direction and then update X. + X_new = X - R * (G_regularized + alpha * V_new) + + If one assign this operators to optimize multiple inputs, for example, "X_1" and "X_2". The same + pseudo code would be extended to handle all tensors jointly. More specifically, we can view "X" as a + concatenation of "X_1" and "X_2" (of course, their gradient and accumulate gradient should + be concatenated too) and then our pseudo code becomes applicable. + +#### Version + +This version of the operator has been available since version 1 of the 'ai.onnx.training' operator set. + +#### Attributes + +
+
alpha : float (required)
+
The decay factor of momentum. It should be a scalar.
+
beta : float (required)
+
The coefficient of gradient in computing new momentum. It should be a scalar.
+
mode : string (required)
+
Its value should be either "nesterov" or "standard". The value "nesterov" leads to the use of Nesterov's momentum while "standard" invokes stochastic gradient method using standard momentum
+
norm_coefficient : float (required)
+
Coefficient of 0.5 * norm_coefficient * ||X||^2.
+
+ +#### Inputs (3 - ∞) + +
+
R : T1
+
The learning rate.
+
T : T2
+
Update count of "X". It should be a scalar.
+
inputs (variadic, heterogeneous) : T3
+
It sequentially contains the current values of optimized tensors, then their gradient tensors, and finally their momentum tensors. For example, if two tensors "X_1" and "X_2" are optimized, The expected input list would be ["X_1", "X_2", gradient of "X_1", gradient of "X_2", momentum of "X_1", momentum of "X_2"].
+
+ +#### Outputs (1 - ∞) + +
+
outputs (variadic, heterogeneous) : T3
+
It sequentially contains the new values of optimized tensors and then the new values of their momentum tensors. For example, if two tensors "X_1" and "X_2" are optimized, the output list would be [new value of "X_1," new value of "X_2" new momentum of "X_1", new momentum of "X_2"].
+
+ +#### Type Constraints + +
+
T1 : tensor(float), tensor(double)
+
Constrain input types to float scalars.
+
T2 : tensor(int64)
+
Constrain input types to 64-bit integer scalars.
+
T3 : tensor(float), tensor(double)
+
Constrain input types to float tensors.
+
+ diff --git a/docs/Operators.md b/docs/Operators.md index 10c0d12a6a2..88d8345c3b5 100644 --- a/docs/Operators.md +++ b/docs/Operators.md @@ -175,6 +175,7 @@ * ai.onnx.training.Adagrad * ai.onnx.training.Gradient * ai.onnx.training.GraphCall + * ai.onnx.training.Momentum ## ai.onnx (default) ### **Abs** @@ -1968,7 +1969,9 @@ s = np.array([1.0, 1.5]).astype(np.float32) bias = np.array([0, 1]).astype(np.float32) mean = np.array([0, 3]).astype(np.float32) var = np.array([1, 1.5]).astype(np.float32) -training_mode = np.ones(1, dtype=bool) +# using np.bool(1) while generating test data with "'bool' object has no attribute 'dtype'" +# working around by using np.byte(1).astype(bool) +training_mode = np.byte(1).astype(bool) y, saved_mean, saved_var, output_mean, output_var = batchnorm_training_mode(x, s, bias, mean, var) node = onnx.helper.make_node( @@ -1987,7 +1990,7 @@ s = np.random.randn(3).astype(np.float32) bias = np.random.randn(3).astype(np.float32) mean = np.random.randn(3).astype(np.float32) var = np.random.rand(3).astype(np.float32) -training_mode = np.ones(1, dtype=bool) +training_mode = np.byte(1).astype(bool) momentum = 0.9 epsilon = 1e-2 y, saved_mean, saved_var, output_mean, output_var = batchnorm_training_mode(x, s, bias, mean, var, momentum, epsilon) @@ -10739,6 +10742,8 @@ This version of the operator has been available since version 12 of the default #### Attributes
+
ignore_index : int
+
Specifies a target value that is ignored and does not contribute to the input gradient. It is an optional value and valid values are [0, C).
reduction : string (default is mean)
Type of reduction to apply to loss: none, sum, mean (default). 'none': the output is the loss for each sample. 'sum': the output will be summed. 'mean': the sum of the output will be divided by the sum of applied weights.
@@ -10958,6 +10963,36 @@ expect(node, inputs=[input, target, weight], outputs=[negative_log_likelihood_lo +
+input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index + +```python +reduction = 'sum' +ignore_index = np.int64(0) +node = onnx.helper.make_node( + 'NegativeLogLikelihoodLoss', + inputs=['input', 'target', 'weight'], + outputs=['loss'], + reduction=reduction, + ignore_index=ignore_index +) + +N, C, dim1, dim2 = 3, 5, 6, 6 +np.random.seed(0) +input = np.random.rand(N, C, dim1, dim2).astype(np.float32) +target = np.random.randint(0, high=C, size=(N, dim1, dim2)) +target[0][0][0] = 0 +weight = np.random.rand(C).astype(np.float32) + +negative_log_likelihood_loss = compute_negative_log_likelihood_loss(input, target, weight=weight, reduction=reduction, ignore_index=ignore_index) + +expect(node, inputs=[input, target, weight], outputs=[negative_log_likelihood_loss], + name='test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index') +``` + +
+ + ### **NonMaxSuppression** Filter out boxes that have high intersection-over-union (IOU) overlap with previously selected boxes. @@ -11953,16 +11988,16 @@ for mode in ['edge', 'reflect']: #### Version -This version of the operator has been available since version 7 of the default ONNX operator set. +This version of the operator has been available since version 12 of the default ONNX operator set. -Other versions of this operator: Pow-1 +Other versions of this operator: Pow-1, Pow-7 #### Inputs
X : T
First operand, base of the exponent.
-
Y : T
+
Y : T1
Second operand, power of the exponent.
@@ -11976,8 +12011,10 @@ Other versions of this operator: Pow-1 #### Type Constraints
-
T : tensor(float16), tensor(float), tensor(double)
-
Constrain input and output types to float tensors.
+
T : tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)
+
Constrain input X and output types to float/int tensors.
+
T1 : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)
+
Constrain input Y types to float/int tensors.
@@ -11995,13 +12032,13 @@ node = onnx.helper.make_node( x = np.array([1, 2, 3]).astype(np.float32) y = np.array([4, 5, 6]).astype(np.float32) -z = np.power(x, y) # expected output [1., 32., 729.] +z = pow(x, y) # expected output [1., 32., 729.] expect(node, inputs=[x, y], outputs=[z], name='test_pow_example') x = np.arange(60).reshape(3, 4, 5).astype(np.float32) y = np.random.randn(3, 4, 5).astype(np.float32) -z = np.power(x, y) +z = pow(x, y) expect(node, inputs=[x, y], outputs=[z], name='test_pow') ``` @@ -12021,7 +12058,7 @@ node = onnx.helper.make_node( x = np.array([1, 2, 3]).astype(np.float32) y = np.array(2).astype(np.float32) -z = np.power(x, y) # expected output [1., 4., 9.] +z = pow(x, y) # expected output [1., 4., 9.] expect(node, inputs=[x, y], outputs=[z], name='test_pow_bcast_scalar') @@ -12033,7 +12070,7 @@ node = onnx.helper.make_node( x = np.array([[1, 2, 3], [4, 5, 6]]).astype(np.float32) y = np.array([1, 2, 3]).astype(np.float32) # expected output [[1, 4, 27], [4, 25, 216]] -z = np.power(x, y).astype(np.float32) +z = pow(x, y) expect(node, inputs=[x, y], outputs=[z], name='test_pow_bcast_array') ``` @@ -12041,6 +12078,68 @@ expect(node, inputs=[x, y], outputs=[z], +
+types + +```python +node = onnx.helper.make_node( + 'Pow', + inputs=['x', 'y'], + outputs=['z'], +) + +x = np.array([1, 2, 3]).astype(np.float32) +y = np.array([4, 5, 6]).astype(np.int64) +z = pow(x, y) # expected output [1., 32., 729.] +expect(node, inputs=[x, y], outputs=[z], + name='test_pow_types_float32_int64') + +x = np.array([1, 2, 3]).astype(np.int64) +y = np.array([4, 5, 6]).astype(np.float32) +z = pow(x, y) # expected output [1, 32, 729] +expect(node, inputs=[x, y], outputs=[z], + name='test_pow_types_int64_float32') + +x = np.array([1, 2, 3]).astype(np.float32) +y = np.array([4, 5, 6]).astype(np.int32) +z = pow(x, y) # expected output [1., 32., 729.] +expect(node, inputs=[x, y], outputs=[z], + name='test_pow_types_float32_int32') + +x = np.array([1, 2, 3]).astype(np.int32) +y = np.array([4, 5, 6]).astype(np.float32) +z = pow(x, y) # expected output [1, 32, 729] +expect(node, inputs=[x, y], outputs=[z], + name='test_pow_types_int32_float32') + +x = np.array([1, 2, 3]).astype(np.float32) +y = np.array([4, 5, 6]).astype(np.uint64) +z = pow(x, y) # expected output [1., 32., 729.] +expect(node, inputs=[x, y], outputs=[z], + name='test_pow_types_float32_uint64') + +x = np.array([1, 2, 3]).astype(np.float32) +y = np.array([4, 5, 6]).astype(np.uint32) +z = pow(x, y) # expected output [1., 32., 729.] +expect(node, inputs=[x, y], outputs=[z], + name='test_pow_types_float32_uint32') + +x = np.array([1, 2, 3]).astype(np.int64) +y = np.array([4, 5, 6]).astype(np.int64) +z = pow(x, y) # expected output [1, 32, 729] +expect(node, inputs=[x, y], outputs=[z], + name='test_pow_types_int64_int64') + +x = np.array([1, 2, 3]).astype(np.int32) +y = np.array([4, 5, 6]).astype(np.int32) +z = pow(x, y) # expected output [1, 32, 729] +expect(node, inputs=[x, y], outputs=[z], + name='test_pow_types_int32_int32') +``` + +
+ + ### **QLinearConv** The convolution operator consumes a quantized input tensor, its scale and zero point, @@ -18208,6 +18307,8 @@ This version of the operator has been available since version 12 of the default #### Attributes
+
ignore_index : int
+
Specifies a target value that is ignored and does not contribute to the input gradient. It is an optional value and valid values are [0, C).
reduction : string (default is mean)
Type of reduction to apply to loss: none, sum, mean(default). 'none': no reduction will be applied, 'sum': the output will be summed. 'mean': the sum of the output will be divided by the number of elements in the output.
@@ -18217,7 +18318,7 @@ This version of the operator has been available since version 12 of the default
scores : T
The predicted outputs with shape [batch_size, class_size], or [batch_size, class_size, D1, D2 , ..., Dk], where K is the number of dimensions.
-
labels : T
+
labels : Tind
The ground truth output tensor, with shape [batch_size], or [batch_size, D1, D2, ..., Dk], where K is the number of dimensions.
weights (optional) : T
A manual rescaling weight given to each class. If given, it has to be a 1D Tensor assigning weight to each of the classes. Otherwise, it is treated as if having all ones.
@@ -18237,6 +18338,8 @@ This version of the operator has been available since version 12 of the default
T : tensor(float16), tensor(float), tensor(double)
Constrain input and output types to float tensors.
+
Tind : tensor(int32), tensor(int64)
+
Constrain target to integer types
@@ -18327,6 +18430,38 @@ expect(node, inputs=[x, labels, weights], outputs=[sce], name='test_softmax_cros +
+softmaxcrossentropy_mean_weights_ignore_index + +```python +# Define operator attributes. +reduction = 'mean' +ignore_index = np.int64(0) + +# Create operator. +node = onnx.helper.make_node('SoftmaxCrossEntropyLoss', + inputs=['x', 'y', 'w'], + outputs=['z'], + reduction=reduction, + ignore_index=ignore_index) + +# Define operator inputs. +np.random.seed(0) +x = np.random.rand(3, 5).astype(np.float32) +labels = np.random.randint(0, high=5, size=(3, )) +labels[0] = 0 +weights = np.array([0.9, 0.7, 0.8, 0.9, 0.9], dtype=np.float32) + +# Compute SoftmaxCrossEntropyLoss +sce = softmaxcrossentropy(x, labels, weight=weights, ignore_index=ignore_index) + +# Check results +expect(node, inputs=[x, labels, weights], outputs=[sce], name='test_softmax_cross_entropy_mean_weight_ignore_index') +``` + +
+ +
softmaxcrossentropy_none @@ -21529,3 +21664,243 @@ This version of the operator has been available since version 1 of the 'ai.onnx.
+### **ai.onnx.training.Momentum** + + Compute one iteration of stochastic gradient update with momentum. + This operator can conduct the optimization of multiple tensor variables. + + Let's define the behavior of this operator. As you can imagine, SG with momentum requires + several parameters: + + - The learning-rate "R". + - The update count "T". That is, the number of conducted training iterations. It should + be zero in the first training iteration. + - A L2-norm regularization coefficient "norm_coefficient". + - A decay coefficient of previous accumulated gradient (i.e., momentum) "alpha". + - The scaling coefficient of current gradient "beta". + - An attribute to choose either standard momentum or Nesterov's momentum "mode" should + be used. + + For the sake of simplicity, assume that there is only one tensor (called "X") to be optimized. + Other necessary inputs are "X"'s gradient (called "G") and "X"'s momentum (called "V"). This + Momentum operator maps all these inputs to the new value of "X" (called "X_new") and its new + momentum (called "V_new"). + + This operator supports two different momentum algorithms. Set the attribute "mode" to + "nesterov" if Nesterov's momentum is desired. Otherwise, set the attribute "model" to + "standard" to use standard momentum. Computation details are described subsequently. + + Let "+", "-", "*", and "/" are all element-wise operations with numpy-style broadcasting. + + Pseudo code for SG with standard momentum: + + // Add gradient of 0.5 * norm_coefficient * ||X||^2, where ||X|| is the sum of squared + // values of all elements in X. + G_regularized = norm_coefficient * X + G + + // In the first training iteration, beta should always be 1. + beta_adjusted = T > 0 ? beta : 1 + + // Compute the current momentum based on previous momentum and the current gradient. + V_new = alpha * V + beta_adjusted * G_regularized + + // Update X. + X_new = X - R * V_new + + Pseudo code for SG with Nesterov's momentum: + + // Add gradient of 0.5 * norm_coefficient * ||X||^2, where ||X|| is the sum of squared + // values of all elements in X. + G_regularized = norm_coefficient * X + G; + + // In the first training iteration, beta should always be 1. + beta_adjusted = T > 0 ? beta : 1 + + // Compute the current momentum based on previous momentum and the current gradient. + V_new = alpha * V + beta_adjusted * G_regularized; + + // Compute final update direction and then update X. + X_new = X - R * (G_regularized + alpha * V_new) + + If one assign this operators to optimize multiple inputs, for example, "X_1" and "X_2". The same + pseudo code would be extended to handle all tensors jointly. More specifically, we can view "X" as a + concatenation of "X_1" and "X_2" (of course, their gradient and accumulate gradient should + be concatenated too) and then our pseudo code becomes applicable. + +#### Version + +This version of the operator has been available since version 1 of the 'ai.onnx.training' operator set. + +#### Attributes + +
+
alpha : float (required)
+
The decay factor of momentum. It should be a scalar.
+
beta : float (required)
+
The coefficient of gradient in computing new momentum. It should be a scalar.
+
mode : string (required)
+
Its value should be either "nesterov" or "standard". The value "nesterov" leads to the use of Nesterov's momentum while "standard" invokes stochastic gradient method using standard momentum
+
norm_coefficient : float (required)
+
Coefficient of 0.5 * norm_coefficient * ||X||^2.
+
+ +#### Inputs (3 - ∞) + +
+
R : T1
+
The learning rate.
+
T : T2
+
Update count of "X". It should be a scalar.
+
inputs (variadic, heterogeneous) : T3
+
It sequentially contains the current values of optimized tensors, then their gradient tensors, and finally their momentum tensors. For example, if two tensors "X_1" and "X_2" are optimized, The expected input list would be ["X_1", "X_2", gradient of "X_1", gradient of "X_2", momentum of "X_1", momentum of "X_2"].
+
+ +#### Outputs (1 - ∞) + +
+
outputs (variadic, heterogeneous) : T3
+
It sequentially contains the new values of optimized tensors and then the new values of their momentum tensors. For example, if two tensors "X_1" and "X_2" are optimized, the output list would be [new value of "X_1," new value of "X_2" new momentum of "X_1", new momentum of "X_2"].
+
+ +#### Type Constraints + +
+
T1 : tensor(float), tensor(double)
+
Constrain input types to float scalars.
+
T2 : tensor(int64)
+
Constrain input types to 64-bit integer scalars.
+
T3 : tensor(float), tensor(double)
+
Constrain input types to float tensors.
+
+ + +#### Examples + +
+momentum + +```python +# Define operator attributes. +norm_coefficient = 0.001 +alpha = 0.95 +beta = 0.1 + +# Create operator. +node = onnx.helper.make_node('Momentum', + inputs=['R', 'T', 'X', 'G', 'V'], + outputs=['X_new', 'V_new'], + norm_coefficient=norm_coefficient, + alpha=alpha, + beta=beta, + mode='standard', + domain='ai.onnx.training' + ) + +# Define operator inputs. +r = np.array(0.1, dtype=np.float32) # scalar +t = np.array(0, dtype=np.int64) # scalar +x = np.array([1.2, 2.8], dtype=np.float32) +g = np.array([-0.94, -2.5], dtype=np.float32) +v = np.array([1.7, 3.6], dtype=np.float32) + +# Compute expected outputs of Momentum. +x_new, v_new = apply_momentum(r, t, x, g, v, + norm_coefficient, alpha, beta) + +# Check results. +expect(node, inputs=[r, t, x, g, v], + outputs=[x_new, v_new], name='test_momentum', + opset_imports=[onnx.helper.make_opsetid('ai.onnx.training', 1)]) +``` + +
+ + +
+momentum_multiple + +```python +# Define operator attributes. +norm_coefficient = 0.001 +alpha = 0.95 +beta = 0.85 + +node = onnx.helper.make_node('Momentum', + inputs=['R', 'T', 'X1', 'X2', + 'G1', 'G2', 'H1', 'H2'], + outputs=['X1_new', 'X2_new', + 'V1_new', 'V2_new'], + norm_coefficient=norm_coefficient, + alpha=alpha, + beta=beta, + mode='standard', + domain='ai.onnx.training' + ) + +# Define operator inputs. +r = np.array(0.1, dtype=np.float32) # scalar +t = np.array(0, dtype=np.int64) # scalar + +x1 = np.array([1.0], dtype=np.float32) +g1 = np.array([-1.0], dtype=np.float32) +v1 = np.array([2.0], dtype=np.float32) + +x2 = np.array([1.0, 2.0], dtype=np.float32) +g2 = np.array([-1.0, -3.0], dtype=np.float32) +v2 = np.array([4.0, 1.0], dtype=np.float32) + +# Compute expected outputs of Momentum. +x1_new, v1_new = apply_momentum(r, t, x1, g1, v1, + norm_coefficient, alpha, beta) +x2_new, v2_new = apply_momentum(r, t, x2, g2, v2, + norm_coefficient, alpha, beta) + +# Check results. +expect(node, inputs=[r, t, x1, x2, g1, g2, v1, v2], + outputs=[x1_new, x2_new, v1_new, v2_new], name='test_momentum_multiple', + opset_imports=[onnx.helper.make_opsetid('ai.onnx.training', 1)]) +``` + +
+ + +
+nesterov_momentum + +```python +# Define operator attributes. +norm_coefficient = 0.01 +alpha = 0.95 +beta = 1.0 + +# Create operator. +node = onnx.helper.make_node('Momentum', + inputs=['R', 'T', 'X', 'G', 'V'], + outputs=['X_new', 'V_new'], + norm_coefficient=norm_coefficient, + alpha=alpha, + beta=beta, + mode='nesterov', + domain='ai.onnx.training' + ) + +# Define operator inputs. +r = np.array(0.1, dtype=np.float32) # scalar +t = np.array(0, dtype=np.int64) # scalar +x = np.array([1.2, 2.8], dtype=np.float32) +g = np.array([-0.94, -2.5], dtype=np.float32) +v = np.array([1.7, 3.6], dtype=np.float32) + +# Compute expected outputs of Adagrad. +x_new, v_new = apply_nesterov(r, t, x, g, v, + norm_coefficient, alpha, beta) + +# Check results. +expect(node, inputs=[r, t, x, g, v], + outputs=[x_new, v_new], name='test_nesterov_momentum', + opset_imports=[onnx.helper.make_opsetid('ai.onnx.training', 1)]) +``` + +
+ + diff --git a/docs/TestCoverage.md b/docs/TestCoverage.md index 3a2b7cc7eb3..f0bdd734020 100644 --- a/docs/TestCoverage.md +++ b/docs/TestCoverage.md @@ -5,7 +5,7 @@ * [Overall Test Coverage](#overall-test-coverage) # Node Test Coverage ## Summary -Node tests have covered 145/163 (88.96%, 5 generators excluded) common operators. +Node tests have covered 146/164 (89.02%, 5 generators excluded) common operators. Node tests have covered 0/0 (N/A) experimental operators. @@ -1257,7 +1257,9 @@ s = np.array([1.0, 1.5]).astype(np.float32) bias = np.array([0, 1]).astype(np.float32) mean = np.array([0, 3]).astype(np.float32) var = np.array([1, 1.5]).astype(np.float32) -training_mode = np.ones(1, dtype=bool) +# using np.bool(1) while generating test data with "'bool' object has no attribute 'dtype'" +# working around by using np.byte(1).astype(bool) +training_mode = np.byte(1).astype(bool) y, saved_mean, saved_var, output_mean, output_var = batchnorm_training_mode(x, s, bias, mean, var) node = onnx.helper.make_node( @@ -1276,7 +1278,7 @@ s = np.random.randn(3).astype(np.float32) bias = np.random.randn(3).astype(np.float32) mean = np.random.randn(3).astype(np.float32) var = np.random.rand(3).astype(np.float32) -training_mode = np.ones(1, dtype=bool) +training_mode = np.byte(1).astype(bool) momentum = 0.9 epsilon = 1e-2 y, saved_mean, saved_var, output_mean, output_var = batchnorm_training_mode(x, s, bias, mean, var, momentum, epsilon) @@ -6011,6 +6013,132 @@ expect(node, inputs=[x, y], outputs=[z], +### Momentum +There are 3 test cases, listed as following: +
+momentum + +```python +# Define operator attributes. +norm_coefficient = 0.001 +alpha = 0.95 +beta = 0.1 + +# Create operator. +node = onnx.helper.make_node('Momentum', + inputs=['R', 'T', 'X', 'G', 'V'], + outputs=['X_new', 'V_new'], + norm_coefficient=norm_coefficient, + alpha=alpha, + beta=beta, + mode='standard', + domain='ai.onnx.training' + ) + +# Define operator inputs. +r = np.array(0.1, dtype=np.float32) # scalar +t = np.array(0, dtype=np.int64) # scalar +x = np.array([1.2, 2.8], dtype=np.float32) +g = np.array([-0.94, -2.5], dtype=np.float32) +v = np.array([1.7, 3.6], dtype=np.float32) + +# Compute expected outputs of Momentum. +x_new, v_new = apply_momentum(r, t, x, g, v, + norm_coefficient, alpha, beta) + +# Check results. +expect(node, inputs=[r, t, x, g, v], + outputs=[x_new, v_new], name='test_momentum', + opset_imports=[onnx.helper.make_opsetid('ai.onnx.training', 1)]) +``` + +
+
+momentum_multiple + +```python +# Define operator attributes. +norm_coefficient = 0.001 +alpha = 0.95 +beta = 0.85 + +node = onnx.helper.make_node('Momentum', + inputs=['R', 'T', 'X1', 'X2', + 'G1', 'G2', 'H1', 'H2'], + outputs=['X1_new', 'X2_new', + 'V1_new', 'V2_new'], + norm_coefficient=norm_coefficient, + alpha=alpha, + beta=beta, + mode='standard', + domain='ai.onnx.training' + ) + +# Define operator inputs. +r = np.array(0.1, dtype=np.float32) # scalar +t = np.array(0, dtype=np.int64) # scalar + +x1 = np.array([1.0], dtype=np.float32) +g1 = np.array([-1.0], dtype=np.float32) +v1 = np.array([2.0], dtype=np.float32) + +x2 = np.array([1.0, 2.0], dtype=np.float32) +g2 = np.array([-1.0, -3.0], dtype=np.float32) +v2 = np.array([4.0, 1.0], dtype=np.float32) + +# Compute expected outputs of Momentum. +x1_new, v1_new = apply_momentum(r, t, x1, g1, v1, + norm_coefficient, alpha, beta) +x2_new, v2_new = apply_momentum(r, t, x2, g2, v2, + norm_coefficient, alpha, beta) + +# Check results. +expect(node, inputs=[r, t, x1, x2, g1, g2, v1, v2], + outputs=[x1_new, x2_new, v1_new, v2_new], name='test_momentum_multiple', + opset_imports=[onnx.helper.make_opsetid('ai.onnx.training', 1)]) +``` + +
+
+nesterov_momentum + +```python +# Define operator attributes. +norm_coefficient = 0.01 +alpha = 0.95 +beta = 1.0 + +# Create operator. +node = onnx.helper.make_node('Momentum', + inputs=['R', 'T', 'X', 'G', 'V'], + outputs=['X_new', 'V_new'], + norm_coefficient=norm_coefficient, + alpha=alpha, + beta=beta, + mode='nesterov', + domain='ai.onnx.training' + ) + +# Define operator inputs. +r = np.array(0.1, dtype=np.float32) # scalar +t = np.array(0, dtype=np.int64) # scalar +x = np.array([1.2, 2.8], dtype=np.float32) +g = np.array([-0.94, -2.5], dtype=np.float32) +v = np.array([1.7, 3.6], dtype=np.float32) + +# Compute expected outputs of Adagrad. +x_new, v_new = apply_nesterov(r, t, x, g, v, + norm_coefficient, alpha, beta) + +# Check results. +expect(node, inputs=[r, t, x, g, v], + outputs=[x_new, v_new], name='test_nesterov_momentum', + opset_imports=[onnx.helper.make_opsetid('ai.onnx.training', 1)]) +``` + +
+ + ### Mul There are 2 test cases, listed as following:
@@ -6084,7 +6212,7 @@ expect(node, inputs=[x], outputs=[y], ### NegativeLogLikelihoodLoss -There are 7 test cases, listed as following: +There are 8 test cases, listed as following:
input_shape_is_NC @@ -6255,6 +6383,34 @@ expect(node, inputs=[input, target, weight], outputs=[negative_log_likelihood_lo name='test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum') ``` +
+
+input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index + +```python +reduction = 'sum' +ignore_index = np.int64(0) +node = onnx.helper.make_node( + 'NegativeLogLikelihoodLoss', + inputs=['input', 'target', 'weight'], + outputs=['loss'], + reduction=reduction, + ignore_index=ignore_index +) + +N, C, dim1, dim2 = 3, 5, 6, 6 +np.random.seed(0) +input = np.random.rand(N, C, dim1, dim2).astype(np.float32) +target = np.random.randint(0, high=C, size=(N, dim1, dim2)) +target[0][0][0] = 0 +weight = np.random.rand(C).astype(np.float32) + +negative_log_likelihood_loss = compute_negative_log_likelihood_loss(input, target, weight=weight, reduction=reduction, ignore_index=ignore_index) + +expect(node, inputs=[input, target, weight], outputs=[negative_log_likelihood_loss], + name='test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index') +``` +
@@ -6841,7 +6997,7 @@ for mode in ['edge', 'reflect']: ### Pow -There are 2 test cases, listed as following: +There are 3 test cases, listed as following:
pow @@ -6854,13 +7010,13 @@ node = onnx.helper.make_node( x = np.array([1, 2, 3]).astype(np.float32) y = np.array([4, 5, 6]).astype(np.float32) -z = np.power(x, y) # expected output [1., 32., 729.] +z = pow(x, y) # expected output [1., 32., 729.] expect(node, inputs=[x, y], outputs=[z], name='test_pow_example') x = np.arange(60).reshape(3, 4, 5).astype(np.float32) y = np.random.randn(3, 4, 5).astype(np.float32) -z = np.power(x, y) +z = pow(x, y) expect(node, inputs=[x, y], outputs=[z], name='test_pow') ``` @@ -6878,7 +7034,7 @@ node = onnx.helper.make_node( x = np.array([1, 2, 3]).astype(np.float32) y = np.array(2).astype(np.float32) -z = np.power(x, y) # expected output [1., 4., 9.] +z = pow(x, y) # expected output [1., 4., 9.] expect(node, inputs=[x, y], outputs=[z], name='test_pow_bcast_scalar') @@ -6890,11 +7046,71 @@ node = onnx.helper.make_node( x = np.array([[1, 2, 3], [4, 5, 6]]).astype(np.float32) y = np.array([1, 2, 3]).astype(np.float32) # expected output [[1, 4, 27], [4, 25, 216]] -z = np.power(x, y).astype(np.float32) +z = pow(x, y) expect(node, inputs=[x, y], outputs=[z], name='test_pow_bcast_array') ``` +
+
+types + +```python +node = onnx.helper.make_node( + 'Pow', + inputs=['x', 'y'], + outputs=['z'], +) + +x = np.array([1, 2, 3]).astype(np.float32) +y = np.array([4, 5, 6]).astype(np.int64) +z = pow(x, y) # expected output [1., 32., 729.] +expect(node, inputs=[x, y], outputs=[z], + name='test_pow_types_float32_int64') + +x = np.array([1, 2, 3]).astype(np.int64) +y = np.array([4, 5, 6]).astype(np.float32) +z = pow(x, y) # expected output [1, 32, 729] +expect(node, inputs=[x, y], outputs=[z], + name='test_pow_types_int64_float32') + +x = np.array([1, 2, 3]).astype(np.float32) +y = np.array([4, 5, 6]).astype(np.int32) +z = pow(x, y) # expected output [1., 32., 729.] +expect(node, inputs=[x, y], outputs=[z], + name='test_pow_types_float32_int32') + +x = np.array([1, 2, 3]).astype(np.int32) +y = np.array([4, 5, 6]).astype(np.float32) +z = pow(x, y) # expected output [1, 32, 729] +expect(node, inputs=[x, y], outputs=[z], + name='test_pow_types_int32_float32') + +x = np.array([1, 2, 3]).astype(np.float32) +y = np.array([4, 5, 6]).astype(np.uint64) +z = pow(x, y) # expected output [1., 32., 729.] +expect(node, inputs=[x, y], outputs=[z], + name='test_pow_types_float32_uint64') + +x = np.array([1, 2, 3]).astype(np.float32) +y = np.array([4, 5, 6]).astype(np.uint32) +z = pow(x, y) # expected output [1., 32., 729.] +expect(node, inputs=[x, y], outputs=[z], + name='test_pow_types_float32_uint32') + +x = np.array([1, 2, 3]).astype(np.int64) +y = np.array([4, 5, 6]).astype(np.int64) +z = pow(x, y) # expected output [1, 32, 729] +expect(node, inputs=[x, y], outputs=[z], + name='test_pow_types_int64_int64') + +x = np.array([1, 2, 3]).astype(np.int32) +y = np.array([4, 5, 6]).astype(np.int32) +z = pow(x, y) # expected output [1, 32, 729] +expect(node, inputs=[x, y], outputs=[z], + name='test_pow_types_int32_int32') +``` +
@@ -10485,7 +10701,7 @@ expect(node, inputs=[x], outputs=[y], ### SoftmaxCrossEntropyLoss -There are 6 test cases, listed as following: +There are 7 test cases, listed as following:
softmaxcrossentropy_mean @@ -10564,6 +10780,36 @@ sce = softmaxcrossentropy(x, labels, weight=weights) expect(node, inputs=[x, labels, weights], outputs=[sce], name='test_softmax_cross_entropy_mean_weight') ``` +
+
+softmaxcrossentropy_mean_weights_ignore_index + +```python +# Define operator attributes. +reduction = 'mean' +ignore_index = np.int64(0) + +# Create operator. +node = onnx.helper.make_node('SoftmaxCrossEntropyLoss', + inputs=['x', 'y', 'w'], + outputs=['z'], + reduction=reduction, + ignore_index=ignore_index) + +# Define operator inputs. +np.random.seed(0) +x = np.random.rand(3, 5).astype(np.float32) +labels = np.random.randint(0, high=5, size=(3, )) +labels[0] = 0 +weights = np.array([0.9, 0.7, 0.8, 0.9, 0.9], dtype=np.float32) + +# Compute SoftmaxCrossEntropyLoss +sce = softmaxcrossentropy(x, labels, weight=weights, ignore_index=ignore_index) + +# Check results +expect(node, inputs=[x, labels, weights], outputs=[sce], name='test_softmax_cross_entropy_mean_weight_ignore_index') +``` +
softmaxcrossentropy_none diff --git a/onnx/backend/test/case/model/sequence.py b/onnx/backend/test/case/model/sequence.py index 560d440ca8a..1322c8db1bb 100644 --- a/onnx/backend/test/case/model/sequence.py +++ b/onnx/backend/test/case/model/sequence.py @@ -330,14 +330,15 @@ def make_graph( expect(model, inputs=[x], outputs=[out], name="test_sequence_model7") #8th testcase - split zero length - seq_split_node = onnx.helper.make_node('SplitToSequence', ['X'], ['seq_1']) + seq_split_node = onnx.helper.make_node('SplitToSequence', ['X', 'Splits'], ['seq_1']) seq_len_node = onnx.helper.make_node('SequenceLength', ['seq_1'], ['len']) - tensor_shape = [] # type: ignore - len_shape = [] # type: ignore + tensor_shape = ['n'] # type: ignore + splits_shape = [3] # type: ignore x = np.array([]).astype(np.float32) - out_len = np.int64(0) + splits = np.array([0, 0, 0]).astype(np.int64) + out_len = np.int64(3) graph = onnx.helper.make_graph( nodes=[seq_split_node, seq_len_node], @@ -348,9 +349,9 @@ def make_graph( onnx.TensorProto.FLOAT, tensor_shape), # type: ignore onnx.helper.make_tensor_value_info( - 'Split', + 'Splits', onnx.TensorProto.INT64, - len_shape)], # type: ignore + splits_shape)], # type: ignore outputs=[ onnx.helper.make_tensor_value_info( 'len', @@ -358,4 +359,4 @@ def make_graph( len_shape)]) # type: ignore model = onnx.helper.make_model(graph, producer_name='backend-test') - expect(model, inputs=[x], outputs=[out_len], name="test_sequence_model8") + expect(model, inputs=[x, splits], outputs=[out_len], name="test_sequence_model8") diff --git a/onnx/backend/test/case/node/batchnorm.py b/onnx/backend/test/case/node/batchnorm.py index 44ae086329f..52d30d03d45 100644 --- a/onnx/backend/test/case/node/batchnorm.py +++ b/onnx/backend/test/case/node/batchnorm.py @@ -23,7 +23,7 @@ def batchnorm_test_mode(x, s, bias, mean, var, epsilon=1e-5): # type: ignore def batchnorm_training_mode(x, s, bias, mean, var, momentum=0.9, epsilon=1e-5): # type: ignore axis = np.arange(len(x.shape)) - np.delete(axis, 1) + axis = np.delete(axis, 1) axis = tuple(axis) saved_mean = x.mean(axis=axis) saved_var = x.var(axis=axis) @@ -43,7 +43,9 @@ def export_train(): # type: () -> None bias = np.array([0, 1]).astype(np.float32) mean = np.array([0, 3]).astype(np.float32) var = np.array([1, 1.5]).astype(np.float32) - training_mode = np.ones(1, dtype=bool) + # using np.bool(1) while generating test data with "'bool' object has no attribute 'dtype'" + # working around by using np.byte(1).astype(bool) + training_mode = np.byte(1).astype(bool) y, saved_mean, saved_var, output_mean, output_var = batchnorm_training_mode(x, s, bias, mean, var) node = onnx.helper.make_node( @@ -62,7 +64,7 @@ def export_train(): # type: () -> None bias = np.random.randn(3).astype(np.float32) mean = np.random.randn(3).astype(np.float32) var = np.random.rand(3).astype(np.float32) - training_mode = np.ones(1, dtype=bool) + training_mode = np.byte(1).astype(bool) momentum = 0.9 epsilon = 1e-2 y, saved_mean, saved_var, output_mean, output_var = batchnorm_training_mode(x, s, bias, mean, var, momentum, epsilon) diff --git a/onnx/backend/test/case/node/momentum.py b/onnx/backend/test/case/node/momentum.py new file mode 100644 index 00000000000..530f6062ccb --- /dev/null +++ b/onnx/backend/test/case/node/momentum.py @@ -0,0 +1,147 @@ +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function +from __future__ import unicode_literals + +import numpy as np # type: ignore + +import onnx +from ..base import Base +from . import expect + + +def apply_momentum(t, r, x, g, v, norm_coefficient, alpha, beta): # type: ignore + # Add gradient of regularization term. + g_regularized = norm_coefficient * x + g + # Coefficient of gradient should be 1 at the first iteration. + beta_adjusted = beta if t > 0 else 1 + # Update momentum. + v_new = alpha * v + beta_adjusted * g_regularized + # Apply SG with momentum update rule. + x_new = x - r * v_new + return x_new, v_new + + +def apply_nesterov(t, r, x, g, v, norm_coefficient, alpha, beta): # type: ignore + # Add gradient of regularization term. + g_regularized = norm_coefficient * x + g + # Coefficient of gradient should be 1 at the first iteration. + beta_adjusted = beta if t > 0 else 1 + # Update momentum. + v_new = alpha * v + beta_adjusted * g_regularized + # Apply Nesterov with momentum update rule. + x_new = x - r * (g_regularized + alpha * v_new) + return x_new, v_new + + +class Momentum(Base): + + @staticmethod + def export_momentum(): # type: () -> None + # Define operator attributes. + norm_coefficient = 0.001 + alpha = 0.95 + beta = 0.1 + + # Create operator. + node = onnx.helper.make_node('Momentum', + inputs=['R', 'T', 'X', 'G', 'V'], + outputs=['X_new', 'V_new'], + norm_coefficient=norm_coefficient, + alpha=alpha, + beta=beta, + mode='standard', + domain='ai.onnx.training' + ) + + # Define operator inputs. + r = np.array(0.1, dtype=np.float32) # scalar + t = np.array(0, dtype=np.int64) # scalar + x = np.array([1.2, 2.8], dtype=np.float32) + g = np.array([-0.94, -2.5], dtype=np.float32) + v = np.array([1.7, 3.6], dtype=np.float32) + + # Compute expected outputs of Momentum. + x_new, v_new = apply_momentum(r, t, x, g, v, + norm_coefficient, alpha, beta) + + # Check results. + expect(node, inputs=[r, t, x, g, v], + outputs=[x_new, v_new], name='test_momentum', + opset_imports=[onnx.helper.make_opsetid('ai.onnx.training', 1)]) + + @staticmethod + def export_nesterov_momentum(): # type: () -> None + # Define operator attributes. + norm_coefficient = 0.01 + alpha = 0.95 + beta = 1.0 + + # Create operator. + node = onnx.helper.make_node('Momentum', + inputs=['R', 'T', 'X', 'G', 'V'], + outputs=['X_new', 'V_new'], + norm_coefficient=norm_coefficient, + alpha=alpha, + beta=beta, + mode='nesterov', + domain='ai.onnx.training' + ) + + # Define operator inputs. + r = np.array(0.1, dtype=np.float32) # scalar + t = np.array(0, dtype=np.int64) # scalar + x = np.array([1.2, 2.8], dtype=np.float32) + g = np.array([-0.94, -2.5], dtype=np.float32) + v = np.array([1.7, 3.6], dtype=np.float32) + + # Compute expected outputs of Adagrad. + x_new, v_new = apply_nesterov(r, t, x, g, v, + norm_coefficient, alpha, beta) + + # Check results. + expect(node, inputs=[r, t, x, g, v], + outputs=[x_new, v_new], name='test_nesterov_momentum', + opset_imports=[onnx.helper.make_opsetid('ai.onnx.training', 1)]) + + @staticmethod + def export_momentum_multiple(): # type: () -> None + # Define operator attributes. + norm_coefficient = 0.001 + alpha = 0.95 + beta = 0.85 + + node = onnx.helper.make_node('Momentum', + inputs=['R', 'T', 'X1', 'X2', + 'G1', 'G2', 'H1', 'H2'], + outputs=['X1_new', 'X2_new', + 'V1_new', 'V2_new'], + norm_coefficient=norm_coefficient, + alpha=alpha, + beta=beta, + mode='standard', + domain='ai.onnx.training' + ) + + # Define operator inputs. + r = np.array(0.1, dtype=np.float32) # scalar + t = np.array(0, dtype=np.int64) # scalar + + x1 = np.array([1.0], dtype=np.float32) + g1 = np.array([-1.0], dtype=np.float32) + v1 = np.array([2.0], dtype=np.float32) + + x2 = np.array([1.0, 2.0], dtype=np.float32) + g2 = np.array([-1.0, -3.0], dtype=np.float32) + v2 = np.array([4.0, 1.0], dtype=np.float32) + + # Compute expected outputs of Momentum. + x1_new, v1_new = apply_momentum(r, t, x1, g1, v1, + norm_coefficient, alpha, beta) + x2_new, v2_new = apply_momentum(r, t, x2, g2, v2, + norm_coefficient, alpha, beta) + + # Check results. + expect(node, inputs=[r, t, x1, x2, g1, g2, v1, v2], + outputs=[x1_new, x2_new, v1_new, v2_new], name='test_momentum_multiple', + opset_imports=[onnx.helper.make_opsetid('ai.onnx.training', 1)]) diff --git a/onnx/backend/test/case/node/negativeloglikelihoodloss.py b/onnx/backend/test/case/node/negativeloglikelihoodloss.py index 7f655c7c87f..1d08addc7af 100644 --- a/onnx/backend/test/case/node/negativeloglikelihoodloss.py +++ b/onnx/backend/test/case/node/negativeloglikelihoodloss.py @@ -10,7 +10,7 @@ from . import expect -def compute_negative_log_likelihood_loss(input, target, weight=None, reduction='mean'): # type: ignore +def compute_negative_log_likelihood_loss(input, target, weight=None, reduction='mean', ignore_index=None): # type: ignore ''' Compute negative_log_likelihood_loss ''' input_shape = input.shape @@ -19,20 +19,34 @@ def compute_negative_log_likelihood_loss(input, target, weight=None, reduction=' N, C = input_shape neg_gather_element_input = np.zeros((N, ), dtype=np.float32) for i in range(N): - neg_gather_element_input[i] = -input[i][target[i]] + if target[i] != ignore_index: + neg_gather_element_input[i] = -input[i][target[i]] else: N, C, dim1, dim2 = input_shape neg_gather_element_input = np.zeros((N, dim1, dim2), dtype=np.float32) for i in range(N): for d1 in range(dim1): for d2 in range(dim2): - neg_gather_element_input[i][d1][d2] = -input[i][target[i][d1][d2]][d1][d2] + if target[i][d1][d2] != ignore_index: + neg_gather_element_input[i][d1][d2] = -input[i][target[i][d1][d2]][d1][d2] loss = neg_gather_element_input if weight is not None: # Gather(input=weight, index=target) gather_weight = np.take(weight, target) + if ignore_index is not None: + if len(input_shape) == 2: + for i in range(input_shape[0]): + if target[i] == ignore_index: + gather_weight[i] = 0 + + if len(input_shape) == 3: + for i in range(input_shape[0]): + for j in range(input_shape[1]): + if target[i][j] == ignore_index: + gather_weight[i][j] = 0 + loss = gather_weight * loss if reduction == 'mean': return loss.sum() / gather_weight.sum() @@ -189,3 +203,27 @@ def export_input_shape_is_NCd1d2_with_weight_reduction_sum(): # type: () -> Non expect(node, inputs=[input, target, weight], outputs=[negative_log_likelihood_loss], name='test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum') + + @staticmethod + def export_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index(): # type: () -> None + reduction = 'sum' + ignore_index = np.int64(0) + node = onnx.helper.make_node( + 'NegativeLogLikelihoodLoss', + inputs=['input', 'target', 'weight'], + outputs=['loss'], + reduction=reduction, + ignore_index=ignore_index + ) + + N, C, dim1, dim2 = 3, 5, 6, 6 + np.random.seed(0) + input = np.random.rand(N, C, dim1, dim2).astype(np.float32) + target = np.random.randint(0, high=C, size=(N, dim1, dim2)) + target[0][0][0] = 0 + weight = np.random.rand(C).astype(np.float32) + + negative_log_likelihood_loss = compute_negative_log_likelihood_loss(input, target, weight=weight, reduction=reduction, ignore_index=ignore_index) + + expect(node, inputs=[input, target, weight], outputs=[negative_log_likelihood_loss], + name='test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index') diff --git a/onnx/backend/test/case/node/pow.py b/onnx/backend/test/case/node/pow.py index a48dfbdcabe..14d73aa6c66 100644 --- a/onnx/backend/test/case/node/pow.py +++ b/onnx/backend/test/case/node/pow.py @@ -10,6 +10,11 @@ from . import expect +def pow(x, y): # type: ignore + z = np.power(x, y).astype(x.dtype) + return z + + class Pow(Base): @staticmethod @@ -22,13 +27,13 @@ def export(): # type: () -> None x = np.array([1, 2, 3]).astype(np.float32) y = np.array([4, 5, 6]).astype(np.float32) - z = np.power(x, y) # expected output [1., 32., 729.] + z = pow(x, y) # expected output [1., 32., 729.] expect(node, inputs=[x, y], outputs=[z], name='test_pow_example') x = np.arange(60).reshape(3, 4, 5).astype(np.float32) y = np.random.randn(3, 4, 5).astype(np.float32) - z = np.power(x, y) + z = pow(x, y) expect(node, inputs=[x, y], outputs=[z], name='test_pow') @@ -42,7 +47,7 @@ def export_pow_broadcast(): # type: () -> None x = np.array([1, 2, 3]).astype(np.float32) y = np.array(2).astype(np.float32) - z = np.power(x, y) # expected output [1., 4., 9.] + z = pow(x, y) # expected output [1., 4., 9.] expect(node, inputs=[x, y], outputs=[z], name='test_pow_bcast_scalar') @@ -54,6 +59,62 @@ def export_pow_broadcast(): # type: () -> None x = np.array([[1, 2, 3], [4, 5, 6]]).astype(np.float32) y = np.array([1, 2, 3]).astype(np.float32) # expected output [[1, 4, 27], [4, 25, 216]] - z = np.power(x, y).astype(np.float32) + z = pow(x, y) expect(node, inputs=[x, y], outputs=[z], name='test_pow_bcast_array') + + @staticmethod + def export_types(): # type: () -> None + node = onnx.helper.make_node( + 'Pow', + inputs=['x', 'y'], + outputs=['z'], + ) + + x = np.array([1, 2, 3]).astype(np.float32) + y = np.array([4, 5, 6]).astype(np.int64) + z = pow(x, y) # expected output [1., 32., 729.] + expect(node, inputs=[x, y], outputs=[z], + name='test_pow_types_float32_int64') + + x = np.array([1, 2, 3]).astype(np.int64) + y = np.array([4, 5, 6]).astype(np.float32) + z = pow(x, y) # expected output [1, 32, 729] + expect(node, inputs=[x, y], outputs=[z], + name='test_pow_types_int64_float32') + + x = np.array([1, 2, 3]).astype(np.float32) + y = np.array([4, 5, 6]).astype(np.int32) + z = pow(x, y) # expected output [1., 32., 729.] + expect(node, inputs=[x, y], outputs=[z], + name='test_pow_types_float32_int32') + + x = np.array([1, 2, 3]).astype(np.int32) + y = np.array([4, 5, 6]).astype(np.float32) + z = pow(x, y) # expected output [1, 32, 729] + expect(node, inputs=[x, y], outputs=[z], + name='test_pow_types_int32_float32') + + x = np.array([1, 2, 3]).astype(np.float32) + y = np.array([4, 5, 6]).astype(np.uint64) + z = pow(x, y) # expected output [1., 32., 729.] + expect(node, inputs=[x, y], outputs=[z], + name='test_pow_types_float32_uint64') + + x = np.array([1, 2, 3]).astype(np.float32) + y = np.array([4, 5, 6]).astype(np.uint32) + z = pow(x, y) # expected output [1., 32., 729.] + expect(node, inputs=[x, y], outputs=[z], + name='test_pow_types_float32_uint32') + + x = np.array([1, 2, 3]).astype(np.int64) + y = np.array([4, 5, 6]).astype(np.int64) + z = pow(x, y) # expected output [1, 32, 729] + expect(node, inputs=[x, y], outputs=[z], + name='test_pow_types_int64_int64') + + x = np.array([1, 2, 3]).astype(np.int32) + y = np.array([4, 5, 6]).astype(np.int32) + z = pow(x, y) # expected output [1, 32, 729] + expect(node, inputs=[x, y], outputs=[z], + name='test_pow_types_int32_int32') diff --git a/onnx/backend/test/case/node/softmaxcrossentropy.py b/onnx/backend/test/case/node/softmaxcrossentropy.py index 1948bc02f43..3e8adca88f6 100644 --- a/onnx/backend/test/case/node/softmaxcrossentropy.py +++ b/onnx/backend/test/case/node/softmaxcrossentropy.py @@ -10,7 +10,7 @@ from . import expect -def softmaxcrossentropy(x, target, weight=None, reduction='mean'): # type: ignore +def softmaxcrossentropy(x, target, weight=None, reduction='mean', ignore_index=None): # type: ignore max_x = np.max(x, axis=1, keepdims=True) exp_x = np.exp(x - max_x) p = exp_x / np.sum(exp_x, axis=1, keepdims=True) @@ -20,17 +20,32 @@ def softmaxcrossentropy(x, target, weight=None, reduction='mean'): # type: igno N, C = input_shape neg_gather_element_input = np.zeros((N, ), dtype=np.float32) for i in range(N): - neg_gather_element_input[i] = -inp[i][target[i]] + if target[i] != ignore_index: + neg_gather_element_input[i] = -inp[i][target[i]] if len(input_shape) == 3: N, C, D = input_shape neg_gather_element_input = np.zeros((N, D), dtype=np.float32) for i in range(N): for d in range(D): - neg_gather_element_input[i][d] = -inp[i][target[i][d]][d] + if target[i][d] != ignore_index: + neg_gather_element_input[i][d] = -inp[i][target[i][d]][d] loss = neg_gather_element_input if weight is not None: gather_weight = np.take(weight, target) + + if ignore_index is not None: + if len(input_shape) == 2: + for i in range(input_shape[0]): + if target[i] == ignore_index: + gather_weight[i] = 0 + + if len(input_shape) == 3: + for i in range(input_shape[0]): + for j in range(input_shape[1]): + if target[i][j] == ignore_index: + gather_weight[i][j] = 0 + loss = gather_weight * loss if reduction == 'mean': return loss.sum() / gather_weight.sum() @@ -178,3 +193,29 @@ def export_softmaxcrossentropy_mean_weights(): # type: () -> None # Check results expect(node, inputs=[x, labels, weights], outputs=[sce], name='test_softmax_cross_entropy_mean_weight') + + @staticmethod + def export_softmaxcrossentropy_mean_weights_ignore_index(): # type: () -> None + # Define operator attributes. + reduction = 'mean' + ignore_index = np.int64(0) + + # Create operator. + node = onnx.helper.make_node('SoftmaxCrossEntropyLoss', + inputs=['x', 'y', 'w'], + outputs=['z'], + reduction=reduction, + ignore_index=ignore_index) + + # Define operator inputs. + np.random.seed(0) + x = np.random.rand(3, 5).astype(np.float32) + labels = np.random.randint(0, high=5, size=(3, )) + labels[0] = 0 + weights = np.array([0.9, 0.7, 0.8, 0.9, 0.9], dtype=np.float32) + + # Compute SoftmaxCrossEntropyLoss + sce = softmaxcrossentropy(x, labels, weight=weights, ignore_index=ignore_index) + + # Check results + expect(node, inputs=[x, labels, weights], outputs=[sce], name='test_softmax_cross_entropy_mean_weight_ignore_index') diff --git a/onnx/backend/test/data/node/test_batchnorm_epsilon_old/model.onnx b/onnx/backend/test/data/node/test_batchnorm_epsilon_old/model.onnx index a58bef46800..870cd21363c 100644 --- a/onnx/backend/test/data/node/test_batchnorm_epsilon_old/model.onnx +++ b/onnx/backend/test/data/node/test_batchnorm_epsilon_old/model.onnx @@ -1,4 +1,4 @@ - backend-test: + backend-test: A x s diff --git a/onnx/backend/test/data/node/test_batchnorm_epsilon_training_mode/model.onnx b/onnx/backend/test/data/node/test_batchnorm_epsilon_training_mode/model.onnx index 94158119efbc9108fa9f4a63ab6648871246b134..b34f0a5391af2f70e7b05760c795fa2c5bc74873 100644 GIT binary patch delta 119 zcmX@ie4JT`gI$OxDKR-aH7`ZCB(=E2YQsh$5k^^YF5Z%&#LT?Ry!80o{FGE7HZB$p zP9cUQX)eafi3cSn$1|FlNpNu$CzhqA#OJ0a<_U3ead0pSv2ZbQFeiy~aYAIu5{r-} IoR|c70CGwjvj6}9 delta 107 zcmX@ke3)5?gH4DhDKR-aH7`ZCB(=E2YRyI=5k?tlF5Z%&#LT?Ry!80o{FGE7E-nrZ zP9YX9CJx5Qc8tM#B3xX>iDjuN@wusqc|vSlEF6qN3`xRVoDk8n#3GoW6O#ZB0PA`g A4FCWD diff --git a/onnx/backend/test/data/node/test_batchnorm_epsilon_training_mode/test_data_set_0/input_5.pb b/onnx/backend/test/data/node/test_batchnorm_epsilon_training_mode/test_data_set_0/input_5.pb index 2b72d47c5f9..f72fe9452db 100644 --- a/onnx/backend/test/data/node/test_batchnorm_epsilon_training_mode/test_data_set_0/input_5.pb +++ b/onnx/backend/test/data/node/test_batchnorm_epsilon_training_mode/test_data_set_0/input_5.pb @@ -1 +1 @@ - B training_modeJ \ No newline at end of file + B training_modeJ \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_batchnorm_epsilon_training_mode/test_data_set_0/output_0.pb b/onnx/backend/test/data/node/test_batchnorm_epsilon_training_mode/test_data_set_0/output_0.pb index 29af295c856d443c74b4fc4c2fb15286670b7847..77cd8aabed707547c796b76dd9094c23c8f03e0e 100644 GIT binary patch literal 496 zcmVL1PBEX0YU+JO5g)EN&rCB5V=1gXV*W6$_zl`00Te*uVFut=G8yh zC7eG3te!vooVh;)qpLqaR`5UY{L4SXcB(&sTf0ARl(j#QZ1O*?Q<*>1f3!dJnQlK{ z^$0-6+>XC;L#n?NIGR4dbuhq&AA>(YZ2!LGP=>zKkO07J7#zQ~9#+3*?d3ko zi^o4$1IxXhbsWFI(AmEaFxJ1I7jnI1?i9XNJ>$N95evWAtscKtxVJu_W|_Y3myf<{ zIsd-2tv5Z)Ss1?zaACfjL(aaab#%QEkD0zQR`tHe+%CQx3`f0K=@h=|fT+Ehd#b*K zMTNc+(||uaJCQ$1<#InYBNIOyW34}E1dl(kHXT26S-d~31!+IX&Za*cP|82s*s4FZ z)89YctVTa4F1bHVEr36l*=#?YyM;fCzmq?02d6)U5Ia9e{;t24-E6<;!;L=@So^=y zDHy=Y9@D=E7mB{4h9W=o^}@d}D=@vY%h;kZ29Jl*O5A2C#yY@L1PBEX0YU+JO5g*+{PaJ79JW8Y$k0C|d<8&E!udbsIBh?jcF#Yv zube;7=bk^gm$pCcO{zcGwB|p5zs5iDMX5hdAh zC?mj7!{xu%?!UbkhVRbum!(v`MW-~YMj2MFwMRI^te7Up>RH#+^{}k mijh6m&)vRa61P5jB!|BE0V+K@pQ$~(L~Xucsz|(0{F*%z= \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_batchnorm_epsilon_training_mode/test_data_set_0/output_3.pb b/onnx/backend/test/data/node/test_batchnorm_epsilon_training_mode/test_data_set_0/output_3.pb index 16d491d0a59..b6f0d22db61 100644 --- a/onnx/backend/test/data/node/test_batchnorm_epsilon_training_mode/test_data_set_0/output_3.pb +++ b/onnx/backend/test/data/node/test_batchnorm_epsilon_training_mode/test_data_set_0/output_3.pb @@ -1,2 +1,2 @@ -B -saved_meanJ`> \ No newline at end of file +B +saved_meanJ "`= 4t>O= \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_batchnorm_epsilon_training_mode/test_data_set_0/output_4.pb b/onnx/backend/test/data/node/test_batchnorm_epsilon_training_mode/test_data_set_0/output_4.pb index 7f174abd3a1..9d7c5f6aeca 100644 --- a/onnx/backend/test/data/node/test_batchnorm_epsilon_training_mode/test_data_set_0/output_4.pb +++ b/onnx/backend/test/data/node/test_batchnorm_epsilon_training_mode/test_data_set_0/output_4.pb @@ -1 +1 @@ -B saved_varJ{ҋ? \ No newline at end of file +B saved_varJ 0+X?- ?? \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_batchnorm_example_old/model.onnx b/onnx/backend/test/data/node/test_batchnorm_example_old/model.onnx index b7b47f79c09..d21127e42c0 100644 --- a/onnx/backend/test/data/node/test_batchnorm_example_old/model.onnx +++ b/onnx/backend/test/data/node/test_batchnorm_example_old/model.onnx @@ -1,4 +1,4 @@ - backend-test: + backend-test: . x s diff --git a/onnx/backend/test/data/node/test_batchnorm_example_training_mode/model.onnx b/onnx/backend/test/data/node/test_batchnorm_example_training_mode/model.onnx index 97d51b19d7fd0deb73b4a502c1851cc8a4d286ac..8727e3e78bef95f27f6a5a37009ad6dfd3bed150 100644 GIT binary patch delta 110 zcmZ3_yqQ^ugI$OxDKR-aH7`ZCB(=E2YR*C-MhS5)-jbrk%)HFJ^!VKTlvE)$E*1_> zA%@9%jKO9STwKM8WvMCgxv7bHLR?%N9E?INTudBHN#b0b5Sg;XB4i0CCIKD*HLw`X delta 106 zcmdnYyq;NzgH4DhDKR-aH7`ZCB(=E2YQ{n#Mj2@?-jbrk%)HFJ^!VKTlvE)uE)EV( zAr>ws4#vq^jKO*$TwKM8WvMCgxv7bHLTp?t9E?H?Ny1#55Ye*4BAB2PlK>9@yww<) diff --git a/onnx/backend/test/data/node/test_batchnorm_example_training_mode/test_data_set_0/input_5.pb b/onnx/backend/test/data/node/test_batchnorm_example_training_mode/test_data_set_0/input_5.pb index 2b72d47c5f9..f72fe9452db 100644 --- a/onnx/backend/test/data/node/test_batchnorm_example_training_mode/test_data_set_0/input_5.pb +++ b/onnx/backend/test/data/node/test_batchnorm_example_training_mode/test_data_set_0/input_5.pb @@ -1 +1 @@ - B training_modeJ \ No newline at end of file + B training_modeJ \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_batchnorm_example_training_mode/test_data_set_0/output_0.pb b/onnx/backend/test/data/node/test_batchnorm_example_training_mode/test_data_set_0/output_0.pb index e698ff662bf0754defaf14212c0ecdd0c1c59bef..5211daa8b6d8a206444ac22784393c083a5cafdb 100644 GIT binary patch literal 39 qcmd;JzsRICtjR^+; literal 39 vcmd;JPx# literal 27 icmd;J5@2-V&Mz$~C@qQ4O-;=6;+Qp4(k?B{%mDyq$_O|B diff --git a/onnx/backend/test/data/node/test_batchnorm_example_training_mode/test_data_set_0/output_2.pb b/onnx/backend/test/data/node/test_batchnorm_example_training_mode/test_data_set_0/output_2.pb index 1ea67621406..ab0f6f0d655 100644 --- a/onnx/backend/test/data/node/test_batchnorm_example_training_mode/test_data_set_0/output_2.pb +++ b/onnx/backend/test/data/node/test_batchnorm_example_training_mode/test_data_set_0/output_2.pb @@ -1,2 +1,2 @@ B -output_varJ?""? \ No newline at end of file +output_varJwww?UU? \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_batchnorm_example_training_mode/test_data_set_0/output_3.pb b/onnx/backend/test/data/node/test_batchnorm_example_training_mode/test_data_set_0/output_3.pb index bee5bfe8c58006b1d5dab906dc05bcd5af4312e2..0593c9a9e913fdcfa48cd550bd009486df4864e5 100644 GIT binary patch literal 26 dcmd;J5@2-VDo!j*O^MG7xV6oq@!q*%8{X2*C_+2TgcS-fg%GPE-0qz0e8DWL6Mpz)^JX#kdw>-*y8M39(lI)jG z`xuFj&o_}wChhD1kUn5oN-L7n@i_H*fT1?E9J1C5szp;D;5vQ6m`tOK%E-+|Rm$B; z4FAsDLKX`(HblFzZG<94ul(NID)i;$%dPXqZ!l!Que?~DMbC-8^UD#L>x3Avz=OqK Ykas~jz)P~unuq6{-O2eQ{&NK-HOvS#`d{?sPGqVL5 zgNqS4Y0S!+i7{TY83DlN$^{1_OTr`KG$bBm;z1@EWK2K(WG9=#e_K68mbTsX)MmRT Rdup?7ug7!mr!Z*3Lw|4RTkik> literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/input_0.pb b/onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/input_0.pb new file mode 100644 index 00000000000..d0483cc61f7 --- /dev/null +++ b/onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/input_0.pb @@ -0,0 +1 @@ +BRJ= \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/input_1.pb b/onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/input_1.pb new file mode 100644 index 0000000000000000000000000000000000000000..54656de61df113af74638d4e31110b7cb8c8c9f0 GIT binary patch literal 15 QcmWe&cVZ0j;$VOR01J-+0RR91 literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/input_2.pb b/onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/input_2.pb new file mode 100644 index 0000000000000000000000000000000000000000..259b58bf7f3d11fde5b959e44f4a97ae791ed8c5 GIT binary patch literal 14 Vcmd;J6kv2>iZJwIVPI&m2LKBq0rda? literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/input_3.pb b/onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/input_3.pb new file mode 100644 index 0000000000000000000000000000000000000000..eb51654d455357db15a81c2fc6a46a89d28f47b2 GIT binary patch literal 18 Zcmd;J5@2*ayRs1VPI(34*&}q0%QOH literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/input_5.pb b/onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/input_5.pb new file mode 100644 index 0000000000000000000000000000000000000000..62e5b3a8f7515918af3f49d0b6536a2fcea0da0b GIT binary patch literal 18 Zcmd;J5@2*@-XybVPIfz000T20cHRI literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/input_7.pb b/onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/input_7.pb new file mode 100644 index 0000000000000000000000000000000000000000..cb33a3de6635f4b8e576e11be4af29c88323dfdf GIT binary patch literal 18 Xcmd;J5@2*<@-Xt^U|?u)0AhOp6S@Mc literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/output_0.pb b/onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/output_0.pb new file mode 100644 index 0000000000000000000000000000000000000000..cc63aee5c44bdc27df70bb0c29c8593dda990200 GIT binary patch literal 18 Zcmd;J6kv2>i!hAOOD*?eVPI&m2LK%N1EK%` literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/output_1.pb b/onnx/backend/test/data/node/test_momentum_multiple/test_data_set_0/output_1.pb new file mode 100644 index 0000000000000000000000000000000000000000..a3e2b32c78b8014b6d81469d0adad991b2143b0c GIT binary patch literal 22 dcmd;J5@2*x0a8IgMA~DHah?k69R1j delta 11 ScmZ3>x0a8IgKZ;|Hah?k4gzcd diff --git a/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_reduction_mean/model.onnx b/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_reduction_mean/model.onnx index aee8b92e5d9617b5b8626e76db65b087e2e1ec65..6455c78d411b2ec4b88aad83580bda7faecb46af 100644 GIT binary patch delta 10 Rcmeyy_>GZ?gMA{?7XTDt1BUGZ?gKZ+y7XTDp1BL(q diff --git a/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_reduction_mean_expanded/model.onnx b/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_reduction_mean_expanded/model.onnx index 5eb7e7cd05b1400214ee9a0ee202b6a875a34a8e..d7a6ef04b5c4d5c44e324e65d58bd5fb93fba7bc 100644 GIT binary patch delta 11 ScmeAd>K9_-VBg5Z%LxDv$pRe! delta 11 ScmeAd>K9_-VB5&V%LxDv!~z@u diff --git a/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_reduction_sum/model.onnx b/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_reduction_sum/model.onnx index 27a5fadc85ac98f84fbc3ddd71bf6d309777f6f1..4f134ce1b2aeef7049b710a0061fa8cbcfc7ffa7 100644 GIT binary patch delta 10 Rcmeyu_=S;)gMA{?Cjb;X1Azbl delta 10 Rcmeyu_=S;)gKZ+yCjb;T1AqVk diff --git a/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_reduction_sum_expanded/model.onnx b/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_reduction_sum_expanded/model.onnx index 37e8a8bc8213f8e3d6df34a3cca91689bd844bf4..99f9bfe5f25a43ca1f1f1849174234afad371fc0 100644 GIT binary patch delta 11 Scmew-_)n0DgMA~@9}WN-Km+Fh delta 11 Scmew-_)n0DgKZ?gigMA~D3?l#!>jEMG delta 11 ScmbQoG>?gigKZ;|3?l#!<^mxA diff --git a/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_expanded/model.onnx b/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_expanded/model.onnx index 129b0d2170914885694c1f701f5b9733f41695e0..b4b60b33b3d4ab8dde194b5482e8f60c2492b700 100644 GIT binary patch delta 11 Scmdlbu}gx9gMA~D1rGod>H@6* delta 11 Scmdlbu}gx9gKZ;|1rGod0}6LXUz6J$lq@=nv?@A?XgArE#}pH`britN+r8O@oJ* zWxk%7cg2;=>uR?rtDXY+@cI)&4Rd1;eSj=fohE*dWONf`+B)yocN@}GFke(UU79_$ zrg2V{Tzb5P+-U1MLSq$uuV$wx=Hzups^10U>^kxN=P63$$FL0TNW#`>Y$q0n-ZO%4 zK;Kjc#(CVlt7r9sM-v)6--8ntnTr26Zy!bvatR9&$pT`Lu;#36(Dwe~PD&_5CiAi5 bTwP3_0R+#d!)$#6e?K5INI5rqi;MghwVYZ+ literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index/test_data_set_0/input_0.pb b/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index/test_data_set_0/input_0.pb new file mode 100644 index 0000000000000000000000000000000000000000..1a71820b36a1975cdfa96cecb1c5fcf20bc15a86 GIT binary patch literal 2180 zcmWO6izC+e0tWDs_r*a=#9_lCN^T>U^!~oj6Ro*KvxXwHS1ftmg^^3SZm}Yv%UUNY za>{KbtqW>vN?NsASv9(>#jHA0TbGrc&mZy8oTMq!lxxbTO!ShiPE1Z;6?nnS5x>?t&E!&U%}P1 zQaHH0$Ir@#akog1dcSXz!cEP%xcx1zd}PY3Z7Doc=0KHRCmK!-;%Tij|8C9@9&NKZ z+HA|x#HH|*spE(I-SD6LHR7J>QCTqrJ$Wt%|Gp^B-3`VP2P?ekJd?>S>*3q|5QqG- zG1_Z^2%S5^T{eqX^L268-xuX)k7K9Z5^CK#BQ_O=5Z`|z_&kfcV~?P&o6T9*OsJeM zs4RS2=%25^PMbeP>#kZ9&m0qN7wX_`as{VX6pEh~I3!{2@D5E1hv%#6tfVJi641<24Cf zALPOftJOr!-dS)pHmB#e`n2DkhJCY=IA~`|g}D}&svFUJcpg2E>_Xk$KKR@YfpR|( zeJX`>J`_}$zJzix7LybO_`4-R+)(|Dj{04wT;R;s*dAz%xI^3aoMiv?XQE})o=^N9 zz%xRN^UGuSQ%Npi5{bMn8#-Msfv??u$;8TwegpMT27iOG9fqv!3gQ#>>D2mo1rO<# zvLQ0k~Fm;Z*4F z2(#LVb?%1TV->-&LyyJNRCV5xG+YgUBA>Wq0<348i3D_i0R z8Dj^RNwvLg5bZ{=3th`6qfaoc={2$rS@l_03qk2Vk zXE^8UWpTmtbjl23`Pa<%IAL=Wa{j$fjI6(ivlkL@tw{qJKZc3VyFDn)Q%Jq&P?8<0>hhvUU4VX4}VFqaZ|l?|ix zHwP9NI5Wzo5@(A~p+4UOW2A0aHwN?>zcZai?^@C2`V?_E;wCaH)cDX#izkvC zs6Txp$A^v319Ay zrR|wOmJ3Vdj=#;S@jmQJS|@r;1|U+4P!$%+q&Q7+FF&0R{x}TZx(k9 zb?dQgjvQO&ER-}0N5#<~Gv0{J;{4KFZb-U>l3UieS8T}ut(W-IHTtF#y?w$y7bt41+r^G>^T4gl(O;X%WP_bMdrEmBhEM ziI|?!j=j4BY3rNAPbYuE_NG znb*^V|5nFyxI3KLN!}vq)_M%7?!nsoN0F*(hi0arvzr5Jg3S5OFRm=Q_5mi3HDjKc zJwKhMWR8P2)lOF6QOjRa-E~(6*-e!CzCQ`yOB&37Bc8wc91#bueUEIr@1(2-4e literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index/test_data_set_0/input_1.pb b/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index/test_data_set_0/input_1.pb new file mode 100644 index 0000000000000000000000000000000000000000..febe677b84bd42bb04ce84d35a913f994cf373da GIT binary patch literal 451 zcmYk0!3}^g3t%MGtEa6_|Pfx bC^U87NU3|sG8-P}c-+G$^r*q!@vvikna%{# literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index/test_data_set_0/input_2.pb b/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index/test_data_set_0/input_2.pb new file mode 100644 index 00000000000..6d74ae43745 --- /dev/null +++ b/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index/test_data_set_0/input_2.pb @@ -0,0 +1 @@ +BweightJY?`o?p?W=K? \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index/test_data_set_0/output_0.pb b/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index/test_data_set_0/output_0.pb new file mode 100644 index 00000000000..d97cebfae6a --- /dev/null +++ b/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index/test_data_set_0/output_0.pb @@ -0,0 +1 @@ +BlossJbڿ \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index_expanded/model.onnx b/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index_expanded/model.onnx new file mode 100644 index 0000000000000000000000000000000000000000..96173792901c7d2ff7370762369770a0fd8e895c GIT binary patch literal 4425 zcmd5=Pixdb6yM2av*~kbm$8%{1R=)-5B2CtEaGMD!Np52Lz=wJ4x62qDaSu)}j^{YP~G-A`^KlV24&hs%2$_p2bTPQj4R* z^dP-2&ZL=%GnCoPh!WE(F|trXR|_FCRg_j#rMMEfTE@1MifN@9k#@pii?*IC{h~rV z$I&hD;}zxIks1?I+Fnu^VcqW2yJ8jy<^a zfcgRWD?%D>!om7~8g*?z!Dq%_6GE31%7*eemPHgE*8NzS(QW9R#(9OKe#nNwh;=TQ z51$w*wL0L1!W(c~fA!%{4@kNL&~yV*%4&^F6BVg|a6EHhd8RkXWrFaP@%QAI=IT9v zL*zD$^NsO$tc%)drU^!~oj6Ro*KvxXwHS1ftmg^^3SZm}Yv%UUNY za>{KbtqW>vN?NsASv9(>#jHA0TbGrc&mZy8oTMq!lxxbTO!ShiPE1Z;6?nnS5x>?t&E!&U%}P1 zQaHH0$Ir@#akog1dcSXz!cEP%xcx1zd}PY3Z7Doc=0KHRCmK!-;%Tij|8C9@9&NKZ z+HA|x#HH|*spE(I-SD6LHR7J>QCTqrJ$Wt%|Gp^B-3`VP2P?ekJd?>S>*3q|5QqG- zG1_Z^2%S5^T{eqX^L268-xuX)k7K9Z5^CK#BQ_O=5Z`|z_&kfcV~?P&o6T9*OsJeM zs4RS2=%25^PMbeP>#kZ9&m0qN7wX_`as{VX6pEh~I3!{2@D5E1hv%#6tfVJi641<24Cf zALPOftJOr!-dS)pHmB#e`n2DkhJCY=IA~`|g}D}&svFUJcpg2E>_Xk$KKR@YfpR|( zeJX`>J`_}$zJzix7LybO_`4-R+)(|Dj{04wT;R;s*dAz%xI^3aoMiv?XQE})o=^N9 zz%xRN^UGuSQ%Npi5{bMn8#-Msfv??u$;8TwegpMT27iOG9fqv!3gQ#>>D2mo1rO<# zvLQ0k~Fm;Z*4F z2(#LVb?%1TV->-&LyyJNRCV5xG+YgUBA>Wq0<348i3D_i0R z8Dj^RNwvLg5bZ{=3th`6qfaoc={2$rS@l_03qk2Vk zXE^8UWpTmtbjl23`Pa<%IAL=Wa{j$fjI6(ivlkL@tw{qJKZc3VyFDn)Q%Jq&P?8<0>hhvUU4VX4}VFqaZ|l?|ix zHwP9NI5Wzo5@(A~p+4UOW2A0aHwN?>zcZai?^@C2`V?_E;wCaH)cDX#izkvC zs6Txp$A^v319Ay zrR|wOmJ3Vdj=#;S@jmQJS|@r;1|U+4P!$%+q&Q7+FF&0R{x}TZx(k9 zb?dQgjvQO&ER-}0N5#<~Gv0{J;{4KFZb-U>l3UieS8T}ut(W-IHTtF#y?w$y7bt41+r^G>^T4gl(O;X%WP_bMdrEmBhEM ziI|?!j=j4BY3rNAPbYuE_NG znb*^V|5nFyxI3KLN!}vq)_M%7?!nsoN0F*(hi0arvzr5Jg3S5OFRm=Q_5mi3HDjKc zJwKhMWR8P2)lOF6QOjRa-E~(6*-e!CzCQ`yOB&37Bc8wc91#bueUEIr@1(2-4e literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index_expanded/test_data_set_0/input_1.pb b/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index_expanded/test_data_set_0/input_1.pb new file mode 100644 index 0000000000000000000000000000000000000000..febe677b84bd42bb04ce84d35a913f994cf373da GIT binary patch literal 451 zcmYk0!3}^g3t%MGtEa6_|Pfx bC^U87NU3|sG8-P}c-+G$^r*q!@vvikna%{# literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index_expanded/test_data_set_0/input_2.pb b/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index_expanded/test_data_set_0/input_2.pb new file mode 100644 index 00000000000..6d74ae43745 --- /dev/null +++ b/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index_expanded/test_data_set_0/input_2.pb @@ -0,0 +1 @@ +BweightJY?`o?p?W=K? \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index_expanded/test_data_set_0/output_0.pb b/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index_expanded/test_data_set_0/output_0.pb new file mode 100644 index 00000000000..d97cebfae6a --- /dev/null +++ b/onnx/backend/test/data/node/test_negative_log_likelihood_loss_input_shape_is_NCd1d2_with_weight_reduction_sum_ignore_index_expanded/test_data_set_0/output_0.pb @@ -0,0 +1 @@ +BlossJbڿ \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_nesterov_momentum/model.onnx b/onnx/backend/test/data/node/test_nesterov_momentum/model.onnx new file mode 100644 index 0000000000000000000000000000000000000000..1f3fb547acead6c3d2cf783e4d4632074cc15b86 GIT binary patch literal 326 zcmd;J7vf1uOwLZtOVKS!EiSQ|%f!{q$i*1M#TdfH7{SHp&czre#2OKwms&2w8U~`2 zIDGSSQ}aqnbG7)nSQB#!G7?3Njf?FUFfwZKaj_(&mL!TYFf@Sq!dxu5`6;PN9C<*q zQ;YJ;7BDhvNpT6}N6k_9I;b0VE0C7UV9CjdwALwu)E-ntB3=0<%2NOuZ9bF&{ bSs+Q63+hEAZ6HCghmln}iEv>!QGgKueA-B# literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_nesterov_momentum/test_data_set_0/input_0.pb b/onnx/backend/test/data/node/test_nesterov_momentum/test_data_set_0/input_0.pb new file mode 100644 index 00000000000..d0483cc61f7 --- /dev/null +++ b/onnx/backend/test/data/node/test_nesterov_momentum/test_data_set_0/input_0.pb @@ -0,0 +1 @@ +BRJ= \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_nesterov_momentum/test_data_set_0/input_1.pb b/onnx/backend/test/data/node/test_nesterov_momentum/test_data_set_0/input_1.pb new file mode 100644 index 0000000000000000000000000000000000000000..54656de61df113af74638d4e31110b7cb8c8c9f0 GIT binary patch literal 15 QcmWe&cVZ0j;$VOR01J-+0RR91 literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_nesterov_momentum/test_data_set_0/input_2.pb b/onnx/backend/test/data/node/test_nesterov_momentum/test_data_set_0/input_2.pb new file mode 100644 index 00000000000..15244fd4b75 --- /dev/null +++ b/onnx/backend/test/data/node/test_nesterov_momentum/test_data_set_0/input_2.pb @@ -0,0 +1 @@ +BXJ?333@ \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_nesterov_momentum/test_data_set_0/input_3.pb b/onnx/backend/test/data/node/test_nesterov_momentum/test_data_set_0/input_3.pb new file mode 100644 index 0000000000000000000000000000000000000000..439d577ffbbfb621346b9f41aca167107861274a GIT binary patch literal 17 Ycmd;J5@2*qTJG#E!I=7L#LjkfIa{vGU delta 62 zcmV-E0KxzM0saAy8UeqTJG#E!I=GR$LjgD&aR2}S diff --git a/onnx/backend/test/data/node/test_pow_bcast_array/model.onnx b/onnx/backend/test/data/node/test_pow_bcast_array/model.onnx index 8b630fa48b5..0edfc2fb0ef 100644 --- a/onnx/backend/test/data/node/test_pow_bcast_array/model.onnx +++ b/onnx/backend/test/data/node/test_pow_bcast_array/model.onnx @@ -1,4 +1,4 @@ - backend-test:a + backend-test:a  x yz"Powtest_pow_bcast_arrayZ @@ -13,4 +13,4 @@ z   -B \ No newline at end of file +B \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_pow_bcast_scalar/model.onnx b/onnx/backend/test/data/node/test_pow_bcast_scalar/model.onnx index b9eaabe2114b120ff0c798541e98105d77aa6b10..bea5d68cad06df8c466ae46682581112f02f9784 100644 GIT binary patch delta 10 Rcmd1FVd7w)$dt**0{{%B0rLO= delta 10 Rcmd1FVd7w($dt**2>=X>0qg(( diff --git a/onnx/backend/test/data/node/test_pow_example/model.onnx b/onnx/backend/test/data/node/test_pow_example/model.onnx index ddc1ea3ac00..a4aa9a398bf 100644 --- a/onnx/backend/test/data/node/test_pow_example/model.onnx +++ b/onnx/backend/test/data/node/test_pow_example/model.onnx @@ -1,4 +1,4 @@ - backend-test:U + backend-test:U  x yz"Powtest_pow_exampleZ @@ -13,4 +13,4 @@ z  -B \ No newline at end of file +B \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_pow_types_float/model.onnx b/onnx/backend/test/data/node/test_pow_types_float/model.onnx new file mode 100644 index 00000000000..c0fd50393cd --- /dev/null +++ b/onnx/backend/test/data/node/test_pow_types_float/model.onnx @@ -0,0 +1,16 @@ + backend-test:Y + +x +yz"Powtest_pow_types_floatZ +x + + +Z +y + + +b +z + + +B \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_pow_types_float/test_data_set_0/input_0.pb b/onnx/backend/test/data/node/test_pow_types_float/test_data_set_0/input_0.pb new file mode 100644 index 0000000000000000000000000000000000000000..d91963f15c890955c4049fa33858b8d64ca4c264 GIT binary patch literal 33 Ycmd;J7GQT`tniXxWPkuBD9sF|0V1^lMgRZ+ literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_pow_types_float/test_data_set_0/input_1.pb b/onnx/backend/test/data/node/test_pow_types_float/test_data_set_0/input_1.pb new file mode 100644 index 0000000000000000000000000000000000000000..a760307aaa9d741605ce33d06bfeb54d3669be86 GIT binary patch literal 21 acmd;J7GQK@tn}hxU}$h)U|0ae2OIz(Yy-~# literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_pow_types_float/test_data_set_0/output_0.pb b/onnx/backend/test/data/node/test_pow_types_float/test_data_set_0/output_0.pb new file mode 100644 index 0000000000000000000000000000000000000000..c77d2103a132ff11755f8c395343ece7c1ccd1ff GIT binary patch literal 33 acmd;J7GQT`tn!jzWPkt#D1DO&!TTL032)r>i_@% literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_pow_types_float32_int64/model.onnx b/onnx/backend/test/data/node/test_pow_types_float32_int64/model.onnx new file mode 100644 index 00000000000..9fe81897e54 --- /dev/null +++ b/onnx/backend/test/data/node/test_pow_types_float32_int64/model.onnx @@ -0,0 +1,16 @@ + backend-test:a + +x +yz"Powtest_pow_types_float32_int64Z +x + + +Z +y + + +b +z + + +B \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_pow_types_float32_int64/test_data_set_0/input_0.pb b/onnx/backend/test/data/node/test_pow_types_float32_int64/test_data_set_0/input_0.pb new file mode 100644 index 0000000000000000000000000000000000000000..62e4e87e30c2908b48e0e912c49f073faf7953fd GIT binary patch literal 21 acmd;J7GQK@tnlJtU}&&sU|?_nA_o8)lme{) literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_pow_types_float32_int64/test_data_set_0/input_1.pb b/onnx/backend/test/data/node/test_pow_types_float32_int64/test_data_set_0/input_1.pb new file mode 100644 index 0000000000000000000000000000000000000000..2a776165494d557dfd635a732120b459acb76fcf GIT binary patch literal 33 Ycmd;J7GQT`tn`v#VSoTuD9r|?0V7}mPyhe` literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_pow_types_float32_int64/test_data_set_0/output_0.pb b/onnx/backend/test/data/node/test_pow_types_float32_int64/test_data_set_0/output_0.pb new file mode 100644 index 0000000000000000000000000000000000000000..0cc39708ca5cd984c0c32285657b92f0545914ba GIT binary patch literal 21 ccmd;J7GQK@tn%VvU}&&sU|?`!a4>TL032)r>i_@% literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_pow_types_float32_uint32/model.onnx b/onnx/backend/test/data/node/test_pow_types_float32_uint32/model.onnx new file mode 100644 index 00000000000..110e4dc2fc7 --- /dev/null +++ b/onnx/backend/test/data/node/test_pow_types_float32_uint32/model.onnx @@ -0,0 +1,16 @@ + backend-test:b + +x +yz"Powtest_pow_types_float32_uint32Z +x + + +Z +y + +  +b +z + + +B \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_pow_types_float32_uint32/test_data_set_0/input_0.pb b/onnx/backend/test/data/node/test_pow_types_float32_uint32/test_data_set_0/input_0.pb new file mode 100644 index 0000000000000000000000000000000000000000..62e4e87e30c2908b48e0e912c49f073faf7953fd GIT binary patch literal 21 acmd;J7GQK@tnlJtU}&&sU|?_nA_o8)lme{) literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_pow_types_float32_uint32/test_data_set_0/input_1.pb b/onnx/backend/test/data/node/test_pow_types_float32_uint32/test_data_set_0/input_1.pb new file mode 100644 index 0000000000000000000000000000000000000000..7918aa428c7bc792a28f31407f517e45ba1c62f9 GIT binary patch literal 21 Zcmd;J7T|GWtn}hxVPIfj1!6WJ1^^SH0Z9M= literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_pow_types_float32_uint32/test_data_set_0/output_0.pb b/onnx/backend/test/data/node/test_pow_types_float32_uint32/test_data_set_0/output_0.pb new file mode 100644 index 0000000000000000000000000000000000000000..0cc39708ca5cd984c0c32285657b92f0545914ba GIT binary patch literal 21 ccmd;J7GQK@tn%VvU}&&sU|?`!a4>TL032)r>i_@% literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_pow_types_float32_uint64/model.onnx b/onnx/backend/test/data/node/test_pow_types_float32_uint64/model.onnx new file mode 100644 index 00000000000..46cdbc7e253 --- /dev/null +++ b/onnx/backend/test/data/node/test_pow_types_float32_uint64/model.onnx @@ -0,0 +1,16 @@ + backend-test:b + +x +yz"Powtest_pow_types_float32_uint64Z +x + + +Z +y + +  +b +z + + +B \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_pow_types_float32_uint64/test_data_set_0/input_0.pb b/onnx/backend/test/data/node/test_pow_types_float32_uint64/test_data_set_0/input_0.pb new file mode 100644 index 0000000000000000000000000000000000000000..62e4e87e30c2908b48e0e912c49f073faf7953fd GIT binary patch literal 21 acmd;J7GQK@tnlJtU}&&sU|?_nA_o8)lme{) literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_pow_types_float32_uint64/test_data_set_0/input_1.pb b/onnx/backend/test/data/node/test_pow_types_float32_uint64/test_data_set_0/input_1.pb new file mode 100644 index 0000000000000000000000000000000000000000..015d9afbace42a4d241bc77acb6ea8770177b674 GIT binary patch literal 33 Ycmd;J7T|Satn`v#VSoTuD9r|?0VEUwRsaA1 literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_pow_types_float32_uint64/test_data_set_0/output_0.pb b/onnx/backend/test/data/node/test_pow_types_float32_uint64/test_data_set_0/output_0.pb new file mode 100644 index 0000000000000000000000000000000000000000..0cc39708ca5cd984c0c32285657b92f0545914ba GIT binary patch literal 21 ccmd;J7GQK@tn%VvU}&&sU|?`!a4>TL032)r>i_@% literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_pow_types_int/model.onnx b/onnx/backend/test/data/node/test_pow_types_int/model.onnx new file mode 100644 index 00000000000..2e62ba5ad4f --- /dev/null +++ b/onnx/backend/test/data/node/test_pow_types_int/model.onnx @@ -0,0 +1,16 @@ + backend-test:W + +x +yz"Powtest_pow_types_intZ +x + + +Z +y + + +b +z + + +B \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_pow_types_int/test_data_set_0/input_0.pb b/onnx/backend/test/data/node/test_pow_types_int/test_data_set_0/input_0.pb new file mode 100644 index 0000000000000000000000000000000000000000..62e4e87e30c2908b48e0e912c49f073faf7953fd GIT binary patch literal 21 acmd;J7GQK@tnlJtU}&&sU|?_nA_o8)lme{) literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_pow_types_int/test_data_set_0/input_1.pb b/onnx/backend/test/data/node/test_pow_types_int/test_data_set_0/input_1.pb new file mode 100644 index 0000000000000000000000000000000000000000..2a776165494d557dfd635a732120b459acb76fcf GIT binary patch literal 33 Ycmd;J7GQT`tn`v#VSoTuD9r|?0V7}mPyhe` literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_pow_types_int/test_data_set_0/output_0.pb b/onnx/backend/test/data/node/test_pow_types_int/test_data_set_0/output_0.pb new file mode 100644 index 0000000000000000000000000000000000000000..0cc39708ca5cd984c0c32285657b92f0545914ba GIT binary patch literal 21 ccmd;J7GQK@tn%VvU}&&sU|?`!a4>TL032)r>i_@% literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_pow_types_int32_float32/model.onnx b/onnx/backend/test/data/node/test_pow_types_int32_float32/model.onnx new file mode 100644 index 00000000000..bc90f825a44 --- /dev/null +++ b/onnx/backend/test/data/node/test_pow_types_int32_float32/model.onnx @@ -0,0 +1,16 @@ + backend-test:a + +x +yz"Powtest_pow_types_int32_float32Z +x + + +Z +y + + +b +z + + +B \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_pow_types_int32_float32/test_data_set_0/input_0.pb b/onnx/backend/test/data/node/test_pow_types_int32_float32/test_data_set_0/input_0.pb new file mode 100644 index 0000000000000000000000000000000000000000..bb7ca3491e33a4debc334ff548aa5cca7ec638db GIT binary patch literal 21 Zcmd;J7GQH?tnlJtWME)m0%B$$1^^P@0XYBw literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_pow_types_int32_float32/test_data_set_0/input_1.pb b/onnx/backend/test/data/node/test_pow_types_int32_float32/test_data_set_0/input_1.pb new file mode 100644 index 0000000000000000000000000000000000000000..a760307aaa9d741605ce33d06bfeb54d3669be86 GIT binary patch literal 21 acmd;J7GQK@tn}hxU}$h)U|0ae2OIz(Yy-~# literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_pow_types_int32_float32/test_data_set_0/output_0.pb b/onnx/backend/test/data/node/test_pow_types_int32_float32/test_data_set_0/output_0.pb new file mode 100644 index 0000000000000000000000000000000000000000..67fd11d8712c63df32379d3d5d267b267b5638de GIT binary patch literal 21 acmd;J7GQH?tn%VvWME)W0OFfW3=9AlO9C+f literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_pow_types_int32_int32/model.onnx b/onnx/backend/test/data/node/test_pow_types_int32_int32/model.onnx new file mode 100644 index 00000000000..ee0ccee1ac5 --- /dev/null +++ b/onnx/backend/test/data/node/test_pow_types_int32_int32/model.onnx @@ -0,0 +1,16 @@ + backend-test:_ + +x +yz"Powtest_pow_types_int32_int32Z +x + + +Z +y + + +b +z + + +B \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_pow_types_int32_int32/test_data_set_0/input_0.pb b/onnx/backend/test/data/node/test_pow_types_int32_int32/test_data_set_0/input_0.pb new file mode 100644 index 0000000000000000000000000000000000000000..bb7ca3491e33a4debc334ff548aa5cca7ec638db GIT binary patch literal 21 Zcmd;J7GQH?tnlJtWME)m0%B$$1^^P@0XYBw literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_pow_types_int32_int32/test_data_set_0/input_1.pb b/onnx/backend/test/data/node/test_pow_types_int32_int32/test_data_set_0/input_1.pb new file mode 100644 index 0000000000000000000000000000000000000000..194c2ad3ae1744cdd30210b1f3770f246a19e3eb GIT binary patch literal 21 Zcmd;J7GQH?tn}hxVPIfj1!6WJ1^^Q_0Yd-) literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_pow_types_int32_int32/test_data_set_0/output_0.pb b/onnx/backend/test/data/node/test_pow_types_int32_int32/test_data_set_0/output_0.pb new file mode 100644 index 0000000000000000000000000000000000000000..67fd11d8712c63df32379d3d5d267b267b5638de GIT binary patch literal 21 acmd;J7GQH?tn%VvWME)W0OFfW3=9AlO9C+f literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_pow_types_int64_float32/model.onnx b/onnx/backend/test/data/node/test_pow_types_int64_float32/model.onnx new file mode 100644 index 00000000000..3aa62ec7f97 --- /dev/null +++ b/onnx/backend/test/data/node/test_pow_types_int64_float32/model.onnx @@ -0,0 +1,16 @@ + backend-test:a + +x +yz"Powtest_pow_types_int64_float32Z +x + + +Z +y + + +b +z + + +B \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_pow_types_int64_float32/test_data_set_0/input_0.pb b/onnx/backend/test/data/node/test_pow_types_int64_float32/test_data_set_0/input_0.pb new file mode 100644 index 0000000000000000000000000000000000000000..d91963f15c890955c4049fa33858b8d64ca4c264 GIT binary patch literal 33 Ycmd;J7GQT`tniXxWPkuBD9sF|0V1^lMgRZ+ literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_pow_types_int64_float32/test_data_set_0/input_1.pb b/onnx/backend/test/data/node/test_pow_types_int64_float32/test_data_set_0/input_1.pb new file mode 100644 index 0000000000000000000000000000000000000000..a760307aaa9d741605ce33d06bfeb54d3669be86 GIT binary patch literal 21 acmd;J7GQK@tn}hxU}$h)U|0ae2OIz(Yy-~# literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_pow_types_int64_float32/test_data_set_0/output_0.pb b/onnx/backend/test/data/node/test_pow_types_int64_float32/test_data_set_0/output_0.pb new file mode 100644 index 0000000000000000000000000000000000000000..c77d2103a132ff11755f8c395343ece7c1ccd1ff GIT binary patch literal 33 acmd;J7GQT`tn!jzWPkt#D1DO&!T^@Bjd3gakJL delta 36 qcmZ3_wVrE(Fslr^5C<0%2Qv^eC2?~xRtd3jv2ZX7F*q>^@Bjd3paeJo diff --git a/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_3d_expanded/test_data_set_0/input_1.pb b/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_3d_expanded/test_data_set_0/input_1.pb index cd3dd54a5b93f253a108e3cb7a976f79bb38e27b..87a0fc4fe9ba2edea50d17f0ee92267c58f8a54a 100644 GIT binary patch literal 35 fcmd;J=3o+Fb7HLYl3-?FU;tqzAZCHmK#BnXB&Gpa literal 59 hcmd;J=3o+FcVevcGGJza02s{#<+DI(7$3@I002oy0dW8T diff --git a/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_expanded/model.onnx b/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_expanded/model.onnx index e7787066fa6e634d83f18e332d7cf14c8d1ebbe0..19875ee4259d07ba683cd5f0fa51ddf8cfce359a 100644 GIT binary patch delta 32 ncmey%`ImFUQx*v}Ar>ws4(23oF2*V$HZB$pMj-|#CIKD*gN_A+ delta 32 ncmey%`ImFUQx*w!Ar>ws4(23oF2*V$HZB$pMj-|#CIKD*gQf+A diff --git a/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_expanded/test_data_set_0/input_1.pb b/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_expanded/test_data_set_0/input_1.pb index 6251a1cdb8d5f655406beff874d0b8f495b4e18b..bc4806043c89396003882cc95bab3524afb90a85 100644 GIT binary patch literal 21 Zcmd;J7GQH?tn}hxWME)m0b*t#1^^QN0XzTz literal 33 Ycmd;J7GQT`tn`v#WPkt`D9sF|0V41LNdN!< diff --git a/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight/model.onnx b/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight/model.onnx index dfe74a2fa911457af577a64845e77027b2af7b34..4d23b428c71965804fd36da45832eeccd8a917c5 100644 GIT binary patch delta 11 ScmX@Wcz|)jbVjy`Gc*7hzXSpR delta 11 ScmX@Wcz|)jbVl}xGc*7h!vq5W diff --git a/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight/test_data_set_0/input_1.pb b/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight/test_data_set_0/input_1.pb index 6251a1cdb8d5f655406beff874d0b8f495b4e18b..bc4806043c89396003882cc95bab3524afb90a85 100644 GIT binary patch literal 21 Zcmd;J7GQH?tn}hxWME)m0b*t#1^^QN0XzTz literal 33 Ycmd;J7GQT`tn`v#WPkt`D9sF|0V41LNdN!< diff --git a/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_expanded/model.onnx b/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_expanded/model.onnx index b27ae19a56bd7aa0ec9086a19243e7864b06928e..b5f370b67e56b9c6d0c9bc0c05a2e390b8d30471 100644 GIT binary patch delta 13 Ucmey&^_gpf7b_#%WN%gt03+7~3;+NC delta 13 Ucmey&^_gpf7b_$CWN%gt03+N44FCWD diff --git a/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_expanded/test_data_set_0/input_1.pb b/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_expanded/test_data_set_0/input_1.pb index 6251a1cdb8d5f655406beff874d0b8f495b4e18b..bc4806043c89396003882cc95bab3524afb90a85 100644 GIT binary patch literal 21 Zcmd;J7GQH?tn}hxWME)m0b*t#1^^QN0XzTz literal 33 Ycmd;J7GQT`tn`v#WPkt`D9sF|0V41LNdN!< diff --git a/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index/model.onnx b/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index/model.onnx new file mode 100644 index 0000000000000000000000000000000000000000..0817553caac6ad330b6e5b03e70077303df0c407 GIT binary patch literal 226 zcmYk0y$*sf6ot8fSg%A)G0~aP#L3N3N4L?uA`}k6}usEFLp&%lC9#U+J6Br2sJ{3P*G^Z)m1)J9@WP}mgPyPLcW``uA2b;EETfgEE5T@ z*H#DiW{H!6C!bhlQiN{KBhi6FcSg<1LBwKXVti0BbHgg0rBB8FeYZf*pfmIdP=Ypb ibVj$i9!3!f@u+K{0aXv62c!EGp`{+W`pr7n3;zccP&xGg literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index/test_data_set_0/input_0.pb b/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index/test_data_set_0/input_0.pb new file mode 100644 index 00000000000..aeb52e93c10 --- /dev/null +++ b/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index/test_data_set_0/input_0.pb @@ -0,0 +1 @@ +BxJ<  ?7?N?w} ?H>QY%?n >~J?e?^k?l?Z{= \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index/test_data_set_0/input_1.pb b/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index/test_data_set_0/input_1.pb new file mode 100644 index 0000000000000000000000000000000000000000..29f42478fda2a3b734ddaa7ee030d968c21e231f GIT binary patch literal 21 Xcmd;J7GQH?tn}hx00I^uW(Hya67m5% literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index/test_data_set_0/input_2.pb b/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index/test_data_set_0/input_2.pb new file mode 100644 index 00000000000..2186ff73f9e --- /dev/null +++ b/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index/test_data_set_0/input_2.pb @@ -0,0 +1 @@ +BwJfff?333?L?fff?fff? \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index/test_data_set_0/output_0.pb b/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index/test_data_set_0/output_0.pb new file mode 100644 index 00000000000..840b42e6bd1 --- /dev/null +++ b/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index/test_data_set_0/output_0.pb @@ -0,0 +1 @@ +BzJ? \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index_expanded/model.onnx b/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index_expanded/model.onnx new file mode 100644 index 0000000000000000000000000000000000000000..fc50e774ebf05403000c572c6e42e6b374474261 GIT binary patch literal 1598 zcmcgsJx{|h5UrE6q(=nEWq{RVCH??n0=gB3W!xRnk6Hx~@mcRJz=bG39m_ zua8BZ*(w*uGGB>A@r{th+;CYaT?sB#E?O?yGQlM0vqoh`YW2onl9u@x;F};FoPf@` zq_|0$j{&}jb3I7oT+gU2nU4W}-0MLanmBH`NzD2akvEx$n-zR`&Ogb%oqJKv``}rw znUBGas_QvL8Y-Oj!BQ8ztTc}5SQqd~;52kVwm>$N?AYzAC=w&r0{O>sA(nEkb#A?N zcn$r^HmJ2o7Favo6Mr~>=&zgJboRuf5C8Gu>A+h21wQY%?n >~J?e?^k?l?Z{= \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index_expanded/test_data_set_0/input_1.pb b/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index_expanded/test_data_set_0/input_1.pb new file mode 100644 index 0000000000000000000000000000000000000000..29f42478fda2a3b734ddaa7ee030d968c21e231f GIT binary patch literal 21 Xcmd;J7GQH?tn}hx00I^uW(Hya67m5% literal 0 HcmV?d00001 diff --git a/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index_expanded/test_data_set_0/input_2.pb b/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index_expanded/test_data_set_0/input_2.pb new file mode 100644 index 00000000000..2186ff73f9e --- /dev/null +++ b/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index_expanded/test_data_set_0/input_2.pb @@ -0,0 +1 @@ +BwJfff?333?L?fff?fff? \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index_expanded/test_data_set_0/output_0.pb b/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index_expanded/test_data_set_0/output_0.pb new file mode 100644 index 00000000000..840b42e6bd1 --- /dev/null +++ b/onnx/backend/test/data/node/test_softmax_cross_entropy_mean_weight_ignore_index_expanded/test_data_set_0/output_0.pb @@ -0,0 +1 @@ +BzJ? \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_softmax_cross_entropy_none/model.onnx b/onnx/backend/test/data/node/test_softmax_cross_entropy_none/model.onnx index 53f852258e1..dce630c51ac 100644 --- a/onnx/backend/test/data/node/test_softmax_cross_entropy_none/model.onnx +++ b/onnx/backend/test/data/node/test_softmax_cross_entropy_none/model.onnx @@ -9,7 +9,7 @@ Z y - + b z diff --git a/onnx/backend/test/data/node/test_softmax_cross_entropy_none/test_data_set_0/input_1.pb b/onnx/backend/test/data/node/test_softmax_cross_entropy_none/test_data_set_0/input_1.pb index 6251a1cdb8d5f655406beff874d0b8f495b4e18b..bc4806043c89396003882cc95bab3524afb90a85 100644 GIT binary patch literal 21 Zcmd;J7GQH?tn}hxWME)m0b*t#1^^QN0XzTz literal 33 Ycmd;J7GQT`tn`v#WPkt`D9sF|0V41LNdN!< diff --git a/onnx/backend/test/data/node/test_softmax_cross_entropy_none_expanded/model.onnx b/onnx/backend/test/data/node/test_softmax_cross_entropy_none_expanded/model.onnx index 4504fd8a571..bb633ae1ffd 100644 --- a/onnx/backend/test/data/node/test_softmax_cross_entropy_none_expanded/model.onnx +++ b/onnx/backend/test/data/node/test_softmax_cross_entropy_none_expanded/model.onnx @@ -23,7 +23,7 @@ QSoftmaxCrossEntropyLoss_test_softmax_cross_entropy_none_expanded_functionlog_pr Z y - + b z diff --git a/onnx/backend/test/data/node/test_softmax_cross_entropy_none_expanded/test_data_set_0/input_1.pb b/onnx/backend/test/data/node/test_softmax_cross_entropy_none_expanded/test_data_set_0/input_1.pb index 6251a1cdb8d5f655406beff874d0b8f495b4e18b..bc4806043c89396003882cc95bab3524afb90a85 100644 GIT binary patch literal 21 Zcmd;J7GQH?tn}hxWME)m0b*t#1^^QN0XzTz literal 33 Ycmd;J7GQT`tn`v#WPkt`D9sF|0V41LNdN!< diff --git a/onnx/backend/test/data/node/test_softmax_cross_entropy_none_weights/model.onnx b/onnx/backend/test/data/node/test_softmax_cross_entropy_none_weights/model.onnx index e19f2be1fe9..1f3dd146cbb 100644 --- a/onnx/backend/test/data/node/test_softmax_cross_entropy_none_weights/model.onnx +++ b/onnx/backend/test/data/node/test_softmax_cross_entropy_none_weights/model.onnx @@ -10,7 +10,7 @@ Z y - + Z w diff --git a/onnx/backend/test/data/node/test_softmax_cross_entropy_none_weights/test_data_set_0/input_1.pb b/onnx/backend/test/data/node/test_softmax_cross_entropy_none_weights/test_data_set_0/input_1.pb index 6251a1cdb8d5f655406beff874d0b8f495b4e18b..bc4806043c89396003882cc95bab3524afb90a85 100644 GIT binary patch literal 21 Zcmd;J7GQH?tn}hxWME)m0b*t#1^^QN0XzTz literal 33 Ycmd;J7GQT`tn`v#WPkt`D9sF|0V41LNdN!< diff --git a/onnx/backend/test/data/node/test_softmax_cross_entropy_none_weights_expanded/model.onnx b/onnx/backend/test/data/node/test_softmax_cross_entropy_none_weights_expanded/model.onnx index ea22b63b577..7e7af80dcdc 100644 --- a/onnx/backend/test/data/node/test_softmax_cross_entropy_none_weights_expanded/model.onnx +++ b/onnx/backend/test/data/node/test_softmax_cross_entropy_none_weights_expanded/model.onnx @@ -25,7 +25,7 @@ YSoftmaxCrossEntropyLoss_test_softmax_cross_entropy_none_weights_expanded_functi Z y - + Z w diff --git a/onnx/backend/test/data/node/test_softmax_cross_entropy_none_weights_expanded/test_data_set_0/input_1.pb b/onnx/backend/test/data/node/test_softmax_cross_entropy_none_weights_expanded/test_data_set_0/input_1.pb index 6251a1cdb8d5f655406beff874d0b8f495b4e18b..bc4806043c89396003882cc95bab3524afb90a85 100644 GIT binary patch literal 21 Zcmd;J7GQH?tn}hxWME)m0b*t#1^^QN0XzTz literal 33 Ycmd;J7GQT`tn`v#WPkt`D9sF|0V41LNdN!< diff --git a/onnx/backend/test/data/node/test_softmax_cross_entropy_sum/model.onnx b/onnx/backend/test/data/node/test_softmax_cross_entropy_sum/model.onnx index e9f7ea879ee06bb073d5ebdd9e7a89b0c07bd2bc..a3b4bb690971fa54434ec12f11836cc9e7a9a317 100644 GIT binary patch delta 31 mcmZ3?xR`N5w*;FI3l|dya}qZfW0epa7YhfY5Q7tw01p6M7z9E9 delta 31 mcmZ3?xR`N5w*ws4(23oF2*V$HZB$pMj-|#CIKD*epdxo delta 32 ncmaFI`HpkLH5LhWAr>ws4(23oF2*V$HZB$pMj-|#CIKD*es2X> diff --git a/onnx/backend/test/data/node/test_softmax_cross_entropy_sum_expanded/test_data_set_0/input_1.pb b/onnx/backend/test/data/node/test_softmax_cross_entropy_sum_expanded/test_data_set_0/input_1.pb index 6251a1cdb8d5f655406beff874d0b8f495b4e18b..bc4806043c89396003882cc95bab3524afb90a85 100644 GIT binary patch literal 21 Zcmd;J7GQH?tn}hxWME)m0b*t#1^^QN0XzTz literal 33 Ycmd;J7GQT`tn`v#WPkt`D9sF|0V41LNdN!< diff --git a/onnx/common/constants.h b/onnx/common/constants.h index fc2a2212280..1725d6ca475 100644 --- a/onnx/common/constants.h +++ b/onnx/common/constants.h @@ -12,7 +12,7 @@ namespace ONNX_NAMESPACE { constexpr const char* AI_ONNX_ML_DOMAIN = "ai.onnx.ml"; constexpr const char* AI_ONNX_TRAINING_DOMAIN = "ai.onnx.training"; constexpr const char* ONNX_DOMAIN = ""; -constexpr bool OPTIONAL = false; +constexpr bool OPTIONAL_VALUE = false; // For dimension denotation. constexpr const char* DATA_BATCH = "DATA_BATCH"; diff --git a/onnx/cpp2py_export.cc b/onnx/cpp2py_export.cc index 26d52ab6f2e..fc5cb5b2575 100644 --- a/onnx/cpp2py_export.cc +++ b/onnx/cpp2py_export.cc @@ -270,7 +270,7 @@ PYBIND11_MODULE(onnx_cpp2py_export, onnx_cpp2py_export) { [](const py::bytes& bytes, const std::vector& names) { ModelProto proto{}; ParseProtoFromPyBytes(&proto, bytes); - auto const result = optimization::Optimize(std::move(proto), names); + auto const result = optimization::Optimize(proto, names); std::string out; result.SerializeToString(&out); return py::bytes(out); @@ -282,7 +282,7 @@ PYBIND11_MODULE(onnx_cpp2py_export, onnx_cpp2py_export) { ModelProto proto{}; ParseProtoFromPyBytes(&proto, bytes); auto const result = - optimization::OptimizeFixed(std::move(proto), names); + optimization::OptimizeFixed(proto, names); std::string out; result.SerializeToString(&out); return py::bytes(out); diff --git a/onnx/defs/generator/defs.cc b/onnx/defs/generator/defs.cc index 0c0fe5c5130..2b5cd7f2d1a 100644 --- a/onnx/defs/generator/defs.cc +++ b/onnx/defs/generator/defs.cc @@ -184,7 +184,7 @@ ONNX_OPERATOR_SET_SCHEMA( "(Optional) The value of the output elements." "Should be a one-element tensor. If not specified, it defaults to a tensor of value 0 and datatype float32", AttributeProto::TENSOR, - OPTIONAL) + OPTIONAL_VALUE) .Input( 0, "input", @@ -313,7 +313,7 @@ ONNX_OPERATOR_SET_SCHEMA( "the data type of the input tensor T1 is used. If input tensor T1 is also not" "specified, then type defaults to 'float'.", AttributeProto::INT, - OPTIONAL) + OPTIONAL_VALUE) .Input( 0, "input", @@ -397,7 +397,7 @@ ONNX_OPERATOR_SET_SCHEMA( "seed", "(Optional) Seed to the random generator, if not specified we will auto generate one.", AttributeProto::FLOAT, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "dtype", "The data type for the elements of the output tensor. If not specified, default is TensorProto::FLOAT.", @@ -447,7 +447,7 @@ ONNX_OPERATOR_SET_SCHEMA( "seed", "(Optional) Seed to the random generator, if not specified we will auto generate one.", AttributeProto::FLOAT, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "dtype", "The data type for the elements of the output tensor. Default is TensorProto::FLOAT.", @@ -497,13 +497,13 @@ ONNX_OPERATOR_SET_SCHEMA( "seed", "(Optional) Seed to the random generator, if not specified we will auto generate one.", AttributeProto::FLOAT, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "dtype", "(Optional) The data type for the elements of the output tensor, if not specified, we will use " "the data type of the input tensor.", AttributeProto::INT, - OPTIONAL) + OPTIONAL_VALUE) .Input( 0, "input", @@ -562,13 +562,13 @@ ONNX_OPERATOR_SET_SCHEMA( "seed", "(Optional) Seed to the random generator, if not specified we will auto generate one.", AttributeProto::FLOAT, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "dtype", "(Optional) The data type for the elements of the output tensor, if not specified, we will use " "the data type of the input tensor.", AttributeProto::INT, - OPTIONAL) + OPTIONAL_VALUE) .Input( 0, "input", @@ -617,7 +617,7 @@ ONNX_OPERATOR_SET_SCHEMA( "seed", "(Optional) Seed to the random generator, if not specified we will auto generate one.", AttributeProto::FLOAT, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "dtype", "(Optional) The data type for the elements of the output tensor, if not specified, we will use int32.", diff --git a/onnx/defs/logical/defs.cc b/onnx/defs/logical/defs.cc index 4bb95d3d234..40a80de642c 100644 --- a/onnx/defs/logical/defs.cc +++ b/onnx/defs/logical/defs.cc @@ -7,39 +7,42 @@ namespace ONNX_NAMESPACE { inline void unaryLogicalOpInference(InferenceContext& ctx) { - // Type inference - updateOutputElemType(ctx, 0, TensorProto::BOOL); - // Shape inference - if (hasInputShape(ctx, 0)) { - propagateShapeFromInputToOutput(ctx, 0, 0); - } + // Type inference + updateOutputElemType(ctx, 0, TensorProto::BOOL); + // Shape inference + if (hasInputShape(ctx, 0)) { + propagateShapeFromInputToOutput(ctx, 0, 0); + } } std::function BinaryLogicDocGenerator(const char* name) { return [=](OpSchema& schema) { - std::string doc = R"DOC( + std::string doc; + POPULATE_OP_DOC_STR( + doc = R"DOC( Returns the tensor resulted from performing the `{name}` logical operation elementwise on the input tensors `A` and `B` (with Numpy-style broadcasting support). {broadcast_doc} )DOC"; ReplaceAll(doc, "{name}", name); - ReplaceAll(doc, "{broadcast_doc}", GenerateBroadcastingDocMul().c_str()); - schema.SetDoc(doc); - schema.Input(0, "A", "First input operand for the logical operator.", "T"); - schema.Input(1, "B", "Second input operand for the logical operator.", "T"); - schema.Output(0, "C", "Result tensor.", "T1"); - schema.TypeAndShapeInferenceFunction([](InferenceContext& ctx) { - // Type inference - updateOutputElemType(ctx, 0, TensorProto::BOOL); - // Shape inference - if (hasNInputShapes(ctx, 2)) - bidirectionalBroadcastShapeInference( - ctx.getInputType(0)->tensor_type().shape(), - ctx.getInputType(1)->tensor_type().shape(), - *ctx.getOutputType(0)->mutable_tensor_type()->mutable_shape()); - }); - }; + ReplaceAll( + doc, "{broadcast_doc}", GenerateBroadcastingDocMul().c_str());); + schema.SetDoc(doc); + schema.Input(0, "A", "First input operand for the logical operator.", "T"); + schema.Input(1, "B", "Second input operand for the logical operator.", "T"); + schema.Output(0, "C", "Result tensor.", "T1"); + schema.TypeAndShapeInferenceFunction([](InferenceContext& ctx) { + // Type inference + updateOutputElemType(ctx, 0, TensorProto::BOOL); + // Shape inference + if (hasNInputShapes(ctx, 2)) + bidirectionalBroadcastShapeInference( + ctx.getInputType(0)->tensor_type().shape(), + ctx.getInputType(1)->tensor_type().shape(), + *ctx.getOutputType(0)->mutable_tensor_type()->mutable_shape()); + }); + }; } ONNX_OPERATOR_SET_SCHEMA( @@ -172,7 +175,8 @@ ONNX_OPERATOR_SET_SCHEMA( BitShift, 11, OpSchema() - .SetDoc(std::string(BitShift_ver11_doc) + GenerateBroadcastingDocMul()) + .SetDoc(GET_OP_DOC_STR( + std::string(BitShift_ver11_doc) + GenerateBroadcastingDocMul())) .Input(0, "X", "First operand, input to be shifted.", "T") .Input(1, "Y", "Second operand, amounts of shift.", "T") .Output(0, "Z", "Output tensor", "T") @@ -213,12 +217,11 @@ ONNX_OPERATOR_SET_SCHEMA( {"tensor(bool)"}, "Constrains output to boolean tensor.") .TypeAndShapeInferenceFunction(InferenceFunction()) - .FunctionBody(FunctionBodyHelper::BuildNodes({ - // nodes: {outputs, op, inputs, attributes} - {{"O1"}, "Less", {"A", "B"}}, - {{"O2"}, "Equal", {"A", "B"}}, - {{"C"}, "Or", {"O1", "O2"}} - }))); + .FunctionBody(FunctionBodyHelper::BuildNodes( + {// nodes: {outputs, op, inputs, attributes} + {{"O1"}, "Less", {"A", "B"}}, + {{"O2"}, "Equal", {"A", "B"}}, + {{"C"}, "Or", {"O1", "O2"}}}))); ONNX_OPERATOR_SET_SCHEMA( GreaterOrEqual, @@ -234,11 +237,10 @@ ONNX_OPERATOR_SET_SCHEMA( {"tensor(bool)"}, "Constrains output to boolean tensor.") .TypeAndShapeInferenceFunction(InferenceFunction()) - .FunctionBody(FunctionBodyHelper::BuildNodes({ - // nodes: {outputs, op, inputs, attributes} - {{"O1"}, "Greater", {"A", "B"}}, - {{"O2"}, "Equal", {"A", "B"}}, - {{"C"}, "Or", {"O1", "O2"}} - }))); - -} // namespace ONNX_NAMESPACE + .FunctionBody(FunctionBodyHelper::BuildNodes( + {// nodes: {outputs, op, inputs, attributes} + {{"O1"}, "Greater", {"A", "B"}}, + {{"O2"}, "Equal", {"A", "B"}}, + {{"C"}, "Or", {"O1", "O2"}}}))); + +} // namespace ONNX_NAMESPACE \ No newline at end of file diff --git a/onnx/defs/logical/old.cc b/onnx/defs/logical/old.cc index 770ae6cbbc4..424a2f9c22f 100644 --- a/onnx/defs/logical/old.cc +++ b/onnx/defs/logical/old.cc @@ -16,7 +16,8 @@ inline void logicalOpInference_opset1(InferenceContext& ctx) { std::function BinaryLogicDocGenerator_opset1( const char* name) { return [=](OpSchema& schema) { - std::string doc = R"DOC( + std::string doc; + POPULATE_OP_DOC_STR(doc = R"DOC( Returns the tensor resulted from performing the `{name}` logical operation elementwise on the input tensors `A` and `B`. @@ -24,7 +25,7 @@ If broadcasting is enabled, the right-hand-side argument will be broadcasted to match the shape of left-hand-side argument. See the doc of `Add` for a detailed description of the broadcasting rules. )DOC"; - ReplaceAll(doc, "{name}", name); + ReplaceAll(doc, "{name}", name);); schema.SetDoc(doc); schema.Attr( "broadcast", @@ -35,7 +36,7 @@ detailed description of the broadcasting rules. "axis", "If set, defines the broadcast dimensions.", AttributeProto::INT, - OPTIONAL); + OPTIONAL_VALUE); schema.Input(0, "A", "Left input tensor for the logical operator.", "T"); schema.Input(1, "B", "Right input tensor for the logical operator.", "T"); schema.Output(0, "C", "Result tensor.", "T1"); @@ -43,16 +44,20 @@ detailed description of the broadcasting rules. }; } -std::function BinaryLogicDocGenerator_opset7(const char* name) { +std::function BinaryLogicDocGenerator_opset7( + const char* name) { return [=](OpSchema& schema) { - std::string doc = R"DOC( + std::string doc; + POPULATE_OP_DOC_STR( + doc = R"DOC( Returns the tensor resulted from performing the `{name}` logical operation elementwise on the input tensors `A` and `B` (with Numpy-style broadcasting support). {broadcast_doc} )DOC"; - ReplaceAll(doc, "{name}", name); - ReplaceAll(doc, "{broadcast_doc}", GenerateBroadcastingDocMul().c_str()); + ReplaceAll(doc, "{name}", name); + ReplaceAll( + doc, "{broadcast_doc}", GenerateBroadcastingDocMul().c_str());); schema.SetDoc(doc); schema.Input(0, "A", "First input operand for the logical operator.", "T"); schema.Input(1, "B", "Second input operand for the logical operator.", "T"); @@ -173,9 +178,7 @@ ONNX_OPERATOR_SET_SCHEMA( .FillUsing(BinaryLogicDocGenerator_opset7("greater")) .TypeConstraint( "T", - {"tensor(float16)", - "tensor(float)", - "tensor(double)"}, + {"tensor(float16)", "tensor(float)", "tensor(double)"}, "Constrains input to float tensors.") .TypeConstraint( "T1", @@ -189,15 +192,11 @@ ONNX_OPERATOR_SET_SCHEMA( .FillUsing(BinaryLogicDocGenerator_opset7("less")) .TypeConstraint( "T", - {"tensor(float16)", - "tensor(float)", - "tensor(double)"}, + {"tensor(float16)", "tensor(float)", "tensor(double)"}, "Constrains input to float tensors.") .TypeConstraint( "T1", {"tensor(bool)"}, "Constrains output to boolean tensor.")); - - } // namespace ONNX_NAMESPACE diff --git a/onnx/defs/math/defs.cc b/onnx/defs/math/defs.cc index 3d387a58592..98fb875664d 100644 --- a/onnx/defs/math/defs.cc +++ b/onnx/defs/math/defs.cc @@ -1,9 +1,9 @@ // Copyright (c) ONNX Project Contributors. // Licensed under the MIT license. +#include #include #include "onnx/defs/function.h" -#include #include "onnx/defs/schema.h" #include "onnx/defs/tensor_proto_util.h" @@ -11,13 +11,16 @@ namespace ONNX_NAMESPACE { std::function MathDocGenerator(const char* name) { return [=](OpSchema& schema) { - std::string doc = R"DOC( + std::string doc; + POPULATE_OP_DOC_STR( + doc = R"DOC( Performs element-wise binary {name} (with Numpy-style broadcasting support). {broadcast_doc} )DOC"; - ReplaceAll(doc, "{name}", name); - ReplaceAll(doc, "{broadcast_doc}", GenerateBroadcastingDocMul().c_str()); + ReplaceAll(doc, "{name}", name); + ReplaceAll( + doc, "{broadcast_doc}", GenerateBroadcastingDocMul().c_str());); schema.SetDoc(doc); schema.Input(0, "A", "First operand.", "T"); schema.Input(1, "B", "Second operand.", "T"); @@ -41,7 +44,8 @@ std::function SoftmaxFamilyDocGenerator( const char* name, const char* description) { return [=](OpSchema& schema) { - std::string doc = R"DOC( + std::string doc; + POPULATE_OP_DOC_STR(doc = R"DOC( The operator computes the {name} ({description}) values for each layer in the batch of the given input. @@ -57,8 +61,8 @@ Each of these dimensions must be matched correctly, or else the operator will throw errors. The output tensor has the same shape and contains the {name} values of the corresponding input. )DOC"; - ReplaceAll(doc, "{name}", name); - ReplaceAll(doc, "{description}", description); + ReplaceAll(doc, "{name}", name); + ReplaceAll(doc, "{description}", description);); schema.SetDoc(doc); schema.Attr( "axis", @@ -87,7 +91,7 @@ and contains the {name} values of the corresponding input. schema.TypeAndShapeInferenceFunction([](InferenceContext& ctx) { // Type inference propagateElemTypeFromInputToOutput(ctx, 0, 0); - + // Shape inference starts if (!hasNInputShapes(ctx, 1)) { return; @@ -95,12 +99,17 @@ and contains the {name} values of the corresponding input. // Validate the value of 'axis' const TensorShapeProto& input_shape = - ctx.getInputType(0)->tensor_type().shape(); + ctx.getInputType(0)->tensor_type().shape(); int r = input_shape.dim_size(); int axis = static_cast(getAttribute(ctx, "axis", 1)); if (axis < -r || axis >= r) { - fail_shape_inference( - "'axis' must be in [", -r, " , " , (r-1) , "]. Its actual value is: ", axis); + fail_shape_inference( + "'axis' must be in [", + -r, + " , ", + (r - 1), + "]. Its actual value is: ", + axis); } // Shape inference @@ -419,25 +428,31 @@ ONNX_OPERATOR_SET_SCHEMA( Celu, 12, OpSchema() - .SetDoc(celu_ver12_doc) - .Input(0, "X", "Input tensor", "T") - .Output(0, "Y", "Output tensor", "T") - .Attr( - "alpha", - "The Alpha value in Celu formula which control the shape of " - "the unit. The default value is 1.0.", - AttributeProto::FLOAT, - celu_default_alpha) - .TypeConstraint( - "T", - {"tensor(float16)", "tensor(float)", "tensor(double)"}, - "Constrain input and output types to floating-point tensors.") - .FunctionBody(FunctionBodyHelper::BuildNodes( - {// nodes: {outputs, op, inputs, attributes} - FunctionBodyHelper::NodeDef{{"alpha"}, "Constant", {}, {MakeRefAttribute("value_float", "alpha", AttributeProto::FLOAT)}}, - {{"X_alpha"}, "Div", {"X", "alpha"}}, - {{"Elu_Result"}, "Elu", {"X_alpha"}, {{"alpha", 1.f}}}, - {{"Y"}, "Mul", {"alpha", "Elu_Result"}}}))); + .SetDoc(celu_ver12_doc) + .Input(0, "X", "Input tensor", "T") + .Output(0, "Y", "Output tensor", "T") + .Attr( + "alpha", + "The Alpha value in Celu formula which control the shape of " + "the unit. The default value is 1.0.", + AttributeProto::FLOAT, + celu_default_alpha) + .TypeConstraint( + "T", + {"tensor(float16)", "tensor(float)", "tensor(double)"}, + "Constrain input and output types to floating-point tensors.") + .FunctionBody(FunctionBodyHelper::BuildNodes( + {// nodes: {outputs, op, inputs, attributes} + FunctionBodyHelper::NodeDef{{"alpha"}, + "Constant", + {}, + {MakeRefAttribute( + "value_float", + "alpha", + AttributeProto::FLOAT)}}, + {{"X_alpha"}, "Div", {"X", "alpha"}}, + {{"Elu_Result"}, "Elu", {"X_alpha"}, {{"alpha", 1.f}}}, + {{"Y"}, "Mul", {"alpha", "Elu_Result"}}}))); static const char* Exp_ver6_doc = R"DOC( Calculates the exponential of the given input tensor, element-wise. @@ -505,7 +520,7 @@ ONNX_OPERATOR_SET_SCHEMA( "Constrain input and output types to float tensors.") .TypeAndShapeInferenceFunction(propagateShapeAndTypeFromFirstInput)); -static const char* Pow_ver7_doc = R"DOC( +static const char* Pow_ver12_doc = R"DOC( Pow takes input data (Tensor) and exponent Tensor, and produces one output data (Tensor) where the function `f(x) = x^exponent`, is applied to the data tensor elementwise. @@ -513,16 +528,35 @@ is applied to the data tensor elementwise. ONNX_OPERATOR_SET_SCHEMA( Pow, - 7, + 12, OpSchema() - .SetDoc(std::string(Pow_ver7_doc) + GenerateBroadcastingDocMul()) + .SetDoc(GET_OP_DOC_STR( + std::string(Pow_ver12_doc) + GenerateBroadcastingDocMul())) .Input(0, "X", "First operand, base of the exponent.", "T") - .Input(1, "Y", "Second operand, power of the exponent.", "T") + .Input(1, "Y", "Second operand, power of the exponent.", "T1") .Output(0, "Z", "Output tensor (same size as X)", "T") .TypeConstraint( "T", - {"tensor(float16)", "tensor(float)", "tensor(double)"}, - "Constrain input and output types to float tensors.") + {"tensor(int32)", + "tensor(int64)", + "tensor(float16)", + "tensor(float)", + "tensor(double)"}, + "Constrain input X and output types to float/int tensors.") + .TypeConstraint( + "T1", + {"tensor(uint8)", + "tensor(uint16)", + "tensor(uint32)", + "tensor(uint64)", + "tensor(int8)", + "tensor(int16)", + "tensor(int32)", + "tensor(int64)", + "tensor(float16)", + "tensor(float)", + "tensor(double)"}, + "Constrain input Y types to float/int tensors.") .TypeAndShapeInferenceFunction([](InferenceContext& ctx) { propagateElemTypeFromInputToOutput(ctx, 0, 0); if (hasNInputShapes(ctx, 2)) @@ -542,9 +576,9 @@ ONNX_OPERATOR_SET_SCHEMA( PRelu, 9, OpSchema() - .SetDoc( - PRelu_ver9_doc + - GenerateBroadcastingDocUni("tensor slope", "input tensor X")) + .SetDoc(GET_OP_DOC_STR( + std::string(PRelu_ver9_doc) + + GenerateBroadcastingDocUni("tensor slope", "input tensor X"))) .Input(0, "X", "Input tensor", "T") .Input( 1, @@ -605,17 +639,21 @@ ONNX_OPERATOR_SET_SCHEMA( "Constrain input and output types to float tensors.") .TypeAndShapeInferenceFunction(propagateShapeAndTypeFromFirstInput)); -// Generate opschema for element-wise ops. Leaves type constraint "T" unspecified. +// Generate opschema for element-wise ops. Leaves type constraint "T" +// unspecified. std::function ElementwiseMultiOpDocGenerator( const char* name) { return [=](OpSchema& schema) { - std::string doc = R"DOC( + std::string doc; + POPULATE_OP_DOC_STR( + doc = R"DOC( Element-wise {name} of each of the input tensors (with Numpy-style broadcasting support). All inputs and outputs must have the same data type. {broadcast_doc} )DOC"; - ReplaceAll(doc, "{name}", name); - ReplaceAll(doc, "{broadcast_doc}", GenerateBroadcastingDocMul().c_str()); + ReplaceAll(doc, "{name}", name); + ReplaceAll( + doc, "{broadcast_doc}", GenerateBroadcastingDocMul().c_str());); schema.SetDoc(doc); schema.Input( 0, @@ -695,11 +733,7 @@ ONNX_OPERATOR_SET_SCHEMA( 12, OpSchema() .SetDoc(Clip_ver12_doc) - .Input( - 0, - "input", - "Input tensor whose elements to be clipped", - "T") + .Input(0, "input", "Input tensor whose elements to be clipped", "T") .Input( 1, "min", @@ -797,11 +831,10 @@ ONNX_OPERATOR_SET_SCHEMA( Gemm, 11, OpSchema() - .SetDoc( - Gemm_ver11_doc + - GenerateBroadcastingDocUni("tensor C", "tensor A * B") + - "\n" + - GenerateOptionalArgumentsDoc()) + .SetDoc(GET_OP_DOC_STR( + std::string(Gemm_ver11_doc) + + GenerateBroadcastingDocUni("tensor C", "tensor A * B") + "\n" + + GenerateOptionalArgumentsDoc())) .Input( 0, "A", @@ -1601,40 +1634,49 @@ output = [5, 3, 0] )DOC"; ONNX_OPERATOR_SET_SCHEMA( - CumSum, - 11, + CumSum, + 11, OpSchema() - .SetDoc(CumSum_ver11_doc) - .Attr( - "exclusive", - "If set to 1 will return exclusive sum in which the top element is not included." - " In other terms, if set to 1, the j-th output element would be the sum of the first (j-1) elements." - " Otherwise, it would be the sum of the first j elements.", - AttributeProto::INT, - static_cast(0)) - .Attr( - "reverse", - "If set to 1 will perform the sums in reverse direction.", - AttributeProto::INT, - static_cast(0)) - .Input(0, "x", "An input tensor that is to be processed.", "T") - .Input(1, "axis", "(Optional) A 0-D tensor. Must be in the range [-rank(x), rank(x)-1]. " - "Negative value means counting dimensions from the back.", "T2") - .Output(0, "y", - "Output tensor of the same type as 'x' with cumulative sums of the x's elements", - "T") - .TypeConstraint("T", { - "tensor(uint32)", - "tensor(uint64)", - "tensor(int32)", - "tensor(int64)", - "tensor(float)", - "tensor(double)"}, "Input can be of any tensor type.") - .TypeConstraint("T2", { - "tensor(int32)", - "tensor(int64)"}, "axis tensor can be int32 or int64 only") - .TypeAndShapeInferenceFunction(ONNX_NAMESPACE::propagateShapeAndTypeFromFirstInput) - ); + .SetDoc(CumSum_ver11_doc) + .Attr( + "exclusive", + "If set to 1 will return exclusive sum in which the top element is not included." + " In other terms, if set to 1, the j-th output element would be the sum of the first (j-1) elements." + " Otherwise, it would be the sum of the first j elements.", + AttributeProto::INT, + static_cast(0)) + .Attr( + "reverse", + "If set to 1 will perform the sums in reverse direction.", + AttributeProto::INT, + static_cast(0)) + .Input(0, "x", "An input tensor that is to be processed.", "T") + .Input( + 1, + "axis", + "(Optional) A 0-D tensor. Must be in the range [-rank(x), rank(x)-1]. " + "Negative value means counting dimensions from the back.", + "T2") + .Output( + 0, + "y", + "Output tensor of the same type as 'x' with cumulative sums of the x's elements", + "T") + .TypeConstraint( + "T", + {"tensor(uint32)", + "tensor(uint64)", + "tensor(int32)", + "tensor(int64)", + "tensor(float)", + "tensor(double)"}, + "Input can be of any tensor type.") + .TypeConstraint( + "T2", + {"tensor(int32)", "tensor(int64)"}, + "axis tensor can be int32 or int64 only") + .TypeAndShapeInferenceFunction( + ONNX_NAMESPACE::propagateShapeAndTypeFromFirstInput)); static const char* Round_ver11_doc = R"DOC( Round takes one input Tensor and rounds the values, element-wise, meaning @@ -1702,8 +1744,7 @@ ONNX_OPERATOR_SET_SCHEMA( const auto mat_w = input_shape.dim(rank - 1); const auto mat_h = input_shape.dim(rank - 2); - if (mat_w.has_dim_value() && - mat_h.has_dim_value() && + if (mat_w.has_dim_value() && mat_h.has_dim_value() && (mat_w.dim_value() != mat_h.dim_value())) { fail_shape_inference( "The inner-most 2 dimensions must have the same size (mat_w:", @@ -1713,7 +1754,7 @@ ONNX_OPERATOR_SET_SCHEMA( ")."); } - for (int i=0; i < rank - 2; ++i) { + for (int i = 0; i < rank - 2; ++i) { auto* dim = output_shape->add_dim(); *dim = input_shape.dim(i); } @@ -1810,45 +1851,156 @@ TensorProto ToDimensionOneTensor(int32_t value) { return t; } -bool BuildContextDependentFunctionBody(const FunctionBodyBuildContext& ctx, const OpSchema& schema, FunctionProto& functionProto) { +TensorProto ToDimensionOneFloatTensor(float value) { + auto t = ToTensor(std::vector({value})); + t.add_dims(1); + return t; +} + +bool BuildContextDependentFunctionBody( + const FunctionBodyBuildContext& ctx, + const OpSchema& schema, + FunctionProto& functionProto) { std::vector body; - body.push_back({{"expanded_target"}, "Unsqueeze", {"target"}, {MakeAttribute("axes", std::vector({1}))}}); - body.push_back({{"input_gather_element"}, "GatherElements", {"input", "expanded_target"}, {MakeAttribute("axis", (int64_t)1)}}); + body.push_back({{"expanded_target"}, + "Unsqueeze", + {"target"}, + {MakeAttribute("axes", std::vector({1}))}}); + body.push_back({{"input_gather_element"}, + "GatherElements", + {"input", "expanded_target"}, + {MakeAttribute("axis", (int64_t)1)}}); body.push_back({{"loss_NCdd"}, "Neg", {"input_gather_element"}}); - body.push_back({{"const_zero"}, "Constant", {}, {MakeAttribute("value", ToDimensionOneTensor(0))}}); - body.push_back({{"const_one"}, "Constant", {}, {MakeAttribute("value", ToDimensionOneTensor(1))}}); - body.push_back({{"loss_N1dd"}, "Slice", {"loss_NCdd", "const_zero", "const_one", "const_one"}}); - - if (!ctx.hasInput(2)) { - if (ctx.getAttribute("reduction")->s() == "none") { - body.push_back({{"loss"}, "Squeeze", {"loss_N1dd"}, {MakeAttribute("axes", std::vector({1}))}}); + body.push_back({{"const_zero"}, + "Constant", + {}, + {MakeAttribute("value", ToDimensionOneTensor(0))}}); + body.push_back({{"const_one"}, + "Constant", + {}, + {MakeAttribute("value", ToDimensionOneTensor(1))}}); + body.push_back({{"loss_N1dd"}, + "Slice", + {"loss_NCdd", "const_zero", "const_one", "const_one"}}); + + if (ctx.getAttribute("ignore_index") == nullptr) { + if (!ctx.hasInput(2)) { + if (ctx.getAttribute("reduction")->s() == "none") { + body.push_back({{"loss"}, + "Squeeze", + {"loss_N1dd"}, + {MakeAttribute("axes", std::vector({1}))}}); + } else { + body.push_back({{"loss_Ndd"}, + "Squeeze", + {"loss_N1dd"}, + {MakeAttribute("axes", std::vector({1}))}}); + if (ctx.getAttribute("reduction")->s() == "mean") { + body.push_back({{"loss"}, + "ReduceMean", + {"loss_Ndd"}, + {MakeAttribute("keepdims", (int64_t)0)}}); + } else { + body.push_back({{"loss"}, + "ReduceSum", + {"loss_Ndd"}, + {MakeAttribute("keepdims", (int64_t)0)}}); + } + } } else { - body.push_back({{"loss_Ndd"}, "Squeeze", {"loss_N1dd"}, {MakeAttribute("axes", std::vector({1}))}}); - if(ctx.getAttribute("reduction")->s() == "mean") { - body.push_back({{"loss"}, "ReduceMean", {"loss_Ndd"}, {MakeAttribute("keepdims", (int64_t)0)}}); + body.push_back({{"weight_gather"}, "Gather", {"weight", "target"}}); + body.push_back({{"loss_unweighted"}, + "Squeeze", + {"loss_N1dd"}, + {MakeAttribute("axes", std::vector({1}))}}); + if (ctx.getAttribute("reduction")->s() == "none") { + body.push_back({{"loss"}, "Mul", {"loss_unweighted", "weight_gather"}}); } else { - body.push_back({{"loss"}, "ReduceSum", {"loss_Ndd"}, {MakeAttribute("keepdims", (int64_t)0)}}); + body.push_back( + {{"loss_Ndd"}, "Mul", {"loss_unweighted", "weight_gather"}}); + if (ctx.getAttribute("reduction")->s() == "mean") { + body.push_back({{"loss_sum"}, + "ReduceSum", + {"loss_Ndd"}, + {MakeAttribute("keepdims", (int64_t)0)}}); + body.push_back({{"weight_gather_sum"}, + "ReduceSum", + {"weight_gather"}, + {MakeAttribute("keepdims", (int64_t)0)}}); + body.push_back({{"loss"}, "Div", {"loss_sum", "weight_gather_sum"}}); + } else { + body.push_back({{"loss"}, + "ReduceSum", + {"loss_Ndd"}, + {MakeAttribute("keepdims", (int64_t)0)}}); + } } - } + } } else { - body.push_back({{"weight_gather"}, "Gather", {"weight", "target"}}); - body.push_back({{"loss_unweighted"}, "Squeeze", {"loss_N1dd"}, {MakeAttribute("axes", std::vector({1}))}}); + body.push_back( + {{"const_ignore_index"}, + "Constant", + {}, + {MakeAttribute( + "value", + ToDimensionOneTensor(ctx.getAttribute("ignore_index")->i()))}}); + body.push_back({{"const_zero_float"}, + "Constant", + {}, + {MakeAttribute("value", ToDimensionOneFloatTensor(0.0f))}}); + if (!ctx.hasInput(2)) { + body.push_back({{"input_shape"}, "Shape", {"input"}}); + body.push_back({{"input_class"}, + "Slice", + {"input_shape", "const_one", "const_one"}}); + body.push_back({{"const_weights_ones"}, + "ConstantOfShape", + {"input_class"}, + {MakeAttribute("value", ToDimensionOneFloatTensor(1))}}); + body.push_back( + {{"weights_default"}, + "ScatterElements", + {"const_weights_ones", "const_ignore_index", "const_zero_float"}}); + body.push_back( + {{"weight_gather"}, "Gather", {"weights_default", "target"}}); + } else { + body.push_back({{"weights_default"}, + "ScatterElements", + {"weight", "const_ignore_index", "const_zero_float"}}); + body.push_back( + {{"weight_gather"}, "Gather", {"weights_default", "target"}}); + } + + body.push_back({{"loss_unweighted"}, + "Squeeze", + {"loss_N1dd"}, + {MakeAttribute("axes", std::vector({1}))}}); if (ctx.getAttribute("reduction")->s() == "none") { body.push_back({{"loss"}, "Mul", {"loss_unweighted", "weight_gather"}}); } else { - body.push_back({{"loss_Ndd"}, "Mul", {"loss_unweighted", "weight_gather"}}); - if(ctx.getAttribute("reduction")->s() == "mean") { - body.push_back({{"loss_sum"}, "ReduceSum", {"loss_Ndd"}, {MakeAttribute("keepdims", (int64_t)0)}}); - body.push_back({{"weight_gather_sum"}, "ReduceSum", {"weight_gather"}, {MakeAttribute("keepdims", (int64_t)0)}}); + body.push_back( + {{"loss_Ndd"}, "Mul", {"loss_unweighted", "weight_gather"}}); + if (ctx.getAttribute("reduction")->s() == "mean") { + body.push_back({{"loss_sum"}, + "ReduceSum", + {"loss_Ndd"}, + {MakeAttribute("keepdims", (int64_t)0)}}); + body.push_back({{"weight_gather_sum"}, + "ReduceSum", + {"weight_gather"}, + {MakeAttribute("keepdims", (int64_t)0)}}); body.push_back({{"loss"}, "Div", {"loss_sum", "weight_gather_sum"}}); } else { - body.push_back({{"loss"}, "ReduceSum", {"loss_Ndd"}, {MakeAttribute("keepdims", (int64_t)0)}}); + body.push_back({{"loss"}, + "ReduceSum", + {"loss_Ndd"}, + {MakeAttribute("keepdims", (int64_t)0)}}); } } } auto func_nodes = FunctionBodyHelper::BuildNodes(body); - for (const auto node : func_nodes) { + for (const auto& node : func_nodes) { auto new_node = functionProto.add_node(); new_node->CopyFrom(node); } @@ -1888,6 +2040,12 @@ ONNX_OPERATOR_SET_SCHEMA( "'mean': the sum of the output will be divided by the sum of applied weights.", AttributeProto::STRING, std::string("mean")) + .Attr( + "ignore_index", + "Specifies a target value that is ignored and does not contribute to the input gradient. " + "It is an optional value and valid values are [0, C).", + AttributeProto::INT, + false) .TypeConstraint( "T", {"tensor(float16)", "tensor(float)", "tensor(double)"}, @@ -1896,73 +2054,87 @@ ONNX_OPERATOR_SET_SCHEMA( "Tind", {"tensor(int32)", "tensor(int64)"}, "Constrain target to integer types") - .SetContextDependentFunctionBodyBuilder(BuildContextDependentFunctionBody) + .SetContextDependentFunctionBodyBuilder( + BuildContextDependentFunctionBody) .TypeAndShapeInferenceFunction([](InferenceContext& ctx) { - // Type inference - propagateElemTypeFromInputToOutput(ctx, 0, 0); + // Type inference + propagateElemTypeFromInputToOutput(ctx, 0, 0); - // Shape inference - if (hasInputShape(ctx, 0) && hasInputShape(ctx, 1)) { - const TensorShapeProto& input_shape = ctx.getInputType(0)->tensor_type().shape(); - const TensorShapeProto& target_shape = ctx.getInputType(1)->tensor_type().shape(); - - const int input_rank = static_cast(input_shape.dim_size()); - const int target_rank = static_cast(target_shape.dim_size()); - - if (input_rank < 2) { - fail_shape_inference("Input rank must be >= 2.") - } - if (target_rank != input_rank - 1) { - fail_shape_inference("Target rank must be 1 less than the input rank.") - } - - // match input dimensions (N, C, d1, ..., dk) with target dimensions of (C, - // d1, ..., dk) - for (int dim = 0; dim < target_rank; dim++) { - const auto input_dim = dim == 0 ? input_shape.dim(dim) : input_shape.dim(dim + 1); - const auto target_dim = target_shape.dim(dim); - if (input_dim.has_dim_value() && target_dim.has_dim_value() && input_dim.dim_value() != target_dim.dim_value()) - fail_shape_inference("Input and target dimension value mismatch.") - } - - if (ctx.getNumInputs() == 3) { - const TensorShapeProto& weight_shape = ctx.getInputType(2)->tensor_type().shape(); - if (weight_shape.dim_size() != 1) - fail_shape_inference("Weight rank must be 1.") + // Shape inference + if (hasInputShape(ctx, 0) && hasInputShape(ctx, 1)) { + const TensorShapeProto& input_shape = + ctx.getInputType(0)->tensor_type().shape(); + const TensorShapeProto& target_shape = + ctx.getInputType(1)->tensor_type().shape(); + + const int input_rank = static_cast(input_shape.dim_size()); + const int target_rank = static_cast(target_shape.dim_size()); + + if (input_rank < 2) { + fail_shape_inference("Input rank must be >= 2.") + } + if (target_rank != input_rank - 1) { + fail_shape_inference( + "Target rank must be 1 less than the input rank.") + } + + // match input dimensions (N, C, d1, ..., dk) with target + // dimensions of (C, d1, ..., dk) + for (int dim = 0; dim < target_rank; dim++) { + const auto input_dim = + dim == 0 ? input_shape.dim(dim) : input_shape.dim(dim + 1); + const auto target_dim = target_shape.dim(dim); + if (input_dim.has_dim_value() && target_dim.has_dim_value() && + input_dim.dim_value() != target_dim.dim_value()) + fail_shape_inference( + "Input and target dimension value mismatch.") + } + + if (ctx.getNumInputs() == 3) { + const TensorShapeProto& weight_shape = + ctx.getInputType(2)->tensor_type().shape(); + if (weight_shape.dim_size() != 1) + fail_shape_inference("Weight rank must be 1.") const auto weight_dim = weight_shape.dim(0); - const auto input_dim_1 = input_shape.dim(1); - if (input_dim_1.has_dim_value() && weight_dim.has_dim_value() && weight_dim.dim_value() != input_dim_1.dim_value()) - fail_shape_inference("Input and weight dimension value mismatch.") - } - - TensorShapeProto* output_shape = ctx.getOutputType(0)->mutable_tensor_type()->mutable_shape(); - - if (ctx.getAttribute("reduction")->s() == "none") { - // output tensor is of shape (N, d1, d2, ..., dk) if reduction attribute - // is "none". - for (int i = 0; i < input_rank - 1; i++) { - auto* dim = output_shape->add_dim(); - if (i == 0) - *dim = input_shape.dim(i); - else - *dim = input_shape.dim(i + 1); - } - } - // otherwise output is a scalar. - }})); + const auto input_dim_1 = input_shape.dim(1); + if (input_dim_1.has_dim_value() && weight_dim.has_dim_value() && + weight_dim.dim_value() != input_dim_1.dim_value()) + fail_shape_inference( + "Input and weight dimension value mismatch.") + } -void einsumRankInference( - ONNX_NAMESPACE::InferenceContext& ctx, std::string equation) { + TensorShapeProto* output_shape = + ctx.getOutputType(0)->mutable_tensor_type()->mutable_shape(); + if (ctx.getAttribute("reduction")->s() == "none") { + // output tensor is of shape (N, d1, d2, ..., dk) if + // reduction attribute is "none". + for (int i = 0; i < input_rank - 1; i++) { + auto* dim = output_shape->add_dim(); + if (i == 0) + *dim = input_shape.dim(i); + else + *dim = input_shape.dim(i + 1); + } + } + // otherwise output is a scalar. + } + })); + +void einsumRankInference( + ONNX_NAMESPACE::InferenceContext& ctx, + std::string equation) { const size_t numInputs = ctx.getNumInputs(); if (numInputs < 1 || !hasNInputShapes(ctx, static_cast(numInputs))) { return; } auto* output_shape = getOutputShape(ctx, 0); - std::string left_equation; + std::string left_equation; - equation.erase(std::remove(equation.begin(), equation.end(), ' '), equation.end()); // Remove space char + equation.erase( + std::remove(equation.begin(), equation.end(), ' '), + equation.end()); // Remove space char auto mid_index = equation.find("->"); if (mid_index != std::string::npos) { // Separate right and left hand sides of the equation @@ -1979,17 +2151,21 @@ void einsumRankInference( // Parse the left-hand side std::stringstream str(left_equation); - while(std::getline(str, term, ',')) { + while (std::getline(str, term, ',')) { auto ellipsis_index = term.find("..."); if (ellipsis_index != std::string::npos) { if (numInputs <= num_operands) { - fail_shape_inference("Number of input tensors does not match the operands in the equation."); + fail_shape_inference( + "Number of input tensors does not match the operands in the equation."); } - // If there is an ellipsis, the number of dimensions it represents must be total dim - letter dimensions - size_t rank = ctx.getInputType(num_operands)->tensor_type().shape().dim_size(); + // If there is an ellipsis, the number of dimensions it represents + // must be total dim - letter dimensions + size_t rank = + ctx.getInputType(num_operands)->tensor_type().shape().dim_size(); if (num_ellipsis == 0) { num_ellipsis_indices = rank - term.size() + 3; - } else { // ellipsis has been seen before. Check that if dimensions are compatible + } else { // ellipsis has been seen before. Check that if dimensions + // are compatible if (num_ellipsis_indices != rank - term.size() + 3) { fail_shape_inference("Ellipsis represents incompatible dimensions."); } @@ -2000,7 +2176,8 @@ void einsumRankInference( } if (numInputs != num_operands) { - fail_shape_inference("Number of input tensors does not match the operands in the equation."); + fail_shape_inference( + "Number of input tensors does not match the operands in the equation."); } const size_t number_of_letters = 26; @@ -2009,12 +2186,14 @@ void einsumRankInference( if (mid_index != std::string::npos) { std::string right_equation = equation.substr(mid_index + 2); auto right_ellipsis_index = right_equation.find("..."); - if (right_ellipsis_index != std::string::npos) { // Right-hand side contains ellipsis + if (right_ellipsis_index != + std::string::npos) { // Right-hand side contains ellipsis for (size_t i = 0; i < num_ellipsis; ++i) { output_shape->add_dim(); } } - for (char c: right_equation) { // Add a dimension per each character in right hand equation + for (char c : right_equation) { // Add a dimension per each character + // in right hand equation if (c != '.') { output_shape->add_dim(); } @@ -2024,7 +2203,8 @@ void einsumRankInference( for (size_t i = 0; i < num_ellipsis_indices; i++) { output_shape->add_dim(); } - for (size_t i = 0; i < left_equation.size(); i++) { // Count chars that appear exactly once on left hand side + for (size_t i = 0; i < left_equation.size(); + i++) { // Count chars that appear exactly once on left hand side if ((left_equation.at(i) != ',') && (left_equation.at(i) != '.')) { num_letter_occurrences[left_equation.at(i) - 'a']++; } @@ -2067,15 +2247,8 @@ ONNX_OPERATOR_SET_SCHEMA( 12, OpSchema() .SetDoc(Einsum_ver12_doc) - .Attr( - "equation", - "Einsum expression string.", - AttributeProto::STRING) - .Input(0, - "Inputs", - "Operands", - "T", - OpSchema::Variadic) + .Attr("equation", "Einsum expression string.", AttributeProto::STRING) + .Input(0, "Inputs", "Operands", "T", OpSchema::Variadic) .Output(0, "Output", "Output tensor", "T") .TypeConstraint( "T", @@ -2088,7 +2261,7 @@ ONNX_OPERATOR_SET_SCHEMA( if (equation.compare("") == 0) { return; } - einsumRankInference(ctx, equation); + einsumRankInference(ctx, equation); })); static const char* Inverse_ver12_doc = R"DOC( @@ -2125,8 +2298,7 @@ ONNX_OPERATOR_SET_SCHEMA( const auto mat_w = input_shape.dim(rank - 1); const auto mat_h = input_shape.dim(rank - 2); - if (mat_w.has_dim_value() && - mat_h.has_dim_value() && + if (mat_w.has_dim_value() && mat_h.has_dim_value() && (mat_w.dim_value() != mat_h.dim_value())) { fail_shape_inference( "The inner-most 2 dimensions must have the same size (mat_w:", @@ -2168,7 +2340,10 @@ L = ReduceSum(L), if reduction = 'sum'; .)DOC"; -bool BuildContextDependentFunctionBodyMSD(const FunctionBodyBuildContext& ctx, const OpSchema& schema, FunctionProto& functionProto) { +bool BuildContextDependentFunctionBodyMSD( + const FunctionBodyBuildContext& ctx, + const OpSchema& schema, + FunctionProto& functionProto) { std::vector body; body.push_back(FunctionBodyHelper::Const("Q_Pow", 2)); body.push_back({{"X_Sub"}, "Sub", {"scores", "labels"}}); @@ -2179,9 +2354,15 @@ bool BuildContextDependentFunctionBodyMSD(const FunctionBodyBuildContext& ctx, c } else { body.push_back({{"X_Pow"}, "Pow", {"X_Sub", "Q_Pow"}}); if (ctx.getAttribute("reduction")->s() == "sum") { - body.push_back({{"output"}, "ReduceSum", {"X_Pow"}, {MakeAttribute("keepdims", (int64_t)0)}}); + body.push_back({{"output"}, + "ReduceSum", + {"X_Pow"}, + {MakeAttribute("keepdims", (int64_t)0)}}); } else { - body.push_back({{"output"}, "ReduceMean", {"X_Pow"}, {MakeAttribute("keepdims", (int64_t)0)}}); + body.push_back({{"output"}, + "ReduceMean", + {"X_Pow"}, + {MakeAttribute("keepdims", (int64_t)0)}}); } } } else { @@ -2191,15 +2372,21 @@ bool BuildContextDependentFunctionBodyMSD(const FunctionBodyBuildContext& ctx, c } else { body.push_back({{"X_Mul"}, "Mul", {"weights", "X_Pow"}}); if (ctx.getAttribute("reduction")->s() == "sum") { - body.push_back({{"output"}, "ReduceSum", {"X_Mul"}, {MakeAttribute("keepdims", (int64_t)0)}}); + body.push_back({{"output"}, + "ReduceSum", + {"X_Mul"}, + {MakeAttribute("keepdims", (int64_t)0)}}); } else { - body.push_back({{"output"}, "ReduceMean", {"X_Mul"}, {MakeAttribute("keepdims", (int64_t)0)}}); + body.push_back({{"output"}, + "ReduceMean", + {"X_Mul"}, + {MakeAttribute("keepdims", (int64_t)0)}}); } } } auto func_nodes = FunctionBodyHelper::BuildNodes(body); - for (const auto node : func_nodes) { + for (const auto& node : func_nodes) { auto new_node = functionProto.add_node(); new_node->CopyFrom(node); } @@ -2241,18 +2428,18 @@ ONNX_OPERATOR_SET_SCHEMA( "T", {"tensor(float16)", "tensor(float)", "tensor(double)"}, "Constrain input and output types to float tensors.") - .SetContextDependentFunctionBodyBuilder(BuildContextDependentFunctionBodyMSD) + .SetContextDependentFunctionBodyBuilder( + BuildContextDependentFunctionBodyMSD) .TypeAndShapeInferenceFunction([](InferenceContext& ctx) { - propagateElemTypeFromInputToOutput(ctx, 0, 0); - std::string reduction = getAttribute(ctx, "reduction", "mean"); - if (reduction.compare("none") == 0) { - if (hasInputShape(ctx, 0)) { - propagateShapeFromInputToOutput(ctx, 0, 0); - } - } else { - updateOutputShape(ctx, 0, TensorShapeProto()); + propagateElemTypeFromInputToOutput(ctx, 0, 0); + std::string reduction = getAttribute(ctx, "reduction", "mean"); + if (reduction.compare("none") == 0) { + if (hasInputShape(ctx, 0)) { + propagateShapeFromInputToOutput(ctx, 0, 0); } - + } else { + updateOutputShape(ctx, 0, TensorShapeProto()); + } })); const char* reduction_doc_sce = @@ -2293,7 +2480,10 @@ If reduction = 'mean', the output is scalar: ReduceMean(L), or if weight is prov where tensor W is of shape (N, D1, D2, ..., Dk) and W[n][d1][d2]...[dk] = weights[labels[i][d1][d2]...[dk]]. )DOC"; -bool BuildContextDependentFunctionBodySCE(const FunctionBodyBuildContext& ctx, const OpSchema& schema, FunctionProto& functionProto) { +bool BuildContextDependentFunctionBodySCE( + const FunctionBodyBuildContext& ctx, + const OpSchema& schema, + FunctionProto& functionProto) { std::vector body; body.push_back({{"X_Max"}, "Max", {"scores"}}); body.push_back({{"X_Sub"}, "Sub", {"scores", "X_Max"}}); @@ -2301,16 +2491,36 @@ bool BuildContextDependentFunctionBodySCE(const FunctionBodyBuildContext& ctx, c body.push_back({{"X_RS"}, "ReduceSum", {"X_Exp"}}); body.push_back({{"X_Div"}, "Div", {"X_Exp", "X_RS"}}); body.push_back({{"log_prob"}, "Log", {"X_Div"}}); - if (!ctx.hasInput(2)) { - body.push_back({ {"output"}, "NegativeLogLikelihoodLoss", {"log_prob", "labels"}, - {MakeRefAttribute("reduction", AttributeProto::STRING)}}); + if (ctx.getAttribute("ignore_index") == nullptr) { + if (!ctx.hasInput(2)) { + body.push_back({{"output"}, + "NegativeLogLikelihoodLoss", + {"log_prob", "labels"}, + {MakeRefAttribute("reduction", AttributeProto::STRING)}}); + } else { + body.push_back({{"output"}, + "NegativeLogLikelihoodLoss", + {"log_prob", "labels", "weights"}, + {MakeRefAttribute("reduction", AttributeProto::STRING)}}); + } } else { - body.push_back({{"output"}, "NegativeLogLikelihoodLoss", {"log_prob", "labels", "weights"}, - {MakeRefAttribute("reduction", AttributeProto::STRING)}}); + if (!ctx.hasInput(2)) { + body.push_back({{"output"}, + "NegativeLogLikelihoodLoss", + {"log_prob", "labels"}, + {MakeRefAttribute("reduction", AttributeProto::STRING), + MakeRefAttribute("ignore_index", AttributeProto::INT)}}); + } else { + body.push_back({{"output"}, + "NegativeLogLikelihoodLoss", + {"log_prob", "labels", "weights"}, + {MakeRefAttribute("reduction", AttributeProto::STRING), + MakeRefAttribute("ignore_index", AttributeProto::INT)}}); + } } auto func_nodes = FunctionBodyHelper::BuildNodes(body); - for (const auto node : func_nodes) { + for (const auto& node : func_nodes) { auto new_node = functionProto.add_node(); new_node->CopyFrom(node); } @@ -2329,6 +2539,12 @@ ONNX_OPERATOR_SET_SCHEMA( reduction_doc_sce, AttributeProto::STRING, std::string("mean")) + .Attr( + "ignore_index", + "Specifies a target value that is ignored and does not contribute to the input gradient. " + "It is an optional value and valid values are [0, C).", + AttributeProto::INT, + false) .Input( 0, "scores", @@ -2340,8 +2556,8 @@ ONNX_OPERATOR_SET_SCHEMA( "labels", "The ground truth output tensor, with shape [batch_size], or " "[batch_size, D1, D2, ..., Dk], where K is the number of dimensions.", - "T") - .Input( + "Tind") + .Input( 2, "weights", "A manual rescaling weight given to each class. If given, it has to " @@ -2356,27 +2572,31 @@ ONNX_OPERATOR_SET_SCHEMA( "shape of [batch_size], or [batch_size, D1, D2, ..., Dk] in case of " "K-dimensional loss. Otherwise, it is a scalar.", "T") - .Output( - 1, - "log_prob", - "Log probability tensor. If the output of softmax is prob, its value is log(prob).", - "T", - OpSchema::Optional) + .Output( + 1, + "log_prob", + "Log probability tensor. If the output of softmax is prob, its value is log(prob).", + "T", + OpSchema::Optional) .TypeConstraint( "T", {"tensor(float16)", "tensor(float)", "tensor(double)"}, "Constrain input and output types to float tensors.") - .SetContextDependentFunctionBodyBuilder(BuildContextDependentFunctionBodySCE) + .TypeConstraint( + "Tind", + {"tensor(int32)", "tensor(int64)"}, + "Constrain target to integer types") + .SetContextDependentFunctionBodyBuilder( + BuildContextDependentFunctionBodySCE) .TypeAndShapeInferenceFunction([](InferenceContext& ctx) { - propagateElemTypeFromInputToOutput(ctx, 0, 0); - std::string reduction = getAttribute(ctx, "reduction", "mean"); - if (reduction.compare("none") == 0) { - if (hasInputShape(ctx, 1)) { - propagateShapeFromInputToOutput(ctx, 1, 0); - } - } else { - updateOutputShape(ctx, 0, TensorShapeProto()); + propagateElemTypeFromInputToOutput(ctx, 0, 0); + std::string reduction = getAttribute(ctx, "reduction", "mean"); + if (reduction.compare("none") == 0) { + if (hasInputShape(ctx, 1)) { + propagateShapeFromInputToOutput(ctx, 1, 0); } - + } else { + updateOutputShape(ctx, 0, TensorShapeProto()); + } })); } // namespace ONNX_NAMESPACE diff --git a/onnx/defs/math/old.cc b/onnx/defs/math/old.cc index 7244c03af89..3e8ebf71a6d 100644 --- a/onnx/defs/math/old.cc +++ b/onnx/defs/math/old.cc @@ -11,7 +11,8 @@ std::function SoftmaxFamilyDocGenerator_opset1( const char* name, const char* description) { return [=](OpSchema& schema) { - std::string doc = R"DOC( + std::string doc; + POPULATE_OP_DOC_STR(doc = R"DOC( The operator computes the {name} ({description}) values for each layer in the batch of the given input. The input is a 2-D tensor (Tensor) of size (batch_size x input_feature_dimensions). The output tensor has the same shape @@ -28,8 +29,8 @@ In this situation, we must have a_0 = N and a_1 * ... * a_{n-1} = D. Each of these dimensions must be matched correctly, or else the operator will throw errors. )DOC"; - ReplaceAll(doc, "{name}", name); - ReplaceAll(doc, "{description}", description); + ReplaceAll(doc, "{name}", name); + ReplaceAll(doc, "{description}", description);); schema.SetDoc(doc); schema.Attr( "axis", @@ -100,11 +101,12 @@ Attribute `broadcast=1` needs to be passed to enable broadcasting. std::function MathDocGenerator_old(const char* name) { return [=](OpSchema& schema) { - std::string doc = R"DOC( + std::string doc; + POPULATE_OP_DOC_STR(doc = R"DOC( Performs element-wise binary {name} (with limited broadcast support). {broadcast_doc})DOC"; - ReplaceAll(doc, "{name}", name); - ReplaceAll(doc, "{broadcast_doc}", kBroadcastDoc_old); + ReplaceAll(doc, "{name}", name); + ReplaceAll(doc, "{broadcast_doc}", kBroadcastDoc_old);); schema.SetDoc(doc); schema.Attr( "broadcast", @@ -119,12 +121,12 @@ Performs element-wise binary {name} (with limited broadcast support). "consumed_inputs", "legacy optimization attribute.", AttributeProto::INTS, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( "axis", "If set, defines the broadcast dimensions. See doc for details.", AttributeProto::INT, - OPTIONAL); + OPTIONAL_VALUE); schema.Input( 0, "A", @@ -146,11 +148,12 @@ Performs element-wise binary {name} (with limited broadcast support). std::function MathDocGenerator_old_opset6(const char* name) { return [=](OpSchema& schema) { - std::string doc = R"DOC( + std::string doc; + POPULATE_OP_DOC_STR(doc = R"DOC( Performs element-wise binary {name} (with limited broadcast support). {broadcast_doc})DOC"; - ReplaceAll(doc, "{name}", name); - ReplaceAll(doc, "{broadcast_doc}", kBroadcastDoc_old); + ReplaceAll(doc, "{name}", name); + ReplaceAll(doc, "{broadcast_doc}", kBroadcastDoc_old);); schema.SetDoc(doc); schema.Attr( "broadcast", @@ -161,7 +164,7 @@ Performs element-wise binary {name} (with limited broadcast support). "axis", "If set, defines the broadcast dimensions. See doc for details.", AttributeProto::INT, - OPTIONAL); + OPTIONAL_VALUE); schema.Input( 0, "A", @@ -249,7 +252,7 @@ ONNX_OPERATOR_SET_SCHEMA( "axis", "If set, defines the broadcast dimensions. See doc for details.", AttributeProto::INT, - OPTIONAL) + OPTIONAL_VALUE) .Output(0, "Z", "Output tensor (same size as X)", "T") .TypeConstraint( "T", @@ -257,6 +260,33 @@ ONNX_OPERATOR_SET_SCHEMA( "Constrain input and output types to float tensors.") .TypeAndShapeInferenceFunction(propagateShapeAndTypeFromFirstInput)); +static const char* Pow_ver7_doc = R"DOC( +Pow takes input data (Tensor) and exponent Tensor, and +produces one output data (Tensor) where the function `f(x) = x^exponent`, +is applied to the data tensor elementwise. +)DOC"; + +ONNX_OPERATOR_SET_SCHEMA( + Pow, + 7, + OpSchema() + .SetDoc(std::string(Pow_ver7_doc) + GenerateBroadcastingDocMul()) + .Input(0, "X", "First operand, base of the exponent.", "T") + .Input(1, "Y", "Second operand, power of the exponent.", "T") + .Output(0, "Z", "Output tensor (same size as X)", "T") + .TypeConstraint( + "T", + {"tensor(float16)", "tensor(float)", "tensor(double)"}, + "Constrain input and output types to float tensors.") + .TypeAndShapeInferenceFunction([](InferenceContext& ctx) { + propagateElemTypeFromInputToOutput(ctx, 0, 0); + if (hasNInputShapes(ctx, 2)) + bidirectionalBroadcastShapeInference( + ctx.getInputType(0)->tensor_type().shape(), + ctx.getInputType(1)->tensor_type().shape(), + *ctx.getOutputType(0)->mutable_tensor_type()->mutable_shape()); + })); + static const char* Neg_ver1_doc = R"DOC( Neg takes one input data (Tensor) and produces one output data (Tensor) where each element flipped sign, y = -x, is applied to @@ -277,7 +307,7 @@ ONNX_OPERATOR_SET_SCHEMA( "consumed_inputs", "legacy optimization attribute.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .TypeConstraint( "T", {"tensor(float16)", "tensor(float)", "tensor(double)"}, @@ -303,7 +333,7 @@ ONNX_OPERATOR_SET_SCHEMA( "consumed_inputs", "legacy optimization attribute.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .TypeConstraint( "T", {"tensor(float16)", "tensor(float)", "tensor(double)"}, @@ -329,7 +359,7 @@ ONNX_OPERATOR_SET_SCHEMA( "consumed_inputs", "legacy optimization attribute.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .TypeConstraint( "T", {"tensor(float16)", "tensor(float)", "tensor(double)"}, @@ -355,7 +385,7 @@ ONNX_OPERATOR_SET_SCHEMA( "consumed_inputs", "legacy optimization attribute.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .TypeConstraint( "T", {"tensor(float16)", "tensor(float)", "tensor(double)"}, @@ -381,7 +411,7 @@ ONNX_OPERATOR_SET_SCHEMA( "consumed_inputs", "legacy optimization attribute.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .TypeConstraint( "T", {"tensor(float16)", "tensor(float)", "tensor(double)"}, @@ -407,7 +437,7 @@ ONNX_OPERATOR_SET_SCHEMA( "consumed_inputs", "legacy optimization attribute.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .TypeConstraint( "T", {"tensor(float16)", "tensor(float)", "tensor(double)"}, @@ -433,7 +463,7 @@ ONNX_OPERATOR_SET_SCHEMA( "consumed_inputs", "legacy optimization attribute.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .TypeConstraint( "T", {"tensor(float16)", "tensor(float)", "tensor(double)"}, @@ -464,7 +494,7 @@ ONNX_OPERATOR_SET_SCHEMA( "consumed_inputs", "legacy optimization attribute.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .TypeConstraint( "T", {"tensor(float16)", "tensor(float)", "tensor(double)"}, @@ -498,7 +528,7 @@ ONNX_OPERATOR_SET_SCHEMA( "consumed_inputs", "legacy optimization attribute.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .SetDoc(Selu_ver1_doc) .Input(0, "X", "Input tensor", "T") .Output(0, "Y", "Output tensor", "T") @@ -530,7 +560,7 @@ ONNX_OPERATOR_SET_SCHEMA( "consumed_inputs", "legacy optimization attribute.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .SetDoc(Elu_ver1_doc) .Input(0, "X", "1D input tensor", "T") .Output(0, "Y", "1D input tensor", "T") @@ -562,7 +592,7 @@ ONNX_OPERATOR_SET_SCHEMA( "consumed_inputs", "legacy optimization attribute.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .TypeConstraint( "T", {"tensor(float16)", "tensor(float)", "tensor(double)"}, @@ -591,7 +621,7 @@ ONNX_OPERATOR_SET_SCHEMA( "consumed_inputs", "legacy optimization attribute.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .TypeConstraint( "T", {"tensor(float16)", "tensor(float)", "tensor(double)"}, @@ -620,7 +650,7 @@ ONNX_OPERATOR_SET_SCHEMA( "consumed_inputs", "legacy optimization attribute.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .TypeConstraint( "T", {"tensor(float16)", "tensor(float)", "tensor(double)"}, @@ -654,7 +684,7 @@ ONNX_OPERATOR_SET_SCHEMA( "consumed_inputs", "legacy optimization attribute.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .TypeConstraint( "T", {"tensor(float16)", "tensor(float)", "tensor(double)"}, @@ -689,9 +719,9 @@ ONNX_OPERATOR_SET_SCHEMA( PRelu, 7, OpSchema() - .SetDoc( - PRelu_ver7_doc + - GenerateBroadcastingDocUni("tensor slope", "input tensor X")) + .SetDoc(GET_OP_DOC_STR( + std::string(PRelu_ver7_doc) + + GenerateBroadcastingDocUni("tensor slope", "input tensor X"))) .Input(0, "X", "Input tensor", "T") .Input( 1, @@ -726,7 +756,7 @@ ONNX_OPERATOR_SET_SCHEMA( "consumed_inputs", "legacy optimization attribute.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .TypeConstraint( "T", {"tensor(float16)", "tensor(float)", "tensor(double)"}, @@ -759,7 +789,7 @@ ONNX_OPERATOR_SET_SCHEMA( "consumed_inputs", "legacy optimization attribute.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .SetDoc(HardSigmoid_ver1_doc) .Input(0, "X", "Input tensor", "T") .Output(0, "Y", "Output tensor", "T") @@ -787,7 +817,7 @@ ONNX_OPERATOR_SET_SCHEMA( "consumed_inputs", "legacy optimization attribute.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .TypeConstraint( "T", {"tensor(float16)", "tensor(float)", "tensor(double)"}, @@ -812,7 +842,7 @@ ONNX_OPERATOR_SET_SCHEMA( "consumed_inputs", "legacy optimization attribute.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .TypeConstraint( "T", {"tensor(float16)", "tensor(float)", "tensor(double)"}, @@ -837,7 +867,7 @@ ONNX_OPERATOR_SET_SCHEMA( "consumed_inputs", "legacy optimization attribute.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .TypeConstraint( "T", {"tensor(float16)", "tensor(float)", "tensor(double)"}, @@ -867,7 +897,7 @@ ONNX_OPERATOR_SET_SCHEMA( "consumed_inputs", "legacy optimization attribute.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .TypeConstraint( "T", {"tensor(float16)", "tensor(float)", "tensor(double)"}, @@ -888,12 +918,12 @@ ONNX_OPERATOR_SET_SCHEMA( "min", "Minimum value, under which element is replaced by min", AttributeProto::FLOAT, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "max", "Maximum value, above which element is replaced by max", AttributeProto::FLOAT, - OPTIONAL) + OPTIONAL_VALUE) // This attribute was added via AllowConsumed API in OpSchema. // After removing the API, we're now using the Attr API to simulate the // old definition. @@ -901,7 +931,7 @@ ONNX_OPERATOR_SET_SCHEMA( "consumed_inputs", "legacy optimization attribute.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Input(0, "input", "Input tensor whose elements to be clipped", "T") .Output(0, "output", "Output tensor with clipped input elements", "T") .TypeConstraint( @@ -1055,9 +1085,9 @@ ONNX_OPERATOR_SET_SCHEMA( Gemm, 7, OpSchema() - .SetDoc( - Gemm_ver7_doc + - GenerateBroadcastingDocUni("tensor C", "tensor A * B")) + .SetDoc(GET_OP_DOC_STR( + std::string(Gemm_ver7_doc) + + GenerateBroadcastingDocUni("tensor C", "tensor A * B"))) .Input( 0, "A", @@ -1145,9 +1175,9 @@ ONNX_OPERATOR_SET_SCHEMA( Gemm, 9, OpSchema() - .SetDoc( - Gemm_ver9_doc + - GenerateBroadcastingDocUni("tensor C", "tensor A * B")) + .SetDoc(GET_OP_DOC_STR( + std::string(Gemm_ver9_doc) + + GenerateBroadcastingDocUni("tensor C", "tensor A * B"))) .Input( 0, "A", @@ -1616,11 +1646,7 @@ ONNX_OPERATOR_SET_SCHEMA( 11, OpSchema() .SetDoc(Clip_ver11_doc) - .Input( - 0, - "input", - "Input tensor whose elements to be clipped", - "T") + .Input(0, "input", "Input tensor whose elements to be clipped", "T") .Input( 1, "min", @@ -1645,13 +1671,16 @@ ONNX_OPERATOR_SET_SCHEMA( std::function ElementwiseMultiOpDocGenerator_old( const char* name) { return [=](OpSchema& schema) { - std::string doc = R"DOC( + std::string doc; + POPULATE_OP_DOC_STR( + doc = R"DOC( Element-wise {name} of each of the input tensors (with Numpy-style broadcasting support). All inputs and outputs must have the same data type. {broadcast_doc} )DOC"; - ReplaceAll(doc, "{name}", name); - ReplaceAll(doc, "{broadcast_doc}", GenerateBroadcastingDocMul().c_str()); + ReplaceAll(doc, "{name}", name); + ReplaceAll( + doc, "{broadcast_doc}", GenerateBroadcastingDocMul().c_str());); schema.SetDoc(doc); schema.Input( 0, diff --git a/onnx/defs/nn/defs.cc b/onnx/defs/nn/defs.cc index 1c0601d7456..76b8b880d27 100644 --- a/onnx/defs/nn/defs.cc +++ b/onnx/defs/nn/defs.cc @@ -89,10 +89,10 @@ void convPoolShapeInference( std::vector effective_kernel_shape = kernel_shape; for (int i = 0; i < static_cast(kernel_shape.size()); i++) { // accounting for dilation, how big is the kernel in this dimension - effective_kernel_shape[i] = (effective_kernel_shape[i] - 1) * dilations[i] + 1; + effective_kernel_shape[i] = + (effective_kernel_shape[i] - 1) * dilations[i] + 1; } - std::vector pads; if (getRepeatedAttribute(ctx, "pads", pads)) { if (pads.size() != n_input_dims * 2) { @@ -115,7 +115,9 @@ void convPoolShapeInference( residual -= stride; } } - int64_t total_pad = residual == 0 ? effective_kernel_shape[i] - stride : effective_kernel_shape[i] - residual; + int64_t total_pad = residual == 0 + ? effective_kernel_shape[i] - stride + : effective_kernel_shape[i] - residual; if (total_pad < 0) total_pad = 0; int64_t half_pad_small = total_pad >> 1; @@ -167,7 +169,8 @@ void convPoolShapeInference( if (ceil_mode == 1) strided_kernel_positions = (int64_t)(std::ceil( - (effective_input_size - effective_kernel_shape[i]) / float(strides[i]))); + (effective_input_size - effective_kernel_shape[i]) / + float(strides[i]))); else strided_kernel_positions = (effective_input_size - effective_kernel_shape[i]) / strides[i]; @@ -184,11 +187,15 @@ void convPoolShapeInference( } } -std::vector GetSupportedDataTypesForPoolingOps(bool supports8bit){ - if (supports8bit) { - return {"tensor(float16)", "tensor(float)", "tensor(double)", "tensor(int8)", "tensor(uint8)"}; - } - return {"tensor(float16)", "tensor(float)", "tensor(double)"}; +std::vector GetSupportedDataTypesForPoolingOps(bool supports8bit) { + if (supports8bit) { + return {"tensor(float16)", + "tensor(float)", + "tensor(double)", + "tensor(int8)", + "tensor(uint8)"}; + } + return {"tensor(float16)", "tensor(float)", "tensor(double)"}; } std::function PoolOpSchemaGenerator( @@ -198,7 +205,9 @@ std::function PoolOpSchemaGenerator( bool use_dilation, bool supports8bit = false) { return [=](OpSchema& schema) { - std::string doc = R"DOC( + std::string doc; + POPULATE_OP_DOC_STR( + doc = R"DOC( {name} consumes an input tensor X and applies {opName} pooling across the tensor according to kernel sizes, stride sizes, and pad lengths. {opName} pooling consisting of computing the {opName} on all values of a @@ -228,14 +237,14 @@ std::function PoolOpSchemaGenerator( ``` {additionalDescription} )DOC"; - ReplaceAll(doc, "{name}", name); - ReplaceAll(doc, "{opName}", opName); - ReplaceAll(doc, "{additionalDescription}", additionalDescription); - ReplaceAll( - doc, - "{kernelSpatialShape}", - use_dilation ? "((kernel_spatial_shape[i] - 1) * dilations[i] + 1)" - : "kernel_spatial_shape[i]"); + ReplaceAll(doc, "{name}", name); + ReplaceAll(doc, "{opName}", opName); + ReplaceAll(doc, "{additionalDescription}", additionalDescription); + ReplaceAll( + doc, + "{kernelSpatialShape}", + use_dilation ? "((kernel_spatial_shape[i] - 1) * dilations[i] + 1)" + : "kernel_spatial_shape[i]");); schema.SetDoc(doc); schema.Attr( "kernel_shape", @@ -245,13 +254,13 @@ std::function PoolOpSchemaGenerator( "strides", "Stride along each spatial axis. If not present, the stride defaults to 1 along each spatial axis.", AttributeProto::INTS, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( "auto_pad", auto_pad_doc, AttributeProto::STRING, std::string("NOTSET")); - schema.Attr("pads", pads_doc, AttributeProto::INTS, OPTIONAL); + schema.Attr("pads", pads_doc, AttributeProto::INTS, OPTIONAL_VALUE); schema.Attr( "ceil_mode", "Whether to use ceil or floor (default) to compute the output shape.", @@ -283,8 +292,9 @@ std::function PoolOpSchemaGenerator( schema.TypeConstraint( "T", GetSupportedDataTypesForPoolingOps(supports8bit), - supports8bit ? "Constrain input and output types to float and 8 bit tensors." - : "Constrain input and output types to float tensors."); + supports8bit + ? "Constrain input and output types to float and 8 bit tensors." + : "Constrain input and output types to float tensors."); schema.TypeAndShapeInferenceFunction([use_dilation](InferenceContext& ctx) { propagateElemTypeFromInputToOutput(ctx, 0, 0); if (ctx.getNumOutputs() > 1) { @@ -335,7 +345,7 @@ ONNX_OPERATOR_SET_SCHEMA( "dilations", "Dilation value along each spatial axis of filter. If not present, the dilation defaults to 1 along each spatial axis.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Output( 1, "Indices", @@ -475,8 +485,8 @@ ONNX_OPERATOR_SET_SCHEMA( "strides", "Stride along each spatial axis. If not present, the stride defaults to 1 along each spatial axis.", AttributeProto::INTS, - OPTIONAL) - .Attr("pads", pads_doc, AttributeProto::INTS, OPTIONAL) + OPTIONAL_VALUE) + .Attr("pads", pads_doc, AttributeProto::INTS, OPTIONAL_VALUE) .Input( 0, "X", @@ -530,13 +540,14 @@ ONNX_OPERATOR_SET_SCHEMA( std::function LpPoolOpSchemaGenerator(const char* name) { return [=](OpSchema& schema) { - std::string doc = R"DOC( + std::string doc; + POPULATE_OP_DOC_STR(doc = R"DOC( {name} consumes an input tensor X and applies Lp pooling across the tensor according to kernel sizes, stride sizes, and pad lengths. Lp pooling consisting of computing the Lp norm on all values of a subset of the input tensor according to the kernel size and downsampling the data into the output tensor Y for further processing.)DOC"; - ReplaceAll(doc, "{name}", name); + ReplaceAll(doc, "{name}", name);); schema.SetDoc(doc); schema.Attr( "kernel_shape", @@ -546,13 +557,13 @@ std::function LpPoolOpSchemaGenerator(const char* name) { "strides", "Stride along each spatial axis. If not present, the stride defaults to 1 along each spatial axis.", AttributeProto::INTS, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( "auto_pad", auto_pad_doc, AttributeProto::STRING, std::string("NOTSET")); - schema.Attr("pads", pads_doc, AttributeProto::INTS, OPTIONAL); + schema.Attr("pads", pads_doc, AttributeProto::INTS, OPTIONAL_VALUE); schema.Attr( "p", "p value of the Lp norm used to pool over the input data.", @@ -636,11 +647,12 @@ void roiPoolTypeShapeInference(InferenceContext& ctx) { std::function RoiPoolOpSchemaGenerator(const char* name) { return [=](OpSchema& schema) { - std::string doc = R"DOC( + std::string doc; + POPULATE_OP_DOC_STR(doc = R"DOC( ROI {name} pool consumes an input tensor X and region of interests (RoIs) to apply {name} pooling across each RoI, to produce output 4-D tensor of shape (num_rois, channels, pooled_shape[0], pooled_shape[1]).)DOC"; - ReplaceAll(doc, "{name}", name); + ReplaceAll(doc, "{name}", name);); schema.SetDoc(doc); schema.Attr( "pooled_shape", @@ -688,10 +700,11 @@ ONNX_OPERATOR_SET_SCHEMA( std::function ConvOpSchemaGenerator(const char* filter_desc) { return [=](OpSchema& schema) { - std::string doc = R"DOC( + std::string doc; + POPULATE_OP_DOC_STR(doc = R"DOC( The convolution operator consumes an input tensor and {filter_desc}, and computes the output.)DOC"; - ReplaceAll(doc, "{filter_desc}", filter_desc); + ReplaceAll(doc, "{filter_desc}", filter_desc);); schema.SetDoc(doc); schema.Input( 0, @@ -745,27 +758,23 @@ computes the output.)DOC"; "kernel_shape", "The shape of the convolution kernel. If not present, should be inferred from input W.", AttributeProto::INTS, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( "dilations", "dilation value along each spatial axis of the filter. If not present, the dilation defaults is 1 along each spatial axis.", AttributeProto::INTS, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( "strides", "Stride along each spatial axis. If not present, the stride defaults is 1 along each spatial axis.", AttributeProto::INTS, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( "auto_pad", auto_pad_doc, AttributeProto::STRING, std::string("NOTSET")); - schema.Attr( - "pads", - pads_doc, - AttributeProto::INTS, - OPTIONAL); + schema.Attr("pads", pads_doc, AttributeProto::INTS, OPTIONAL_VALUE); schema.Attr( "group", "number of groups input channels and output channels are divided into.", @@ -898,17 +907,17 @@ ONNX_OPERATOR_SET_SCHEMA( "kernel_shape", "The shape of the convolution kernel. If not present, should be inferred from input 'w'.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "dilations", "dilation value along each spatial axis of the filter. If not present, the dilation defaults to 1 along each spatial axis.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "strides", "Stride along each spatial axis. If not present, the stride defaults to 1 along each spatial axis.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "pads", "Padding for the beginning and ending along each spatial axis, it can take any value greater than or equal to 0." @@ -918,14 +927,13 @@ ONNX_OPERATOR_SET_SCHEMA( "This attribute cannot be used simultaneously with auto_pad attribute. If not present, the padding defaults" "to 0 along start and end of each spatial axis.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "group", "number of groups input channels and output channels are divided into. default is 1.", AttributeProto::INT, static_cast(1)) - .TypeAndShapeInferenceFunction([](InferenceContext& - ctx) { + .TypeAndShapeInferenceFunction([](InferenceContext& ctx) { auto x_type = ctx.getInputType(0); auto w_type = ctx.getInputType(3); if (nullptr == x_type || nullptr == w_type || @@ -1038,17 +1046,17 @@ ONNX_OPERATOR_SET_SCHEMA( "kernel_shape", "The shape of the convolution kernel. If not present, should be inferred from input 'w'.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "dilations", "dilation value along each spatial axis of the filter. If not present, the dilation defaults to 1 along each axis.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "strides", "Stride along each spatial axis. If not present, the stride defaults to 1 along each axis.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "pads", "Padding for the beginning and ending along each spatial axis, it can take any value greater than or equal to 0." @@ -1058,14 +1066,13 @@ ONNX_OPERATOR_SET_SCHEMA( "This attribute cannot be used simultaneously with auto_pad attribute. If not present, the padding defaults" "to 0 along start and end of each spatial axis.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "group", "number of groups input channels and output channels are divided into. default is 1.", AttributeProto::INT, static_cast(1)) - .TypeAndShapeInferenceFunction([](InferenceContext& - ctx) { + .TypeAndShapeInferenceFunction([](InferenceContext& ctx) { auto x_type = ctx.getInputType(0); auto w_type = ctx.getInputType(1); auto y_type = ctx.getOutputType(0); @@ -1077,8 +1084,7 @@ ONNX_OPERATOR_SET_SCHEMA( } // Right now we only support int32 - y_type->mutable_tensor_type()->set_elem_type( - TensorProto::INT32); + y_type->mutable_tensor_type()->set_elem_type(TensorProto::INT32); convPoolShapeInference(ctx, true, false, 0, 1); })); @@ -1180,7 +1186,7 @@ void convTransposeShapeInference(InferenceContext& ctx) { } } } - + std::vector output_shape; bool output_shape_presented = true; if (getRepeatedAttribute(ctx, "output_shape", output_shape)) { @@ -1243,7 +1249,8 @@ void convTransposeShapeInference(InferenceContext& ctx) { std::function ConvTransposeOpSchemaGenerator( const char* filter_desc) { return [=](OpSchema& schema) { - std::string doc = R"DOC( + std::string doc; + POPULATE_OP_DOC_STR(doc = R"DOC( The convolution transpose operator consumes an input tensor and {filter_desc}, and computes the output. @@ -1258,7 +1265,7 @@ output_shape can also be explicitly specified in which case pads values are auto Else: pads[start_i] = total_padding[i] - (total_padding[i]/2); pads[end_i] = (total_padding[i]/2). )DOC"; - ReplaceAll(doc, "{filter_desc}", filter_desc); + ReplaceAll(doc, "{filter_desc}", filter_desc);); schema.SetDoc(doc); schema.Input( 0, @@ -1304,13 +1311,13 @@ output_shape can also be explicitly specified in which case pads values are auto "kernel_shape", "The shape of the convolution kernel. If not present, should be inferred from input W.", AttributeProto::INTS, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( "output_shape", "The shape of the output can be explicitly set which will cause pads values to be auto generated. If output_shape is specified " "pads values are ignored. See doc for details for equations to generate pads", AttributeProto::INTS, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( "output_padding", "Additional elements added to the side with higher coordinate indices in the output. " @@ -1324,23 +1331,23 @@ output_shape can also be explicitly specified in which case pads values are auto "participates in the computation of the needed padding amount. " "This is also called adjs or adjustment in some frameworks.", AttributeProto::INTS, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( "dilations", "dilation value along each spatial axis of the filter. If not present, the dilation defaults to 1 along each spatial axis.", AttributeProto::INTS, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( "strides", "Stride along each spatial axis. If not present, the stride defaults to 1 along each spatial axis.", AttributeProto::INTS, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( "auto_pad", auto_pad_doc, AttributeProto::STRING, std::string("NOTSET")); - schema.Attr("pads", pads_doc, AttributeProto::INTS, OPTIONAL); + schema.Attr("pads", pads_doc, AttributeProto::INTS, OPTIONAL_VALUE); schema.Attr( "group", "number of groups input channels and output channels are divided into.", @@ -1388,12 +1395,13 @@ std::function GlobalPoolingOpSchemaGenerator( const char* op_type, const char* op) { return [=](OpSchema& schema) { - std::string doc = R"DOC( + std::string doc; + POPULATE_OP_DOC_STR(doc = R"DOC( Global{op_type} consumes an input tensor X and applies {op} pooling across the values in the same channel. This is equivalent to {op_type} with kernel size equal to the spatial dimension of input tensor.)DOC"; - ReplaceAll(doc, "{op_type}", op_type); - ReplaceAll(doc, "{op}", op); + ReplaceAll(doc, "{op_type}", op_type); + ReplaceAll(doc, "{op}", op);); schema.SetDoc(doc); schema.Input( 0, @@ -1436,12 +1444,13 @@ std::function GlobalLpPoolingOpSchemaGenerator( const char* op_type, const char* op) { return [=](OpSchema& schema) { - std::string doc = R"DOC( + std::string doc; + POPULATE_OP_DOC_STR(doc = R"DOC( Global{op_type} consumes an input tensor X and applies {op} pooling across the values in the same channel. This is equivalent to {op_type} with kernel size equal to the spatial dimension of input tensor.)DOC"; - ReplaceAll(doc, "{op_type}", op_type); - ReplaceAll(doc, "{op}", op); + ReplaceAll(doc, "{op_type}", op_type); + ReplaceAll(doc, "{op}", op);); schema.SetDoc(doc); schema.Attr( "p", @@ -1471,7 +1480,6 @@ std::function GlobalLpPoolingOpSchemaGenerator( "T", {"tensor(float16)", "tensor(float)", "tensor(double)"}, "Constrain input and output types to float tensors."); - schema.SetDoc(doc); schema.TypeAndShapeInferenceFunction( [](InferenceContext& ctx) { globalPoolTypeShapeInference(ctx); }); }; @@ -1521,7 +1529,9 @@ ONNX_OPERATOR_SET_SCHEMA( 12, OpSchema() .NumOutputs({1, 5}) - .SetDoc(BatchNormalization_ver12_doc + GenerateOptionalArgumentsDoc()) + .SetDoc(GET_OP_DOC_STR( + std::string(BatchNormalization_ver12_doc) + + GenerateOptionalArgumentsDoc())) .Attr( "epsilon", "The epsilon value to use to avoid division by zero.", @@ -1598,70 +1608,94 @@ ONNX_OPERATOR_SET_SCHEMA( "Constrain input 'training_mode' types to boolean tensors.") .TypeAndShapeInferenceFunction([](InferenceContext& ctx) { propagateElemTypeFromInputToOutput(ctx, 0, 0); - if(hasInputShape(ctx, 0)){ + if (hasInputShape(ctx, 0)) { propagateShapeFromInputToOutput(ctx, 0, 0); auto& x_input_shape = getInputShape(ctx, 0); int num_channels = 1; - if ( static_cast(x_input_shape.dim_size()) > 1 && x_input_shape.dim(1).has_dim_value()) { - num_channels = static_cast(x_input_shape.dim(1).dim_value()); + if (static_cast(x_input_shape.dim_size()) > 1 && + x_input_shape.dim(1).has_dim_value()) { + num_channels = static_cast(x_input_shape.dim(1).dim_value()); } if (hasInputShape(ctx, 1)) { - auto& scale_input_shape = getInputShape(ctx, 1); - if(static_cast(scale_input_shape.dim_size()) != 1 || !scale_input_shape.dim(0).has_dim_value() || static_cast(scale_input_shape.dim(0).dim_value()) != num_channels) { - fail_shape_inference("All scale, B, mean and var must be tensors of shape C."); - } + auto& scale_input_shape = getInputShape(ctx, 1); + if (static_cast(scale_input_shape.dim_size()) != 1 || + !scale_input_shape.dim(0).has_dim_value() || + static_cast(scale_input_shape.dim(0).dim_value()) != + num_channels) { + fail_shape_inference( + "All scale, B, mean and var must be tensors of shape C."); + } } if (hasInputShape(ctx, 2)) { - auto& b_input_shape = getInputShape(ctx, 2); - if(static_cast(b_input_shape.dim_size()) != 1 || !b_input_shape.dim(0).has_dim_value() || static_cast(b_input_shape.dim(0).dim_value()) != num_channels) { - fail_shape_inference("All scale, B, mean and var must be tensors of shape C."); - } + auto& b_input_shape = getInputShape(ctx, 2); + if (static_cast(b_input_shape.dim_size()) != 1 || + !b_input_shape.dim(0).has_dim_value() || + static_cast(b_input_shape.dim(0).dim_value()) != + num_channels) { + fail_shape_inference( + "All scale, B, mean and var must be tensors of shape C."); + } } if (hasInputShape(ctx, 3)) { - auto& mean_input_shape = getInputShape(ctx, 3); - if(static_cast(mean_input_shape.dim_size() != 1)|| !mean_input_shape.dim(0).has_dim_value() || static_cast(mean_input_shape.dim(0).dim_value()) != num_channels) { - fail_shape_inference("All scale, B, mean and var must be tensors of shape C."); - } + auto& mean_input_shape = getInputShape(ctx, 3); + if (static_cast(mean_input_shape.dim_size() != 1) || + !mean_input_shape.dim(0).has_dim_value() || + static_cast(mean_input_shape.dim(0).dim_value()) != + num_channels) { + fail_shape_inference( + "All scale, B, mean and var must be tensors of shape C."); + } } if (hasInputShape(ctx, 4)) { - auto& var_input_shape = getInputShape(ctx, 4); - if(static_cast(var_input_shape.dim_size()) != 1 || !var_input_shape.dim(0).has_dim_value() || static_cast(var_input_shape.dim(0).dim_value()) != num_channels) { - fail_shape_inference("All scale, B, mean and var must be tensors of shape C."); - } + auto& var_input_shape = getInputShape(ctx, 4); + if (static_cast(var_input_shape.dim_size()) != 1 || + !var_input_shape.dim(0).has_dim_value() || + static_cast(var_input_shape.dim(0).dim_value()) != + num_channels) { + fail_shape_inference( + "All scale, B, mean and var must be tensors of shape C."); + } } if (ctx.getNumInputs() > 5 && hasInputShape(ctx, 5)) { - auto& mode_input_shape = getInputShape(ctx, 5); - if (static_cast(mode_input_shape.dim_size()) != 0) { - fail_shape_inference("Training_mode is not a scalar boolean."); + auto& mode_input_shape = getInputShape(ctx, 5); + // if mode is not scalar or tensor of rank 1, fail shape inference + if (static_cast(mode_input_shape.dim_size()) != 0) { + if (static_cast(mode_input_shape.dim_size()) > 1 || + !mode_input_shape.dim(0).has_dim_value() || + static_cast(mode_input_shape.dim(0).dim_value()) != + 1) { + fail_shape_inference( + "Training_mode is not a scalar boolean."); } + } } if (ctx.getNumOutputs() > 1) { - TensorShapeProto outputs_shape; - *outputs_shape.add_dim() = x_input_shape.dim(1); // channel - - propagateElemTypeFromInputToOutput(ctx, 0, 1); - updateOutputShape(ctx, 1, outputs_shape); - - if (ctx.getNumOutputs() > 2){ - propagateElemTypeFromInputToOutput(ctx, 0, 2); - updateOutputShape(ctx, 2, outputs_shape); - } - - if (ctx.getNumOutputs() > 3){ - propagateElemTypeFromInputToOutput(ctx, 0, 3); - updateOutputShape(ctx, 3, outputs_shape); - } - - if (ctx.getNumOutputs() > 4){ - propagateElemTypeFromInputToOutput(ctx, 0, 4); - updateOutputShape(ctx, 4, outputs_shape); - } + TensorShapeProto outputs_shape; + *outputs_shape.add_dim() = x_input_shape.dim(1); // channel + + propagateElemTypeFromInputToOutput(ctx, 0, 1); + updateOutputShape(ctx, 1, outputs_shape); + + if (ctx.getNumOutputs() > 2) { + propagateElemTypeFromInputToOutput(ctx, 0, 2); + updateOutputShape(ctx, 2, outputs_shape); + } + + if (ctx.getNumOutputs() > 3) { + propagateElemTypeFromInputToOutput(ctx, 0, 3); + updateOutputShape(ctx, 3, outputs_shape); + } + + if (ctx.getNumOutputs() > 4) { + propagateElemTypeFromInputToOutput(ctx, 0, 4); + updateOutputShape(ctx, 4, outputs_shape); + } } } })); @@ -1763,13 +1797,23 @@ ONNX_OPERATOR_SET_SCHEMA( Dropout, 12, OpSchema() - .SetDoc(Dropout_ver12_doc + GenerateOptionalArgumentsDoc()) - .Attr("seed", "(Optional) Seed to the random generator, if not specified we will auto generate one.", AttributeProto::INT, OPTIONAL) + .SetDoc(GET_OP_DOC_STR( + std::string(Dropout_ver12_doc) + GenerateOptionalArgumentsDoc())) + .Attr( + "seed", + "(Optional) Seed to the random generator, if not specified we will auto generate one.", + AttributeProto::INT, + OPTIONAL_VALUE) .Input(0, "data", "The input data as Tensor.", "T") - .Input(1, "ratio", "The ratio of random dropout, with value in [0, 1). If this input was not set, " - "or if it was set to 0, the output would be a simple copy of the input. " - "If it's non-zero, output will be a random dropout of the scaled input, which is typically " - "the case during training.", "T1", OpSchema::Optional) + .Input( + 1, + "ratio", + "The ratio of random dropout, with value in [0, 1). If this input was not set, " + "or if it was set to 0, the output would be a simple copy of the input. " + "If it's non-zero, output will be a random dropout of the scaled input, which is typically " + "the case during training.", + "T1", + OpSchema::Optional) .Output(0, "output", "The output.", "T") .Output(1, "mask", "The output mask.", "T2", OpSchema::Optional) .TypeConstraint( @@ -1793,7 +1837,7 @@ ONNX_OPERATOR_SET_SCHEMA( if (ctx.getNumInputs() > 1 && hasInputShape(ctx, 1)) { auto& ratio_input_shape = getInputShape(ctx, 1); if (static_cast(ratio_input_shape.dim_size()) != 0) { - fail_shape_inference("Ratio of Dropout must be a scalar."); + fail_shape_inference("Ratio of Dropout must be a scalar."); } } if (ctx.getNumOutputs() == 2) { @@ -2000,14 +2044,14 @@ ONNX_OPERATOR_SET_SCHEMA( "It's an 1-D tensor starting with the collections of all 1-grams and ending with the collections of n-grams. " "The i-th element in pool stores the n-gram that should be mapped to coordinate ngram_indexes[i] in the output vector.", AttributeProto::STRINGS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "pool_int64s", "List of int64 n-grams learned from the training set. Either this or pool_strings attributes must be present but not both. " "It's an 1-D tensor starting with the collections of all 1-grams and ending with the collections of n-grams. " "The i-th element in pool stores the n-gram that should be mapped to coordinate ngram_indexes[i] in the output vector.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "ngram_counts", "The starting indexes of 1-grams, 2-grams, and so on in pool. " @@ -2028,7 +2072,7 @@ ONNX_OPERATOR_SET_SCHEMA( "By default, weights is an all-one tensor.This attribute is used when mode is \"IDF\" or \"TFIDF\" " "to scale the associated word counts.", AttributeProto::FLOATS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "mode", "The weighting criteria. It can be one of \"TF\" (term frequency), " @@ -2105,13 +2149,13 @@ ONNX_OPERATOR_SET_SCHEMA( "stopwords", "List of stop words. If not set, no word would be removed from X.", AttributeProto::STRINGS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "locale", "Environment dependent string that denotes the locale according to which output strings needs to be upper/lowercased." "Default en_US or platform specific equivalent as decided by the implementation.", AttributeProto::STRING, - OPTIONAL) + OPTIONAL_VALUE) .SetDoc(StringNormalizer_ver10_doc) .TypeAndShapeInferenceFunction([](InferenceContext& ctx) { auto output_elem_type = ctx.getOutputType(0)->mutable_tensor_type(); diff --git a/onnx/defs/nn/old.cc b/onnx/defs/nn/old.cc index e6251590771..5b6173786ff 100644 --- a/onnx/defs/nn/old.cc +++ b/onnx/defs/nn/old.cc @@ -87,10 +87,10 @@ void convPoolShapeInference1( std::vector effective_kernel_shape = kernel_shape; for (int i = 0; i < static_cast(kernel_shape.size()); i++) { // accounting for dilation, how big is the kernel in this dimension - effective_kernel_shape[i] = (effective_kernel_shape[i] - 1) * dilations[i] + 1; + effective_kernel_shape[i] = + (effective_kernel_shape[i] - 1) * dilations[i] + 1; } - std::vector pads; if (getRepeatedAttribute(ctx, "pads", pads)) { if (pads.size() != n_input_dims * 2) { @@ -113,7 +113,9 @@ void convPoolShapeInference1( residual -= stride; } } - int64_t total_pad = residual == 0 ? effective_kernel_shape[i] - stride : effective_kernel_shape[i] - residual; + int64_t total_pad = residual == 0 + ? effective_kernel_shape[i] - stride + : effective_kernel_shape[i] - residual; if (total_pad < 0) total_pad = 0; int64_t half_pad_small = total_pad >> 1; @@ -128,7 +130,7 @@ void convPoolShapeInference1( } } } - + auto output_shape = ctx.getOutputType(0)->mutable_tensor_type()->mutable_shape(); @@ -165,7 +167,8 @@ void convPoolShapeInference1( if (ceil_mode == 1) strided_kernel_positions = (int64_t)(std::ceil( - (effective_input_size - effective_kernel_shape[i]) / float(strides[i]))); + (effective_input_size - effective_kernel_shape[i]) / + float(strides[i]))); else strided_kernel_positions = (effective_input_size - effective_kernel_shape[i]) / strides[i]; @@ -187,7 +190,9 @@ std::function PoolOpSchemaGenerator_9( const char* opName, const char* additionalDescription) { return [=](OpSchema& schema) { - std::string doc = R"DOC( + std::string doc; + POPULATE_OP_DOC_STR( + doc = R"DOC( {name} consumes an input tensor X and applies {opName} pooling across the tensor according to kernel sizes, stride sizes, and pad lengths. {opName} pooling consisting of computing the {opName} on all values of a @@ -210,22 +215,25 @@ std::function PoolOpSchemaGenerator_9( ``` {additionalDescription} )DOC"; - ReplaceAll(doc, "{name}", name); - ReplaceAll(doc, "{opName}", opName); - ReplaceAll(doc, "{additionalDescription}", additionalDescription); + ReplaceAll(doc, "{name}", name); + ReplaceAll(doc, "{opName}", opName); + ReplaceAll(doc, "{additionalDescription}", additionalDescription);); schema.SetDoc(doc); schema.Attr( "kernel_shape", "The size of the kernel along each axis.", AttributeProto::INTS); schema.Attr( - "strides", "Stride along each spatial axis.", AttributeProto::INTS, OPTIONAL); + "strides", + "Stride along each spatial axis.", + AttributeProto::INTS, + OPTIONAL_VALUE); schema.Attr( "auto_pad", auto_pad_doc2, AttributeProto::STRING, std::string("NOTSET")); - schema.Attr("pads", pads_doc2, AttributeProto::INTS, OPTIONAL); + schema.Attr("pads", pads_doc2, AttributeProto::INTS, OPTIONAL_VALUE); schema.Input( 0, "X", @@ -275,7 +283,9 @@ std::function PoolOpSchemaGenerator_10( bool use_dilation, int opsetNum) { return [=](OpSchema& schema) { - std::string doc = R"DOC( + std::string doc; + POPULATE_OP_DOC_STR( + doc = R"DOC( {name} consumes an input tensor X and applies {opName} pooling across the tensor according to kernel sizes, stride sizes, and pad lengths. {opName} pooling consisting of computing the {opName} on all values of a @@ -305,30 +315,32 @@ std::function PoolOpSchemaGenerator_10( ``` {additionalDescription} )DOC"; - ReplaceAll(doc, "{name}", name); - ReplaceAll(doc, "{opName}", opName); - ReplaceAll(doc, "{additionalDescription}", additionalDescription); - ReplaceAll( - doc, - "{kernelSpatialShape}", - use_dilation ? "((kernel_spatial_shape[i] - 1) * dilations[i] + 1)" - : "kernel_spatial_shape[i]"); + ReplaceAll(doc, "{name}", name); + ReplaceAll(doc, "{opName}", opName); + ReplaceAll(doc, "{additionalDescription}", additionalDescription); + ReplaceAll( + doc, + "{kernelSpatialShape}", + use_dilation ? "((kernel_spatial_shape[i] - 1) * dilations[i] + 1)" + : "kernel_spatial_shape[i]");); schema.SetDoc(doc); schema.Attr( "kernel_shape", "The size of the kernel along each axis.", AttributeProto::INTS); schema.Attr( - "strides", - opsetNum == 11 ? "Stride along each spatial axis. If not present, the stride defaults to 1 along each spatial axis." : "Stride along each spatial axis.", - AttributeProto::INTS, - OPTIONAL); + "strides", + opsetNum == 11 + ? "Stride along each spatial axis. If not present, the stride defaults to 1 along each spatial axis." + : "Stride along each spatial axis.", + AttributeProto::INTS, + OPTIONAL_VALUE); schema.Attr( "auto_pad", auto_pad_doc2, AttributeProto::STRING, std::string("NOTSET")); - schema.Attr("pads", pads_doc2, AttributeProto::INTS, OPTIONAL); + schema.Attr("pads", pads_doc2, AttributeProto::INTS, OPTIONAL_VALUE); schema.Attr( "ceil_mode", "Whether to use ceil or floor (default) to compute the output shape.", @@ -470,7 +482,7 @@ ONNX_OPERATOR_SET_SCHEMA( "dilations", "Dilation value along each spatial axis of filter.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Output( 1, "Indices", @@ -506,7 +518,7 @@ ONNX_OPERATOR_SET_SCHEMA( "dilations", "Dilation value along each spatial axis of filter. If not present, the dilation defaults to 1 along each spatial axis.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Output( 1, "Indices", @@ -646,8 +658,8 @@ ONNX_OPERATOR_SET_SCHEMA( "strides", "Stride along each spatial axis.", AttributeProto::INTS, - OPTIONAL) - .Attr("pads", pads_doc2, AttributeProto::INTS, OPTIONAL) + OPTIONAL_VALUE) + .Attr("pads", pads_doc2, AttributeProto::INTS, OPTIONAL_VALUE) .Input( 0, "X", @@ -732,18 +744,18 @@ ONNX_OPERATOR_SET_SCHEMA( "kernel_shape", "The size of the kernel along each axis.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "strides", "Stride along each axis.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "auto_pad", auto_pad_doc1, AttributeProto::STRING, std::string("NOTSET")) - .Attr("pads", pads_doc1, AttributeProto::INTS, OPTIONAL) + .Attr("pads", pads_doc1, AttributeProto::INTS, OPTIONAL_VALUE) .Attr( "p", "p value of the Lp norm used to pool over the input data, default is 2.0.", @@ -775,26 +787,30 @@ ONNX_OPERATOR_SET_SCHEMA( std::function LpPoolOpSchemaGenerator_10(const char* name) { return [=](OpSchema& schema) { - std::string doc = R"DOC( + std::string doc; + POPULATE_OP_DOC_STR(doc = R"DOC( {name} consumes an input tensor X and applies Lp pooling across the tensor according to kernel sizes, stride sizes, and pad lengths. Lp pooling consisting of computing the Lp norm on all values of a subset of the input tensor according to the kernel size and downsampling the data into the output tensor Y for further processing.)DOC"; - ReplaceAll(doc, "{name}", name); + ReplaceAll(doc, "{name}", name);); schema.SetDoc(doc); schema.Attr( "kernel_shape", "The size of the kernel along each axis.", AttributeProto::INTS); schema.Attr( - "strides", "Stride along each spatial axis.", AttributeProto::INTS, OPTIONAL); + "strides", + "Stride along each spatial axis.", + AttributeProto::INTS, + OPTIONAL_VALUE); schema.Attr( "auto_pad", auto_pad_doc2, AttributeProto::STRING, std::string("NOTSET")); - schema.Attr("pads", pads_doc2, AttributeProto::INTS, OPTIONAL); + schema.Attr("pads", pads_doc2, AttributeProto::INTS, OPTIONAL_VALUE); schema.Attr( "p", "p value of the Lp norm used to pool over the input data.", @@ -840,12 +856,14 @@ static const char* GlobalLpPool_ver1_doc = R"DOC( the values in the same channel. This is equivalent to LpPool with kernel size equal to the spatial dimension of input tensor.)DOC"; - std::function ConvOpSchemaGenerator_10(const char* filter_desc) { +std::function ConvOpSchemaGenerator_10( + const char* filter_desc) { return [=](OpSchema& schema) { - std::string doc = R"DOC( + std::string doc; + POPULATE_OP_DOC_STR(doc = R"DOC( The convolution operator consumes an input tensor and {filter_desc}, and computes the output.)DOC"; - ReplaceAll(doc, "{filter_desc}", filter_desc); + ReplaceAll(doc, "{filter_desc}", filter_desc);); schema.SetDoc(doc); schema.Input( 0, @@ -899,20 +917,23 @@ computes the output.)DOC"; "kernel_shape", "The shape of the convolution kernel. If not present, should be inferred from input W.", AttributeProto::INTS, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( "dilations", "dilation value along each spatial axis of the filter.", AttributeProto::INTS, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( - "strides", "Stride along each spatial axis.", AttributeProto::INTS, OPTIONAL); + "strides", + "Stride along each spatial axis.", + AttributeProto::INTS, + OPTIONAL_VALUE); schema.Attr( "auto_pad", auto_pad_doc2, AttributeProto::STRING, std::string("NOTSET")); - schema.Attr("pads", pads_doc2, AttributeProto::INTS, OPTIONAL); + schema.Attr("pads", pads_doc2, AttributeProto::INTS, OPTIONAL_VALUE); schema.Attr( "group", "number of groups input channels and output channels are divided into.", @@ -1027,7 +1048,7 @@ void convTransposeShapeInference1(InferenceContext& ctx) { } } } - + std::vector output_shape; bool output_shape_presented = true; if (getRepeatedAttribute(ctx, "output_shape", output_shape)) { @@ -1090,7 +1111,8 @@ void convTransposeShapeInference1(InferenceContext& ctx) { std::function ConvTransposeOpSchemaGenerator_10( const char* filter_desc) { return [=](OpSchema& schema) { - std::string doc = R"DOC( + std::string doc; + POPULATE_OP_DOC_STR(doc = R"DOC( The convolution transpose operator consumes an input tensor and {filter_desc}, and computes the output. @@ -1105,7 +1127,7 @@ output_shape can also be explicitly specified in which case pads values are auto Else: pads[start_i] = total_padding[i] - (total_padding[i]/2); pads[end_i] = (total_padding[i]/2). )DOC"; - ReplaceAll(doc, "{filter_desc}", filter_desc); + ReplaceAll(doc, "{filter_desc}", filter_desc);); schema.SetDoc(doc); schema.Input( 0, @@ -1151,32 +1173,35 @@ output_shape can also be explicitly specified in which case pads values are auto "kernel_shape", "The shape of the convolution kernel. If not present, should be inferred from input W.", AttributeProto::INTS, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( "output_shape", "The shape of the output can be explicitly set which will cause pads values to be auto generated. If output_shape is specified " "pads values are ignored. See doc for details for equations to generate pads", AttributeProto::INTS, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( "output_padding", "The zero-padding added to one side of the output." " This is also called adjs/adjustment in some frameworks.", AttributeProto::INTS, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( "dilations", "dilation value along each spatial axis of the filter.", AttributeProto::INTS, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( - "strides", "Stride along each spatial axis.", AttributeProto::INTS, OPTIONAL); + "strides", + "Stride along each spatial axis.", + AttributeProto::INTS, + OPTIONAL_VALUE); schema.Attr( "auto_pad", auto_pad_doc2, AttributeProto::STRING, std::string("NOTSET")); - schema.Attr("pads", pads_doc2, AttributeProto::INTS, OPTIONAL); + schema.Attr("pads", pads_doc2, AttributeProto::INTS, OPTIONAL_VALUE); schema.Attr( "group", "number of groups input channels and output channels are divided into.", @@ -1353,7 +1378,7 @@ ONNX_OPERATOR_SET_SCHEMA( "consumed_inputs", "legacy optimization attribute.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "epsilon", "The epsilon value to use to avoid division by zero, default is 1e-5f.", @@ -1401,7 +1426,7 @@ ONNX_OPERATOR_SET_SCHEMA( "consumed_inputs", "legacy optimization attribute.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "is_test", "(int, default 0) if nonzero, run dropout in test mode where " @@ -1463,7 +1488,8 @@ ONNX_OPERATOR_SET_SCHEMA( Dropout, 7, OpSchema() - .SetDoc(Dropout_ver7_doc + GenerateOptionalArgumentsDoc()) + .SetDoc(GET_OP_DOC_STR( + std::string(Dropout_ver7_doc) + GenerateOptionalArgumentsDoc())) .Attr( "ratio", "The ratio of random dropout", @@ -1490,7 +1516,8 @@ ONNX_OPERATOR_SET_SCHEMA( Dropout, 10, OpSchema() - .SetDoc(Dropout_ver10_doc + GenerateOptionalArgumentsDoc()) + .SetDoc(GET_OP_DOC_STR( + std::string(Dropout_ver10_doc) + GenerateOptionalArgumentsDoc())) .Attr( "ratio", "The ratio of random dropout", @@ -1647,7 +1674,9 @@ ONNX_OPERATOR_SET_SCHEMA( 9, OpSchema() .NumOutputs({1, 5}) - .SetDoc(BatchNormalization_ver9_doc + GenerateOptionalArgumentsDoc()) + .SetDoc(GET_OP_DOC_STR( + std::string(BatchNormalization_ver9_doc) + + GenerateOptionalArgumentsDoc())) .Attr( "epsilon", "The epsilon value to use to avoid division by zero.", @@ -1835,8 +1864,10 @@ ONNX_OPERATOR_SET_SCHEMA( BatchNormalization, 7, OpSchema() + .SetDoc(GET_OP_DOC_STR( + std::string(BatchNormalization_ver7_doc) + + GenerateOptionalArgumentsDoc())) .NumOutputs({1, 5}) - .SetDoc(BatchNormalization_ver7_doc + GenerateOptionalArgumentsDoc()) .Attr( "spatial", "If true, compute the mean and variance across per activation. " diff --git a/onnx/defs/operator_sets-training.h b/onnx/defs/operator_sets-training.h index 902159ecd50..2cef74d018b 100644 --- a/onnx/defs/operator_sets-training.h +++ b/onnx/defs/operator_sets-training.h @@ -10,6 +10,7 @@ namespace ONNX_NAMESPACE { // Declare training operators. class ONNX_OPERATOR_SET_SCHEMA_CLASS_NAME(OnnxTraining, 1, Gradient); class ONNX_OPERATOR_SET_SCHEMA_CLASS_NAME(OnnxTraining, 1, GraphCall); +class ONNX_OPERATOR_SET_SCHEMA_CLASS_NAME(OnnxTraining, 1, Momentum); class ONNX_OPERATOR_SET_SCHEMA_CLASS_NAME(OnnxTraining, 1, Adagrad); // Iterate over schema from ai.onnx.training version 1 @@ -18,6 +19,7 @@ class OpSet_OnnxTraining_ver1 { static void ForEachSchema(std::function fn) { fn(GetOpSchema()); fn(GetOpSchema()); + fn(GetOpSchema()); fn(GetOpSchema()); } }; diff --git a/onnx/defs/operator_sets.h b/onnx/defs/operator_sets.h index 8895988ff59..2668a1950d9 100644 --- a/onnx/defs/operator_sets.h +++ b/onnx/defs/operator_sets.h @@ -732,6 +732,7 @@ class ONNX_OPERATOR_SET_SCHEMA_CLASS_NAME(Onnx, 12, MeanSquaredDistance); class ONNX_OPERATOR_SET_SCHEMA_CLASS_NAME(Onnx, 12, LessOrEqual); class ONNX_OPERATOR_SET_SCHEMA_CLASS_NAME(Onnx, 12, GreaterOrEqual); class ONNX_OPERATOR_SET_SCHEMA_CLASS_NAME(Onnx, 12, SoftmaxCrossEntropyLoss); +class ONNX_OPERATOR_SET_SCHEMA_CLASS_NAME(Onnx, 12, Pow); // Iterate over schema from ai.onnx version 12 class OpSet_Onnx_ver12 { @@ -758,6 +759,7 @@ class OpSet_Onnx_ver12 { fn(GetOpSchema()); fn(GetOpSchema()); fn(GetOpSchema()); + fn(GetOpSchema()); } }; diff --git a/onnx/defs/reduction/defs.cc b/onnx/defs/reduction/defs.cc index 75e533b16b2..6fbc832daca 100644 --- a/onnx/defs/reduction/defs.cc +++ b/onnx/defs/reduction/defs.cc @@ -7,35 +7,39 @@ namespace ONNX_NAMESPACE { -std::vector GetSupportedDataTypesForReductionOps(bool supports8bit){ - if (supports8bit) { - auto data_types = OpSchema::numeric_types_for_math_reduction(); - data_types.push_back("tensor(uint8)"); - data_types.push_back("tensor(int8)"); +std::vector GetSupportedDataTypesForReductionOps( + bool supports8bit) { + if (supports8bit) { + auto data_types = OpSchema::numeric_types_for_math_reduction(); + data_types.push_back("tensor(uint8)"); + data_types.push_back("tensor(int8)"); - return data_types; - } + return data_types; + } - return OpSchema::numeric_types_for_math_reduction(); + return OpSchema::numeric_types_for_math_reduction(); } -std::function ReduceDocGenerator(const char* name, bool supports_8bit_datatypes = false) { +std::function ReduceDocGenerator( + const char* name, + bool supports_8bit_datatypes = false) { return [=](OpSchema& schema) { - std::string doc = R"DOC( + std::string doc; + POPULATE_OP_DOC_STR(doc = R"DOC( Computes the {name} of the input tensor's element along the provided axes. The resulted tensor has the same rank as the input if keepdims equal 1. If keepdims equal 0, then the resulted tensor have the reduced dimension pruned. The above behavior is similar to numpy, with the exception that numpy default keepdims to False instead of True.)DOC"; - ReplaceAll(doc, "{name}", name); + ReplaceAll(doc, "{name}", name);); schema.SetDoc(doc.c_str()); schema.Attr( "axes", "A list of integers, along which to reduce. The default is to reduce over " "all the dimensions of the input tensor. Accepted range is [-r, r-1] where r = rank(data).", AttributeProto::INTS, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( "keepdims", "Keep the reduced dimension or not, default 1 mean keep reduced dimension.", @@ -46,9 +50,9 @@ False instead of True.)DOC"; schema.TypeConstraint( "T", GetSupportedDataTypesForReductionOps(supports_8bit_datatypes), - supports_8bit_datatypes ? - "Constrain input and output types to high-precision and 8 bit numeric tensors." : - "Constrain input and output types to high-precision numeric tensors."); + supports_8bit_datatypes + ? "Constrain input and output types to high-precision and 8 bit numeric tensors." + : "Constrain input and output types to high-precision numeric tensors."); schema.TypeAndShapeInferenceFunction([](InferenceContext& ctx) { propagateElemTypeFromInputToOutput(ctx, 0, 0); if (!hasNInputShapes(ctx, 1)) { @@ -147,7 +151,8 @@ ONNX_OPERATOR_SET_SCHEMA( std::function ArgReduceDocGenerator(const char* name) { return [=](OpSchema& schema) { - std::string doc = R"DOC( + std::string doc; + POPULATE_OP_DOC_STR(doc = R"DOC( Computes the indices of the {name} elements of the input tensor's element along the provided axis. The resulting tensor has the same rank as the input if keepdims equal 1. If keepdims equal 0, then the resulting tensor have the reduced dimension pruned. @@ -155,7 +160,7 @@ If select_last_index is True (default False), the index of the last occurrence o is selected if the {name} appears more than once in the input. Otherwise the index of the first occurrence is selected. The type of the output tensor is integer.)DOC"; - ReplaceAll(doc, "{name}", name); + ReplaceAll(doc, "{name}", name);); schema.SetDoc(doc.c_str()); schema.Attr( "axis", @@ -200,7 +205,7 @@ The type of the output tensor is integer.)DOC"; axis = axis_proto->i(); if (axis < -input_ndim || axis >= input_ndim) { fail_shape_inference( - "'axis' must be in [-rank(indices), rank(indices)-1]"); + "'axis' must be in [-rank(indices), rank(indices)-1]"); } if (axis < 0) axis += input_ndim; diff --git a/onnx/defs/reduction/old.cc b/onnx/defs/reduction/old.cc index 9a017f02135..d7ef7e05b5a 100644 --- a/onnx/defs/reduction/old.cc +++ b/onnx/defs/reduction/old.cc @@ -10,24 +10,25 @@ std::function ReduceDocGenerator_opset1( const char* name, int opset = 1) { return [=](OpSchema& schema) { - std::string doc = R"DOC( + std::string doc; + POPULATE_OP_DOC_STR(doc = R"DOC( Computes the {name} of the input tensor's element along the provided axes. The resulted tensor has the same rank as the input if keepdims equal 1. If keepdims equal 0, then the resulted tensor have the reduced dimension pruned. The above behavior is similar to numpy, with the exception that numpy default keepdims to False instead of True.)DOC"; - ReplaceAll(doc, "{name}", name); + ReplaceAll(doc, "{name}", name);); schema.SetDoc(doc.c_str()); schema.Attr( "axes", - opset >= 11 ? - "A list of integers, along which to reduce. The default is to reduce over " - "all the dimensions of the input tensor. Accepted range is [-r, r-1] where r = rank(data)." : - "A list of integers, along which to reduce. The default is to reduce over " - "all the dimensions of the input tensor.", + opset >= 11 + ? "A list of integers, along which to reduce. The default is to reduce over " + "all the dimensions of the input tensor. Accepted range is [-r, r-1] where r = rank(data)." + : "A list of integers, along which to reduce. The default is to reduce over " + "all the dimensions of the input tensor.", AttributeProto::INTS, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( "keepdims", "Keep the reduced dimension or not, default 1 mean keep reduced dimension.", @@ -143,12 +144,13 @@ ONNX_OPERATOR_SET_SCHEMA( std::function ArgReduceDocGenerator_opset1(const char* name) { return [=](OpSchema& schema) { - std::string doc = R"DOC( + std::string doc; + POPULATE_OP_DOC_STR(doc = R"DOC( Computes the indices of the {name} elements of the input tensor's element along the provided axis. The resulted tensor has the same rank as the input if keepdims equal 1. If keepdims equal 0, then the resulted tensor have the reduced dimension pruned. The type of the output tensor is integer.)DOC"; - ReplaceAll(doc, "{name}", name); + ReplaceAll(doc, "{name}", name);); schema.SetDoc(doc.c_str()); schema.Attr( "axis", @@ -268,7 +270,7 @@ The type of the output tensor is integer.)DOC"; axis = axis_proto->i(); if (axis < -input_ndim || axis >= input_ndim) { fail_shape_inference( - "'axis' must be in [-rank(indices), rank(indices)-1]"); + "'axis' must be in [-rank(indices), rank(indices)-1]"); } if (axis < 0) axis += input_ndim; diff --git a/onnx/defs/rnn/defs.cc b/onnx/defs/rnn/defs.cc index a640669e28d..63875b19a46 100644 --- a/onnx/defs/rnn/defs.cc +++ b/onnx/defs/rnn/defs.cc @@ -62,7 +62,7 @@ std::function RNNDocGenerator(const char* /*name*/) { "hidden_size", "Number of neurons in the hidden layer", AttributeProto::INT, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( "activation_alpha", "Optional scaling values used by some activation functions. The values " @@ -70,21 +70,21 @@ std::function RNNDocGenerator(const char* /*name*/) { "in LSTM. Default values are the same as of corresponding ONNX operators." "For example with LeakyRelu, the default alpha is 0.01.", AttributeProto::FLOATS, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( "activation_beta", "Optional scaling values used by some activation functions. The values " "are consumed in the order of activation functions, for example (f, g, h) " "in LSTM. Default values are the same as of corresponding ONNX operators.", AttributeProto::FLOATS, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( "clip", "Cell clip threshold. Clipping bounds the elements of a tensor " "in the range of [-threshold, +threshold] and is applied to the input " "of activations. No clip if not specified.", AttributeProto::FLOAT, - OPTIONAL); + OPTIONAL_VALUE); schema.Input( 0, "X", @@ -197,7 +197,8 @@ ONNX_OPERATOR_SET_SCHEMA( RNN, 7, OpSchema() - .SetDoc(RNN_ver7_doc + GenerateOptionalArgumentsDoc()) + .SetDoc(GET_OP_DOC_STR( + std::string(RNN_ver7_doc) + GenerateOptionalArgumentsDoc())) .Attr( "activations", "One (or two if bidirectional) activation function for " @@ -309,7 +310,8 @@ ONNX_OPERATOR_SET_SCHEMA( GRU, 7, OpSchema() - .SetDoc(GRU_ver7_doc + GenerateOptionalArgumentsDoc()) + .SetDoc(GET_OP_DOC_STR( + std::string(GRU_ver7_doc) + GenerateOptionalArgumentsDoc())) .Attr( "activations", "A list of 2 (or 4 if bidirectional) activation functions " @@ -317,7 +319,7 @@ ONNX_OPERATOR_SET_SCHEMA( "of the activation functions specified above. Optional: See the equations " "for default if not specified.", AttributeProto::STRINGS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "linear_before_reset", "When computing the output of the hidden gate, " @@ -437,7 +439,8 @@ ONNX_OPERATOR_SET_SCHEMA( LSTM, 7, OpSchema() - .SetDoc(LSTM_ver7_doc + GenerateOptionalArgumentsDoc()) + .SetDoc(GET_OP_DOC_STR( + std::string(LSTM_ver7_doc) + GenerateOptionalArgumentsDoc())) .Attr( "activations", "A list of 3 (or 6 if bidirectional) activation functions " @@ -445,7 +448,7 @@ ONNX_OPERATOR_SET_SCHEMA( "be one of the activation functions specified above. Optional: See the equations " "for default if not specified.", AttributeProto::STRINGS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "input_forget", "Couple the input and forget gates if 1.", diff --git a/onnx/defs/rnn/old.cc b/onnx/defs/rnn/old.cc index 28c0861328e..a75858b3f2a 100644 --- a/onnx/defs/rnn/old.cc +++ b/onnx/defs/rnn/old.cc @@ -13,21 +13,21 @@ std::function RNNDocGeneratorOld(const char* /*name*/) { "hidden_size", "Number of neurons in the hidden layer", AttributeProto::INT, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( "activation_alpha", "Optional scaling values used by some activation functions. The values " "are consumed in the order of activation functions, for example (f, g, h) " "in LSTM.", AttributeProto::FLOATS, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( "activation_beta", "Optional scaling values used by some activation functions. The values " "are consumed in the order of activation functions, for example (f, g, h) " "in LSTM.", AttributeProto::FLOATS, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( "output_sequence", "The sequence output for the hidden is optional if 0. Default 0.", @@ -39,7 +39,7 @@ std::function RNNDocGeneratorOld(const char* /*name*/) { "in the range of [-threshold, +threshold] and is applied to the input " "of activations. No clip if not specified.", AttributeProto::FLOAT, - OPTIONAL); + OPTIONAL_VALUE); schema.Input( 0, "X", @@ -171,7 +171,7 @@ ONNX_OPERATOR_SET_SCHEMA( "of the activation functions specified above. Optional: See the equations " "for default if not specified.", AttributeProto::STRINGS, - OPTIONAL) + OPTIONAL_VALUE) .Input( 1, "W", @@ -266,7 +266,7 @@ std::function RNNDocGenerator1(const char* /*name*/) { "hidden_size", "Number of neurons in the hidden layer", AttributeProto::INT, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( "activation_alpha", "Optional scaling values used by some activation functions. The values " @@ -274,14 +274,14 @@ std::function RNNDocGenerator1(const char* /*name*/) { "in LSTM. Default values are the same as of corresponding ONNX operators." "For example with LeakyRelu, the default alpha is 0.01.", AttributeProto::FLOATS, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( "activation_beta", "Optional scaling values used by some activation functions. The values " "are consumed in the order of activation functions, for example (f, g, h) " "in LSTM. Default values are the same as of corresponding ONNX operators.", AttributeProto::FLOATS, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( "output_sequence", "The sequence output for the hidden is optional if 0. Default 0.", @@ -293,7 +293,7 @@ std::function RNNDocGenerator1(const char* /*name*/) { "in the range of [-threshold, +threshold] and is applied to the input " "of activations. No clip if not specified.", AttributeProto::FLOAT, - OPTIONAL); + OPTIONAL_VALUE); schema.Input( 0, "X", @@ -527,7 +527,7 @@ ONNX_OPERATOR_SET_SCHEMA( "of the activation functions specified above. Optional: See the equations " "for default if not specified.", AttributeProto::STRINGS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "linear_before_reset", "When computing the output of the hidden gate, " @@ -655,7 +655,7 @@ ONNX_OPERATOR_SET_SCHEMA( "be one of the activation functions specified above. Optional: See the equations " "for default if not specified.", AttributeProto::STRINGS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "input_forget", "Couple the input and forget gates if 1, default 0.", diff --git a/onnx/defs/schema.cc b/onnx/defs/schema.cc index 01d0829b23e..fb32c8518ec 100644 --- a/onnx/defs/schema.cc +++ b/onnx/defs/schema.cc @@ -28,36 +28,6 @@ DbgOperatorSetTracker& DbgOperatorSetTracker::Instance() { } #endif -OpSchema::FormalParameter::FormalParameter( - std::string name, - DataTypeSet allowed_type_set, - std::string type_str, - std::string description, - FormalParameterOption param_option, - bool is_homogeneous, - int min_arity) - : name_(std::move(name)), - type_set_(std::move(allowed_type_set)), - type_str_(std::move(type_str)), - description_(std::move(description)), - param_option_(param_option), - is_homogeneous_(is_homogeneous), - min_arity_(min_arity) {} - -OpSchema::FormalParameter::FormalParameter( - std::string name, - std::string description, - std::string type_str, - FormalParameterOption param_option, - bool is_homogeneous, - int min_arity) - : name_(std::move(name)), - type_str_(std::move(type_str)), - description_(std::move(description)), - param_option_(param_option), - is_homogeneous_(is_homogeneous), - min_arity_(min_arity) {} - const std::string& OpSchema::FormalParameter::GetName() const { return name_; } @@ -441,11 +411,6 @@ OpSchema& OpSchema::SetSupportLevel(SupportType support) { return *this; } -OpSchema& OpSchema::SetDoc(std::string doc) { - doc_ = std::move(doc); - return *this; -} - // Functions to specify name for the operator schema. OpSchema& OpSchema::SetName(std::string name) { name_ = std::move(name); @@ -607,7 +572,7 @@ OpSchema& OpSchema::AllowUncheckedAttributes() { OpSchema& OpSchema::Input( int n, std::string name, - std::string description, + const std::string& description, std::string type_str, OpSchema::FormalParameterOption param_option, bool is_homogeneous, @@ -617,7 +582,11 @@ OpSchema& OpSchema::Input( } inputs_[n] = FormalParameter( std::move(name), - std::move(description), +#ifndef __ONNX_NO_DOC_STRINGS + description, +#else + std::string(), +#endif std::move(type_str), param_option, is_homogeneous, @@ -636,7 +605,11 @@ OpSchema& OpSchema::Input( return Input( n, std::string(name), +#ifndef __ONNX_NO_DOC_STRINGS std::string(description), +#else + std::string(), +#endif std::string(type_str), param_option, is_homogeneous, @@ -646,7 +619,7 @@ OpSchema& OpSchema::Input( OpSchema& OpSchema::Output( int n, std::string name, - std::string description, + const std::string& description, std::string type_str, OpSchema::FormalParameterOption param_option, bool is_homogeneous, @@ -656,7 +629,11 @@ OpSchema& OpSchema::Output( } outputs_[n] = FormalParameter( std::move(name), - std::move(description), +#ifndef __ONNX_NO_DOC_STRINGS + description, +#else + std::string(), +#endif std::move(type_str), param_option, is_homogeneous, @@ -675,7 +652,11 @@ OpSchema& OpSchema::Output( return Output( n, std::string(name), +#ifndef __ONNX_NO_DOC_STRINGS std::string(description), +#else + std::string(), +#endif std::string(type_str), param_option, is_homogeneous, @@ -747,7 +728,7 @@ bool OpSchema::BuildContextDependentFunction( } OpSchema& OpSchema::FunctionBody(const std::vector& func_nodes) { - for (const auto node : func_nodes) { + for (const auto& node : func_nodes) { auto new_node = function_body_.add_node(); new_node->CopyFrom(node); } diff --git a/onnx/defs/schema.h b/onnx/defs/schema.h index 3a2dec8c20e..23cc72530fb 100644 --- a/onnx/defs/schema.h +++ b/onnx/defs/schema.h @@ -28,12 +28,13 @@ namespace ONNX_NAMESPACE { struct FunctionBodyBuildContext { virtual const AttributeProto* getAttribute(const std::string& name) const = 0; virtual bool hasInput(int i) const = 0; - virtual bool hasOutput(int i) const = 0; + virtual bool hasOutput(int i) const = 0; virtual ~FunctionBodyBuildContext() {} }; struct FunctionBodyBuildContextImpl : public FunctionBodyBuildContext { - FunctionBodyBuildContextImpl(NodeProto& node_proto) : node_proto_(node_proto) { + FunctionBodyBuildContextImpl(NodeProto& node_proto) + : node_proto_(node_proto) { for (auto& attr : *node_proto.mutable_attribute()) { attributesByName_[attr.name()] = &attr; } @@ -58,17 +59,19 @@ struct FunctionBodyBuildContextImpl : public FunctionBodyBuildContext { if (i >= node_proto_.output_size()) return false; return node_proto_.output(i) != ""; - } + } std::unordered_map attributesByName_; NodeProto node_proto_; }; -using FunctionBodyQueryFunction = std::function; +using FunctionBodyQueryFunction = + std::function; class OpSchema; -using ContextDependentFunctionBodyBuilder = std::function; +using ContextDependentFunctionBodyBuilder = std::function< + bool(const FunctionBodyBuildContext&, const OpSchema&, FunctionProto&)>; class SchemaError final : public std::runtime_error { public: @@ -148,20 +151,39 @@ class OpSchema final { explicit FormalParameter( std::string name, - DataTypeSet type_set, + DataTypeSet allowed_type_set, std::string type_str, - std::string description, + const std::string& description, FormalParameterOption param_option = Single, bool is_homogeneous = true, - int min_arity = 1); + int min_arity = 1) + : name_(std::move(name)), + type_set_(std::move(allowed_type_set)), + type_str_(std::move(type_str)), +#ifndef __ONNX_NO_DOC_STRINGS + description_(description), +#endif + param_option_(param_option), + is_homogeneous_(is_homogeneous), + min_arity_(min_arity) { + } explicit FormalParameter( std::string name, - std::string description, + const std::string& description, std::string type_str, FormalParameterOption param_option = Single, bool is_homogeneous = true, - int min_arity = 1); + int min_arity = 1) + : name_(std::move(name)), + type_str_(std::move(type_str)), +#ifndef __ONNX_NO_DOC_STRINGS + description_(description), +#endif + param_option_(param_option), + is_homogeneous_(is_homogeneous), + min_arity_(min_arity) { + } // Get formal parameter name. const std::string& GetName() const; @@ -329,7 +351,14 @@ class OpSchema final { return *this; } - OpSchema& SetDoc(std::string doc); + OpSchema& SetDoc(const std::string& doc) { +#ifndef __ONNX_NO_DOC_STRINGS + doc_ = doc; +#else + ONNX_UNUSED_PARAMETER(doc); +#endif + return *this; + } // Functions to specify name for the operator schema. OpSchema& SetName(const char* name); @@ -464,7 +493,7 @@ class OpSchema final { OpSchema& Input( int n, std::string name, - std::string description, + const std::string& description, std::string type_str, FormalParameterOption param_option = Single, bool is_homogeneous = true, @@ -483,7 +512,7 @@ class OpSchema final { OpSchema& Output( int n, std::string name, - std::string description, + const std::string& description, std::string type_str, FormalParameterOption param_option = Single, bool is_homogeneous = true, @@ -663,7 +692,9 @@ class OpSchema final { } OpSchema& FunctionBody(const std::vector& func_nodes); - OpSchema& FunctionBody(const std::vector& func_nodes, const std::vector& opsets); + OpSchema& FunctionBody( + const std::vector& func_nodes, + const std::vector& opsets); const FunctionProto* GetFunction() const; @@ -671,9 +702,12 @@ class OpSchema final { return functionBuilder_ != nullptr; } - OpSchema& SetContextDependentFunctionBodyBuilder(ContextDependentFunctionBodyBuilder); - - bool BuildContextDependentFunction(const FunctionBodyBuildContext& ctx, FunctionProto& functionProto) const; + OpSchema& SetContextDependentFunctionBodyBuilder( + ContextDependentFunctionBodyBuilder); + + bool BuildContextDependentFunction( + const FunctionBodyBuildContext& ctx, + FunctionProto& functionProto) const; // Verifies that the schema is valid and all specifications are compatible. // It will also parse all type strings specified for inputs/outputs into valid @@ -946,7 +980,8 @@ OpSchema GetOpSchema(); ONNX_OPERATOR_SET_SCHEMA_EX(name, OnnxML, AI_ONNX_ML_DOMAIN, ver, true, impl) #define ONNX_TRAINING_OPERATOR_SET_SCHEMA(name, ver, impl) \ - ONNX_OPERATOR_SET_SCHEMA_EX(name, OnnxTraining, AI_ONNX_TRAINING_DOMAIN, ver, true, impl) + ONNX_OPERATOR_SET_SCHEMA_EX( \ + name, OnnxTraining, AI_ONNX_TRAINING_DOMAIN, ver, true, impl) // Defines specialization of GetOpSchema for a class whose name is determined // based on a convention using name, domain, and version. Operator schema are @@ -1044,4 +1079,49 @@ inline std::string GenerateBroadcastingDocUni( return ret; } +/* + * Macros for setting operator documentation + * Use this macro for simple SetDoc() calls that generate documentation + * directly. This is the macro to use in almost all cases. + * Sample usage guidelines: + * const char* doc_str = "foo"; + * SetDoc(GET_OP_DOC_STR(doc_str)) + * + * SetDoc(GET_OP_DOC_STR( + std::string(BitShift_ver11_doc) + GenerateBroadcastingDocMul())) + */ +#ifndef __ONNX_NO_DOC_STRINGS +#define GET_OP_DOC_STR(doc_str) (doc_str) +#else +#define GET_OP_DOC_STR(doc_str) ("") +#endif + +/* + * Use this macro when the documentation needs to be populated in some + * complicated way like string substitutions, etc before calling SetDoc. + * Sample usage guidelines: + std::string doc; + POPULATE_OP_DOC_STR( + doc = R"DOC( +Returns the tensor resulted from performing the `{name}` logical operation +elementwise on the input tensors `A` and `B` (with Numpy-style broadcasting +support). + +{broadcast_doc} +)DOC"; + ReplaceAll(doc, "{name}", name); + ReplaceAll( + doc, "{broadcast_doc}", GenerateBroadcastingDocMul().c_str());); + schema.SetDoc(doc); + * + */ +#ifndef __ONNX_NO_DOC_STRINGS +#define POPULATE_OP_DOC_STR(DocPopulatorCode) \ + do { \ + DocPopulatorCode \ + } while (0) +#else +#define POPULATE_OP_DOC_STR(DocPopulatorCode) +#endif + } // namespace ONNX_NAMESPACE diff --git a/onnx/defs/sequence/defs.cc b/onnx/defs/sequence/defs.cc index 571729cfb7b..fd4e5fd562f 100644 --- a/onnx/defs/sequence/defs.cc +++ b/onnx/defs/sequence/defs.cc @@ -23,7 +23,7 @@ ONNX_OPERATOR_SET_SCHEMA( "(Optional) The data type of the tensors in the output sequence. " "The default type is 'float'.", AttributeProto::INT, - OPTIONAL) + OPTIONAL_VALUE) .Output( 0, "output", diff --git a/onnx/defs/tensor/defs.cc b/onnx/defs/tensor/defs.cc index bc4b3afc2bb..66f54df264a 100644 --- a/onnx/defs/tensor/defs.cc +++ b/onnx/defs/tensor/defs.cc @@ -147,12 +147,12 @@ num_blocks[d] = floor((input_spatial_shape[d] + 2 * padding[d] - dilation[d] * ( "dilations", "Dilation value along each spatial axis of the extracted blocks. If not present, the dilation defaults is 1 along each spatial axis.", AttributeProto::INTS, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( "strides", "Stride along each spatial axis of the input image. If not present, the stride defaults is 1 along each spatial axis.", AttributeProto::INTS, - OPTIONAL); + OPTIONAL_VALUE); schema.Attr( "pads", "Padding for the beginning and ending along each spatial axis, it can take any value greater " @@ -162,7 +162,7 @@ num_blocks[d] = floor((input_spatial_shape[d] + 2 * padding[d] - dilation[d] * ( "added at the beginning of axis `i` and xi_end, the number of pixels added at " "the end of axis `i`. If not present, the padding defaults to 0 along start and end of each spatial axis.", AttributeProto::INTS, - OPTIONAL); + OPTIONAL_VALUE); schema.TypeAndShapeInferenceFunction([](InferenceContext& ctx) { propagateElemTypeFromInputToOutput(ctx, 0, 0); unfoldToDepthShapeInference(ctx); @@ -566,7 +566,7 @@ ONNX_OPERATOR_SET_SCHEMA( "where r = rank(input).", AttributeProto::INT, static_cast(0)) - .Attr("split", "length of each output. Values should be >= 0.", AttributeProto::INTS, OPTIONAL) + .Attr("split", "length of each output. Values should be >= 0.", AttributeProto::INTS, OPTIONAL_VALUE) .SetDoc(Split_ver11_doc) .TypeAndShapeInferenceFunction([](InferenceContext& ctx) { for (int i = 0; i < static_cast(ctx.getNumOutputs()); ++i) { @@ -912,7 +912,7 @@ ONNX_OPERATOR_SET_SCHEMA( "A list of integers. By default, reverse the dimensions, " "otherwise permute the axes according to the values given.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Input(0, "data", "An input tensor.", "T") .Output(0, "transposed", "Transposed output.", "T") .TypeConstraint( @@ -1464,7 +1464,7 @@ ONNX_OPERATOR_SET_SCHEMA( "List of integers indicating the dimensions to squeeze. Negative value means counting dimensions " "from the back. Accepted range is [-r, r-1] where r = rank(data).", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .SetDoc(Squeeze_ver11_doc) .Input(0, "data", "Tensors with at least max(dims) dimensions.", "T") .Output(0, "squeezed", "Reshaped tensor with same data as input.", "T") @@ -2007,7 +2007,7 @@ ONNX_OPERATOR_SET_SCHEMA( "input is flattened before elements being selected. Negative value means counting dimensions " "from the back. Accepted range is [-r, r-1] where r = rank(input).", AttributeProto::INT, - OPTIONAL) + OPTIONAL_VALUE) .Input(0, "input", "Tensor of rank r >= 1.", "T") .Input( 1, @@ -2514,7 +2514,7 @@ ONNX_OPERATOR_SET_SCHEMA( "flattened input are returned. Negative value means counting dimensions " "from the back. Accepted range is [-r, r-1] where r = rank(input).", AttributeProto::INT, - OPTIONAL) + OPTIONAL_VALUE) .Input(0, "X", "A N-D input tensor that is to be processed.", "T") .Output( 0, diff --git a/onnx/defs/tensor/old.cc b/onnx/defs/tensor/old.cc index 2c5af8d04a5..2c17914c3f4 100644 --- a/onnx/defs/tensor/old.cc +++ b/onnx/defs/tensor/old.cc @@ -136,7 +136,7 @@ ONNX_OPERATOR_SET_SCHEMA( "axis", "Which axis to concat on. Default value is 1.", AttributeProto::INT, - OPTIONAL) + OPTIONAL_VALUE) .SetDoc(Concat_ver1_doc) .Input( 0, @@ -251,8 +251,8 @@ ONNX_OPERATOR_SET_SCHEMA( "T", {"tensor(float16)", "tensor(float)", "tensor(double)"}, "Constrain input types to float tensors.") - .Attr("axis", "Which axis to split on", AttributeProto::INT, OPTIONAL) - .Attr("split", "length of each output", AttributeProto::INTS, OPTIONAL) + .Attr("axis", "Which axis to split on", AttributeProto::INT, OPTIONAL_VALUE) + .Attr("split", "length of each output", AttributeProto::INTS, OPTIONAL_VALUE) .SetDoc(Split_ver1_doc)); static const char* Pad_ver1_doc = R"DOC( @@ -318,7 +318,7 @@ ONNX_OPERATOR_SET_SCHEMA( 1, OpSchema() .SetDoc(Reshape_ver1_doc) - .Attr("shape", "New shape", AttributeProto::INTS, OPTIONAL) + .Attr("shape", "New shape", AttributeProto::INTS, OPTIONAL_VALUE) // This attribute was added via AllowConsumed API in OpSchema. // After removing the API, we're now using the Attr API to simulate the // old definition. @@ -326,7 +326,7 @@ ONNX_OPERATOR_SET_SCHEMA( "consumed_inputs", "legacy optimization attribute.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Input(0, "data", "An input tensor.", "T") .Output(0, "reshaped", "Reshaped data.", "T") .TypeConstraint( @@ -602,7 +602,7 @@ ONNX_OPERATOR_SET_SCHEMA( "It's optional. If not present, will be treated as " "[0, 1, ..., len(`starts`) - 1].", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "starts", "Starting indices of corresponding axis in `axes`", @@ -1167,7 +1167,7 @@ ONNX_OPERATOR_SET_SCHEMA( "axes", "List of non-negative integers, indicate the dimensions to squeeze.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .SetDoc(Squeeze_ver1_doc) .Input(0, "data", "Tensors with at least max(dims) dimensions.", "T") .Output(0, "squeezed", "Reshaped tensor with same data as input.", "T") @@ -1450,7 +1450,7 @@ ONNX_OPERATOR_SET_SCHEMA( "(Optional) Axis along which to take slices. If not specified, " "input is flattened before elements being selected.", AttributeProto::INT, - OPTIONAL) + OPTIONAL_VALUE) .Input(0, "input", "Tensor of rank r >= 1.", "T") .Input( 1, @@ -1500,7 +1500,7 @@ ONNX_OPERATOR_SET_SCHEMA( "Which axis to split on. ", AttributeProto::INT, static_cast(0)) - .Attr("split", "length of each output", AttributeProto::INTS, OPTIONAL) + .Attr("split", "length of each output", AttributeProto::INTS, OPTIONAL_VALUE) .SetDoc(Split_ver2_doc) .TypeAndShapeInferenceFunction([](InferenceContext& ctx) { for (int i = 0; i < static_cast(ctx.getNumOutputs()); ++i) { diff --git a/onnx/defs/traditionalml/defs.cc b/onnx/defs/traditionalml/defs.cc index 696e3e8cb8d..ad1496d6d66 100644 --- a/onnx/defs/traditionalml/defs.cc +++ b/onnx/defs/traditionalml/defs.cc @@ -146,12 +146,12 @@ ONNX_ML_OPERATOR_SET_SCHEMA( "cats_strings", "The strings of the map. This sequence must be the same length as the 'cats_int64s' sequence", AttributeProto::STRINGS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "cats_int64s", "The integers of the map. This sequence must be the same length as the 'cats_strings' sequence.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "default_string", "A string to use when an input integer value is not found in the map.
One and only one of the 'default_*' attributes must be defined.", @@ -217,12 +217,12 @@ ONNX_ML_OPERATOR_SET_SCHEMA( "string_vocabulary", "A string vocabulary array.
One and only one of the vocabularies must be defined.", AttributeProto::STRINGS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "int64_vocabulary", "An integer vocabulary array.
One and only one of the vocabularies must be defined.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .TypeAndShapeInferenceFunction([](InferenceContext& ctx) { auto input_elem_type = ctx.getInputType(0) ->map_type() @@ -267,7 +267,7 @@ ONNX_ML_OPERATOR_SET_SCHEMA( "inputdimensions", "The size of each input in the input list", AttributeProto::INTS, - OPTIONAL)); + OPTIONAL_VALUE)); static const char* Imputer_ver1_doc = R"DOC( Replaces inputs that equal one value with another, leaving all other elements alone.
@@ -298,7 +298,7 @@ ONNX_ML_OPERATOR_SET_SCHEMA( "imputed_value_floats", "Value(s) to change to", AttributeProto::FLOATS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "replaced_value_float", "A value that needs replacing.", @@ -308,7 +308,7 @@ ONNX_ML_OPERATOR_SET_SCHEMA( "imputed_value_int64s", "Value(s) to change to.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "replaced_value_int64", "A value that needs replacing.", @@ -354,28 +354,28 @@ ONNX_ML_OPERATOR_SET_SCHEMA( "keys_strings", "A list of strings. One and only one of 'keys_*'s should be set.", AttributeProto::STRINGS, - OPTIONAL) - .Attr("keys_int64s", "A list of ints.", AttributeProto::INTS, OPTIONAL) + OPTIONAL_VALUE) + .Attr("keys_int64s", "A list of ints.", AttributeProto::INTS, OPTIONAL_VALUE) .Attr( "keys_floats", "A list of floats.", AttributeProto::FLOATS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "values_strings", "A list of strings. One and only one of 'value_*'s should be set.", AttributeProto::STRINGS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "values_int64s", "A list of ints.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "values_floats", "A list of floats.", AttributeProto::FLOATS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "default_string", "A string.", @@ -493,7 +493,7 @@ ONNX_ML_OPERATOR_SET_SCHEMA( "intercepts", "A collection of intercepts.", AttributeProto::FLOATS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "multi_class", "Indicates whether to do OvR or multinomial (0=OvR is the default).", @@ -503,12 +503,12 @@ ONNX_ML_OPERATOR_SET_SCHEMA( "classlabels_strings", "Class labels when using string labels. One and only one 'classlabels' attribute must be defined.", AttributeProto::STRINGS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "classlabels_ints", "Class labels when using integer labels. One and only one 'classlabels' attribute must be defined.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "post_transform", "Indicates the transform to apply to the scores vector.
One of 'NONE,' 'SOFTMAX,' 'LOGISTIC,' 'SOFTMAX_ZERO,' or 'PROBIT'", @@ -607,12 +607,12 @@ ONNX_ML_OPERATOR_SET_SCHEMA( "coefficients", "Weights of the model(s).", AttributeProto::FLOATS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "intercepts", "Weights of the intercepts, if used.", AttributeProto::FLOATS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "targets", "The total number of regression targets, 1 if not defined.", @@ -690,12 +690,12 @@ ONNX_ML_OPERATOR_SET_SCHEMA( "cats_int64s", "List of categories, ints.
One and only one of the 'cats_*' attributes must be defined.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "cats_strings", "List of categories, strings.
One and only one of the 'cats_*' attributes must be defined.", AttributeProto::STRINGS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "zeros", "If true and category is not present, will return all zeros; if false and a category if not found, the operator will fail.", @@ -724,12 +724,12 @@ ONNX_ML_OPERATOR_SET_SCHEMA( "offset", "First, offset by this.
Can be length of features in an [N,F] tensor or length 1, in which case it applies to all features, regardless of dimension count.", AttributeProto::FLOATS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "scale", "Second, multiply by this.
Can be length of features in an [N,F] tensor or length 1, in which case it applies to all features, regardless of dimension count.
Must be same length as 'offset'", AttributeProto::FLOATS, - OPTIONAL)); + OPTIONAL_VALUE)); static const char* SVMClassifier_ver1_doc = R"DOC( Support Vector Machine classifier @@ -767,21 +767,21 @@ ONNX_ML_OPERATOR_SET_SCHEMA( "kernel_params", "List of 3 elements containing gamma, coef0, and degree, in that order. Zero if unused for the kernel.", AttributeProto::FLOATS, - OPTIONAL) - .Attr("vectors_per_class", "", AttributeProto::INTS, OPTIONAL) - .Attr("support_vectors", "", AttributeProto::FLOATS, OPTIONAL) - .Attr("coefficients", "", AttributeProto::FLOATS, OPTIONAL) + OPTIONAL_VALUE) + .Attr("vectors_per_class", "", AttributeProto::INTS, OPTIONAL_VALUE) + .Attr("support_vectors", "", AttributeProto::FLOATS, OPTIONAL_VALUE) + .Attr("coefficients", "", AttributeProto::FLOATS, OPTIONAL_VALUE) .Attr( "prob_a", "First set of probability coefficients.", AttributeProto::FLOATS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "prob_b", "Second set of probability coefficients. This array must be same size as prob_a.
If these are provided then output Z are probability estimates, otherwise they are raw scores.", AttributeProto::FLOATS, - OPTIONAL) - .Attr("rho", "", AttributeProto::FLOATS, OPTIONAL) + OPTIONAL_VALUE) + .Attr("rho", "", AttributeProto::FLOATS, OPTIONAL_VALUE) .Attr( "post_transform", "Indicates the transform to apply to the score.
One of 'NONE,' 'SOFTMAX,' 'LOGISTIC,' 'SOFTMAX_ZERO,' or 'PROBIT'", @@ -791,12 +791,12 @@ ONNX_ML_OPERATOR_SET_SCHEMA( "classlabels_strings", "Class labels if using string labels.
One and only one of the 'classlabels_*' attributes must be defined.", AttributeProto::STRINGS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "classlabels_ints", "Class labels if using integer labels.
One and only one of the 'classlabels_*' attributes must be defined.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .TypeAndShapeInferenceFunction([](InferenceContext& ctx) { std::vector label_strs; auto result = @@ -841,12 +841,12 @@ ONNX_ML_OPERATOR_SET_SCHEMA( "kernel_params", "List of 3 elements containing gamma, coef0, and degree, in that order. Zero if unused for the kernel.", AttributeProto::FLOATS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "support_vectors", "Chosen support vectors", AttributeProto::FLOATS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "one_class", "Flag indicating whether the regression is a one-class SVM or not.", @@ -856,7 +856,7 @@ ONNX_ML_OPERATOR_SET_SCHEMA( "coefficients", "Support vector coefficients.", AttributeProto::FLOATS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "n_supports", "The number of support vectors.", @@ -867,7 +867,7 @@ ONNX_ML_OPERATOR_SET_SCHEMA( "Indicates the transform to apply to the score.
One of 'NONE,' 'SOFTMAX,' 'LOGISTIC,' 'SOFTMAX_ZERO,' or 'PROBIT.'", AttributeProto::STRING, std::string("NONE")) - .Attr("rho", "", AttributeProto::FLOATS, OPTIONAL)); + .Attr("rho", "", AttributeProto::FLOATS, OPTIONAL_VALUE)); static const char* TreeEnsembleClassifier_ver1_doc = R"DOC( Tree Ensemble classifier. Returns the top class for each of N inputs.
@@ -908,77 +908,77 @@ ONNX_ML_OPERATOR_SET_SCHEMA( "nodes_treeids", "Tree id for each node.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "nodes_nodeids", "Node id for each node. Ids may restart at zero for each tree, but it not required to.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "nodes_featureids", "Feature id for each node.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "nodes_values", "Thresholds to do the splitting on for each node.", AttributeProto::FLOATS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "nodes_hitrates", "Popularity of each node, used for performance and may be omitted.", AttributeProto::FLOATS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "nodes_modes", "The node kind, that is, the comparison to make at the node. There is no comparison to make at a leaf node.
One of 'BRANCH_LEQ', 'BRANCH_LT', 'BRANCH_GTE', 'BRANCH_GT', 'BRANCH_EQ', 'BRANCH_NEQ', 'LEAF'", AttributeProto::STRINGS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "nodes_truenodeids", "Child node if expression is true.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "nodes_falsenodeids", "Child node if expression is false.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "nodes_missing_value_tracks_true", "For each node, define what to do in the presence of a missing value: if a value is missing (NaN), use the 'true' or 'false' branch based on the value in this array.
This attribute may be left undefined, and the defalt value is false (0) for all nodes.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "class_treeids", "The id of the tree that this node is in.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "class_nodeids", "node id that this weight is for.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "class_ids", "The index of the class list that each weight is for.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "class_weights", "The weight for the class in class_id.", AttributeProto::FLOATS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "classlabels_strings", "Class labels if using string labels.
One and only one of the 'classlabels_*' attributes must be defined.", AttributeProto::STRINGS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "classlabels_int64s", "Class labels if using integer labels.
One and only one of the 'classlabels_*' attributes must be defined.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "post_transform", "Indicates the transform to apply to the score.
One of 'NONE,' 'SOFTMAX,' 'LOGISTIC,' 'SOFTMAX_ZERO,' or 'PROBIT.'", @@ -988,7 +988,7 @@ ONNX_ML_OPERATOR_SET_SCHEMA( "base_values", "Base values for classification, added to final class score; the size must be the same as the classes or can be left unassigned (assumed 0)", AttributeProto::FLOATS, - OPTIONAL) + OPTIONAL_VALUE) .TypeAndShapeInferenceFunction([](InferenceContext& ctx) { std::vector label_strs; auto result = @@ -1033,72 +1033,72 @@ ONNX_ML_OPERATOR_SET_SCHEMA( "nodes_treeids", "Tree id for each node.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "nodes_nodeids", "Node id for each node. Node ids must restart at zero for each tree and increase sequentially.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "nodes_featureids", "Feature id for each node.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "nodes_values", "Thresholds to do the splitting on for each node.", AttributeProto::FLOATS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "nodes_hitrates", "Popularity of each node, used for performance and may be omitted.", AttributeProto::FLOATS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "nodes_modes", "The node kind, that is, the comparison to make at the node. There is no comparison to make at a leaf node.
One of 'BRANCH_LEQ', 'BRANCH_LT', 'BRANCH_GTE', 'BRANCH_GT', 'BRANCH_EQ', 'BRANCH_NEQ', 'LEAF'", AttributeProto::STRINGS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "nodes_truenodeids", "Child node if expression is true", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "nodes_falsenodeids", "Child node if expression is false", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "nodes_missing_value_tracks_true", "For each node, define what to do in the presence of a NaN: use the 'true' (if the attribute value is 1) or 'false' (if the attribute value is 0) branch based on the value in this array.
This attribute may be left undefined and the defalt value is false (0) for all nodes.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "target_treeids", "The id of the tree that each node is in.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "target_nodeids", "The node id of each weight", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "target_ids", "The index of the target that each weight is for", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "target_weights", "The weight for each target", AttributeProto::FLOATS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "n_targets", "The total number of targets.", AttributeProto::INT, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "post_transform", "Indicates the transform to apply to the score.
One of 'NONE,' 'SOFTMAX,' 'LOGISTIC,' 'SOFTMAX_ZERO,' or 'PROBIT'", @@ -1113,7 +1113,7 @@ ONNX_ML_OPERATOR_SET_SCHEMA( "base_values", "Base values for classification, added to final class score; the size must be the same as the classes or can be left unassigned (assumed 0)", AttributeProto::FLOATS, - OPTIONAL)); + OPTIONAL_VALUE)); static const char* ZipMap_ver1_doc = R"DOC( Creates a map from the input and the attributes.
@@ -1137,12 +1137,12 @@ ONNX_ML_OPERATOR_SET_SCHEMA( "classlabels_strings", "The keys when using string keys.
One and only one of the 'classlabels_*' attributes must be defined.", AttributeProto::STRINGS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "classlabels_int64s", "The keys when using int keys.
One and only one of the 'classlabels_*' attributes must be defined.", AttributeProto::INTS, - OPTIONAL) + OPTIONAL_VALUE) .TypeAndShapeInferenceFunction([](InferenceContext& ctx) { std::vector classlabels_strings; bool result = getRepeatedAttribute( diff --git a/onnx/defs/traditionalml/old.cc b/onnx/defs/traditionalml/old.cc index 81f7646d0c3..cd3d7dddd37 100644 --- a/onnx/defs/traditionalml/old.cc +++ b/onnx/defs/traditionalml/old.cc @@ -37,7 +37,7 @@ ONNX_ML_OPERATOR_SET_SCHEMA( "classes_strings", "A list of labels.", AttributeProto::STRINGS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "default_int64", "An integer to use when an input string value is not found in the map.
One and only one of the 'default_*' attributes must be defined.", diff --git a/onnx/defs/training/defs.cc b/onnx/defs/training/defs.cc index 96d3f7c873e..607f469a752 100644 --- a/onnx/defs/training/defs.cc +++ b/onnx/defs/training/defs.cc @@ -177,7 +177,7 @@ ONNX_TRAINING_OPERATOR_SET_SCHEMA( "intermediate variables) that can be generated from inputs " "cannot be included in this attribute.", AttributeProto::STRINGS, - OPTIONAL) + OPTIONAL_VALUE) .Attr( "y", "The targeted tensor. It can be viewed as the output of the " @@ -461,4 +461,154 @@ ONNX_TRAINING_OPERATOR_SET_SCHEMA( propagateShapeFromInputToOutput(ctx, i_in, i_out); }})); -} // namespace ONNX_NAMESPACE +static const char* Momentum_ver1_doc = R"DOC( + Compute one iteration of stochastic gradient update with momentum. + This operator can conduct the optimization of multiple tensor variables. + + Let's define the behavior of this operator. As you can imagine, SG with momentum requires + several parameters: + + - The learning-rate "R". + - The update count "T". That is, the number of conducted training iterations. It should + be zero in the first training iteration. + - A L2-norm regularization coefficient "norm_coefficient". + - A decay coefficient of previous accumulated gradient (i.e., momentum) "alpha". + - The scaling coefficient of current gradient "beta". + - An attribute to choose either standard momentum or Nesterov's momentum "mode" should + be used. + + For the sake of simplicity, assume that there is only one tensor (called "X") to be optimized. + Other necessary inputs are "X"'s gradient (called "G") and "X"'s momentum (called "V"). This + Momentum operator maps all these inputs to the new value of "X" (called "X_new") and its new + momentum (called "V_new"). + + This operator supports two different momentum algorithms. Set the attribute "mode" to + "nesterov" if Nesterov's momentum is desired. Otherwise, set the attribute "model" to + "standard" to use standard momentum. Computation details are described subsequently. + + Let "+", "-", "*", and "/" are all element-wise operations with numpy-style broadcasting. + + Pseudo code for SG with standard momentum: + + // Add gradient of 0.5 * norm_coefficient * ||X||^2, where ||X|| is the sum of squared + // values of all elements in X. + G_regularized = norm_coefficient * X + G + + // In the first training iteration, beta should always be 1. + beta_adjusted = T > 0 ? beta : 1 + + // Compute the current momentum based on previous momentum and the current gradient. + V_new = alpha * V + beta_adjusted * G_regularized + + // Update X. + X_new = X - R * V_new + + Pseudo code for SG with Nesterov's momentum: + + // Add gradient of 0.5 * norm_coefficient * ||X||^2, where ||X|| is the sum of squared + // values of all elements in X. + G_regularized = norm_coefficient * X + G; + + // In the first training iteration, beta should always be 1. + beta_adjusted = T > 0 ? beta : 1 + + // Compute the current momentum based on previous momentum and the current gradient. + V_new = alpha * V + beta_adjusted * G_regularized; + + // Compute final update direction and then update X. + X_new = X - R * (G_regularized + alpha * V_new) + + If one assign this operators to optimize multiple inputs, for example, "X_1" and "X_2". The same + pseudo code would be extended to handle all tensors jointly. More specifically, we can view "X" as a + concatenation of "X_1" and "X_2" (of course, their gradient and accumulate gradient should + be concatenated too) and then our pseudo code becomes applicable. +)DOC"; + +ONNX_TRAINING_OPERATOR_SET_SCHEMA( + Momentum, + 1, + OpSchema() + .SetDoc(Momentum_ver1_doc) + .Input(0, "R", "The learning rate.", "T1") + .Input(1, "T", "Update count of \"X\". It should be a scalar.", "T2") + .Input( + 2, + "inputs", + "It sequentially contains the current values of optimized tensors, then their " + "gradient tensors, and finally their momentum tensors. For example, if two tensors " + "\"X_1\" and \"X_2\" are optimized, The expected input list would be " + "[\"X_1\", \"X_2\", gradient of \"X_1\", gradient of \"X_2\", momentum of \"X_1\", momentum of \"X_2\"].", + "T3", + OpSchema::Variadic, + false) + .Output( + 0, + "outputs", + "It sequentially contains the new values of optimized tensors and then the new " + "values of their momentum tensors. For example, if two tensors \"X_1\" and \"X_2\" are " + "optimized, the output list would be [new value of \"X_1,\" new value of \"X_2\" " + "new momentum of \"X_1\", new momentum of \"X_2\"].", + "T3", + OpSchema::Variadic, + false) + .Attr( + "alpha", + "The decay factor of momentum. It should be a scalar.", + AttributeProto::FLOAT) + .Attr( + "beta", + "The coefficient of gradient in computing new momentum. It should be a scalar.", + AttributeProto::FLOAT) + .Attr( + "norm_coefficient", + "Coefficient of 0.5 * norm_coefficient * ||X||^2.", + AttributeProto::FLOAT) + .Attr( + "mode", + "Its value should be either \"nesterov\" or \"standard\". The value \"nesterov\" leads " + "to the use of Nesterov's momentum while \"standard\" invokes stochastic gradient method " + "using standard momentum", + AttributeProto::STRING) + .TypeConstraint( + "T1", + {"tensor(float)", "tensor(double)"}, + "Constrain input types to float scalars.") + .TypeConstraint( + "T2", + {"tensor(int64)"}, + "Constrain input types to 64-bit integer scalars.") + .TypeConstraint( + "T3", + {"tensor(float)", "tensor(double)"}, + "Constrain input types to float tensors.") + .TypeAndShapeInferenceFunction([](InferenceContext& ctx) { + // Assume that the input list is [R, T, X1, X2, G1, G2, V1, V2] and + // output list is [X1_new, X2_new, V1_new, V2_new] for explaining + // the code below in a simpler way. + + // The count of input tensors excluding "R" and "T". + auto num_adjustable_tensors = ctx.getNumInputs() - 2; + + // Check number of (optimized tensor, gradient, momentum) tuples. + if (num_adjustable_tensors % 3 != 0) + fail_shape_inference( + "The sum of optimized tensor count and momentum tensor count ", + "should be a multiple of 2 in the input list of Momentum operator"); + + // The count of "X1" and "X2". + auto num_optimized_tensors = num_adjustable_tensors / 3; + for (size_t i = 0; i < num_optimized_tensors; ++i){ + // Pass X1's/X2's shapes to X1_new/X2_new. + size_t i_in = 2 + i; + size_t i_out = i; + propagateElemTypeFromInputToOutput(ctx, i_in, i_out); + propagateShapeFromInputToOutput(ctx, i_in, i_out); + // Pass V1's/V2's shapes to V1_new/V2_new. + i_in = 2 + 2 * num_optimized_tensors + i; + i_out = i + num_optimized_tensors; + propagateElemTypeFromInputToOutput(ctx, i_in, i_out); + propagateShapeFromInputToOutput(ctx, i_in, i_out); + } + })); + +} // namespace ONNX_NAMESPACE \ No newline at end of file diff --git a/onnx/optimizer/pass_manager.cc b/onnx/optimizer/pass_manager.cc index 8f1ebe38ba5..7ad365a0636 100644 --- a/onnx/optimizer/pass_manager.cc +++ b/onnx/optimizer/pass_manager.cc @@ -14,7 +14,7 @@ void GeneralPassManager::add(std::shared_ptr pass) { } std::shared_ptr GeneralPassManager::run(Graph& graph) { - for (std::shared_ptr pass : this->passes) { + for (const std::shared_ptr& pass : this->passes) { auto pass_analysis = pass->runPass(graph); } return std::shared_ptr(new EmptyPassManagerAnalysis()); @@ -25,7 +25,7 @@ std::shared_ptr FixedPointPassManager::run(Graph& graph) { do { fixed_point_optimization_done = false; - for (std::shared_ptr pass : this->passes) { + for (const std::shared_ptr& pass : this->passes) { std::shared_ptr analysis = pass->runPass(graph); if (pass->getPassAnalysisType() == PassAnalysisType::Empty) { continue; diff --git a/onnx/shape_inference/implementation.cc b/onnx/shape_inference/implementation.cc index 237ca6c22e2..f1a0b16c68e 100644 --- a/onnx/shape_inference/implementation.cc +++ b/onnx/shape_inference/implementation.cc @@ -354,12 +354,12 @@ void InferShapeForFunctionNode( NodeProto copy_n(n); // Add attribute information into the temporary node copy_n.clear_attribute(); - for (auto attr : n.attribute()) { + for (const auto& attr : n.attribute()) { if (attr.has_ref_attr_name()) { if (attr_map.count(attr.ref_attr_name())) { auto copy_attr = *attr_map[attr.ref_attr_name()]; copy_attr.set_name(attr.name()); - copy_n.add_attribute()->CopyFrom(std::move(copy_attr)); + copy_n.add_attribute()->CopyFrom(copy_attr); } } else { copy_n.add_attribute()->CopyFrom(attr); diff --git a/onnx/test/shape_inference_test.py b/onnx/test/shape_inference_test.py index 9e6e121c84a..24df2c4797c 100644 --- a/onnx/test/shape_inference_test.py +++ b/onnx/test/shape_inference_test.py @@ -2893,6 +2893,47 @@ def test_adagrad_multiple(self): # type: () -> None make_tensor_value_info('H2_new', TensorProto.FLOAT, (3, 4))], opset_imports=[helper.make_opsetid('', 12), helper.make_opsetid('ai.onnx.training', 1)]) + def test_momentum(self): # type: () -> None + graph = self._make_graph( + [('R', TensorProto.FLOAT, ()), # scalar's shape is () + ('T', TensorProto.INT64, ()), # scalar's shape is () + ('X', TensorProto.FLOAT, (1, 2)), + ('G', TensorProto.FLOAT, (1, 2)), + ('V', TensorProto.FLOAT, (1, 2))], + [make_node('Momentum', ['R', 'T', 'X', 'G', 'V'], ['X_new', 'V_new'], + alpha=0.9, beta=1.0, norm_coefficient=0.02, mode='standard', + domain='ai.onnx.training')], + []) + self._assert_inferred( + graph, + [make_tensor_value_info('X_new', TensorProto.FLOAT, (1, 2)), + make_tensor_value_info('V_new', TensorProto.FLOAT, (1, 2))], + opset_imports=[helper.make_opsetid('', 12), helper.make_opsetid('ai.onnx.training', 1)]) + + def test_momentum_multiple(self): # type: () -> None + graph = self._make_graph( + [('R', TensorProto.FLOAT, ()), # scalar's shape is () + ('T', TensorProto.INT64, ()), # scalar's shape is () + ('X1', TensorProto.FLOAT, (1, 2)), + ('X2', TensorProto.FLOAT, (3, 4)), + ('G1', TensorProto.FLOAT, (1, 2)), + ('G2', TensorProto.FLOAT, (3, 4)), + ('V1', TensorProto.FLOAT, (1, 2)), + ('V2', TensorProto.FLOAT, (3, 4))], + [make_node('Momentum', ['R', 'T', 'X1', 'X2', 'G1', 'G2', 'V1', 'V2'], + ['X1_new', 'X2_new', 'V1_new', 'V2_new'], + alpha=0.9, beta=1.0, norm_coefficient=0.02, mode='nesterov', + domain='ai.onnx.training')], + []) + + self._assert_inferred( + graph, + [make_tensor_value_info('X1_new', TensorProto.FLOAT, (1, 2)), + make_tensor_value_info('X2_new', TensorProto.FLOAT, (3, 4)), + make_tensor_value_info('V1_new', TensorProto.FLOAT, (1, 2)), + make_tensor_value_info('V2_new', TensorProto.FLOAT, (3, 4))], + opset_imports=[helper.make_opsetid('', 12), helper.make_opsetid('ai.onnx.training', 1)]) + def test_pad_opset10(self): # type: () -> None graph = self._make_graph( [('x', TensorProto.FLOAT, (1, None, 2))], diff --git a/onnx/version_converter/convert.cc b/onnx/version_converter/convert.cc index 256b2e405db..c2e4438ea0c 100644 --- a/onnx/version_converter/convert.cc +++ b/onnx/version_converter/convert.cc @@ -22,8 +22,8 @@ ModelProto DefaultVersionConverter::convert_version( const ModelProto& mp_in, const OpSetID& initial_version, const OpSetID& target_version) const { - const std::string initial_domain = initial_version.domain(); - const std::string target_domain = target_version.domain(); + const std::string& initial_domain = initial_version.domain(); + const std::string& target_domain = target_version.domain(); assertDefaultDomain(initial_domain, target_domain); for (auto it = mp_in.opset_import().begin(); it != mp_in.opset_import() diff --git a/third_party/pybind11 b/third_party/pybind11 index a1041190c8b..80d452484c5 160000 --- a/third_party/pybind11 +++ b/third_party/pybind11 @@ -1 +1 @@ -Subproject commit a1041190c8b8ff0cd9e2f0752248ad5e3789ea0c +Subproject commit 80d452484c5409444b0ec19383faa84bb7a4d351