[DML] MatrixMultiplyIntegerToFloat by AnaghaRaoAMD · Pull Request #19608 · microsoft/onnxruntime

AnaghaRaoAMD · 2024-02-22T17:16:41Z

Description

DML Implementation for com.microsoft.MatMulIntegerToFloat

.\onnxruntime_test_all.exe --gtest_filter="*MatMulIntegerToFloat.*"
Note: Google Test filter = *MatMulIntegerToFloat.*
[==========] Running 22 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 22 tests from MatMulIntegerToFloat
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8S8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8S8 (620 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8S8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8S8 (497 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8S8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8S8 (488 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_S8S8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_S8S8 (503 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8U8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8U8 (495 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8U8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8U8 (488 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8U8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8U8 (492 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_U8X8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_U8X8 (502 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8U8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8U8 (452 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8U8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8U8 (454 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8U8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8U8 (446 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_S8U8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_S8U8 (508 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8S8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8S8 (456 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8S8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8S8 (455 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8S8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8S8 (447 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_U8S8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_U8S8 (465 ms)
[ RUN      ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_U8U8
[       OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_U8U8 (111 ms)
[ RUN      ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_U8S8
[       OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_U8S8 (115 ms)
[ RUN      ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_S8S8
[       OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_S8S8 (114 ms)
[ RUN      ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_S8U8
[       OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_S8U8 (110 ms)
[ RUN      ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16
[       OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16 (112 ms)
[ RUN      ] MatMulIntegerToFloat.MatMulInteger_With_ZeroPoint
[       OK ] MatMulIntegerToFloat.MatMulInteger_With_ZeroPoint (337 ms)
[----------] 22 tests from MatMulIntegerToFloat (8679 ms total)

[----------] Global test environment tear-down
[==========] 22 tests from 1 test suite ran. (8680 ms total)
[  PASSED  ] 22 tests.
memleakdbg:
----- No memory leaks detected -----

Motivation and Context

CalculateMatMulIntegerToFloat to replace CPU EP run reference
Added more FP32 testcases to isolate all input datatype combinations
Added fixed input to MatMulIntegerToFloat_FP16* test cases as for FP16 test cases.
onnxruntime/test/testdata/matmul_integer_to_float.py` is capable of generating FP16 models, but we do not produce any for now

[Cherry Pick Reviewed] Commit all MatrixMultiplyIntegerToFloat PRs [MatrixMultiplyIntegerToFloat (](https://github.com/microsoft/onnxruntime/pull/18275/commits/bf642a4d35691a13ff0ecef11cb8a9571c5a5610)https://github.com/microsoft/onnxruntime/pull/16804[)] [MatMulIntToFloat Enable FP16 and update tensor ORT-DML indexing (](https://github.com/microsoft/onnxruntime/pull/18275/commits/8237548d14f11a165a9b82bf181f8762e65f6142)https://github.com/microsoft/onnxruntime/pull/16871[)] [Disable MatMulIntegerToFloat transformation for FP16 on CPU EP (](https://github.com/microsoft/onnxruntime/pull/18275/commits/b16bf809dea31872ccb664f2622711966078e3f5)https://github.com/microsoft/onnxruntime/pull/18239[)]

### Description MatMulIntegerToFloat tests were noticed to be failing for DMLEP the root cause being inaccuracies in CPUEP implementation to some data type combinations. ``` .\onnxruntime_test_all.exe --gtest_filter="*MatMulIntegerToFloat.*" Note: Google Test filter = *MatMulIntegerToFloat.* [==========] Running 22 tests from 1 test suite. [----------] Global test environment set-up. [----------] 22 tests from MatMulIntegerToFloat [ RUN ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8S8 [ OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8S8 (620 ms) [ RUN ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8S8 [ OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8S8 (497 ms) [ RUN ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8S8 [ OK ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8S8 (488 ms) [ RUN ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_S8S8 [ OK ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_S8S8 (503 ms) [ RUN ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8U8 [ OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8U8 (495 ms) [ RUN ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8U8 [ OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8U8 (488 ms) [ RUN ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8U8 [ OK ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8U8 (492 ms) [ RUN ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_U8X8 [ OK ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_U8X8 (502 ms) [ RUN ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8U8 [ OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8U8 (452 ms) [ RUN ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8U8 [ OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8U8 (454 ms) [ RUN ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8U8 [ OK ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8U8 (446 ms) [ RUN ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_S8U8 [ OK ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_S8U8 (508 ms) [ RUN ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8S8 [ OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8S8 (456 ms) [ RUN ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8S8 [ OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8S8 (455 ms) [ RUN ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8S8 [ OK ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8S8 (447 ms) [ RUN ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_U8S8 [ OK ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_U8S8 (465 ms) [ RUN ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_U8U8 [ OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_U8U8 (111 ms) [ RUN ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_U8S8 [ OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_U8S8 (115 ms) [ RUN ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_S8S8 [ OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_S8S8 (114 ms) [ RUN ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_S8U8 [ OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_S8U8 (110 ms) [ RUN ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16 [ OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16 (112 ms) [ RUN ] MatMulIntegerToFloat.MatMulInteger_With_ZeroPoint [ OK ] MatMulIntegerToFloat.MatMulInteger_With_ZeroPoint (337 ms) [----------] 22 tests from MatMulIntegerToFloat (8679 ms total) [----------] Global test environment tear-down [==========] 22 tests from 1 test suite ran. (8680 ms total) [ PASSED ] 22 tests. memleakdbg: ----- No memory leaks detected ----- ``` ### Motivation and Context  * `CalculateMatMulIntegerToFloat` to replace CPU EP run reference * Added more FP32 testcases to isolate all input datatype combinations * Added fixed input to `MatMulIntegerToFloat_FP16*` test cases as for FP16 test cases. There is no support for direct onnxruntime::MLFloat16 datatype comparison with gtest framework. This leads to FP32 reference -> FP16 tensor -> FP32 reference conversion which is adding inaccuracies. ![image](https://github.com/microsoft/onnxruntime/assets/127366241/c6aaf68e-44df-42be-9860-df2cb0dd7a56) * Removing `MatMulIntegerToFloatHelper` as its same as `MatMulHelper` * onnxruntime/test/testdata/matmul_integer_to_float.py` is still capable of generating FP16 models, but we do not produce any for now

AnaghaRaoAMD · 2024-03-01T19:27:02Z

/azp run Big Models, Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline

azure-pipelines · 2024-03-01T19:27:21Z

Azure Pipelines successfully started running 3 pipeline(s).

AnaghaRaoAMD · 2024-03-01T19:40:31Z

/azp run Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, Linux QNN CI Pipeline, MacOS CI Pipeline, Windows ARM64 QNN CI Pipeline, Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows x64 QNN CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed

azure-pipelines · 2024-03-01T19:40:38Z

You have several pipelines (over 10) configured to build pull requests in this repository. Specify which pipelines you would like to run by using /azp run [pipelines] command. You can specify multiple pipelines using a comma separated list.

AnaghaRaoAMD · 2024-03-01T19:41:44Z

/azp run Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, Linux QNN CI Pipeline, MacOS CI Pipeline, Windows ARM64 QNN CI Pipeline, Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline

azure-pipelines · 2024-03-01T19:42:20Z

Azure Pipelines successfully started running 9 pipeline(s).

AnaghaRaoAMD · 2024-03-01T19:42:34Z

/azp run Windows x64 QNN CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed

azure-pipelines · 2024-03-01T19:42:55Z

Azure Pipelines successfully started running 5 pipeline(s).

tianleiwu · 2024-03-05T05:28:38Z

@raoanag, please adjust the test threshold (always use abs error and not relative error?). Here is an example test failure:
NoZeroPoint_HasBias_test_U8S8
/mnt/vss/_work/1/s/onnxruntime/test/providers/checkers.cc:272
The difference between cur_expected[i] and cur_actual[i] is 6.6161155700683594e-06, which exceeds *(params.relative_error) * std::abs(cur_expected[i]), where
cur_expected[i] evaluates to 0.00020684301853179932,
cur_actual[i] evaluates to 0.00020022690296173096, and
*(params.relative_error) * std::abs(cur_expected[i]) evaluates to 4.1368602978764102e-06.
i:211
Google Test trace:
/mnt/vss/_work/1/s/onnxruntime/test/providers/checkers.cc:484: provider type: CPUExecutionProvider
/mnt/vss/_work/1/s/onnxruntime/test/providers/base_tester.cc:791: registered execution providers: CPUExecutionProvider
/mnt/vss/_work/1/s/onnxruntime/test/common/random_generator.h:49: ORT test random seed: 1183722559

fs-eire · 2024-03-06T00:46:30Z

WebAssembly CI also fails on test "MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8S8".

build log link

[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8S8
/mnt/vss/_work/1/s/onnxruntime/test/providers/checkers.cc:272: Failure
The difference between cur_expected[i] and cur_actual[i] is 1.9073486328125e-06, which exceeds *(params.relative_error) * std::abs(cur_expected[i]), where
cur_expected[i] evaluates to 1.9073486328125e-06,
cur_actual[i] evaluates to 0, and
*(params.relative_error) * std::abs(cur_expected[i]) evaluates to 3.8146971803598717e-08.
i:497
Google Test trace:
/mnt/vss/_work/1/s/onnxruntime/test/providers/checkers.cc:484: provider type: CPUExecutionProvider
/mnt/vss/_work/1/s/onnxruntime/test/providers/base_tester.cc:791: registered execution providers: CPUExecutionProvider
/mnt/vss/_work/1/s/onnxruntime/test/common/random_generator.h:49: ORT test random seed: 1298328783

/mnt/vss/_work/1/s/onnxruntime/test/providers/checkers.cc:272: Failure
The difference between cur_expected[i] and cur_actual[i] is 1.9073486328125e-06, which exceeds *(params.relative_error) * std::abs(cur_expected[i]), where
cur_expected[i] evaluates to 1.9073486328125e-06,
cur_actual[i] evaluates to 0, and
*(params.relative_error) * std::abs(cur_expected[i]) evaluates to 3.8146971803598717e-08.
i:497
Google Test trace:
/mnt/vss/_work/1/s/onnxruntime/test/providers/checkers.cc:484: provider type: CPUExecutionProvider
/mnt/vss/_work/1/s/onnxruntime/test/providers/base_tester.cc:791: registered execution providers: CPUExecutionProvider
/mnt/vss/_work/1/s/onnxruntime/test/common/random_generator.h:49: ORT test random seed: 1298328783

/mnt/vss/_work/1/s/onnxruntime/test/providers/checkers.cc:272: Failure
The difference between cur_expected[i] and cur_actual[i] is 1.3113021850585938e-06, which exceeds *(params.relative_error) * std::abs(cur_expected[i]), where
cur_expected[i] evaluates to 1.3113021850585938e-06,
cur_actual[i] evaluates to -0, and
*(params.relative_error) * std::abs(cur_expected[i]) evaluates to 2.6226043559063328e-08.
i:497
Google Test trace:
/mnt/vss/_work/1/s/onnxruntime/test/providers/checkers.cc:484: provider type: CPUExecutionProvider
/mnt/vss/_work/1/s/onnxruntime/test/providers/base_tester.cc:791: registered execution providers: CPUExecutionProvider
/mnt/vss/_work/1/s/onnxruntime/test/common/random_generator.h:49: ORT test random seed: 1298328783

/mnt/vss/_work/1/s/onnxruntime/test/providers/checkers.cc:272: Failure
The difference between cur_expected[i] and cur_actual[i] is 1.3113021850585938e-06, which exceeds *(params.relative_error) * std::abs(cur_expected[i]), where
cur_expected[i] evaluates to 1.3113021850585938e-06,
cur_actual[i] evaluates to -0, and
*(params.relative_error) * std::abs(cur_expected[i]) evaluates to 2.6226043559063328e-08.
i:497
Google Test trace:
/mnt/vss/_work/1/s/onnxruntime/test/providers/checkers.cc:484: provider type: CPUExecutionProvider
/mnt/vss/_work/1/s/onnxruntime/test/providers/base_tester.cc:791: registered execution providers: CPUExecutionProvider
/mnt/vss/_work/1/s/onnxruntime/test/common/random_generator.h:49: ORT test random seed: 1298328783

[  FAILED  ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8S8 (7 ms)

### Description Check float/double/float16/bfloat16 tensors are close like [numpy.isclose](https://numpy.org/doc/stable/reference/generated/numpy.isclose.html). ``` absolute(a - b) <= (atol + rtol * absolute(b)) ``` The default tolerance thresholds: - float: atol=1e-5 and rtol=1e-4 - float16: atol=0.0025 and rtol=0.001 - bfloat16: atol=0.02 and rtol=0.01 ### Motivation and Context Current pipeline has frequent failure due to using only relative tolerance in #19608: [ RUN ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8S8 1: C:\a\_work\1\s\onnxruntime\test\providers\checkers.cc(272): error: The difference between cur_expected[i] and cur_actual[i] is 1.3113021850585938e-06, which exceeds *(params.relative_error) * std::abs(cur_expected[i]), where 1: cur_expected[i] evaluates to -1.3113021850585938e-06, 1: cur_actual[i] evaluates to 0, and 1: *(params.relative_error) * std::abs(cur_expected[i]) evaluates to 2.6226043559063328e-08. It is not reasonable to use relative tolerance for a small value very close to 0. Combining relative tolerance with a positive absolute tolerance could avoid such issue.

### Description DML Implementation for [com.microsoft.MatMulIntegerToFloat](https://github.com/microsoft/onnxruntime/blob/main/docs/ContribOperators.md#com.microsoft.MatMulIntegerToFloat) ``` .\onnxruntime_test_all.exe --gtest_filter="*MatMulIntegerToFloat.*" Note: Google Test filter = *MatMulIntegerToFloat.* [==========] Running 22 tests from 1 test suite. [----------] Global test environment set-up. [----------] 22 tests from MatMulIntegerToFloat [ RUN ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8S8 [ OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8S8 (620 ms) [ RUN ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8S8 [ OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8S8 (497 ms) [ RUN ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8S8 [ OK ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8S8 (488 ms) [ RUN ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_S8S8 [ OK ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_S8S8 (503 ms) [ RUN ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8U8 [ OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8U8 (495 ms) [ RUN ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8U8 [ OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8U8 (488 ms) [ RUN ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8U8 [ OK ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8U8 (492 ms) [ RUN ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_U8X8 [ OK ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_U8X8 (502 ms) [ RUN ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8U8 [ OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8U8 (452 ms) [ RUN ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8U8 [ OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8U8 (454 ms) [ RUN ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8U8 [ OK ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8U8 (446 ms) [ RUN ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_S8U8 [ OK ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_S8U8 (508 ms) [ RUN ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8S8 [ OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8S8 (456 ms) [ RUN ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8S8 [ OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8S8 (455 ms) [ RUN ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8S8 [ OK ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8S8 (447 ms) [ RUN ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_U8S8 [ OK ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_U8S8 (465 ms) [ RUN ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_U8U8 [ OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_U8U8 (111 ms) [ RUN ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_U8S8 [ OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_U8S8 (115 ms) [ RUN ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_S8S8 [ OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_S8S8 (114 ms) [ RUN ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_S8U8 [ OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_S8U8 (110 ms) [ RUN ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16 [ OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16 (112 ms) [ RUN ] MatMulIntegerToFloat.MatMulInteger_With_ZeroPoint [ OK ] MatMulIntegerToFloat.MatMulInteger_With_ZeroPoint (337 ms) [----------] 22 tests from MatMulIntegerToFloat (8679 ms total) [----------] Global test environment tear-down [==========] 22 tests from 1 test suite ran. (8680 ms total) [ PASSED ] 22 tests. memleakdbg: ----- No memory leaks detected ----- ``` ### Motivation and Context  * `CalculateMatMulIntegerToFloat` to replace CPU EP run reference * Added more FP32 testcases to isolate all input datatype combinations * Added fixed input to `MatMulIntegerToFloat_FP16*` test cases as for FP16 test cases. * onnxruntime/test/testdata/matmul_integer_to_float.py` is capable of generating FP16 models, but we do not produce any for now

### Description Check float/double/float16/bfloat16 tensors are close like [numpy.isclose](https://numpy.org/doc/stable/reference/generated/numpy.isclose.html). ``` absolute(a - b) <= (atol + rtol * absolute(b)) ``` The default tolerance thresholds: - float: atol=1e-5 and rtol=1e-4 - float16: atol=0.0025 and rtol=0.001 - bfloat16: atol=0.02 and rtol=0.01 ### Motivation and Context Current pipeline has frequent failure due to using only relative tolerance in microsoft#19608: [ RUN ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8S8 1: C:\a\_work\1\s\onnxruntime\test\providers\checkers.cc(272): error: The difference between cur_expected[i] and cur_actual[i] is 1.3113021850585938e-06, which exceeds *(params.relative_error) * std::abs(cur_expected[i]), where 1: cur_expected[i] evaluates to -1.3113021850585938e-06, 1: cur_actual[i] evaluates to 0, and 1: *(params.relative_error) * std::abs(cur_expected[i]) evaluates to 2.6226043559063328e-08. It is not reasonable to use relative tolerance for a small value very close to 0. Combining relative tolerance with a positive absolute tolerance could avoid such issue.

### Description Check float/double/float16/bfloat16 tensors are close like [numpy.isclose](https://numpy.org/doc/stable/reference/generated/numpy.isclose.html). ``` absolute(a - b) <= (atol + rtol * absolute(b)) ``` The default tolerance thresholds: - float: atol=1e-5 and rtol=1e-4 - float16: atol=0.0025 and rtol=0.001 - bfloat16: atol=0.02 and rtol=0.01 ### Motivation and Context Current pipeline has frequent failure due to using only relative tolerance in microsoft/onnxruntime#19608: [ RUN ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8S8 1: C:\a\_work\1\s\onnxruntime\test\providers\checkers.cc(272): error: The difference between cur_expected[i] and cur_actual[i] is 1.3113021850585938e-06, which exceeds *(params.relative_error) * std::abs(cur_expected[i]), where 1: cur_expected[i] evaluates to -1.3113021850585938e-06, 1: cur_actual[i] evaluates to 0, and 1: *(params.relative_error) * std::abs(cur_expected[i]) evaluates to 2.6226043559063328e-08. It is not reasonable to use relative tolerance for a small value very close to 0. Combining relative tolerance with a positive absolute tolerance could avoid such issue.

github-advanced-security AI found potential problems Feb 22, 2024

View reviewed changes

Comment thread onnxruntime/test/testdata/transform/fusion/matmul_integer_to_float.py Fixed

AnaghaRaoAMD force-pushed the user/anarao/DMLEPMatMulInt2Flt branch 5 times, most recently from ce6aa75 to 3cec30a Compare February 23, 2024 19:10

AnaghaRaoAMD marked this pull request as ready for review February 23, 2024 20:09

AnaghaRaoAMD requested a review from a team as a code owner February 23, 2024 20:09

AnaghaRaoAMD requested review from tbqh and yufenglee February 23, 2024 20:09

AnaghaRaoAMD and others added 6 commits February 27, 2024 12:11

Doc updates

9cceffa

Resolve conflicts

1c74a29

Lint runner

88f988e

adding back 120 character

795241c

AnaghaRaoAMD force-pushed the user/anarao/DMLEPMatMulInt2Flt branch from 139236a to ddf8f78 Compare February 27, 2024 21:28

Linx Build fix

6fe223c

AnaghaRaoAMD force-pushed the user/anarao/DMLEPMatMulInt2Flt branch from ddf8f78 to 6fe223c Compare February 28, 2024 00:41

tbqh previously approved these changes Feb 28, 2024

View reviewed changes

yufenglee previously approved these changes Feb 29, 2024

View reviewed changes

update constexpr Linix build error

a4c8158

AnaghaRaoAMD dismissed stale reviews from yufenglee and tbqh via a4c8158 March 1, 2024 20:17

Anagha Rao added 2 commits March 1, 2024 14:46

Update tolerance

577706f

Increase tolerance for CPU

66c21b2

yufenglee approved these changes Mar 4, 2024

View reviewed changes

AnaghaRaoAMD merged commit 27b1dc9 into main Mar 4, 2024

AnaghaRaoAMD deleted the user/anarao/DMLEPMatMulInt2Flt branch March 4, 2024 19:55

tianleiwu mentioned this pull request Mar 6, 2024

Update tolerance of provider tests to fix flaky tests #19792

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DML] MatrixMultiplyIntegerToFloat#19608

[DML] MatrixMultiplyIntegerToFloat#19608
AnaghaRaoAMD merged 10 commits into
mainfrom
user/anarao/DMLEPMatMulInt2Flt

AnaghaRaoAMD commented Feb 22, 2024

Uh oh!

Uh oh!

AnaghaRaoAMD commented Mar 1, 2024

Uh oh!

azure-pipelines Bot commented Mar 1, 2024

Uh oh!

AnaghaRaoAMD commented Mar 1, 2024

Uh oh!

azure-pipelines Bot commented Mar 1, 2024

Uh oh!

AnaghaRaoAMD commented Mar 1, 2024

Uh oh!

azure-pipelines Bot commented Mar 1, 2024

Uh oh!

AnaghaRaoAMD commented Mar 1, 2024

Uh oh!

azure-pipelines Bot commented Mar 1, 2024

Uh oh!

tianleiwu commented Mar 5, 2024 •

edited

Loading

Uh oh!

fs-eire commented Mar 6, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

AnaghaRaoAMD commented Feb 22, 2024

Description

Motivation and Context

Uh oh!

Uh oh!

AnaghaRaoAMD commented Mar 1, 2024

Uh oh!

azure-pipelines Bot commented Mar 1, 2024

Uh oh!

AnaghaRaoAMD commented Mar 1, 2024

Uh oh!

azure-pipelines Bot commented Mar 1, 2024

Uh oh!

AnaghaRaoAMD commented Mar 1, 2024

Uh oh!

azure-pipelines Bot commented Mar 1, 2024

Uh oh!

AnaghaRaoAMD commented Mar 1, 2024

Uh oh!

azure-pipelines Bot commented Mar 1, 2024

Uh oh!

tianleiwu commented Mar 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fs-eire commented Mar 6, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

tianleiwu commented Mar 5, 2024 •

edited

Loading