Disable MatMulIntegerToFloat transformation for FP16 on CPU EP #18239

AnaghaRaoAMD · 2023-11-02T07:34:43Z

Description

MatMulIntegerToFloat is updated to support FP16. The nodes for FP16 Transformation use "Mul" FP16, which is not directly supported by the CPU.

For now FP16 transformation is only supported for DML EP. Disabled all FP16 tests on CPU.

Tests result without -use_dml build flag

onnxruntime_test_all.exe --gtest_filter="*MatMulIntegerToFloat*"
Note: Google Test filter = *MatMulIntegerToFloat*
[==========] Running 8 tests from 4 test suites.
[----------] Global test environment set-up.
[----------] 1 test from CPU_U8S8_Precision_Tests
[ RUN      ] CPU_U8S8_Precision_Tests.MatMulIntegerToFloat
[       OK ] CPU_U8S8_Precision_Tests.MatMulIntegerToFloat (181 ms)
[----------] 1 test from CPU_U8S8_Precision_Tests (181 ms total)

[----------] 1 test from GraphTransformationTests
[ RUN      ] GraphTransformationTests.MatMulIntegerToFloatTest
[       OK ] GraphTransformationTests.MatMulIntegerToFloatTest (17 ms)
[----------] 1 test from GraphTransformationTests (17 ms total)

[----------] 1 test from QDQTransformerTests
[ RUN      ] QDQTransformerTests.MatMulIntegerToFloat
[       OK ] QDQTransformerTests.MatMulIntegerToFloat (656 ms)
[----------] 1 test from QDQTransformerTests (656 ms total)

[----------] 5 tests from MatMulIntegerToFloat
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8X8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8X8 (195 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8X8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8X8 (206 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8S8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8S8 (107 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8S8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8S8 (114 ms)
[ RUN      ] MatMulIntegerToFloat.MatMulInteger_With_ZeroPoint
[       OK ] MatMulIntegerToFloat.MatMulInteger_With_ZeroPoint (227 ms)
[----------] 5 tests from MatMulIntegerToFloat (854 ms total)

[----------] Global test environment tear-down
[==========] 8 tests from 4 test suites ran. (1713 ms total)
[  PASSED  ] 8 tests.
memleakdbg:
----- No memory leaks detected -----

onnxruntime_test_all.exe --gtest_filter="GraphTransformationTests.MatMulIntegerToFloat*"
Note: Google Test filter = GraphTransformationTests.MatMulIntegerToFloat*
[==========] Running 2 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 2 tests from GraphTransformationTests
[ RUN      ] GraphTransformationTests.MatMulIntegerToFloatTest
[       OK ] GraphTransformationTests.MatMulIntegerToFloatTest (13 ms)
[ RUN      ] GraphTransformationTests.MatMulIntegerToFloat16Test
[       OK ] GraphTransformationTests.MatMulIntegerToFloat16Test (4 ms)
[----------] 2 tests from GraphTransformationTests (20 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 1 test suite ran. (22 ms total)
[  PASSED  ] 2 tests.
memleakdbg:
----- No memory leaks detected -----

Motivation and Context

AnaghaRaoAMD · 2023-11-02T20:28:07Z

#20231102.3 • Diable MatMulIntegerToFloat Tranformtion for FP16 on CPU EP Windows GPU CI Pipeline

onnxruntime/core/optimizer/matmul_integer_to_float.cc

sumitsays · 2023-11-02T22:20:29Z

We should also add a test case in graph_transform_test.cc to test the MatMulIntegerToFloat for FP16 type.

sumitsays · 2023-11-03T02:58:05Z

We should also add a test case in graph_transform_test.cc to test the MatMulIntegerToFloat for FP16 type.

It looks like we can manually assign each node to DML ep to test the fusion happens for DML EP in the case FP16 before calling the ApplyTransformers.

for (auto& node : graph.Nodes()) {
  node.SetExecutionProviderType("kDmlExecutionProvider");
}

And then we would be able to assert the count of MatMulIntegerToFloat node.

onnxruntime/core/optimizer/matmul_integer_to_float.cc

onnxruntime/test/contrib_ops/matmul_integer_to_float_test.cc

onnxruntime/test/optimizer/graph_transform_test.cc

onnxruntime/test/testdata/transform/fusion/matmul_integer_to_float.py

 if __name__ == "__main__":
-    GenerateModel("matmul_integer_to_float.onnx")
-    GenerateModel("matmul_integer_to_float16.onnx", output_type_fp16=True)
+    GenerateModel("matmul_integer_to_float.onnx")


sumitsays

### Description MatMulIntegerToFloat is updated to support FP16. The nodes for FP16 Transformation use "Mul" FP16, which is not directly supported by the CPU. For now FP16 transformation is only supported for DML EP. Disabled all FP16 tests on CPU. Tests result without `-use_dml` build flag ``` onnxruntime_test_all.exe --gtest_filter="*MatMulIntegerToFloat*" Note: Google Test filter = *MatMulIntegerToFloat* [==========] Running 8 tests from 4 test suites. [----------] Global test environment set-up. [----------] 1 test from CPU_U8S8_Precision_Tests [ RUN ] CPU_U8S8_Precision_Tests.MatMulIntegerToFloat [ OK ] CPU_U8S8_Precision_Tests.MatMulIntegerToFloat (181 ms) [----------] 1 test from CPU_U8S8_Precision_Tests (181 ms total) [----------] 1 test from GraphTransformationTests [ RUN ] GraphTransformationTests.MatMulIntegerToFloatTest [ OK ] GraphTransformationTests.MatMulIntegerToFloatTest (17 ms) [----------] 1 test from GraphTransformationTests (17 ms total) [----------] 1 test from QDQTransformerTests [ RUN ] QDQTransformerTests.MatMulIntegerToFloat [ OK ] QDQTransformerTests.MatMulIntegerToFloat (656 ms) [----------] 1 test from QDQTransformerTests (656 ms total) [----------] 5 tests from MatMulIntegerToFloat [ RUN ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8X8 [ OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8X8 (195 ms) [ RUN ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8X8 [ OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8X8 (206 ms) [ RUN ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8S8 [ OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8S8 (107 ms) [ RUN ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8S8 [ OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8S8 (114 ms) [ RUN ] MatMulIntegerToFloat.MatMulInteger_With_ZeroPoint [ OK ] MatMulIntegerToFloat.MatMulInteger_With_ZeroPoint (227 ms) [----------] 5 tests from MatMulIntegerToFloat (854 ms total) [----------] Global test environment tear-down [==========] 8 tests from 4 test suites ran. (1713 ms total) [ PASSED ] 8 tests. memleakdbg: ----- No memory leaks detected ----- ``` ``` onnxruntime_test_all.exe --gtest_filter="GraphTransformationTests.MatMulIntegerToFloat*" Note: Google Test filter = GraphTransformationTests.MatMulIntegerToFloat* [==========] Running 2 tests from 1 test suite. [----------] Global test environment set-up. [----------] 2 tests from GraphTransformationTests [ RUN ] GraphTransformationTests.MatMulIntegerToFloatTest [ OK ] GraphTransformationTests.MatMulIntegerToFloatTest (13 ms) [ RUN ] GraphTransformationTests.MatMulIntegerToFloat16Test [ OK ] GraphTransformationTests.MatMulIntegerToFloat16Test (4 ms) [----------] 2 tests from GraphTransformationTests (20 ms total) [----------] Global test environment tear-down [==========] 2 tests from 1 test suite ran. (22 ms total) [ PASSED ] 2 tests. memleakdbg: ----- No memory leaks detected ----- ``` ### Motivation and Context

MatMulIntegerToFloat is updated to support FP16. The nodes for FP16 Transformation use "Mul" FP16, which is not directly supported by the CPU. For now FP16 transformation is only supported for DML EP. Disabled all FP16 tests on CPU. Tests result without `-use_dml` build flag ``` onnxruntime_test_all.exe --gtest_filter="*MatMulIntegerToFloat*" Note: Google Test filter = *MatMulIntegerToFloat* [==========] Running 8 tests from 4 test suites. [----------] Global test environment set-up. [----------] 1 test from CPU_U8S8_Precision_Tests [ RUN ] CPU_U8S8_Precision_Tests.MatMulIntegerToFloat [ OK ] CPU_U8S8_Precision_Tests.MatMulIntegerToFloat (181 ms) [----------] 1 test from CPU_U8S8_Precision_Tests (181 ms total) [----------] 1 test from GraphTransformationTests [ RUN ] GraphTransformationTests.MatMulIntegerToFloatTest [ OK ] GraphTransformationTests.MatMulIntegerToFloatTest (17 ms) [----------] 1 test from GraphTransformationTests (17 ms total) [----------] 1 test from QDQTransformerTests [ RUN ] QDQTransformerTests.MatMulIntegerToFloat [ OK ] QDQTransformerTests.MatMulIntegerToFloat (656 ms) [----------] 1 test from QDQTransformerTests (656 ms total) [----------] 5 tests from MatMulIntegerToFloat [ RUN ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8X8 [ OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8X8 (195 ms) [ RUN ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8X8 [ OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8X8 (206 ms) [ RUN ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8S8 [ OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8S8 (107 ms) [ RUN ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8S8 [ OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8S8 (114 ms) [ RUN ] MatMulIntegerToFloat.MatMulInteger_With_ZeroPoint [ OK ] MatMulIntegerToFloat.MatMulInteger_With_ZeroPoint (227 ms) [----------] 5 tests from MatMulIntegerToFloat (854 ms total) [----------] Global test environment tear-down [==========] 8 tests from 4 test suites ran. (1713 ms total) [ PASSED ] 8 tests. memleakdbg: ----- No memory leaks detected ----- ``` ``` onnxruntime_test_all.exe --gtest_filter="GraphTransformationTests.MatMulIntegerToFloat*" Note: Google Test filter = GraphTransformationTests.MatMulIntegerToFloat* [==========] Running 2 tests from 1 test suite. [----------] Global test environment set-up. [----------] 2 tests from GraphTransformationTests [ RUN ] GraphTransformationTests.MatMulIntegerToFloatTest [ OK ] GraphTransformationTests.MatMulIntegerToFloatTest (13 ms) [ RUN ] GraphTransformationTests.MatMulIntegerToFloat16Test [ OK ] GraphTransformationTests.MatMulIntegerToFloat16Test (4 ms) [----------] 2 tests from GraphTransformationTests (20 ms total) [----------] Global test environment tear-down [==========] 2 tests from 1 test suite ran. (22 ms total) [ PASSED ] 2 tests. memleakdbg: ----- No memory leaks detected ----- ```

… (#18554) [Cherry Pick Reviewed] MatMulIntegerToFloat is updated to support FP16. The nodes for FP16 Transformation use "Mul" FP16, which is not directly supported by the CPU. For now FP16 transformation is only supported for DML EP. Disabled all FP16 tests on CPU. Tests result without `-use_dml` build flag ``` onnxruntime_test_all.exe --gtest_filter="*MatMulIntegerToFloat*" Note: Google Test filter = *MatMulIntegerToFloat* [==========] Running 8 tests from 4 test suites. [----------] Global test environment set-up. [----------] 1 test from CPU_U8S8_Precision_Tests [ RUN ] CPU_U8S8_Precision_Tests.MatMulIntegerToFloat [ OK ] CPU_U8S8_Precision_Tests.MatMulIntegerToFloat (181 ms) [----------] 1 test from CPU_U8S8_Precision_Tests (181 ms total) [----------] 1 test from GraphTransformationTests [ RUN ] GraphTransformationTests.MatMulIntegerToFloatTest [ OK ] GraphTransformationTests.MatMulIntegerToFloatTest (17 ms) [----------] 1 test from GraphTransformationTests (17 ms total) [----------] 1 test from QDQTransformerTests [ RUN ] QDQTransformerTests.MatMulIntegerToFloat [ OK ] QDQTransformerTests.MatMulIntegerToFloat (656 ms) [----------] 1 test from QDQTransformerTests (656 ms total) [----------] 5 tests from MatMulIntegerToFloat [ RUN ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8X8 [ OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8X8 (195 ms) [ RUN ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8X8 [ OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8X8 (206 ms) [ RUN ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8S8 [ OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8S8 (107 ms) [ RUN ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8S8 [ OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8S8 (114 ms) [ RUN ] MatMulIntegerToFloat.MatMulInteger_With_ZeroPoint [ OK ] MatMulIntegerToFloat.MatMulInteger_With_ZeroPoint (227 ms) [----------] 5 tests from MatMulIntegerToFloat (854 ms total) [----------] Global test environment tear-down [==========] 8 tests from 4 test suites ran. (1713 ms total) [ PASSED ] 8 tests. memleakdbg: ----- No memory leaks detected ----- ``` ``` onnxruntime_test_all.exe --gtest_filter="GraphTransformationTests.MatMulIntegerToFloat*" Note: Google Test filter = GraphTransformationTests.MatMulIntegerToFloat* [==========] Running 2 tests from 1 test suite. [----------] Global test environment set-up. [----------] 2 tests from GraphTransformationTests [ RUN ] GraphTransformationTests.MatMulIntegerToFloatTest [ OK ] GraphTransformationTests.MatMulIntegerToFloatTest (13 ms) [ RUN ] GraphTransformationTests.MatMulIntegerToFloat16Test [ OK ] GraphTransformationTests.MatMulIntegerToFloat16Test (4 ms) [----------] 2 tests from GraphTransformationTests (20 ms total) [----------] Global test environment tear-down [==========] 2 tests from 1 test suite ran. (22 ms total) [ PASSED ] 2 tests. memleakdbg: ----- No memory leaks detected ----- ```  ### Description  ### Motivation and Context

Diable MatMulIntegerToFloat Tranformtion for FP16 on CPU EP

fd12db0

AnaghaRaoAMD marked this pull request as ready for review November 2, 2023 17:11

AnaghaRaoAMD requested review from adtsai, sumitsays and yufenglee November 2, 2023 17:11

yufenglee approved these changes Nov 2, 2023

View reviewed changes

AnaghaRaoAMD requested a review from zhangxiang1993 November 2, 2023 21:05

sumitsays reviewed Nov 2, 2023

View reviewed changes

onnxruntime/core/optimizer/matmul_integer_to_float.cc Outdated Show resolved Hide resolved

Add DML EP to MatMulIntegerToFloatFusion

65857f1

sumitsays reviewed Nov 3, 2023

View reviewed changes

onnxruntime/core/optimizer/matmul_integer_to_float.cc Outdated Show resolved Hide resolved

Adding FP16 Transformation test for DML EP

1239ae5

sumitsays reviewed Nov 3, 2023

View reviewed changes

onnxruntime/test/contrib_ops/matmul_integer_to_float_test.cc Show resolved Hide resolved

sumitsays reviewed Nov 3, 2023

View reviewed changes

onnxruntime/test/optimizer/graph_transform_test.cc Show resolved Hide resolved

sumitsays reviewed Nov 3, 2023

View reviewed changes

onnxruntime/test/optimizer/graph_transform_test.cc Show resolved Hide resolved

Copy model to fusion folder

cdb94fa

AnaghaRaoAMD force-pushed the user/anagrao/MatMulupdate branch from 94de302 to cdb94fa Compare November 3, 2023 16:31

github-advanced-security bot found potential problems Nov 3, 2023

View reviewed changes

sumitsays approved these changes Nov 3, 2023

View reviewed changes

AnaghaRaoAMD merged commit 7d4dba7 into DmlPrototype Nov 3, 2023

AnaghaRaoAMD deleted the user/anagrao/MatMulupdate branch November 3, 2023 17:05

AnaghaRaoAMD mentioned this pull request Nov 3, 2023

Enable MatrixMultiplyIntegerToFloat on DML #18275

Merged

AnaghaRaoAMD mentioned this pull request Nov 22, 2023

Disable MatMulIntegerToFloat transformation for FP16 on CPU EP #18553

Closed

ekmixon mentioned this pull request Apr 6, 2024

[Snyk] Security upgrade eslint from 7.25.0 to 9.0.0 ekmixon/onnxruntime#192

Open

MaxMood96 mentioned this pull request Apr 6, 2024

[Snyk] Security upgrade eslint from 7.25.0 to 9.0.0 MaxMood96/onnxruntime#449

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Disable MatMulIntegerToFloat transformation for FP16 on CPU EP #18239

Disable MatMulIntegerToFloat transformation for FP16 on CPU EP #18239

Uh oh!

AnaghaRaoAMD commented Nov 2, 2023 •

edited

Loading

Uh oh!

AnaghaRaoAMD commented Nov 2, 2023

Uh oh!

Uh oh!

sumitsays commented Nov 2, 2023

Uh oh!

sumitsays commented Nov 3, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Check warning

sumitsays left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Disable MatMulIntegerToFloat transformation for FP16 on CPU EP #18239

Disable MatMulIntegerToFloat transformation for FP16 on CPU EP #18239

Uh oh!

Conversation

AnaghaRaoAMD commented Nov 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Uh oh!

AnaghaRaoAMD commented Nov 2, 2023

Uh oh!

Uh oh!

sumitsays commented Nov 2, 2023

Uh oh!

sumitsays commented Nov 3, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Check warning

sumitsays left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

AnaghaRaoAMD commented Nov 2, 2023 •

edited

Loading