Detect VNNI support (avx512) in CI and skip ORT test for int8 models if missing VNNI #526

jcwchen · 2022-05-18T02:25:40Z

Description
Sometimes GitHub Action machines don't have VNNI support. However, all data of int8 models were produced by machines with VNNI support. To make CI consistent, it should know whether the machine has VNNI support or not. If not, simply skip it.

Motivation and Context
Closes #522.

Signed-off-by: jcwchen <jacky82226@gmail.com>

jcwchen · 2022-05-19T14:18:54Z

@mengniwang95 If you have bandwidth, would you mind to help me review this PR? I added two files for testing: test-mobilenetv2-12-int8.tar.gz, test-resnet50-v1-12-int8.onnx. As you can see in the CIs of the latest commit:

Windows CI does have avx512 support so it still will run ORT test for quantized models.
Linux CI does not have avx512 support so it will skip the ORT test for quantized models.

I will remove these two testing files before merge. Thank you for your time!

This reverts commit bf0e98f. Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

garymm

Hope you don't mind the drive-by review.

garymm · 2022-06-06T19:04:39Z

.github/workflows/linux_ci.yml

+          # TODO: now ONNX only supports Protobuf <= 3.20.1
+          python -m pip install protobuf==3.20.1
+          python -m pip install onnx onnxruntime requests py-cpuinfo
+          python -m cpuinfo


Why run cpuinfo here?
If you want to keep it, could you please add a comment?

I kept it intentionally for future debugging since the CPU information can be very helpful while debugging why ORT inference behaves differently on different machines. I will send a PR to add comments about it.

garymm · 2022-06-06T19:24:43Z

workflow_scripts/check_model.py

 import ort_test_dir_utils
 import onnxruntime
 import onnx
 import test_utils


+def has_vnni_support():
+    return 'avx512' in str(get_cpu_info()['flags'])


I don't think this is strictly correct. From my reading there are CPUs that support avx512 "foundation" but not VNNI, which is an extension.
I think the flag to check for is avx512_vnni and maybe also avx512_4vnniw.

Sources:

https://en.wikichip.org/wiki/x86/avx512_vnni
https://bugzilla.redhat.com/show_bug.cgi?id=1761678
https://unix.stackexchange.com/questions/43539/what-do-the-flags-in-proc-cpuinfo-mean

Thank you for catching this! The function/PR name here might be misleading -- I made several experiments and concluded that machines with avx512f behave differently from machines without avx512f. Unfortunately I don't have machines with avx512_vnni support so I am not sure whether machines with avx512_vnni and machines with avx512f (and without avx512_vnni) behave the same. At least based on previous int8-related PRs, the output.pb produced by machines with avx512_vnni seems the same as the output.pb produced by machines with avx512f support only.

In summary, we need to figure out:

Machines without any avx512 support

Machines with avx512f support but without avx512_vnni support

Machines with avx512_vnni support

I am sure that 1 != 2 and 1 != 3, but I am not sure whether 2 == 3.

cc @mengniwang95 who is the main contributor of int8 models in ONNX Model Zoo. Feel free to comment if you have any insight. Thank you for the help!.

garymm · 2022-06-06T19:30:27Z

workflow_scripts/check_model.py

+    return 'avx512' in str(get_cpu_info()['flags'])
+
+
+def skip_quant_models_if_missing_vnni(model_name):


Suggested change:

def ort_skip_reason(model_path: str) -> Optional[str]: if '-int8' in model_name and not has_vnni_support(): return f'Skip ORT test for {model_path} because this machine lacks VNNI support and the output.pb was produced with VNNI support.' model = onnx.load(model_path) if model.opset_import[0].version < 7: return f'Skip ORT test for {model_path} because ORT only *guarantees* support for models stamped with opset version 7' return None def run_backend_ort(...): skip_reason = ort_skip_reason(model_path) if skip_reason: # log skip reason return

Thank you for the suggestion! I have created a new PR to improve it: #533.

detect VNNI support in CI

d1751c5

Signed-off-by: jcwchen <jacky82226@gmail.com>

jcwchen added the CI label May 18, 2022

jcwchen added 3 commits May 17, 2022 19:25

Merge branch 'main' into jcw/handle-vnni

e2d36c9

print cpu config

5cb5c9d

Signed-off-by: jcwchen <jacky82226@gmail.com>

test quant models

bf0e98f

Signed-off-by: jcwchen <jacky82226@gmail.com>

wenbingl approved these changes May 31, 2022

View reviewed changes

jcwchen added 2 commits June 1, 2022 08:18

Revert "test quant models"

686de67

This reverts commit bf0e98f. Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

frozen protobuf as 3.20.1 for onnx

d8354e8

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

jcwchen merged commit b441335 into onnx:main Jun 1, 2022

jcwchen deleted the jcw/handle-vnni branch June 1, 2022 15:28

XinyuYe-Intel mentioned this pull request Jun 6, 2022

Add quantized yolov4 model #521

Open

garymm reviewed Jun 6, 2022

View reviewed changes

jcwchen mentioned this pull request Jun 7, 2022

Use avx512f instead of VNNI and refactor code in CIs #533

Merged

mszhanyi mentioned this pull request Oct 28, 2022

Some models Failed on Windows CPU #568

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detect VNNI support (avx512) in CI and skip ORT test for int8 models if missing VNNI #526

Detect VNNI support (avx512) in CI and skip ORT test for int8 models if missing VNNI #526

jcwchen commented May 18, 2022

jcwchen commented May 19, 2022

garymm left a comment

garymm Jun 6, 2022

jcwchen Jun 6, 2022

garymm Jun 6, 2022

jcwchen Jun 6, 2022

garymm Jun 6, 2022

jcwchen Jun 7, 2022

		return 'avx512' in str(get_cpu_info()['flags'])


		def skip_quant_models_if_missing_vnni(model_name):

Detect VNNI support (avx512) in CI and skip ORT test for int8 models if missing VNNI #526

Detect VNNI support (avx512) in CI and skip ORT test for int8 models if missing VNNI #526

Conversation

jcwchen commented May 18, 2022

jcwchen commented May 19, 2022

garymm left a comment

Choose a reason for hiding this comment

garymm Jun 6, 2022

Choose a reason for hiding this comment

jcwchen Jun 6, 2022

Choose a reason for hiding this comment

garymm Jun 6, 2022

Choose a reason for hiding this comment

jcwchen Jun 6, 2022

Choose a reason for hiding this comment

garymm Jun 6, 2022

Choose a reason for hiding this comment

jcwchen Jun 7, 2022

Choose a reason for hiding this comment