Skip to content

AArch64: convolution diff issue with gemm_s8u8s32 kernel #2675

Open
@xiang1guo

Description

@xiang1guo

Backgroud

This is a follow up of issue 2674. Share the same background with issue 2674, but for a different test case as follows. The case is skipped in #2168 because of failures.

test_graph_unit_dnnl_large_partition_usm_cpu(test_large_partition_execute.Int8Resnet50Stage2Block)

Summary

When I try to analyze the issue, I found that the test case can also be reproduced with benchdnn without graph API component.
The failed kernel is convolution,gemm_s8u8s32:ref
See the following log:

ONEDNN_VERBOSE=1 ./tests/benchdnn/benchdnn --conv --reset --allow-enum-tags-only=0 --engine=cpu --dir=FWD_I --alg=direct --dt=u8:s8:u8 --bia-dt=f32 
--stag=acdb --wtag=any --dtag=acdb --attr-post-ops=eltwise_relu --attr-scales=src0:common:0.5+dst:common:0.5+wei:per_oc --attr-zero-points=src0:common:1+dst:common:1 --attr-scratchpad=user
 mb1_ic8oc8_ih12oh12kh1sh1dh0ph0_iw12ow12kw1sw1dw0pw0
onednn_verbose,v1,info,oneDNN v3.8.0 (commit af1410c21a7455af587ae496c719ac7896d8ed95)
onednn_verbose,v1,info,cpu,runtime:OpenMP,nthr:4
onednn_verbose,v1,info,cpu,isa:AArch64 SVE (256 bits)
onednn_verbose,v1,info,gpu,runtime:none
onednn_verbose,v1,info,graph,backend,0:dnnl_backend
onednn_verbose,v1,primitive,info,template:operation,engine,primitive,implementation,prop_kind,memory_descriptors,attributes,auxiliary,problem_desc,exec_time
onednn_verbose,v1,graph,info,template:operation,engine,partition_id,partition_kind,op_names,data_formats,logical_tensors,fpmath_mode,implementation,backend,exec_time
onednn_verbose,v1,primitive,exec,cpu,reorder,jit:uni,undef,src:f32::blocked:a::f0 dst:f32::blocked:a::f0,,,8,0.0109863
onednn_verbose,v1,primitive,exec,cpu,reorder,jit:uni,undef,src:f32::blocked:abcd::f0 dst:s8::blocked:cdba::f8:zpm1,,,8x8x1x1,0.163086
onednn_verbose,v1,primitive,exec,cpu,reorder,rnn_data_reorder,undef,src:f32::blocked:abcd::f0 dst:s8::blocked:abcd::f0,,,8x8x1x1,0.0268555
onednn_verbose,v1,primitive,exec,cpu,reorder,jit:uni,undef,src:s8::blocked:abcd::f0 dst:s8::blocked:cdba::f8:zpm1,,,8x8x1x1,0.0200195
onednn_verbose,v1,primitive,exec,cpu,reorder,jit:uni,undef,src:f32::blocked:abcd::f0 dst:u8::blocked:acdb::f0,,,1x8x12x12,0.104004
onednn_verbose,v1,primitive,exec,cpu,convolution,gemm_s8u8s32:ref,forward_inference,src:u8::blocked:acdb::f0 wei:s8:a:blocked:cdba::f8:zpm1 bia:f32:a:blocked:a::f0 dst:u8::blocked:acdb::f0,attr-scratchpad:user attr-scales:src0:0:f32+dst:0:f32+wei:1:f32 attr-zero-points:src0:0:s32+dst:0:s32 attr-post-ops:eltwise_relu,alg:convolution_direct,mb1_ic8oc8_ih12oh12kh1sh1dh0ph0_iw12ow12kw1sw1dw0pw0,0.177002
onednn_verbose,v1,primitive,exec,cpu,reorder,jit:uni,undef,src:u8::blocked:acdb::f0 dst:f32::blocked:abcd::f0,,,1x8x12x12,0.0229492
[   0][DST][0:0:0:0] exp_f32:          14 exp:          14 got:          16 diff:       2 rdiff:0.142857
[   1][DST][0:0:0:1] exp_f32:        21.5 exp:          22 got:          23 diff:       1 rdiff:0.0454545
[   2][DST][0:0:0:2] exp_f32:        15.5 exp:          16 got:          17 diff:       1 rdiff:  0.0625
[   3][DST][0:0:0:3] exp_f32:       14.75 exp:          15 got:          16 diff:       1 rdiff:0.0666667
[   4][DST][0:0:0:4] exp_f32:          19 exp:          19 got:          20 diff:       1 rdiff:0.0526316
[   5][DST][0:0:0:5] exp_f32:       11.75 exp:          12 got:          13 diff:       1 rdiff:0.0833333
[   6][DST][0:0:0:6] exp_f32:          15 exp:          15 got:          16 diff:       1 rdiff:0.0666667
[   7][DST][0:0:0:7] exp_f32:        15.5 exp:          16 got:          17 diff:       1 rdiff:  0.0625
[   8][DST][0:0:0:8] exp_f32:           8 exp:           8 got:          10 diff:       2 rdiff:    0.25
[   9][DST][0:0:0:9] exp_f32:       20.25 exp:          20 got:          22 diff:       2 rdiff:     0.1
[COMPARE_STATS][DST]: trh=0 err_max_diff:      32 err_max_rdiff:      32 all_max_diff:      32 all_max_rdiff:      32
0:FAILED (errors:897 total:1152) __REPRO: --conv --allow-enum-tags-only=false --dir=FWD_I --dt=u8:s8:u8 --bia-dt=f32 --stag=acdb --dtag=acdb --attr-scales=src:common:0.5+dst:common:0.5+wei:per_oc --attr-zero-points=src:common:1+dst:common:1 --attr-post-ops=relu --attr-scratchpad=user mb1ic8ih12oc8oh12kh1ph0
tests:1 passed:0 skipped:0 mistrusted:0 unimplemented:0 invalid_arguments:0 failed:1 listed:0
total: 0.01s; fill: 0.00s (52%); compute_ref: 0.00s (5%); compare: 0.00s (11%);

Environment

  • system: Linux 22.04.1-Ubuntu SMP aarch64 aarch64 aarch64 GNU/Linux
  • gcc: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
  • cmake cmake version 3.22.1

Steps to reproduce

  • Build library:
1. setup ACL library
    git clone --branch v24.11.1 --depth 1 https://github.com/ARM-software/ComputeLibrary.git 
    git checkout 1f3bf6bbc4a1a57b5915fc0a19b195ae53acc06d
    scons -j4 Werror=0 debug=0 neon=1 opencl=0 embed_kernels=0 os=linux arch=armv8.2-a build=native multi_isa=1 fixed_format_kernels=1 cppthreads=0 openmp=1 examples=0 validation_tests=0
2. export ACL_ROOT_DIR=/path/to/ComputeLibrary
3. build oneDNN
    cmake .. -DDNNL_AARCH64_USE_ACL=ON -DONEDNN_BUILD_GRAPH=ON -DDNNL_CPU_RUNTIME=OMP -DONEDNN_WERROR=ON -DDNNL_BUILD_FOR_CI=ON -DONEDNN_TEST_SET=NIGHTLY -DCMAKE_BUILD_TYPE=Debug
    make -j 4
  • Run test:
ONEDNN_VERBOSE=1 ./tests/benchdnn/benchdnn --conv --reset --allow-enum-tags-only=0 --engine=cpu --dir=FWD_I --alg=direct --dt=u8:s8:u8 --bia-dt=f32 
--stag=acdb --wtag=any --dtag=acdb --attr-post-ops=eltwise_relu --attr-scales=src0:common:0.5+dst:common:0.5+wei:per_oc --attr-zero-points=src0:common:1+dst:common:1 --attr-scratchpad=user
 mb1_ic8oc8_ih12oh12kh1sh1dh0ph0_iw12ow12kw1sw1dw0pw0

Metadata

Metadata

Assignees

Labels

platform:cpu-aarch64Codeowner: @oneapi-src/onednn-cpu-aarch64sightingSuspicious library behavior. Should be promoted to a bug when confirmed

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions