Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tensor_cast asan flake #11203

Closed
GMNGeoffrey opened this issue Nov 17, 2022 · 2 comments · Fixed by #11206
Closed

tensor_cast asan flake #11203

GMNGeoffrey opened this issue Nov 17, 2022 · 2 comments · Fixed by #11206
Assignees

Comments

@GMNGeoffrey
Copy link
Contributor

https://github.com/iree-org/iree/actions/runs/3492591056/jobs/5846535193 (which was a presubmit run testing other stuff, but should still be representative) had a failure in iree/tests/e2e/tensor_ops/tensor_cast.mlir.test only when run a bunch of times. The logs are not super helpful though:

2022-11-17T22:58:53.5635431Z  588/1085 Test  #136: iree/tests/e2e/tensor_ops/tensor_cast.mlir.test ..........................................................***Failed    2.99 sec
2022-11-17T22:58:53.5636685Z -- Testing: 1 tests, 1 workers --
2022-11-17T22:58:53.5673679Z FAIL: IREE :: e2e/tensor_ops/tensor_cast.mlir (1 of 1)
2022-11-17T22:58:53.5675536Z ******************** TEST 'IREE :: e2e/tensor_ops/tensor_cast.mlir' FAILED ********************
2022-11-17T22:58:53.5676066Z Script:
2022-11-17T22:58:53.5676456Z --
2022-11-17T22:58:53.5677664Z : 'RUN: at line 1';   iree-run-mlir --iree-hal-target-backends=llvm-cpu /work/tests/e2e/tensor_ops/tensor_cast.mlir | FileCheck /work/tests/e2e/tensor_ops/tensor_cast.mlir
2022-11-17T22:58:53.5679528Z : 'RUN: at line 2';   [[ $IREE_VMVX_DISABLE == 1 ]] || (iree-run-mlir --iree-hal-target-backends=vmvx /work/tests/e2e/tensor_ops/tensor_cast.mlir | FileCheck /work/tests/e2e/tensor_ops/tensor_cast.mlir)
2022-11-17T22:58:53.5681710Z : 'RUN: at line 3';   [[ $IREE_VULKAN_DISABLE == 1 ]] || (iree-run-mlir --iree-hal-target-backends=vulkan-spirv /work/tests/e2e/tensor_ops/tensor_cast.mlir | FileCheck /work/tests/e2e/tensor_ops/tensor_cast.mlir)
2022-11-17T22:58:53.5682646Z --
2022-11-17T22:58:53.5683006Z Exit Code: 1
2022-11-17T22:58:53.5683266Z 
2022-11-17T22:58:53.5683441Z Command Output (stderr):
2022-11-17T22:58:53.5683889Z --
2022-11-17T22:58:53.5684402Z Tracer caught signal 11: addr=0x4978 pc=0x55726567fa9a sp=0x7f88659f9d40
2022-11-17T22:58:53.5685027Z ==47855==LeakSanitizer has encountered a fatal error.
2022-11-17T22:58:53.5685788Z ==47855==HINT: For debugging, try setting environment variable LSAN_OPTIONS=verbosity=1:log_threads=1
2022-11-17T22:58:53.5687135Z ==47855==HINT: LeakSanitizer does not work under ptrace (strace, gdb, etc)
2022-11-17T22:58:53.5687562Z 
2022-11-17T22:58:53.5687757Z --
2022-11-17T22:58:53.5687958Z 
2022-11-17T22:58:53.5693532Z ********************
2022-11-17T22:58:53.5762529Z ********************
2022-11-17T22:58:53.5762959Z Failed Tests (1):
2022-11-17T22:58:53.5763535Z   IREE :: e2e/tensor_ops/tensor_cast.mlir
2022-11-17T22:58:53.5763856Z 
2022-11-17T22:58:53.5763867Z 
2022-11-17T22:58:53.5764044Z Testing Time: 2.82s
@bjacob
Copy link
Contributor

bjacob commented Nov 18, 2022

I have installed exactly the same swiftshader by running build_tools/third_party/swiftshader/build_vk_swiftshader.sh, and i've been running this test a thousand times, but I can't reproduce.

One bit of information from the error message is that the addr=0x4978 looks abnormally small, as if it was a null pointer deref in disguise (disguised by some offset from a base pointer that happens to be null here). So I googled this a bit and found: google/sanitizers#1353

@bjacob
Copy link
Contributor

bjacob commented Nov 18, 2022

Found google/sanitizers#1342 . Let's give a try to LSAN_OPTIONS=use_tls=0 , as in mqudsi/fish-shell@7718d44 ?

bjacob added a commit that referenced this issue Nov 21, 2022
Fixes #11203.

This PR started out as a tentative massaging of LSAN to avert some issue
on CI bots, but that didn't work and it later appeared that there was a
rich history of LSAN issues with Swiftshader and the correct course of
action was to disable the LSAN part of ASAN when running ASAN tests on
Vulkan.

Reusing this PR so we keep the motivating history in one place. See the
below
#11206 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants