You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
AMD's AddressSanitizer reports a memory corruption bug in amrex::InitRandom(). It's currently unclear if this is a real bug or a false positive.
bwibking@moth:~/quokka/build> ./src/HydroBlast3D/test_hydro3d_blast ../tests/benchmark_unigrid_256.inInitializing AMReX (23.10-23-g601cc4ee80e0)...MPI initialized with 1 MPI processesMPI initialized with thread support level 0Initializing HIP...HIP initialized with 1 device.===================================================================1379366==ERROR: AddressSanitizer: global-buffer-overflow on address 0x00000326cfe8 at pc 0x7fc7796a5ea7 bp 0x7ffd90522b00 sp 0x7ffd905222c0READ of size 32 at 0x00000326cfe8 thread T0 #0 0x7fc7796a5ea6 in __interceptor_memcpy (/opt/rocm-5.7.0/llvm/lib/clang/17.0.0/lib/linux/libclang_rt.asan-x86_64.so+0xa5ea6)(BuildId: e2f6676d7d0ade0de2c4ac32fa5856892b18b70a) #1 0x7fc7781440a9 (/opt/rocm-5.7.0/lib/libamdhip64.so.5+0x3440a9) (BuildId: 7342fbe1c361ada40d7aa3c1da36c32f3fbe143d) #2 0x7fc7781462f6 (/opt/rocm-5.7.0/lib/libamdhip64.so.5+0x3462f6) (BuildId: 7342fbe1c361ada40d7aa3c1da36c32f3fbe143d) #3 0x7fc7781465a6 (/opt/rocm-5.7.0/lib/libamdhip64.so.5+0x3465a6) (BuildId: 7342fbe1c361ada40d7aa3c1da36c32f3fbe143d) #4 0x7fc778112434 (/opt/rocm-5.7.0/lib/libamdhip64.so.5+0x312434) (BuildId: 7342fbe1c361ada40d7aa3c1da36c32f3fbe143d) #5 0x7fc7780dcc53 (/opt/rocm-5.7.0/lib/libamdhip64.so.5+0x2dcc53) (BuildId: 7342fbe1c361ada40d7aa3c1da36c32f3fbe143d) #6 0x7fc777f835e9 (/opt/rocm-5.7.0/lib/libamdhip64.so.5+0x1835e9) (BuildId: 7342fbe1c361ada40d7aa3c1da36c32f3fbe143d) #7 0x7fc777e89c0e (/opt/rocm-5.7.0/lib/libamdhip64.so.5+0x89c0e) (BuildId: 7342fbe1c361ada40d7aa3c1da36c32f3fbe143d) #8 0x7fc777fe650e (/opt/rocm-5.7.0/lib/libamdhip64.so.5+0x1e650e) (BuildId: 7342fbe1c361ada40d7aa3c1da36c32f3fbe143d) #9 0x7fc778010bd9 (/opt/rocm-5.7.0/lib/libamdhip64.so.5+0x210bd9) (BuildId: 7342fbe1c361ada40d7aa3c1da36c32f3fbe143d) #10 0x7fc777fe6f91 (/opt/rocm-5.7.0/lib/libamdhip64.so.5+0x1e6f91) (BuildId: 7342fbe1c361ada40d7aa3c1da36c32f3fbe143d) #11 0x7fc777ff13e7 in hipLaunchKernel (/opt/rocm-5.7.0/lib/libamdhip64.so.5+0x1f13e7) (BuildId: 7342fbe1c361ada40d7aa3c1da36c32f3fbe143d) #12 0x2ae25ef in std::enable_if<MaybeDeviceRunnable<(anonymous namespace)::ResizeRandomSeed(unsigned long)::'lambda'(int)>::value, void>::type amrex::ParallelFor<256, int, (anonymous namespace)::ResizeRandomSeed(unsigned long)::'lambda'(int), void>(amrex::Gpu::KernelInfo const&, int, (anonymous namespace)::ResizeRandomSeed(unsigned long)::'lambda'(int)&&) /home/bwibking/quokka/extern/amrex/Src/Base/AMReX_GpuLaunchFunctsG.H:878:5 #13 0x2ae25ef in void amrex::ParallelFor<int, (anonymous namespace)::ResizeRandomSeed(unsigned long)::'lambda'(int), void>(int, (anonymous namespace)::ResizeRandomSeed(unsigned long)::'lambda'(int)&&) /home/bwibking/quokka/extern/amrex/Src/Base/AMReX_GpuLaunchFunctsG.H:1457:5 #14 0x2ae25ef in (anonymous namespace)::ResizeRandomSeed(unsigned long) /home/bwibking/quokka/extern/amrex/Src/Base/AMReX_Random.cpp:54:5 #15 0x2ae25ef in amrex::InitRandom(unsigned long, int, unsigned long) /home/bwibking/quokka/extern/amrex/Src/Base/AMReX_Random.cpp:104:5 #16 0x2a21bf9 in amrex::Initialize(int&, char**&, bool, ompi_communicator_t*, std::function<void ()> const&, std::ostream&, std::ostream&, void (*)(char const*)) /home/bwibking/quokka/extern/amrex/Src/Base/AMReX.cpp:618:5 #17 0x29fc0cc in main /home/bwibking/quokka/src/main.cpp:22:2 #18 0x7fc773c3feaf in __libc_start_call_main (/lib64/libc.so.6+0x3feaf) (BuildId: b39d468aead6d9ede227751ffe093da287488648) #19 0x7fc773c3ff5f in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x3ff5f) (BuildId: b39d468aead6d9ede227751ffe093da287488648) #20 0x2897644 in _start (/home/bwibking/quokka/build/src/HydroBlast3D/test_hydro3d_blast+0x2897644)0x00000326cfe8 is located 56 bytes before global variable 'EOSData::mindens' defined in '/home/bwibking/quokka/extern/Microphysics/interfaces/eos_data.cpp' (0x326d020) of size 80x00000326cfe8 is located 24 bytes before global variable 'EOSData::maxtemp' defined in '/home/bwibking/quokka/extern/Microphysics/interfaces/eos_data.cpp' (0x326d000) of size 80x00000326cfe8 is located 0 bytes after global variable 'EOSData::mintemp' defined in '/home/bwibking/quokka/extern/Microphysics/interfaces/eos_data.cpp' (0x326cfe0) of size 8SUMMARY: AddressSanitizer: global-buffer-overflow (/opt/rocm-5.7.0/llvm/lib/clang/17.0.0/lib/linux/libclang_rt.asan-x86_64.so+0xa5ea6) (BuildId: e2f6676d7d0ade0de2c4ac32fa5856892b18b70a) in __interceptor_memcpyShadow bytes around the buggy address:Shadow bytes around the buggy address: 0x00000326cd00: 00 00 00 00 f9 f9 f9 f9 01 f9 f9 f9 00 00 00 00 0x00000326cd80: f9 f9 f9 f9 01 f9 f9 f9 01 f9 f9 f9 00 f9 f9 f9 0x00000326ce00: 00 f9 f9 f9 00 f9 f9 f9 00 f9 f9 f9 00 f9 f9 f9 0x00000326ce80: 00 f9 f9 f9 00 f9 f9 f9 00 f9 f9 f9 00 f9 f9 f9 0x00000326cf00: 00 f9 f9 f9 00 f9 f9 f9 00 f9 f9 f9 00 f9 f9 f9=>0x00000326cf80: 00 f9 f9 f9 00 f9 f9 f9 00 f9 f9 f9 00[f9]f9 f9 0x00000326d000: 00 f9 f9 f9 00 f9 f9 f9 00 f9 f9 f9 00 f9 f9 f9 0x00000326d080: 00 f9 f9 f9 00 f9 f9 f9 00 f9 f9 f9 00 f9 f9 f9 0x00000326d100: 00 f9 f9 f9 00 f9 f9 f9 00 f9 f9 f9 00 f9 f9 f9 0x00000326d180: 00 f9 f9 f9 00 f9 f9 f9 00 f9 f9 f9 00 00 00 00 0x00000326d200: 00 00 00 00 01 f9 f9 f9 00 f9 f9 f9 04 f9 f9 f9Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb==1379366==ABORTING
Additional context
Although the test suite passes on Moth, we have seen bizarre and incorrect behavior of Quokka when running production simulations on AMD GPUs. We want to rule out whether this is the cause.
The text was updated successfully, but these errors were encountered:
BenWibking
changed the title
[HIP] memory corruption inside AMReX reported by ASAN on AMD GPU
[HIP] memory corruption reported by ASAN on AMD GPU
Nov 19, 2023
According to Weiqun, this is a false positive. However, the same issue appears at different places in the Microphysics unit tests. So it is likely that something wrong is happening, but ASAN is misdiagnosing it.
BenWibking
changed the title
[HIP] memory corruption reported by ASAN on AMD GPU
[HIP] memory errors affecting Microphysics codes on AMD GPU
Nov 19, 2023
Describe the bug
AMD's AddressSanitizer reports a memory corruption bug in
amrex::InitRandom()
. It's currently unclear if this is a real bug or a false positive.To Reproduce
Steps to reproduce the behavior:
moth-sanitizer.profile
settings on Moth (add build profile for AMDGPU ASAN #446).Additional context
Although the test suite passes on Moth, we have seen bizarre and incorrect behavior of Quokka when running production simulations on AMD GPUs. We want to rule out whether this is the cause.
The text was updated successfully, but these errors were encountered: