-
Notifications
You must be signed in to change notification settings - Fork 847
Description
As title, vkCreateComputePipelines triggers a segmentation fault in the driver code when creating compute pipelines for certain shaders on Samsung S23.
Stack trace:
02-12 15:17:01.627 13994 13994 F DEBUG : Cmdline: /data/local/tmp/etvk/execute_bpte /data/local/tmp/etvk/models/scenex_v9_512_vulkan_fp16.bpte
02-12 15:17:01.627 13994 13994 F DEBUG : pid: 13984, tid: 13984, name: execute_bpte >>> /data/local/tmp/etvk/execute_bpte <<<
02-12 15:17:01.627 13994 13994 F DEBUG : uid: 2000
02-12 15:17:01.627 13994 13994 F DEBUG : tagged_addr_ctrl: 0000000000000001 (PR_TAGGED_ADDR_ENABLE)
02-12 15:17:01.627 13994 13994 F DEBUG : pac_enabled_keys: 000000000000000f (PR_PAC_APIAKEY, PR_PAC_APIBKEY, PR_PAC_APDAKEY, PR_PAC_APDBKEY)
02-12 15:17:01.627 13994 13994 F DEBUG : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0000000000000010
02-12 15:17:01.627 13994 13994 F DEBUG : Cause: null pointer dereference
02-12 15:17:01.627 13994 13994 F DEBUG : x0 0000000000000000 x1 0000007fdfced698 x2 0000007fdfced6a0 x3 b40000734a6b3bc0
02-12 15:17:01.627 13994 13994 F DEBUG : x4 0000000000000000 x5 f291713a99c3f5e2 x6 b4000072c9e4f8e0 x7 0000000000000000
02-12 15:17:01.627 13994 13994 F DEBUG : x8 0000007fdfced6a0 x9 b40000734a69f400 x10 0000000000000000 x11 0000007fdfced720
02-12 15:17:01.627 13994 13994 F DEBUG : x12 0000000000000001 x13 0000000000000002 x14 000000000000002b x15 0000000000000000
02-12 15:17:01.627 13994 13994 F DEBUG : x16 0000000000000022 x17 0000000000000001 x18 000000734e580000 x19 0000000000000000
02-12 15:17:01.627 13994 13994 F DEBUG : x20 b40000734a6b3bc0 x21 b4000072c9f7cca0 x22 0000007fdfced698 x23 0000000000000006
02-12 15:17:01.627 13994 13994 F DEBUG : x24 000000734dfbb000 x25 000000000000467e x26 000000734dfbb000 x27 0000000000000015
02-12 15:17:01.627 13994 13994 F DEBUG : x28 0000000000000001 x29 0000007fdfced7f0
02-12 15:17:01.627 13994 13994 F DEBUG : lr 006c7872b74cf4cc sp 0000007fdfced5e0 pc 00000072b73c19f4 pst 0000000060001000
02-12 15:17:01.627 13994 13994 F DEBUG : 34 total frames
02-12 15:17:01.627 13994 13994 F DEBUG : backtrace:
02-12 15:17:01.627 13994 13994 F DEBUG : NOTE: Function names and BuildId information is missing for some frames due
02-12 15:17:01.627 13994 13994 F DEBUG : NOTE: to unreadable libraries. For unwinds of apps, only shared libraries
02-12 15:17:01.627 13994 13994 F DEBUG : NOTE: found under the lib/ directory are readable.
02-12 15:17:01.627 13994 13994 F DEBUG : NOTE: On this device, run setenforce 0 to make the libraries readable.
02-12 15:17:01.627 13994 13994 F DEBUG : NOTE: Unreadable libraries:
02-12 15:17:01.627 13994 13994 F DEBUG : NOTE: /data/local/tmp/etvk/execute_bpte
02-12 15:17:01.627 13994 13994 F DEBUG : #00 pc 00000000005bf9f4 /vendor/lib64/libllvm-qgl.so (!!!0000!94f922484c7c2a123d83cc9a7b0fc0!dc3d4da3a2!+84) (BuildId: 197773a235861a62fb29a17d08291e53)
02-12 15:17:01.627 13994 13994 F DEBUG : #01 pc 00000000006cd4c8 /vendor/lib64/libllvm-qgl.so (!!!0000!8aa9316cfa40ea8e13922cfdcda509!dc3d4da3a2!+120) (BuildId: 197773a235861a62fb29a17d08291e53)
02-12 15:17:01.627 13994 13994 F DEBUG : #02 pc 00000000006ccdd4 /vendor/lib64/libllvm-qgl.so (!!!0000!99fd46ca6897ca43f4eedd7822487a!dc3d4da3a2!+436) (BuildId: 197773a235861a62fb29a17d08291e53)
02-12 15:17:01.627 13994 13994 F DEBUG : #03 pc 0000000000910934 /vendor/lib64/libllvm-qgl.so (!!!0000!866bd28e17dc06a823006799f7570e!dc3d4da3a2!+532) (BuildId: 197773a235861a62fb29a17d08291e53)
02-12 15:17:01.627 13994 13994 F DEBUG : #04 pc 000000000090ccdc /vendor/lib64/libllvm-qgl.so (!!!0000!2a7897fa7e385f84d70f5d88ea5046!dc3d4da3a2!+2508) (BuildId: 197773a235861a62fb29a17d08291e53)
02-12 15:17:01.627 13994 13994 F DEBUG : #05 pc 0000000000d68a64 /vendor/lib64/libllvm-qgl.so (!!!0000!4ee45c73f202da09ceb9e97299e78c!dc3d4da3a2!+724) (BuildId: 197773a235861a62fb29a17d08291e53)
02-12 15:17:01.627 13994 13994 F DEBUG : #06 pc 00000000006e41c8 /vendor/lib64/libllvm-qgl.so (!!!0000!e39e8cf324350f3c5a7f77e6d95208!dc3d4da3a2!+472) (BuildId: 197773a235861a62fb29a17d08291e53)
02-12 15:17:01.627 13994 13994 F DEBUG : #07 pc 00000000006e3af4 /vendor/lib64/libllvm-qgl.so (!!!0000!367303fb02553850da321d3446c78a!dc3d4da3a2!+100) (BuildId: 197773a235861a62fb29a17d08291e53)
02-12 15:17:01.627 13994 13994 F DEBUG : #08 pc 000000000081b5a4 /vendor/lib64/libllvm-qgl.so (!!!0000!0e406d1c583002d7aa7c873d54dca9!dc3d4da3a2!+372) (BuildId: 197773a235861a62fb29a17d08291e53)
02-12 15:17:01.627 13994 13994 F DEBUG : #09 pc 000000000081a314 /vendor/lib64/libllvm-qgl.so (!!!0000!115a3b096d9bc78c0dfb42d0e49024!dc3d4da3a2!+116) (BuildId: 197773a235861a62fb29a17d08291e53)
02-12 15:17:01.627 13994 13994 F DEBUG : #10 pc 0000000000819b54 /vendor/lib64/libllvm-qgl.so (!!!0000!aa916b5e953dd3dca1b992ddb2c964!dc3d4da3a2!+788) (BuildId: 197773a235861a62fb29a17d08291e53)
02-12 15:17:01.627 13994 13994 F DEBUG : #11 pc 0000000000d47d0c /vendor/lib64/libllvm-qgl.so (!!!0000!51d38902a0381d361b611c909947d9!dc3d4da3a2!+60) (BuildId: 197773a235861a62fb29a17d08291e53)
02-12 15:17:01.627 13994 13994 F DEBUG : #12 pc 0000000000977bc0 /vendor/lib64/libllvm-qgl.so (!!!0000!34520e27c398aec80a9430978fab84!dc3d4da3a2!+1424) (BuildId: 197773a235861a62fb29a17d08291e53)
02-12 15:17:01.627 13994 13994 F DEBUG : #13 pc 0000000000972908 /vendor/lib64/libllvm-qgl.so (!!!0000!b351d96637f21e15c92b76750b44e2!dc3d4da3a2!+760) (BuildId: 197773a235861a62fb29a17d08291e53)
02-12 15:17:01.627 13994 13994 F DEBUG : #14 pc 0000000000971790 /vendor/lib64/libllvm-qgl.so (CreateQGLCProgram(QGPUCompiler::CompileData*)+48) (BuildId: 197773a235861a62fb29a17d08291e53)
02-12 15:17:01.627 13994 13994 F DEBUG : #15 pc 00000000009712d0 /vendor/lib64/libllvm-qgl.so (!!!0000!1e9735fa2d7fa7113c5ea09cbdfdc0!dc3d4da3a2!+320) (BuildId: 197773a235861a62fb29a17d08291e53)
02-12 15:17:01.627 13994 13994 F DEBUG : #16 pc 0000000000970040 /vendor/lib64/libllvm-qgl.so (!!!0000!aefccce6a332610a9b22f30d0961cc!dc3d4da3a2!+592) (BuildId: 197773a235861a62fb29a17d08291e53)
02-12 15:17:01.627 13994 13994 F DEBUG : #17 pc 000000000004cfc4 /vendor/lib64/libllvm-glnext.so (!!!0000!3dcaee58dbbfbd4511f8fc7a97b9b9!dc3d4da3a2!+900) (BuildId: 9e51ef917b23889becdc61e58b6448fc)
02-12 15:17:01.627 13994 13994 F DEBUG : #18 pc 000000000027dd0c /vendor/lib64/hw/[vulkan.adreno.so](http://vulkan.adreno.so/) (!!!0000!9f8153b2695670b78964f3638e2666!dc3d4da3a2!+8076) (BuildId: 9ddb695a94bf97a272a018a299b56fb4)
02-12 15:17:01.627 13994 13994 F DEBUG : #19 pc 000000000027b3f4 /vendor/lib64/hw/[vulkan.adreno.so](http://vulkan.adreno.so/) (!!!0000!2aa5082753cd3c7ad1b8091f24093d!dc3d4da3a2!+340) (BuildId: 9ddb695a94bf97a272a018a299b56fb4)
02-12 15:17:01.627 13994 13994 F DEBUG : #20 pc 0000000000299b90 /vendor/lib64/hw/[vulkan.adreno.so](http://vulkan.adreno.so/) (!!!0000!4a8b3805ee4e9b1d8ce9b59e2f189a!dc3d4da3a2!+1072) (BuildId: 9ddb695a94bf97a272a018a299b56fb4)
02-12 15:17:01.627 13994 13994 F DEBUG : #21 pc 000000000029941c /vendor/lib64/hw/[vulkan.adreno.so](http://vulkan.adreno.so/) (qglinternal::vkCreateComputePipelines(VkDevice_T*, VkPipelineCache_T*, unsigned int, VkComputePipelineCreateInfo const*, VkAllocationCallbacks const*, VkPipeline_T**)+684) (BuildId: 9ddb695a94bf97a272a018a299b56fb4)
02-12 15:17:01.627 13994 13994 F DEBUG : #22 pc 0000000001049bc8 /data/local/tmp/etvk/execute_bpte
02-12 15:17:01.627 13994 13994 F DEBUG : #23 pc 0000000000f2310c /data/local/tmp/etvk/execute_bpte
02-12 15:17:01.627 13994 13994 F DEBUG : #24 pc 0000000000f099ec /data/local/tmp/etvk/execute_bpte
02-12 15:17:01.628 13994 13994 F DEBUG : #25 pc 0000000000f08918 /data/local/tmp/etvk/execute_bpte
02-12 15:17:01.628 13994 13994 F DEBUG : #26 pc 0000000000697780 /data/local/tmp/etvk/execute_bpte
Environment
Android NDK: 29.0.13846066
Vulkan SDK: 1.4.321.0
GLSLC version:
shaderc v2023.8 v2025.3
spirv-tools v2025.3 v2022.4-833-g33e02568
glslang 11.1.0-1253-gefd24d75
Target: SPIR-V 1.0
Repro steps
Ensure that the Vulkan SDK is installed (latest version is OK) and that glslc exists on your path:
glslc --versionThe Android NDK must also be installed. Any NDK version past NDK r17c should suffice. Set the ANDROID_NDK environment variable to the install location:
export ANDROID_NDK=...Repository Setup
Setup ExecuTorch repo. I prepared the ssj_s23_segv_repro branch to make it easier to reproduce the issue.
git clone https://github.com/pytorch/executorch.git
cd executorch
git fetch
git checkout ssj_s23_segv_repro
git submodule update --initExport Model
Build python libs and install executorch to your Python environment. Run from executorch root
./install_executorch.sh -eExport the model to use for reproduction.
python ./cnn_toy.py
python ./cnn_toy.py --fp16This should create two model files in the current dir:
$ ls | grep pte
cnn_toy_vulkan_fp16.pte
cnn_toy_vulkan_fp32.pte
Build libraries and model runner for Android
For this step, ensure ANDROID_NDK is set to the install path of the Android NDK.
Running from ExecuTorch Root:
cmake . \
-DCMAKE_INSTALL_PREFIX=cmake-out-android-so \
-DCMAKE_TOOLCHAIN_FILE=$ANDROID_NDK/build/cmake/android.toolchain.cmake \
-DANDROID_SUPPORT_FLEXIBLE_PAGE_SIZES=ON \
--preset "android-arm64-v8a" \
-DANDROID_PLATFORM=android-28 \
-DPYTHON_EXECUTABLE=python \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_C_COMPILER_LAUNCHER=ccache \
-DCMAKE_CXX_COMPILER_LAUNCHER=ccache \
-DEXECUTORCH_PAL_DEFAULT=posix \
-DEXECUTORCH_BUILD_VULKAN=ON \
-DEXECUTORCH_BUILD_TESTS=OFF \
-DEXECUTORCH_BUILD_EXTENSION_EVALUE_UTIL=ON \
-DEXECUTORCH_BUILD_EXECUTOR_RUNNER=ON \
-DEXECUTORCH_ENABLE_EVENT_TRACER=ON \
-Bcmake-out-android-so && \
cmake --build cmake-out-android-so -j16 --target install --config ReleaseThen, push model files to device and attempt inference:
export MODEL_PATH=./cnn_toy_vulkan_fp16.pte && \
export MODEL_FILE=$(basename ${MODEL_PATH}) && \
adb shell mkdir -p /data/local/tmp/etvk/models/ && \
adb push $MODEL_PATH /data/local/tmp/etvk/models/$MODEL_FILE && \
adb push cmake-out-android-so/executor_runner /data/local/tmp/etvk && \
adb shell /data/local/tmp/etvk/executor_runner --model_path /data/local/tmp/etvk/models/$MODEL_FILEWhen executing the model, the runtime will log the compute pipeline currently being created:
[ET-VK] Skipping pipeline for shader: concat_2_buffer_float
[ET-VK] Creating pipeline 1/34: buffer_to_nchw_float_float
[ET-VK] Creating pipeline 2/34: mean_per_row_buffer_float
[ET-VK] Creating pipeline 3/34: clone_image_to_buffer_float_floatWhen the seg fault occurs, the output will stop. The last printed shader name will indicate the compute pipeline that triggered the seg fault.
Inspecting GLSL/SPIR-V compute shaders
To inspect the GLSL/SPIR-V code of the shader:
$ cat cmake-out-android-so/vulkan_compute_shaders/clone_image_to_buffer_float_float.glsl
$ cat cmake-out-android-so/vulkan_compute_shaders/clone_image_to_buffer_float_float.spv