-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Closed
Description
I am recently trying different models and testing them on android_rpc. However, I discovered that many models do not work with either OpenCL or Vulkan. The outcome for some of the models is listed below.
deploy_model_on_android.py |
PyTorch Pretrained Resnet18 | Fast-depth | SC-SfMLearner | |
|---|---|---|---|---|
| OpenCL | Fail | Success | Fail | Success |
| Vulkan | Success | Siccess | Fail | Fail |
All models work on CPU without problem. The mobile side is hanging until the watchdog wakes up in the case of fast-depth running on Vulkan, while the crash messages from logcat for both OpenCL cases are like
2020-10-07 16:18:25.702 30873-30900/org.apache.tvm.tvmrpc W/System.err: Load module from /data/user/0/org.apache.tvm.tvmrpc/cache/tvm4j_rpc_7263567248671157602/net.so
2020-10-07 16:18:28.912 30873-30900/org.apache.tvm.tvmrpc E/libc++abi: terminating with uncaught exception of type std::bad_cast: std::bad_cast
2020-10-07 16:18:28.912 30873-30900/org.apache.tvm.tvmrpc A/libc: Fatal signal 6 (SIGABRT), code -6 in tid 30900 (Thread-2), pid 30873 (mrpc:RPCProcess)
2020-10-07 16:18:28.936 30961-30961/? W/crash_dump64: type=1400 audit(0.0:815): avc: denied { search } for name="org.apache.tvm.tvmrpc" dev="sda45" ino=6684883 scontext=u:r:crash_dump:s0:c512,c768 tcontext=u:object_r:app_data_file:s0:c512,c768 tclass=dir permissive=0
2020-10-07 16:18:28.952 30961-30961/? I/crash_dump64: obtaining output fd from tombstoned, type: kDebuggerdTombstone
2020-10-07 16:18:28.953 859-859/? I//system/bin/tombstoned: received crash request for pid 30873
2020-10-07 16:18:28.955 30961-30961/? I/crash_dump64: performing dump of process 30873 (target tid = 30900)
2020-10-07 16:18:28.955 30961-30961/? A/DEBUG: *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
2020-10-07 16:18:28.955 30961-30961/? A/DEBUG: Build fingerprint: 'google/walleye/walleye:8.1.0/OPM2.171026.006.G1/4820017:user/release-keys'
2020-10-07 16:18:28.955 30961-30961/? A/DEBUG: Revision: 'MP1'
2020-10-07 16:18:28.955 30961-30961/? A/DEBUG: ABI: 'arm64'
2020-10-07 16:18:28.955 30961-30961/? A/DEBUG: pid: 30873, tid: 30900, name: Thread-2 >>> org.apache.tvm.tvmrpc:RPCProcess <<<
2020-10-07 16:18:28.955 30961-30961/? A/DEBUG: signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr --------
2020-10-07 16:18:28.957 30961-30961/? A/DEBUG: Abort message: 'terminating with uncaught exception of type std::bad_cast: std::bad_cast'
2020-10-07 16:18:28.957 30961-30961/? A/DEBUG: x0 0000000000000000 x1 00000000000078b4 x2 0000000000000006 x3 0000000000000008
2020-10-07 16:18:28.957 30961-30961/? A/DEBUG: x4 fefeff75ad4fe127 x5 fefeff75ad4fe127 x6 fefeff75ad4fe127 x7 7f7f7f7fff7fff7f
2020-10-07 16:18:28.957 30961-30961/? A/DEBUG: x8 0000000000000083 x9 0000000010000000 x10 00000076b128cd00 x11 0000000000000001
2020-10-07 16:18:28.958 30961-30961/? A/DEBUG: x12 0000000000000018 x13 0000000000000000 x14 0000000000000000 x15 003668d56858db21
2020-10-07 16:18:28.958 30961-30961/? A/DEBUG: x16 00000057b320afa8 x17 0000007748dc352c x18 00000076a1bad640 x19 0000000000007899
2020-10-07 16:18:28.958 30961-30961/? A/DEBUG: x20 00000000000078b4 x21 0000000000000083 x22 ffffff80ffffffc8 x23 00000076b128cef0
2020-10-07 16:18:28.958 30961-30961/? A/DEBUG: x24 00000076b128cdd0 x25 00000076b128ce10 x26 00000076a8b460c0 x27 00000076a1bad538
2020-10-07 16:18:28.958 30961-30961/? A/DEBUG: x28 00000076a5059f00 x29 00000076b128cd40 x30 0000007748d78760
2020-10-07 16:18:28.958 30961-30961/? A/DEBUG: sp 00000076b128cd00 pc 0000007748d78788 pstate 0000000060000000
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: backtrace:
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #00 pc 000000000001d788 /system/lib64/libc.so (abort+120)
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #01 pc 000000000009ce88 /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libc++_shared.so
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #02 pc 000000000009d07c /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libc++_shared.so
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #03 pc 00000000000aead0 /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libc++_shared.so
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #04 pc 00000000000ae0fc /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libc++_shared.so
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #05 pc 00000000000ae058 /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libc++_shared.so (__cxa_throw+112)
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #06 pc 00000000000abe5c /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libc++.so (std::__1::locale::use_facet(std::__1::locale::id&) const+216)
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #07 pc 00000000001d8ed8 /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (_ZNSt3__124__put_character_sequenceIcNS_11char_traitsIcEEEERNS_13basic_ostreamIT_T0_EES7_PKS4_m+160)
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #08 pc 0000000001329348 /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (llvm::QGPUScalarizationPass::scalarizeLoad(llvm::Instruction const*)+648)
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #09 pc 00000000013235e8 /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (llvm::QGPUScalarizationPass::scalarizeInstruction(llvm::Instruction const*)+68)
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #10 pc 0000000001322710 /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (llvm::QGPUScalarizationPass::scalarizeModule(llvm::Module&)+804)
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #11 pc 0000000001333800 /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (llvm::QGPUScalarizationPass::runOnModule(llvm::Module&)+8)
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #12 pc 0000000000315768 /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (llvm::MPPassManager::runOnModule(llvm::Module&)+464)
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #13 pc 00000000003163a8 /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (llvm::PassManagerImpl::run(llvm::Module&)+400)
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #14 pc 00000000009bd17c /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (llvm::llclib::Compile(llvm::Module*, void* (*)(unsigned int), char**, unsigned int&, llvm::Module*, llvm::CLPrintfInterpreter const*)+5504)
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #15 pc 000000000156bf9c /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (clang::clanglib::Codegen(llvm::MemoryBuffer*, cl_compiler_target, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, llvm::OwningArrayPtr<char>&, unsigned int&, cl_rs_compiler_info*)+1164)
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #16 pc 0000000001583268 /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so ((anonymous namespace)::CompilationModel::link()+6848)
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #17 pc 0000000001579950 /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (cl_compiler_link_program+364)
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #18 pc 0000000000053ae8 /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libCB.so (cl_program_link_immediate+832)
2020-10-07 16:18:29.504 859-859/? E//system/bin/tombstoned: Tombstone written to: /data/tombstones/tombstone_07
I believe this is caused by certain bugs with TVM mobile GPU support. Attached is a TorchScript model of fast-depth and the corresponding test script for bug reproduciton. The test is performed on Android 8.1 on Pixel 2, while the server side uses the demo_android docker image, with pytorch 1.4 installed.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels