Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NNAPIFlags for Specifying CPU or GPU Inference Do Not Take Effect #2

Open
BeIanChang opened this issue Feb 28, 2024 · 2 comments
Open

Comments

@BeIanChang
Copy link

BeIanChang commented Feb 28, 2024

  • ONNX Runtime version: 1.17.0
  • Android version: 10
  • Kotlin version: 1.9.22
  • JAVA version: 2.1
  • SDK Build-Tools: 33.0.1
  • AGP: 8.1.3
  • GPU: ARM Mali GPU | G310
  • CPU: ARMv8 Processor rev 4 (v8l)

Steps to Reproduce:

I'm trying to configure the ONNX Runtime session within the DepthAnything Class to use NNAPI with specific flags (e.g. USE_FP16 and CPU_DISABLED). Here's the code snippet for setting up the session options:

    private val ortEnvironment = OrtEnvironment.getEnvironment()
    private val ortSession : OrtSession
    private val inputName: String
    init 
    {
            // Create session options
            val options = OrtSession.SessionOptions().apply 
            {
                 addNnapi(EnumSet.of(NNAPIFlags.USE_FP16, NNAPIFlags.CPU_DISABLED))
            }
            val modelByteArray = context.assets.open("depth_anything_small_fp16.onnx").readBytes()
            ortSession = ortEnvironment.createSession(modelByteArray, options)
            inputName = ortSession.inputNames.iterator().next()
    }

Expected Behavior:
I expected the model inference to run using NNAPI with FP16 precision and without using the CPU.

Actual Behavior:
The inference seems to run as if these options were not applied at all. The performance and behavior do not change regardless of the flags set.

@shubham0204
Copy link
Owner

@BeIanChang I have not explored the USE_FP16 and CPU_DISABLED options in onnxruntime. I'll study them and check how we can apply these options effectively to the Depth-Anything ONNX model used in this project.

@BeIanChang
Copy link
Author

In Android Studio Logcat I got following information during loading model:

2024-02-28 17:49:49.464 17037-17037 onnxruntime             com.ml.shubham0204.depthanything     W   [W:onnxruntime:, graph.cc:108 MergeShapeInfo] Error merging shape info for output. 'post-postop-output' source:{-1,-1,-1,1} target:{1,518,518,3}. Falling back to lenient merge.
2024-02-28 17:49:49.481 17037-17037 Manager                 com.ml.shubham0204.depthanything     I  DeviceManager::DeviceManager
2024-02-28 17:49:49.482 17037-17037 Manager                 com.ml.shubham0204.depthanything     I  findAvailableDevices
2024-02-28 17:49:52.742 17037-17037 onnxruntime             com.ml.shubham0204.depthanything     W   [W:onnxruntime:ort-java, nnapi_execution_provider.cc:225 GetCapability] NnapiExecutionProvider::GetCapability, number of partitions supported by NNAPI: 3 number of nodes in the graph: 724 number of nodes supported by NNAPI: 11
2024-02-28 17:49:52.808 17037-17037 onnxruntime             com.ml.shubham0204.depthanything     W   [W:onnxruntime:ort-java, nnapi_execution_provider.cc:225 GetCapability] NnapiExecutionProvider::GetCapability, number of partitions supported by NNAPI: 3 number of nodes in the graph: 725 number of nodes supported by NNAPI: 11
2024-02-28 17:49:52.869 17037-17037 TypeManager             com.ml.shubham0204.depthanything     I  Failed to read /vendor/etc/nnapi_extensions_app_allowlist ; No app allowlisted for vendor extensions use.
2024-02-28 17:49:55.624 17037-17037 onnxruntime             com.ml.shubham0204.depthanything     W   [W:onnxruntime:, session_state.cc:1166 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2024-02-28 17:49:55.624 17037-17037 onnxruntime             com.ml.shubham0204.depthanything     W   [W:onnxruntime:, session_state.cc:1168 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.

So I guess that most of the model is simply not suppported by NNAPI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants