Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Genai LLMInference :Failed to load GPU model with the Error - Failed to build program executable - Out of host memoryPass #5406

Open
KosuriSireesha opened this issue May 15, 2024 · 7 comments
Assignees
Labels
platform:android Issues with Android as Platform stat:awaiting response Waiting for user response task:LLM inference Issues related to MediaPipe LLM Inference Gen AI setup type:bug Bug in the Source Code of MediaPipe Solution

Comments

@KosuriSireesha
Copy link

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

None

OS Platform and Distribution

Android 14

Mobile device if the issue happens on mobile device

Android Mobile device

Browser and version if the issue happens on browser

No response

Programming Language and version

Kotlin

MediaPipe version

0.10.14

Bazel version

No response

Solution

LLMInference

Android Studio, NDK, SDK versions (if issue is related to building in Android environment)

No response

Xcode & Tulsi version (if issue is related to building for iOS)

No response

Describe the actual behavior

Initialization of LLmInference fails loading the GPU models on the latest Maven package - 0.10.14 with the Error -"Failed to build program executable - Out of host memoryPass"

Describe the expected behaviour

LLMInference App should run using GPU model and Information retrieving works successfully

Standalone code/steps you may have used to try to get what you need

Tested on Android Mobile device .
Followed the steps mentioned at -https://ai.google.dev/edge/mediapipe/solutions/genai/llm_inference/android-
1.Used  LLMInference example  app from  the git https://github.com/googlesamples/mediapipe
2. Added Maven dependency in the build gradle -
 dependencies {
    implementation 'com.google.mediapipe:tasks-genai:0.10.14'
}
https://mvnrepository.com/artifact/com.google.mediapipe/tasks-genai
3 GPU model used in the LlmInference example- gemma-2b-it-gpu-int4.bin(downloaded from https://www.kaggle.com/models/google/gemma/tfLite)


4.Run the LLmInference APP on Mobile device.

Initialization of LLmInference fails  loading the GPU models on the latest Maven package - 0.10.14 with the Error -"Failed to build program executable - Out of host memoryPass"

It also fails on Maven Genai package 0.10.13

Note - CPU models works fine 
       On Maven package 0.10.11 ,GPU models also works fine .

Other info / Complete Logs

Error logs -
05-14 16:51:08.646 24123 24123 F DEBUG   : Abort message: 'F0000 00:00:1715685668.150224   24107 llm_engine.cc:185] Check failed: graph_->WaitUntilIdle() is OK (UNKNOWN: CalculatorGraph::Run() failed: 
05-14 16:51:08.646 24123 24123 F DEBUG   : Calculator::Open() for node "LlmGpuCalculator" failed: Failed to build program executable - Out of host memoryPass) '

05-14 16:51:08.647 24123 24123 F DEBUG   : backtrace:
05-14 16:51:08.647 24123 24123 F DEBUG   :       #00 pc 000000000005b690  /apex/com.android.runtime/lib64/bionic/libc.so (abort+168) (BuildId: a9682a43d4afba2f7ad4dbb2a45a3a46)
05-14 16:51:08.647 24123 24123 F DEBUG   :       #01 pc 000000000056c0e8  /data/app/~~l8VmStNhgr8s-Zsnqe56Vw==/com.google.mediapipe.examples.llminference-ozuwrcwM1I2BwNY2zNhIwQ==/base.apk!libllm_inference_engine_jni.so (offset 0x100000)
05-14 16:51:08.647 24123 24123 F DEBUG   :       #02 pc 000000000056c468  /data/app/~~l8VmStNhgr8s-Zsnqe56Vw==/com.google.mediapipe.examples.llminference-ozuwrcwM1I2BwNY2zNhIwQ==/base.apk!libllm_inference_engine_jni.so (offset 0x100000)
05-14 16:51:08.647 24123 24123 F DEBUG   :       #03 pc 000000000056c208  /data/app/~~l8VmStNhgr8s-Zsnqe56Vw==/com.google.mediapipe.examples.llminference-ozuwrcwM1I2BwNY2zNhIwQ==/base.apk!libllm_inference_engine_jni.so (offset 0x100000)
@KosuriSireesha KosuriSireesha added the type:bug Bug in the Source Code of MediaPipe Solution label May 15, 2024
@kuaashish
Copy link
Collaborator

kuaashish commented May 16, 2024

Hi @KosuriSireesha,

Could you confirm whether you are running the code on a physical device or a emulator? If it is a physical device, please provide the complete configuration and device name so we can reproduce and better understand the issue.

Thank you!!

@KosuriSireesha
Copy link
Author

Hi @kuaashish ,
I tested this issue on physical device .It is a device with SD8635 chipset (Adreno 735 GPU).

@kuaashish kuaashish added task:LLM inference Issues related to MediaPipe LLM Inference Gen AI setup platform:android Issues with Android as Platform labels May 16, 2024
@kuaashish
Copy link
Collaborator

Hi @schmidt-sebastian,

Can you please look into this issue too?

Thank you!!

@kuaashish kuaashish added the stat:awaiting googler Waiting for Google Engineer's Response label May 16, 2024
@schmidt-sebastian
Copy link
Collaborator

What phone is this running on? We currently only support higher level Android hardware.

@KosuriSireesha
Copy link
Author

Hi @schmidt-sebastian ,
I tested on latest hardware with SD8635 chipset (Adreno 735 GPU).. GPU models worked on ver 0.10.11.It failed on ver0.10.13 and 0.10.14 .
can you give more details on Higher level Android hardware phone ?
what is the minimum hardware requirement for the gpu models to work .
In what usecase do we get the error - ""Failed to build program executable - Out of host memoryPass" ?

@KosuriSireesha
Copy link
Author

Hi @schmidt-sebastian
Any update on this issue?

@kuaashish
Copy link
Collaborator

Hi @KosuriSireesha,

Could you please provide the name of your device and its RAM details? We believe there have been slight changes to the GPU in newer versions, which might be causing your device to be unable to run the model.

Thank you!!

@kuaashish kuaashish added stat:awaiting response Waiting for user response and removed stat:awaiting googler Waiting for Google Engineer's Response labels May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform:android Issues with Android as Platform stat:awaiting response Waiting for user response task:LLM inference Issues related to MediaPipe LLM Inference Gen AI setup type:bug Bug in the Source Code of MediaPipe Solution
Projects
None yet
Development

No branches or pull requests

3 participants