Repeated token generation for gpu-based gemma models on device #5414
Labels
os:windows
MediaPipe issues on Windows
platform:android
Issues with Android as Platform
stat:awaiting response
Waiting for user response
task:LLM inference
Issues related to MediaPipe LLM Inference Gen AI setup
type:support
General questions
Have I written custom code (as opposed to using a stock example script provided in MediaPipe)
No
OS Platform and Distribution
Windows 11, Android 14
MediaPipe Tasks SDK version
No response
Task name (e.g. Image classification, Gesture recognition etc.)
LLMInference
Programming Language and version (e.g. C++, Python, Java)
Kotlin/Java
Describe the actual behavior
GPU Model generates garbage output (same token repeated indefinitely)
Describe the expected behaviour
A meaningful response from the model
Standalone code/steps you may have used to try to get what you need
Other info / Complete Logs
The text was updated successfully, but these errors were encountered: