npu support #342

referearn89-lang · 2026-05-08T16:02:39Z

referearn89-lang
May 8, 2026

Sir, thank for making this application and I want to suggest you that and you already know it that many devices nowadays comes with npu for ai and snapdragon is one of them, and you already know it and you even implemented that for image generation if I'm right but if we can also use npu for text generation we can get faster reply without heating the device too much ,
That all I just wanted to suggest the npu support for text generation thank you!, sir pocket pal app have support for npu for text generation if you want inspiration

maoist2009 · 2026-06-28T21:29:50Z

maoist2009
Jun 28, 2026

Qualcomm's NPU LLM is still quite outdated, lacking top-of-the-edge side models like the GEMMA4 and QWEN3.5.

However, I found that Google's official Litert already supports running the SM8750 (theoretically 8850 forward compatible) running the Gemma4 E2B. On my Poco F7 Ultra, it can run at 15 tokens per second, and the heat generation is very low. However, memory usage is significant: the 4096T consumes 5GB for context, and 9GB for maxing out 128K

I am trying to create files for qwen3.5 4b.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Off Grid AI

npu support #342

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Off Grid AI

npu support #342

Uh oh!

referearn89-lang May 8, 2026

Replies: 1 comment

Uh oh!

maoist2009 Jun 28, 2026

referearn89-lang
May 8, 2026

maoist2009
Jun 28, 2026