android: patch DT_NEEDED on GPU samplers so they actually load (fixes #270)#271
Open
prithidevghosh wants to merge 1 commit into
Open
android: patch DT_NEEDED on GPU samplers so they actually load (fixes #270)#271prithidevghosh wants to merge 1 commit into
prithidevghosh wants to merge 1 commit into
Conversation
The two sampler .sos in upstream LiteRT-LM's prebuilt/android_arm64/ (libLiteRtTopKOpenClSampler.so, libLiteRtTopKWebGpuSampler.so) reference LiteRtCreateEnvironment as undefined but declare no NEEDED dependency on the library that provides it. Bionic's per-library linker namespace then refuses to resolve the symbol at dlopen, the samplers fail to load, and the engine silently falls back to CPU sampling. Measured impact in the wild: ~3 tok/s decode on Gemma 4 E2B INT4 instead of ~8.7 tok/s with GPU sampling (upstream google-ai-edge/LiteRT-LM#2211 reports a 2.87x speedup after the equivalent patch). This distribution links LiteRt symbols statically into the rebuilt libLiteRtLm.so, so unlike the upstream workaround (which adds libLiteRt.so) we add libLiteRtLm.so as the NEEDED dependency. patchelf is added as a host-side build dep (brew/apt). Fixes DenisovAV#270.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #270: the two GPU sampler
.sos flutter_gemma ships inprebuilt/android_arm64/(libLiteRtTopKOpenClSampler.soandlibLiteRtTopKWebGpuSampler.so) silently fail todlopenat runtime because they referenceLiteRtCreateEnvironment(and friends) as undefined symbols but don't declare anyNEEDEDdependency on the library that provides it. The engine then falls back to CPU sampling, which costs ~3× in end-to-end decode throughput on Gemma 4 E2B INT4 (3 tok/s vs 8.7 tok/s — upstreamgoogle-ai-edge/LiteRT-LM#2211measured a 2.87× speedup after the equivalent patch).Fix
One new step (
8b.) innative/litert_lm/build_android.sh: after copying Google's prebuilt companion.sos intoprebuilt/android_arm64/, runpatchelf --add-needed libLiteRtLm.soon both samplers. The verification step at the end of the script now also prints the post-patchDT_NEEDEDlist so CI logs make the fix visible.The upstream workaround in
google-ai-edge/LiteRT-LM#2211targetslibLiteRt.so. This distribution links LiteRt symbols statically into the rebuiltlibLiteRtLm.so(see the comment above the bazelisk build, lines 96–102), so the correct target here islibLiteRtLm.so. I verified withllvm-readelf --dyn-symsthatLiteRtCreateEnvironmentis exportedGLOBAL DEFAULTfrom the existinglibLiteRtLm.sobuild, so adding it asNEEDEDis sufficient — no other changes needed.patchelfis added as a host-side build prerequisite (brew install patchelf/apt-get install patchelf); the script exits early with a clear message if it's missing. The--add-neededstep is idempotent — it skips iflibLiteRtLm.sois already in theNEEDEDlist, which lets the script be re-run safely.Test plan
patchelfinstalled (or installs it before runningbuild_android.sh).llvm-readelf -d prebuilt/android_arm64/libLiteRtTopKOpenClSampler.so | grep NEEDEDshowslibLiteRtLm.so.libLiteRtTopKWebGpuSampler.so.Notes for the maintainer
-Wl,-z,max-page-size=16384) is preserved:patchelf --add-neededonly touches the dynamic section (PT_DYNAMIC), notPT_LOADalignment. Recent patchelf (≥0.18) handles this correctly.linkopts—google-ai-edge/LiteRT-LM#2211is still open. When upstream lands the proper fix, this patchelf step becomes a no-op (the idempotency check short-circuits it) and can be removed.gh pr create --repo DenisovAV/flutter_gemma --base main --head prithidevghosh:fix/android-sampler-dt-needed --title "android: patch DT_NEEDED on GPU samplers so they actually load (fixes #270)" --body-file /tmp/pr_body.md 2>&1 | tail -5