v0.6.6 - Switch export to LiteRT and make Android input layout-adaptive (#548)
π Summary
π v0.6.6 is a major Android model update: the app moves from legacy TFLite export tooling to LiteRT-based .tflite models, adds smarter Android runtime compatibility for both old and new model layouts, and trims the example app by making Snapdragon QNN/NPU support optional.
π Key Changes
-
Switched Android official model exports to LiteRT π¦
The release now uses Ultralyticsβ newerformat=litertexport flow instead of the older TFLite export path. Official Android assets now come fromv0.6.6and use the new_w8a32.tflitenaming. -
Android runtime now auto-adapts to different input layouts π
New LiteRT exports use a different tensor layout than older.tflitefiles. The Android code now detects this automatically, so it can run:- new LiteRT NCHW models
- older legacy NHWC TFLite models
without users needing to manually adjust anything.
-
Fixed support for more model types on Android β
Segment, semantic segmentation, and pose models exported with LiteRT now load and run correctly. The runtime was updated to recognize the newer LiteRT output tensor naming and shape conventions. -
Official Android assets updated to w8a32 LiteRT β‘
The project now ships w8a32 LiteRT models for Android, chosen because they stay small, work with the GPU path, and do not require calibration data during export. -
Export scripts were simplified and modernized π οΈ
The Android export pipeline now targets LiteRT directly, uses the newerquantizesetting, removes legacy conversion workarounds, and depends onultralytics>=8.4.83. -
Snapdragon QNN runtime is now opt-in in the example app π
The example Android app no longer bundles the heavy Qualcomm QNN/NPU runtime by default. This dramatically reduces app size, while still allowing NPU testing when explicitly enabled. -
Small release and workflow polish β¨
- Example app build number was bumped for store upload acceptance.
- Slack release notifications were simplified and improved with clickable PR links.
π― Purpose & Impact
-
Better future compatibility for Android models π€
Moving to LiteRT aligns the Flutter app with the current Ultralytics export path, making it easier to support newer YOLO models and export improvements going forward. -
Smoother upgrades for existing users π
Because Android now auto-detects input layout and output conventions, older TFLite models should continue working, while new LiteRT models work out of the box too. -
More reliable support across tasks π―
Users working with segmentation, semantic segmentation, and pose should see better stability and fewer model-loading/runtime issues on Android. -
Smaller downloads for Android app testers π¦
Making QNN optional cuts the default example app size significantly, which helps with faster installs and easier testing. Users who want Snapdragon NPU acceleration can still enable it explicitly. -
Practical performance-focused default for Android β‘
The new w8a32 LiteRT assets were selected as a balanced default: small file size, GPU-friendly, and easier export. For most users, that means a more dependable Android deployment path. -
Cleaner maintenance for developers π¨βπ»
The export and release flow is now simpler, with less legacy logic and clearer model packaging, which should reduce breakage and make future releases easier to manage.
What's Changed
- Make Snapdragon QNN runtime opt-in in the example app by @glenn-jocher in #543
- Bump example app build number to 0.6.5+12 by @glenn-jocher in #544
- Use quantize export arg in TFLite export examples by @glenn-jocher in #545
- Simplify Slack notification messages by @glenn-jocher in #546
- Linkify PR numbers in publish Slack notifications by @glenn-jocher in #547
- Switch export to LiteRT and make Android input layout-adaptive by @glenn-jocher in #548
Full Changelog: v0.6.5...v0.6.6