Skip to content

v0.6.6 - Switch export to LiteRT and make Android input layout-adaptive (#548)

Choose a tag to compare

@UltralyticsAssistant UltralyticsAssistant released this 30 Jun 17:49
e0dfb65

🌟 Summary

πŸš€ v0.6.6 is a major Android model update: the app moves from legacy TFLite export tooling to LiteRT-based .tflite models, adds smarter Android runtime compatibility for both old and new model layouts, and trims the example app by making Snapdragon QNN/NPU support optional.

πŸ“Š Key Changes

  • Switched Android official model exports to LiteRT πŸ“¦
    The release now uses Ultralytics’ newer format=litert export flow instead of the older TFLite export path. Official Android assets now come from v0.6.6 and use the new _w8a32.tflite naming.

  • Android runtime now auto-adapts to different input layouts πŸ”„
    New LiteRT exports use a different tensor layout than older .tflite files. The Android code now detects this automatically, so it can run:

    • new LiteRT NCHW models
    • older legacy NHWC TFLite models
      without users needing to manually adjust anything.
  • Fixed support for more model types on Android βœ…
    Segment, semantic segmentation, and pose models exported with LiteRT now load and run correctly. The runtime was updated to recognize the newer LiteRT output tensor naming and shape conventions.

  • Official Android assets updated to w8a32 LiteRT ⚑
    The project now ships w8a32 LiteRT models for Android, chosen because they stay small, work with the GPU path, and do not require calibration data during export.

  • Export scripts were simplified and modernized πŸ› οΈ
    The Android export pipeline now targets LiteRT directly, uses the newer quantize setting, removes legacy conversion workarounds, and depends on ultralytics>=8.4.83.

  • Snapdragon QNN runtime is now opt-in in the example app πŸ“‰
    The example Android app no longer bundles the heavy Qualcomm QNN/NPU runtime by default. This dramatically reduces app size, while still allowing NPU testing when explicitly enabled.

  • Small release and workflow polish ✨

    • Example app build number was bumped for store upload acceptance.
    • Slack release notifications were simplified and improved with clickable PR links.

🎯 Purpose & Impact

  • Better future compatibility for Android models 🀝
    Moving to LiteRT aligns the Flutter app with the current Ultralytics export path, making it easier to support newer YOLO models and export improvements going forward.

  • Smoother upgrades for existing users πŸ”
    Because Android now auto-detects input layout and output conventions, older TFLite models should continue working, while new LiteRT models work out of the box too.

  • More reliable support across tasks 🎯
    Users working with segmentation, semantic segmentation, and pose should see better stability and fewer model-loading/runtime issues on Android.

  • Smaller downloads for Android app testers πŸ“¦
    Making QNN optional cuts the default example app size significantly, which helps with faster installs and easier testing. Users who want Snapdragon NPU acceleration can still enable it explicitly.

  • Practical performance-focused default for Android ⚑
    The new w8a32 LiteRT assets were selected as a balanced default: small file size, GPU-friendly, and easier export. For most users, that means a more dependable Android deployment path.

  • Cleaner maintenance for developers πŸ‘¨β€πŸ’»
    The export and release flow is now simpler, with less legacy logic and clearer model packaging, which should reduce breakage and make future releases easier to manage.

What's Changed

Full Changelog: v0.6.5...v0.6.6