Releases · qualcomm/ai-hub-models

19 Jun 01:24

qaihm-bot

v0.56.0

ca623d4

v0.56.0 Latest

Latest

Model improvements & fixes

Added evaluation support for Detectron2_Detection
Fixed dataset issue to reflect the correct on-device accuracy for BERT
Fixed accuracy discrepancy between torch and on-device for HuggingFace WaveLM Base Plus
Fixed input type to avoid mismatch for VideoMAE
Model changes to RTMDet that allow quantization to be w8a16
Correctly adding all context lengths to Qwen 2.5 7B VL metadata
PiperTTS variants correctly specify the language in their description.
Running demo.py on-device now works with custom input shapes
MMMU multimodal eval dataset and evaluator were added for VLMs
Ability run curated 100 prompts evaluation across LLMs/VLMs in evaluate.py
Added performance numbers for Samsung Galaxy S26 across all models.
XR 2 Gen 2 published perf numbers are now measured using Samsung Galaxy S22 instead of QCS8450 (Proxy), as Proxy devices will soon be deprecated in workbench.

Assets 2

04 Jun 09:40

qaihm-bot

v0.55.0

9e147e3

v0.55.0

New Models

NAFNet (denoise and deblur)
PiperTTS (en/de/it)
YOLOv8-OBB
YOLOv11-Pose

Improvements

YOLO models now support exporting to dynamic batch size
EdgeTAM now supports video object tracking (in addition to image segmentation)
Link to Voice AI SDK added to website and README for relevant models

Bug Fixes

Add missing ddcolor dependency

Performance Numbers

Updated performance numbers with the latest version of AI Hub Workbench
Dragonwing IQ-8275 EVK is now featured on model cards as a similar device to SA7255P

Assets 2

19 May 22:27

qaihm-bot

v0.54.0

eb19c08

v0.54.0

New Models

Pi0.5 (pi05) — Vision-language-action model

Improvements

Select Qualcomm devices that are not hosted on AI Hub Workbench (i.e., Dragonwing Q-7790, Dragonwing Q-8750, Dragonwing IQ-X5121, Dragonwing IQ-X7181, SA8255P ADP, SA8650P ADP) now appear on the device dropdown list for applicable models alongside metrics for a similar hosted device. This helps users to identify supported device/model pairings even when AI Hub Workbench cannot host the actual device.

Bug Fixes

Fix FCN-ResNet50 accuracy gap: Previous evaluation used incorrect label mapping for the 21-class VOC segmentation task, producing artificially low metrics.
Fix DETR models mixed precision: Switched DETR-ResNet50/101 and Conditional-DETR quantization to use fp16 as the higher precision, resolving an accuracy regression with the previous int16 recipe.
Fix MeloTTS empty assets in config: Model assets in config.json were serializing as empty due to a Pydantic v2 type annotation issue

Assets 2

05 May 16:26

qaihm-bot

v0.53.1

a3b8ab0

v0.53.1

New models
- SixDRepNet (sixd_repnet)
- RangeNet-Plus-Plus (RangeNet++, rangenet_plus_plus)
Re-instated models
- HuggingFace-WavLM-Base-Plus (huggingface_wavlm_base_plus)
Improvements and bug fixes
- Export metadata.json was missing IO specs for models released as context bin
- Evaluation bug fixes for huggingface_wavelm_base_plus
- The model llama_v3_2_3b_instruct_ssd now supports up to 8K context length (thanks to config changes and LM head split out into separate context binary)

Assets 2

28 Apr 15:49

qaihm-bot

v0.52.0

bbca4cf

v0.52.0

New Models

Qwen2.5-VL-7B (handles both vision and text input, pre-compiled assets are available from here)
ResNet34-SSD

Improvements

Downloads from model cards are now zip files consistent with the export command (which includes metadata and auxiliary files).
Fix profiling issues with ddrnet23_slim and lama_dilated.
Add Genie App scripts to LLMs/VLMs (run with genie-app -s genie-app-script.txt)
Bug fixes

Assets 2

22 Apr 00:04

qaihm-bot

v0.51.0

a52c65d

v0.51.0

New Models

CREStereo
DnCNN — First image de-noising model
Mask2Former
YOLOv9-Det

Reinstated Models

Whisper-Medium
RF-DETR
Swin family (Swin-Base, Swin-Small, Swin-Tiny, SwinV2-Base)

New Features

Whisper, MeloTTS, and Opus models can now be exported to run with the Voice AI runtime
Enhanced model metadata — get_input_spec() now carries typed preprocessing metadata (image normalization, resize info) via InputSpecEntry for programmatic access

Improvements

Bumped default QAIRT version to 2.45; all performance and accuracy numbers updated - Upgraded LLM Genie runtime to QAIRT 2.45
Improved mediapipe_selfie visualization to match MediaPipe reference
Model metadata produced by export changed from YAML to JSON and now includes LLM metadata

Bug Fixes / Cleanup

Fixed swinv2_base accuracy — added missing attention replacement that was causing ~39% accuracy loss
Fixed a bug in the Whisper demo which now runs successfully
Extraneous normalization in DDRNet and PidNet demo was removed
Qwen 2.5 7B export now includes sequence length of 1
Fixed electra_bert_base_discrim_google evaluation script for batch size > 1
Removed ONNXRUNTIME_GENAI runtime

Assets 2

09 Apr 20:54

qaihm-bot

v0.50.2

15564da

v0.50.2

Improvements:

All the assets and performance/accuracy numbers were updated.

Assets 2

09 Apr 20:53

qaihm-bot

v0.50.1

bd9f6e1

v0.50.1

Bug Fixes:

Fixed windows compatibility for Llama 3.2 3B Instruct SSD
In export.py, models with multiple components now support changing input size for each component separately
Fix issue where LiteRT was not included in the SDK / Tool versions in metadata created by export.py
Fixed issue where evaluate.py for electra_bert_base_discrim_google would always produce accuracy of 0
Fix issue where CLI commands would fail on Python 3.11+
Restore missing package README for display on PyPI

Assets 2

09 Apr 20:53

qaihm-bot

v0.50.0

df201c7

v0.50.0

New Models & Assets

Qwen3-4B-Instruct-2507
Stereonet

Bug Fixes:

Fixed export of Qwen 2.5 7B Instruct
SD 2.1 had regressed and was returning noise instead of a natural image. This has been fixed.

Improvements:

Added support for PyTorch 2.11

Assets 2

24 Mar 02:55

qaihm-bot

v0.49.1

c7381f9

v0.49.1

General Updates

Added quantized variants for DETR-ResNet50, DETR-ResNet50-DC5, DETR-ResNet101, and DETR-ResNet101-DC5
Removed BiseNet and BGNet due to licensing concerns
Llama 3.2 3B Instruct SSD variant uses Self Speculative Decoding (SSD), inference acceleration solution that achieves on-target speed up with guaranteed output accuracy identical to the base model. Choose this variant over llama_v3_2_3b_instruct for faster token generation on supported devices.
Updated performance & accuracy data from latest version of AI Hub Workbench

Bug Fixes

Fixed batchnorm unfolding issue in MediaPipe Hand Gesture, enabling the model to be fully NPU-resident when quantized with TFLite
Fixed non-determinism in loading the BSD300 dataset. This previously caused us to report incorrect accuracy data for several super resolution models.
MeloTTS has been updated to work around an HTP issue with summation that produced incorrect shapes at runtime. This update is available only via the export script and is not yet available with pre-generated assets.

Assets 2

Uh oh!

Releases: qualcomm/ai-hub-models

v0.56.0

Uh oh!

v0.55.0

New Models

Improvements

Bug Fixes

Performance Numbers

Uh oh!

v0.54.0

Uh oh!

v0.53.1

Uh oh!

v0.52.0

Uh oh!

v0.51.0

New Models

Reinstated Models

New Features

Improvements

Bug Fixes / Cleanup

Uh oh!

v0.50.2

Uh oh!

v0.50.1

Uh oh!

v0.50.0

Uh oh!

v0.49.1

Uh oh!