Running Frigate on the Allwinner A733 / Vivante VIP9000 NPU #23418
unnamedwild-ux
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I got Frigate object detection running on the Vivante VIP9000 NPU of an Allwinner A733 SBC (Radxa Cubie A7A) via a custom
detector plugin + an ACUITY-compiled NBG model (~25–60 ms/inference). Along the way I hit a nasty, non-obvious bug that made
every model misclassify (a car detected as person with a full-frame box). Writing it up in case it helps anyone on
Allwinner/Rockchip/other VIPLite-based NPUs.
Solution repo: https://github.com/unnamedwild-ux/frigate_npu_vivante
This is a community integration for an NPU that Frigate doesn't support out of the box — sharing in case it's useful or worth
folding in.
Setup
How it works
A custom vivante detector plugin (drop-in at frigate/detectors/plugins/vivante.py) loads an NBG (ACUITY-compiled network) and
runs it on the NPU via the VIPLite 2.0 API through ctypes. The model is trimmed to its 6 raw conv heads (3 scales × {box-DFL,
class}); the DFL/box decode + NMS run on the CPU in the plugin, because quantizing the post-processing (NMS/TopK/argmax) on
the NPU loses accuracy / isn't supported. The plugin auto-detects the head type and reg_max (16 for YOLOv8/9, 17 for
YOLO-NAS), so it handles YOLOv8/v9/YOLO-NAS DFL heads without code changes.
The problem(s) it solves
vip_query_output reports each head's dims as [H, W, C, 1], but the NPU actually writes the buffer in CHW (channel-major)
memory order. If you reshape it as the reported [H, W, C], classes and boxes get scrambled across grid cells, so objects are
detected but mislabeled — e.g. a clearly visible car comes out as person/boat with a near-full-frame box at high confidence.
This affects every model, so it's easy to misdiagnose as "the NPU is broken" or "bad quantization."
How I confirmed it: feed a uniform gray image to both the NPU and the ACUITY CPU simulation (pegasus inference) — the outputs
only match when the NPU buffer is reshaped as [C, H, W] (cosine 1.0000, vs ~0.54 as HWC). On a real frame, NHWC-RGB input +
CHW decode → car 0.96 (matching the float ONNX). Fix is one reshape:
arr = raw.reshape(ch, s, s) # CHW memory layout ← correct
arr = np.moveaxis(arr, 0, -1) # → (H, W, C) for decoding
NOT raw.reshape([H, W, C]) ← scrambles classes/boxes
(Input stays NHWC uint8 RGB — that part was always fine.)
End-to-end conversion guide (ONNX → NBG). Trim to the 6 heads, quantize, export for the a733/0x1000003B target. Notes that
int16 quantization is worth it over uint8 (uint8 dropped a real-scene car from ~0.80 → ~0.50, below Frigate's thresholds;
int16 keeps ~float accuracy).
A working detector contract for Frigate — input nhwc / rgb / uint8, CPU DFL+NMS, box clamping to avoid the norfair nan
tracker crash, etc.
What's in the repo
Licensing note: the repo intentionally does not ship the proprietary VIPLite .so libs (they come from the board's npu-runtime
package) or any Frigate+ model — it documents how to obtain/convert your own. Only my own code + docs are included (MIT).
Beta Was this translation helpful? Give feedback.
All reactions