Skip to content

v0.4.0

Choose a tag to compare

@github-actions github-actions released this 13 Apr 04:33
· 33 commits to main since this release
v0.4.0
ec2e559

What's Changed

Added

  • OutputQuantization::effective(qmode) helper that normalizes
    Kinara's per-qmode dequantization formulas into a single
    (scale, offset) pair. Only qmode 9 is currently supported; other
    modes return Error::UnsupportedQmode.
  • InputPreprocess struct (core) / class (ara2-py) holding the
    per-channel image normalization parameters (mean, scale,
    bgr_to_rgb, aspect_resize, mirror, center_crop), queried via
    Model::input_preprocess(i) / model.input_preprocess(i).
  • InputQuantization.qmode and InputQuantization.offset fields
    sourced from dv_model_input_preprocess_param::qmode and ::offset.
  • Ara2Info metadata section (DvmMetadata.ara2) parsing the
    optional ara2.qmode field from edgefirst.json embedded in DVMs.
  • Session.close() and Model.close() Python methods; both are
    idempotent and are called by the __exit__ path so with blocks
    now actually release resources.
  • Error::UnsupportedQmode(i32) variant raised by dequantize() when
    a model uses a quantization mode other than 9.

Changed

  • Model::dequantize() (core and Python) now uses the correct
    qmode-9 formula (raw - offset) * scale. Previously applied the
    qmode 0-3 formula raw / qn, which silently produced values off
    by several orders of magnitude on current production models.
  • ara2-py::Model.set_input_tensor accepts any numpy array whose
    total byte length matches the tensor size — the buffer is obtained
    via tobytes() and memcpy'd verbatim. Callers no longer need to
    .view(np.uint8) before calling. Non-contiguous arrays are handled
    transparently (numpy makes a contiguous copy on the fly).
  • ara2-py::Model.get_output_tensor returns a typed array
    (int8/uint8/int16/uint16/float32) reshaped to the
    tensor's declared (C, H, W) shape. Callers that relied on the
    legacy flat-uint8 return must either use the typed array directly
    or call .ravel().
  • ara2-py::Session and ara2-py::Model now wrap Option<inner>
    internally. Method calls on a closed instance raise
    Ara2Error("session is closed") / "model is closed" instead of
    returning stale data.

Fixed

  • Quantization: dequantize() produced grossly wrong values on
    every qmode-9 model (which is every production DVM today).
  • Input dtype erasure: set_input_tensor rejected any array that
    wasn't uint8, forcing a .view(np.uint8) workaround in every
    consumer.
  • Output dtype erasure: get_output_tensor returned uint8
    regardless of the tensor's actual signedness, forcing a manual
    .view(int8) on consumers.
  • Input zero-point confusion: Consumers were reading
    InputQuantization.mean as an integer zero-point — but mean was
    the per-channel float normalization mean, not a quantization
    zero-point. The actual zero-point is now exposed as
    InputQuantization.offset.
  • Missing close: Session.__exit__ and Model.__exit__ were
    no-ops; there was no way to deterministically release resources
    outside a context manager. close() methods plus a real __exit__
    body fix both cases.

Removed (BREAKING)

  • OutputQuantization.scale field (was output_scale in the C
    struct; unused by every downstream consumer).
  • InputQuantization.mean and InputQuantization.scale fields —
    moved to InputPreprocess.

Migration

Consumers of edgefirst-ara2 0.3.x must update:

0.3.x 0.4.0
int(iq.mean) as zero-point iq.offset (true zero-point)
iq.mean, iq.scale (per-channel) model.input_preprocess(i).mean, .scale
oq.scale (used with old dequantized = raw / qn formula) oq.qn (used with new dequantized = (raw - offset) * qn formula — see Changed section)
model.set_input_tensor(0, arr.view(np.uint8)) model.set_input_tensor(0, arr)
raw.view(np.int8) after get_output_tensor unnecessary — returned array is already typed
session.__exit__(None, None, None) session.close()
model.__exit__(None, None, None) model.close()

Non-qmode-9 DVMs now raise Ara2Error("unsupported quantization mode: qmode=N ...") from dequantize(). If you encounter this, file an issue with the model so qmode 0-3 support can be added with a test fixture.


crates.io: ara2 | ara2-sys
PyPI: edgefirst-ara2