v0.4.0
What's Changed
Added
OutputQuantization::effective(qmode)helper that normalizes
Kinara's per-qmode dequantization formulas into a single
(scale, offset)pair. Only qmode 9 is currently supported; other
modes returnError::UnsupportedQmode.InputPreprocessstruct (core) / class (ara2-py) holding the
per-channel image normalization parameters (mean,scale,
bgr_to_rgb,aspect_resize,mirror,center_crop), queried via
Model::input_preprocess(i)/model.input_preprocess(i).InputQuantization.qmodeandInputQuantization.offsetfields
sourced fromdv_model_input_preprocess_param::qmodeand::offset.Ara2Infometadata section (DvmMetadata.ara2) parsing the
optionalara2.qmodefield fromedgefirst.jsonembedded in DVMs.Session.close()andModel.close()Python methods; both are
idempotent and are called by the__exit__path sowithblocks
now actually release resources.Error::UnsupportedQmode(i32)variant raised bydequantize()when
a model uses a quantization mode other than 9.
Changed
Model::dequantize()(core and Python) now uses the correct
qmode-9 formula(raw - offset) * scale. Previously applied the
qmode 0-3 formularaw / qn, which silently produced values off
by several orders of magnitude on current production models.ara2-py::Model.set_input_tensoraccepts any numpy array whose
total byte length matches the tensor size — the buffer is obtained
viatobytes()and memcpy'd verbatim. Callers no longer need to
.view(np.uint8)before calling. Non-contiguous arrays are handled
transparently (numpy makes a contiguous copy on the fly).ara2-py::Model.get_output_tensorreturns a typed array
(int8/uint8/int16/uint16/float32) reshaped to the
tensor's declared(C, H, W)shape. Callers that relied on the
legacy flat-uint8return must either use the typed array directly
or call.ravel().ara2-py::Sessionandara2-py::Modelnow wrapOption<inner>
internally. Method calls on a closed instance raise
Ara2Error("session is closed")/"model is closed"instead of
returning stale data.
Fixed
- Quantization:
dequantize()produced grossly wrong values on
every qmode-9 model (which is every production DVM today). - Input dtype erasure:
set_input_tensorrejected any array that
wasn'tuint8, forcing a.view(np.uint8)workaround in every
consumer. - Output dtype erasure:
get_output_tensorreturneduint8
regardless of the tensor's actual signedness, forcing a manual
.view(int8)on consumers. - Input zero-point confusion: Consumers were reading
InputQuantization.meanas an integer zero-point — butmeanwas
the per-channel float normalization mean, not a quantization
zero-point. The actual zero-point is now exposed as
InputQuantization.offset. - Missing close:
Session.__exit__andModel.__exit__were
no-ops; there was no way to deterministically release resources
outside a context manager.close()methods plus a real__exit__
body fix both cases.
Removed (BREAKING)
OutputQuantization.scalefield (wasoutput_scalein the C
struct; unused by every downstream consumer).InputQuantization.meanandInputQuantization.scalefields —
moved toInputPreprocess.
Migration
Consumers of edgefirst-ara2 0.3.x must update:
| 0.3.x | 0.4.0 |
|---|---|
int(iq.mean) as zero-point |
iq.offset (true zero-point) |
iq.mean, iq.scale (per-channel) |
model.input_preprocess(i).mean, .scale |
oq.scale (used with old dequantized = raw / qn formula) |
oq.qn (used with new dequantized = (raw - offset) * qn formula — see Changed section) |
model.set_input_tensor(0, arr.view(np.uint8)) |
model.set_input_tensor(0, arr) |
raw.view(np.int8) after get_output_tensor |
unnecessary — returned array is already typed |
session.__exit__(None, None, None) |
session.close() |
model.__exit__(None, None, None) |
model.close() |
Non-qmode-9 DVMs now raise Ara2Error("unsupported quantization mode: qmode=N ...") from dequantize(). If you encounter this, file an issue with the model so qmode 0-3 support can be added with a test fixture.
crates.io: ara2 | ara2-sys
PyPI: edgefirst-ara2