Skip to content

onnxutils: rewrite onnx_helper around Ort::IoBinding + TensorMap#44

Open
bendavid wants to merge 1 commit into
mainfrom
onnx-helper-iobinding
Open

onnxutils: rewrite onnx_helper around Ort::IoBinding + TensorMap#44
bendavid wants to merge 1 commit into
mainfrom
onnx-helper-iobinding

Conversation

@bendavid
Copy link
Copy Markdown
Owner

Summary

  • onnx_helper rewritten to use persistent per-slot ORT-owned buffers pre-bound through Ort::IoBinding. Per-call cost reduces to session.Run(opts, binding) — no per-call Ort::Value construction, no per-call name array setup.
  • Eigen tensors fill the persistent buffers via Eigen::TensorMap<…, Eigen::RowMajor> assignment, so ORT's row-major contract is enforced regardless of the caller's storage layout. The previous code path could silently scramble data when callers passed default-ColMajor multi-axis Eigen tensors (the bug surfaced downstream in WRemnants's shift+smear reweight helpers for every NVar ≥ 2 caller and was wholly invisible to compilation, unit tests, and ONNX itself).
  • Two constructors: the no-shapes form queries the model and throws if any axis is dynamic; the explicit-shape form takes input_shapes / output_shapes by const-reference for models with dynamic axes the caller pins at construction.
  • Input tuple passed by const & to operator(); output tuple remains mutable.
  • onnx_helper_alloc removed — its only consumer (shift_smear_reweight_helpers.hpp in WRemnants) migrates to the new onnx_helper with explicit shapes in a companion WRemnants change.
  • Data-member surface shrinks from ~9 vectors to 5 (env, sessions, input Values, output Values, bindings). Names/shapes/element-types/MemoryInfo/allocator wrapper are all local to the construction-time init_ — ORT internalises everything the helper needs at the moment it's handed in.

Test plan

  • WRemnants shift_smear_reweight_helpers.hpp migration cross-check: PyTorch checkpoint vs deployed ONNX vs ScaleHelperSimpleReweight (NVar=1) vs JpsiCorrectionsUncReweightHelper (closureA, NVar=2) all agree at ~1e-9 float32 noise.
  • WRemnants CI on the companion branch.
  • Verify any other downstream narf::onnx_helper consumers (none in WRemnants).

🤖 Generated with Claude Code

The previous ``onnx_helper`` allocated Ort::Values at construction
(static shapes only) and copied through them via ``std::copy``; the
companion ``onnx_helper_alloc`` constructed Ort::Values per call as
non-owning views over the caller's Eigen buffers (zero-copy, but
forwarded the caller's storage order verbatim to ORT). The zero-copy
path silently scrambled tensor data for any caller passing an Eigen
default-ColMajor multi-axis tensor — ORT consumes the buffer as
row-major, and the layout mismatch only became visible when at least
two axes were length > 1.

New ``onnx_helper``:

  * Persistent ORT-owned buffers per slot. ``Ort::Value::CreateTensor(allocator, …)``
    is called once during construction; the resulting Values are
    pre-bound through one ``Ort::IoBinding`` per slot. ``operator()``
    reduces to ``session.Run(opts, binding)``; no per-call
    Ort::Value construction, no per-call name array setup.
  * Eigen tensors fill the persistent buffers via
    ``Eigen::TensorMap<…, RowMajor>`` assignment, which enforces ORT's
    row-major contract regardless of the caller's storage layout —
    a ColMajor multi-axis tensor passed by the caller now produces
    correct numerical results (with a strided copy) instead of a
    silent scramble.
  * Two constructors: the no-shapes form queries the model and
    throws if any axis is dynamic; the explicit-shape form takes
    ``input_shapes`` and ``output_shapes`` by const-reference for
    models with dynamic axes the caller pins at construction.
  * Inputs taken by ``const &`` in ``operator()``; output tuple
    remains mutable.
  * Shared global default allocator (``Ort::AllocatorWithDefaultOptions``),
    constructed local to ``init_``: the wrapper is ``Unowned`` around
    an ORT-library-lifetime singleton, so the Ort::Value buffers we
    allocate keep working long after the wrapper goes out of scope.

``onnx_helper_alloc`` is removed — its only consumer
(``wremnants/production/include/shift_smear_reweight_helpers.hpp``)
migrates to the new ``onnx_helper`` with explicit shapes in a
companion commit.

The data-member surface shrinks from ~9 vectors to 5: env, sessions,
input Ort::Values, output Ort::Values, IoBindings. Names, shapes,
element types, memory info, the allocator wrapper are all local to
``init_`` — ORT internalizes everything the helper needs at the
moment it's handed in (the IoBinding copies the name; CreateTensor
copies the shape into the Value's TensorShape; the element type
lives in the Value; the default allocator singleton outlives any
wrapper).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
bendavid added a commit to bendavid/WRemnants that referenced this pull request May 24, 2026
Pair with bendavid/narf#44. The narf submodule is bumped to the
``onnx-helper-iobinding`` tip so ``narf::onnx_helper`` exposes the
new persistent-buffer + IoBinding + TensorMap-based fill interface
and ``narf::onnx_helper_alloc`` is gone.

ReweightModel becomes a class template on ``NVar`` and constructs
``narf::onnx_helper`` once with the explicit input/output shapes
[{1,F}, {1,NCond}, {1,NVar,F}, {1,NVar,F}] / [{1,NVar}], pinning
the model's dynamic axes at instantiation. ``run`` is no longer a
function template — it takes the fixed-shape RowMajor Eigen
tensors directly, inputs as ``const &``. ``ReweightEvaluator<NVar>``
holds a ``std::shared_ptr<ReweightModel<NVar>>`` and calls
``model_->run(...)`` without a template argument.

Per-call we no longer pay for Ort::Value construction or name array
setup — the IoBinding is one-shot at construction; ``operator()``
copies in/out through ``Eigen::TensorMap<…, RowMajor>`` views over
the persistent ORT-owned buffers. The RowMajor declarations on the
caller-side Eigen tensors still match the model shape for the copy
to be a simple memcpy, but they no longer need to match the
network's storage order for correctness — Eigen's cross-layout
assignment handles that.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant