Fix VGF runtime aborts on 0-dim tensor inputs and Python scalar coercion (#19446)#19446
Fix VGF runtime aborts on 0-dim tensor inputs and Python scalar coercion (#19446)#19446psiddh wants to merge 1 commit intopytorch:mainfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19446
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (2 Unrelated Failures)As of commit 358b4e2 with merge base a49171d ( FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@psiddh has exported this pull request. If you are a Meta employee, you can view the originating Diff in D104603739. |
This PR needs a
|
There was a problem hiding this comment.
Pull request overview
Fixes hard aborts when bridging between ATen tensors and ExecuTorch tensors for valid 0-dim (scalar) tensors by avoiding invalid nullptr assertions on empty metadata arrays.
Changes:
- Relax
check_tensor_metato only assert non-null sizes/strides storage for ranked tensors (dim() > 0). - Add regression tests covering zero-dimensional aliasing in both bridge directions.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| extension/aten_util/aten_bridge.cpp | Gates sizes/strides nullptr checks on dim() > 0 to allow valid scalar tensor metadata. |
| extension/aten_util/test/aten_bridge_test.cpp | Adds regression tests for 0-dim tensor aliasing to ensure the bridge doesn’t abort. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…ion (pytorch#19446) Summary: The ARM-backend VGF tests (e.g. ``test_sum_dim_intlist_vgf_quant`` and ``test_sum_dim_intlist_vgf_no_quant`` for all 19 parametrizations) were hard-aborting the pytest process with two latent bugs that compounded: 1. **C++ aten_bridge nullptr assert on 0-dim tensors (T270603238).** ``executorch/extension/aten_util/aten_bridge.cpp::check_tensor_meta`` had two unconditional ``ET_CHECK_MSG(b.{sizes,strides}().data() != nullptr, ...)`` asserts. For 0-dim (scalar) tensors, ``sizes()``/``strides()`` are empty ``IntArrayRef``s whose ``.data()`` may legitimately return nullptr. The process aborted on every valid scalar tensor input. Fix: gate the nullptr checks on ``b.dim() > 0``. The subsequent loops are no-ops when ``dim() == 0`` and ``dim_order_to_stride_nocheck`` already early-returns for ``dims == 0`` (``dim_order_util.h:132-134``), so the relaxed asserts are safe. 2. **VGF Python runner over-wrapping non-tensor inputs (Error::InvalidArgument 0x12).** ``runner_fb.run_vgf`` previously called ``torch.tensor(x)`` on every non-tensor input (including ``None``/``bool``/``int``), producing 0-dim tensors. The lowered method's signature, however, expects ``EValue`` tags ``Int``/``Bool``/``None`` for those slots — receiving a ``Tensor`` caused ``Method::set_inputs`` to reject the inputs. The pybindings layer (``pybindings.cpp:804-809``) already natively handles ``None``/``bool``/``int`` Python objects; the runner just had to stop interfering. Fix: only wrap Python ``float`` (and other unknown types) as 0-dim tensors — the original ``addmm`` alpha/beta motivation. Pass ``None``/``bool``/``int`` through unchanged. 3. **Regression tests** for the C++ fix in ``executorch/extension/aten_util/test/aten_bridge_test.cpp``: ``AliasETensorToATenTensorZeroDim`` and ``AliasATTensorToETensorZeroDim`` construct true 0-dim tensors via ``at::scalar_tensor`` and verify the bridge does not abort. The existing ``AliasETensorToATenTensorFail`` death test still fires for ranked tensors with empty strides because that case has ``dim() == 3 > 0``. Fixes T270603238. Differential Revision: D104603739
Summary:
The ARM-backend VGF tests (e.g.
test_sum_dim_intlist_vgf_quantandtest_sum_dim_intlist_vgf_no_quantfor all 19 parametrizations) werehard-aborting the pytest process with two latent bugs that compounded:
C++ aten_bridge nullptr assert on 0-dim tensors (T270603238).
executorch/extension/aten_util/aten_bridge.cpp::check_tensor_metahad twounconditional
ET_CHECK_MSG(b.{sizes,strides}().data() != nullptr, ...)asserts. For 0-dim (scalar) tensors,
sizes()/strides()are emptyIntArrayRefs whose.data()may legitimately return nullptr. Theprocess aborted on every valid scalar tensor input. Fix: gate the nullptr
checks on
b.dim() > 0. The subsequent loops are no-ops whendim() == 0anddim_order_to_stride_nocheckalready early-returns fordims == 0(dim_order_util.h:132-134), so the relaxed asserts are safe.VGF Python runner over-wrapping non-tensor inputs (Error::InvalidArgument
0x12).
runner_fb.run_vgfpreviously calledtorch.tensor(x)onevery non-tensor input (including
None/bool/int), producing0-dim tensors. The lowered method's signature, however, expects
EValuetags
Int/Bool/Nonefor those slots — receiving aTensorcaused
Method::set_inputsto reject the inputs. The pybindings layer(
pybindings.cpp:804-809) already natively handlesNone/bool/intPython objects; the runner just had to stopinterfering. Fix: only wrap Python
float(and other unknown types) as0-dim tensors — the original
addmmalpha/beta motivation. PassNone/bool/intthrough unchanged.Regression tests for the C++ fix in
executorch/extension/aten_util/test/aten_bridge_test.cpp:AliasETensorToATenTensorZeroDimandAliasATTensorToETensorZeroDimconstruct true 0-dim tensors via
at::scalar_tensorand verify thebridge does not abort. The existing
AliasETensorToATenTensorFaildeathtest still fires for ranked tensors with empty strides because that case
has
dim() == 3 > 0.Fixes T270603238.
Differential Revision: D104603739