Use typed RaggedDim sentinel for ragged dimensions#320
Merged
Conversation
Ragged dims were previously encoded as raw `null` entries in
`TensorType.dims`, violating the implicit non-null-element contract of
`Iterable<Dimension<?>>` and erasing the structural raggedness signal for
downstream consumers (Hybridize's input-signature inference collapsed `null`
to `SymbolicDim("?")`). Replace with a typed `RaggedDim extends
Dimension<Void>` plus `DimensionType.Ragged`, migrate the seven emit sites in
the five ragged generators, and update `TensorShapeUtil.areBroadcastable` and
`TensorShapeUtil.getBroadcastedShapes` to treat `RaggedDim` as
broadcast-compatible and propagate it through the result (preserving the
legacy null-as-compatible behavior the ragged-broadcast tests depend on).
Ragged-test fixtures renamed from `TENSOR_*_NONE_*` to `TENSOR_*_RAGGED_*` to
reflect the new representation; genuinely-shared `TENSOR_2_NONE_INT32`
(testDataset10's `DatasetFromGeneratorGenerator` output) retained as-is.
The dynamic-batch / placeholder `null` sites in `Input.java`,
`FlowFromDirectoryGenerator.java`, and `TensorGenerator.java` are out of
scope here; tracked separately at wala#545.
Closes wala#544.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per review on #320: the explanation of why `super(null)` is mechanically necessary is an implementation detail, not contract-level documentation. Tag it `@implNote` so the description line stays focused on what the constructor does. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR introduces a typed sentinel (TensorType.RaggedDim) to represent ragged tensor dimensions (previously encoded as raw null entries), and updates ragged tensor generators, broadcasting utilities, and test fixtures to use and recognize this representation.
Changes:
- Add
TensorType.RaggedDimandDimensionType.Raggedto represent ragged dimensions explicitly. - Update ragged tensor shape emitters to append
new RaggedDim()instead ofnull. - Teach
TensorShapeUtilbroadcast logic to treatRaggedDimas broadcast-compatible and to propagate raggedness into the broadcast result; update tests accordingly.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| com.ibm.wala.cast.python.ml/source/com/ibm/wala/cast/python/ml/util/TensorShapeUtil.java | Extend broadcast compatibility/propagation rules to recognize RaggedDim. |
| com.ibm.wala.cast.python.ml/source/com/ibm/wala/cast/python/ml/types/TensorType.java | Add RaggedDim sentinel + enum tag for typed ragged dimensions. |
| com.ibm.wala.cast.python.ml/source/com/ibm/wala/cast/python/ml/client/RaggedTensorFromValues.java | Emit ragged shapes using RaggedDim instead of raw null. |
| com.ibm.wala.cast.python.ml/source/com/ibm/wala/cast/python/ml/client/RaggedRange.java | Emit ragged axis using RaggedDim. |
| com.ibm.wala.cast.python.ml/source/com/ibm/wala/cast/python/ml/client/RaggedFromNestedValueRowIds.java | Emit K ragged dimensions using RaggedDim. |
| com.ibm.wala.cast.python.ml/source/com/ibm/wala/cast/python/ml/client/RaggedFromNestedRowLengths.java | Emit K ragged dimensions using RaggedDim. |
| com.ibm.wala.cast.python.ml/source/com/ibm/wala/cast/python/ml/client/RaggedConstant.java | Emit ragged dimensions using RaggedDim. |
| com.ibm.wala.cast.python.ml.test/source/com/ibm/wala/cast/python/ml/test/TestTensorflow2Model.java | Rename/update ragged fixtures to use RaggedDim-based shapes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Per CodeQL / github-code-quality bot on #320: the `Long` boxed counter at `RaggedConstant.java:432` is never null and never needs to be. Switching to primitive `long` removes misleading nullability and the per-iteration unboxing of `R` from `long`-vs-`Long` comparison. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per Copilot review on #320: the constant name suggests a rank-4 shape with three unknown dims, but the value was constructed with only two dimensions (`NumericDim(2), null`). The constant carried `@SuppressWarnings("unused")` and had no readers anywhere in the suite. Deleting clears the inconsistency without behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #320 +/- ##
============================================
+ Coverage 70.80% 70.82% +0.01%
- Complexity 2606 2608 +2
============================================
Files 266 266
Lines 19815 19852 +37
Branches 3195 3198 +3
============================================
+ Hits 14031 14061 +30
- Misses 4504 4510 +6
- Partials 1280 1281 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This was referenced May 22, 2026
khatchad
added a commit
to ponder-lab/Hybridize-Functions-Refactoring
that referenced
this pull request
May 27, 2026
…typed dim sentinels Mechanical update of 42 `testHasLikelyTensorParameter*` expected `TensorType`s to match the new typed-sentinel encoding Ariadne 0.45.0 emits per wala/ML#545 (`DynamicDim`) and ponder-lab/ML#320 (`RaggedDim`): - Tests 100-104, 124, 125 (`keras.Input`): leading batch dim `null` → `DynamicDim.INSTANCE`. - Tests 59-66 (`RaggedTensor.from_nested_row_splits`), 71-82 (`RaggedTensor.from_row_*` / `from_value_rowids`): interior ragged dims `null` → `RaggedDim.INSTANCE`. - Tests 67-70 (`from_nested_value_rowids`), 118-122 (`from_nested_row_lengths`): mixed shape `(4, None, None, None)` → ragged dims at positions 1-2 (`RaggedDim.INSTANCE`) plus a dynamic flat-values dim at position 3 (`DynamicDim.INSTANCE`). - Tests 112-117 (`ragged.range`): both parameter and signature dims tighten from `null` / `SymbolicDim("?")` to `NumericDim(5)` per ponder-lab/ML#332's `RaggedRange` precision improvement. Also adds the two new imports (`DynamicDim`, `RaggedDim`) and drops the now-resolved typed-sentinel TODO from `testHasLikelyTensorParameter59()`'s docstring; the Hybridize#524 `RaggedTensorSpec` emission TODO stays.
3 tasks
khatchad
added a commit
to ponder-lab/Hybridize-Functions-Refactoring
that referenced
this pull request
May 27, 2026
The per-position consensus loop in `inferSpec` collapsed every non-`NumericDim`
to a `SymbolicDim("?")` wildcard via a fall-through `else`. With Ariadne 0.45.0
shipping typed dim sentinels (`DynamicDim` per wala/ML#545; `RaggedDim` per
ponder-lab/ML#320), the consumer can name those cases explicitly.
Splits the fan-out into independent branches:
- Disagreement across contexts → wildcard.
- `instanceof NumericDim` → keep the concrete value.
- `instanceof DynamicDim` → wildcard (`tf.keras.Input`-style batch axes encode
as `SymbolicDim("?")` in `TensorSpec`; same surface behavior as before).
- `instanceof RaggedDim` → wildcard, carrying a `TODO(#524)` marker so the
`RaggedTensorSpec` emission work lands on a clearly-anchored branch.
- Otherwise → wildcard.
Surface behavior is unchanged; the existing `testHasLikelyTensorParameter*`
suite (492 tests, all green) covers the regression surface. Drops the stale
`wala/ML#544` reference from the inline comment—the typed-sentinel half of
that TODO shipped in Ariadne 0.45.0.
Closes #561.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
RaggedDim extends Dimension<Void>+DimensionType.RaggedinTensorType.java. Replaces rawnullper-element entries that previously encoded ragged dimensions and violated the implicit non-null-element contract ofIterable<Dimension<?>>.RaggedConstant,RaggedFromNestedRowLengths,RaggedFromNestedValueRowIds,RaggedRange,RaggedTensorFromValues) fromshape.add(null)toshape.add(new RaggedDim()).TensorShapeUtil.areBroadcastableandTensorShapeUtil.getBroadcastedShapesto recognizeRaggedDimas broadcast-compatible and propagate it through the result. Without this, ragged-on-ragged broadcasts (testAdd66throughtestAdd99,testGradient,testGradient2) throwNonBroadcastableShapesException.TENSOR_*_NONE_*toTENSOR_*_RAGGED_*to match the new representation. The genuinely-sharedTENSOR_2_NONE_INT32(used bytestDataset10/testDataset10a, whose output comes fromDatasetFromGeneratorGenerator, still null-emitting) is retained as-is.Out of scope
Dynamic-batch and placeholder
nullsites inInput.java:137,Input.java:154,FlowFromDirectoryGenerator.java:75, andTensorGenerator.java:2683still encode rawnull. Tracked separately at wala#545.Test plan
mvn clean install -DskipTestsfrom root passes.mvn testfrom root: 777 tests, 0 failures, 0 errors, 3 skipped.Closes wala#544.
🤖 Generated with Claude Code