Allow selected zero-valid tileop verifier paths#720
Conversation
Codex Review该评论由 review 机器人自动更新。
SummaryVerifier relaxations in this PR are too broad for mgather/mscatter, thistogram, and trem/trems: they now accept zero-valid selector/tmp shapes that can still reach runtime on non-empty ops. Findings
The new
The histogram changes allow
These verifiers used to require the scratch tile to have at least one valid row; after this change they only reject negative rows. The new all-zero test case is fine, but the relaxation is broader than that: a kernel with |
There was a problem hiding this comment.
Code Review
This pull request enables support for statically empty valid regions (where valid dimensions can be zero) in the PTO IR and during lowering to EmitC. When a static valid dimension or row is zero, the lowering process now elides emitting hardware calls (such as TSTORE, TCVT, TMOV, and TPUSH), effectively treating them as no-ops. The review feedback suggests generalizing the hasStaticZeroValidDim and hasStaticZeroValidRow helper functions in PTOToEmitC.cpp to support other PTO types with valid shapes, such as TensorViewType and PartitionTensorViewType, rather than limiting them only to TileBufType.
| static bool hasStaticZeroValidDim(Type type) { | ||
| auto tileTy = dyn_cast<pto::TileBufType>(type); | ||
| if (!tileTy) | ||
| return false; | ||
| ArrayRef<int64_t> validShape = tileTy.getValidShape(); | ||
| return llvm::any_of(validShape, | ||
| [](int64_t dim) { return dim == 0; }); | ||
| } | ||
|
|
||
| static bool hasStaticZeroValidRow(Type type) { | ||
| auto tileTy = dyn_cast<pto::TileBufType>(type); | ||
| if (!tileTy) | ||
| return false; | ||
| ArrayRef<int64_t> validShape = tileTy.getValidShape(); | ||
| return !validShape.empty() && validShape[0] == 0; | ||
| } |
There was a problem hiding this comment.
The current implementation of hasStaticZeroValidDim and hasStaticZeroValidRow only supports pto::TileBufType. However, other types like pto::TensorViewType and pto::PartitionTensorViewType also have valid shapes and can be passed to operations like pto.tpush (as documented in the manual). If these types are used, the zero-valid dimension check will return false, preventing the elision of the EmitC hardware calls.
We should generalize these helpers to support all PTO types that have a valid shape.
static bool hasStaticZeroValidDim(Type type) {
ArrayRef<int64_t> validShape;
if (auto tileTy = dyn_cast<pto::TileBufType>(type))
validShape = tileTy.getValidShape();
else if (auto viewTy = dyn_cast<pto::TensorViewType>(type))
validShape = viewTy.getValidShape();
else if (auto partTy = dyn_cast<pto::PartitionTensorViewType>(type))
validShape = partTy.getValidShape();
else
return false;
return llvm::any_of(validShape, [](int64_t dim) { return dim == 0; });
}
static bool hasStaticZeroValidRow(Type type) {
ArrayRef<int64_t> validShape;
if (auto tileTy = dyn_cast<pto::TileBufType>(type))
validShape = tileTy.getValidShape();
else if (auto viewTy = dyn_cast<pto::TensorViewType>(type))
validShape = viewTy.getValidShape();
else if (auto partTy = dyn_cast<pto::PartitionTensorViewType>(type))
validShape = partTy.getValidShape();
else
return false;
return !validShape.empty() && validShape[0] == 0;
}a29013f to
a29413d
Compare
a29413d to
858a55d
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 858a55d45b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (srcValid[0] != ShapedType::kDynamic && dstValid[0] != ShapedType::kDynamic && | ||
| srcValid[0] != dstValid[0]) | ||
| return emitOpError("expects src and dst to have the same valid_shape[0]"); | ||
| if (srcValid[0] != ShapedType::kDynamic && srcValid[0] == 0) | ||
| return emitOpError("expects src valid_shape[0] to be non-zero"); | ||
| if (srcValid[1] != ShapedType::kDynamic && srcValid[1] == 0) | ||
| return emitOpError("expects src valid_shape[1] to be non-zero"); | ||
| if (dstValid[0] != ShapedType::kDynamic && dstValid[0] == 0) | ||
| return emitOpError("expects dst valid_shape[0] to be non-zero"); | ||
| if (dstValid[1] != ShapedType::kDynamic && dstValid[1] == 0) | ||
| return emitOpError("expects dst valid_shape[1] to be non-zero"); | ||
| return success(); |
There was a problem hiding this comment.
Reject non-empty expands from zero-valid scalar
When the source column vector has valid_shape[1] == 0 but the destination has a positive valid column count, this verifier now succeeds because the removed non-zero checks leave only the row-count comparison. That accepts a trowexpand that tries to fill a non-empty destination from an empty scalar column, and the EmitC lowering still emits TROWEXPAND unconditionally, so the invalid case can reach codegen instead of being rejected or modeled as an empty/no-op result.
Useful? React with 👍 / 👎.
858a55d to
1f9dab4
Compare
A5 板测成功
|
A3 板测失败
失败用例
|
A3 板测失败详情:PR #720syncall_binding
tprefetch_async_binding
|
…ons (hw-native-sys#708) PyPTO's dual-AIV no-op replay (SplitMode::None on a2a3) stamps a fully-empty valid region (v_row/v_col = 0) on the inactive lane's tiles. This makes that "no useful output" marker legal end-to-end through PTOAS for the affected ops, completing the PTOAS-side work for hw-native-sys#708 beyond hw-native-sys#720 (which fixed only the explicit-operand subview path). pto.subview inferred path (no valid operand, result type declares v=0): - SubViewOp::verify: accept dstValid[dim]==0 when the valid operand is omitted. - Lowering (PTOViewToMemref clampSubViewValidDim/lowerSubViewOps): derive the bind_tile valid extent from the result type's valid_shape instead of backfilling the subview size, so the 0 marker survives lowering instead of being silently widened back to a real write on the no-op lane. - SubViewOp::isCompatibleReturnTypes override (PTOOps.td extraClassDeclaration): accept a declared empty (0) in place of the size-inferred valid extent at parse time; verify() backstops any operand/type contradiction. Reduction / row-expand ops (blocker hw-native-sys#3): add a fully-empty-dst (valid==[0,0]) early-accept to verifyRowReductionValidRegion (trowmax/trowsum), verifyColReductionValidRegion (tcolmax/tcolsum), TRowExpandOp::verify, and verifyTRowExpandReduceLikeOp (trowexpandmax/min/expdif). Only the [0,0] marker is accepted; one-sided/partial empties still fall through to the existing strict checks, so hw-native-sys#720's *_invalid.pto negative tests stay green. Scope / deferred: - This commits to the "hardware Rv=0 is a no-op" branch of the item-3 audit (pto-isa#143), which is not verified here; mitigated by accepting only the fully-empty marker (an op that writes zero elements). - EmitC still emits the instruction for a v=0 op (e.g. TSTORE/TROWMAX); ISA-level elision/no-op is out of scope, matching hw-native-sys#720. - TRowExpandAddOp and the argmax/argmin family are left strict (not in scope). Tests: 4 new lit fixtures (subview inferred positive + 2 negative bounds; reductions positive). Full lit suite passes (287/287). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Fixes #708.
This PR relaxes verifier/type semantics for selected zero-valid tile regions:
Validation: