Skip to content

fix(pass): store iter-arg returns to InOut params in ConvertTensorToTileOps#783

Merged
lyfne123 merged 1 commit intohw-native-sys:mainfrom
zhangqi-chen:fix/iter-arg-store-sinking-in-convert-tensor-to-tile
Mar 31, 2026
Merged

fix(pass): store iter-arg returns to InOut params in ConvertTensorToTileOps#783
lyfne123 merged 1 commit intohw-native-sys:mainfrom
zhangqi-chen:fix/iter-arg-store-sinking-in-convert-tensor-to-tile

Conversation

@zhangqi-chen
Copy link
Copy Markdown
Collaborator

@zhangqi-chen zhangqi-chen commented Mar 30, 2026

Summary

  • ConvertTensorToTileOps: for InCore returns that feed back as ForStmt iter-args, tile.store now targets the existing In param (auto-promoted to InOut) instead of adding new Out params. The store is placed outside the IfStmt, referencing the phi variable directly — no store sinking into branches needed.
  • InitMemRef: tile alias assignments (a = b) now share the source's MemRef instead of allocating a fresh one, preventing empty-memref aliases in IfStmt branches.
  • MemoryReuse YieldFixupMutator: when IfStmt branches yield tiles with different MemRefs, tile.move is inserted in the else branch to unify to the then-branch's canonical MemRef (mirroring the existing ForStmt pattern).

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 30, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Added utilities to detect/resolve tile-alias chains and locate yields, introduced an analysis to derive per-InCore iter-argument mappings from orchestration loops, extended TransformIncoreFunction to accept these mappings and track merged return indices, and added a store-sinking phase that inserts tile.store into IfStmt branches and updates returns accordingly.

Changes

Cohort / File(s) Summary
Core transform & utilities
src/ir/transforms/convert_tensor_to_tile_ops_pass.cpp
Added helper utilities for tile-alias chain detection/resolution, backward yield discovery, and yield-adjacent insertion.
Iter-arg analysis
src/ir/transforms/convert_tensor_to_tile_ops_pass.cpp
Introduced IterArgMapping and AnalyzeIterArgMappings that scan orchestration functions for ForStmt loops with InCore calls and derive mappings from InCore return indices to loop iter-arg positions.
TransformIncoreFunction & driver
src/ir/transforms/convert_tensor_to_tile_ops_pass.cpp
Updated TransformIncoreFunction signature to accept IterArgMapping; added merged_return_indices to IncoreTransformResult. Implemented store-sinking into terminal IfStmt branches, tile-alias resolution for stored tiles, adjusted tensor_to_tile substitutions, and changed return processing to bypass new Out params for merged indices. Pass driver now computes iter-arg mappings once and supplies them per InCore function.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • lyfne123
  • Hzfengsy

Poem

🐰 I mapped the tiles where yields once played,

I sank the stores in branches, unafraid;
From loops to InCore, indices now sing,
Hop—merge the returns, and let the tiles spring!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 77.78% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: adding iter-arg-aware store sinking to promote In parameters to InOut and avoid redundant Out parameters in the ConvertTensorToTileOps pass.
Description check ✅ Passed The PR description clearly relates to the changeset, detailing specific implementation changes in ConvertTensorToTileOps, InitMemRef, and MemoryReuse components.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an optimization to the ConvertTensorToTileOps pass by sinking tile.store operations into IfStmt branches when return values correspond to loop iteration arguments. It adds a new analysis phase to map InCore function returns to call arguments and includes utilities for resolving tile alias chains and manipulating statement lists. Feedback was provided regarding an inconsistency in the documentation for FindYieldStmt, which should be updated to reflect its recursive implementation.

@zhangqi-chen zhangqi-chen force-pushed the fix/iter-arg-store-sinking-in-convert-tensor-to-tile branch from f49d451 to b11bf7c Compare March 30, 2026 03:16
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/ir/transforms/convert_tensor_to_tile_ops_pass.cpp`:
- Around line 1634-1642: The code currently picks last_if_stmt and
unconditionally replaces its result with fresh return_vars but only updates the
final ReturnStmt via tensor_to_tile, leaving any subsequent statements
referencing the old IfStmt result broken; fix by only sinking (modifying
last_if_stmt, last_if_index, and generating new_if_stmt) when that IfStmt is the
terminal non-return statement (i.e., last_if_index == new_stmts.size()-1 or all
following statements are safe returns), otherwise after creating new_if_stmt
perform a suffix rewrite that substitutes old_rv -> new_rv in every statement
after last_if_index (use the same tensor_to_tile substitution logic you use for
the ReturnStmt) so no later statement keeps referencing the dead/typed-out old
IfStmt; refer to last_if_stmt, last_if_index, new_stmts, new_if_stmt,
tensor_to_tile, ReturnStmt, and old_rv/new_rv when implementing the checks and
suffix substitution.
- Around line 466-468: The code currently stores a single mapping into
result[gvar->name_] (when discovering mapping for an InCore call inside a
ForStmt), so the last loop wins and can mis-map loop-carried args across
different call sites; change the assignment logic to detect conflicting mappings
for the same InCore function: when you find a mapping for gvar->name_, compare
it to any existing mapping in result[gvar->name_]; if they match do nothing, if
they differ mark that function as ambiguous (e.g., store a special sentinel or a
skip flag) and do not apply sinking for that callee in TransformIncoreFunction;
apply the same compare-or-mark-ambiguous change to the other occurrence handling
mappings (the block around the second occurrence noted in the review) so all
discovered mappings for a given InCore function must match before enabling store
sinking.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 5c4d4389-5ae4-46b8-a3f5-9237f9d1177e

📥 Commits

Reviewing files that changed from the base of the PR and between eccb091 and f49d451.

📒 Files selected for processing (1)
  • src/ir/transforms/convert_tensor_to_tile_ops_pass.cpp

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
src/ir/transforms/convert_tensor_to_tile_ops_pass.cpp (2)

1632-1640: ⚠️ Potential issue | 🔴 Critical

Only sink when the rewritten IfStmt is terminal, or rewrite suffix uses.

The transform rewrites last_if_stmt in place, but statements after Line 1750 are not re-substituted from old return vars to new ones. If the chosen IfStmt is non-terminal, downstream users can reference stale vars/types.

Minimal safe guard
-    if (!sink_candidates.empty()) {
+    if (!sink_candidates.empty() && last_if_index + 1 == new_stmts.size()) {

Also applies to: 1748-1751

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ir/transforms/convert_tensor_to_tile_ops_pass.cpp` around lines 1632 -
1640, The current sink picks the last IfStmt (last_if_stmt) from new_stmts
without ensuring it is terminal; only sink (rewrite) when that IfStmt is
terminal (no statements after its index) or else update the trailing statements
to substitute old return variables/types to the newly rewritten ones; locate the
selection logic that builds last_if_stmt/last_if_index from new_stmts and either
(a) add a guard that skips sinking unless last_if_index == new_stmts.size()-1,
or (b) after rewriting IfStmt, walk new_stmts from last_if_index+1 onward and
apply the same return-var/type remapping used for the IfStmt so downstream uses
are updated.

466-468: ⚠️ Potential issue | 🟠 Major

Handle conflicting iter-arg mappings per callee instead of last-writer-wins.

At Line 467, a later loop can overwrite an earlier mapping for the same InCore callee. If call-site mappings differ, sinking can target the wrong parameter for one caller.

Suggested fix
 std::unordered_map<std::string, IterArgMapping> AnalyzeIterArgMappings(
     const std::vector<FunctionPtr>& functions) {
   std::unordered_map<std::string, IterArgMapping> result;
+  std::unordered_set<std::string> ambiguous_callees;
@@
-          if (!mapping.empty()) {
-            result[gvar->name_] = std::move(mapping);
+          if (!mapping.empty()) {
+            if (ambiguous_callees.count(gvar->name_) > 0) {
+              break;
+            }
+            auto it_existing = result.find(gvar->name_);
+            if (it_existing == result.end()) {
+              result.emplace(gvar->name_, mapping);
+            } else if (it_existing->second != mapping) {
+              result.erase(it_existing);
+              ambiguous_callees.insert(gvar->name_);
+            }
             break;  // Found the right InCore call for this ForStmt
           }
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/ir/transforms/convert_tensor_to_tile_ops_pass.cpp`:
- Around line 368-374: The loop currently skips traversing ForStmt bodies when
for_stmt->iter_args_ is empty, causing nested loops/calls to be missed; change
the control flow so that if stmt is a ForStmt you always push its body (e.g.,
worklist.push_back(FlattenToStmts(for_stmt->body_))) into the worklist even when
for_stmt->iter_args_.empty(), and only skip recursion/continue for non-ForStmt
cases (still handle WhileStmt as before); update the conditional around
As<ForStmt>(stmt) and the use of iter_args_ to ensure ForStmt bodies are always
traversed.

---

Duplicate comments:
In `@src/ir/transforms/convert_tensor_to_tile_ops_pass.cpp`:
- Around line 1632-1640: The current sink picks the last IfStmt (last_if_stmt)
from new_stmts without ensuring it is terminal; only sink (rewrite) when that
IfStmt is terminal (no statements after its index) or else update the trailing
statements to substitute old return variables/types to the newly rewritten ones;
locate the selection logic that builds last_if_stmt/last_if_index from new_stmts
and either (a) add a guard that skips sinking unless last_if_index ==
new_stmts.size()-1, or (b) after rewriting IfStmt, walk new_stmts from
last_if_index+1 onward and apply the same return-var/type remapping used for the
IfStmt so downstream uses are updated.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: f8c4b7f5-3cdc-4c4f-a829-6181d665b3ad

📥 Commits

Reviewing files that changed from the base of the PR and between f49d451 and b11bf7c.

📒 Files selected for processing (1)
  • src/ir/transforms/convert_tensor_to_tile_ops_pass.cpp

@zhangqi-chen zhangqi-chen changed the title fix(pass): add iter-arg-aware store sinking to ConvertTensorToTileOps [WIP]fix(pass): add iter-arg-aware store sinking to ConvertTensorToTileOps Mar 30, 2026
@zhangqi-chen zhangqi-chen force-pushed the fix/iter-arg-store-sinking-in-convert-tensor-to-tile branch from b11bf7c to 46048ff Compare March 31, 2026 08:27
@zhangqi-chen zhangqi-chen changed the title [WIP]fix(pass): add iter-arg-aware store sinking to ConvertTensorToTileOps fix(pass): store iter-arg returns to InOut params in ConvertTensorToTileOps Mar 31, 2026
When an InCore function returns tile values that feed back as iter-args
in a ForStmt loop, the pass now stores to the existing In param
(auto-promoted to InOut by UpgradeWrittenTensorParamDirections) instead
of adding new Out params. The tile.store is placed outside the IfStmt,
referencing the phi variable directly.

Supporting fixes in downstream passes:
- InitMemRef: tile alias assignments (a = b) now share the source's
  MemRef instead of allocating a fresh one, preventing empty-memref
  aliases in IfStmt branches
- MemoryReuse YieldFixupMutator: when IfStmt branches yield tiles with
  different MemRefs, insert tile.move in the else branch to unify to
  the then-branch's canonical MemRef (mirroring the existing ForStmt
  tile.move pattern)

Key additions:
- AnalyzeIterArgMappings: scans orchestration functions to map InCore
  return indices to their corresponding call arg positions via
  yield/iter-arg tracing
- TransformIncoreFunction Phase 3: for returns with iter-arg mappings,
  tile.store targets the In param directly (no new Out param, no
  tensor.create at call site)
@zhangqi-chen zhangqi-chen force-pushed the fix/iter-arg-store-sinking-in-convert-tensor-to-tile branch from 46048ff to 871edd6 Compare March 31, 2026 09:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants