Skip to content

[SandboxIR][Region] Auxiliary vector metadata now requires a region #137443

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 27, 2025

Conversation

vporpo
Copy link
Contributor

@vporpo vporpo commented Apr 26, 2025

The auxiliary vector is represented by the !sandboxaux metadata. But until now adding an instruction to the aux vector would not automatically add it to the region (i.e., it would not mark it with !sandboxvec). This behavior was problematic when generating regions from metadata, because you wouldn't know which region owns the auxiliary vector.

This patch fixes this issue: We now require that an instruction with !sandboxaux also defines its region with !sandboxvec.

@vporpo
Copy link
Contributor Author

vporpo commented May 23, 2025

ping

Copy link
Collaborator

@slackito slackito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TrackScore changes look unrelated to the rest of the patch.

@vporpo
Copy link
Contributor Author

vporpo commented May 24, 2025

The TrackScore changes look unrelated to the rest of the patch.

Yeah it looks unrelated, but it is a workaround to avoid affecting the region's cost when adding an instruction to an aux vector. Let me explain: The way we calculate the region's total instruction cost at the moment is by keeping track of all the instructions added to the region. But with this patch we want any instruction in an aux vector to also be in a region. But doing that would affect the region's cost, which is undesirable.

We will eventually move away from this region-based cost tracking, which should clean this up.

@slackito
Copy link
Collaborator

Hm, it's unfortunate that we have to break the invariant that the cost of the region is the cost of the instructions in it. So now we can have an instruction added to a Region just because it's in an aux vector, but it doesn't count towards the cost. What if later we decide to expand the Region by adding that instruction "for real" and want its cost to be counted?

Is there another way to encode the association between a Region and its aux vector in metadata so we don't have to do this?

@vporpo
Copy link
Contributor Author

vporpo commented May 24, 2025

When we add an instruction to the region "for real" its cost will always be counted, because the Sandbox IR callbacks will call Region::add() and count the cost. It's only when we add an instruction to an auxiliary vector that we don't count the cost, which makes sense because even though the instruction is tagged as being in the region it hasn't actually been added in the "normal" way, i.e., by creating it and adding it to the BB.

So yes, this patch breaks the assumption that the cost in the scoreboard corresponds to the instructions that are tagged as members of a Region, but I don't think this is a very important assumption. I think we should focus more on getting the Region design as nice as possible and not focus on its interaction with the cost tracking, because conceptually the cost tracking should be disjoint from the Region anyway. I mean the Region is an IR-level concept that helps us with serialization and deserialization, while the cost tracking is specific to the vectorizer and can be implemented in many different ways, including having it completely removed from Region.

I think the main question is whether we want to allow Aux instructions not being "members" of a region but still being "owned" by a Region.
For example with the help of "!ownedby" we could describe the following:

define void @foo(i8 %v) {
  %t0 = add i8 %v, 0, !sandboxvec !0, !sandboxaux !1 // %t0 is a member of region !0 and also at Aux[0] of region !0
  %t1 = add i8 %v, 1, !sandboxaux !1, !ownedby !0 // %t1 is not in Region !0 but it's in Aux[1]of region !0
  ret void
}
!0 = distinct !{!"sandboxregion"}
!1 = !{i32 0}
!2 = !{i32 1}

So when you parse the IR for this code you will create a Region, place %t0 in its members and at Aux[0], then add %t1 to Aux[1] but not in its members.

But I think this complicates things a bit. I am trying to think of a reason why would want to support this other than the issue with the cost calculation.

Copy link
Collaborator

@slackito slackito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have a plan to untangle Region and cost tracking I don't think it's worth blocking on this now. Approving with a small nit.

@@ -115,7 +115,7 @@ class Region {
Context::CallbackID EraseInstCB;

/// Adds I to the set.
void add(Instruction *I);
void add(Instruction *I, bool TrackScore = true);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment explaining what's the intended use case for passing false here. Actually, if it doesn't make sense outside of setAux and createRegionsFromMD we could even add a separate overload (instead of a default argument) and make the 2-argument overload private. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree. Having two separate functions and keeping the 2-argument one private makes sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm add() is already private because we don't want the user to start messing with the region instructions because they get added in by the handlers. But anyway I created an addImpl(Instruction *I, bool DontTrackScore) which is for internal use only.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's private then I'm fine with having an optional arg. Sorry I didn't notice at first.

Nit: booleans with a negative name can be confusing to reason about, I liked TrackScore better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to keep two functions because it make is more obvious which one is the one you should avoid.

Regarding the boolean name, it is indeed confusing. How about IgnoreScore or SkipScore? I prefer the flipped logic compared to the original version because it kind of implies that tracking the score is the normal think to do.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

The auxiliary vector is represented by the !sandboxaux metadata.
But until now adding an instruction to the aux vector would not automatically
add it to the region (i.e., it would not mark it with !sandboxvec).
This behavior was problematic when generating regions from metadata, because
you wouldn't know which region owns the auxiliary vector.

This patch fixes this issue: We now require that an instruction with
!sandboxaux also defines its region with !sandboxvec.
@vporpo vporpo merged commit a2b5fd7 into llvm:main May 27, 2025
11 checks passed
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Jun 3, 2025
…lvm#137443)

The auxiliary vector is represented by the !sandboxaux metadata. But
until now adding an instruction to the aux vector would not
automatically add it to the region (i.e., it would not mark it with
!sandboxvec). This behavior was problematic when generating regions from
metadata, because you wouldn't know which region owns the auxiliary
vector.

This patch fixes this issue: We now require that an instruction with
!sandboxaux also defines its region with !sandboxvec.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants