[TIR] Fix plan buffer allocation location for loop carried dependencies #12757
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The pass
PlanAndUpdateBufferAllocationLocation
seems to have problem when the buffer accessed indices take a loop carried dependency. As an example,The block
b1
's read access to intermediate bufferC
on iterationi
, dependsb0
write ofC
on bothi
andi-1
, thus we should not put allocation ofC
under loopi
, which is the LCA position of current plan strategy.To fix the issue we change the behavior of
DetectBufferLCA
to be aware of opaque block iters (loop carried dependency and other more complex behaviors are categorized asopaque
in iter type annotation).It enforce that every legal "ancestor" of buffer accesses should dominate all loops relates to accessed opaque block iters within buffer indices. Eg, since
vi
is opaque, bufferC
indices usevi
, the loopi
must be under the planned allocation point ofC
.As an interesting workload related to loop carried dependency, refer to https://discuss.tvm.apache.org/t/rfc-introducing-a-rolling-buffer-scheduling-primitive/9836, where the intermediate result of previous iteration is try best to get reused.
cc @Hzfengsy @junrushao1994