-
Notifications
You must be signed in to change notification settings - Fork 15.5k
Closed
Labels
Description
The current loop-fusion pass contains redundant and unnecessary checks that are causing multiple issues and crashes (e.g., #166560, #166535, #165031, #80301). In addition, a required domination check is missing (#168263). Considering that the current loop fusion only supports adjacent loops, we are able to simplify the checks in this pass. Following plans outline how these loop fusion checks should be restructured. There are two approaches we have considered and tested:
Approach 1: Simplify isControlFlowEquivalent
This function is currently unreliable and unnecessary. Its existing uses can be replaced or eliminated:
- During candidate collection (collectFusionCandidates): A simplified version of
isControlFlowEquivalentthat only verifies if first loop dominates the second and the second post-dominates the first would be sufficient here. - During fusion (fuseCandidates) in
isSafeToMoveBefore: At this stage, loop legality has already been verified the control-flow of the loops, and re-checking control-flow equivalence is redundant.1
Approach 2: Remove isControlFlowEquivalent
In this approach, we check the adjacency of the loops during collection rather than fusion:
- During candidate collection (collectFusionCandidates):
isControlFlowEquivalentis replaced withisAdjacent. - During fusion (fuseCandidates): Condition
isAdjacentis removed.isControlFlowEquivalentis also removed fromisSafeToMoveBeforebecause of the same reason as in the first approach.
- In this approach, dominancy and post-dominancy of the loops are not explicitly checked anymore. No need to check whether
PDT->dominates(FC1, FC0), as this condition is already verified implicitly in isAdjacent. But for the guarded loops, isAdjacent is not a sufficient check to make sure thatDT->dominates(FC0, FC1). So one additional change that comes with this approach is to check this dominancy along with isAdjacent.
Additional Effects
- Simplify FusionCandidateCompare
In either approaches, ordering can be determined only by checking which loop dominates the other, reducing the number of required checks here.
Comparing the computational complexity2
-
Original loop fusion
- During collection (collectFusionCandidates), it checks each new candidate with the beginning of each candidate set, O(n.k).
- During fusion (fuseCandidates), it checks each loop with all others in the set, O(n^2/k).
-
Approach 1
- During collection (collectFusionCandidates), it checks each new candidate with the beginning of each candidate set, O(n.k).
- During fusion (fuseCandidates), it checks each loop only with its following loop in the set, since it is the only possible adjacent loop, O(n).
-
Approach 2
- During collection (collectFusionCandidates), since the condition to keep in the same set is adjacency, each new candidates need to be checked with all members of all sets, O(n^2).
- During fusion (fuseCandidates), it checks each loop only with its following loop in the set, since it is the only adjacent loop, O(n).