-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
This task is part of feature #18249 illustrating join order enumeration for Q5 using a Join Ranking, where higher ranks indicate earlier execution. It assumes ranks are precomputed and focuses solely on enumeration and pruning. Rank computation and pruning strategies will be described in separated tickets
Enumeration Examples
Figure 2 shows three joins ranked 4, one ranked 3, and two ranked 2. The three rank-4 joins are selected first in the join order, initiating three in-progress plans as illustrated in Figure 3. In these examples, we assume the join output remains sorted on the join key if the larger input is already sorted. This reflects the behavior of merge joins, which preserve sort order. Hash joins, on the other hand, do not generally guarantee output order, though some engines may preserve probe-side order in specific cases.
Figure 2: Join Graph Annotated with Join Ranked
After a join is selected, remaining join ranks are recalculated using updated properties and may change from earlier values.
Figure 3: Level-1 Partial Plans
In this enumeration scheme/examples, join type matters: if inputs are sorted, both merge and hash joins are considered; otherwise, only hash joins are used. As a result, the three plans above expand into five plans shown in Figure 4.
Figure 4: Level-1 Partial Plans with Join Type Selection
To avoid plan space explosion during enumeration, we retain at most two plans after each step. A pruning strategy is applied to discard less promising options. We'll cover pruning later; for now, we assume it eliminates plans (1.2) and (2), as shown in Figure 5.
Figure 5: Level-1 Partial Plans After Join Type Selection and Pruning
Similarly, the remaining level-1 partial plans (1.1) and (3) generate three level-2 plans, as shown in Figure 6.
Figure 6: Level-2 Partial Plans
Then plan (3.1) is pruned because it shares the same structure as plan (1.1.1).
Figure 7: Level-2 Partial Plans After Pruning
Continuing, level-2 plans (1.1.1) and (3.2) each contain a single highest-ranked join (rank 3), resulting in two level-3 plans shown in Figure 8.
Figure 8: Level-3 Partial Plans
Continuing, Figures 9 and 10 show the level-4 plans before and after pruning, respectively.
Figure 9: Level-4 Partial Plans
Figure10: Level-4 Partial Plans after Pruning
Finally, Figure 11 shows the two final-level plans, with plan (3.2.1.1.1) selected as the best.
Figure11: Final-level Plans
Enumeration Summary
As shown, the enumeration process depends on three customizable components:
- A join ranking strategy, which can vary and be tailored as needed
- A pruning mechanism, which can also be adapted
- An option to include or exclude join types (i.e., physical plans) during enumeration
This suggests we can design a flexible enumeration framework that avoids relying on fixed join rankings or pruning strategies.