Skip to content

Conversation

@Dandandan
Copy link
Contributor

@Dandandan Dandandan commented Nov 22, 2025

Which issue does this PR close?

Rationale for this change

This makes roundrobin repartition more fairly distributed.
The benchmarks probably don't reflect this as much (maybe on very high core counts?), as the partitioning already mostly happens at the source side.

What changes are included in this PR?

Set start partition based on input partition.

Are these changes tested?

Existing tests

Are there any user-facing changes?

@github-actions github-actions bot added core Core DataFusion crate physical-plan Changes to the physical-plan crate labels Nov 22, 2025
}
assert_eq!(partition_row_counts.len(), 3);
assert_eq!(partition_row_counts[0], 2);
assert_eq!(partition_row_counts[0], 1);
Copy link
Contributor Author

@Dandandan Dandandan Nov 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shows the improvement - all output partitions get at least 1 batch, instead of skewing them to the left side.

@Dandandan Dandandan changed the title Avoid repartition skew Avoid skew in Roundrobin repartition Nov 22, 2025
@Dandandan
Copy link
Contributor Author

Couldn't show perf difference on my 10 core machine.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me -- thank you @Dandandan

I have scheduled a benchmark run just to make sure there isn't something we missed but I don't think so

@alamb
Copy link
Contributor

alamb commented Nov 23, 2025

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing improve_round_robin (7923e9c) to e6d1773 diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

@alamb
Copy link
Contributor

alamb commented Nov 23, 2025

🤖: Benchmark completed

Details

Comparing HEAD and improve_round_robin
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ improve_round_robin ┃    Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ QQuery 0     │  2595.38 ms │          2597.17 ms │ no change │
│ QQuery 1     │  1177.94 ms │          1190.90 ms │ no change │
│ QQuery 2     │  2380.03 ms │          2409.04 ms │ no change │
│ QQuery 3     │  1123.45 ms │          1169.65 ms │ no change │
│ QQuery 4     │  2306.66 ms │          2299.56 ms │ no change │
│ QQuery 5     │ 28310.15 ms │         28487.39 ms │ no change │
│ QQuery 6     │  4053.54 ms │          4070.51 ms │ no change │
│ QQuery 7     │  3539.32 ms │          3484.79 ms │ no change │
└──────────────┴─────────────┴─────────────────────┴───────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 45486.47ms │
│ Total Time (improve_round_robin)   │ 45709.01ms │
│ Average Time (HEAD)                │  5685.81ms │
│ Average Time (improve_round_robin) │  5713.63ms │
│ Queries Faster                     │          0 │
│ Queries Slower                     │          0 │
│ Queries with No Change             │          8 │
│ Queries with Failure               │          0 │
└────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ improve_round_robin ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     2.15 ms │             2.38 ms │  1.11x slower │
│ QQuery 1     │    49.73 ms │            56.40 ms │  1.13x slower │
│ QQuery 2     │   135.28 ms │           152.26 ms │  1.13x slower │
│ QQuery 3     │   165.47 ms │           180.46 ms │  1.09x slower │
│ QQuery 4     │  1134.59 ms │          1197.59 ms │  1.06x slower │
│ QQuery 5     │  1515.78 ms │          1575.32 ms │     no change │
│ QQuery 6     │     2.13 ms │             2.13 ms │     no change │
│ QQuery 7     │    56.23 ms │            56.15 ms │     no change │
│ QQuery 8     │  1495.27 ms │          1514.66 ms │     no change │
│ QQuery 9     │  1924.03 ms │          1984.95 ms │     no change │
│ QQuery 10    │   374.62 ms │           379.12 ms │     no change │
│ QQuery 11    │   426.78 ms │           432.36 ms │     no change │
│ QQuery 12    │  1451.16 ms │          1479.07 ms │     no change │
│ QQuery 13    │  2172.22 ms │          2188.10 ms │     no change │
│ QQuery 14    │  1310.14 ms │          1335.93 ms │     no change │
│ QQuery 15    │  1261.43 ms │          1293.28 ms │     no change │
│ QQuery 16    │  2684.79 ms │          2746.00 ms │     no change │
│ QQuery 17    │  2646.52 ms │          2709.54 ms │     no change │
│ QQuery 18    │  5450.75 ms │          5190.07 ms │     no change │
│ QQuery 19    │   128.17 ms │           131.84 ms │     no change │
│ QQuery 20    │  2019.99 ms │          1986.07 ms │     no change │
│ QQuery 21    │  2384.41 ms │          2308.55 ms │     no change │
│ QQuery 22    │  4026.18 ms │          3892.83 ms │     no change │
│ QQuery 23    │ 13686.85 ms │         12851.26 ms │ +1.07x faster │
│ QQuery 24    │   213.69 ms │           220.83 ms │     no change │
│ QQuery 25    │   475.22 ms │           477.64 ms │     no change │
│ QQuery 26    │   211.72 ms │           229.80 ms │  1.09x slower │
│ QQuery 27    │  2890.26 ms │          2821.92 ms │     no change │
│ QQuery 28    │ 24093.16 ms │         23316.79 ms │     no change │
│ QQuery 29    │  1012.48 ms │           968.54 ms │     no change │
│ QQuery 30    │  1372.89 ms │          1365.68 ms │     no change │
│ QQuery 31    │  1430.46 ms │          1395.63 ms │     no change │
│ QQuery 32    │  4937.49 ms │          4567.13 ms │ +1.08x faster │
│ QQuery 33    │  6032.51 ms │          5668.22 ms │ +1.06x faster │
│ QQuery 34    │  6052.95 ms │          6020.43 ms │     no change │
│ QQuery 35    │  1966.38 ms │          1919.65 ms │     no change │
│ QQuery 36    │   126.48 ms │           118.61 ms │ +1.07x faster │
│ QQuery 37    │    53.75 ms │            53.95 ms │     no change │
│ QQuery 38    │   122.80 ms │           120.61 ms │     no change │
│ QQuery 39    │   200.01 ms │           200.19 ms │     no change │
│ QQuery 40    │    41.49 ms │            43.51 ms │     no change │
│ QQuery 41    │    39.54 ms │            41.39 ms │     no change │
│ QQuery 42    │    32.83 ms │            33.60 ms │     no change │
└──────────────┴─────────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 97810.75ms │
│ Total Time (improve_round_robin)   │ 95230.45ms │
│ Average Time (HEAD)                │  2274.67ms │
│ Average Time (improve_round_robin) │  2214.66ms │
│ Queries Faster                     │          4 │
│ Queries Slower                     │          6 │
│ Queries with No Change             │         33 │
│ Queries with Failure               │          0 │
└────────────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      HEAD ┃ improve_round_robin ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 130.05 ms │           132.03 ms │     no change │
│ QQuery 2     │  27.07 ms │            28.60 ms │  1.06x slower │
│ QQuery 3     │  34.31 ms │            39.21 ms │  1.14x slower │
│ QQuery 4     │  29.36 ms │            29.37 ms │     no change │
│ QQuery 5     │  88.72 ms │            87.85 ms │     no change │
│ QQuery 6     │  19.61 ms │            19.71 ms │     no change │
│ QQuery 7     │ 220.58 ms │           225.71 ms │     no change │
│ QQuery 8     │  34.36 ms │            35.66 ms │     no change │
│ QQuery 9     │ 102.35 ms │           103.43 ms │     no change │
│ QQuery 10    │  66.06 ms │            65.16 ms │     no change │
│ QQuery 11    │  18.34 ms │            17.60 ms │     no change │
│ QQuery 12    │  52.17 ms │            51.29 ms │     no change │
│ QQuery 13    │  47.91 ms │            47.14 ms │     no change │
│ QQuery 14    │  14.02 ms │            14.20 ms │     no change │
│ QQuery 15    │  26.16 ms │            24.45 ms │ +1.07x faster │
│ QQuery 16    │  25.21 ms │            25.57 ms │     no change │
│ QQuery 17    │ 150.56 ms │           149.85 ms │     no change │
│ QQuery 18    │ 278.19 ms │           282.53 ms │     no change │
│ QQuery 19    │  37.49 ms │            37.54 ms │     no change │
│ QQuery 20    │  50.17 ms │            49.66 ms │     no change │
│ QQuery 21    │ 321.74 ms │           328.07 ms │     no change │
│ QQuery 22    │  17.75 ms │            18.11 ms │     no change │
└──────────────┴───────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 1792.17ms │
│ Total Time (improve_round_robin)   │ 1812.71ms │
│ Average Time (HEAD)                │   81.46ms │
│ Average Time (improve_round_robin) │   82.40ms │
│ Queries Faster                     │         1 │
│ Queries Slower                     │         2 │
│ Queries with No Change             │        19 │
│ Queries with Failure               │         0 │
└────────────────────────────────────┴───────────┘

@Dandandan
Copy link
Contributor Author

🤖: Benchmark completed

Details

Comparing HEAD and improve_round_robin
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ improve_round_robin ┃    Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ QQuery 0     │  2595.38 ms │          2597.17 ms │ no change │
│ QQuery 1     │  1177.94 ms │          1190.90 ms │ no change │
│ QQuery 2     │  2380.03 ms │          2409.04 ms │ no change │
│ QQuery 3     │  1123.45 ms │          1169.65 ms │ no change │
│ QQuery 4     │  2306.66 ms │          2299.56 ms │ no change │
│ QQuery 5     │ 28310.15 ms │         28487.39 ms │ no change │
│ QQuery 6     │  4053.54 ms │          4070.51 ms │ no change │
│ QQuery 7     │  3539.32 ms │          3484.79 ms │ no change │
└──────────────┴─────────────┴─────────────────────┴───────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 45486.47ms │
│ Total Time (improve_round_robin)   │ 45709.01ms │
│ Average Time (HEAD)                │  5685.81ms │
│ Average Time (improve_round_robin) │  5713.63ms │
│ Queries Faster                     │          0 │
│ Queries Slower                     │          0 │
│ Queries with No Change             │          8 │
│ Queries with Failure               │          0 │
└────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ improve_round_robin ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     2.15 ms │             2.38 ms │  1.11x slower │
│ QQuery 1     │    49.73 ms │            56.40 ms │  1.13x slower │
│ QQuery 2     │   135.28 ms │           152.26 ms │  1.13x slower │
│ QQuery 3     │   165.47 ms │           180.46 ms │  1.09x slower │
│ QQuery 4     │  1134.59 ms │          1197.59 ms │  1.06x slower │
│ QQuery 5     │  1515.78 ms │          1575.32 ms │     no change │
│ QQuery 6     │     2.13 ms │             2.13 ms │     no change │
│ QQuery 7     │    56.23 ms │            56.15 ms │     no change │
│ QQuery 8     │  1495.27 ms │          1514.66 ms │     no change │
│ QQuery 9     │  1924.03 ms │          1984.95 ms │     no change │
│ QQuery 10    │   374.62 ms │           379.12 ms │     no change │
│ QQuery 11    │   426.78 ms │           432.36 ms │     no change │
│ QQuery 12    │  1451.16 ms │          1479.07 ms │     no change │
│ QQuery 13    │  2172.22 ms │          2188.10 ms │     no change │
│ QQuery 14    │  1310.14 ms │          1335.93 ms │     no change │
│ QQuery 15    │  1261.43 ms │          1293.28 ms │     no change │
│ QQuery 16    │  2684.79 ms │          2746.00 ms │     no change │
│ QQuery 17    │  2646.52 ms │          2709.54 ms │     no change │
│ QQuery 18    │  5450.75 ms │          5190.07 ms │     no change │
│ QQuery 19    │   128.17 ms │           131.84 ms │     no change │
│ QQuery 20    │  2019.99 ms │          1986.07 ms │     no change │
│ QQuery 21    │  2384.41 ms │          2308.55 ms │     no change │
│ QQuery 22    │  4026.18 ms │          3892.83 ms │     no change │
│ QQuery 23    │ 13686.85 ms │         12851.26 ms │ +1.07x faster │
│ QQuery 24    │   213.69 ms │           220.83 ms │     no change │
│ QQuery 25    │   475.22 ms │           477.64 ms │     no change │
│ QQuery 26    │   211.72 ms │           229.80 ms │  1.09x slower │
│ QQuery 27    │  2890.26 ms │          2821.92 ms │     no change │
│ QQuery 28    │ 24093.16 ms │         23316.79 ms │     no change │
│ QQuery 29    │  1012.48 ms │           968.54 ms │     no change │
│ QQuery 30    │  1372.89 ms │          1365.68 ms │     no change │
│ QQuery 31    │  1430.46 ms │          1395.63 ms │     no change │
│ QQuery 32    │  4937.49 ms │          4567.13 ms │ +1.08x faster │
│ QQuery 33    │  6032.51 ms │          5668.22 ms │ +1.06x faster │
│ QQuery 34    │  6052.95 ms │          6020.43 ms │     no change │
│ QQuery 35    │  1966.38 ms │          1919.65 ms │     no change │
│ QQuery 36    │   126.48 ms │           118.61 ms │ +1.07x faster │
│ QQuery 37    │    53.75 ms │            53.95 ms │     no change │
│ QQuery 38    │   122.80 ms │           120.61 ms │     no change │
│ QQuery 39    │   200.01 ms │           200.19 ms │     no change │
│ QQuery 40    │    41.49 ms │            43.51 ms │     no change │
│ QQuery 41    │    39.54 ms │            41.39 ms │     no change │
│ QQuery 42    │    32.83 ms │            33.60 ms │     no change │
└──────────────┴─────────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 97810.75ms │
│ Total Time (improve_round_robin)   │ 95230.45ms │
│ Average Time (HEAD)                │  2274.67ms │
│ Average Time (improve_round_robin) │  2214.66ms │
│ Queries Faster                     │          4 │
│ Queries Slower                     │          6 │
│ Queries with No Change             │         33 │
│ Queries with Failure               │          0 │
└────────────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      HEAD ┃ improve_round_robin ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 130.05 ms │           132.03 ms │     no change │
│ QQuery 2     │  27.07 ms │            28.60 ms │  1.06x slower │
│ QQuery 3     │  34.31 ms │            39.21 ms │  1.14x slower │
│ QQuery 4     │  29.36 ms │            29.37 ms │     no change │
│ QQuery 5     │  88.72 ms │            87.85 ms │     no change │
│ QQuery 6     │  19.61 ms │            19.71 ms │     no change │
│ QQuery 7     │ 220.58 ms │           225.71 ms │     no change │
│ QQuery 8     │  34.36 ms │            35.66 ms │     no change │
│ QQuery 9     │ 102.35 ms │           103.43 ms │     no change │
│ QQuery 10    │  66.06 ms │            65.16 ms │     no change │
│ QQuery 11    │  18.34 ms │            17.60 ms │     no change │
│ QQuery 12    │  52.17 ms │            51.29 ms │     no change │
│ QQuery 13    │  47.91 ms │            47.14 ms │     no change │
│ QQuery 14    │  14.02 ms │            14.20 ms │     no change │
│ QQuery 15    │  26.16 ms │            24.45 ms │ +1.07x faster │
│ QQuery 16    │  25.21 ms │            25.57 ms │     no change │
│ QQuery 17    │ 150.56 ms │           149.85 ms │     no change │
│ QQuery 18    │ 278.19 ms │           282.53 ms │     no change │
│ QQuery 19    │  37.49 ms │            37.54 ms │     no change │
│ QQuery 20    │  50.17 ms │            49.66 ms │     no change │
│ QQuery 21    │ 321.74 ms │           328.07 ms │     no change │
│ QQuery 22    │  17.75 ms │            18.11 ms │     no change │
└──────────────┴───────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 1792.17ms │
│ Total Time (improve_round_robin)   │ 1812.71ms │
│ Average Time (HEAD)                │   81.46ms │
│ Average Time (improve_round_robin) │   82.40ms │
│ Queries Faster                     │         1 │
│ Queries Slower                     │         2 │
│ Queries with No Change             │        19 │
│ Queries with Failure               │         0 │
└────────────────────────────────────┴───────────┘

Looks like mostly noise.

@Dandandan Dandandan added this pull request to the merge queue Nov 23, 2025
Merged via the queue into apache:main with commit c2ba087 Nov 23, 2025
75 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate physical-plan Changes to the physical-plan crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Roundrobin repartition skews to left-side partitions

2 participants