Skip to content

chore: Optimize schema rewriter usages#21158

Merged
comphead merged 1 commit intoapache:mainfrom
comphead:schema_adapter
Mar 26, 2026
Merged

chore: Optimize schema rewriter usages#21158
comphead merged 1 commit intoapache:mainfrom
comphead:schema_adapter

Conversation

@comphead
Copy link
Copy Markdown
Contributor

@comphead comphead commented Mar 25, 2026

Which issue does this PR close?

  • Closes #.

Rationale for this change

The rewriter actually has 3 responsibilities:

  1. Index remapping — column indices in expressions may not match the file schema
  2. Type casting — when logical and physical field types differ
  3. Missing column handling — replacing references to absent columns with nulls

Do not use cycles for schema rewrite if predicate is not set or logic schema equal to physical schema

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added the datasource Changes to the datasource crate label Mar 25, 2026
@comphead comphead requested a review from adriangb March 25, 2026 21:10
@comphead
Copy link
Copy Markdown
Contributor Author

Hi @adriangb appreciate if you can check the PR as expert in schema adapter rewrites, if you think it makes sense we can also port it to #21078 and #21079

Copy link
Copy Markdown
Contributor

@adriangb adriangb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Does this show up in any benchmarks? I think if we can show a perf improvement (i.e. a regression when this code was originally introduced) then that's a good justification to backport 😄

@comphead
Copy link
Copy Markdown
Contributor Author

Nice!

Does this show up in any benchmarks? I think if we can show a perf improvement (i.e. a regression when this code was originally introduced) then that's a good justification to backport 😄

I'm running rn on distributed cluster through Comet and will attach a reference.

@comphead
Copy link
Copy Markdown
Contributor Author

I was able to see 10% gain, not very impressive in terms of entire heavyweight test.
Tbh I was expecting more and it is difficult to get accurate numbers as there multiple processes involved, but the smoke test (with vs without) shows up to 10% overall gain. It would still prob nice to go to 53.1.0

@comphead comphead added this pull request to the merge queue Mar 26, 2026
Merged via the queue into apache:main with commit 757ce78 Mar 26, 2026
32 checks passed
@comphead comphead deleted the schema_adapter branch March 26, 2026 03:38
comphead added a commit to comphead/arrow-datafusion that referenced this pull request Mar 26, 2026
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes #.

## Rationale for this change

The rewriter actually has 3 responsibilities:
1. Index remapping — column indices in expressions may not match the
file schema
  2. Type casting — when logical and physical field types differ
3. Missing column handling — replacing references to absent columns with
nulls

Do not use cycles for schema rewrite if predicate is not set or logic
schema equal to physical schema

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
@comphead
Copy link
Copy Markdown
Contributor Author

run benchmarks

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Linux bench-c4137709444-570-52q9g 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux
Comparing schema_adapter (0346cbc) to 51f13d7 (merge-base) diff using: tpch
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Linux bench-c4137709444-568-8rfl7 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux
Comparing schema_adapter (0346cbc) to 51f13d7 (merge-base) diff using: clickbench_partitioned
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Linux bench-c4137709444-569-v4ff8 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux
Comparing schema_adapter (0346cbc) to 51f13d7 (merge-base) diff using: tpcds
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Details

Comparing HEAD and schema_adapter
--------------------
Benchmark tpch_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Query     ┃                           HEAD ┃                 schema_adapter ┃    Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ QQuery 1  │ 46.00 / 46.54 ±0.68 / 47.87 ms │ 45.63 / 46.42 ±1.02 / 48.39 ms │ no change │
│ QQuery 2  │ 21.34 / 21.48 ±0.10 / 21.59 ms │ 21.15 / 21.58 ±0.51 / 22.57 ms │ no change │
│ QQuery 3  │ 32.13 / 32.27 ±0.17 / 32.55 ms │ 32.17 / 32.40 ±0.17 / 32.58 ms │ no change │
│ QQuery 4  │ 20.83 / 21.64 ±0.68 / 22.42 ms │ 20.78 / 21.65 ±0.65 / 22.33 ms │ no change │
│ QQuery 5  │ 48.79 / 50.49 ±1.16 / 52.02 ms │ 48.34 / 49.47 ±1.10 / 51.54 ms │ no change │
│ QQuery 6  │ 17.48 / 17.59 ±0.10 / 17.72 ms │ 17.43 / 17.82 ±0.45 / 18.43 ms │ no change │
│ QQuery 7  │ 55.78 / 57.02 ±1.15 / 58.50 ms │ 55.23 / 56.18 ±0.74 / 57.47 ms │ no change │
│ QQuery 8  │ 48.41 / 48.91 ±0.25 / 49.12 ms │ 48.56 / 48.72 ±0.10 / 48.85 ms │ no change │
│ QQuery 9  │ 55.04 / 55.85 ±0.48 / 56.40 ms │ 54.37 / 56.75 ±1.90 / 59.98 ms │ no change │
│ QQuery 10 │ 71.83 / 72.68 ±0.66 / 73.77 ms │ 71.86 / 73.70 ±1.41 / 76.13 ms │ no change │
│ QQuery 11 │ 14.30 / 14.43 ±0.10 / 14.60 ms │ 14.26 / 15.13 ±0.98 / 16.71 ms │ no change │
│ QQuery 12 │ 28.58 / 28.73 ±0.08 / 28.79 ms │ 28.35 / 29.31 ±0.86 / 30.77 ms │ no change │
│ QQuery 13 │ 39.33 / 39.86 ±0.35 / 40.42 ms │ 40.13 / 40.99 ±0.74 / 42.26 ms │ no change │
│ QQuery 14 │ 28.84 / 29.12 ±0.20 / 29.44 ms │ 28.83 / 29.17 ±0.48 / 30.09 ms │ no change │
│ QQuery 15 │ 34.07 / 34.93 ±0.74 / 35.82 ms │ 34.14 / 34.89 ±1.18 / 37.22 ms │ no change │
│ QQuery 16 │ 16.25 / 16.62 ±0.32 / 17.17 ms │ 16.23 / 16.76 ±0.40 / 17.24 ms │ no change │
│ QQuery 17 │ 74.45 / 76.21 ±1.47 / 78.64 ms │ 73.52 / 74.83 ±1.16 / 76.46 ms │ no change │
│ QQuery 18 │ 78.28 / 79.81 ±1.25 / 81.26 ms │ 78.13 / 80.01 ±1.87 / 83.50 ms │ no change │
│ QQuery 19 │ 38.08 / 38.51 ±0.45 / 39.26 ms │ 37.91 / 38.36 ±0.34 / 38.85 ms │ no change │
│ QQuery 20 │ 40.79 / 41.65 ±0.63 / 42.42 ms │ 40.17 / 41.64 ±1.02 / 43.00 ms │ no change │
│ QQuery 21 │ 64.64 / 66.28 ±1.07 / 67.67 ms │ 65.14 / 66.77 ±1.40 / 68.95 ms │ no change │
│ QQuery 22 │ 17.88 / 18.11 ±0.14 / 18.31 ms │ 17.76 / 18.04 ±0.26 / 18.46 ms │ no change │
└───────────┴────────────────────────────────┴────────────────────────────────┴───────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Benchmark Summary             ┃          ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━┩
│ Total Time (HEAD)             │ 908.74ms │
│ Total Time (schema_adapter)   │ 910.61ms │
│ Average Time (HEAD)           │  41.31ms │
│ Average Time (schema_adapter) │  41.39ms │
│ Queries Faster                │        0 │
│ Queries Slower                │        0 │
│ Queries with No Change        │       22 │
│ Queries with Failure          │        0 │
└───────────────────────────────┴──────────┘

Resource Usage

tpch — base (merge-base)

Metric Value
Wall time 4.8s
Peak memory 4.0 GiB
Avg memory 3.6 GiB
CPU user 33.8s
CPU sys 2.9s
Disk read 0 B
Disk write 136.0 KiB

tpch — branch

Metric Value
Wall time 4.8s
Peak memory 4.0 GiB
Avg memory 3.6 GiB
CPU user 33.8s
CPU sys 3.0s
Disk read 0 B
Disk write 72.0 KiB

File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Details

Comparing HEAD and schema_adapter
--------------------
Benchmark tpcds_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                     HEAD ┃                           schema_adapter ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │           43.12 / 44.02 ±0.77 / 45.12 ms │           43.10 / 44.01 ±0.86 / 45.61 ms │     no change │
│ QQuery 2  │        145.64 / 146.94 ±1.15 / 148.46 ms │        145.59 / 146.92 ±1.22 / 148.91 ms │     no change │
│ QQuery 3  │        114.00 / 115.77 ±1.02 / 116.73 ms │        114.39 / 114.97 ±0.53 / 115.88 ms │     no change │
│ QQuery 4  │    1345.44 / 1365.40 ±13.05 / 1385.78 ms │    1301.63 / 1343.02 ±23.08 / 1372.65 ms │     no change │
│ QQuery 5  │        172.81 / 174.26 ±1.33 / 175.89 ms │        172.42 / 176.22 ±2.34 / 179.07 ms │     no change │
│ QQuery 6  │     990.74 / 1038.40 ±31.39 / 1075.54 ms │    1030.46 / 1056.20 ±21.14 / 1085.07 ms │     no change │
│ QQuery 7  │        353.55 / 356.65 ±3.17 / 362.61 ms │        352.86 / 357.02 ±3.04 / 360.19 ms │     no change │
│ QQuery 8  │        115.00 / 116.70 ±1.62 / 119.44 ms │        114.72 / 116.50 ±1.12 / 117.77 ms │     no change │
│ QQuery 9  │        101.47 / 105.02 ±2.42 / 107.68 ms │        100.62 / 103.32 ±3.26 / 107.33 ms │     no change │
│ QQuery 10 │        108.27 / 109.53 ±0.96 / 111.00 ms │        108.21 / 109.00 ±0.54 / 109.90 ms │     no change │
│ QQuery 11 │        968.49 / 978.81 ±9.72 / 994.46 ms │        917.60 / 925.15 ±7.72 / 938.76 ms │ +1.06x faster │
│ QQuery 12 │           45.57 / 47.17 ±0.87 / 48.06 ms │           44.15 / 45.75 ±1.47 / 48.32 ms │     no change │
│ QQuery 13 │        407.73 / 412.07 ±5.20 / 421.84 ms │        400.66 / 404.00 ±1.82 / 405.77 ms │     no change │
│ QQuery 14 │    1018.17 / 1036.38 ±11.35 / 1046.80 ms │     1013.65 / 1026.98 ±7.37 / 1036.18 ms │     no change │
│ QQuery 15 │           16.72 / 17.96 ±0.83 / 18.85 ms │           15.50 / 16.21 ±1.05 / 18.31 ms │ +1.11x faster │
│ QQuery 16 │           41.65 / 42.60 ±0.73 / 43.34 ms │           40.66 / 42.26 ±1.92 / 45.99 ms │     no change │
│ QQuery 17 │        242.63 / 243.93 ±1.21 / 245.89 ms │        237.67 / 240.61 ±2.26 / 243.47 ms │     no change │
│ QQuery 18 │        128.73 / 129.82 ±0.93 / 131.43 ms │        127.77 / 129.58 ±1.28 / 131.69 ms │     no change │
│ QQuery 19 │        156.27 / 158.92 ±1.65 / 160.71 ms │        154.82 / 156.07 ±1.03 / 157.95 ms │     no change │
│ QQuery 20 │           13.99 / 14.67 ±0.40 / 15.17 ms │           13.24 / 14.11 ±0.62 / 15.16 ms │     no change │
│ QQuery 21 │           19.33 / 19.80 ±0.28 / 20.16 ms │           19.66 / 20.04 ±0.32 / 20.43 ms │     no change │
│ QQuery 22 │        482.90 / 487.66 ±4.27 / 495.50 ms │        480.04 / 482.87 ±3.67 / 490.02 ms │     no change │
│ QQuery 23 │        872.62 / 882.34 ±6.41 / 890.18 ms │       874.41 / 891.21 ±12.11 / 905.76 ms │     no change │
│ QQuery 24 │        410.70 / 414.81 ±3.05 / 418.49 ms │        415.46 / 420.02 ±3.16 / 424.03 ms │     no change │
│ QQuery 25 │        349.14 / 352.83 ±1.90 / 354.56 ms │        351.98 / 354.23 ±1.39 / 355.74 ms │     no change │
│ QQuery 26 │           82.81 / 83.91 ±1.70 / 87.28 ms │           82.73 / 83.60 ±0.89 / 85.20 ms │     no change │
│ QQuery 27 │        349.06 / 349.82 ±0.75 / 351.13 ms │        347.13 / 351.40 ±3.03 / 356.24 ms │     no change │
│ QQuery 28 │        150.21 / 151.36 ±0.96 / 153.06 ms │        149.57 / 151.65 ±1.25 / 153.01 ms │     no change │
│ QQuery 29 │        298.65 / 300.77 ±1.97 / 304.25 ms │        297.25 / 300.48 ±2.34 / 302.50 ms │     no change │
│ QQuery 30 │           43.95 / 45.25 ±1.20 / 47.02 ms │           45.77 / 46.72 ±0.83 / 47.83 ms │     no change │
│ QQuery 31 │        170.11 / 171.65 ±1.42 / 174.14 ms │        169.97 / 171.47 ±1.40 / 173.57 ms │     no change │
│ QQuery 32 │           57.37 / 58.95 ±1.55 / 61.25 ms │           57.31 / 58.52 ±1.32 / 60.42 ms │     no change │
│ QQuery 33 │        141.53 / 143.07 ±1.56 / 145.85 ms │        139.19 / 142.22 ±1.63 / 144.10 ms │     no change │
│ QQuery 34 │        106.84 / 107.88 ±1.07 / 109.84 ms │        106.87 / 108.21 ±1.47 / 110.94 ms │     no change │
│ QQuery 35 │        107.11 / 108.88 ±1.52 / 111.01 ms │        107.28 / 108.46 ±0.98 / 109.84 ms │     no change │
│ QQuery 36 │        213.25 / 218.86 ±3.61 / 223.73 ms │        211.58 / 219.50 ±5.29 / 227.51 ms │     no change │
│ QQuery 37 │        178.30 / 183.10 ±3.37 / 186.92 ms │        178.44 / 179.81 ±1.66 / 182.98 ms │     no change │
│ QQuery 38 │           82.47 / 86.28 ±2.82 / 91.08 ms │           87.69 / 89.89 ±1.67 / 92.43 ms │     no change │
│ QQuery 39 │        127.11 / 128.17 ±0.75 / 129.46 ms │        123.22 / 128.71 ±4.30 / 136.49 ms │     no change │
│ QQuery 40 │        112.18 / 116.93 ±5.69 / 127.55 ms │        110.73 / 116.46 ±6.68 / 129.07 ms │     no change │
│ QQuery 41 │           14.36 / 15.84 ±1.38 / 18.07 ms │           14.64 / 15.53 ±0.77 / 16.83 ms │     no change │
│ QQuery 42 │        107.12 / 108.19 ±0.86 / 109.50 ms │        106.12 / 108.91 ±2.32 / 112.07 ms │     no change │
│ QQuery 43 │           83.60 / 84.88 ±0.93 / 86.11 ms │           83.38 / 84.97 ±1.60 / 88.05 ms │     no change │
│ QQuery 44 │           11.23 / 12.07 ±0.89 / 13.39 ms │           11.38 / 12.30 ±1.00 / 14.17 ms │     no change │
│ QQuery 45 │           52.69 / 53.61 ±0.84 / 54.85 ms │           52.84 / 54.11 ±1.18 / 56.08 ms │     no change │
│ QQuery 46 │        227.50 / 230.98 ±2.04 / 233.07 ms │        230.00 / 231.59 ±0.91 / 232.52 ms │     no change │
│ QQuery 47 │        699.44 / 706.87 ±5.50 / 713.53 ms │        689.48 / 699.74 ±8.53 / 709.21 ms │     no change │
│ QQuery 48 │        290.34 / 295.62 ±2.68 / 297.86 ms │        287.36 / 292.08 ±3.11 / 295.68 ms │     no change │
│ QQuery 49 │        253.50 / 254.87 ±1.48 / 256.72 ms │        254.67 / 257.07 ±1.50 / 259.36 ms │     no change │
│ QQuery 50 │        228.98 / 233.16 ±3.57 / 238.69 ms │        231.32 / 236.95 ±4.84 / 243.88 ms │     no change │
│ QQuery 51 │        179.31 / 183.87 ±2.60 / 186.78 ms │        184.32 / 187.08 ±1.67 / 188.68 ms │     no change │
│ QQuery 52 │        107.71 / 109.17 ±1.55 / 111.99 ms │        107.11 / 108.38 ±0.88 / 109.46 ms │     no change │
│ QQuery 53 │        103.52 / 104.83 ±1.01 / 106.27 ms │        101.83 / 103.32 ±0.97 / 104.65 ms │     no change │
│ QQuery 54 │        148.72 / 149.40 ±0.38 / 149.83 ms │        146.46 / 148.17 ±1.28 / 149.79 ms │     no change │
│ QQuery 55 │        107.57 / 108.73 ±0.88 / 110.02 ms │        107.63 / 108.62 ±0.64 / 109.17 ms │     no change │
│ QQuery 56 │        142.56 / 143.60 ±1.13 / 145.44 ms │        140.75 / 142.29 ±1.28 / 144.63 ms │     no change │
│ QQuery 57 │        175.11 / 178.66 ±2.20 / 181.55 ms │        174.64 / 175.57 ±0.89 / 176.99 ms │     no change │
│ QQuery 58 │        295.12 / 299.81 ±3.90 / 306.78 ms │       301.52 / 311.65 ±11.61 / 333.61 ms │     no change │
│ QQuery 59 │        199.05 / 200.71 ±1.30 / 202.37 ms │        196.95 / 199.01 ±2.18 / 203.16 ms │     no change │
│ QQuery 60 │        144.21 / 145.57 ±0.87 / 146.71 ms │        142.92 / 144.44 ±1.07 / 145.78 ms │     no change │
│ QQuery 61 │        170.68 / 172.11 ±1.35 / 174.18 ms │        171.85 / 173.53 ±0.96 / 174.85 ms │     no change │
│ QQuery 62 │       873.88 / 890.36 ±16.31 / 912.98 ms │       913.18 / 943.44 ±24.13 / 973.93 ms │  1.06x slower │
│ QQuery 63 │        105.37 / 107.66 ±2.42 / 112.32 ms │        103.41 / 106.97 ±1.84 / 108.47 ms │     no change │
│ QQuery 64 │        692.75 / 702.30 ±4.95 / 706.00 ms │        698.93 / 702.71 ±4.32 / 710.87 ms │     no change │
│ QQuery 65 │        255.83 / 258.94 ±3.37 / 264.98 ms │        254.01 / 255.76 ±1.36 / 257.49 ms │     no change │
│ QQuery 66 │       231.11 / 256.67 ±18.05 / 276.60 ms │        237.84 / 250.41 ±8.68 / 260.71 ms │     no change │
│ QQuery 67 │        308.94 / 315.69 ±4.18 / 320.60 ms │        315.64 / 321.01 ±4.46 / 329.14 ms │     no change │
│ QQuery 68 │        282.32 / 285.43 ±1.84 / 287.87 ms │        279.22 / 283.64 ±2.62 / 286.94 ms │     no change │
│ QQuery 69 │        103.11 / 106.29 ±1.71 / 108.27 ms │        102.40 / 104.33 ±1.86 / 107.52 ms │     no change │
│ QQuery 70 │        342.85 / 346.23 ±3.67 / 351.90 ms │        339.23 / 347.60 ±9.36 / 364.96 ms │     no change │
│ QQuery 71 │        135.81 / 137.44 ±1.93 / 141.03 ms │        133.39 / 135.74 ±2.42 / 139.92 ms │     no change │
│ QQuery 72 │        713.92 / 723.19 ±8.99 / 738.71 ms │       713.18 / 728.10 ±13.04 / 751.36 ms │     no change │
│ QQuery 73 │        103.57 / 107.50 ±3.75 / 113.23 ms │        102.28 / 104.77 ±2.71 / 109.62 ms │     no change │
│ QQuery 74 │       563.68 / 584.51 ±17.71 / 610.61 ms │       563.66 / 577.71 ±17.03 / 609.57 ms │     no change │
│ QQuery 75 │        276.02 / 279.09 ±1.62 / 280.63 ms │        276.38 / 278.95 ±1.70 / 281.69 ms │     no change │
│ QQuery 76 │        133.03 / 134.35 ±0.92 / 135.64 ms │        131.30 / 134.64 ±2.33 / 138.01 ms │     no change │
│ QQuery 77 │        188.68 / 190.93 ±1.85 / 193.70 ms │        187.33 / 189.37 ±1.61 / 191.83 ms │     no change │
│ QQuery 78 │        348.90 / 354.65 ±4.17 / 359.14 ms │        349.32 / 356.39 ±3.84 / 360.94 ms │     no change │
│ QQuery 79 │        235.58 / 238.04 ±1.89 / 240.72 ms │        233.99 / 234.45 ±0.48 / 235.18 ms │     no change │
│ QQuery 80 │        327.90 / 331.34 ±2.46 / 334.97 ms │        326.57 / 330.30 ±3.72 / 337.33 ms │     no change │
│ QQuery 81 │           27.02 / 28.60 ±2.30 / 33.16 ms │           25.82 / 26.99 ±0.98 / 28.59 ms │ +1.06x faster │
│ QQuery 82 │        202.25 / 206.59 ±2.66 / 210.66 ms │        199.77 / 202.01 ±1.90 / 205.23 ms │     no change │
│ QQuery 83 │           39.47 / 40.53 ±1.12 / 42.39 ms │           38.73 / 39.59 ±0.61 / 40.49 ms │     no change │
│ QQuery 84 │           47.88 / 49.90 ±1.58 / 51.85 ms │           48.36 / 48.93 ±0.40 / 49.41 ms │     no change │
│ QQuery 85 │        151.03 / 151.95 ±0.76 / 153.01 ms │        148.69 / 150.34 ±0.92 / 151.33 ms │     no change │
│ QQuery 86 │           38.41 / 39.08 ±0.47 / 39.78 ms │           38.45 / 39.72 ±0.99 / 41.09 ms │     no change │
│ QQuery 87 │           86.15 / 89.13 ±3.43 / 95.84 ms │           84.79 / 89.42 ±2.82 / 93.06 ms │     no change │
│ QQuery 88 │          98.96 / 99.89 ±0.87 / 101.23 ms │          98.94 / 99.72 ±0.64 / 100.70 ms │     no change │
│ QQuery 89 │        118.57 / 120.32 ±1.35 / 122.59 ms │        116.93 / 119.36 ±1.94 / 122.55 ms │     no change │
│ QQuery 90 │           23.85 / 24.52 ±0.61 / 25.39 ms │           23.24 / 24.21 ±1.12 / 26.39 ms │     no change │
│ QQuery 91 │           66.13 / 66.59 ±0.29 / 66.99 ms │           61.63 / 64.29 ±1.38 / 65.53 ms │     no change │
│ QQuery 92 │           57.55 / 58.50 ±0.67 / 59.56 ms │           56.66 / 57.74 ±0.74 / 58.80 ms │     no change │
│ QQuery 93 │        188.67 / 191.69 ±2.01 / 194.04 ms │        190.56 / 192.38 ±1.52 / 194.38 ms │     no change │
│ QQuery 94 │           61.35 / 62.74 ±0.86 / 63.89 ms │           60.20 / 61.34 ±0.70 / 62.40 ms │     no change │
│ QQuery 95 │        133.22 / 134.80 ±1.34 / 137.13 ms │        133.58 / 135.32 ±1.78 / 138.76 ms │     no change │
│ QQuery 96 │           72.55 / 74.60 ±1.30 / 76.31 ms │           72.92 / 74.00 ±0.89 / 75.07 ms │     no change │
│ QQuery 97 │        128.98 / 132.73 ±2.05 / 135.20 ms │        126.49 / 129.90 ±2.26 / 132.96 ms │     no change │
│ QQuery 98 │        152.89 / 154.32 ±0.93 / 155.15 ms │        154.47 / 155.91 ±1.23 / 157.94 ms │     no change │
│ QQuery 99 │ 10740.08 / 10787.05 ±39.73 / 10858.52 ms │ 10722.14 / 10753.67 ±25.06 / 10797.21 ms │     no change │
└───────────┴──────────────────────────────────────────┴──────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary             ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)             │ 33724.05ms │
│ Total Time (schema_adapter)   │ 33639.84ms │
│ Average Time (HEAD)           │   340.65ms │
│ Average Time (schema_adapter) │   339.80ms │
│ Queries Faster                │          3 │
│ Queries Slower                │          1 │
│ Queries with No Change        │         95 │
│ Queries with Failure          │          0 │
└───────────────────────────────┴────────────┘

Resource Usage

tpcds — base (merge-base)

Metric Value
Wall time 169.0s
Peak memory 5.6 GiB
Avg memory 4.7 GiB
CPU user 271.2s
CPU sys 17.8s
Disk read 0 B
Disk write 702.8 MiB

tpcds — branch

Metric Value
Wall time 168.5s
Peak memory 5.5 GiB
Avg memory 4.6 GiB
CPU user 269.7s
CPU sys 19.0s
Disk read 0 B
Disk write 148.0 KiB

File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Details

Comparing HEAD and schema_adapter
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                   HEAD ┃                         schema_adapter ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │           1.36 / 4.70 ±6.51 / 17.71 ms │           1.37 / 4.75 ±6.53 / 17.80 ms │     no change │
│ QQuery 1  │         15.05 / 15.44 ±0.23 / 15.76 ms │         14.58 / 15.29 ±0.38 / 15.66 ms │     no change │
│ QQuery 2  │         45.89 / 46.35 ±0.31 / 46.83 ms │         45.18 / 46.01 ±0.77 / 47.38 ms │     no change │
│ QQuery 3  │         43.67 / 45.26 ±1.15 / 46.79 ms │         47.04 / 48.33 ±0.95 / 49.34 ms │  1.07x slower │
│ QQuery 4  │      291.10 / 296.76 ±3.64 / 301.93 ms │      333.04 / 341.89 ±8.16 / 353.18 ms │  1.15x slower │
│ QQuery 5  │      345.15 / 347.79 ±2.55 / 352.09 ms │      394.38 / 397.87 ±2.37 / 400.88 ms │  1.14x slower │
│ QQuery 6  │            5.65 / 6.53 ±0.97 / 8.41 ms │           6.63 / 8.97 ±2.35 / 13.45 ms │  1.37x slower │
│ QQuery 7  │         17.05 / 17.52 ±0.50 / 18.44 ms │         18.05 / 18.31 ±0.18 / 18.53 ms │     no change │
│ QQuery 8  │      417.04 / 427.68 ±6.55 / 436.64 ms │     421.74 / 479.79 ±42.43 / 533.88 ms │  1.12x slower │
│ QQuery 9  │     638.48 / 716.63 ±39.48 / 743.25 ms │     632.03 / 650.84 ±11.31 / 661.32 ms │ +1.10x faster │
│ QQuery 10 │      100.09 / 102.80 ±2.08 / 105.86 ms │        93.47 / 97.45 ±5.41 / 107.97 ms │ +1.05x faster │
│ QQuery 11 │      112.66 / 114.91 ±1.33 / 116.63 ms │      104.48 / 106.66 ±1.77 / 109.37 ms │ +1.08x faster │
│ QQuery 12 │     350.88 / 384.16 ±16.73 / 394.58 ms │      346.63 / 358.07 ±9.19 / 373.22 ms │ +1.07x faster │
│ QQuery 13 │      459.03 / 472.60 ±7.82 / 481.75 ms │     457.24 / 473.72 ±19.64 / 511.62 ms │     no change │
│ QQuery 14 │      350.55 / 357.26 ±6.19 / 368.65 ms │      350.11 / 356.64 ±6.11 / 367.91 ms │     no change │
│ QQuery 15 │     376.35 / 399.59 ±22.10 / 435.56 ms │     359.58 / 382.74 ±22.39 / 425.49 ms │     no change │
│ QQuery 16 │     809.18 / 853.01 ±39.89 / 899.08 ms │     812.18 / 853.06 ±26.08 / 881.64 ms │     no change │
│ QQuery 17 │     710.05 / 773.15 ±45.05 / 840.03 ms │     730.25 / 780.43 ±49.66 / 858.37 ms │     no change │
│ QQuery 18 │  1464.71 / 1506.04 ±34.43 / 1562.58 ms │ 1387.09 / 1574.40 ±108.09 / 1691.69 ms │     no change │
│ QQuery 19 │        37.02 / 46.98 ±10.77 / 66.03 ms │         36.71 / 38.65 ±1.66 / 41.45 ms │ +1.22x faster │
│ QQuery 20 │     717.89 / 743.43 ±29.62 / 792.44 ms │     719.29 / 746.52 ±25.54 / 794.61 ms │     no change │
│ QQuery 21 │      763.16 / 777.63 ±8.76 / 788.10 ms │     763.47 / 779.83 ±16.56 / 809.65 ms │     no change │
│ QQuery 22 │  1130.21 / 1171.21 ±23.48 / 1198.45 ms │  1125.30 / 1146.18 ±21.78 / 1178.15 ms │     no change │
│ QQuery 23 │ 3107.64 / 3236.30 ±130.09 / 3441.59 ms │  3057.42 / 3180.93 ±90.67 / 3260.95 ms │     no change │
│ QQuery 24 │       99.89 / 102.66 ±1.47 / 104.25 ms │       98.81 / 103.26 ±2.74 / 107.38 ms │     no change │
│ QQuery 25 │      139.17 / 141.69 ±1.56 / 143.34 ms │      138.09 / 140.12 ±1.99 / 143.52 ms │     no change │
│ QQuery 26 │      101.71 / 102.40 ±0.86 / 104.04 ms │       98.83 / 100.96 ±1.17 / 101.92 ms │     no change │
│ QQuery 27 │      856.41 / 869.48 ±6.66 / 874.05 ms │      837.05 / 844.06 ±4.41 / 850.34 ms │     no change │
│ QQuery 28 │  7748.02 / 7823.70 ±49.10 / 7881.93 ms │  7701.38 / 7844.37 ±74.28 / 7916.38 ms │     no change │
│ QQuery 29 │         51.50 / 54.48 ±4.45 / 63.32 ms │         50.93 / 55.98 ±5.39 / 64.21 ms │     no change │
│ QQuery 30 │     365.17 / 391.97 ±17.46 / 415.27 ms │      369.96 / 377.08 ±6.26 / 384.52 ms │     no change │
│ QQuery 31 │     401.20 / 427.55 ±17.59 / 455.17 ms │      369.03 / 375.99 ±6.62 / 388.30 ms │ +1.14x faster │
│ QQuery 32 │  1075.36 / 1157.08 ±55.09 / 1231.03 ms │  1137.72 / 1218.69 ±90.45 / 1392.58 ms │  1.05x slower │
│ QQuery 33 │  1450.68 / 1495.62 ±40.71 / 1557.45 ms │  1468.29 / 1599.09 ±97.43 / 1681.47 ms │  1.07x slower │
│ QQuery 34 │  1476.58 / 1609.04 ±98.28 / 1705.62 ms │   1472.71 / 1477.52 ±5.29 / 1486.95 ms │ +1.09x faster │
│ QQuery 35 │      382.33 / 391.46 ±5.33 / 398.71 ms │      385.60 / 392.55 ±4.15 / 396.54 ms │     no change │
│ QQuery 36 │      119.28 / 121.82 ±2.55 / 126.43 ms │      125.26 / 127.28 ±1.69 / 130.00 ms │     no change │
│ QQuery 37 │         47.18 / 50.08 ±2.63 / 53.72 ms │         49.02 / 50.21 ±0.68 / 50.94 ms │     no change │
│ QQuery 38 │         74.80 / 76.58 ±1.44 / 78.23 ms │         74.33 / 75.86 ±1.16 / 77.55 ms │     no change │
│ QQuery 39 │      206.40 / 219.33 ±9.67 / 228.63 ms │      224.06 / 231.66 ±8.27 / 245.27 ms │  1.06x slower │
│ QQuery 40 │         24.20 / 26.35 ±1.52 / 28.90 ms │         26.74 / 28.18 ±1.64 / 30.61 ms │  1.07x slower │
│ QQuery 41 │         20.26 / 21.60 ±0.87 / 22.92 ms │         20.78 / 22.26 ±1.15 / 24.30 ms │     no change │
│ QQuery 42 │         18.92 / 19.86 ±0.60 / 20.52 ms │         20.57 / 21.15 ±0.44 / 21.69 ms │  1.06x slower │
└───────────┴────────────────────────────────────────┴────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary             ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)             │ 27966.47ms │
│ Total Time (schema_adapter)   │ 27973.62ms │
│ Average Time (HEAD)           │   650.38ms │
│ Average Time (schema_adapter) │   650.55ms │
│ Queries Faster                │          7 │
│ Queries Slower                │         10 │
│ Queries with No Change        │         26 │
│ Queries with Failure          │          0 │
└───────────────────────────────┴────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric Value
Wall time 141.0s
Peak memory 40.1 GiB
Avg memory 32.5 GiB
CPU user 1324.2s
CPU sys 96.8s
Disk read 0 B
Disk write 2.8 GiB

clickbench_partitioned — branch

Metric Value
Wall time 141.2s
Peak memory 43.2 GiB
Avg memory 29.8 GiB
CPU user 1315.6s
CPU sys 104.6s
Disk read 0 B
Disk write 764.0 KiB

File an issue against this benchmark runner

mbutrovich pushed a commit that referenced this pull request Mar 26, 2026
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123. -->

- Closes #.

## Rationale for this change

The rewriter actually has 3 responsibilities:
1. Index remapping — column indices in expressions may not match the
file schema
  2. Type casting — when logical and physical field types differ
3. Missing column handling — replacing references to absent columns with
nulls

Do not use cycles for schema rewrite if predicate is not set or logic
schema equal to physical schema

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes. -->

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes #.

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

datasource Changes to the datasource crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants