Skip to content

(test benchmark runnner) morsel driven execution#6

Open
adriangb wants to merge 146 commits intoadriangb:mainfrom
Dandandan:parquet-morsel-driven-execution-237164415184908839
Open

(test benchmark runnner) morsel driven execution#6
adriangb wants to merge 146 commits intoadriangb:mainfrom
Dandandan:parquet-morsel-driven-execution-237164415184908839

Conversation

@adriangb
Copy link
Owner

@adriangb adriangb commented Mar 9, 2026

No description provided.

google-labs-jules bot and others added 30 commits February 22, 2026 13:12
This PR implements morsel-driven execution for Parquet files in DataFusion, enabling row-group level work sharing across partitions to mitigate data skew.

Key changes:
- Introduced `WorkQueue` in `datafusion/datasource/src/file_stream.rs` for shared pool of work.
- Added `morselize` method to `FileOpener` trait to allow dynamic splitting of files into morsels.
- Implemented `morselize` for `ParquetOpener` to split files into individual row groups.
- Cached `ParquetMetaData` in `ParquetMorsel` extensions to avoid redundant I/O.
- Modified `FileStream` to support work stealing from the shared queue.
- Implemented `Weak` pointer pattern for `WorkQueue` in `FileScanConfig` to support plan re-executability.
- Added `MorselizingGuard` to ensure shared state consistency on cancellation.
- Added `allow_morsel_driven` configuration option (enabled by default for Parquet).
- Implemented row-group pruning during the morselization phase for better efficiency.

Tests:
- Added `parquet_morsel_driven_execution` test to verify work distribution and re-executability.
- Added `parquet_morsel_driven_enabled_by_default` to verify the default configuration.

Co-authored-by: Dandandan <163737+Dandandan@users.noreply.github.com>
timsaucer and others added 21 commits March 10, 2026 11:34
## Which issue does this PR close?

- Closes apache#17035

## Rationale for this change

Now that we have proper `FFI_ConfigOptions` we can pass these to scalar
UDFs via FFI.

## What changes are included in this PR?

Instead of passing default options, pass in converted config options
from the input.

Also did a drive by cleanup of switching to using FFI_ColumnarValue
since it is now available.

## Are these changes tested?

Unit test added.

## Are there any user-facing changes?

This is a breaking API change, but not one that users will interact with
directly. It breaks the ABI for FFI libraries, which is currently
unstable.
…cking in instrumentedObjectStore (apache#20802)

## Which issue does this PR close?
Related to apache#18138 but does not close any issue.

## Rationale for this change
TimeToFirstItemStream held an Arc<Mutex<Vec<RequestDetails>>> and a
request_index to write back the duration into the shared request list. I
saw @alamb and @BlakeOrth's reviews on the PR apache#19127 about the
improvements and wanted to change.

## What changes are included in this PR?
- Replace Arc<Mutex<Vec<RequestDetails>>> + index in
TimeToFirstItemStream with a per-request Arc<AtomicU64>
- Store duration as nanoseconds in AtomicU64 (0 = not yet set) with
Release/Acquire ordering
- Start the timer lazily on the first poll_next call instead of at
stream creation, so only actual storage latency is measured

## Are these changes tested?
Existing tests and I've also added time comparison

## Are there any user-facing changes?
No
## Which issue does this PR close?

- Part of apache#20585

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

## Rationale for this change

String UDFs should preserve string representation where feasible.
`translate` previously accepted Utf8View input but emitted Utf8, causing
an unnecessary type downgrade. This aligns `translate` with the expected
behavior of returning the same string type as its primary input.

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

1. Updated `translate` return type inference to emit Utf8View when input
is Utf8View, while preserving existing behavior for Utf8 and LargeUtf8.
2. Refactored `translate` and `translate_with_map` to use explicit
string builders (via a local `TranslateOutput` helper trait) instead of
`.collect::<GenericStringArray<T>>()`, so the correct output array type
is produced for each input type.
3. Added unit tests for Utf8View input (basic, null, non-ASCII) and
sqllogictests verifying `arrow_typeof` output for all three string
types.

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

Yes. Unit tests and sqllogictests are included.

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

No.

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
…pache#20822)

## Which issue does this PR close?

- Related to apache#20603 

## Rationale for this change

This PR enables Parquet row-level filter pushdown for struct field
access expressions, which previously fell back to a full scan followed
by a separate filtering pass, a significant perf penalty for queries
filtering on struct fields in large Parquet files (like Variant types!)

Filters on struct fields like `WHERE s['foo'] > 67` were not being
pushed into the Parquet decoder. This is because `PushdownChecker` sees
the underlying `Column("s")` has a `Struct` type and unconditionally
rejects it, without considering that `get_field` resolves to a primitive
leaf. With this change, deeply nested access like `s['outer']['inner']`
will also get pushed down because the logical simplifier flattens it
before it reaches the physical plan

Note: this does not address the projection side and should not be
blocked by it. `SELECT s['foo']` still reads the entire struct rather
than just the needed leaf column. That requires separate changes to how
the opener builds its projection mask.
…bytes (apache#20719)

## Which issue does this PR close?

- Closes apache#19569.

## Rationale for this change
This was the latest usage as far as I can see so I've changed it. I
think this is not on the hot path so if you want we can close the PR and
issue with it.

## What changes are included in this PR?
Instead of using write! format string write hex with using constant char
mapping

## Are these changes tested?
Runned debug display tests:
```
running 1 test
test scalar::tests::test_binary_display ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 364 filtered out; finished in 0.00s
```


## Are there any user-facing changes?
No
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes apache#16281

## Rationale for this change

The sqllogictest for the substrait was failing for subquery.

```
query failed: DataFusion error: This feature is not implemented: Cannot convert <subquery> to Substrait
```

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

- added support for `ScalarSubquery` and `Exists` expressions in the
Substrait producer.

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?
Yes

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
…er proves zero selectivity (apache#20743)

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes apache#20742

## Rationale for this change

- see apache#20742

## What changes are included in this PR?

In `collect_new_statistics`, when a filter proves no rows can match, use
a typed null (e.g., ScalarValue::Int32(None)) instead of untyped
ScalarValue::Null for column min/max/sum values. The column's data type
is looked up from the schema so that downstream interval analysis can
still intersect intervals of the same type.

## Are these changes tested?

add one test case

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
…verhead (apache#20623)

## Which issue does this PR close?

- Closes apache#20622.

## Rationale for this change

Several array set operations (e.g., `array_distinct`, `array_union`,
`array_intersect`, `array_except`) share a similar structure:

* Convert the input(s) using `RowConverter`, ideally in bulk
* Apply the set operation as appropriate, which involves adding or
removing elements from the candidate set of result `Rows`
* Convert the final set of `Rows` back into `ArrayRef`

We can do better for the final step: instead of converting from `Rows`
back into `ArrayRef`, we can just track which indices in the input(s)
correspond to the values we want to return. We can then grab those
values with a single `take`, which avoids the `Row` -> `ArrayRef`
deserialization overhead. This is a 5-20% performance win, depending on
the set operation and the characteristics of the input.

The only wrinkle is that for `intersect` and `union`, because there are
multiple inputs we need to concatenate the inputs together so that we
have a single index space. It turns out that this optimization is a win,
even incurring the `concat` overhead.

## What changes are included in this PR?

* Add a benchmark for `array_except`
* Implement this optimization for `array_distinct`, `array_union`,
`array_intersect`, `array_except`

## Are these changes tested?

Yes, and benchmarked.

## Are there any user-facing changes?

No.
…che#20842)

Bumps
[taiki-e/install-action](https://github.com/taiki-e/install-action) from
2.68.16 to 2.68.25.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/taiki-e/install-action/releases">taiki-e/install-action's
releases</a>.</em></p>
<blockquote>
<h2>2.68.25</h2>
<ul>
<li>
<p>Update <code>zizmor@latest</code> to 1.23.1.</p>
</li>
<li>
<p>Update <code>tombi@latest</code> to 0.9.4.</p>
</li>
<li>
<p>Update <code>cargo-semver-checks@latest</code> to 0.47.0.</p>
</li>
</ul>
<h2>2.68.24</h2>
<ul>
<li>Avoid triggering <a
href="https://docs.zizmor.sh/audits/#ref-confusion">zizmor
ref-confusion</a> when using this action in form of <code>uses:
taiki-e/install-action@v2</code> or <code>uses:
taiki-e/install-action@&lt;tool_name&gt;</code>.</li>
</ul>
<h2>2.68.23</h2>
<ul>
<li>
<p>Update <code>zizmor@latest</code> to 1.23.0.</p>
</li>
<li>
<p>Update <code>tombi@latest</code> to 0.9.3.</p>
</li>
<li>
<p>Update <code>mise@latest</code> to 2026.3.5.</p>
</li>
</ul>
<h2>2.68.22</h2>
<ul>
<li>
<p>Update <code>release-plz@latest</code> to 0.3.157.</p>
</li>
<li>
<p>Update <code>cargo-binstall@latest</code> to 1.17.7.</p>
</li>
<li>
<p>Update <code>mise@latest</code> to 2026.3.4.</p>
</li>
</ul>
<h2>2.68.21</h2>
<ul>
<li>
<p>Update <code>tombi@latest</code> to 0.9.2.</p>
</li>
<li>
<p>Update <code>uv@latest</code> to 0.10.9.</p>
</li>
<li>
<p>Update <code>rclone@latest</code> to 1.73.2.</p>
</li>
<li>
<p>Update <code>cargo-sort@latest</code> to 2.1.1.</p>
</li>
</ul>
<h2>2.68.20</h2>
<ul>
<li>
<p>Update <code>tombi@latest</code> to 0.9.1.</p>
</li>
<li>
<p>Update <code>cargo-neat@latest</code> to 0.3.2.</p>
</li>
</ul>
<h2>2.68.19</h2>
<ul>
<li>
<p>Update <code>mise@latest</code> to 2026.3.3.</p>
</li>
<li>
<p>Update <code>cargo-auditable@latest</code> to 0.7.4.</p>
</li>
<li>
<p>Update <code>cargo-sort@latest</code> to 2.1.0.</p>
</li>
</ul>
<h2>2.68.18</h2>
<ul>
<li>
<p>Update <code>uv@latest</code> to 0.10.8.</p>
</li>
<li>
<p>Update <code>grcov@latest</code> to 0.10.7.</p>
</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/taiki-e/install-action/blob/main/CHANGELOG.md">taiki-e/install-action's
changelog</a>.</em></p>
<blockquote>
<h1>Changelog</h1>
<p>All notable changes to this project will be documented in this
file.</p>
<p>This project adheres to <a href="https://semver.org">Semantic
Versioning</a>.</p>
<!-- raw HTML omitted -->
<h2>[Unreleased]</h2>
<h2>[2.68.25] - 2026-03-08</h2>
<ul>
<li>
<p>Update <code>zizmor@latest</code> to 1.23.1.</p>
</li>
<li>
<p>Update <code>tombi@latest</code> to 0.9.4.</p>
</li>
<li>
<p>Update <code>cargo-semver-checks@latest</code> to 0.47.0.</p>
</li>
</ul>
<h2>[2.68.24] - 2026-03-08</h2>
<ul>
<li>Avoid triggering <a
href="https://docs.zizmor.sh/audits/#ref-confusion">zizmor
ref-confusion</a> when using this action in form of <code>uses:
taiki-e/install-action@v2</code> or <code>uses:
taiki-e/install-action@&lt;tool_name&gt;</code>.</li>
</ul>
<h2>[2.68.23] - 2026-03-08</h2>
<ul>
<li>
<p>Update <code>zizmor@latest</code> to 1.23.0.</p>
</li>
<li>
<p>Update <code>tombi@latest</code> to 0.9.3.</p>
</li>
<li>
<p>Update <code>mise@latest</code> to 2026.3.5.</p>
</li>
</ul>
<h2>[2.68.22] - 2026-03-07</h2>
<ul>
<li>
<p>Update <code>release-plz@latest</code> to 0.3.157.</p>
</li>
<li>
<p>Update <code>cargo-binstall@latest</code> to 1.17.7.</p>
</li>
<li>
<p>Update <code>mise@latest</code> to 2026.3.4.</p>
</li>
</ul>
<h2>[2.68.21] - 2026-03-07</h2>
<ul>
<li>
<p>Update <code>tombi@latest</code> to 0.9.2.</p>
</li>
<li>
<p>Update <code>uv@latest</code> to 0.10.9.</p>
</li>
<li>
<p>Update <code>rclone@latest</code> to 1.73.2.</p>
</li>
<li>
<p>Update <code>cargo-sort@latest</code> to 2.1.1.</p>
</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/taiki-e/install-action/commit/a37010ded18ff788be4440302bd6830b1ae50d8b"><code>a37010d</code></a>
Release 2.68.25</li>
<li><a
href="https://github.com/taiki-e/install-action/commit/ffc2b1c2fffc4b45c7ec734311b1717dc5b6c320"><code>ffc2b1c</code></a>
Update <code>zizmor@latest</code> to 1.23.1</li>
<li><a
href="https://github.com/taiki-e/install-action/commit/8f3b52a0c24fe848f8b36e972306155b0f668242"><code>8f3b52a</code></a>
Update <code>tombi@latest</code> to 0.9.4</li>
<li><a
href="https://github.com/taiki-e/install-action/commit/df9c07a392aaa960ce05346ab056a01a0d3b4dd0"><code>df9c07a</code></a>
Update <code>cargo-semver-checks@latest</code> to 0.47.0</li>
<li><a
href="https://github.com/taiki-e/install-action/commit/3c19ebdd96e392c121ba23d56d739b7c23e79dc1"><code>3c19ebd</code></a>
zizmor: Enable ref-confusion</li>
<li><a
href="https://github.com/taiki-e/install-action/commit/b18b9d93a43496aeda12369e7563d9251abc2fe1"><code>b18b9d9</code></a>
Release 2.68.24</li>
<li><a
href="https://github.com/taiki-e/install-action/commit/5ccf6295e62df96d2686cb3c579301c6d3da6a89"><code>5ccf629</code></a>
codegen: Avoid allocation in workspace_root()</li>
<li><a
href="https://github.com/taiki-e/install-action/commit/93ea0b33c357ab5e56584967e551b351d558ff99"><code>93ea0b3</code></a>
Avoid triggering zizmor ref-confusion</li>
<li><a
href="https://github.com/taiki-e/install-action/commit/7c8485f1068cb2274a4b805d3d8ec77237d4fdf3"><code>7c8485f</code></a>
Update script and CI config</li>
<li><a
href="https://github.com/taiki-e/install-action/commit/fc2a2b349fea94690f6a04dcec522c55f51fe2fd"><code>fc2a2b3</code></a>
Release 2.68.23</li>
<li>Additional commits viewable in <a
href="https://github.com/taiki-e/install-action/compare/d6e286fa45544157a02d45a43742857ebbc25d12...a37010ded18ff788be4440302bd6830b1ae50d8b">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=taiki-e/install-action&package-manager=github_actions&previous-version=2.68.16&new-version=2.68.25)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…20843)

Bumps [github/codeql-action](https://github.com/github/codeql-action)
from 4.32.5 to 4.32.6.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/github/codeql-action/releases">github/codeql-action's
releases</a>.</em></p>
<blockquote>
<h2>v4.32.6</h2>
<ul>
<li>Update default CodeQL bundle version to <a
href="https://github.com/github/codeql-action/releases/tag/codeql-bundle-v2.24.3">2.24.3</a>.
<a
href="https://redirect.github.com/github/codeql-action/pull/3548">#3548</a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/github/codeql-action/blob/main/CHANGELOG.md">github/codeql-action's
changelog</a>.</em></p>
<blockquote>
<h1>CodeQL Action Changelog</h1>
<p>See the <a
href="https://github.com/github/codeql-action/releases">releases
page</a> for the relevant changes to the CodeQL CLI and language
packs.</p>
<h2>[UNRELEASED]</h2>
<ul>
<li>Fixed <a
href="https://redirect.github.com/github/codeql-action/issues/3555">a
bug</a> which caused the CodeQL Action to fail loading repository
properties if a &quot;Multi select&quot; repository property was
configured for the repository. <a
href="https://redirect.github.com/github/codeql-action/pull/3557">#3557</a></li>
<li>The CodeQL Action now loads <a
href="https://docs.github.com/en/organizations/managing-organization-settings/managing-custom-properties-for-repositories-in-your-organization">custom
repository properties</a> on GitHub Enterprise Server, enabling the
customization of features such as
<code>github-codeql-disable-overlay</code> that was previously only
available on GitHub.com. <a
href="https://redirect.github.com/github/codeql-action/pull/3559">#3559</a></li>
</ul>
<h2>4.32.6 - 05 Mar 2026</h2>
<ul>
<li>Update default CodeQL bundle version to <a
href="https://github.com/github/codeql-action/releases/tag/codeql-bundle-v2.24.3">2.24.3</a>.
<a
href="https://redirect.github.com/github/codeql-action/pull/3548">#3548</a></li>
</ul>
<h2>4.32.5 - 02 Mar 2026</h2>
<ul>
<li>Repositories owned by an organization can now set up the
<code>github-codeql-disable-overlay</code> custom repository property to
disable <a
href="https://redirect.github.com/github/roadmap/issues/1158">improved
incremental analysis for CodeQL</a>. First, create a custom repository
property with the name <code>github-codeql-disable-overlay</code> and
the type &quot;True/false&quot; in the organization's settings. Then in
the repository's settings, set this property to <code>true</code> to
disable improved incremental analysis. For more information, see <a
href="https://docs.github.com/en/organizations/managing-organization-settings/managing-custom-properties-for-repositories-in-your-organization">Managing
custom properties for repositories in your organization</a>. This
feature is not yet available on GitHub Enterprise Server. <a
href="https://redirect.github.com/github/codeql-action/pull/3507">#3507</a></li>
<li>Added an experimental change so that when <a
href="https://redirect.github.com/github/roadmap/issues/1158">improved
incremental analysis</a> fails on a runner — potentially due to
insufficient disk space — the failure is recorded in the Actions cache
so that subsequent runs will automatically skip improved incremental
analysis until something changes (e.g. a larger runner is provisioned or
a new CodeQL version is released). We expect to roll this change out to
everyone in March. <a
href="https://redirect.github.com/github/codeql-action/pull/3487">#3487</a></li>
<li>The minimum memory check for improved incremental analysis is now
skipped for CodeQL 2.24.3 and later, which has reduced peak RAM usage.
<a
href="https://redirect.github.com/github/codeql-action/pull/3515">#3515</a></li>
<li>Reduced log levels for best-effort private package registry
connection check failures to reduce noise from workflow annotations. <a
href="https://redirect.github.com/github/codeql-action/pull/3516">#3516</a></li>
<li>Added an experimental change which lowers the minimum disk space
requirement for <a
href="https://redirect.github.com/github/roadmap/issues/1158">improved
incremental analysis</a>, enabling it to run on standard GitHub Actions
runners. We expect to roll this change out to everyone in March. <a
href="https://redirect.github.com/github/codeql-action/pull/3498">#3498</a></li>
<li>Added an experimental change which allows the
<code>start-proxy</code> action to resolve the CodeQL CLI version from
feature flags instead of using the linked CLI bundle version. We expect
to roll this change out to everyone in March. <a
href="https://redirect.github.com/github/codeql-action/pull/3512">#3512</a></li>
<li>The previously experimental changes from versions 4.32.3, 4.32.4,
3.32.3 and 3.32.4 are now enabled by default. <a
href="https://redirect.github.com/github/codeql-action/pull/3503">#3503</a>,
<a
href="https://redirect.github.com/github/codeql-action/pull/3504">#3504</a></li>
</ul>
<h2>4.32.4 - 20 Feb 2026</h2>
<ul>
<li>Update default CodeQL bundle version to <a
href="https://github.com/github/codeql-action/releases/tag/codeql-bundle-v2.24.2">2.24.2</a>.
<a
href="https://redirect.github.com/github/codeql-action/pull/3493">#3493</a></li>
<li>Added an experimental change which improves how certificates are
generated for the authentication proxy that is used by the CodeQL Action
in Default Setup when <a
href="https://docs.github.com/en/code-security/how-tos/secure-at-scale/configure-organization-security/manage-usage-and-access/giving-org-access-private-registries">private
package registries are configured</a>. This is expected to generate more
widely compatible certificates and should have no impact on analyses
which are working correctly already. We expect to roll this change out
to everyone in February. <a
href="https://redirect.github.com/github/codeql-action/pull/3473">#3473</a></li>
<li>When the CodeQL Action is run <a
href="https://docs.github.com/en/code-security/how-tos/scan-code-for-vulnerabilities/troubleshooting/troubleshooting-analysis-errors/logs-not-detailed-enough#creating-codeql-debugging-artifacts-for-codeql-default-setup">with
debugging enabled in Default Setup</a> and <a
href="https://docs.github.com/en/code-security/how-tos/secure-at-scale/configure-organization-security/manage-usage-and-access/giving-org-access-private-registries">private
package registries are configured</a>, the &quot;Setup proxy for
registries&quot; step will output additional diagnostic information that
can be used for troubleshooting. <a
href="https://redirect.github.com/github/codeql-action/pull/3486">#3486</a></li>
<li>Added a setting which allows the CodeQL Action to enable network
debugging for Java programs. This will help GitHub staff support
customers with troubleshooting issues in GitHub-managed CodeQL
workflows, such as Default Setup. This setting can only be enabled by
GitHub staff. <a
href="https://redirect.github.com/github/codeql-action/pull/3485">#3485</a></li>
<li>Added a setting which enables GitHub-managed workflows, such as
Default Setup, to use a <a
href="https://github.com/dsp-testing/codeql-cli-nightlies">nightly
CodeQL CLI release</a> instead of the latest, stable release that is
used by default. This will help GitHub staff support customers whose
analyses for a given repository or organization require early access to
a change in an upcoming CodeQL CLI release. This setting can only be
enabled by GitHub staff. <a
href="https://redirect.github.com/github/codeql-action/pull/3484">#3484</a></li>
</ul>
<h2>4.32.3 - 13 Feb 2026</h2>
<ul>
<li>Added experimental support for testing connections to <a
href="https://docs.github.com/en/code-security/how-tos/secure-at-scale/configure-organization-security/manage-usage-and-access/giving-org-access-private-registries">private
package registries</a>. This feature is not currently enabled for any
analysis. In the future, it may be enabled by default for Default Setup.
<a
href="https://redirect.github.com/github/codeql-action/pull/3466">#3466</a></li>
</ul>
<h2>4.32.2 - 05 Feb 2026</h2>
<ul>
<li>Update default CodeQL bundle version to <a
href="https://github.com/github/codeql-action/releases/tag/codeql-bundle-v2.24.1">2.24.1</a>.
<a
href="https://redirect.github.com/github/codeql-action/pull/3460">#3460</a></li>
</ul>
<h2>4.32.1 - 02 Feb 2026</h2>
<ul>
<li>A warning is now shown in Default Setup workflow logs if a <a
href="https://docs.github.com/en/code-security/how-tos/secure-at-scale/configure-organization-security/manage-usage-and-access/giving-org-access-private-registries">private
package registry is configured</a> using a GitHub Personal Access Token
(PAT), but no username is configured. <a
href="https://redirect.github.com/github/codeql-action/pull/3422">#3422</a></li>
<li>Fixed a bug which caused the CodeQL Action to fail when repository
properties cannot successfully be retrieved. <a
href="https://redirect.github.com/github/codeql-action/pull/3421">#3421</a></li>
</ul>
<h2>4.32.0 - 26 Jan 2026</h2>
<ul>
<li>Update default CodeQL bundle version to <a
href="https://github.com/github/codeql-action/releases/tag/codeql-bundle-v2.24.0">2.24.0</a>.
<a
href="https://redirect.github.com/github/codeql-action/pull/3425">#3425</a></li>
</ul>
<h2>4.31.11 - 23 Jan 2026</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/github/codeql-action/commit/0d579ffd059c29b07949a3cce3983f0780820c98"><code>0d579ff</code></a>
Merge pull request <a
href="https://redirect.github.com/github/codeql-action/issues/3551">#3551</a>
from github/update-v4.32.6-72d2d850d</li>
<li><a
href="https://github.com/github/codeql-action/commit/d4c6be7cf1c47a33a06fa9183269e133e6863574"><code>d4c6be7</code></a>
Update changelog for v4.32.6</li>
<li><a
href="https://github.com/github/codeql-action/commit/72d2d850d1f91d4e1e024f4cf4276fd16bb68462"><code>72d2d85</code></a>
Merge pull request <a
href="https://redirect.github.com/github/codeql-action/issues/3548">#3548</a>
from github/update-bundle/codeql-bundle-v2.24.3</li>
<li><a
href="https://github.com/github/codeql-action/commit/23f983ce00d9a853697a6aaa9eae8d5abbf14849"><code>23f983c</code></a>
Merge pull request <a
href="https://redirect.github.com/github/codeql-action/issues/3544">#3544</a>
from github/dependabot/github_actions/dot-github/wor...</li>
<li><a
href="https://github.com/github/codeql-action/commit/832e97ccad228ef72e06ffee26f6251bceeb7e5f"><code>832e97c</code></a>
Merge pull request <a
href="https://redirect.github.com/github/codeql-action/issues/3545">#3545</a>
from github/dependabot/github_actions/dot-github/wor...</li>
<li><a
href="https://github.com/github/codeql-action/commit/5ef38c0b13c2f0f5ce928cb7706f5fb19fc97ae2"><code>5ef38c0</code></a>
Merge pull request <a
href="https://redirect.github.com/github/codeql-action/issues/3546">#3546</a>
from github/dependabot/npm_and_yarn/tar-7.5.10</li>
<li><a
href="https://github.com/github/codeql-action/commit/80c9cda73902bba67939606c4bf3a1d9606bb150"><code>80c9cda</code></a>
Add changelog note</li>
<li><a
href="https://github.com/github/codeql-action/commit/f2669dd916c673b2811839169929a8ba71bb7634"><code>f2669dd</code></a>
Update default bundle to codeql-bundle-v2.24.3</li>
<li><a
href="https://github.com/github/codeql-action/commit/bd03c44cf40965f5476f66fad404194e4cb35710"><code>bd03c44</code></a>
Merge branch 'main' into
dependabot/github_actions/dot-github/workflows/actio...</li>
<li><a
href="https://github.com/github/codeql-action/commit/102d7627b63c066871badf0743c11b2f6dd9c9e9"><code>102d762</code></a>
Bump tar from 7.5.7 to 7.5.10</li>
<li>Additional commits viewable in <a
href="https://github.com/github/codeql-action/compare/c793b717bc78562f491db7b0e93a3a178b099162...0d579ffd059c29b07949a3cce3983f0780820c98">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=github/codeql-action&package-manager=github_actions&previous-version=4.32.5&new-version=4.32.6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes #.

## Rationale for this change

```
Crate:     paste
Version:   1.0.15
Warning:   unmaintained
Title:     paste - no longer maintained
Date:      2024-10-07
ID:        RUSTSEC-2024-0436
```

We also need to remove `paste` from the project

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
…he#20231)

## Which issue does this PR close?

Closes apache#20194

## Rationale for this change

A query with `ROW_NUMBER() OVER (... ORDER BY CASE WHEN col='0' THEN 1
ELSE 0 END)` combined with a filter `nvl(t2.value_2_3,'0')='0'` fails
with a `SanityCheckPlan` error. This worked in 50.3.0 but broke in
52.1.0.

## What changes are included in this PR?

**Root cause**: `collect_columns_from_predicate_inner` was extracting
equality pairs where neither side was a `Column` (e.g. `nvl(col, '0') =
'0'`), creating equivalence classes between complex expressions and
literals. `normalize_expr`'s deep traversal would then replace the
literal `'0'` inside unrelated sort/window CASE WHEN expressions with
the complex NVL expression, corrupting the sort ordering and causing a
mismatch between `SortExec`'s reported output ordering and
`BoundedWindowAggExec`'s expected ordering.

**Fix** (two changes in `filter.rs`):
1. **`collect_columns_from_predicate_inner`**: Only extract equality
pairs where at least one side is a `Column` reference. This matches the
function's documented intent ("Column-Pairs") and prevents
complex-expression-to-literal equivalence classes from being created.
2. **`extend_constants`**: Recognize `Literal` expressions as inherently
constant (previously only checked `is_expr_constant` on the input's
equivalence properties, which doesn't know about literals). This ensures
constant propagation still works for `complex_expr = literal` predicates
— e.g. `nvl(col, '0')` is properly marked as constant after the filter.

## How was this tested?

- Unit test `test_collect_columns_skips_non_column_pairs` verifying the
filtering logic
- Sqllogictest reproducing the exact query from the issue
- Full test suites: equivalence tests (51 passed), physical-plan tests
(1255 passed), physical-optimizer tests (20 passed)
- Manual verification with datafusion-cli running the reproduction query

## Test plan
- [x] Unit test for `collect_columns_from_predicate_inner` column
filtering
- [x] Sqllogictest regression test for apache#20194
- [x] Existing test suites pass
- [x] Manual reproduction query succeeds

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Bumps [quinn-proto](https://github.com/quinn-rs/quinn) from 0.11.13 to
0.11.14.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/quinn-rs/quinn/releases">quinn-proto's
releases</a>.</em></p>
<blockquote>
<h2>quinn-proto 0.11.14</h2>
<p><a href="https://github.com/jxs"><code>@​jxs</code></a> reported a
denial of service issue in quinn-proto 5 days ago:</p>
<ul>
<li><a
href="https://github.com/quinn-rs/quinn/security/advisories/GHSA-6xvm-j4wr-6v98">https://github.com/quinn-rs/quinn/security/advisories/GHSA-6xvm-j4wr-6v98</a></li>
</ul>
<p>We coordinated with them to release this version to patch the issue.
Unfortunately the maintainers missed these issues during code review and
we did not have enough fuzzing coverage -- we regret the oversight and
have added an additional fuzzing target.</p>
<p>Organizations that want to participate in coordinated disclosure can
contact us privately to discuss terms.</p>
<h2>What's Changed</h2>
<ul>
<li>Fix over-permissive proto dependency edge by <a
href="https://github.com/Ralith"><code>@​Ralith</code></a> in <a
href="https://redirect.github.com/quinn-rs/quinn/pull/2385">quinn-rs/quinn#2385</a></li>
<li>0.11.x: avoid unwrapping VarInt decoding during parameter parsing by
<a href="https://github.com/djc"><code>@​djc</code></a> in <a
href="https://redirect.github.com/quinn-rs/quinn/pull/2559">quinn-rs/quinn#2559</a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/quinn-rs/quinn/commit/2c315aa7f9c2a6c1db87f8f51f40623a427c78fd"><code>2c315aa</code></a>
proto: bump version to 0.11.14</li>
<li><a
href="https://github.com/quinn-rs/quinn/commit/8ad47f431e7deb82c08b09c2e33ef85aa88fd212"><code>8ad47f4</code></a>
Use newer rustls-pki-types PEM parser API</li>
<li><a
href="https://github.com/quinn-rs/quinn/commit/c81c0289abe30d8437ccbf9b6304e2bc9c707cea"><code>c81c028</code></a>
ci: fix workflow syntax</li>
<li><a
href="https://github.com/quinn-rs/quinn/commit/0050172969f7e69e136c433181330da7790d8d73"><code>0050172</code></a>
ci: pin wasm-bindgen-cli version</li>
<li><a
href="https://github.com/quinn-rs/quinn/commit/8a6f82c58d1c565eab78f986e614223e6ed76a85"><code>8a6f82c</code></a>
Take semver-compatible dependency updates</li>
<li><a
href="https://github.com/quinn-rs/quinn/commit/e52db4ad8df0f9720e7b0e32ecc0e48c9a93de0f"><code>e52db4a</code></a>
Apply suggestions from clippy 1.91</li>
<li><a
href="https://github.com/quinn-rs/quinn/commit/6df7275c582ca9b7225e0ccf9f9871a55eb73155"><code>6df7275</code></a>
chore: Fix <code>unnecessary_unwrap</code> clippy</li>
<li><a
href="https://github.com/quinn-rs/quinn/commit/c8eefa07e087b06d8f2b78ff262ce8ac952994f1"><code>c8eefa0</code></a>
proto: avoid unwrapping varint decoding during parameters parsing</li>
<li><a
href="https://github.com/quinn-rs/quinn/commit/9723a977754c8662001b0fef97aab8f3ddf1df92"><code>9723a97</code></a>
fuzz: add fuzzing target for parsing transport parameters</li>
<li><a
href="https://github.com/quinn-rs/quinn/commit/eaf0ef30252cef4acec21f150427e604cd4271c9"><code>eaf0ef3</code></a>
Fix over-permissive proto dependency edge (<a
href="https://redirect.github.com/quinn-rs/quinn/issues/2385">#2385</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/quinn-rs/quinn/compare/quinn-proto-0.11.13...quinn-proto-0.11.14">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=quinn-proto&package-manager=cargo&previous-version=0.11.13&new-version=0.11.14)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/apache/datafusion/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
## Which issue does this PR close?

- Closes apache#20841

## Rationale for this change

We want to split IO and CPU to allow for more (NUMA-aware) parallelism
and utilizing IO and CPU better.
This allows for e.g. more coalescing, prefetching, parallel IO, more
parallel / incremental decoding etc.
Also this allows doing morsels only on a CPU level and not doing IO
multiple times for each morsel.

## What changes are included in this PR?

Just refactor `ParquetOpener` to use `ParquetPushDecoder`. I used claude
to rewrite it and to keep changes small.

## Are these changes tested?

Existing tests. Nothing should change, the arrow-rs code also uses
`ParquetPushDecoder`.

## Are there any user-facing changes?

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…ache#20780)

## Which issue does this PR close?

- Closes #.

## Rationale for this change

ClickBench quueries (Q7, Q15, Q16, Q18) have some redundant projections
for sorting based on count.
Probably not a (measurable) improvement, but the plan looks better (in
case of non-TopK it could probably be measurable).

## What changes are included in this PR?

## Are these changes tested?
Existing tests.

## Are there any user-facing changes?

---------

Co-authored-by: Claude <noreply@anthropic.com>
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes #.

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->
Besides pushing `LimitExec` down the query plan, there is another
optimization that allows plan nodes to *absorb* a limit, so it can
potentially stop early.

I’ve noticed that this form of limit absorption has not been implemented
by many operators. This suggests the optimization is non-obvious, so I’d
like to improve the documentation for it.

A recent PR that implements this optimization is:
- apache#20228

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes apache#20797

## Rationale for this change

- see apache#20797

## What changes are included in this PR?

impl ser/de for preserve_order in RepartitionExec

## Are these changes tested?

add one test case

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
…ache#20627)

## Summary
- include synchronous `start_next_file()` / `FileOpener::open()` setup
time in `time_elapsed_scanning_total`
- keep existing `time_opening` and scanning timers lifecycle intact
- avoid timer overlap by scoping the temporary timer before calling
`time_scanning_total.start()`

## Details
In `FileStreamState::Open`, `start_next_file()` is invoked before
`time_scanning_total.start()`. If `open()` performs synchronous work
before returning the future, that time was previously unaccounted for in
`time_elapsed_scanning_total`.

This change wraps the `start_next_file()` call in a scoped timer on the
same `time_scanning_total` metric so the missing segment is recorded.

- Fixes apache#20571

## Validation
I tested by reading CSV files via AWS S3.

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes #.

```
Crate:     generational-arena
Version:   0.2.9
Warning:   unmaintained
Title:     `generational-arena` is unmaintained
Date:      2024-02-11
ID:        RUSTSEC-2024-0014
URL:       https://rustsec.org/advisories/RUSTSEC-2024-0014
```

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes #.

## Rationale for this change

Move dependencies from main area to dev

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
…oder

Resolve merge conflicts in opener.rs, clickbench.slt, and
projection_pushdown.slt. Adapt the morsel-driven bloom filter pruning
in open() to use a separate ParquetRecordBatchStreamBuilder (as
upstream now does) since prune_by_bloom_filters requires that type,
not the new ParquetPushDecoderBuilder.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.