-
Notifications
You must be signed in to change notification settings - Fork 1.8k
[WIP][NO MERGE] ASF Infra test #18656
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…che#18508) Bumps [taiki-e/install-action](https://github.com/taiki-e/install-action) from 2.62.46 to 2.62.47. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/taiki-e/install-action/releases">taiki-e/install-action's releases</a>.</em></p> <blockquote> <h2>2.62.47</h2> <ul> <li> <p>Update <code>vacuum@latest</code> to 0.20.0.</p> </li> <li> <p>Update <code>cargo-nextest@latest</code> to 0.9.111.</p> </li> <li> <p>Update <code>cargo-shear@latest</code> to 1.6.2.</p> </li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/taiki-e/install-action/blob/main/CHANGELOG.md">taiki-e/install-action's changelog</a>.</em></p> <blockquote> <h1>Changelog</h1> <p>All notable changes to this project will be documented in this file.</p> <p>This project adheres to <a href="https://semver.org">Semantic Versioning</a>.</p> <!-- raw HTML omitted --> <h2>[Unreleased]</h2> <ul> <li> <p>Update <code>cargo-udeps@latest</code> to 0.1.60.</p> </li> <li> <p>Update <code>zizmor@latest</code> to 1.16.3.</p> </li> </ul> <h2>[2.62.47] - 2025-11-05</h2> <ul> <li> <p>Update <code>vacuum@latest</code> to 0.20.0.</p> </li> <li> <p>Update <code>cargo-nextest@latest</code> to 0.9.111.</p> </li> <li> <p>Update <code>cargo-shear@latest</code> to 1.6.2.</p> </li> </ul> <h2>[2.62.46] - 2025-11-04</h2> <ul> <li> <p>Update <code>vacuum@latest</code> to 0.19.5.</p> </li> <li> <p>Update <code>syft@latest</code> to 1.37.0.</p> </li> <li> <p>Update <code>mise@latest</code> to 2025.11.2.</p> </li> <li> <p>Update <code>knope@latest</code> to 0.21.5.</p> </li> </ul> <h2>[2.62.45] - 2025-11-02</h2> <ul> <li> <p>Update <code>zizmor@latest</code> to 1.16.2.</p> </li> <li> <p>Update <code>cargo-binstall@latest</code> to 1.15.10.</p> </li> <li> <p>Update <code>ubi@latest</code> to 0.8.4.</p> </li> <li> <p>Update <code>mise@latest</code> to 2025.11.1.</p> </li> <li> <p>Update <code>cargo-semver-checks@latest</code> to 0.45.0.</p> </li> </ul> <h2>[2.62.44] - 2025-11-01</h2> <ul> <li>Update <code>mise@latest</code> to 2025.11.0.</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/taiki-e/install-action/commit/6f9c7cc51aa54b13cbcbd12f8bbf69d8ba405b4b"><code>6f9c7cc</code></a> Release 2.62.47</li> <li><a href="https://github.com/taiki-e/install-action/commit/f13cacde469bbeca99e2ca0b0118337dd536aaf7"><code>f13cacd</code></a> Update <code>vacuum@latest</code> to 0.20.0</li> <li><a href="https://github.com/taiki-e/install-action/commit/62c4f5632b45a86418e529c41d1b2f82063b35a9"><code>62c4f56</code></a> Update <code>cargo-nextest@latest</code> to 0.9.111</li> <li><a href="https://github.com/taiki-e/install-action/commit/800a584e84678ab6b0c92051141d4a2942098533"><code>800a584</code></a> Update <code>cargo-shear@latest</code> to 1.6.2</li> <li>See full diff in <a href="https://github.com/taiki-e/install-action/compare/f535147c22906d77695e11cb199e764aa610a4fc...6f9c7cc51aa54b13cbcbd12f8bbf69d8ba405b4b">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Oleks V <comphead@users.noreply.github.com>
…LargeListView` types (apache#18432) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Closes apache#18351 ## Rationale for this change `array_slice` accepts `ListView` / `LargeListView` inputs. <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? - Extend array_slice_inner to handle `ListView`/`LargeListView` arrays directly. - Share the stride/bounds logic between list and list‑view implementations via a new `SlicePlan`. <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? Yes <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> Yes. `array_slice` now accepts `ListView` and `LargeListView` arrays without requiring an explicit cast.
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - part of #apache#18142. ## Rationale for this change This PR is for consolidating all the `builtin-functions` examples into a single example binary. We are agreed on the pattern and we can apply it to the remaining examples <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> --------- Co-authored-by: Sergey Zhukov <szhukov@aligntech.com>
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Related apache#18210 ## Rationale for this change To keep logic clear in binary operator and make it possible to use binary operators for nested data structures in coming changes. ## What changes are included in this PR? Another housekeeping refactor for binary operators. - Keep the API from datum module consistent by using `Operator` instead of kernel function - Move nested data structure check into cmp operators. This allows us to implement binary operators for `List`, `Struct` and etc. ## Are these changes tested? Unit tests ## Are there any user-facing changes? N/A
## Which issue does this PR close? - Closes apache#18431 ## Rationale for this change -The trace_id in the result is depended on a random number. I think it's better to remove it from the sql to get a stable result ## What changes are included in this PR? Remove the trace_id from the sql and the assert result ## Are these changes tested? N/A ## Are there any user-facing changes? No
…rnal invariant checks (apache#18511) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Closes apache#15492 ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> See issue for the rationale and example. This PR introduces the following macros to make invariant checks and throwing internal errors easier, and also let the error message include more assertion details if it failed (what's the expected/actual value), to make debugging easier. - `assert_or_internal_err!()` - `assert_eq_or_internal_err!()` - `assert_ne_or_internal_err!()` ```rust // before if field.name() != expected.name() { return internal_err!( "Field name mismatch at index {}: expected '{}', found '{}'", idx, expected.name(), field.name() ); } // after assert_eq_or_internal_err!( field.name(), expected.name(), "Field name mismatch at index {}", idx ); ``` If the assertion fails, the error now reads: ``` Internal error: Assertion failed: field.name() == expected.name() (left: "foo", right: "bar"): Field name mismatch at index 3. ``` ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> 1. Add macros and UTs to test 2. Updated a few internal error patterns that are applicable for this macro ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 3. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> UTs ## Are there any user-facing changes? No <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> --------- Co-authored-by: Alex Huang <huangweijun1001@gmail.com>
…cal-optimizer (apache#18555) ## Which issue does this PR close? - Closes apache#18547. ## What changes are included in this PR? enforce clippy lint `needless_pass_by_value` to `datafusion-physical-optimizer` ## Are these changes tested? yes ## Are there any user-facing changes? no
…pache#18554) ## Which issue does this PR close? - Closes apache#18546. ## Rationale for this change enforce clippy lint `needless_pass_by_value` ## Are these changes tested? yes ## Are there any user-facing changes? no
…mmon (apache#18556) ## Which issue does this PR close? - Closes apache#18543 ## What changes are included in this PR? enforce clippy lint `needless_pass_by_value` to `datafusion-physical-expr-common` ## Are these changes tested? yes ## Are there any user-facing changes? no
…ion-physical-expr` (apache#18557) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Closes apache#18544. ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> See apache#18503 for details. ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> I enabled the clippy lint rule and then fixed nearly all instances. ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> As part of the normal test suite, yes. ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> The following `pub (crate)` APIs were changed: - `regex_match_dyn` in `datafusion/physical-expr/src/expressions/binary/kernels.rs` - `regex_match_dyn_scalar` in `datafusion/physical-expr/src/expressions/binary/kernels.rs` But no fully `pub` functions were changed.
Most of these file source implementations cannot operate without schema,
they all have `.expect("schema must be set")`s that violate using the
language to enforce correctness.
This is an attempt to rework that by making it so you have to pass in a
schema to construct them.
That said there are downsides:
1. More boilerplate.
2. Requires that the schema passed into `FileScanConfig` and
`FileSource` match.
I feel like there's another twist to this needed... maybe moving the
schema out of `FileScanConfig`? That's not currently possible, it's used
in both places. Maybe having a `FileScan` and a `FileScanConfig` and
having construction be `FileScan::new(FileSource::new(config), config)`?
## Which issue does this PR close? ## Rationale for this change A small fix for a rare case in SLT runner when it panics instead of printing result. ## What changes are included in this PR? Code change in sqllogictest ## Are these changes tested? Manual test ## Are there any user-facing changes? No
) ## Which issue does this PR close? ## Rationale for this change complex expr is not supported in prepared statement argument. ## What changes are included in this PR? simplify arguments of prepared statement first. ## Are these changes tested? UT ## Are there any user-facing changes? No --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
…ache#18423) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Closes apache#18278 ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> Convenience method for when parsing has already been done, and we want to start from a an expr object instead of SQL string. ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> Added test ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> Yes, new public api. <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->
…#18565) ## Which issue does this PR close? - part of apache#17558 - port of apache#18551 ## Rationale for this change Let's update the version numbers! ## What changes are included in this PR? - forward port the change from apache#18551 to main ## Are these changes tested? by CI ## Are there any user-facing changes? New version
## Rationale for this change Noticed while doing apache#18424 that the list types `List` and `FixedSizeList` uses `MutableData` to build the reverse array. Using `take` turns out to be a lot faster, ~70% for both `List` and `FixedSizeList`. This PR also reworks the benchmark added in apache#18425, and these are the results on that compared to the implementation on main: ``` # cargo bench --bench array_reverse Compiling datafusion-functions-nested v50.3.0 (/Users/vegard/dev/datafusion/datafusion/functions-nested) Finished `bench` profile [optimized] target(s) in 42.08s Running benches/array_reverse.rs (target/release/deps/array_reverse-2c473eed34a53d0a) Gnuplot not found, using plotters backend Benchmarking array_reverse_list: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.3s, or reduce sample count to 70. array_reverse_list time: [62.201 ms 62.551 ms 62.946 ms] change: [−70.137% −69.965% −69.785%] (p = 0.00 < 0.05) Performance has improved. Found 8 outliers among 100 measurements (8.00%) 5 (5.00%) high mild 3 (3.00%) high severe Benchmarking array_reverse_list_view: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.3s, or reduce sample count to 70. array_reverse_list_view time: [61.649 ms 61.905 ms 62.185 ms] change: [−16.122% −15.623% −15.087%] (p = 0.00 < 0.05) Performance has improved. Found 6 outliers among 100 measurements (6.00%) 5 (5.00%) high mild 1 (1.00%) high severe array_reverse_fixed_size_list time: [4.7936 ms 4.8292 ms 4.8741 ms] change: [−76.435% −76.196% −75.951%] (p = 0.00 < 0.05) Performance has improved. Found 20 outliers among 100 measurements (20.00%) 8 (8.00%) low mild 5 (5.00%) high mild 7 (7.00%) high severe ``` ## Are these changes tested? Covered by existing sqllogic tests, and one new test for `FixedSizeList`.
## Which issue does this PR close? This PR is for consolidating all the `custom_data_source` examples into a single example binary. We are agreed on the pattern and we can apply it to the remaining examples <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - part of #apache#18142. ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> --------- Co-authored-by: Sergey Zhukov <szhukov@aligntech.com> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
## Which issue does this PR close? - apache#17211 It's not yet clear to me if this will fully close the above issue, or if it's just the first step. I think there may be more work to do, so I'm not going to have this auto-close the issue. ## Rationale for this change tl;dr of the issue: normalizing the access pattern(s) for objects for partitioned tables should not only reduce the number of requests to a backing object store, but will also allow any existing and/or future caching mechanisms to apply equally to both directory-partitioned and flat tables. List request on `main`: ```sql DataFusion CLI v50.2.0 > \object_store_profiling summary ObjectStore Profile mode set to Summary > CREATE EXTERNAL TABLE overture_partitioned STORED AS PARQUET LOCATION 's3://overturemaps-us-west-2/release/2025-09-24.0/'; 0 row(s) fetched. Elapsed 37.236 seconds. Object Store Profiling Instrumented Object Store: instrument_mode: Summary, inner: AmazonS3(overturemaps-us-west-2) Summaries: +-----------+----------+-----+-----+-----+-----+-------+ | Operation | Metric | min | max | avg | sum | count | +-----------+----------+-----+-----+-----+-----+-------+ | List | duration | | | | | 1 | | List | size | | | | | 1 | +-----------+----------+-----+-----+-----+-----+-------+ Instrumented Object Store: instrument_mode: Summary, inner: AmazonS3(overturemaps-us-west-2) Summaries: +-----------+----------+-----------+-----------+-------------+-------------+-------+ | Operation | Metric | min | max | avg | sum | count | +-----------+----------+-----------+-----------+-------------+-------------+-------+ | Get | duration | 0.044411s | 0.338399s | 0.104535s | 162.133179s | 1551 | | Get | size | 8 B | 1285059 B | 338457.56 B | 524947683 B | 1551 | | List | duration | | | | | 3 | | List | size | | | | | 3 | +-----------+----------+-----------+-----------+-------------+-------------+-------+ > select count(*) from overture_partitioned; +------------+ | count(*) | +------------+ | 4219677254 | +------------+ 1 row(s) fetched. Elapsed 40.061 seconds. Object Store Profiling Instrumented Object Store: instrument_mode: Summary, inner: AmazonS3(overturemaps-us-west-2) Summaries: +-----------+----------+-----------+-----------+-------------+-------------+-------+ | Operation | Metric | min | max | avg | sum | count | +-----------+----------+-----------+-----------+-------------+-------------+-------+ | Get | duration | 0.042554s | 0.453125s | 0.103147s | 159.980835s | 1551 | | Get | size | 8 B | 1285059 B | 338457.56 B | 524947683 B | 1551 | | List | duration | 0.043498s | 0.196298s | 0.092462s | 2.034174s | 22 | | List | size | | | | | 22 | +-----------+----------+-----------+-----------+-------------+-------------+-------+ > select count(*) from overture_partitioned; +------------+ | count(*) | +------------+ | 4219677254 | +------------+ 1 row(s) fetched. Elapsed 0.924 seconds. Object Store Profiling Instrumented Object Store: instrument_mode: Summary, inner: AmazonS3(overturemaps-us-west-2) Summaries: +-----------+----------+-----------+-----------+-----------+-----------+-------+ | Operation | Metric | min | max | avg | sum | count | +-----------+----------+-----------+-----------+-----------+-----------+-------+ | List | duration | 0.040526s | 0.161407s | 0.092792s | 2.041431s | 22 | | List | size | | | | | 22 | +-----------+----------+-----------+-----------+-----------+-----------+-------+ > ``` List requests for this PR: ```sql DataFusion CLI v50.2.0 > \object_store_profiling summary ObjectStore Profile mode set to Summary > CREATE EXTERNAL TABLE overture_partitioned STORED AS PARQUET LOCATION 's3://overturemaps-us-west-2/release/2025-09-24.0/'; 0 row(s) fetched. Elapsed 33.962 seconds. Object Store Profiling Instrumented Object Store: instrument_mode: Summary, inner: AmazonS3(overturemaps-us-west-2) Summaries: +-----------+----------+-----+-----+-----+-----+-------+ | Operation | Metric | min | max | avg | sum | count | +-----------+----------+-----+-----+-----+-----+-------+ | List | duration | | | | | 1 | | List | size | | | | | 1 | +-----------+----------+-----+-----+-----+-----+-------+ Instrumented Object Store: instrument_mode: Summary, inner: AmazonS3(overturemaps-us-west-2) Summaries: +-----------+----------+-----------+-----------+-------------+-------------+-------+ | Operation | Metric | min | max | avg | sum | count | +-----------+----------+-----------+-----------+-------------+-------------+-------+ | Get | duration | 0.043832s | 0.342730s | 0.110505s | 171.393509s | 1551 | | Get | size | 8 B | 1285059 B | 338457.56 B | 524947683 B | 1551 | | List | duration | | | | | 3 | | List | size | | | | | 3 | +-----------+----------+-----------+-----------+-------------+-------------+-------+ > select count(*) from overture_partitioned; +------------+ | count(*) | +------------+ | 4219677254 | +------------+ 1 row(s) fetched. Elapsed 38.119 seconds. Object Store Profiling Instrumented Object Store: instrument_mode: Summary, inner: AmazonS3(overturemaps-us-west-2) Summaries: +-----------+----------+-----------+-----------+-------------+-------------+-------+ | Operation | Metric | min | max | avg | sum | count | +-----------+----------+-----------+-----------+-------------+-------------+-------+ | Get | duration | 0.043186s | 0.296394s | 0.099681s | 154.605286s | 1551 | | Get | size | 8 B | 1285059 B | 338457.56 B | 524947683 B | 1551 | | List | duration | | | | | 1 | | List | size | | | | | 1 | +-----------+----------+-----------+-----------+-------------+-------------+-------+ > select count(*) from overture_partitioned; +------------+ | count(*) | +------------+ | 4219677254 | +------------+ 1 row(s) fetched. Elapsed 0.815 seconds. Object Store Profiling Instrumented Object Store: instrument_mode: Summary, inner: AmazonS3(overturemaps-us-west-2) Summaries: +-----------+----------+-----+-----+-----+-----+-------+ | Operation | Metric | min | max | avg | sum | count | +-----------+----------+-----+-----+-----+-----+-------+ | List | duration | | | | | 1 | | List | size | | | | | 1 | +-----------+----------+-----+-----+-----+-----+-------+ > ``` List operations | Action | `main` | this PR | | ---- | ---- | ---- | | Create Table | 3 | 3 | | Cold-cache Query | 22 | 1 | | Warm-cache Query | 22 | 1 | ## What changes are included in this PR? - Refactored helpers related to listing, discovering, and pruning objects based on partitions to normalize the strategy between partitioned and flat tables ## Are these changes tested? Yes. The internal methods that have been modified are covered by existing tests. ## Are there any user-facing changes? No ## Additional Notes I want to surface that I believe there is a chance for a performance _regression_ for certain queries against certain tables. One performance related mechanism the existing code implements, but this code currently omits, is (potentially) reducing the number of partitions listed based on query filters. In order for the existing code to exercise this optimization the query filters must contain all the path elements of a subdirectory as column filters. E.g. Given a table with a directory-partitioning structure like: ``` path/to/table/a=1/b=2/c=3/data.parquet ``` This query: ```sql select count(*) from table where a=1 and b=2; ``` Will result in listing the following path: ``` LIST: path/to/table/a=1/b=2/ ``` Whereas this query: ```sql select count(*) from table where b=2; ``` Will result in listing the following path: ``` LIST: path/to/table/ ``` I believe the real-world impact of this omission is likely minimal, at least when using high-latency storage such as S3 or other object stores, especially considering the existing implementation is likely to execute multiple sequential `LIST` operations due to its breadth-first search implementation. The most likely configuration for a table that would be negatively impacted would be a table that holds many thousands of underlying objects (most cloud stores return recursive list requests with page sizes of many hundreds to thousands of objects) with a relatively shallow partition structure. I may be able to find or build a dataset that fulfills these criteria to test this assertion if there's concern about it. I believe we could also augment the existing low-level `object_store` interactions to allow listing a prefix on a table, which would allow the same pruning of list operations with the code in this PR. The downside to this approach is it either complicates future caching efforts, or leads to cache fragmentation in a simpler cache implementation. I didn't include these changes in this PR to avoid the change set being too large. ## cc @alamb --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
…18491) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Closes apache#17027 ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> `output_batches` should be a common metric in all operators, thus should ideally be added to `BaselineMetrics` ``` > explain analyze select * from generate_series(1, 1000000) as t1(v1) order by v1 desc; +-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | plan_type | plan | +-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Plan with Metrics | SortExec: expr=[v1@0 DESC], preserve_partitioning=[false], metrics=[output_rows=1000000, elapsed_compute=535.320324ms, output_bytes=7.6 MB, output_batches=123, spill_count=0, spilled_bytes=0.0 B, spilled_rows=0, batches_split=0] | | | ProjectionExec: expr=[value@0 as v1], metrics=[output_rows=1000000, elapsed_compute=208.379µs, output_bytes=7.7 MB, output_batches=123] | | | LazyMemoryExec: partitions=1, batch_generators=[generate_series: start=1, end=1000000, batch_size=8192], metrics=[output_rows=1000000, elapsed_compute=15.924291ms, output_bytes=7.7 MB, output_batches=123] | | | | +-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row(s) fetched. Elapsed 0.492 second ``` ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> - Added `output_batches` into `BaselineMetrics` with `DEV` MetricType - Tracked through `record_poll()` API - Changes are similar to apache#18268 - Refactored `assert_metrics` macro to take multiple metrics strings for substring check - Added `output_bytes` and `output_batches` tracking in `TopK` operator - Added `baseline` metrics for `RepartitionExec` ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> Added UT ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> Changes in the `EXPLAIN ANALYZE` output, `output_batches` will be added to `metrics=[...]`
…pr (apache#18532) ## Which issue does this PR close? - Closes apache#18504. ## Rationale for this change Followed suggestions to not update any public-facing APIs and put the lint rule in the appropriate spot. ## What changes are included in this PR? * Add `#![deny(clippy::needless_pass_by_value)]` and `#![cfg_attr(test, allow(clippy::needless_pass_by_value))]` to `lib.rs`. * Add `#[allow(clippy::needless_pass_by_value)]` to public functions * fix `rewrite_in_terms_of_projection()` and `get_exprs_except_skipped()` to use references per the lint suggestion ## Are these changes tested? Yes, though the same test failed even without changes to the public APIs: `test expr_rewriter::order_by::test::rewrite_sort_cols_by_agg_alias ... FAILED` I'll append the logs for your convenience: ``` failures: ---- expr_rewriter::order_by::test::rewrite_sort_cols_by_agg_alias stdout ---- running: 'c1 --> c1 -- column *named* c1 that came out of the projection, (not t.c1)' running: 'min(c2) --> "min(c2)" -- (column *named* "min(t.c2)"!)' thread 'expr_rewriter::order_by::test::rewrite_sort_cols_by_agg_alias' (27524241) panicked at datafusion/expr/src/expr_rewriter/order_by.rs:308:13: assertion `left == right` failed: input:Sort { expr: AggregateFunction(AggregateFunction { func: AggregateUDF { inner: Min { name: "min", signature: Signature { type_signature: VariadicAny, volatility: Immutable, parameter_names: None } } }, params: AggregateFunctionParams { args: [Column(Column { relation: None, name: "c2" })], distinct: false, filter: None, order_by: [], null_treatment: None } }), asc: true, nulls_first: true } rewritten:Sort { expr: Column(Column { relation: None, name: "min(t.c2)" }), asc: true, nulls_first: true } expected:Sort { expr: Column(Column { relation: Some(Bare { table: "min(t" }), name: "c2)" }), asc: true, nulls_first: true } left: Sort { expr: Column(Column { relation: None, name: "min(t.c2)" }), asc: true, nulls_first: true } right: Sort { expr: Column(Column { relation: Some(Bare { table: "min(t" }), name: "c2)" }), asc: true, nulls_first: true } note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace failures: expr_rewriter::order_by::test::rewrite_sort_cols_by_agg_alias ``` ## Are there any user-facing changes? No, all modification were constrained to internal APIs. --------- Co-authored-by: Yongting You <2010youy01@gmail.com>
## Which issue does this PR close? ## Rationale for this change get_field doesn't support nested key ## What changes are included in this PR? support nested key ## Are these changes tested? UT ## Are there any user-facing changes? No --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Closes apache#16688. ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> Currently Datafusion can only read Arrow files if the're in the File format, not the Stream format. I work with a bunch of Stream format files and wanted native support. ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> To accomplish the above, this PR splits the Arrow datasource into two separate implementations (`ArrowStream*` and `ArrowFile*`) with a facade on top to differentiate between the formats at query planning time. ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> Yes, there are end-to-end sqllogictests along with tests for the changes within datasource-arrow. ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> Technically yes, in that we support a new format now. I'm not sure which documentation would need to be updated? --------- Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com>
…che#18581) Bumps [taiki-e/install-action](https://github.com/taiki-e/install-action) from 2.62.47 to 2.62.49. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/taiki-e/install-action/releases">taiki-e/install-action's releases</a>.</em></p> <blockquote> <h2>2.62.49</h2> <ul> <li> <p>Update <code>cargo-binstall@latest</code> to 1.15.11.</p> </li> <li> <p>Update <code>cargo-auditable@latest</code> to 0.7.2.</p> </li> <li> <p>Update <code>vacuum@latest</code> to 0.20.2.</p> </li> </ul> <h2>2.62.48</h2> <ul> <li> <p>Update <code>mise@latest</code> to 2025.11.3.</p> </li> <li> <p>Update <code>cargo-audit@latest</code> to 0.22.0.</p> </li> <li> <p>Update <code>vacuum@latest</code> to 0.20.1.</p> </li> <li> <p>Update <code>uv@latest</code> to 0.9.8.</p> </li> <li> <p>Update <code>cargo-udeps@latest</code> to 0.1.60.</p> </li> <li> <p>Update <code>zizmor@latest</code> to 1.16.3.</p> </li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/taiki-e/install-action/blob/main/CHANGELOG.md">taiki-e/install-action's changelog</a>.</em></p> <blockquote> <h1>Changelog</h1> <p>All notable changes to this project will be documented in this file.</p> <p>This project adheres to <a href="https://semver.org">Semantic Versioning</a>.</p> <!-- raw HTML omitted --> <h2>[Unreleased]</h2> <h2>[2.62.49] - 2025-11-09</h2> <ul> <li> <p>Update <code>cargo-binstall@latest</code> to 1.15.11.</p> </li> <li> <p>Update <code>cargo-auditable@latest</code> to 0.7.2.</p> </li> <li> <p>Update <code>vacuum@latest</code> to 0.20.2.</p> </li> </ul> <h2>[2.62.48] - 2025-11-08</h2> <ul> <li> <p>Update <code>mise@latest</code> to 2025.11.3.</p> </li> <li> <p>Update <code>cargo-audit@latest</code> to 0.22.0.</p> </li> <li> <p>Update <code>vacuum@latest</code> to 0.20.1.</p> </li> <li> <p>Update <code>uv@latest</code> to 0.9.8.</p> </li> <li> <p>Update <code>cargo-udeps@latest</code> to 0.1.60.</p> </li> <li> <p>Update <code>zizmor@latest</code> to 1.16.3.</p> </li> </ul> <h2>[2.62.47] - 2025-11-05</h2> <ul> <li> <p>Update <code>vacuum@latest</code> to 0.20.0.</p> </li> <li> <p>Update <code>cargo-nextest@latest</code> to 0.9.111.</p> </li> <li> <p>Update <code>cargo-shear@latest</code> to 1.6.2.</p> </li> </ul> <h2>[2.62.46] - 2025-11-04</h2> <ul> <li> <p>Update <code>vacuum@latest</code> to 0.19.5.</p> </li> <li> <p>Update <code>syft@latest</code> to 1.37.0.</p> </li> <li> <p>Update <code>mise@latest</code> to 2025.11.2.</p> </li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/taiki-e/install-action/commit/44c6d64aa62cd779e873306675c7a58e86d6d532"><code>44c6d64</code></a> Release 2.62.49</li> <li><a href="https://github.com/taiki-e/install-action/commit/3a701df4c2a3e11596a1c5a65eb0e69c79ee4a82"><code>3a701df</code></a> Update <code>cargo-binstall@latest</code> to 1.15.11</li> <li><a href="https://github.com/taiki-e/install-action/commit/4242e04eb80c4492261074808c18d638aa247de0"><code>4242e04</code></a> Update <code>cargo-auditable@latest</code> to 0.7.2</li> <li><a href="https://github.com/taiki-e/install-action/commit/3df5533ef842d100d27dbd43c2fbd8aa0cccddcc"><code>3df5533</code></a> Update <code>vacuum@latest</code> to 0.20.2</li> <li><a href="https://github.com/taiki-e/install-action/commit/e797ba6a25dbd8669057e123b02812e16138589e"><code>e797ba6</code></a> Release 2.62.48</li> <li><a href="https://github.com/taiki-e/install-action/commit/bcf91e02acc5cc0ed84eac8d763b7328a3c7cd3f"><code>bcf91e0</code></a> Update <code>mise@latest</code> to 2025.11.3</li> <li><a href="https://github.com/taiki-e/install-action/commit/e78113b60c103d89241857d78e2610df1305cffd"><code>e78113b</code></a> Update <code>cargo-audit@latest</code> to 0.22.0</li> <li><a href="https://github.com/taiki-e/install-action/commit/0ef486444ebe65689986d037f4b61d8292b5a4ed"><code>0ef4864</code></a> Update <code>vacuum@latest</code> to 0.20.1</li> <li><a href="https://github.com/taiki-e/install-action/commit/5eda7b198531ad7024688974dd308f7ea0bd21aa"><code>5eda7b1</code></a> Update <code>uv@latest</code> to 0.9.8</li> <li><a href="https://github.com/taiki-e/install-action/commit/3853a413e6de756806bca9b522388e2d2b5abbd6"><code>3853a41</code></a> Update <code>cargo-udeps@latest</code> to 0.1.60</li> <li>Additional commits viewable in <a href="https://github.com/taiki-e/install-action/compare/6f9c7cc51aa54b13cbcbd12f8bbf69d8ba405b4b...44c6d64aa62cd779e873306675c7a58e86d6d532">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [maturin](https://github.com/pyo3/maturin) from 1.9.6 to 1.10.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/pyo3/maturin/releases">maturin's releases</a>.</em></p> <blockquote> <h2>v1.10.0</h2> <h2>What's Changed</h2> <ul> <li>Fix generated WHEEL Tag metadata to be spec compliant. by <a href="https://github.com/jsirois"><code>@jsirois</code></a> in <a href="https://redirect.github.com/PyO3/maturin/pull/2762">PyO3/maturin#2762</a></li> <li>Export all Cargo URL metadata items to Python by <a href="https://github.com/chrysn"><code>@chrysn</code></a> in <a href="https://redirect.github.com/PyO3/maturin/pull/2760">PyO3/maturin#2760</a></li> <li>Update maximum Python version to 3.14 by <a href="https://github.com/messense"><code>@messense</code></a> in <a href="https://redirect.github.com/PyO3/maturin/pull/2763">PyO3/maturin#2763</a></li> <li>Remove shebang from non-executable <strong>init</strong>.py file by <a href="https://github.com/musicinmybrain"><code>@musicinmybrain</code></a> in <a href="https://redirect.github.com/PyO3/maturin/pull/2775">PyO3/maturin#2775</a></li> <li>Stop warning about missing <code>extension-module</code> feature on pyo3 0.26+ by <a href="https://github.com/messense"><code>@messense</code></a> in <a href="https://redirect.github.com/PyO3/maturin/pull/2789">PyO3/maturin#2789</a></li> <li><code>--profile</code> conflicts with <code>--release</code> (and/or <code>--debug</code>) by <a href="https://github.com/davidhewitt"><code>@davidhewitt</code></a> in <a href="https://redirect.github.com/PyO3/maturin/pull/2793">PyO3/maturin#2793</a></li> <li>Bump MSRV to 1.83.0 by <a href="https://github.com/messense"><code>@messense</code></a> in <a href="https://redirect.github.com/PyO3/maturin/pull/2790">PyO3/maturin#2790</a></li> <li>respect CLI profile over pyproject.toml by <a href="https://github.com/davidhewitt"><code>@davidhewitt</code></a> in <a href="https://redirect.github.com/PyO3/maturin/pull/2794">PyO3/maturin#2794</a></li> <li>chore: add FreeBSD 14.3 amd64 sysconfig by <a href="https://github.com/fleetingbytes"><code>@fleetingbytes</code></a> in <a href="https://redirect.github.com/PyO3/maturin/pull/2805">PyO3/maturin#2805</a></li> <li>Add Cygwin support by <a href="https://github.com/lazka"><code>@lazka</code></a> in <a href="https://redirect.github.com/PyO3/maturin/pull/2819">PyO3/maturin#2819</a></li> <li>PyO3: do not add <code>extension-module</code> feature in template and tutorial by <a href="https://github.com/Tpt"><code>@Tpt</code></a> in <a href="https://redirect.github.com/PyO3/maturin/pull/2821">PyO3/maturin#2821</a></li> <li>Remove add_directory() from ModuleWriter trait by <a href="https://github.com/e-nomem"><code>@e-nomem</code></a> in <a href="https://redirect.github.com/PyO3/maturin/pull/2824">PyO3/maturin#2824</a></li> <li>Correct wheel naming when targeting iOS by <a href="https://github.com/freakboy3742"><code>@freakboy3742</code></a> in <a href="https://redirect.github.com/PyO3/maturin/pull/2827">PyO3/maturin#2827</a></li> <li>Add support for iOS cross-platform virtual environments by <a href="https://github.com/freakboy3742"><code>@freakboy3742</code></a> in <a href="https://redirect.github.com/PyO3/maturin/pull/2828">PyO3/maturin#2828</a></li> <li>add <code>editable-profile</code> option by <a href="https://github.com/davidhewitt"><code>@davidhewitt</code></a> in <a href="https://redirect.github.com/PyO3/maturin/pull/2826">PyO3/maturin#2826</a></li> <li>Make sdist reproducible by <a href="https://github.com/e-nomem"><code>@e-nomem</code></a> in <a href="https://redirect.github.com/PyO3/maturin/pull/2831">PyO3/maturin#2831</a></li> <li>always use "library" mode to generate uniffi bindings by <a href="https://github.com/davidhewitt"><code>@davidhewitt</code></a> in <a href="https://redirect.github.com/PyO3/maturin/pull/2840">PyO3/maturin#2840</a></li> <li>If an interpreter is available, use it, even when building ABI3. by <a href="https://github.com/freakboy3742"><code>@freakboy3742</code></a> in <a href="https://redirect.github.com/PyO3/maturin/pull/2829">PyO3/maturin#2829</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/jsirois"><code>@jsirois</code></a> made their first contribution in <a href="https://redirect.github.com/PyO3/maturin/pull/2762">PyO3/maturin#2762</a></li> <li><a href="https://github.com/chrysn"><code>@chrysn</code></a> made their first contribution in <a href="https://redirect.github.com/PyO3/maturin/pull/2760">PyO3/maturin#2760</a></li> <li><a href="https://github.com/ddelange"><code>@ddelange</code></a> made their first contribution in <a href="https://redirect.github.com/PyO3/maturin/pull/2769">PyO3/maturin#2769</a></li> <li><a href="https://github.com/vvsagar"><code>@vvsagar</code></a> made their first contribution in <a href="https://redirect.github.com/PyO3/maturin/pull/2783">PyO3/maturin#2783</a></li> <li><a href="https://github.com/MatthijsKok"><code>@MatthijsKok</code></a> made their first contribution in <a href="https://redirect.github.com/PyO3/maturin/pull/2799">PyO3/maturin#2799</a></li> <li><a href="https://github.com/fleetingbytes"><code>@fleetingbytes</code></a> made their first contribution in <a href="https://redirect.github.com/PyO3/maturin/pull/2805">PyO3/maturin#2805</a></li> <li><a href="https://github.com/linkmauve"><code>@linkmauve</code></a> made their first contribution in <a href="https://redirect.github.com/PyO3/maturin/pull/2811">PyO3/maturin#2811</a></li> <li><a href="https://github.com/lazka"><code>@lazka</code></a> made their first contribution in <a href="https://redirect.github.com/PyO3/maturin/pull/2819">PyO3/maturin#2819</a></li> <li><a href="https://github.com/e-nomem"><code>@e-nomem</code></a> made their first contribution in <a href="https://redirect.github.com/PyO3/maturin/pull/2824">PyO3/maturin#2824</a></li> <li><a href="https://github.com/freakboy3742"><code>@freakboy3742</code></a> made their first contribution in <a href="https://redirect.github.com/PyO3/maturin/pull/2827">PyO3/maturin#2827</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/PyO3/maturin/compare/v1.9.6...v1.10.0">https://github.com/PyO3/maturin/compare/v1.9.6...v1.10.0</a></p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/PyO3/maturin/blob/main/Changelog.md">maturin's changelog</a>.</em></p> <blockquote> <h2>[1.10.0]</h2> <ul> <li>Add <code>tool.maturin.editable-profile</code> option to override profile for editable package installations.</li> <li>Add support for Cygwin.</li> <li>When building <code>abi3</code> wheels on non-Windows platforms that aren't cross-compiling, the <code>sysconfigdata</code> of the interpreter used to run maturin will now be used, rather than a dummy interpreter.</li> <li>Allow iOS cross-platform virtual environments, such as those used by cibuildwheel, to imply an iOS target.</li> <li>Fix iOS wheel naming to be compliant with PEP 730.</li> <li>Fix generated WHEEL Tag metadata to be spec compliant.</li> <li>Fix incorrect warning about missing <code>extension-module</code> feature on PyO3 0.26+.</li> <li>Remove <code>add_directory()</code> from ModuleWriter and make it an implementation detail for the specific impl.</li> <li>Clear out uid/gid and set deterministic mtime for files in sdist.</li> <li>Always use "library" mode to build uniffi bindings.</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/PyO3/maturin/commit/c3093d1c1089a65e4baca7bb98b930ce0b297863"><code>c3093d1</code></a> release: 1.10.0 (<a href="https://redirect.github.com/pyo3/maturin/issues/2841">#2841</a>)</li> <li><a href="https://github.com/PyO3/maturin/commit/a41bc8654c7753106a50c52fbea6c51f63e18adb"><code>a41bc86</code></a> If an interpreter is available, use it, even when building ABI3. (<a href="https://redirect.github.com/pyo3/maturin/issues/2829">#2829</a>)</li> <li><a href="https://github.com/PyO3/maturin/commit/e75305205431319e1b9a70b5a76bb84e8bfa60bb"><code>e753052</code></a> always use "library" mode to generate uniffi bindings (<a href="https://redirect.github.com/pyo3/maturin/issues/2840">#2840</a>)</li> <li><a href="https://github.com/PyO3/maturin/commit/216b643ea45794107922308c71b352f06d27a163"><code>216b643</code></a> Update manylinux/musllinux policies to the latest main (<a href="https://redirect.github.com/pyo3/maturin/issues/2836">#2836</a>)</li> <li><a href="https://github.com/PyO3/maturin/commit/044ba832245e67924cb950e81597949ae14acb97"><code>044ba83</code></a> Revert "Upgrade goblin to 0.10" (<a href="https://redirect.github.com/pyo3/maturin/issues/2837">#2837</a>)</li> <li><a href="https://github.com/PyO3/maturin/commit/bb3d629fb79cc41e3fb71a75a86eed599a2d2643"><code>bb3d629</code></a> Upgrade goblin to 0.10 (<a href="https://redirect.github.com/pyo3/maturin/issues/2833">#2833</a>)</li> <li><a href="https://github.com/PyO3/maturin/commit/837549608af174d232b6f9e15f04ab3e77258bf5"><code>8375496</code></a> Use <code>serial_test</code> for tests that modifies env vars (<a href="https://redirect.github.com/pyo3/maturin/issues/2832">#2832</a>)</li> <li><a href="https://github.com/PyO3/maturin/commit/9dc2f5fc546d609436805e655b4c02b4ebf9287b"><code>9dc2f5f</code></a> Make sdist reproducible (<a href="https://redirect.github.com/pyo3/maturin/issues/2831">#2831</a>)</li> <li><a href="https://github.com/PyO3/maturin/commit/685efba876ad23417d506e40f21c8bebacb0c00f"><code>685efba</code></a> ci: bump to Python 3.14, update runners (<a href="https://redirect.github.com/pyo3/maturin/issues/2830">#2830</a>)</li> <li><a href="https://github.com/PyO3/maturin/commit/57aa6ed9663c70ebd57b78eebe7143b5fa3b0839"><code>57aa6ed</code></a> add <code>editable-profile</code> option (<a href="https://redirect.github.com/pyo3/maturin/issues/2826">#2826</a>)</li> <li>Additional commits viewable in <a href="https://github.com/pyo3/maturin/compare/v1.9.6...v1.10.0">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Closes #. ## Rationale for this change `cargo-machete` identifies an unused dependency and this blocks a bunch of dependabot updates PRs apache#18580 <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->
… require code changes. (apache#18586) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> Part of apache#18503 ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> Enforce the lint rule to all crates that are already passing this extra check, and we don't need further code change on them. ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Closes apache#18341. - Closes apache#9370 ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> Cases where two RepartitionExec operators appear consecutively in the plan. This is unneeded overhead that eliminating provides speed ups. Full Report: [The Physical Optimizer and Fixing Consecutive Repartitions In the Enforce Distribution Rule.pdf](https://github.com/user-attachments/files/23420831/The.Physical.Optimizer.and.Fixing.Consecutive.Repartitions.In.the.Enforce.Distribution.Rule.pdf) Issue Report: [Fixing Consecutive Repartitions In the Enforce Distribution Rule.pdf](https://github.com/user-attachments/files/23420880/Fixing.Consecutive.Repartitions.In.the.Enforce.Distribution.Rule.pdf) ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> Change to repartition adding logic in `enforce_distribution.rs` A ton of test and bench updates to mirror new behavior ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> Yes benchmarked and tested, check report for benchmarks ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
…8540) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Closes apache#18155. ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> Merges the functionality of `CoalesceAsyncExecInput` into `CoalesceBatches` to remove redundant optimizer logic and simplify batch coalescing behavior. ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> Behavior is covered by existing ``CoalesceBatches and optimizer tests. ## Are there any user-facing changes? No <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->
## Which issue does this PR close? - part of apache#17558 - Related to apache#18549 ## Rationale for this change While working on apache#18549 I noticed that the release download page was out of date https://datafusion.apache.org/download.html <img width="816" height="456" alt="Screenshot 2025-11-08 at 6 31 50 AM" src="https://github.com/user-attachments/assets/4678bc8e-0d85-4951-91af-3b1c0ee00a26" /> ## What changes are included in this PR? Update links (I will comment more inline) ## Are these changes tested? I manually rendered the page: <img width="846" height="869" alt="Screenshot 2025-11-08 at 6 39 00 AM" src="https://github.com/user-attachments/assets/e8312c7c-b4af-43f9-9767-ae234c08c928" /> ## Are there any user-facing changes? Updated download page that is not outdated
…e#18603) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Closes apache#18375 . ## What changes are included in this PR? Enhance the help message for invalid command, add the `help` command in the help message. <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? Manual test <img width="737" height="190" alt="image" src="https://github.com/user-attachments/assets/073d4b96-db6f-448b-a7a4-3f9fa456a72e" /> <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? No <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Closes apache#18597 ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> A check is recently added to `invoke_with_args` that checks for the output type of the result with the expected output type from the UDF - apache#17515. Because the fast path misses adding the timezone, the assertion added in this PR fails. ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> Include timezone information in the fast path. ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> Yes, added a unit test ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> No <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Closes: apache#18606 - Relates to: apache#7001 ## Rationale for this change Moe coalesce batches inside filter exec. We can use `BatchCoalescer ::push_batch_with_filter` which should give a speed up compared to filtering individual batches + concatenating afterwards. <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? Changes the FilterExec to coalesce batches inside. I did not make a change to remove CoalesceBatchesExec from all the plans, I plan to create an issue and a PR after this is merged to do so. Now it should be mostly a no-op with limited overhead as input batches are already well-sized. <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->
## Which issue does this PR close? - None, simply corrects two unit tests ## Rationale for this change The `test_and_null_boolean_intervals` and `test_or_null_boolean_intervals` are a bit misleading. They try to create a 'null' interval using `Interval::try_new(ScalarValue::Boolean(None), ScalarValue::Boolean(None))`. The implementation of `try_new` normalises this to `Interval::UNCERTAIN`, which is not the same as 'null'. ## What changes are included in this PR? Adds tests demonstrating: - `Interval::try_new(ScalarValue::Boolean(None), ScalarValue::Boolean(None)) == Interval::UNCERTAIN` - `Interval::UNCERTAIN` contains `ScalarValue::Boolean(Some(true))` and `ScalarValue::Boolean(Some(false))` - `Interval::UNCERTAIN` does not contain `ScalarValue::Boolean(None)` or `ScalarValue::Null` Renames `test_and_null_boolean_intervals` and `test_and_null_boolean_intervals` ## Are these changes tested? Test only changes ## Are there any user-facing changes? No
…::or` tests (apache#18621) ## Which issue does this PR close? None, unit test clarification ## Rationale for this change The unit tests for `Interval::and`, and `Interval::not` are written using a hard to to interpret matrix of boolean values. It's much easier to scan this for correctness when using the constants instead. ## What changes are included in this PR? - Replace raw boolean values with constant references - Ensure tests cover all permutations - Add missing test for `Interval::or` ## Are these changes tested? Test only change ## Are there any user-facing changes? No
## Which issue does this PR close? apache#18158 introduced a regression in the table type created with `DataFrame::into_view()` ## Rationale for this change Correct regression ## What changes are included in this PR? One line fix ## Are these changes tested? Unit test added ## Are there any user-facing changes? None
… SQL planner based on Datatype (apache#18599) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - as discussed in apache#17261 ## Rationale for this change Logical Plan for datatype Int64 and UInt64 differs, UInt64 Logical Plan's Union are wrapped up in Projection, and EliminateNestedUnion OptimezerRule is not applied leading to significantly longer execution time. <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? Separating Benchmarks based on datatype, converting a datatype specific function to a generic one. <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? Yes. <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? No, benchmarks only. <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
## Which issue does this PR close? Closes apache#18516, although it does still not allow for doing `SELECT * WHERE ree_encoded_column_name = 'test'` because we don't have the necessary cast support available yet. See the issue for more details. ## What changes are included in this PR? Type coercion logic for RunEndEncoded types to datafusion-expr-common. I've basically tried to copy what's done for dictionaries. It's quite possible I missed something! ## Are these changes tested? Yes, adds new tests for type coercion of REE types to datafusion-expr-common.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
catalog
Related to the catalog crate
common
Related to common crate
core
Core DataFusion crate
datasource
Changes to the datasource crate
development-process
Related to development process of DataFusion
documentation
Improvements or additions to documentation
functions
Changes to functions implementation
logical-expr
Logical plan and expressions
optimizer
Optimizer rules
physical-expr
Changes to the physical-expr crates
physical-plan
Changes to the physical-plan crate
proto
Related to proto crate
spark
sql
SQL Planner
sqllogictest
SQL Logic Tests (.slt)
substrait
Changes to the substrait crate
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?