From d4ca4dfe12710a2d25e75fe434e38c4ac732dccd Mon Sep 17 00:00:00 2001 From: Matt Butrovich Date: Wed, 20 May 2026 16:05:18 -0400 Subject: [PATCH] add changelog --- dev/changelog/54.0.0.md | 916 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 916 insertions(+) create mode 100644 dev/changelog/54.0.0.md diff --git a/dev/changelog/54.0.0.md b/dev/changelog/54.0.0.md new file mode 100644 index 0000000000000..22382d868365e --- /dev/null +++ b/dev/changelog/54.0.0.md @@ -0,0 +1,916 @@ + + +# Apache DataFusion 54.0.0 Changelog + +This release consists of 727 commits from 139 contributors. See credits at the end of this changelog for more information. + +See the [upgrade guide](https://datafusion.apache.org/library-user-guide/upgrading.html) for information on how to upgrade from previous versions. + +**Breaking changes:** + +- Add `ExecutionPlan::apply_expressions()` [#20337](https://github.com/apache/datafusion/pull/20337) (LiaCastaneda) +- Add `Field` to `Expr::Cast` -- allow logical expressions to express a cast to an extension type [#18136](https://github.com/apache/datafusion/pull/18136) (paleolimbot) +- feat: parse `JsonAccess` as a binary operator, add `Operator::Colon` [#20628](https://github.com/apache/datafusion/pull/20628) (Samyak2) +- Wrap Arc to Statistics for `partition_statistics` API [#20570](https://github.com/apache/datafusion/pull/20570) (xudong963) +- Replace ahash with foldhash for faster hashing in datafusion-common [#20958](https://github.com/apache/datafusion/pull/20958) (Dandandan) +- fix: `arrays_zip/list_zip` allow single array argument [#21047](https://github.com/apache/datafusion/pull/21047) (hsiang-c) +- Remove file prefetching from FileStream [#20916](https://github.com/apache/datafusion/pull/20916) (Dandandan) +- Remove as_any from scalar UDF trait definition [#20812](https://github.com/apache/datafusion/pull/20812) (timsaucer) +- Provide session to the udtf call [#20222](https://github.com/apache/datafusion/pull/20222) (askalt) +- chore: remove as_any from aggregate and window functions [#21209](https://github.com/apache/datafusion/pull/21209) (timsaucer) +- chore: remove as_any from ExecutionPlan [#21263](https://github.com/apache/datafusion/pull/21263) (timsaucer) +- fix: Prefer numeric in type coercion for comparisons [#20426](https://github.com/apache/datafusion/pull/20426) (neilconway) +- refactor(pruning): remove column param from PruningStatistics::row_counts [#21369](https://github.com/apache/datafusion/pull/21369) (adriangb) +- Remove CastColumnExpr and custom_file_casts example; unify on field-aware CastExpr [#21563](https://github.com/apache/datafusion/pull/21563) (kosiew) +- perf: Optimize NULL handling in `StringViewArrayBuilder` [#21538](https://github.com/apache/datafusion/pull/21538) (neilconway) +- Remove `as_any` on the `PhysicalExpr` trait [#21573](https://github.com/apache/datafusion/pull/21573) (timsaucer) +- Remove trait function `as_any` from datafusion-datasource [#21576](https://github.com/apache/datafusion/pull/21576) (timsaucer) +- feat: change approx percentile/median UDFs to return floats [#21074](https://github.com/apache/datafusion/pull/21074) (theirix) +- chore: Rename concat-specific string builders, make pub(crate) [#21695](https://github.com/apache/datafusion/pull/21695) (neilconway) +- perf: Implement physical execution of uncorrelated scalar subqueries [#21240](https://github.com/apache/datafusion/pull/21240) (neilconway) +- Add lambda support and array_transform udf [#21679](https://github.com/apache/datafusion/pull/21679) (gstvg) +- perf: strength reduce hash partition modulo (up to 1.16x faster) [#21900](https://github.com/apache/datafusion/pull/21900) (Dandandan) +- feat: Improve InListExpr types, flatten dict haystacks and validate in try_new_from_array [#21402](https://github.com/apache/datafusion/pull/21402) (buraksenn) +- feat: type-keyed extensions map for PartitionedFile [#21993](https://github.com/apache/datafusion/pull/21993) (adriangb) +- Add support for lambda column capture [#21323](https://github.com/apache/datafusion/pull/21323) (gstvg) +- feat: Add Protobuf support for Explain node [#21994](https://github.com/apache/datafusion/pull/21994) (danielhumanmod) +- deprecate: mark Statistics V2 framework (PR #14699) as deprecated [#22071](https://github.com/apache/datafusion/pull/22071) (alamb) +- feat: impl Any for MemoryPool [#21803](https://github.com/apache/datafusion/pull/21803) (haohuaijin) +- Add metrics to `FFI_ExecutionPlan` [#22136](https://github.com/apache/datafusion/pull/22136) (mailmindlin) +- fix(aggregate): show aliased expr in explain [#21739](https://github.com/apache/datafusion/pull/21739) (kumarUjjawal) +- proto: serialize dynamic filters on Sort, Aggregate, HashJoin plan nodes [#22011](https://github.com/apache/datafusion/pull/22011) (jayshrivastava) +- Add exact HigherOrderSignature [#22326](https://github.com/apache/datafusion/pull/22326) (LiaCastaneda) +- Add a memory bound FileStatisticsCache for the Listing Table [#20047](https://github.com/apache/datafusion/pull/20047) (mkleen) +- Add configurable UNION DISTINCT to FILTER rewrite optimization [#21075](https://github.com/apache/datafusion/pull/21075) (xiedeyantu) +- minor: make HigherOrderSignature less error-prone [#22106](https://github.com/apache/datafusion/pull/22106) (gstvg) +- Expose `ExecutionPlan` statistics across the FFI boundary [#22157](https://github.com/apache/datafusion/pull/22157) (mailmindlin) +- feat: optional timezone for coerce_int96 [#22318](https://github.com/apache/datafusion/pull/22318) (andygrove) + +**Performance related:** + +- perf: Optimize `array_to_string` to avoid a copy [#20639](https://github.com/apache/datafusion/pull/20639) (neilconway) +- perf: Apply logical regexp optimizations to Utf8View and LargeUtf8 inputs [#20581](https://github.com/apache/datafusion/pull/20581) (petern48) +- perf: Optimize `array_concat` using `MutableArrayData` [#20620](https://github.com/apache/datafusion/pull/20620) (neilconway) +- perf: Optimize `to_char` to allocate less, fix NULL handling [#20635](https://github.com/apache/datafusion/pull/20635) (neilconway) +- Eliminate deterministic group by keys with deterministic transformations [#20706](https://github.com/apache/datafusion/pull/20706) (Dandandan) +- perf: short-circuit and collect_bool for IN list with column references [#20694](https://github.com/apache/datafusion/pull/20694) (zhangxffff) +- perf: sort replace free()->try_grow() pattern with try_resize() to reduce memory pool interactions [#20729](https://github.com/apache/datafusion/pull/20729) (mbutrovich) +- perf: Optimize set operations to avoid RowConverter deserialization overhead [#20623](https://github.com/apache/datafusion/pull/20623) (neilconway) +- perf: Use batched row conversion for `array_has_any`, `array_has_all` [#20588](https://github.com/apache/datafusion/pull/20588) (neilconway) +- perf: Optimize array set ops on sliced arrays [#20693](https://github.com/apache/datafusion/pull/20693) (neilconway) +- perf: Optimize comparison on nested types [#20716](https://github.com/apache/datafusion/pull/20716) (neilconway) +- perf: Optimize `array_positions()` for scalar needle [#20770](https://github.com/apache/datafusion/pull/20770) (neilconway) +- perf: Optimize `approx_distinct()` for string, binary inputs [#21037](https://github.com/apache/datafusion/pull/21037) (neilconway) +- perf: Optimize `approx_distinct` for inline Utf8View [#21064](https://github.com/apache/datafusion/pull/21064) (neilconway) +- perf: Optimize `strpos()` for scalar needle, plus optimize UTF-8 codepath [#20754](https://github.com/apache/datafusion/pull/20754) (neilconway) +- perf: Optimize `lpad()`, `rpad()` for scalar args [#20657](https://github.com/apache/datafusion/pull/20657) (neilconway) +- perf: add in-place fast path for ScalarValue::add [#20959](https://github.com/apache/datafusion/pull/20959) (kumarUjjawal) +- perf: Optimize `array_sort()` [#21083](https://github.com/apache/datafusion/pull/21083) (neilconway) +- Super fast extended tests and improved planning speed linux [#21084](https://github.com/apache/datafusion/pull/21084) (blaginin) +- Add a builder to `SimplifyContext` to avoid allocating default values [#21092](https://github.com/apache/datafusion/pull/21092) (AdamGS) +- Avoid creating new RecordBatches to simplify expressions [#20534](https://github.com/apache/datafusion/pull/20534) (alamb) +- perf: optimize scatter with type-specific specialization [#20498](https://github.com/apache/datafusion/pull/20498) (CuteChuanChuan) +- perf: Optimize `array_min`, `array_max` for arrays of primitive types [#21101](https://github.com/apache/datafusion/pull/21101) (neilconway) +- perf: optimize map validation for common key types [#20805](https://github.com/apache/datafusion/pull/20805) (lyne7-sc) +- perf: specialized SemiAntiSortMergeJoinStream [#20806](https://github.com/apache/datafusion/pull/20806) (mbutrovich) +- Improvement: keep order-preserving repartitions for streaming aggregates [#21107](https://github.com/apache/datafusion/pull/21107) (xudong963) +- perf: Add support for `GroupsAccumulator` to `string_agg` [#21154](https://github.com/apache/datafusion/pull/21154) (neilconway) +- perf: Optimize `split_part`, support `Utf8View` [#21119](https://github.com/apache/datafusion/pull/21119) (neilconway) +- perf: sort-merge join (SMJ) batch deferred filtering and move mark joins to bitwise stream. Near-unique LEFT and FULL SMJ 20-50x faster [#21184](https://github.com/apache/datafusion/pull/21184) (mbutrovich) +- perf: Optimize `string_to_array` for scalar args [#21131](https://github.com/apache/datafusion/pull/21131) (neilconway) +- Misc minor optimizations to query optimizer performance [#21128](https://github.com/apache/datafusion/pull/21128) (AdamGS) +- ensure dynamic filters are correctly pushed down through aggregations [#21059](https://github.com/apache/datafusion/pull/21059) (jayshrivastava) +- perf: Merge Precision in-place [#21219](https://github.com/apache/datafusion/pull/21219) (AdamGS) +- feat: support GroupsAccumulator for first_value and last_value with string/binary types [#21090](https://github.com/apache/datafusion/pull/21090) (UBarney) +- perf: Optimize `split_part` for scalar args [#21238](https://github.com/apache/datafusion/pull/21238) (neilconway) +- perf: optimize object store requests when reading JSON [#20823](https://github.com/apache/datafusion/pull/20823) (ariel-miculas) +- perf: Optimize `split_part` for `Utf8View` [#21420](https://github.com/apache/datafusion/pull/21420) (neilconway) +- Eliminate outer joins with empty relations via null-padded projection [#21321](https://github.com/apache/datafusion/pull/21321) (SubhamSinghal) +- Optimize `regexp_replace` by stripping trailing .\* from anchored patterns. 2.4x improvement (ClickBench Q28) [#21379](https://github.com/apache/datafusion/pull/21379) (Dandandan) +- perf: use DynComparator in sort-merge join (SMJ), microbenchmark queries up to 12% faster, TPC-H overall ~5% faster [#21484](https://github.com/apache/datafusion/pull/21484) (mbutrovich) +- perf: Optimize NULL handling in `substr` [#21519](https://github.com/apache/datafusion/pull/21519) (neilconway) +- perf: replace SMJ's join_filter_not_matched_map HashMap with Vec [#21517](https://github.com/apache/datafusion/pull/21517) (mbutrovich) +- perf: Optimize NULL handling in `find_in_set` [#21464](https://github.com/apache/datafusion/pull/21464) (neilconway) +- perf: Optimize NULL handling in `lcm`, `gcd` [#21468](https://github.com/apache/datafusion/pull/21468) (neilconway) +- perf: Optimize NULL handling in `arrays_zip` [#21475](https://github.com/apache/datafusion/pull/21475) (neilconway) +- perf: Optimize NULL handling in `array_remove` [#21532](https://github.com/apache/datafusion/pull/21532) (neilconway) +- perf: Optimize NULL handling in `array_slice` [#21482](https://github.com/apache/datafusion/pull/21482) (neilconway) +- perf: Optimize NULL handling in some datetime functions [#21477](https://github.com/apache/datafusion/pull/21477) (neilconway) +- perf: Optimize NULL handling in `array_has` [#21471](https://github.com/apache/datafusion/pull/21471) (neilconway) +- perf: Optimize `Utf8View` string concat [#21535](https://github.com/apache/datafusion/pull/21535) (neilconway) +- Conditionally build page pruning predicates [#21480](https://github.com/apache/datafusion/pull/21480) (fpetkovski) +- perf: add fast path for uniform fill values in `array_resize` [#20617](https://github.com/apache/datafusion/pull/20617) (lyne7-sc) +- perf : Optimize count distinct using bitmaps instead of hashsets for smaller datatypes [#21456](https://github.com/apache/datafusion/pull/21456) (coderfender) +- perf: Optimize `left`, `right` to reduce copying [#21442](https://github.com/apache/datafusion/pull/21442) (neilconway) +- perf: Optimize `substr` for Utf8, LargeUtf8 [#21366](https://github.com/apache/datafusion/pull/21366) (neilconway) +- feat: Optimize ORDER BY by Pruning Functionally Redundant Sort Keys [#21362](https://github.com/apache/datafusion/pull/21362) (xiedeyantu) +- perf: Optimize logical optimizer's `OptimizeProjections` pass [#21726](https://github.com/apache/datafusion/pull/21726) (neilconway) +- perf: Optimize `DFSchema::qualified_name` [#21722](https://github.com/apache/datafusion/pull/21722) (neilconway) +- perf: Tweak vec capacity in `project_statistics` [#21734](https://github.com/apache/datafusion/pull/21734) (neilconway) +- perf: Reduce `Box` and `Arc` allocation churn during tree rewriting [#21749](https://github.com/apache/datafusion/pull/21749) (neilconway) +- perf: Implement groups accumulator count distinct primitive types [#21561](https://github.com/apache/datafusion/pull/21561) (coderfender) +- perf: Optimize approx count distinct using bitmaps instead of HLL for smaller int datatypes [#21453](https://github.com/apache/datafusion/pull/21453) (coderfender) +- perf: Optimize `lower`, `upper` for sliced arrays [#21814](https://github.com/apache/datafusion/pull/21814) (neilconway) +- perf: Add bulk NULL-aware string builders, use in `lower` and `upper` [#21789](https://github.com/apache/datafusion/pull/21789) (neilconway) +- perf: Use bulk-NULL builder in `uuid` [#21845](https://github.com/apache/datafusion/pull/21845) (neilconway) +- Skip map_expressions rebuild for Extension nodes with empty expressions [#21701](https://github.com/apache/datafusion/pull/21701) (zhuqi-lucas) +- Refactor InListExpr into static-filter modules [#21649](https://github.com/apache/datafusion/pull/21649) (geoffreyclaude) +- perf: Use bulk-NULL string builder in `initcap` [#21863](https://github.com/apache/datafusion/pull/21863) (neilconway) +- perf: Use bulk-NULL builder in `chr` [#21847](https://github.com/apache/datafusion/pull/21847) (neilconway) +- perf: implement convert_to_state for SparkAvg [#21548](https://github.com/apache/datafusion/pull/21548) (azhangd) +- perf: optimise `first_value`, `last_value` aggregate function [#21383](https://github.com/apache/datafusion/pull/21383) (theirix) +- perf(spark): use 256-entry byte-pair table in hex encoding [#21836](https://github.com/apache/datafusion/pull/21836) (Scolliq) +- perf: Optimize `substr_index` to use bulk-NULL string builder [#21877](https://github.com/apache/datafusion/pull/21877) (neilconway) +- perf: Use bulk-NULL builder in `replace` [#21849](https://github.com/apache/datafusion/pull/21849) (neilconway) +- Add SQL based benchmarking harness, port tpch to use framework [#21707](https://github.com/apache/datafusion/pull/21707) (Omega359) +- perf: Add `BulkNullStringArrayBuilder` trait, use in `repeat` [#21854](https://github.com/apache/datafusion/pull/21854) (neilconway) +- perf: optimize retract_batch for `median` and `percentile_cont` [#21894](https://github.com/apache/datafusion/pull/21894) (lyne7-sc) +- perf: Optimize `reverse` using bulk-NULL string builders [#21991](https://github.com/apache/datafusion/pull/21991) (neilconway) +- perf: Optimize `lower`, `upper` for ASCII inputs [#21980](https://github.com/apache/datafusion/pull/21980) (neilconway) +- perf: Cast entire Date32 array to Date64 on 1st failure [#21948](https://github.com/apache/datafusion/pull/21948) (huymq1710) +- perf: Use `NullBuffer::union_many` [#22070](https://github.com/apache/datafusion/pull/22070) (neilconway) +- perf: improve Int64 `generate_series` and `range` performance [#21891](https://github.com/apache/datafusion/pull/21891) (lyne7-sc) +- perf: batch contiguous extend calls in `array_replace` [#22119](https://github.com/apache/datafusion/pull/22119) (lyne7-sc) +- perf: Add `append_with` to string builders, use in `replace` [#22029](https://github.com/apache/datafusion/pull/22029) (neilconway) +- perf: reuse mask in `truncate_list_nulls` and avoid counting all true bits [#22158](https://github.com/apache/datafusion/pull/22158) (rluvaton) +- Skip RowFilter and page pruning for fully matched row groups [#21637](https://github.com/apache/datafusion/pull/21637) (xudong963) +- perf: bypass values.value(i) for inline strings in ArrowBytesViewMap [#22172](https://github.com/apache/datafusion/pull/22172) (RyanJamesStewart) +- perf: Elimiate SortExec on generate_series() [#22238](https://github.com/apache/datafusion/pull/22238) (2010YOUY01) +- perf: coalesce batches before sending to distributor channels in RepartitionExec [#22010](https://github.com/apache/datafusion/pull/22010) (gabotechs) +- Resolve MIN/MAX from Parquet metadata for Single-mode aggregates and CAST projections [#21651](https://github.com/apache/datafusion/pull/21651) (Dandandan) +- Compact more aggressively in TopK based upon memory usage [#20381](https://github.com/apache/datafusion/pull/20381) (cetra3) + +**Implemented enhancements:** + +- feat: support nanosecond date_part [#20674](https://github.com/apache/datafusion/pull/20674) (mhilton) +- feat: Support Spark `array_contains` builtin function [#20685](https://github.com/apache/datafusion/pull/20685) (comphead) +- feat: Integrate CastColumnExpr into PhysicalExprAdapter [#20269](https://github.com/apache/datafusion/pull/20269) (kumarUjjawal) +- feat: `partition_statistics()` for HashJoinExec [#20711](https://github.com/apache/datafusion/pull/20711) (jonathanc-n) +- feat: make DefaultLogicalExtensionCodec support serialisation of buil… [#20638](https://github.com/apache/datafusion/pull/20638) (Acfboy) +- feat: correct struct column names for `arrays_zip` return type [#20886](https://github.com/apache/datafusion/pull/20886) (comphead) +- feat: Reduce allocations for aggregating `Statistics` [#20768](https://github.com/apache/datafusion/pull/20768) (jonathanc-n) +- feat: add `custom_string_literal_override` to unparser Dialect trait [#20590](https://github.com/apache/datafusion/pull/20590) (goldmedal) +- feat: Extract NDV (distinct_count) statistics from Parquet metadata [#19957](https://github.com/apache/datafusion/pull/19957) (asolimando) +- feat: support repartitioning of FFI execution plans [#20449](https://github.com/apache/datafusion/pull/20449) (timsaucer) +- feat: create a datafusion-example for in-memory file format [#20394](https://github.com/apache/datafusion/pull/20394) (kumarUjjawal) +- feat: implement PhysicalOptimizerRule in FFI crate [#20451](https://github.com/apache/datafusion/pull/20451) (timsaucer) +- feat(metric): Add output skewness metric to detect skewed plans easier [#21211](https://github.com/apache/datafusion/pull/21211) (2010YOUY01) +- feat: add sort pushdown benchmark and SLT tests [#21213](https://github.com/apache/datafusion/pull/21213) (zhuqi-lucas) +- feat(sql): unparse array_has as ANY for Postgres [#20654](https://github.com/apache/datafusion/pull/20654) (vimeh) +- feat: feature-gate `sqllogictests` datafusion-substrait behind optional 'substrait' feature [#21268](https://github.com/apache/datafusion/pull/21268) (zhuqi-lucas) +- feat: generate reversed-name data for sort pushdown benchmark [#21266](https://github.com/apache/datafusion/pull/21266) (zhuqi-lucas) +- feat: Complete basic `LATERAL JOIN` functionality [#21202](https://github.com/apache/datafusion/pull/21202) (neilconway) +- feat: Use NDV for equality filter selectivity calculation [#20789](https://github.com/apache/datafusion/pull/20789) (jonathanc-n) +- feat: make BatchPartitioner::partition_iter public [#21341](https://github.com/apache/datafusion/pull/21341) (hcrosse) +- feat: spark compatible float to timestamp cast with ANSI support [#21212](https://github.com/apache/datafusion/pull/21212) (coderfender) +- feat(spark): Adds spark round function [#21062](https://github.com/apache/datafusion/pull/21062) (SubhamSinghal) +- feat: make DataFrame::create_physical_plan take &self instead of self [#20562](https://github.com/apache/datafusion/pull/20562) (xanderbailey) +- feat: add support for parquet content defined chunking options [#21110](https://github.com/apache/datafusion/pull/21110) (kszucs) +- feat: sort file groups by statistics during sort pushdown (Sort pushdown phase 2) [#21182](https://github.com/apache/datafusion/pull/21182) (zhuqi-lucas) +- feat: Set NDV to Exact(1) for numeric equality filter predicates [#21077](https://github.com/apache/datafusion/pull/21077) (asolimando) +- feat: make sort pushdown BufferExec capacity configurable, default 1GB [#21426](https://github.com/apache/datafusion/pull/21426) (zhuqi-lucas) +- feat: Propagate orderings through struct-producing projections [#21218](https://github.com/apache/datafusion/pull/21218) (rkrishn7) +- feat: add cast_to_type UDF for type-based casting [#21322](https://github.com/apache/datafusion/pull/21322) (adriangb) +- feat: Add pluggable StatisticsRegistry for operator-level statistics propagation [#21483](https://github.com/apache/datafusion/pull/21483) (asolimando) +- feat: Add Hash trait to Aggregate enums [#21569](https://github.com/apache/datafusion/pull/21569) (rluvaton) +- feat(substrait): support Placeholder <-> DynamicParameter in Substrait producer/consumer [#20977](https://github.com/apache/datafusion/pull/20977) (bvolpato) +- feat: add `with_metadata` scalar UDF to attach Arrow field metadata [#21509](https://github.com/apache/datafusion/pull/21509) (adriangb) +- feat: Additional Canonical Extension Types [#21291](https://github.com/apache/datafusion/pull/21291) (tschwarzinger) +- feat: Add memory-limited execution for NestedLoopJoinExec [#21448](https://github.com/apache/datafusion/pull/21448) (viirya) +- feat(stats): cap NDV at row count in statistics estimation [#21081](https://github.com/apache/datafusion/pull/21081) (asolimando) +- feat: support `array_compact` builtin function [#21522](https://github.com/apache/datafusion/pull/21522) (comphead) +- feat: add a config to disable subquery_sort_elimination [#21614](https://github.com/apache/datafusion/pull/21614) (haohuaijin) +- feat: extend single ndv optimization to non-arithmetic supporting types for equality predicates [#21473](https://github.com/apache/datafusion/pull/21473) (buraksenn) +- feat: extend interval analysis support for temporal types [#21520](https://github.com/apache/datafusion/pull/21520) (buraksenn) +- feat: add sort_pushdown_inexact benchmark for RG reorder [#21674](https://github.com/apache/datafusion/pull/21674) (zhuqi-lucas) +- feat: support '>', '<', '>=', '<=', '<>' in all operator [#21416](https://github.com/apache/datafusion/pull/21416) (buraksenn) +- feat: Add support for `LEFT JOIN LATERAL` [#21352](https://github.com/apache/datafusion/pull/21352) (neilconway) +- feat: Expose used `MemoryPool` details in `ResourcesExhausted` error messages [#20387](https://github.com/apache/datafusion/pull/20387) (erenavsarogullari) +- feat: estimate cardinality for semi and anti-joins using distinct counts [#20904](https://github.com/apache/datafusion/pull/20904) (buraksenn) +- feat: support `ListView` and `LargeListView` in `ScalarValue` [#21669](https://github.com/apache/datafusion/pull/21669) (Jefffrey) +- feat: add cosine_distance scalar function [#21542](https://github.com/apache/datafusion/pull/21542) (crm26) +- feat: remove `__unnest_placeholder` from struct unnest projection [#21725](https://github.com/apache/datafusion/pull/21725) (akoshchiy) +- feat(unparser): Keep inner join `Filter → TableScan` predicates to `WHERE` instead of moving to `JOIN ON` [#21694](https://github.com/apache/datafusion/pull/21694) (sgrebnov) +- feat: minor lambda perf improvements [#21896](https://github.com/apache/datafusion/pull/21896) (comphead) +- feat: automatically cast `ListView` to `List` for UDFs [#21855](https://github.com/apache/datafusion/pull/21855) (Jefffrey) +- feat: support binary arguments for StringConcat operator [#21883](https://github.com/apache/datafusion/pull/21883) (theirix) +- feat: add inner_product scalar function [#21861](https://github.com/apache/datafusion/pull/21861) (crm26) +- feat: Support RIGHT/FULL joins in NLJ memory-limited execution [#21833](https://github.com/apache/datafusion/pull/21833) (viirya) +- feat: Improved multiple column aggregation performance by using bitmasks rather than `Vec` [#21886](https://github.com/apache/datafusion/pull/21886) (huymq1710) +- feat: Making From conversions fallible with `TryFrom` [#21985](https://github.com/apache/datafusion/pull/21985) (Soham-Bhattacharjee-work) +- feat: support spark compatible floor function [#21933](https://github.com/apache/datafusion/pull/21933) (athlcode) +- feat: fix NTILE distribution logic [#22051](https://github.com/apache/datafusion/pull/22051) (comphead) +- feat: implement retract_batch for array_agg sliding window support [#22015](https://github.com/apache/datafusion/pull/22015) (SubhamSinghal) +- feat: Upgrade to sqlparser-rs 0.62.0 [#22069](https://github.com/apache/datafusion/pull/22069) (andygrove) +- feat: fix windows frame positive/neg overflows [#22140](https://github.com/apache/datafusion/pull/22140) (comphead) +- feat: fix AVG sliding windows wrong results with NULLs [#22139](https://github.com/apache/datafusion/pull/22139) (comphead) +- feat: fix windows decimal casting frame [#22174](https://github.com/apache/datafusion/pull/22174) (comphead) +- feat: eliminate GlobalLimitExec when input statistics prove limit is already satisfied [#22150](https://github.com/apache/datafusion/pull/22150) (xiedeyantu) +- feat: globally reorder files and row groups by statistics for TopK queries [#21956](https://github.com/apache/datafusion/pull/21956) (zhuqi-lucas) +- feat: Restore nullability when consuming substrait fields [#22105](https://github.com/apache/datafusion/pull/22105) (neilconway) +- feat: add array_normalize scalar function [#22013](https://github.com/apache/datafusion/pull/22013) (crm26) +- feat: add Spark-compatible xxhash64 function [#21967](https://github.com/apache/datafusion/pull/21967) (andygrove) + +**Fixed bugs:** + +- fix: make the `sql` feature truly optional [#20625](https://github.com/apache/datafusion/pull/20625) (linhr) +- fix: use try_shrink instead of shrink in try_resize [#20424](https://github.com/apache/datafusion/pull/20424) (ariel-miculas) +- fix: Provide more generic API for the capacity limit parsing [#20372](https://github.com/apache/datafusion/pull/20372) (erenavsarogullari) +- fix: Fix bug in `array_has` scalar path with sliced arrays [#20677](https://github.com/apache/datafusion/pull/20677) (neilconway) +- fix: `HashJoin` panic with String dictionary keys (don't flatten keys) [#20505](https://github.com/apache/datafusion/pull/20505) (alamb) +- fix: Return `probe_side.len()` for RightMark/Anti count(\*) queries [#20710](https://github.com/apache/datafusion/pull/20710) (jonathanc-n) +- fix: preserve None projection semantics across FFI boundary in ForeignTableProvider::scan [#20393](https://github.com/apache/datafusion/pull/20393) (Kontinuation) +- fix(spark): handle divide-by-zero in Spark `mod`/`pmod` with ANSI mode support [#20461](https://github.com/apache/datafusion/pull/20461) (davidlghellin) +- fix: sqllogictest cannot convert to Substrait [#19739](https://github.com/apache/datafusion/pull/19739) (kumarUjjawal) +- fix: interval analysis error when have two filterexec that inner filter proves zero selectivity [#20743](https://github.com/apache/datafusion/pull/20743) (haohuaijin) +- fix: SanityCheckPlan error with window functions and NVL filter [#20231](https://github.com/apache/datafusion/pull/20231) (EeshanBembi) +- fix: Avoid unnecessary type casts in `concat_ws` [#20436](https://github.com/apache/datafusion/pull/20436) (neilconway) +- fix: Remove `!=0` check from `supports_collect_by_thresholds` [#20730](https://github.com/apache/datafusion/pull/20730) (jonathanc-n) +- fix: do not recompute hash join exec properties if not required [#20900](https://github.com/apache/datafusion/pull/20900) (askalt) +- fix: Optimize `!~ '.*'` case to `col IS NULL AND Boolean(NULL)` instead of `Eq ""` [#20702](https://github.com/apache/datafusion/pull/20702) (petern48) +- fix: Track metrics in hash joins with empty build sides [#20810](https://github.com/apache/datafusion/pull/20810) (nuno-faria) +- fix: dfbench respects DATAFUSION_RUNTIME_MEMORY_LIMIT env var [#20631](https://github.com/apache/datafusion/pull/20631) (adriangb) +- fix(spark): return input string for PATH/FILE on schemeless URLs in `parse_url` [#20506](https://github.com/apache/datafusion/pull/20506) (davidlghellin) +- fix: InList Dictionary filter pushdown type mismatch [#20962](https://github.com/apache/datafusion/pull/20962) (erratic-pattern) +- fix: Run release verification with `--profile=ci` [#20987](https://github.com/apache/datafusion/pull/20987) (alamb) +- fix: move overflow guard before dense ratio in hash join to prevent overflows [#20998](https://github.com/apache/datafusion/pull/20998) (buraksenn) +- fix: improve GroupOrdering docs [#20994](https://github.com/apache/datafusion/pull/20994) (alamb) +- fix: update clickbench expected plan for NDV-aware optimization [#21050](https://github.com/apache/datafusion/pull/21050) (asolimando) +- fix: use datafusion_expr instead of datafusion crate in spark [#21043](https://github.com/apache/datafusion/pull/21043) (davidlghellin) +- Fix CTE reference resolution slt tests [#21049](https://github.com/apache/datafusion/pull/21049) (jonahgao) +- fix: validate wrapped negation during type coercion [#20965](https://github.com/apache/datafusion/pull/20965) (myandpr) +- fix(sql): handle GROUP BY ALL with aliased aggregates [#20943](https://github.com/apache/datafusion/pull/20943) (kumarUjjawal) +- fix: string_to_array('', delim) returns empty array for PostgreSQL compatibility [#21104](https://github.com/apache/datafusion/pull/21104) (dd-david-levin) +- Fix push_down_filter for children with non-empty fetch fields [#21057](https://github.com/apache/datafusion/pull/21057) (shivbhatia10) +- fix(stats): widen sum_value integer arithmetic to SUM-compatible types [#20865](https://github.com/apache/datafusion/pull/20865) (kumarUjjawal) +- fix: skip empty metadata in intersect_metadata_for_union to prevent s… [#21127](https://github.com/apache/datafusion/pull/21127) (RafaelHerrero) +- fix: Df int timestamp cast fix failing CI [#21163](https://github.com/apache/datafusion/pull/21163) (coderfender) +- fix(unparser): Fix BigQuery timestamp literal format in SQL unparsing [#21103](https://github.com/apache/datafusion/pull/21103) (sgrebnov) +- fix: propagate errors for unsupported table function arguments instead of silently dropping them [#21135](https://github.com/apache/datafusion/pull/21135) (buraksenn) +- fix: Fix `main` compilation failure [#21242](https://github.com/apache/datafusion/pull/21242) (2010YOUY01) +- fix: Revert "Fix/support duplicate column names #6543 (#21126)" [#21254](https://github.com/apache/datafusion/pull/21254) (mbutrovich) +- fix: Fix three bugs in query decorrelation [#21208](https://github.com/apache/datafusion/pull/21208) (neilconway) +- fix: date overflow panic [#21233](https://github.com/apache/datafusion/pull/21233) (haohuaijin) +- fix: `SELECT * EXCLUDE(...)` silently returns empty rows when all columns are excluded [#21259](https://github.com/apache/datafusion/pull/21259) (xiedeyantu) +- fix(unparser): use to_rfc3339 for default TIMESTAMPTZ formatting [#21295](https://github.com/apache/datafusion/pull/21295) (sgrebnov) +- fix: use spill writer's schema instead of the first batch schema for spill files [#21293](https://github.com/apache/datafusion/pull/21293) (gruuya) +- fix: binary string concat [#20787](https://github.com/apache/datafusion/pull/20787) (theirix) +- fix(sql): fix a bug when planning semi- or antijoins [#20990](https://github.com/apache/datafusion/pull/20990) (aalexandrov) +- fix(datasource): keep stats absent when collect_stats is false [#21149](https://github.com/apache/datafusion/pull/21149) (kumarUjjawal) +- fix: preserve source field metadata in TryCast expressions [#21390](https://github.com/apache/datafusion/pull/21390) (adriangb) +- fix: skips projection pruning for whole subtree [#20545](https://github.com/apache/datafusion/pull/20545) (Acfboy) +- fix: preserve subquery structure when unparsing SubqueryAlias over Ag… [#21099](https://github.com/apache/datafusion/pull/21099) (yonatan-sevenai) +- fix: FilterExec should drop projection when apply projection pushdown [#21460](https://github.com/apache/datafusion/pull/21460) (haohuaijin) +- fix: preserve duplicate GROUPING SETS rows [#21058](https://github.com/apache/datafusion/pull/21058) (xiedeyantu) +- fix: apply the left side schema on the right side in set expressions [#21052](https://github.com/apache/datafusion/pull/21052) (gruuya) +- fix: Use codepoints in `lpad`, `rpad`, `translate` [#21405](https://github.com/apache/datafusion/pull/21405) (neilconway) +- fix: PostgreSQL dialect can not support tinyint type [#21445](https://github.com/apache/datafusion/pull/21445) (xiedeyantu) +- fix: DataFusion benchmark panicked: failed to cast '2013-07-01' to UInt16 [#21498](https://github.com/apache/datafusion/pull/21498) (xiedeyantu) +- fix(sql): return planner error for malformed typed literals [#21454](https://github.com/apache/datafusion/pull/21454) (officialasishkumar) +- fix: Preserve quoted mixed-case identifiers in the `pivot_unpivot` example [#21432](https://github.com/apache/datafusion/pull/21432) (niebayes) +- fix(spark): array_repeat returns repeated NULLs instead of NULL when element is NULL [#21558](https://github.com/apache/datafusion/pull/21558) (buraksenn) +- fix: grouping with alias [#21438](https://github.com/apache/datafusion/pull/21438) (timsaucer) +- fix(spark): mod/pmod returns NULL instead of NaN for float division by zero [#21557](https://github.com/apache/datafusion/pull/21557) (buraksenn) +- fix: LazyMemoryExec should produce independent streams per execute() [#21565](https://github.com/apache/datafusion/pull/21565) (viirya) +- fix: json scan performance on local files [#21478](https://github.com/apache/datafusion/pull/21478) (ariel-miculas) +- fix(benchmarks): correct TPC-H benchmark SQL [#21615](https://github.com/apache/datafusion/pull/21615) (kumarUjjawal) +- fix: suppress nondeterministic metrics in agg_dyn_e2e sqllogictest [#21657](https://github.com/apache/datafusion/pull/21657) (mbutrovich) +- fix: Fix compilation error on `main` [#21664](https://github.com/apache/datafusion/pull/21664) (2010YOUY01) +- fix: `median` retract logic for sliding window frames [#21300](https://github.com/apache/datafusion/pull/21300) (lyne7-sc) +- fix: Fix Spark `slice` function `Null` type to `GenericListArray` casting issue [#20469](https://github.com/apache/datafusion/pull/20469) (erenavsarogullari) +- fix: Remove nested async block causing Stacked Borrows violation in PushDecoderStreamState [#21663](https://github.com/apache/datafusion/pull/21663) (mbutrovich) +- fix: impl `handle_child_pushdown_result` for `SortExec` [#21527](https://github.com/apache/datafusion/pull/21527) (haohuaijin) +- fix: SortMergeJoin full outer join incorrectly matches rows when filter evaluates to NULL [#21660](https://github.com/apache/datafusion/pull/21660) (mbutrovich) +- fix: try again to fix Miri in ParquetOpener [#21680](https://github.com/apache/datafusion/pull/21680) (mbutrovich) +- fix: `optimize_projections` failure after mark joins created by `EXISTS OR EXISTS` [#21265](https://github.com/apache/datafusion/pull/21265) (buraksenn) +- fix: import from `datafusion_expr` in `make_valid_utf8` [#21687](https://github.com/apache/datafusion/pull/21687) (hcrosse) +- fix: linearized operands in physical binaryexpr protobuf to avoid recursion limit [#21031](https://github.com/apache/datafusion/pull/21031) (haohuaijin) +- fix: remove unnecessary `as_any()` to fix compilation error [#21693](https://github.com/apache/datafusion/pull/21693) (Jefffrey) +- fix: Prevent CLI crash on wide tables [#21721](https://github.com/apache/datafusion/pull/21721) (Geethapranay1) +- fix(unparser): make `BigQueryDialect` more robust [#21296](https://github.com/apache/datafusion/pull/21296) (sgrebnov) +- fix: insert placeholder type inference showing wrong type when there is function wrapped placeholder (unknown type) [#20744](https://github.com/apache/datafusion/pull/20744) (buraksenn) +- fix: array_concat widens container variant for mixed List/LargeList inputs [#21704](https://github.com/apache/datafusion/pull/21704) (hcrosse) +- fix: Fix local `datafusion-cli` test failure [#21761](https://github.com/apache/datafusion/pull/21761) (2010YOUY01) +- fix: Validate spill read schema [#21738](https://github.com/apache/datafusion/pull/21738) (2010YOUY01) +- fix: improve sort pushdown benchmark data and add DESC LIMIT queries [#21711](https://github.com/apache/datafusion/pull/21711) (zhuqi-lucas) +- fix: rebind RecursiveQueryExec batches to the declared output schema [#21770](https://github.com/apache/datafusion/pull/21770) (adriangb) +- fix: Enable `arrow-ipc/zstd` in `datasource-arrow` to make `test_spill_compression` pass in every config [#21504](https://github.com/apache/datafusion/pull/21504) (AdamGS) +- fix: Do not highlight the CLI hint directly [#21858](https://github.com/apache/datafusion/pull/21858) (nuno-faria) +- fix: fix elapsed_compute metric in ParquetSink to report encoding time only [#21825](https://github.com/apache/datafusion/pull/21825) (fred1268) +- fix: grouping separator for float and decimal [#20268](https://github.com/apache/datafusion/pull/20268) (Druva-D) +- fix: Fix `.gitignore` in `benchmarks/` [#21954](https://github.com/apache/datafusion/pull/21954) (2010YOUY01) +- fix(proto): correctly serialize FilterExec empty projection [#21885](https://github.com/apache/datafusion/pull/21885) (Adez017) +- fix: Make conversion from FileDecryptionProperties to ConfigFileDecryptionProperties fallible [#21603](https://github.com/apache/datafusion/pull/21603) (adamreeve) +- fix: Avoid unnecessary input repartitioning with `ScalarSubqueryExec` [#21986](https://github.com/apache/datafusion/pull/21986) (neilconway) +- fix: error on CREATE EXTERNAL TABLE with no files and no explicit schema [#21965](https://github.com/apache/datafusion/pull/21965) (adriangb) +- fix: `median` returns Float64 for integer inputs to avoid truncation [#21988](https://github.com/apache/datafusion/pull/21988) (CuteChuanChuan) +- fix: Correct the number of pruned/matched Parquet pages [#22031](https://github.com/apache/datafusion/pull/22031) (nuno-faria) +- fix: use datafusion_expr instead of datafusion crate [#22052](https://github.com/apache/datafusion/pull/22052) (hsiang-c) +- fix(spark): align parse_url empty FILE path [#21969](https://github.com/apache/datafusion/pull/21969) (kumarUjjawal) +- fix: drop input plan early in `CoalescePartitionsExec` [#22017](https://github.com/apache/datafusion/pull/22017) (Samyak2) +- fix: track join_arrays memory in reservation after SMJ spill [#21962](https://github.com/apache/datafusion/pull/21962) (SubhamSinghal) +- fix: Avoid `overlay` panic on valid Unicode input, Postgres compatibility [#22046](https://github.com/apache/datafusion/pull/22046) (neilconway) +- fix: Panic in Spark's `format_string` for illegal characters [#22077](https://github.com/apache/datafusion/pull/22077) (neilconway) +- fix: Incorrect behavior for `FILTER` on NULLs [#22068](https://github.com/apache/datafusion/pull/22068) (neilconway) +- fix: coerce operand types in Interval mul/div/intersect/union/contains [#22027](https://github.com/apache/datafusion/pull/22027) (adriangb) +- fix(bench): avoid OOM in `array_replace` bench [#22120](https://github.com/apache/datafusion/pull/22120) (kumarUjjawal) +- fix: Nested self-referential CASE chains should not cause exponential hashing work during physical planning. [#22175](https://github.com/apache/datafusion/pull/22175) (avantgardnerio) +- fix: preserve Inexact precision in Statistics [#22146](https://github.com/apache/datafusion/pull/22146) (timsaucer) +- fix: Handle EXECUTE without statement name [#22204](https://github.com/apache/datafusion/pull/22204) (Dandandan) +- fix(sql): reject duplicate unqualified names in CTAS, CREATE VIEW, and SELECT INTO [#22290](https://github.com/apache/datafusion/pull/22290) (kumarUjjawal) +- fix: reduce memory allocation overhead during partial aggregation ear… [#22165](https://github.com/apache/datafusion/pull/22165) (ariel-miculas) +- fix: Fix bug with structurally equal correlated subqueries [#22313](https://github.com/apache/datafusion/pull/22313) (neilconway) +- fix: return error instead of capacity overflow panic in generate_series [#22323](https://github.com/apache/datafusion/pull/22323) (sweb) +- fix: simplifier on leaf nodes returns null [#22368](https://github.com/apache/datafusion/pull/22368) (timsaucer) + +**Documentation updates:** + +- Update DataFusion meetups page on docs [#20629](https://github.com/apache/datafusion/pull/20629) (alamb) +- docs: Update `datafusion-cli` doc for `top-memory-consumers` config [#20390](https://github.com/apache/datafusion/pull/20390) (erenavsarogullari) +- [main] Update version to 52.2.0 [#20573](https://github.com/apache/datafusion/pull/20573) (alamb) +- Update releases links with releases in 2025-2026 [#20630](https://github.com/apache/datafusion/pull/20630) (alamb) +- doc: Add more context to `Precision` [#20713](https://github.com/apache/datafusion/pull/20713) (jonathanc-n) +- Minor: Add comment explaining rationale to avoid dependencies on functions [#20667](https://github.com/apache/datafusion/pull/20667) (alamb) +- Hash join buffering on probe side [#19761](https://github.com/apache/datafusion/pull/19761) (gabotechs) +- Copy limits before repartitions [#20736](https://github.com/apache/datafusion/pull/20736) (avantgardnerio) +- Allow SQL `TypePlanner` to plan SQL types as extension types [#20676](https://github.com/apache/datafusion/pull/20676) (paleolimbot) +- doc: Add documentation for pushing limit into plan [#20271](https://github.com/apache/datafusion/pull/20271) (2010YOUY01) +- [main] Bump to 52.3.0 and changelog (#20790) [#20849](https://github.com/apache/datafusion/pull/20849) (alamb) +- refactor: Improve `SessionContext::parse_duration` API [#20816](https://github.com/apache/datafusion/pull/20816) (erenavsarogullari) +- docs: in release email, be specific about changelog location [#20975](https://github.com/apache/datafusion/pull/20975) (kevinjqliu) +- optimizer: Add configuration to disable join reordering [#21072](https://github.com/apache/datafusion/pull/21072) (2010YOUY01) +- docs: Improve getting started and testing guides for humans and agents [#20970](https://github.com/apache/datafusion/pull/20970) (alamb) +- docs: clarify NULL handling for array_remove functions (#21014) [#21018](https://github.com/apache/datafusion/pull/21018) (Xavrir) +- chore: Add `substr()` benchmarks, refactor [#20803](https://github.com/apache/datafusion/pull/20803) (neilconway) +- docs: Document the TableProvider evaluation order for filter, limit and projection [#21091](https://github.com/apache/datafusion/pull/21091) (alamb) +- Add `arrow_try_cast` UDF [#21130](https://github.com/apache/datafusion/pull/21130) (adriangb) +- docs: Add explicit fmt and clippy commands to AGENTS.md [#21171](https://github.com/apache/datafusion/pull/21171) (zhuqi-lucas) +- docs: add KalamDB to known users [#21181](https://github.com/apache/datafusion/pull/21181) (jamals86) +- [main] Update version to 53.0.0 and bring changelog [#21189](https://github.com/apache/datafusion/pull/21189) (alamb) +- Migrate Avro reader to arrow-avro and remove internal conversion code [#17861](https://github.com/apache/datafusion/pull/17861) (getChan) +- Add metric category filtering for EXPLAIN ANALYZE [#21160](https://github.com/apache/datafusion/pull/21160) (adriangb) +- docs: Add `RESET` Command Documentation [#21245](https://github.com/apache/datafusion/pull/21245) (erenavsarogullari) +- chore: fix upgrade guide link for object_store release notes [#21283](https://github.com/apache/datafusion/pull/21283) (haohuaijin) +- doc: Add documentation explaining the behavior of `null` values ​​in struct comparisons [#21226](https://github.com/apache/datafusion/pull/21226) (xiedeyantu) +- [docs] Add weekly sync details to contributor communication guide [#21298](https://github.com/apache/datafusion/pull/21298) (alamb) +- [docs] add sql example to timestamp/datetime docs for time zone [#21082](https://github.com/apache/datafusion/pull/21082) (buraksenn) +- Update documentation with recent blogs and events [#21462](https://github.com/apache/datafusion/pull/21462) (alamb) +- Update 53 upgrade guide to note release, other changes [#21449](https://github.com/apache/datafusion/pull/21449) (alamb) +- docs: Incorporate writing table provider blog post to user documentation [#21398](https://github.com/apache/datafusion/pull/21398) (buraksenn) +- remove as_any from TableProvider, SchemaProvider, CatalogProvider, and CatalogProviderList [#21346](https://github.com/apache/datafusion/pull/21346) (timsaucer) +- port 52.5.0 changelog to main [#21553](https://github.com/apache/datafusion/pull/21553) (alamb) +- Add `arrow_field(expr)` scalar UDF [#21389](https://github.com/apache/datafusion/pull/21389) (adriangb) +- Reorder `cargo publish` commands by dependency [#21552](https://github.com/apache/datafusion/pull/21552) (alamb) +- chore(deps): update jinja2 requirement from <4,>=3.1 to >=3.1.6,<4 in /docs [#21606](https://github.com/apache/datafusion/pull/21606) (dependabot[bot]) +- chore(deps): update pydata-sphinx-theme requirement from <1,>=0.16 to >=0.17.0,<1 in /docs [#21609](https://github.com/apache/datafusion/pull/21609) (dependabot[bot]) +- Add release management page to the documentation [#21001](https://github.com/apache/datafusion/pull/21001) (alamb) +- Perf: Window topn optimisation [#21479](https://github.com/apache/datafusion/pull/21479) (SubhamSinghal) +- chore(deps): update setuptools requirement from <83,>=82 to >=82.0.1,<83 in /docs [#21607](https://github.com/apache/datafusion/pull/21607) (dependabot[bot]) +- chore(deps): update maturin requirement from <2,>=1.11 to >=1.13.1,<2 in /docs [#21608](https://github.com/apache/datafusion/pull/21608) (dependabot[bot]) +- docs: Update `map_extract` examples [#21360](https://github.com/apache/datafusion/pull/21360) (nuno-faria) +- docs: add April 2026 readings and meetup links [#21644](https://github.com/apache/datafusion/pull/21644) (alamb) +- chore: backport version from `branch-53`, update some dependencies [#21708](https://github.com/apache/datafusion/pull/21708) (comphead) +- chore: add `array_remove_*` NULL handling changes to `Upgrade Guide` [#21769](https://github.com/apache/datafusion/pull/21769) (comphead) +- docs: fix some comments on query_planning example [#21783](https://github.com/apache/datafusion/pull/21783) (jotare) +- docs: fix typos in documentation [#21875](https://github.com/apache/datafusion/pull/21875) (jx2lee) +- docs: refresh CLI usage output in the user guide [#21874](https://github.com/apache/datafusion/pull/21874) (jx2lee) +- docs: clarify ExecutionProps and TaskContext docs [#21872](https://github.com/apache/datafusion/pull/21872) (alamb) +- chore: add internal markdown link check [#21831](https://github.com/apache/datafusion/pull/21831) (Geethapranay1) +- Update documentation for PhysicalExpr::evaluate_bounds [#21879](https://github.com/apache/datafusion/pull/21879) (alamb) +- chore(deps): update pydata-sphinx-theme requirement from <1,>=0.17.0 to >=0.17.1,<1 in /docs [#21889](https://github.com/apache/datafusion/pull/21889) (dependabot[bot]) +- docs(optimizer): add generated optimizer rules reference [#21824](https://github.com/apache/datafusion/pull/21824) (kumarUjjawal) +- add any_match higher-order function [#21903](https://github.com/apache/datafusion/pull/21903) (LiaCastaneda) +- docs: update commiter list [#21978](https://github.com/apache/datafusion/pull/21978) (coderfender) +- chore: update PMC/committer list [#21989](https://github.com/apache/datafusion/pull/21989) (comphead) +- Support '0' value for parse_capacity_limit() [#22014](https://github.com/apache/datafusion/pull/22014) (mkleen) +- docs: add llms.txt ecosystem hub at site root [#22003](https://github.com/apache/datafusion/pull/22003) (timsaucer) +- chore(deps): update maturin requirement from <2,>=1.13.1 to >=1.13.3,<2 in /docs [#22127](https://github.com/apache/datafusion/pull/22127) (dependabot[bot]) +- fix `date_part('isodow')` [#22116](https://github.com/apache/datafusion/pull/22116) (sdf-jkl) +- docs: updating arrays_zip output field naming [#22133](https://github.com/apache/datafusion/pull/22133) (timsaucer) +- Add rand() alias for random() [#22147](https://github.com/apache/datafusion/pull/22147) (xiedeyantu) +- chore: Update Rust toolchain to 1.95 [#22177](https://github.com/apache/datafusion/pull/22177) (Dandandan) +- docs: add DataFusion Java to subproject listings [#22149](https://github.com/apache/datafusion/pull/22149) (andygrove) +- Fix: deadlink in "Concepts, Reading, Events" page to DataFusion blog [#22325](https://github.com/apache/datafusion/pull/22325) (JarroVGIT) +- fixing factorial negative values [#22278](https://github.com/apache/datafusion/pull/22278) (raushanprabhakar1) +- docs(optimizer): Fix PushDownFilter doc typos. [#22320](https://github.com/apache/datafusion/pull/22320) (JSOD11) +- minor: add higher-order function methods to SessionContext [#21950](https://github.com/apache/datafusion/pull/21950) (gstvg) +- Add higher-order functions changes to upgrade guide [#22107](https://github.com/apache/datafusion/pull/22107) (gstvg) +- chore(deps): update myst-parser requirement from <6,>=5 to >=5.1.0,<6 in /docs [#22378](https://github.com/apache/datafusion/pull/22378) (dependabot[bot]) +- feat(functions-nested): add array_filter higher-order function [#21895](https://github.com/apache/datafusion/pull/21895) (ologlogn) +- Add SQL as a category in breaking API change policy [#22179](https://github.com/apache/datafusion/pull/22179) (alamb) + +**Other:** + +- Add metrics for parquet sink [#20307](https://github.com/apache/datafusion/pull/20307) (xudong963) +- Extend dynamic filter to joins that preserve probe side ON [#20447](https://github.com/apache/datafusion/pull/20447) (helgikrs) +- Improve sqllogicteset speed by creating only a single large file rather than 2 [#20586](https://github.com/apache/datafusion/pull/20586) (Tim-53) +- cli: Fix datafusion-cli hint edge cases [#20609](https://github.com/apache/datafusion/pull/20609) (comphead) +- Speedup sqllogictests by running long running tests first [#20576](https://github.com/apache/datafusion/pull/20576) (alamb) +- Fix custom metric display [#20643](https://github.com/apache/datafusion/pull/20643) (gabotechs) +- refactor: Set expected runtime config in error message when the used disk space during the spilling process has exceeded the allocation limit [#20375](https://github.com/apache/datafusion/pull/20375) (erenavsarogullari) +- more families for the CI [#20663](https://github.com/apache/datafusion/pull/20663) (blaginin) +- CI: Add CodeQL workflow for GitHub Actions security scanning [#20636](https://github.com/apache/datafusion/pull/20636) (kevinjqliu) +- chore(deps): bump astral-sh/setup-uv from 7.3.0 to 7.3.1 [#20660](https://github.com/apache/datafusion/pull/20660) (dependabot[bot]) +- chore(deps): bump taiki-e/install-action from 2.68.8 to 2.68.16 [#20661](https://github.com/apache/datafusion/pull/20661) (dependabot[bot]) +- Improve formatting of datatypes [#20605](https://github.com/apache/datafusion/pull/20605) (emilk) +- Add explain plans for ClickBench queries [#20666](https://github.com/apache/datafusion/pull/20666) (alamb) +- Add files_processed and files_scanned metrics to FileStreamMetrics [#20592](https://github.com/apache/datafusion/pull/20592) (adriangb) +- Speedup push_down_filter_regression.slt by using uncompressed parquet [#20652](https://github.com/apache/datafusion/pull/20652) (alamb) +- Implement cardinality_effect for window execs and UnionExec [#20321](https://github.com/apache/datafusion/pull/20321) (getChan) +- ci: Harden labeler workflow, remove unnecessary checkout from pull_request_target job [#20637](https://github.com/apache/datafusion/pull/20637) (kevinjqliu) +- Add tests for sqllogictest prioritization [#20656](https://github.com/apache/datafusion/pull/20656) (alamb) +- correct parquet leaf index mapping when schema contains struct cols [#20698](https://github.com/apache/datafusion/pull/20698) (friendlymatthew) +- Reattach parquet metadata cache after deserializing in datafusion-proto [#20574](https://github.com/apache/datafusion/pull/20574) (nathanb9) +- Wire up with_new_state with DataSource [#20718](https://github.com/apache/datafusion/pull/20718) (gabotechs) +- chore: Enable `assigning_clones` clippy lint [#20670](https://github.com/apache/datafusion/pull/20670) (neilconway) +- FFI_TableOptions are using default values only [#20721](https://github.com/apache/datafusion/pull/20721) (timsaucer) +- Improve documentation for `AggregateUdfImpl::simplify` and `WindowUDFImpl::simplify` [#20712](https://github.com/apache/datafusion/pull/20712) (alamb) +- Fix test that's broken on Windows due to naive path handling [#20692](https://github.com/apache/datafusion/pull/20692) (Rafferty97) +- Fix DELETE/UPDATE filter extraction when predicates are pushed down into TableScan [#19884](https://github.com/apache/datafusion/pull/19884) (kosiew) +- use linker optimization for extended sqllogictests [#20740](https://github.com/apache/datafusion/pull/20740) (blaginin) +- Push even local limits past windows [#20752](https://github.com/apache/datafusion/pull/20752) (avantgardnerio) +- Add case-heavy LEFT JOIN benchmark and debug timing/logging for PushDownFilter hot paths [#20664](https://github.com/apache/datafusion/pull/20664) (kosiew) +- Fix repartition from dropping data when spilling [#20672](https://github.com/apache/datafusion/pull/20672) (xanderbailey) +- test: Add `datafusion-cli` `fair` and `unbounded` memory-pool test coverage [#20565](https://github.com/apache/datafusion/pull/20565) (erenavsarogullari) +- ser/de fetch in FilterExec [#20738](https://github.com/apache/datafusion/pull/20738) (haohuaijin) +- Add tests for simplifying multiple aggregate expressions [#20723](https://github.com/apache/datafusion/pull/20723) (alamb) +- Update reverse UDF to emit utf8view when input is utf8view [#20604](https://github.com/apache/datafusion/pull/20604) (Omega359) +- Make lower and upper emit Utf8View for Utf8View input [#20616](https://github.com/apache/datafusion/pull/20616) (kumarUjjawal) +- Fix FilterExec converting Absent column stats to Exact(NULL) [#20391](https://github.com/apache/datafusion/pull/20391) (fwojciec) +- Clean up date_part preimage implementation [#20350](https://github.com/apache/datafusion/pull/20350) (sdf-jkl) +- Make Physical CastExpr Field-aware and unify cast semantics across physical expressions [#20814](https://github.com/apache/datafusion/pull/20814) (kosiew) +- Pass ConfigOptions to scalar UDFs via FFI [#20454](https://github.com/apache/datafusion/pull/20454) (timsaucer) +- [datafusion-cli] Replace mutex with AtomicU64 for stream duration tracking in instrumentedObjectStore [#20802](https://github.com/apache/datafusion/pull/20802) (buraksenn) +- Make translate emit Utf8View for Utf8View input [#20624](https://github.com/apache/datafusion/pull/20624) (shivaaang) +- Allow filters on struct fields to be pushed down into Parquet scan [#20822](https://github.com/apache/datafusion/pull/20822) (friendlymatthew) +- Used constant with mapping instead of write! to display scalar value bytes [#20719](https://github.com/apache/datafusion/pull/20719) (buraksenn) +- chore(deps): bump taiki-e/install-action from 2.68.16 to 2.68.25 [#20842](https://github.com/apache/datafusion/pull/20842) (dependabot[bot]) +- chore(deps): bump github/codeql-action from 4.32.5 to 4.32.6 [#20843](https://github.com/apache/datafusion/pull/20843) (dependabot[bot]) +- chore: Ignore RUSTSEC-2024-0421 [#20850](https://github.com/apache/datafusion/pull/20850) (comphead) +- chore(deps): bump quinn-proto from 0.11.13 to 0.11.14 [#20859](https://github.com/apache/datafusion/pull/20859) (dependabot[bot]) +- Use `ParquetPushDecoder` in `ParquetOpener` [#20839](https://github.com/apache/datafusion/pull/20839) (Dandandan) +- [Minor] Remove redundant ProjectionExec nodes in sort-based plans [#20780](https://github.com/apache/datafusion/pull/20780) (Dandandan) +- impl ser/de for preserve_order in RepartitionExec [#20798](https://github.com/apache/datafusion/pull/20798) (haohuaijin) +- Fix FileStream scanning_total to include sync next-file open time [#20627](https://github.com/apache/datafusion/pull/20627) (RatulDawar) +- chore: Ignore RUSTSEC-2024-0014 [#20862](https://github.com/apache/datafusion/pull/20862) (comphead) +- chore: clean up dependencies [#20861](https://github.com/apache/datafusion/pull/20861) (comphead) +- Add benchmark for struct field filter pushdown in Parquet [#20829](https://github.com/apache/datafusion/pull/20829) (friendlymatthew) +- Add Null Type Coercions for Placeholders [#20543](https://github.com/apache/datafusion/pull/20543) (cetra3) +- Minor: Deprecate unused `PartitionedFileStream` [#20869](https://github.com/apache/datafusion/pull/20869) (alamb) +- chore(deps): bump substrait from 0.62 to 0.63.0 [#20876](https://github.com/apache/datafusion/pull/20876) (benbellick) +- [Minor] propagate distinct_count as inexact through unions [#20846](https://github.com/apache/datafusion/pull/20846) (buraksenn) +- try to remove redundant alias in expression rewriter and select [#20867](https://github.com/apache/datafusion/pull/20867) (buraksenn) +- Fix duplicate group keys after hash aggregation spill (#20724) [#20858](https://github.com/apache/datafusion/pull/20858) (gboucher90) +- Include .proto files in datafusion-proto-common distribution [#20921](https://github.com/apache/datafusion/pull/20921) (haohuaijin) +- Check sqllogictests for any dangling config settings (#17914) [#20838](https://github.com/apache/datafusion/pull/20838) (cj-zhukov) +- Add support for ListView in unnest [#20760](https://github.com/apache/datafusion/pull/20760) (brancz) +- Project only accessed struct leaves in Parquet row filter pushdown [#20854](https://github.com/apache/datafusion/pull/20854) (friendlymatthew) +- minor: Move PreparedAccessPlan to same module as ParquetAccessPlan [#20929](https://github.com/apache/datafusion/pull/20929) (alamb) +- chore(deps): bump pyjwt from 2.11.0 to 2.12.0 [#20938](https://github.com/apache/datafusion/pull/20938) (dependabot[bot]) +- Rewrite `SUM(expr + scalar)` --> `SUM(expr) + scalar*COUNT(expr)` [#20749](https://github.com/apache/datafusion/pull/20749) (alamb) +- Add AGENTS.md / CLAUDE.md [#20939](https://github.com/apache/datafusion/pull/20939) (Dandandan) +- Support `columns_sorted` in row_filters [#20497](https://github.com/apache/datafusion/pull/20497) (sdf-jkl) +- Add --simulate-latency / SIMULATE_LATENCY option to dfbench / ./bench.sh [#20954](https://github.com/apache/datafusion/pull/20954) (Dandandan) +- Minor: make signatures of `SessionContext::register_*` methods consistent [#20873](https://github.com/apache/datafusion/pull/20873) (alexandreyc) +- test: add reproducer for Dictionary InList pushdown type mismatch (#2… [#20960](https://github.com/apache/datafusion/pull/20960) (erratic-pattern) +- Extract shared `ParquetReadPlan` for leaf column resolution [#20913](https://github.com/apache/datafusion/pull/20913) (friendlymatthew) +- chore: Remove usage of `paste` crate [#20946](https://github.com/apache/datafusion/pull/20946) (coderfender) +- Use exact distinct_count from statistics if exists for `COUNT(DISTINCT column))` calculations [#20845](https://github.com/apache/datafusion/pull/20845) (buraksenn) +- thin-ci [#20972](https://github.com/apache/datafusion/pull/20972) (blaginin) +- chore(deps): bump lz4_flex from 0.12.0 to 0.12.1 [#20973](https://github.com/apache/datafusion/pull/20973) (dependabot[bot]) +- Fix decimal log precision for non-power values [#20433](https://github.com/apache/datafusion/pull/20433) (kumarUjjawal) +- chore(deps): bump Swatinem/rust-cache from 2.8.2 to 2.9.1 [#20979](https://github.com/apache/datafusion/pull/20979) (dependabot[bot]) +- chore(deps): bump taiki-e/install-action from 2.68.25 to 2.68.34 [#20983](https://github.com/apache/datafusion/pull/20983) (dependabot[bot]) +- chore(deps): bump github/codeql-action from 4.32.6 to 4.33.0 [#20982](https://github.com/apache/datafusion/pull/20982) (dependabot[bot]) +- chore(deps): bump astral-sh/setup-uv from 7.3.1 to 7.6.0 [#20981](https://github.com/apache/datafusion/pull/20981) (dependabot[bot]) +- chore(deps): bump runs-on/action from 2.0.3 to 2.1.0 [#20980](https://github.com/apache/datafusion/pull/20980) (dependabot[bot]) +- [Minor] Update Cargo.lock, Fix Tokio minor breaking change [#20978](https://github.com/apache/datafusion/pull/20978) (Dandandan) +- chore(deps): Revert "chore(deps): bump runs-on/action from 2.0.3 to 2.1.0 (#20980)" [#21002](https://github.com/apache/datafusion/pull/21002) (mbutrovich) +- bug: fix `array_remove_*` with NULLS [#21013](https://github.com/apache/datafusion/pull/21013) (comphead) +- Simplify logic for memory pressure partial emit from ordered group by [#20559](https://github.com/apache/datafusion/pull/20559) (alamb) +- Fix memory reservation starvation in sort-merge [#20642](https://github.com/apache/datafusion/pull/20642) (xudong963) +- infra: automatically delete branch on pr merge [#21033](https://github.com/apache/datafusion/pull/21033) (kevinjqliu) +- Add support for nested lists in substrait consumer [#20953](https://github.com/apache/datafusion/pull/20953) (alexanderbianchi) +- build: update Rust toolchain version to 1.94.0 [#21045](https://github.com/apache/datafusion/pull/21045) (dariocurr) +- chore: Cleanup fully-qualified ScalarFunctionArgs [#20804](https://github.com/apache/datafusion/pull/20804) (neilconway) +- Support '>', '<', '>=', '<=', '<>' in any operator [#20830](https://github.com/apache/datafusion/pull/20830) (buraksenn) +- keep fetch when merge FilterExec in FilterPushdown [#21070](https://github.com/apache/datafusion/pull/21070) (haohuaijin) +- Fix Subtraction overflow in `max_distinct_count` when hash join has a pushed-down limit [#20799](https://github.com/apache/datafusion/pull/20799) (KARTIK64-rgb) +- Restore Sort unparser guard for correct ORDER BY placement [#20658](https://github.com/apache/datafusion/pull/20658) (krinart) +- chore(deps): bump rustls-webpki from 0.103.9 to 0.103.10 [#21089](https://github.com/apache/datafusion/pull/21089) (dependabot[bot]) +- chore: Remove duplicate imports in test code [#21061](https://github.com/apache/datafusion/pull/21061) (neilconway) +- test: update sqllogictest expectation for negation type coercion [#21102](https://github.com/apache/datafusion/pull/21102) (myandpr) +- fix[physical-expr-adapter]: support casting structs nested inside complex types [#20907](https://github.com/apache/datafusion/pull/20907) (asubiotto) +- Fix index panic in unparser with mismatched stacked projections [#21094](https://github.com/apache/datafusion/pull/21094) (friendlymatthew) +- chore: Fix all sqllogictest dangling configs [#21108](https://github.com/apache/datafusion/pull/21108) (2010YOUY01) +- Preserve SPM when parent maintains input order [#21097](https://github.com/apache/datafusion/pull/21097) (rkrishn7) +- chore: update testcontainers and astral-tokio-tar for cargo audit [#21114](https://github.com/apache/datafusion/pull/21114) (getChan) +- Spark soundex function implementation [#20725](https://github.com/apache/datafusion/pull/20725) (kazantsev-maksim) +- chore(deps): bump env_logger from 0.11.9 to 0.11.10 in the all-other-cargo-deps group across 1 directory [#21136](https://github.com/apache/datafusion/pull/21136) (dependabot[bot]) +- Fix `elapsed_compute` metric for Parquet DataSourceExec [#20767](https://github.com/apache/datafusion/pull/20767) (ernestprovo23) +- chore(deps): bump taiki-e/install-action from 2.68.34 to 2.69.7 [#21133](https://github.com/apache/datafusion/pull/21133) (dependabot[bot]) +- chore(deps): bump github/codeql-action from 4.33.0 to 4.34.1 [#21132](https://github.com/apache/datafusion/pull/21132) (dependabot[bot]) +- Update to arrow/parquet `58.1.0` [#21044](https://github.com/apache/datafusion/pull/21044) (alamb) +- Simplify sqllogictest timing summary to boolean flag and remove top-N modes [#20598](https://github.com/apache/datafusion/pull/20598) (kosiew) +- Substrait join consumer should not merge nullability of join keys [#21121](https://github.com/apache/datafusion/pull/21121) (hareshkh) +- Enable debug assertions in CI. [#20832](https://github.com/apache/datafusion/pull/20832) (stuhood) +- chore(deps): bump requests from 2.32.5 to 2.33.0 [#21153](https://github.com/apache/datafusion/pull/21153) (dependabot[bot]) +- feat : support spark compatible int to timestamp cast [#20555](https://github.com/apache/datafusion/pull/20555) (coderfender) +- [Minor]: support window functions in order by expressions [#20963](https://github.com/apache/datafusion/pull/20963) (buraksenn) +- chore: Optimize schema rewriter usages [#21158](https://github.com/apache/datafusion/pull/21158) (comphead) +- Add benchmarks for Parquet struct leaf-level projection pruning [#21180](https://github.com/apache/datafusion/pull/21180) (friendlymatthew) +- chore: re-export projection in datafusion::datasource [#21185](https://github.com/apache/datafusion/pull/21185) (rluvaton) +- test: add SMJ benchmarks from #21184 [#21188](https://github.com/apache/datafusion/pull/21188) (mbutrovich) +- Fix sort merge interleave overflow [#20922](https://github.com/apache/datafusion/pull/20922) (xudong963) +- Reduce parquet struct projection benchmark data volume [#21187](https://github.com/apache/datafusion/pull/21187) (friendlymatthew) +- Minor: compute qualify window expressions only when QUALIFY clause is present [#21173](https://github.com/apache/datafusion/pull/21173) (buraksenn) +- fix[physical-plan/aggregates]: fix grouping by Ree [#21195](https://github.com/apache/datafusion/pull/21195) (asubiotto) +- [main] add 52.4.0 changelog [#21053](https://github.com/apache/datafusion/pull/21053) (alamb) +- Use leaf level `ProjectionMask` for parquet projections [#20925](https://github.com/apache/datafusion/pull/20925) (friendlymatthew) +- test: scale remaining sort-merge join (SMJ) benchmark queries [#21200](https://github.com/apache/datafusion/pull/21200) (mbutrovich) +- Fix: MemTable LIMIT ignored with reordered projections [#21177](https://github.com/apache/datafusion/pull/21177) (RamakrishnaChilaka) +- No cargo test for `sort_mem_validation` [#21222](https://github.com/apache/datafusion/pull/21222) (blaginin) +- Fix/support duplicate column names #6543 [#21126](https://github.com/apache/datafusion/pull/21126) (RafaelHerrero) +- Use spot instances for extended tests [#21221](https://github.com/apache/datafusion/pull/21221) (blaginin) +- chore: Cleanup Cargo profiles [#21214](https://github.com/apache/datafusion/pull/21214) (neilconway) +- chore(benchmark): Fix/update compile profile benchmark [#21223](https://github.com/apache/datafusion/pull/21223) (2010YOUY01) +- Basic Extension Type Registry Implementation [#20312](https://github.com/apache/datafusion/pull/20312) (tschwarzinger) +- chore(deps): bump serialize-javascript, terser-webpack-plugin and copy-webpack-plugin in /datafusion/wasmtest/datafusion-wasm-app [#21235](https://github.com/apache/datafusion/pull/21235) (dependabot[bot]) +- chore(deps-dev): bump node-forge from 1.3.2 to 1.4.0 in /datafusion/wasmtest/datafusion-wasm-app [#21225](https://github.com/apache/datafusion/pull/21225) (dependabot[bot]) +- chore(deps): bump cryptography from 46.0.5 to 46.0.6 [#21224](https://github.com/apache/datafusion/pull/21224) (dependabot[bot]) +- Fix FilterExec tree render missing fetch display [#21230](https://github.com/apache/datafusion/pull/21230) (zhuqi-lucas) +- ci: use ubuntu-slim runner for lightweight CI jobs [#21252](https://github.com/apache/datafusion/pull/21252) (CuteChuanChuan) +- kill `check_run_id` and `pr_number` from extended tests [#21228](https://github.com/apache/datafusion/pull/21228) (blaginin) +- [Minor] add non topk benchmarks for utf8/utf8view string aggregates [#21073](https://github.com/apache/datafusion/pull/21073) (buraksenn) +- ci: Add datafusion/sql as a folder to trigger extended tests for on changes [#21255](https://github.com/apache/datafusion/pull/21255) (mbutrovich) +- Misc minor optimization in the Physical Optimizer [#21216](https://github.com/apache/datafusion/pull/21216) (AdamGS) +- chore: Replace `TryInto` impl by `TryFrom` [#21203](https://github.com/apache/datafusion/pull/21203) (Tpt) +- Refactor parquet datasource into an explicit state machine [#21190](https://github.com/apache/datafusion/pull/21190) (alamb) +- Add flat vs. struct field projection benchmarks [#21257](https://github.com/apache/datafusion/pull/21257) (friendlymatthew) +- Refactor: expose predicate constant inference from physical-expr [#21167](https://github.com/apache/datafusion/pull/21167) (xudong963) +- Add end-to-end Parquet tests for List and LargeList struct schema evolution [#20840](https://github.com/apache/datafusion/pull/20840) (kosiew) +- chore(deps): bump taiki-e/install-action from 2.69.7 to 2.70.3 [#21271](https://github.com/apache/datafusion/pull/21271) (dependabot[bot]) +- chore(deps): bump rustyline from 17.0.2 to 18.0.0 [#21276](https://github.com/apache/datafusion/pull/21276) (dependabot[bot]) +- chore(deps): bump ctor from 0.6.3 to 0.8.0 [#21282](https://github.com/apache/datafusion/pull/21282) (dependabot[bot]) +- chore(deps): bump snmalloc-rs from 0.3.8 to 0.7.4 [#21280](https://github.com/apache/datafusion/pull/21280) (dependabot[bot]) +- chore(deps): bump sha1 from 0.10.6 to 0.11.0 [#21277](https://github.com/apache/datafusion/pull/21277) (dependabot[bot]) +- chore(deps): bump astral-sh/setup-uv from 7.6.0 to 8.0.0 [#21272](https://github.com/apache/datafusion/pull/21272) (dependabot[bot]) +- chore(deps): bump github/codeql-action from 4.34.1 to 4.35.1 [#21273](https://github.com/apache/datafusion/pull/21273) (dependabot[bot]) +- chore(deps): bump pygments from 2.19.2 to 2.20.0 [#21256](https://github.com/apache/datafusion/pull/21256) (dependabot[bot]) +- feat(memory_pool): add `TrackConsumersPool::metrics()` to expose cons… [#21147](https://github.com/apache/datafusion/pull/21147) (bert-beyondloops) +- Update repeat UDF to emit utf8view when input is utf8view [#20645](https://github.com/apache/datafusion/pull/20645) (Omega359) +- chore(deps): bump the all-other-cargo-deps group across 1 directory with 7 updates [#21274](https://github.com/apache/datafusion/pull/21274) (dependabot[bot]) +- chore(deps): bump runs-on/action from 2.0.3 to 2.1.0 [#21134](https://github.com/apache/datafusion/pull/21134) (dependabot[bot]) +- chore: add `.claude/settings.local.json` to `.gitignore` [#21312](https://github.com/apache/datafusion/pull/21312) (jonahgao) +- Add `FileStreamBuilder` for creating FileStreams [#21261](https://github.com/apache/datafusion/pull/21261) (alamb) +- refactor: Split Parquet BloomFilter CPU and IO into separate states [#21285](https://github.com/apache/datafusion/pull/21285) (alamb) +- chore(deps): bump object_store from 0.13.1 to 0.13.2 [#21275](https://github.com/apache/datafusion/pull/21275) (dependabot[bot]) +- Merge queue: make dev checks required + add .asf.yaml validation [#21239](https://github.com/apache/datafusion/pull/21239) (blaginin) +- Adds INList and Between expr to skip outer join [#21303](https://github.com/apache/datafusion/pull/21303) (SubhamSinghal) +- No merge group for rust.yml yet [#21343](https://github.com/apache/datafusion/pull/21343) (blaginin) +- Disallow order by within ordered-set aggregate functions argument lists [#20421](https://github.com/apache/datafusion/pull/20421) (cj-zhukov) +- chore: Fix clippy and CI [#21287](https://github.com/apache/datafusion/pull/21287) (comphead) +- Split FileStreamMetrics into its own module [#21340](https://github.com/apache/datafusion/pull/21340) (alamb) +- Skip probe-side consumption when hash join build side is empty [#21068](https://github.com/apache/datafusion/pull/21068) (kosiew) +- Use ParquetMetaDataPushDecoder instead of ParquetMetaDataReader [#21357](https://github.com/apache/datafusion/pull/21357) (Dandandan) +- Eliminate redundant `ProjectionExec`s [#21333](https://github.com/apache/datafusion/pull/21333) (Dandandan) +- Minor: add tests for regexp_replace and capture groups [#21413](https://github.com/apache/datafusion/pull/21413) (alamb) +- bench: add benchmarks for first_value, last_value [#21409](https://github.com/apache/datafusion/pull/21409) (theirix) +- chore(deps): bump the all-other-cargo-deps group with 4 updates [#21435](https://github.com/apache/datafusion/pull/21435) (dependabot[bot]) +- chore(deps): bump taiki-e/install-action from 2.70.3 to 2.74.0 [#21434](https://github.com/apache/datafusion/pull/21434) (dependabot[bot]) +- test: Add `datafusion.format.*` configs test coverage [#21355](https://github.com/apache/datafusion/pull/21355) (erenavsarogullari) +- Estimate aggregate output rows using existing NDV statistics [#20926](https://github.com/apache/datafusion/pull/20926) (buraksenn) +- Follow-up: remove interleave panic recovery after Arrow 58.1.0 [#21436](https://github.com/apache/datafusion/pull/21436) (xudong963) +- writing table to parquet followed by read and schema check [#21444](https://github.com/apache/datafusion/pull/21444) (Rich-T-kid) +- chore(deps): bump cryptography from 46.0.6 to 46.0.7 [#21489](https://github.com/apache/datafusion/pull/21489) (dependabot[bot]) +- Preserve logical cast field semantics during physical lowering with field-aware CastExpr [#20836](https://github.com/apache/datafusion/pull/20836) (kosiew) +- Add more regexp_replace test coverage [#21485](https://github.com/apache/datafusion/pull/21485) (alamb) +- Introduce Morselizer API, rewrite `ParquetOpener` to `ParquetMorselizer` [#21327](https://github.com/apache/datafusion/pull/21327) (alamb) +- chore: create benches small ints for count_distinct [#21521](https://github.com/apache/datafusion/pull/21521) (coderfender) +- refactor: extract sort pushdown logic from FileScanConfig into separate module [#21457](https://github.com/apache/datafusion/pull/21457) (zhuqi-lucas) +- chore: Add array_slice tests for overlapping nulls across inputs [#21540](https://github.com/apache/datafusion/pull/21540) (neilconway) +- Migrate PhysicalExprAdapter to unified CastExpr and remove CastColumnExpr usage [#21493](https://github.com/apache/datafusion/pull/21493) (kosiew) +- Unify cast handling by removing `CastColumnExpr` branches in pruning and ordering equivalence [#21545](https://github.com/apache/datafusion/pull/21545) (kosiew) +- [datafusion-spark] Add Spark-compatible ceil function [#20593](https://github.com/apache/datafusion/pull/20593) (shivbhatia10) +- sql: render PostgreSQL array literals as ARRAY[...] in unparser [#21513](https://github.com/apache/datafusion/pull/21513) (xiedeyantu) +- physical_optimizer: preserve_file_partitions when num file groups < target_partitions [#21533](https://github.com/apache/datafusion/pull/21533) (jayshrivastava) +- EliminateOuterJoin with Like, IsTrue, IsFalse, IsNotUnknown [#21549](https://github.com/apache/datafusion/pull/21549) (SubhamSinghal) +- chore(deps): bump hashbrown from 0.16.1 to 0.17.0 [#21611](https://github.com/apache/datafusion/pull/21611) (dependabot[bot]) +- chore(deps): bump ctor from 0.8.0 to 0.10.0 [#21612](https://github.com/apache/datafusion/pull/21612) (dependabot[bot]) +- Rewrite FileStream in terms of Morsel API [#21342](https://github.com/apache/datafusion/pull/21342) (alamb) +- Consolidate special case `regexp_match` logic [#21486](https://github.com/apache/datafusion/pull/21486) (alamb) +- chore(deps): bump taiki-e/install-action from 2.74.0 to 2.75.10 [#21605](https://github.com/apache/datafusion/pull/21605) (dependabot[bot]) +- chore(deps): bump the all-other-cargo-deps group across 1 directory with 3 updates [#21610](https://github.com/apache/datafusion/pull/21610) (dependabot[bot]) +- bench: first_last remove noisy benchmarks, add update_batch [#21487](https://github.com/apache/datafusion/pull/21487) (theirix) +- chore: Fix `typo` problems [#21495](https://github.com/apache/datafusion/pull/21495) (erenavsarogullari) +- chore(deps-dev): bump follow-redirects from 1.15.6 to 1.16.0 in /datafusion/wasmtest/datafusion-wasm-app [#21601](https://github.com/apache/datafusion/pull/21601) (dependabot[bot]) +- bench: Scale sort benchmarks to 1M rows to exercise merge path [#21630](https://github.com/apache/datafusion/pull/21630) (mbutrovich) +- Port filter_pushdown.rs async tests to sqllogictest [#21620](https://github.com/apache/datafusion/pull/21620) (adriangb) +- chore: fix cargo audit and dependencies check on main [#21655](https://github.com/apache/datafusion/pull/21655) (alamb) +- Spark make_valid_utf8 function implementation [#20633](https://github.com/apache/datafusion/pull/20633) (kazantsev-maksim) +- chore(deps): update tokio from 1.51 to 1.52 [#21670](https://github.com/apache/datafusion/pull/21670) (ahmed-mez) +- Use ListArray nullability instead of offsets for `array_element`, `array_any_value`. [#21672](https://github.com/apache/datafusion/pull/21672) (tabac) +- chore: breakdown `array.slt` into smaller files [#21658](https://github.com/apache/datafusion/pull/21658) (comphead) +- chore: Add more tests with `GROUP BY` to test spark `collect_set` [#21659](https://github.com/apache/datafusion/pull/21659) (comphead) +- Add strategy-focused InList benchmarks [#21648](https://github.com/apache/datafusion/pull/21648) (geoffreyclaude) +- Fix massive spill files for StringView/BinaryView columns II [#21633](https://github.com/apache/datafusion/pull/21633) (adriangb) +- chore: Backport 53.1.0 changelog [#21686](https://github.com/apache/datafusion/pull/21686) (comphead) +- refactor: Introduce SpillState enum for memory-limited NLJ execution [#21636](https://github.com/apache/datafusion/pull/21636) (viirya) +- Support Date32/Date64 in unwrap_cast optimization [#21665](https://github.com/apache/datafusion/pull/21665) (Dandandan) +- feat[expr-common]: add REE arithmetic coercion for numeric and decimal [#21179](https://github.com/apache/datafusion/pull/21179) (asubiotto) +- Make `test_display_pg_json` pass regardless of build setup and dependencies [#21502](https://github.com/apache/datafusion/pull/21502) (AdamGS) +- refactor: Share left-side spill file across partitions on OOM fallback [#21699](https://github.com/apache/datafusion/pull/21699) (viirya) +- Spark is_valid_utf8 function implementation [#21627](https://github.com/apache/datafusion/pull/21627) (kazantsev-maksim) +- chore: use bench array helpers from Arrow bench_util [#21544](https://github.com/apache/datafusion/pull/21544) (theirix) +- chore: add count distinct group benchmarks [#21575](https://github.com/apache/datafusion/pull/21575) (coderfender) +- minor: More comments to `read_spill_as_stream` [#21713](https://github.com/apache/datafusion/pull/21713) (2010YOUY01) +- Dynamic work scheduling in FileStream [#21351](https://github.com/apache/datafusion/pull/21351) (alamb) +- chore: Update Release instructions [#21705](https://github.com/apache/datafusion/pull/21705) (comphead) +- test: add tests for spill file sizes to verify View GC [#21750](https://github.com/apache/datafusion/pull/21750) (RatulDawar) +- chore(deps): bump astral-sh/setup-uv from 8.0.0 to 8.1.0 [#21759](https://github.com/apache/datafusion/pull/21759) (dependabot[bot]) +- chore(deps): bump aws-config from 1.8.15 to 1.8.16 in the all-other-cargo-deps group [#21760](https://github.com/apache/datafusion/pull/21760) (dependabot[bot]) +- chore(deps): bump github/codeql-action from 4.35.1 to 4.35.2 [#21758](https://github.com/apache/datafusion/pull/21758) (dependabot[bot]) +- chore(deps): bump taiki-e/install-action from 2.75.10 to 2.75.18 [#21757](https://github.com/apache/datafusion/pull/21757) (dependabot[bot]) +- Snowflake Unparser dialect and UNNEST support [#21593](https://github.com/apache/datafusion/pull/21593) (yonatan-sevenai) +- Skip files outside partition structure in hive-partitioned listing tables [#21756](https://github.com/apache/datafusion/pull/21756) (zhuqi-lucas) +- Handle canceled partitioned hash join dynamic filters lazily [#21666](https://github.com/apache/datafusion/pull/21666) (adriangb) +- Improve ergonomics for ExecutionPlanMetricsSet and MetricsSet [#21762](https://github.com/apache/datafusion/pull/21762) (gabotechs) +- [Minor]: unify ANY/ALL planning and align ANY NULL semantics with PG [#21743](https://github.com/apache/datafusion/pull/21743) (buraksenn) +- [Minor]: fix security audit because of rustls-webpki version [#21785](https://github.com/apache/datafusion/pull/21785) (buraksenn) +- refactor: Simplify NLJ re-scans with `ReplayableStreamSource` [#21742](https://github.com/apache/datafusion/pull/21742) (2010YOUY01) +- ci: permit stale workflow to delete cache [#21772](https://github.com/apache/datafusion/pull/21772) (Jefffrey) +- Unparser drops ORDER BY alias when flattening Projection through SubqueryAlias [#21491](https://github.com/apache/datafusion/pull/21491) (yonatan-sevenai) +- chore: re-enable `add_months` overflow test [#21774](https://github.com/apache/datafusion/pull/21774) (Jefffrey) +- chore: add aggregation test for listview types [#21776](https://github.com/apache/datafusion/pull/21776) (Jefffrey) +- chore: re-enable `array_union` nested null array edge case test [#21773](https://github.com/apache/datafusion/pull/21773) (Jefffrey) +- Fix: allow coercion from Binary and LargeBinary into BinaryView [#21800](https://github.com/apache/datafusion/pull/21800) (bert-beyondloops) +- chore: leave specialised bench helpers [#21810](https://github.com/apache/datafusion/pull/21810) (theirix) +- Add quote style and trimming to csv writier [#20813](https://github.com/apache/datafusion/pull/20813) (xanderbailey) +- chore(deps): bump picomatch from 2.3.1 to 2.3.2 in /datafusion/wasmtest/datafusion-wasm-app [#21164](https://github.com/apache/datafusion/pull/21164) (dependabot[bot]) +- perf(substr_index): speed up scalar and Utf8View [#21754](https://github.com/apache/datafusion/pull/21754) (kumarUjjawal) +- Fix PushdownSort dropping LIMIT when eliminating SortExec [#21744](https://github.com/apache/datafusion/pull/21744) (sgrebnov) +- chore: use Arc::unwrap_or_clone in more places [#21823](https://github.com/apache/datafusion/pull/21823) (Dandandan) +- build: explicitly set `publish = false` for internal crates [#21869](https://github.com/apache/datafusion/pull/21869) (rluvaton) +- chore: bump API limit for stale workflow [#21867](https://github.com/apache/datafusion/pull/21867) (Jefffrey) +- chore: bump `sha` & `md-5` to `0.11.0` [#21840](https://github.com/apache/datafusion/pull/21840) (Jefffrey) +- feat : ABI upgrade from abi_stabby to stabby since abi_stable is no longer maintained [#21030](https://github.com/apache/datafusion/pull/21030) (coderfender) +- Add protobuf serialization/deserialization support for `EmptyTable` scans [#20844](https://github.com/apache/datafusion/pull/20844) (OlegWock) +- Support Dictionary Arrays in MIN/MAX Aggregates [#21315](https://github.com/apache/datafusion/pull/21315) (kosiew) +- Fix some GH action permission issues identified by CodeQL [#21838](https://github.com/apache/datafusion/pull/21838) (Jefffrey) +- Add support for nested types to nullif. [#21764](https://github.com/apache/datafusion/pull/21764) (tabac) +- chore(deps): bump taiki-e/install-action from 2.75.18 to 2.75.23 [#21887](https://github.com/apache/datafusion/pull/21887) (dependabot[bot]) +- chore(deps): bump libloading from 0.8.9 to 0.9.0 [#21890](https://github.com/apache/datafusion/pull/21890) (dependabot[bot]) +- refactor `array_remove` benchmarks & add nested benches [#21834](https://github.com/apache/datafusion/pull/21834) (Jefffrey) +- Update `astral-tokio-tar` to appease cargo_audit [#21902](https://github.com/apache/datafusion/pull/21902) (alamb) +- Remove unnecessary Mutex in SharedMemoryReservation [#21899](https://github.com/apache/datafusion/pull/21899) (gabotechs) +- ci: add breaking change detector [#21499](https://github.com/apache/datafusion/pull/21499) (rluvaton) +- Fix GH action permissions in `rust.yml` and `docs.yaml` workflows [#21884](https://github.com/apache/datafusion/pull/21884) (Jefffrey) +- chore: fix `iff` typos [#21904](https://github.com/apache/datafusion/pull/21904) (comphead) +- Deduplicate InList primitive static filters [#21932](https://github.com/apache/datafusion/pull/21932) (geoffreyclaude) +- Fix nesting of permissions block in docs workflow [#21930](https://github.com/apache/datafusion/pull/21930) (Jefffrey) +- dependencies check are now required to merge ci [#21940](https://github.com/apache/datafusion/pull/21940) (blaginin) +- build: allow posting comments on PRs made from forks and fix missing protobuf [#21913](https://github.com/apache/datafusion/pull/21913) (rluvaton) +- Use shared statistics merge for union stats [#21430](https://github.com/apache/datafusion/pull/21430) (kumarUjjawal) +- Add ClickBench URL pushdown benchmark [#21945](https://github.com/apache/datafusion/pull/21945) (xudong963) +- test(sqllogictest): stabilize parquet output_rows_skew with WITH ORDER [#21898](https://github.com/apache/datafusion/pull/21898) (RatulDawar) +- Skip unnecessary plan rebuild in adjust_input_keys_ordering for non-join plans [#21947](https://github.com/apache/datafusion/pull/21947) (zhuqi-lucas) +- Adding Use of arrow's has_true() / has_false() [#21806](https://github.com/apache/datafusion/pull/21806) (raushanprabhakar1) +- feat[expr-common]: support regex and LIKE coercion on REE and Dict value types that require an extra coercion step [#21924](https://github.com/apache/datafusion/pull/21924) (asubiotto) +- feat[expr-common]: support REE in coalesce [#21919](https://github.com/apache/datafusion/pull/21919) (asubiotto) +- proto: serialize and dedupe dynamic filters v2 [#21807](https://github.com/apache/datafusion/pull/21807) (jayshrivastava) +- chore: fix `datafusion-spark` substring [#21963](https://github.com/apache/datafusion/pull/21963) (comphead) +- Respect DATA_DIR location for sql benchmarks [#21961](https://github.com/apache/datafusion/pull/21961) (Omega359) +- ci: use base repository branch for breaking change detector [#22006](https://github.com/apache/datafusion/pull/22006) (rluvaton) +- bench: add to_char_array_date32 [#22007](https://github.com/apache/datafusion/pull/22007) (huymq1710) +- ci: add `auto detected api change` label on breaking change detecting in the CI [#21953](https://github.com/apache/datafusion/pull/21953) (rluvaton) +- Fix fully matched row groups with null counts [#21907](https://github.com/apache/datafusion/pull/21907) (xudong963) +- functions: Add dict support for get field [#21115](https://github.com/apache/datafusion/pull/21115) (brancz) +- fix(physical-plan): set column byte_size to 0 in FilterExec zero-row interval stats [#21999](https://github.com/apache/datafusion/pull/21999) (buraksenn) +- Explicitly declare spill codec dependency in `physical-plan` [#21917](https://github.com/apache/datafusion/pull/21917) (kosiew) +- Add benchmark_runner for sql_benchmarks with help and list commands [#22001](https://github.com/apache/datafusion/pull/22001) (Omega359) +- chore: `datafusion-spark` substring to support Binary types [#21979](https://github.com/apache/datafusion/pull/21979) (comphead) +- Add reusable plan-time schema alignment helper and apply to RecursiveQueryExec [#21912](https://github.com/apache/datafusion/pull/21912) (kosiew) +- Upgrade to arrow-rs / parquet / avro 58.2.0 [#21812](https://github.com/apache/datafusion/pull/21812) (alamb) +- kill `linux-build-lib` from extended tests [#21227](https://github.com/apache/datafusion/pull/21227) (blaginin) +- chore: Rust checks are required + merge queue [#21941](https://github.com/apache/datafusion/pull/21941) (blaginin) +- Add wide-schema benchmark suite for measuring per-file metadata overhead [#21970](https://github.com/apache/datafusion/pull/21970) (adriangb) +- chore(deps): bump ctor from 0.10.1 to 1.0.1 [#22023](https://github.com/apache/datafusion/pull/22023) (dependabot[bot]) +- ci: narrow macOS test scope to datafusion-ffi, run benchmarks on amd64 [#22048](https://github.com/apache/datafusion/pull/22048) (blaginin) +- chore(deps): bump github/codeql-action from 4.35.2 to 4.35.3 [#22019](https://github.com/apache/datafusion/pull/22019) (dependabot[bot]) +- chore(deps): bump taiki-e/install-action from 2.74.0 to 2.77.0 [#22018](https://github.com/apache/datafusion/pull/22018) (dependabot[bot]) +- chore(deps): bump the all-other-cargo-deps group across 1 directory with 2 updates [#22022](https://github.com/apache/datafusion/pull/22022) (dependabot[bot]) +- Allow benchmark allocator features together [#21905](https://github.com/apache/datafusion/pull/21905) (xudong963) +- Rich t kid/introduce dict benchmarks [#21860](https://github.com/apache/datafusion/pull/21860) (Rich-T-kid) +- Add benchmarks for dictionary path of new_group_values [#22004](https://github.com/apache/datafusion/pull/22004) (Rich-T-kid) +- Support `IS (NOT) DISTINCT FROM` in Unparser [#22054](https://github.com/apache/datafusion/pull/22054) (cetra3) +- chore: Fix broken build with `--benches --all-features` [#22081](https://github.com/apache/datafusion/pull/22081) (neilconway) +- Chore: Fix TPC-DS schema/query (fixes q30 run) [#22086](https://github.com/apache/datafusion/pull/22086) (Dandandan) +- chore(deps): (fix CI) bump taiki-e/install-action from 2.77.0 to 2.77.6 [#22110](https://github.com/apache/datafusion/pull/22110) (gstvg) +- Prevent empty grouping sets from being eliminated on empty input [#22039](https://github.com/apache/datafusion/pull/22039) (xiedeyantu) +- Consolidate and document SQL AST shims [#22094](https://github.com/apache/datafusion/pull/22094) (alamb) +- Support distinct-from predicates in Parquet pruning [#22084](https://github.com/apache/datafusion/pull/22084) (Dandandan) +- minor: Track Parquet rows and pages matched when the page index is skipped [#22085](https://github.com/apache/datafusion/pull/22085) (nuno-faria) +- Update to `arrow` / `parquet` from 58.2.0 --> 58.3.0 [#22066](https://github.com/apache/datafusion/pull/22066) (alamb) +- Add sqllogictest coverage for unused UNNEST pruning edge cases [#22074](https://github.com/apache/datafusion/pull/22074) (kosiew) +- chore(deps): bump actions/labeler from 6.0.1 to 6.1.0 [#22124](https://github.com/apache/datafusion/pull/22124) (dependabot[bot]) +- chore(deps): bump the all-other-cargo-deps group with 5 updates [#22128](https://github.com/apache/datafusion/pull/22128) (dependabot[bot]) +- chore(deps): bump github/codeql-action from 4.35.3 to 4.35.4 [#22122](https://github.com/apache/datafusion/pull/22122) (dependabot[bot]) +- mem: Cleanup resources of done streams immediately [#22064](https://github.com/apache/datafusion/pull/22064) (EmilyMatt) +- Propagate field metadata through NTH_VALUE, FIRST_VALUE, and LAST_VALUE window functions [#22112](https://github.com/apache/datafusion/pull/22112) (paleolimbot) +- Minor: Disallow async function in lambdas [#22097](https://github.com/apache/datafusion/pull/22097) (gstvg) +- chore(deps): bump runs-on/action from 2.1.0 to 2.1.2 [#22123](https://github.com/apache/datafusion/pull/22123) (dependabot[bot]) +- bench: remove stale `array_expression` benchmark [#22143](https://github.com/apache/datafusion/pull/22143) (kumarUjjawal) +- Add resolve_lambda_variables helper to Expr and LogicalPlan [#22101](https://github.com/apache/datafusion/pull/22101) (gstvg) +- Fix panic on deep compound identifiers [#22186](https://github.com/apache/datafusion/pull/22186) (Dandandan) +- Refactor scalar min/max dispatch into function-based helpers [#22062](https://github.com/apache/datafusion/pull/22062) (kosiew) +- fix missing window expressions when unparsing plans without outer projections [#21801](https://github.com/apache/datafusion/pull/21801) (nathanb9) +- chore(deps): bump urllib3 from 2.6.3 to 2.7.0 [#22109](https://github.com/apache/datafusion/pull/22109) (dependabot[bot]) +- Call take arrays once per repartitioned input batch [#22159](https://github.com/apache/datafusion/pull/22159) (gene-bordegaray) +- Refactor parquet row filter setup [#22191](https://github.com/apache/datafusion/pull/22191) (xudong963) +- fix date_bin overflows subtracting extreme nanosecond timestamp origin [#22251](https://github.com/apache/datafusion/pull/22251) (xiedeyantu) +- fix date_trunc overflows converting extreme non-ns timestamps to nanoseconds [#22262](https://github.com/apache/datafusion/pull/22262) (xiedeyantu) +- Extract parquet push decoder module [#22289](https://github.com/apache/datafusion/pull/22289) (xudong963) +- Track spill read-back memory in SMJ [#22103](https://github.com/apache/datafusion/pull/22103) (SubhamSinghal) +- refactor(parquet-datasource): split opener.rs into an opener/ module [#22346](https://github.com/apache/datafusion/pull/22346) (adriangb) +- refactor(parquet-datasource): split sink and schema_coercion out of file_format.rs [#22347](https://github.com/apache/datafusion/pull/22347) (adriangb) +- fixing negative power to zero [#22277](https://github.com/apache/datafusion/pull/22277) (raushanprabhakar1) +- refactor(parquet-datasource): split bloom_filter out of row_group_filter.rs [#22348](https://github.com/apache/datafusion/pull/22348) (adriangb) +- Revert "[Minor]: unify ANY/ALL planning and align ANY NULL semantics with PG (#21743)" [#22345](https://github.com/apache/datafusion/pull/22345) (alamb) +- Fix pruning predicate for `LIKE` expressions with escape sequences [#22375](https://github.com/apache/datafusion/pull/22375) (masonh22) +- Fix: lead/lag extreme offsets handling [#22243](https://github.com/apache/datafusion/pull/22243) (Dandandan) +- chore(deps): fix CI, bump astral-tokio-tar [#22382](https://github.com/apache/datafusion/pull/22382) (gstvg) +- chore: Replace stray old-style string builder in `substr` [#22183](https://github.com/apache/datafusion/pull/22183) (neilconway) +- chore(deps): bump taiki-e/install-action from 2.77.6 to 2.79.2 [#22377](https://github.com/apache/datafusion/pull/22377) (dependabot[bot]) +- chore(deps): bump github/codeql-action from 4.35.4 to 4.35.5 [#22376](https://github.com/apache/datafusion/pull/22376) (dependabot[bot]) +- chore(deps-dev): bump webpack-dev-server from 5.2.1 to 5.2.4 in /datafusion/wasmtest/datafusion-wasm-app [#22349](https://github.com/apache/datafusion/pull/22349) (dependabot[bot]) +- chore(deps): bump idna from 3.11 to 3.15 [#22381](https://github.com/apache/datafusion/pull/22381) (dependabot[bot]) +- Refactor Spark `format_string` numeric `%c` conversion dispatch [#22166](https://github.com/apache/datafusion/pull/22166) (kosiew) +- chore(deps): bump sysinfo from 0.38.4 to 0.39.2 [#22380](https://github.com/apache/datafusion/pull/22380) (dependabot[bot]) +- fix regexp_count should count empty-pattern matches [#22311](https://github.com/apache/datafusion/pull/22311) (xiedeyantu) +- Actually preserve predicate execution order in PushDownFilter [#21643](https://github.com/apache/datafusion/pull/21643) (joroKr21) +- chore(deps): bump qs and body-parser in /datafusion/wasmtest/datafusion-wasm-app [#22321](https://github.com/apache/datafusion/pull/22321) (dependabot[bot]) + +## Credits + +Thank you to everyone who contributed to this release. Here is a breakdown of commits (PRs merged) per contributor. + +``` + 70 Neil Conway + 68 dependabot[bot] + 51 Andrew Lamb + 26 Burak Şen + 25 Oleks V + 21 Daniël Heres + 20 Adrian Garcia Badaracco + 18 Kumar Ujjawal + 16 kosiew + 15 Matt Butrovich + 15 Tim Saucer + 14 Qi Zhu + 14 Zhen Chen + 13 Dmitrii Blaginin + 13 Jeffrey Vo + 13 Yongting You + 13 xudong.w + 12 Huaijin + 11 Bhargava Vadlamani + 10 Eren Avsarogullari + 10 Matthew Kim + 9 gstvg + 8 Raz Luvaton + 8 Subham Singhal + 8 theirix + 6 Adam Gutglick + 6 Gabriel + 6 Jonathan Chen + 5 Alessandro Solimando + 5 Alfonso Subiotto Marqués + 5 Bruce Ritchie + 5 Liang-Chi Hsieh + 5 Nuno Faria + 5 Sergei Grebnov + 4 Andy Grove + 4 Ariel Miculas-Trif + 4 Jayant Shrivastava + 4 Kevin Liu + 4 lyne + 3 Brent Gardner + 3 David López + 3 Dewey Dunnington + 3 Geoffrey Claude + 3 Harrison Crosse + 3 Huy Mac + 3 Kazantsev Maksim + 3 Konstantin Tarasov + 3 Lía Adriana + 3 Namgung Chan + 3 Peter L + 3 RIchard Baah + 3 Ratul Dawar + 3 Raushan Prabhakar + 3 Xander + 3 Yonatan Striem Amit + 3 Yu-Chuan Hung + 3 crm26 + 2 Acfboy + 2 Adam Curtis + 2 Albert Skalt + 2 Anastasios Bakogiannis + 2 Bert Vermeiren + 2 Frederic Branczyk + 2 Geethapranay1 + 2 Jonah Gao + 2 Liam Feehery + 2 Marko Grujic + 2 Michael Kleen + 2 Peter Nguyen + 2 Rafael Herrero + 2 Rohan Krishnaswamy + 2 Samyak Sarnayak + 2 Sergey Zhukov + 2 Shiv Bhatia + 2 Tobias Schwarzinger + 2 hsiang-c + 2 jj.lee + 2 linfeng + 2 yaommen + 1 Adam Reeve + 1 Ahmed Mezghani + 1 Alex Zhang + 1 Alexander Alexandrov + 1 Alexander Rafferty + 1 Alexandre Crayssac + 1 Andrey Koshchiy + 1 Asish Kumar + 1 Ben Bellick + 1 Bruno Volpato + 1 Daniel Tu + 1 Druva + 1 EeshanBembi + 1 Emil Ernerfeldt + 1 Emily Matheys + 1 Ernest Provo + 1 Filip Petkovski + 1 Filip Wojciechowski + 1 Florian Müller + 1 Fred Thomas + 1 Gene Bordegaray + 1 Georgi Krastev + 1 Guillaume Boucher + 1 Haresh Khanna + 1 Helgi Kristvin Sigurbjarnarson + 1 Heran Lin + 1 Jamal Saad + 1 Jarro van Ginkel + 1 Jax Liu + 1 Joan Antoni RE + 1 Justin O'Dwyer + 1 Kartik Gupta + 1 Krishna Sudarshan J + 1 Kristin Cowalcijk + 1 Krisztián Szűcs + 1 Lavkesh Lahngir + 1 Martin Hilton + 1 Mason + 1 Nathan + 1 Oleh + 1 Ramakrishna Chilaka + 1 Rizky Mirzaviandy Priambodo + 1 RyanStewart + 1 Shivaang + 1 Soham Bhattacharjee + 1 Stu Hood + 1 Thomas Tanon + 1 Tim-53 + 1 UBarney + 1 Viktor Yershov + 1 Vinay Mehta + 1 Zhang Xiaofeng + 1 aditya singh rathore + 1 alexanderbianchi + 1 blaginin + 1 dario curreri + 1 dd-david-levin + 1 gabriel + 1 nathan + 1 niebayes +``` + +Thank you also to everyone who contributed in other ways such as filing issues, reviewing PRs, and providing feedback on this release.