[VL] useful Velox PRs not merged into upstream

Here is a track of useful Velox PRs mostly submitted from Gluten community but not merged. It will be removed once the PR is merged automatically. Comment your PR# below if you want your PR to be tracked here. We didn't pick into gluten/velox because of the rebase effort. You may pick it on necessary


## Tracked PRs (2026-03-11)

### Ansi

- [16307](https://github.com/facebookincubator/velox/pull/16307): ✅ [PICKABLE] feat: Support decimal type for Spark checked_multiply function

### Bug Fixes

- [15751](https://github.com/facebookincubator/velox/pull/15751): 📕 [CLOSED] feat: Flush row group by buffered bytes in parquet writer
- [13734](https://github.com/facebookincubator/velox/pull/13734): 📕 [CLOSED] fix: Support constant value for lead/lag function
- [12684](https://github.com/facebookincubator/velox/pull/12684): 📕 [CLOSED] fix: Make type check reflect the corresponding logical type
- [12630](https://github.com/facebookincubator/velox/pull/12630): 📕 [CLOSED] fix(sparksql): Fix result mismatch cases in casting varchar to timestamp
- [12563](https://github.com/facebookincubator/velox/pull/12563): 📕 [CLOSED] fix: HashProbe load LazyVector before wrapping
- [10925](https://github.com/facebookincubator/velox/pull/10925): 📕 [CLOSED] Fix invalid 'rawResultNulls_' in SelectiveDecimalColumnReader
- [8825 ](https://github.com/facebookincubator/velox/pull/8825): 📕 [CLOSED] Fix Spark split function
- [8888 ](https://github.com/facebookincubator/velox/pull/8888): 📕 [CLOSED] Fix NaN in Spark array_intersect and array_except functions
- [15173](https://github.com/facebookincubator/velox/pull/15173): ✅ [PICKABLE] fix(parquet): Fix reading array of row
- [8014 ](https://github.com/facebookincubator/velox/pull/8014): 📕 [CLOSED] Fix adapters issue caused by null config in FileHandleGenerator
- [14504](https://github.com/facebookincubator/velox/pull/14504): 📕 [CLOSED] fix: Fix incorrect null results when unrelated child fields are missing from the requested type in readers
- [15534](https://github.com/facebookincubator/velox/pull/15534): ⚠️ [CONFLICT] fix: Trim numeric suffix when casting string to real/double
- [15313](https://github.com/facebookincubator/velox/pull/15313): 📕 [CLOSED] fix(s3): Make the MinioServer instance only be initialized once
- [14277](https://github.com/facebookincubator/velox/pull/14277): 📕 [CLOSED] fix: An unloaded lazy vector cannot be wrapped by two different top level vectors
- [13138](https://github.com/facebookincubator/velox/pull/13138): 📕 [CLOSED] fix: Fix the full outer join result mismatch issue with multi duplicated match
- [11772](https://github.com/facebookincubator/velox/pull/11772): 📕 [CLOSED] fix: Fix the MergeSource data lost issue
- [11068](https://github.com/facebookincubator/velox/pull/11068): 📕 [CLOSED] Fix full outer result mismatch issue when output contains multiple matching rows
- [10402](https://github.com/facebookincubator/velox/pull/10402): 📕 [CLOSED] Fix the "An unsupported nested encoding was found." exception in parquet writer
- [13907](https://github.com/facebookincubator/velox/pull/13907): ✅ [PICKABLE] feat: Fix the full outer join result mismatch issue
- [11771](https://github.com/facebookincubator/velox/pull/11771): ⚠️ [CONFLICT] fix: Fix smj result mismatch issue in semi, anit and full outer join
- [15711](https://github.com/facebookincubator/velox/pull/15711): ✅ [PICKABLE] fix: Reduce memory spike in aggregate window functions
- [15343](https://github.com/facebookincubator/velox/pull/15343): ✅ [PICKABLE] feat(parquet): Allow reading a wider integer as a narrower one
- [16164](https://github.com/facebookincubator/velox/pull/16164): ✅ [PICKABLE] fix: Def/rep level calculation for legacy Parquet lists

### Bux Fixes

- [16511](https://github.com/facebookincubator/velox/pull/16511): ✅ [PICKABLE] fix: Check for corrupted repeat/define lengths in Parquet headers
- [15953](https://github.com/facebookincubator/velox/pull/15953): ✅ [PICKABLE] fix: Use EvictingCacheMap for compiled regular expressions

### Enhancement

- [14472](https://github.com/facebookincubator/velox/pull/14472): ⚠️ [CONFLICT] feat:Support multi-threaded asynchronous data upload to object storage.
- [14214](https://github.com/facebookincubator/velox/pull/14214): 📕 [CLOSED] feat(parquet): Support page‑level pruning
- [11285](https://github.com/facebookincubator/velox/pull/11285): 📕 [CLOSED] misc: Optimize the computation of sliding window kRange frame bound
- [10638](https://github.com/facebookincubator/velox/pull/10638): 📕 [CLOSED] Distinguish null constant and non-null constant in simple function's initialize method
- [9591 ](https://github.com/facebookincubator/velox/pull/9591): 📕 [CLOSED] Support offset-based timezone
- [8769 ](https://github.com/facebookincubator/velox/pull/8769): 📕 [CLOSED] Enable UNKNOWN type in type dispatch
- [15707](https://github.com/facebookincubator/velox/pull/15707): 📕 [CLOSED] feat: Enable the hash join to accept a pre-built hash table for joining
- [13762](https://github.com/facebookincubator/velox/pull/13762): ✅ [PICKABLE] feat: Optimize nested loop other join types with small build side
- [11808](https://github.com/facebookincubator/velox/pull/11808): ✅ [PICKABLE] feat: Add negated hugeint filters
- [11740](https://github.com/facebookincubator/velox/pull/11740): ✅ [PICKABLE] feat: Support decimal schema evolution in Parquet scan
- [11646](https://github.com/facebookincubator/velox/pull/11646): ⚠️ [CONFLICT] feat: Support row group skip for Parquet decimal
- [5962 ](https://github.com/facebookincubator/velox/pull/5962): ✅ [PICKABLE] feat: Support struct schema evolution matching by name
- [5464 ](https://github.com/facebookincubator/velox/pull/5464): 📕 [CLOSED] Add decimal type support for Spark first/last aggregate functions
- [11836](https://github.com/facebookincubator/velox/pull/11836): 📕 [CLOSED] feat: Optimize serializer decompress buffer for BufferInputStream
- [11824](https://github.com/facebookincubator/velox/pull/11824): 📕 [CLOSED] feat: Support prefix comparator in spill merge
- [11703](https://github.com/facebookincubator/velox/pull/11703): 📕 [CLOSED] feat(prefix sort): Eliminate null byte from prefix encoder when single sort key
- [11685](https://github.com/facebookincubator/velox/pull/11685): 📕 [CLOSED] feat: Support read file stream without buffer
- [11954](https://github.com/facebookincubator/velox/pull/11954): 📕 [CLOSED] feat: Support Spark explode outer
- [13862](https://github.com/facebookincubator/velox/pull/13862): 📕 [CLOSED] feat: Support spill write batch size limit
- [10456](https://github.com/facebookincubator/velox/pull/10456): 📕 [CLOSED] Support semi projection join type in smj
- [13041](https://github.com/facebookincubator/velox/pull/13041): ⚠️ [CONFLICT] feat: Enable the hash join to accept a pre-built hash table for joining
- [11272](https://github.com/facebookincubator/velox/pull/11272): 📕 [CLOSED] Support string type for PrefixSort
- [13817](https://github.com/facebookincubator/velox/pull/13817): 📕 [CLOSED] feat: Add zstd compression for unified compression API
- [11206](https://github.com/facebookincubator/velox/pull/11206): 📕 [CLOSED] Supports serializing a range of rows for UnsafeRowFast
- [7734 ](https://github.com/facebookincubator/velox/pull/7734): 📕 [CLOSED] Fix Parquet writer to produce evenly-sized row groups
- [15116](https://github.com/facebookincubator/velox/pull/15116): 📕 [CLOSED]     feat: In the str_to_map Spark function, entryDelimiter and keyValueDelimiter are supported for more characters
- [15751](https://github.com/facebookincubator/velox/pull/15751): 📕 [CLOSED] feat: Flush row group by buffered bytes in parquet writer
- [15409](https://github.com/facebookincubator/velox/pull/15409): ✅ [PICKABLE] feat(spilling): Fallback to timsort when allocation of prefix sort buffer memory fails during spilling
- [15290](https://github.com/facebookincubator/velox/pull/15290): ⚠️ [CONFLICT] perf(spilling): Support serializing rows to avoid extracting it as vector
- [15848](https://github.com/facebookincubator/velox/pull/15848): ✅ [PICKABLE] feat: Allow subfield rename and deletion for ORC format
- [15458](https://github.com/facebookincubator/velox/pull/15458): ⚠️ [CONFLICT] perf: Optimize basic numeric upcast
- [15300](https://github.com/facebookincubator/velox/pull/15300): 📕 [CLOSED] feat: Add support for ORC writer
- [16514](https://github.com/facebookincubator/velox/pull/16514): ✅ [PICKABLE] feat(sparksql): Support multi-character delimiters in str_to_map
- [16547](https://github.com/facebookincubator/velox/pull/16547): ✅ [PICKABLE] perf(exec): Tiled column-major extraction for RowContainer
- [16546](https://github.com/facebookincubator/velox/pull/16546): ✅ [PICKABLE] perf(exec): Combine low-selectivity filter results in HashProbe
- [16545](https://github.com/facebookincubator/velox/pull/16545): ✅ [PICKABLE] perf(exec): Add AMAC prefetch optimization for listJoinResults

### Iceberg

- [14276](https://github.com/facebookincubator/velox/pull/14276): 📕 [CLOSED] feat(iceberg): Add Iceberg all functions

### Json

- [11433](https://github.com/facebookincubator/velox/pull/11433): 📕 [CLOSED] Fix JSON parser to allow control characters in JSON string input
- [12892](https://github.com/facebookincubator/velox/pull/12892): 📕 [CLOSED] feat(sparksql): Support wildcard in json path for get_json_object function 
- [5179 ](https://github.com/facebookincubator/velox/pull/5179): 📕 [CLOSED] Optimize get_json_object Spark function using simdjson
- [6016 ](https://github.com/facebookincubator/velox/pull/6016): 📕 [CLOSED] Reject duplicated keys in abstract join node
- [14801](https://github.com/facebookincubator/velox/pull/14801): 📕 [CLOSED] fix: Minify JSON objects/arrays in Spark get_json_object

### Regexp

- [10279](https://github.com/facebookincubator/velox/pull/10279): 📕 [CLOSED] Introduce Hyperscan lib to implement regexp functions
- [8387 ](https://github.com/facebookincubator/velox/pull/8387): 📕 [CLOSED] Fix signature of regexp_replace Spark function and register it in Spark function registry

### Spark Functions

- [7555 ](https://github.com/facebookincubator/velox/pull/7555): 📕 [CLOSED] Add date_format Spark function
- [6296 ](https://github.com/facebookincubator/velox/pull/6296): 📕 [CLOSED] Add SparkSQL url_decode function
- [9719 ](https://github.com/facebookincubator/velox/pull/9719): 📕 [CLOSED] Support `allowPrecisionLoss` in Spark decimal ops
- [11304](https://github.com/facebookincubator/velox/pull/11304): 📕 [CLOSED] Register re-usable Presto date_trunc functions for Spark
- [9714 ](https://github.com/facebookincubator/velox/pull/9714): 📕 [CLOSED] Add session timezone getter
- [7086 ](https://github.com/facebookincubator/velox/pull/7086): 📕 [CLOSED] Support Spark array_union function
- [7083 ](https://github.com/facebookincubator/velox/pull/7083): 📕 [CLOSED] Add Spark quarter function
- [12780](https://github.com/facebookincubator/velox/pull/12780): 📕 [CLOSED] feat: Add Spark to_pretty_string function
- [12763](https://github.com/facebookincubator/velox/pull/12763): 📕 [CLOSED] feat: Add Spark make_dt_interval function
- [12762](https://github.com/facebookincubator/velox/pull/12762): 📕 [CLOSED] feat: Add CAST(interval year month as integer)
- [12521](https://github.com/facebookincubator/velox/pull/12521): 📕 [CLOSED] [Velox] Add Support for Day Time Interval Type
- [12369](https://github.com/facebookincubator/velox/pull/12369): 📕 [CLOSED] feat: Add support for Timestamp to Integral for Spark
- [12230](https://github.com/facebookincubator/velox/pull/12230): 📕 [CLOSED] feat: Add support for double to timestamp cast for Spark
- [12229](https://github.com/facebookincubator/velox/pull/12229): 📕 [CLOSED] feat: Add Spark support to cast double to timestamp 
- [10788](https://github.com/facebookincubator/velox/pull/10788): 📕 [CLOSED] Add Spark split_part function
- [8692 ](https://github.com/facebookincubator/velox/pull/8692): 📕 [CLOSED] Add map_from_entries Spark function
- [10359](https://github.com/facebookincubator/velox/pull/10359): 📕 [CLOSED] Add SparkSql function to_pretty_string
- [12749](https://github.com/facebookincubator/velox/pull/12749): 📕 [CLOSED] feat: Register spark map_from_entries function
- [11033](https://github.com/facebookincubator/velox/pull/11033): 📕 [CLOSED] Support sparksql approx_percentile
- [10280](https://github.com/facebookincubator/velox/pull/10280): 📕 [CLOSED] feat: Register function for map_from_arrays for SparkSQL
- [11114](https://github.com/facebookincubator/velox/pull/11114): 📕 [CLOSED] Support all patterns for Spark CAST(varchar as timestamp)
- [12512](https://github.com/facebookincubator/velox/pull/12512): 📕 [CLOSED] fix(expr): Align cast from decimal to float/double with Spark and Presto
- [4859 ](https://github.com/facebookincubator/velox/pull/4859): 📕 [CLOSED] Add months_between Spark function
- [4830 ](https://github.com/facebookincubator/velox/pull/4830): 📕 [CLOSED] Add next_day Spark function
- [10641](https://github.com/facebookincubator/velox/pull/10641): 📕 [CLOSED] Support overflow in Timestamp::toTimeZone method
- [5419 ](https://github.com/facebookincubator/velox/pull/5419): 📕 [CLOSED] Add substring_index spark function
- [11126](https://github.com/facebookincubator/velox/pull/11126): 📕 [CLOSED] Skip overflow check for decimal add in agg function
- [9272 ](https://github.com/facebookincubator/velox/pull/9272): 📕 [CLOSED] Add normalize_nan Spark function
- [8356 ](https://github.com/facebookincubator/velox/pull/8356): 📕 [CLOSED] Add nanvl Spark function
- [7204 ](https://github.com/facebookincubator/velox/pull/7204): 📕 [CLOSED] Add corr Spark function

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[VL] useful Velox PRs not merged into upstream #11585

Tracked PRs (2026-03-11)

Ansi

Bug Fixes

Bux Fixes

Enhancement

Iceberg

Json

Regexp

Spark Functions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[VL] useful Velox PRs not merged into upstream #11585

Description

Tracked PRs (2026-03-11)

Ansi

Bug Fixes

Bux Fixes

Enhancement

Iceberg

Json

Regexp

Spark Functions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions