Skip to content

feat: adds array_add function#22459

Open
SubhamSinghal wants to merge 4 commits into
apache:mainfrom
SubhamSinghal:array-add-function
Open

feat: adds array_add function#22459
SubhamSinghal wants to merge 4 commits into
apache:mainfrom
SubhamSinghal:array-add-function

Conversation

@SubhamSinghal
Copy link
Copy Markdown
Contributor

@SubhamSinghal SubhamSinghal commented May 22, 2026

Which issue does this PR close?

  • Part of #21536 (array_add — first PR in the vector math series).

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Yes, via SLT only (array_add.slt). Coverage:

  • Happy paths: basic, negative components, single-element, empty, multi-row.
  • NULL propagation: whole-row NULL on each side / both sides; element-level NULL on each side / both sides at same
    and different positions.
  • Type / variant: integer literals, mixed int+float, LargeList×LargeList, mixed List+LargeList,
    FixedSizeListList coercion, Float32 leaf, Int64 leaf.
  • Decimal handling: Decimal128 / Decimal256 rejected at planning; explicit cast to DOUBLE opt-in works.
  • Error paths: per-row length mismatch (exec), unsupported non-list input (plan), non-numeric leaf (plan), boolean
    leaf (plan), nested list (plan), wrong arg count.
  • Aliases: list_add single-row + multi-row.
  • Composition: array_add(array_add(...), ...) chained — single-row, with element NULLs propagating across both
    layers, and multi-row with row-level NULL.

Are there any user-facing changes?

Yes — two new functions:

  • array_add(array1, array2) → List<Float64> / LargeList<Float64>
  • list_add(...) alias

Both exposed via expr_fn and registered in all_default_nested_functions(). Documented inline via #[user_doc]
(description, syntax, SQL example, argument descriptions).

No breaking API changes.

@github-actions github-actions Bot added sqllogictest SQL Logic Tests (.slt) functions Changes to functions implementation labels May 22, 2026
@crm26
Copy link
Copy Markdown
Contributor

crm26 commented May 22, 2026

Heads up — the check configs.md and ***_functions.md is up-to-date job is red because docs/source/user-guide/sql/scalar_functions.md needs to be regenerated for the new array_add entry from your #[user_doc] macro.

One-command fix:

./dev/update_function_docs.sh
git add docs/source/user-guide/sql/scalar_functions.md

Then amend or new-commit + push.

Ok(arg_types[0].clone())
}

fn coerce_types(&self, arg_types: &[DataType]) -> Result<Vec<DataType>> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic in here seems to differ from what we usually have, e.g.

fn coerce_types(&self, arg_types: &[DataType]) -> Result<Vec<DataType>> {
let [_, _] = take_function_args(self.name(), arg_types)?;
let coercion = Some(&ListCoercion::FixedSizedListToList);
for arg_type in arg_types {
if !matches!(arg_type, Null | List(_) | LargeList(_) | FixedSizeList(..)) {
return plan_err!("{} does not support type {arg_type}", self.name());
}
}
// If any input is `LargeList`, both sides must be widened to `LargeList`
// so the runtime dispatch in `cosine_distance_inner` sees a homogeneous
// pair. Follows the pattern in `ArrayConcat::coerce_types`.
let any_large_list = arg_types.iter().any(|t| matches!(t, LargeList(_)));
let coerced = arg_types
.iter()
.map(|arg_type| {
if matches!(arg_type, Null) {
let field = Arc::new(Field::new_list_field(DataType::Float64, true));
return if any_large_list {
LargeList(field)
} else {
List(field)
};
}
let coerced = coerced_type_with_base_type_only(
arg_type,
&DataType::Float64,
coercion,
);
match coerced {
List(field) if any_large_list => LargeList(field),
other => other,
}
})
.collect();
Ok(coerced)
}

Is there a reason for this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Jefffrey I have added more checks like

  1. List of List is handled upfront. List<List> throws error as non numeric
  2. Decimal128 and Decimal256 types throws error as downcasting it would loose precision.
  3. throws error if element type is non-numeric to avoid automatic typecasting of boolean or string type to numeric.

Let me know if I should get rid of these checks.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @crm26

We should try to ensure these array math functions are consistent with each other on the signature they accept

Copy link
Copy Markdown
Contributor Author

@SubhamSinghal SubhamSinghal May 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Jefffrey can we make this a util function?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That could be useful to keep them aligned, until we have better support in the signature API for these types of behaviour

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Jefffrey Create utility function coerce_array_math_arg_types with same implementation as other existing functions. We can add more checks to common function if needed.

Comment thread datafusion/functions-nested/src/array_add.rs Outdated
Comment thread datafusion/functions-nested/src/array_add.rs Outdated
Comment thread datafusion/functions-nested/src/array_add.rs Outdated
Comment thread datafusion/functions-nested/src/array_add.rs Outdated
@github-actions github-actions Bot added the documentation Improvements or additions to documentation label May 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation functions Changes to functions implementation sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants