Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(expr): array_{ndims,lower,upper,length,dims} #10197

Merged
merged 8 commits into from
Jun 9, 2023

Conversation

xiangjinwu
Copy link
Contributor

@xiangjinwu xiangjinwu commented Jun 6, 2023

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Support the following array functions:

  • array_ndims ( anyarray ) → integer
  • array_lower ( anyarray, integer ) → integer
  • array_upper ( anyarray, integer ) → integer (currently returns bigint, to be fixed as breaking change)
  • array_length ( anyarray, integer ) → integer (currently returns bigint, to be fixed as breaking change)
  • array_dims ( anyarray ) → text

The key differences between PostgreSQL array (tensor) and RisingWave array (nested list) are exhibited in related e2e tests. See #3811 (comment) for rationale.

As for implementation:

  • array_ndims can be statically evaluated based on its type only, similar to pg_typeof.
  • array_lower is always 1 for valid input, thus rewritten to a simple case when expression.
  • array_upper is semantically same as array_length.
  • array_dims can also be rewritten to array_lower + array_upper + concat_ws. But a direct implementation is simpler.

Limitations:

  • array_upper / array_length second argument is limited to 1, even when the array is rectangular.
  • array_dims only works for 1d array, even when the array is rectangular.

Resolves #8201
Resolves #10135

Checklist For Contributors

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • All checks passed in ./risedev check (or alias, ./risedev c)

Documentation

Types of user-facing changes

  • SQL commands, functions, and operators

Release note

Support array_ndims, array_lower, array_upper, array_length, array_dims, where the latter 4 are limited to 1 dimension.

@github-actions github-actions bot added type/feature user-facing-changes Contains changes that are visible to users labels Jun 6, 2023
@codecov
Copy link

codecov bot commented Jun 6, 2023

Codecov Report

Merging #10197 (2af3c45) into main (9688b22) will decrease coverage by 0.02%.
The diff coverage is 32.53%.

@@            Coverage Diff             @@
##             main   #10197      +/-   ##
==========================================
- Coverage   70.73%   70.71%   -0.02%     
==========================================
  Files        1237     1237              
  Lines      211682   211761      +79     
==========================================
+ Hits       149725   149744      +19     
- Misses      61957    62017      +60     
Flag Coverage Δ
rust 70.71% <32.53%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/common/src/types/mod.rs 62.85% <0.00%> (-0.60%) ⬇️
src/frontend/src/expr/pure.rs 93.44% <ø> (ø)
src/frontend/src/expr/type_inference/func.rs 78.52% <0.00%> (-1.18%) ⬇️
src/expr/src/vector_op/array_length.rs 13.04% <9.52%> (-11.96%) ⬇️
src/frontend/src/binder/expr/function.rs 86.40% <65.78%> (-1.12%) ⬇️

... and 8 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Member

@xxchan xxchan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a hard time understanding what lower/upper bound mean, because if it's just 1/length, why do the new concepts exist?

It seems PG has this because it supports custom array bounds [lb:ub]={..}. But I still cannot come up with it use cases. 🤡 Do you know whether they are useful?

Others generally LGTM.

src/frontend/src/expr/type_inference/func.rs Outdated Show resolved Hide resolved
src/frontend/src/expr/type_inference/func.rs Show resolved Hide resolved
@xiangjinwu
Copy link
Contributor Author

xiangjinwu commented Jun 7, 2023

I had a hard time understanding what lower/upper bound mean, because if it's just 1/length, why do the new concepts exist?

It seems PG has this because it supports custom array bounds [lb:ub]={..}. But I still cannot come up with it use cases. 🤡 Do you know whether they are useful?

Maybe it is hard to come up with use cases because most programming languages do not support custom array bounds, and we are so used to not relying on it. Although arrays are 0-indexed or 1-indexed in different contexts, we adapt easily with +1 or -1. For example, Pascal does support custom array bounds var arr: array[lb..ub] of integer; and I only used 1..n when implementing binary heap, where 1 is the root and children of n are 2n and 2n+1.

Even if we find a use case and would like to support it, we also need to make sure the sinks and udf servers are able to handle it.

@xxchan
Copy link
Member

xxchan commented Jun 7, 2023 via email

@xiangjinwu
Copy link
Contributor Author

I don’t think our users might need it (or even know it), so I’m not sure whether the functions are worth adding…

I agree. The only function a new user need would be the unary array_length from #8636. All 5 PostgreSQL functions here are just for compatibility (of non-empty 1d array) with existing queries (especially from PostgreSQL clients / drivers).

Copy link
Contributor

@yezizp2012 yezizp2012 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@xxchan
Copy link
Member

xxchan commented Jun 7, 2023

for compatibility (of non-empty 1d array) with existing queries (especially from PostgreSQL clients / drivers).

Ah, I didn't noticed the requirement is from #10134, so I felt confused why to add it. 🤪

@xiangjinwu
Copy link
Contributor Author

for compatibility (of non-empty 1d array) with existing queries (especially from PostgreSQL clients / drivers).

Ah, I didn't noticed the requirement is from #10134, so I felt confused why to add it. 🤪

There's also #8201. And I also intended to use the test expectations in this PR as concrete examples of the semantics (compatibilities and incompatibilities) proposed in #3811 (comment)

@Honeta Honeta linked an issue Jun 8, 2023 that may be closed by this pull request
@xiangjinwu xiangjinwu added this pull request to the merge queue Jun 9, 2023
Merged via the queue into main with commit cb7d24f Jun 9, 2023
34 checks passed
@xiangjinwu xiangjinwu deleted the feat-expr-array-dims branch June 9, 2023 03:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/feature user-facing-changes Contains changes that are visible to users 📖✓ Covered or will be covered in the user docs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support functions: array_upper and array_lower feat: implement array_ndims Support array_upper function
4 participants