Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

window::shift to work for all array types #388

Merged
merged 6 commits into from
Jun 8, 2021

Conversation

jimexist
Copy link
Member

@jimexist jimexist commented May 31, 2021

Which issue does this PR close?

Closes #392 .

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

@codecov-commenter
Copy link

Codecov Report

Merging #388 (2764be8) into master (f41cb17) will increase coverage by 0.03%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #388      +/-   ##
==========================================
+ Coverage   82.61%   82.65%   +0.03%     
==========================================
  Files         162      162              
  Lines       44228    44283      +55     
==========================================
+ Hits        36538    36600      +62     
+ Misses       7690     7683       -7     
Impacted Files Coverage Δ
arrow/src/compute/kernels/window.rs 100.00% <100.00%> (ø)
parquet/src/encodings/encoding.rs 95.04% <0.00%> (+0.19%) ⬆️
arrow/src/array/transform/mod.rs 89.48% <0.00%> (+0.29%) ⬆️
arrow/src/array/array.rs 76.47% <0.00%> (+0.39%) ⬆️
arrow/src/array/data.rs 72.37% <0.00%> (+0.90%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f41cb17...2764be8. Read the comment docs.

Copy link
Contributor

@nevi-me nevi-me left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments

pub fn shift(array: &Array, offset: i64) -> Result<ArrayRef> {
let value_len = array.len() as i64;
if offset == 0 {
Ok(array.slice(0, array.len()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If offset == 0, why not return array.clone()? array.slice does incur small overhead as it might have to descend into the array children for nested types (see #389 )

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's not that easy given Array does not derive Copy

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. This works: Ok(make_array(array.data_ref().clone()))

Going this route avoids:

  • checking the offset and length
  • recomputing the null count

// Concatenate both arrays, add nulls after if shift > 0 else before
if offset > 0 {
concat(&[null_arr.as_ref(), slice.as_ref()])
pub fn shift(array: &Array, offset: i64) -> Result<ArrayRef> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! May you please add tests for struct and list types.

@jimexist jimexist marked this pull request as ready for review June 3, 2021 23:49
@alamb
Copy link
Contributor

alamb commented Jun 7, 2021

Do you think this one is ready now @nevi-me ?

Copy link
Member

@jorgecarleitao jorgecarleitao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. 👍

Copy link
Contributor

@Dandandan Dandandan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@alamb alamb merged commit 0f55b82 into apache:master Jun 8, 2021
@jimexist jimexist deleted the shift-array-all-cases branch June 15, 2021 01:33
alamb pushed a commit that referenced this pull request Jun 19, 2021
* add more doc test for window::shift

* use Ok(make_array(array.data_ref().clone()))

* shift array for not only primitive cases

* include more test cases

* add back copied

* fix renaming
alamb pushed a commit that referenced this pull request Jun 20, 2021
* add more doc test for window::shift

* use Ok(make_array(array.data_ref().clone()))

* shift array for not only primitive cases

* include more test cases

* add back copied

* fix renaming
alamb added a commit that referenced this pull request Jun 22, 2021
* add more doc test for window::shift

* use Ok(make_array(array.data_ref().clone()))

* shift array for not only primitive cases

* include more test cases

* add back copied

* fix renaming

Co-authored-by: Jiayu Liu <Jimexist@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

window::shift can work for more than just primitive array type
6 participants