-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-32613: [C++] Simplify IPC writer for dense unions #33822
Conversation
Thanks for opening a pull request! If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project. Then could you also rename the pull request title in the following format?
or
In the case of PARQUET issues on JIRA the title also supports:
See also: |
|
Hello @kou, I have fixed the broken clang-format test. Thanks. |
@github-actions crossbow submit -g cpp |
Revision: d787277 Submitted crossbow builds: ursacomputing/crossbow @ actions-f892614f66 |
Do existing tests cover this case? |
I think they should. I was looking at the implementation in Go as a reference, and it doesn't seem like the PR that introduced the change (#13806) added any additional tests for finding the offset. I am of the idea right now that existing tests should break if this change is wrong. Please let me know if this is not true, and a unit test is warranted. |
If you add a bug in the target logic and there are tests for this case, these existing tests will be failed. |
But won't those tests run as part of the CI pipeline? I can definitely try to run them, but would need some help 😅. Would running all tests in the ipc/ package be sufficient? |
You can just push a commit that includes a bug to this branch. |
Hello @kou , I have introduced a bug in the code. I added a +1 to the array index while accessing shifted offsets. |
Looks like I will revert the bug if the test results seem satisfactory to you. |
Thanks. Could you revert the bug? |
I have reverted the three commits in which I pushed bugs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Benchmark runs are scheduled for baseline = 3ab246f and contender = b413ac4. b413ac4 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
['Python', 'R'] benchmarks have high level of regressions. |
) JIRA: https://issues.apache.org/jira/browse/ARROW-17339 Closes: apache#32613 ### Rationale for this change Dense union offsets are always non-strictly monotonic for any given child as mandated by the spec, The C++ implementation still assumes that the offsets may be in any order. This can be improved. ### What changes are included in this PR? Just a change to eliminate looping over the size of a `DenseUnionArray` twice. ### Are these changes tested? I am not functionally changing anything. All changes respect the spec, and behavior should be 1:1 with the existing implementation. I believe existing tests should suffice. ### Are there any user-facing changes? There are no user facing changes for this. * Closes: apache#32613 Lead-authored-by: Ramasai Tadepalli <ramasai.tadepalli+3108@gmail.com> Co-authored-by: Ramasai <ramasai.tadepalli+3108@gmail.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
) JIRA: https://issues.apache.org/jira/browse/ARROW-17339 Closes: apache#32613 ### Rationale for this change Dense union offsets are always non-strictly monotonic for any given child as mandated by the spec, The C++ implementation still assumes that the offsets may be in any order. This can be improved. ### What changes are included in this PR? Just a change to eliminate looping over the size of a `DenseUnionArray` twice. ### Are these changes tested? I am not functionally changing anything. All changes respect the spec, and behavior should be 1:1 with the existing implementation. I believe existing tests should suffice. ### Are there any user-facing changes? There are no user facing changes for this. * Closes: apache#32613 Lead-authored-by: Ramasai Tadepalli <ramasai.tadepalli+3108@gmail.com> Co-authored-by: Ramasai <ramasai.tadepalli+3108@gmail.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
) JIRA: https://issues.apache.org/jira/browse/ARROW-17339 Closes: apache#32613 ### Rationale for this change Dense union offsets are always non-strictly monotonic for any given child as mandated by the spec, The C++ implementation still assumes that the offsets may be in any order. This can be improved. ### What changes are included in this PR? Just a change to eliminate looping over the size of a `DenseUnionArray` twice. ### Are these changes tested? I am not functionally changing anything. All changes respect the spec, and behavior should be 1:1 with the existing implementation. I believe existing tests should suffice. ### Are there any user-facing changes? There are no user facing changes for this. * Closes: apache#32613 Lead-authored-by: Ramasai Tadepalli <ramasai.tadepalli+3108@gmail.com> Co-authored-by: Ramasai <ramasai.tadepalli+3108@gmail.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
JIRA: https://issues.apache.org/jira/browse/ARROW-17339
Closes: #32613
Rationale for this change
Dense union offsets are always non-strictly monotonic for any given child as mandated by the spec, The C++ implementation still assumes that the offsets may be in any order. This can be improved.
What changes are included in this PR?
Just a change to eliminate looping over the size of a
DenseUnionArray
twice.Are these changes tested?
I am not functionally changing anything. All changes respect the spec, and behavior should be 1:1 with the existing implementation. I believe existing tests should suffice.
Are there any user-facing changes?
There are no user facing changes for this.