-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Java] Why does VectorSchemaRoot::slice return reference to same root when slice index is [0, rowCount] ? #35275
Comments
Also in this documentation https://arrow.apache.org/docs/java/vector_schema_root.html#vectorschemaroot |
lidavidm
pushed a commit
that referenced
this issue
May 8, 2023
…5476) ### Rationale for this change The optimization in the `slice` function for VectorSchemaRoot that returns the reference to the original root when the slice covers the entire root should be removed because this is an inconsistent interface (sometimes you get a totally new autocloseable and sometimes you get the original one). Also - in the [this documentation page](https://arrow.apache.org/docs/java/vector_schema_root.html#vectorschemaroot) it explicitly says that "A **NEW** VectorSchemaRoot can be sliced from an existing root without copying data:" so any time you call `slice`, the slice root should be a new one (even if it's the entire root). The test for slice has also been changed to have more coverage of the slice function (not just slicing from the start index). ### What changes are included in this PR? Remove self reference return in `VectorSchemaRoot#split` and expand test coverage for the function. ### Are these changes tested? Yes ### Are there any user-facing changes? Yes **This PR includes breaking changes to public APIs.** * Closes: #35275 Authored-by: Parth <parthchonkar@gmail.com> Signed-off-by: David Li <li.davidm96@gmail.com>
liujiacheng777
pushed a commit
to LoongArch-Python/arrow
that referenced
this issue
May 11, 2023
…ot (apache#35476) ### Rationale for this change The optimization in the `slice` function for VectorSchemaRoot that returns the reference to the original root when the slice covers the entire root should be removed because this is an inconsistent interface (sometimes you get a totally new autocloseable and sometimes you get the original one). Also - in the [this documentation page](https://arrow.apache.org/docs/java/vector_schema_root.html#vectorschemaroot) it explicitly says that "A **NEW** VectorSchemaRoot can be sliced from an existing root without copying data:" so any time you call `slice`, the slice root should be a new one (even if it's the entire root). The test for slice has also been changed to have more coverage of the slice function (not just slicing from the start index). ### What changes are included in this PR? Remove self reference return in `VectorSchemaRoot#split` and expand test coverage for the function. ### Are these changes tested? Yes ### Are there any user-facing changes? Yes **This PR includes breaking changes to public APIs.** * Closes: apache#35275 Authored-by: Parth <parthchonkar@gmail.com> Signed-off-by: David Li <li.davidm96@gmail.com>
ArgusLi
pushed a commit
to Bit-Quill/arrow
that referenced
this issue
May 15, 2023
…ot (apache#35476) ### Rationale for this change The optimization in the `slice` function for VectorSchemaRoot that returns the reference to the original root when the slice covers the entire root should be removed because this is an inconsistent interface (sometimes you get a totally new autocloseable and sometimes you get the original one). Also - in the [this documentation page](https://arrow.apache.org/docs/java/vector_schema_root.html#vectorschemaroot) it explicitly says that "A **NEW** VectorSchemaRoot can be sliced from an existing root without copying data:" so any time you call `slice`, the slice root should be a new one (even if it's the entire root). The test for slice has also been changed to have more coverage of the slice function (not just slicing from the start index). ### What changes are included in this PR? Remove self reference return in `VectorSchemaRoot#split` and expand test coverage for the function. ### Are these changes tested? Yes ### Are there any user-facing changes? Yes **This PR includes breaking changes to public APIs.** * Closes: apache#35275 Authored-by: Parth <parthchonkar@gmail.com> Signed-off-by: David Li <li.davidm96@gmail.com>
rtpsw
pushed a commit
to rtpsw/arrow
that referenced
this issue
May 16, 2023
…ot (apache#35476) ### Rationale for this change The optimization in the `slice` function for VectorSchemaRoot that returns the reference to the original root when the slice covers the entire root should be removed because this is an inconsistent interface (sometimes you get a totally new autocloseable and sometimes you get the original one). Also - in the [this documentation page](https://arrow.apache.org/docs/java/vector_schema_root.html#vectorschemaroot) it explicitly says that "A **NEW** VectorSchemaRoot can be sliced from an existing root without copying data:" so any time you call `slice`, the slice root should be a new one (even if it's the entire root). The test for slice has also been changed to have more coverage of the slice function (not just slicing from the start index). ### What changes are included in this PR? Remove self reference return in `VectorSchemaRoot#split` and expand test coverage for the function. ### Are these changes tested? Yes ### Are there any user-facing changes? Yes **This PR includes breaking changes to public APIs.** * Closes: apache#35275 Authored-by: Parth <parthchonkar@gmail.com> Signed-off-by: David Li <li.davidm96@gmail.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the usage question you have. Please include as many useful details as possible.
Hi Arrow community,
In the VectorSchemaRoot::slice implementation it checks if the slice covers the entirety of the root and returns the reference to the same root if so.
Looks like this is done as an optimization to prevent creating another root + transferring to it.
Doesn't this optimization make
slice
inconsistent? Sometimes you get back a new autocloseable resource that has to be separately closed and sometimes you get back the same root.So if you want to calculate something based on a set of different slices of your root then you have to manually check whether the current slice you are requesting isn't the whole root to avoid closing the original root.
Since
splitAndTransfer
do no copying of the actual data, is this optimization worth this tradeoff?Component(s)
Java
The text was updated successfully, but these errors were encountered: