Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Java] DenseUnionVector has no way to set offset/validity directly #24826

Closed
asfimport opened this issue May 1, 2020 · 7 comments
Closed

[Java] DenseUnionVector has no way to set offset/validity directly #24826

asfimport opened this issue May 1, 2020 · 7 comments

Comments

@asfimport
Copy link
Collaborator

You can set the type ID manually, but you cannot set the offset or validity directly. Ideally, we'd have an API like Python that lets us build it directly from constituent vectors and the offsets/type IDs.

Reporter: David Li / @lidavidm

Note: This issue was originally created as ARROW-8666. Please see the migration documentation for further details.

@asfimport
Copy link
Collaborator Author

Liya Fan / @liyafan82:
@lidavidm Thanks for opening this issue.
Could you please give more details about your scenario?
Currently, we have loadFieldBuffers, does it meet your needs?

@asfimport
Copy link
Collaborator Author

David Li / @lidavidm:
Hi [~fan_li_ya], loadFieldBuffers should do the trick, thanks for pointing it out!

The motivation here is we are constructing a dense union vector where the child arrays are in turn struct vectors. Semantically, this is intended as the union of distinct top-level schemas so that we can pass through different datasets in a single Flight RPC. Ideally we can do this with minimal/no copying or allocation, so being able to just transfer the arrays from the input VectorSchemaRoots and construct the right type ID and offset buffers is perfect.

@asfimport
Copy link
Collaborator Author

Bryan Cutler / @BryanCutler:
@lidavidm can this be closed then, or are there further API improvements needed?

@asfimport
Copy link
Collaborator Author

David Li / @lidavidm:
Closing.

@jarohen
Copy link
Contributor

jarohen commented Apr 2, 2024

@lidavidm We've found setOffset to be useful here when constructing DUVs by adding values directly to the underlying vectors - i.e. we do duv.setTypeId(typeId), duv.setOffset(offset), duv.getVectorByType(typeId).set(lng) (we don't do getVectorByType for every row - this is cached).

We've got a commit for it here - would it be something we could donate upstream?

(We've got plenty of end-to-end tests covering it in XTDB, but would also add some unit tests if we raised a PR here.)

@lidavidm
Copy link
Member

lidavidm commented Apr 2, 2024

Sure, I think it makes sense to have still.

@lidavidm lidavidm reopened this Apr 2, 2024
lidavidm pushed a commit that referenced this issue Apr 4, 2024
### Are these changes tested?

Yes

### Are there any user-facing changes?

Yes - the addition of a public DUV.setOffset method
* GitHub Issue: #24826

Authored-by: James Henderson <james@jarohen.dev>
Signed-off-by: David Li <li.davidm96@gmail.com>
@lidavidm lidavidm added this to the 16.0.0 milestone Apr 4, 2024
@lidavidm
Copy link
Member

lidavidm commented Apr 4, 2024

Issue resolved by pull request 40985
#40985

@lidavidm lidavidm closed this as completed Apr 4, 2024
tolleybot pushed a commit to tmct/arrow that referenced this issue May 2, 2024
### Are these changes tested?

Yes

### Are there any user-facing changes?

Yes - the addition of a public DUV.setOffset method
* GitHub Issue: apache#24826

Authored-by: James Henderson <james@jarohen.dev>
Signed-off-by: David Li <li.davidm96@gmail.com>
vibhatha pushed a commit to vibhatha/arrow that referenced this issue May 25, 2024
### Are these changes tested?

Yes

### Are there any user-facing changes?

Yes - the addition of a public DUV.setOffset method
* GitHub Issue: apache#24826

Authored-by: James Henderson <james@jarohen.dev>
Signed-off-by: David Li <li.davidm96@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants