Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-11193: [Java][Documentation] Add Java ListVector Documentation #9142

Closed
wants to merge 3 commits into from
Closed
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
50 changes: 49 additions & 1 deletion docs/source/java/vector.rst
Expand Up @@ -217,6 +217,55 @@ to be declared is that writer/reader is not as efficient as direct access
}
}

Building ListVector
==================

A :class:`ListVector` is a vector that holds a list of values for each index. Working with one you need to handle the same steps as mentioned above (create > allocate > mutate > set value count > access > clear), but the details of how you accomplish this are slightly different since you need to both create the vector and set the list of values for each index.

For example, the code below shows how to build a :class:`ListVector` of int's using the writer :class:`UnionListWriter`. We build a vector from 0 to 9 and each index contains a list with increasing values [[0, 0, 0, 0, 0], [0, 1, 2, 3, 4], [0, 2, 4, 6, 8], …, [0, 9, 18, 27, 36]].

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The values of each index can be in arbitrary orders. This example may confuse the readers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey yea that is a possibility. We can always write in the docs that the values can be added in arbitrary order. Or instead of using a for loop to add the values we can have some predefined arrays we want to add in that have arbitrary order like: [[1, 2, 3], [3, 2, 1], [10, 30, 20]]. Also looking at the tests for ListVector it looks like they all use ascending order.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I see you point. It makes sense.
Maybe here we should not stress that the values are in increasing order?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good I will update the docs to reflect this.

.. code-block:: Java

try (BufferAllocator allocator = new RootAllocator(Long.MAX_VALUE);
ListVector listVector = ListVector.empty("vector", allocator)) {
UnionListWriter writer = listVector.getWriter();
for (int i = 0; i < 10; i++) {
writer.startList();
writer.setPosition(i);
for (int j = 0; j < 5; j++) {
writer.writeInt(j * i);
}
writer.setValueCount(5);
writer.endList();
}
listVector.setValueCount(10);
}

:class:`ListVector` values can be accessed either through the get API or through the reader class :class:`UnionListReader`. To read all the values, first enumerate through the indexes, and then enumerate through the inner list values.

.. code-block:: Java

// access via get API
for (int i = 0; i < listVector.getValueCount(); i++) {
if (!listVector.isNull(i)) {
ArrayList<Integer> elements = (ArrayList<Integer>) listVector.getObject(i);
for (Integer element : elements) {
System.out.println(element);
}
}
}

// access via reader
UnionListReader reader = listVector.getReader();
for (int i = 0; i < listVector.getValueCount(); i++) {
reader.setPosition(i);
while (reader.next()) {
IntReader intReader = reader.reader();
if (intReader.isSet()) {
System.out.println(intReader.readInteger());
}
}
}

Slicing
=======
Expand All @@ -235,4 +284,3 @@ referring to some logical sub-sequence of the data through :class:`TransferPair`
tp.splitAndTransfer(0, 5);
IntVector sliced = (IntVector) tp.getTo();
// In this case, the vector values are [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] and the sliceVector values are [0, 1, 2, 3, 4].