Skip to content

Commit

Permalink
ARROW-8386: [Python] Fix error when pyarrow.jvm gets an empty vector
Browse files Browse the repository at this point in the history
When `pyarrow.jvm` gets an empty Vector from the JVM, the buffer list returned is empty and then fails with a `ValueError` in `pa.Array.from_buffers` because it still expects a list populated with buffers. This change checks if the JVM vector has a value count of 0, then manually creates an empty pyarrow Array of the same type.

Closes #6889 from BryanCutler/python-jvm-empty-array-ARROW-8386

Authored-by: Bryan Cutler <cutlerb@gmail.com>
Signed-off-by: Bryan Cutler <cutlerb@gmail.com>
  • Loading branch information
BryanCutler committed Apr 13, 2020
1 parent 3ff0c18 commit 712b8f2
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 1 deletion.
7 changes: 6 additions & 1 deletion python/pyarrow/jvm.py
Original file line number Diff line number Diff line change
Expand Up @@ -277,9 +277,14 @@ def array(jvm_array):
"Cannot convert JVM Arrow array of type {},"
" complex types not yet implemented.".format(minor_type_str))
dtype = field(jvm_array.getField()).type
length = jvm_array.getValueCount()
buffers = [jvm_buffer(buf)
for buf in list(jvm_array.getBuffers(False))]

# If JVM has an empty Vector, buffer list will be empty so create manually
if len(buffers) == 0:
return pa.array([], type=dtype)

length = jvm_array.getValueCount()
null_count = jvm_array.getNullCount()
return pa.Array.from_buffers(dtype, length, buffers, null_count)

Expand Down
9 changes: 9 additions & 0 deletions python/pyarrow/tests/test_jvm.py
Original file line number Diff line number Diff line change
Expand Up @@ -219,6 +219,15 @@ def test_jvm_array(root_allocator, pa_type, py_data, jvm_type):
assert py_array.equals(jvm_array)


def test_jvm_array_empty(root_allocator):
cls = "org.apache.arrow.vector.{}".format('IntVector')
jvm_vector = jpype.JClass(cls)("vector", root_allocator)
jvm_vector.allocateNew()
jvm_array = pa_jvm.array(jvm_vector)
assert len(jvm_array) == 0
assert jvm_array.type == pa.int32()


# These test parameters mostly use an integer range as an input as this is
# often the only type that is understood by both Python and Java
# implementations of Arrow.
Expand Down

0 comments on commit 712b8f2

Please sign in to comment.