Skip to content

Commit

Permalink
apacheGH-39788: [Python] Validate max_chunksize in Table.to_batches (a…
Browse files Browse the repository at this point in the history
…pache#39796)

### Rationale for this change

Validating the keyword to be strictly positive, to avoid an infinite loop.

* Closes: apache#39788

Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Signed-off-by: Raúl Cumplido <raulcumplido@gmail.com>
  • Loading branch information
jorisvandenbossche authored and zanmato1984 committed Feb 28, 2024
1 parent 3c6e02a commit 2cc57af
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 0 deletions.
2 changes: 2 additions & 0 deletions python/pyarrow/table.pxi
Original file line number Diff line number Diff line change
Expand Up @@ -4172,6 +4172,8 @@ cdef class Table(_Tabular):
reader.reset(new TableBatchReader(deref(self.table)))

if max_chunksize is not None:
if not max_chunksize > 0:
raise ValueError("'max_chunksize' should be strictly positive")
c_max_chunksize = max_chunksize
reader.get().set_chunksize(c_max_chunksize)

Expand Down
3 changes: 3 additions & 0 deletions python/pyarrow/tests/test_table.py
Original file line number Diff line number Diff line change
Expand Up @@ -1089,6 +1089,9 @@ def test_table_to_batches():
table_from_iter = pa.Table.from_batches(iter([batch1, batch2, batch1]))
assert table.equals(table_from_iter)

with pytest.raises(ValueError):
table.to_batches(max_chunksize=0)


def test_table_basics():
data = [
Expand Down

0 comments on commit 2cc57af

Please sign in to comment.