Skip to content

Commit

Permalink
apacheGH-37050: [Python][Interchange protocol] Add a workaround for e…
Browse files Browse the repository at this point in the history
…mpty dataframes (apache#38037)

### Rationale for this change

The implementation of the DataFrame Interchange Protocol does not currently support consumption of dataframes with 0 number of chunks (empty dataframes).

### What changes are included in this PR?

Add a workaround to not error in this case.

### Are these changes tested?

Yes, added `test_empty_dataframe` in `python/pyarrow/tests/interchange/test_conversion.py`.

### Are there any user-facing changes?
No.
* Closes: apache#37050

Authored-by: AlenkaF <frim.alenka@gmail.com>
Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
  • Loading branch information
AlenkaF authored and dgreiss committed Feb 17, 2024
1 parent 75510a2 commit a18259a
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 0 deletions.
4 changes: 4 additions & 0 deletions python/pyarrow/interchange/from_dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,10 @@ def _from_dataframe(df: DataFrameObject, allow_copy=True):
batch = protocol_df_chunk_to_pyarrow(chunk, allow_copy)
batches.append(batch)

if not batches:
batch = protocol_df_chunk_to_pyarrow(df)
batches.append(batch)

return pa.Table.from_batches(batches)


Expand Down
7 changes: 7 additions & 0 deletions python/pyarrow/tests/interchange/test_conversion.py
Original file line number Diff line number Diff line change
Expand Up @@ -513,3 +513,10 @@ def test_allow_copy_false_bool_categorical():
df = df.astype("category")
with pytest.raises(RuntimeError):
pi.from_dataframe(df, allow_copy=False)


def test_empty_dataframe():
schema = pa.schema([('col1', pa.int8())])
df = pa.table([[]], schema=schema)
dfi = df.__dataframe__()
assert pi.from_dataframe(dfi) == df

0 comments on commit a18259a

Please sign in to comment.