Skip to content

Conversation

@fangchenli
Copy link
Contributor

What changes were proposed in this pull request?

Add tests for pa.Array.to_pandas with default arguments

Why are the changes needed?

We want to monitor changes in PyArrow's behavior.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Unit tests.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Opus 4.5

@github-actions
Copy link

JIRA Issue Information

=== Sub-task SPARK-54944 ===
Summary: Add tests for pa.Array.to_pandas with default arguments
Assignee: None
Status: Open
Affected: ["4.2.0"]


This comment was automatically generated by GitHub Actions

@fangchenli fangchenli marked this pull request as ready for review January 12, 2026 18:57
@Yicong-Huang
Copy link
Contributor

I think for test only PRs we need to add [TESTS] in title

class PyArrowArrayToPandasTests(unittest.TestCase):
"""Test pa.Array.to_pandas with default arguments."""

def test_integer_types(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for to_pandas we have seen many OOM and SEGFAULT issues in PySpark, which underneath uses PyArrow's to_pandas. I hope we can cover more types especially for nested struct/list/json.

Comment on lines +84 to +92
# With nulls, integers convert to float64 by default (numpy doesn't have nullable int)
arr = pa.array([1, None, 3], type=pa.int64())
series = arr.to_pandas()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about NaN?

@fangchenli fangchenli changed the title [SPARK-54944][Python] Add tests for pa.Array.to_pandas with default arguments [SPARK-54944][Python][TESTS] Add tests for pa.Array.to_pandas with default arguments Jan 13, 2026
@fangchenli fangchenli force-pushed the pa-array-to-pandas-tests branch from 7720374 to 6e42fe8 Compare January 14, 2026 00:28
@fangchenli fangchenli changed the title [SPARK-54944][Python][TESTS] Add tests for pa.Array.to_pandas with default arguments [SPARK-54944][PYTHON][TESTS] Add tests for pa.Array.to_pandas with default arguments Jan 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants