Skip to content

Conversation

@zhengruifeng
Copy link
Contributor

What changes were proposed in this pull request?

Optimize Py4J config calls in df.toPandas

Why are the changes needed?

In spark connect, we get all configs in a batch; in spark classic, we can do the similar optimization that all configs are fetched in a batch, so that the py4j calls can be minimized.

Does this PR introduce any user-facing change?

no

How was this patch tested?

ci

Was this patch authored or co-authored using generative AI tooling?

no

@zhengruifeng zhengruifeng changed the title [SPARK-54300][PYTHON] Optimize Py4J config calls in df.toPandas [SPARK-54300][PYTHON] Optimize Py4J calls in df.toPandas Nov 11, 2025
@zhengruifeng zhengruifeng changed the title [SPARK-54300][PYTHON] Optimize Py4J calls in df.toPandas [SPARK-54300][PYTHON] Optimize Py4J calls in df.toPandas Nov 11, 2025
@zhengruifeng zhengruifeng marked this pull request as draft November 11, 2025 08:00
@zhengruifeng zhengruifeng changed the title [SPARK-54300][PYTHON] Optimize Py4J calls in df.toPandas [WIP][SPARK-54300][PYTHON] Optimize Py4J calls in df.toPandas Nov 11, 2025
@zhengruifeng zhengruifeng marked this pull request as ready for review November 12, 2025 02:33
@zhengruifeng zhengruifeng changed the title [WIP][SPARK-54300][PYTHON] Optimize Py4J calls in df.toPandas [SPARK-54300][PYTHON] Optimize Py4J calls in df.toPandas Nov 12, 2025
@zhengruifeng
Copy link
Contributor Author

merged to master

@zhengruifeng zhengruifeng deleted the py4j_conf_topandas branch November 12, 2025 02:41
zifeif2 pushed a commit to zifeif2/spark that referenced this pull request Nov 22, 2025
### What changes were proposed in this pull request?
Optimize Py4J config calls in df.toPandas

### Why are the changes needed?
In spark connect, we get all configs in a batch; in spark classic, we can do the similar optimization that all configs are fetched in a batch, so that the py4j calls can be minimized.

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
ci

### Was this patch authored or co-authored using generative AI tooling?
no

Closes apache#52994 from zhengruifeng/py4j_conf_topandas.

Authored-by: Ruifeng Zheng <ruifengz@apache.org>
Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants