Skip to content

[SPARK-56584][PYTHON] Generalize RESULT_TYPE_MISMATCH_FOR_ARROW_UDF error class and remove dead SCHEMA_MISMATCH_FOR_ARROW_PYTHON_UDF#55494

Closed
Yicong-Huang wants to merge 1 commit intoapache:masterfrom
Yicong-Huang:SPARK-56584
Closed

[SPARK-56584][PYTHON] Generalize RESULT_TYPE_MISMATCH_FOR_ARROW_UDF error class and remove dead SCHEMA_MISMATCH_FOR_ARROW_PYTHON_UDF#55494
Yicong-Huang wants to merge 1 commit intoapache:masterfrom
Yicong-Huang:SPARK-56584

Conversation

@Yicong-Huang
Copy link
Copy Markdown
Contributor

@Yicong-Huang Yicong-Huang commented Apr 23, 2026

What changes were proposed in this pull request?

Two related cleanups in the PySpark result-verify path:

  1. Rename error class RESULT_TYPE_MISMATCH_FOR_ARROW_UDF to the more general RESULT_COLUMN_TYPES_MISMATCH (parallel to RESULT_COLUMN_NAMES_MISMATCH / RESULT_COLUMN_SCHEMA_MISMATCH). The error is raised from the generic verify_arrow_result path in python/pyspark/worker.py; the name shouldn't mention "ARROW_UDF".
  2. Reword the message to align with its siblings:
    • Before: Columns do not match in their data type: <mismatch>.
    • After: Column types of the returned data do not match specified schema. Mismatch: <mismatch>.
  3. Remove the dead error class SCHEMA_MISMATCH_FOR_ARROW_PYTHON_UDF. git grep confirms no code path raises it, and its message body is identical to SCHEMA_MISMATCH_FOR_PANDAS_UDF.

Why are the changes needed?

Part of SPARK-55388 (Refactor PythonEvalType processing logic). Cleanup to make error class names and messages consistent across the result-verify path, and to remove dead code.

Does this PR introduce any user-facing change?

Yes. User-visible error class name and message for result column type mismatches in Arrow UDFs change. The unreleased SCHEMA_MISMATCH_FOR_ARROW_PYTHON_UDF class is removed (no code raised it).

How was this patch tested?

Updated 4 existing asserts in test_arrow_grouped_map.py and test_arrow_cogrouped_map.py that match the new message.

Was this patch authored or co-authored using generative AI tooling?

No

…rror class and remove dead SCHEMA_MISMATCH_FOR_ARROW_PYTHON_UDF
@Yicong-Huang Yicong-Huang changed the title [SPARK-56584][PYTHON] Generalize RESULT_TYPE_MISMATCH_FOR_ARROW_UDF error class and remove dead SCHEMA_MISMATCH_FOR_ARROW_PYTHON_UDF [SPARK-56584][PYTHON] Generalize RESULT_TYPE_MISMATCH_FOR_ARROW_UDF error class and remove dead SCHEMA_MISMATCH_FOR_ARROW_PYTHON_UDF Apr 23, 2026
@Yicong-Huang
Copy link
Copy Markdown
Contributor Author

cc @gaogaotiantian

@HyukjinKwon
Copy link
Copy Markdown
Member

Merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants