Skip to content

[SPARK-56169][SQL] Fix ClassCastException in error reporting when GetStructField child type is changed by plan transformation#54970

Closed
ilicmarkodb wants to merge 1 commit intoapache:masterfrom
ilicmarkodb:fix-get-struct-field-classcast
Closed

[SPARK-56169][SQL] Fix ClassCastException in error reporting when GetStructField child type is changed by plan transformation#54970
ilicmarkodb wants to merge 1 commit intoapache:masterfrom
ilicmarkodb:fix-get-struct-field-classcast

Conversation

@ilicmarkodb
Copy link
Contributor

@ilicmarkodb ilicmarkodb commented Mar 24, 2026

What changes were proposed in this pull request?

SPARK-53470 added ExpectsInputTypes to GetStructField so that checkInputDataTypes() catches the case where a plan transformation changes the child's type from StructType to something else. This can happen when an analyzer rule inserts a projection that changes a column's output type after GetStructField was already created referencing that column.

However, when CheckAnalysis detects this mismatch, the error formatting path (toPrettySQL -> usePrettyExpression) accesses GetStructField.dataType which calls childSchema -> child.dataType.asInstanceOf[StructType], throwing a raw ClassCastException before the proper DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE error can be reported.

This PR fixes two things:

  1. usePrettyExpression checks child.dataType before accessing childSchema, falling back to a safe representation when the child is not a StructType
  2. childSchema uses pattern matching instead of an unsafe cast, throwing a clear SparkException.internalError instead of ClassCastException

Why are the changes needed?

Without this fix, users see a raw ClassCastException: StringType$ cannot be cast to StructType instead of the proper DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE error that checkInputDataTypes() was trying to report.

Does this PR introduce any user-facing change?

Yes - users now get a proper AnalysisException with error class DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE instead of a raw ClassCastException.

How was this patch tested?

New tests.

Was this patch authored or co-authored using generative AI tooling?

Yes

Co-authored-by: Claude

@ilicmarkodb ilicmarkodb force-pushed the fix-get-struct-field-classcast branch 2 times, most recently from e05f76d to e1176a1 Compare March 24, 2026 00:28
…StructField child type is changed by plan transformation

### What changes were proposed in this pull request?

SPARK-53470 added `ExpectsInputTypes` to `GetStructField` so that `checkInputDataTypes()` catches the case where a plan transformation changes the child's type from `StructType` to something else. This can happen when an analyzer rule inserts a projection that changes a column's output type after `GetStructField` was already created referencing that column.

However, when `CheckAnalysis` detects this mismatch, the error formatting path (`toPrettySQL` -> `usePrettyExpression`) accesses `GetStructField.dataType` which calls `childSchema` -> `child.dataType.asInstanceOf[StructType]`, throwing a raw `ClassCastException` before the proper `DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE` error can be reported.

This PR fixes two things:
1. `usePrettyExpression` checks `child.dataType` before accessing `childSchema`, falling back to a safe representation when the child is not a `StructType`
2. `childSchema` uses pattern matching instead of an unsafe cast, throwing a clear `SparkException.internalError` instead of `ClassCastException`

### Why are the changes needed?

Without this fix, users see a raw `ClassCastException: StringType$ cannot be cast to StructType` instead of the proper `DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE` error that `checkInputDataTypes()` was trying to report.

### Does this PR introduce _any_ user-facing change?

Yes - users now get a proper `AnalysisException` with error class `DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE` instead of a raw `ClassCastException`.

### How was this patch tested?

TODO: add unit test

### Was this patch authored or co-authored using generative AI tooling?

Yes

Co-authored-by: Isaac
@ilicmarkodb ilicmarkodb force-pushed the fix-get-struct-field-classcast branch from e1176a1 to 787f217 Compare March 24, 2026 00:28
@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 75757e1 Mar 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants