Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-38407][SQL][3.2] ANSI Cast: loosen the limitation of casting non-null complex types #35754

Closed
wants to merge 1 commit into from

Conversation

gengliangwang
Copy link
Member

What changes were proposed in this pull request?

When ANSI mode is off, ArrayType(DoubleType, containsNull = false) can't cast as ArrayType(IntegerType, containsNull = false) since there can be overflow thus result in null results and breaks the non-null constraint.

When ANSI mode is on, currently Spark SQL has the same behavior. However, this is not correct since the non-null constraint won't be break. Spark SQL can just execute the cast and throw runtime error on overflow, just like casting DoubleType as IntegerType.

This applies to MapType and StructType as well. This PR is to loosen the limitation of casting non-null Array/Map/Struct types.

Why are the changes needed?

For ANSI mode compliance
image

Does this PR introduce any user-facing change?

Yes, for Cast under ANSI mode or table insertion, a complex type which don't contain null can be cast as another non-null complex type as long as the element types are castable. Before changes, this is only allowed when the source element type can be upcast to the target element type.

How was this patch tested?

UT

…ll complex types

When ANSI mode is off, `ArrayType(DoubleType, containsNull = false)` can't cast as `ArrayType(IntegerType, containsNull = false)` since there can be overflow thus result in null results and breaks the non-null constraint.

When ANSI mode is on, currently Spark SQL has the same behavior. However, this is not correct since the non-null constraint won't be break. Spark SQL can just execute the cast and throw runtime error on overflow, just like casting DoubleType as IntegerType.

This applies to MapType and StructType as well.  This PR is to loosen the limitation of casting non-null Array/Map/Struct types.

For ANSI mode compliance
<img width="559" alt="image" src="https://user-images.githubusercontent.com/1097932/156600154-24e3817b-ba22-45ce-a16f-faf7461af73e.png">

Yes, for Cast under ANSI mode or table insertion, a complex type which don't contain null can be cast as another non-null complex type as long as the element types are castable. Before changes, this is only allowed when the source element type can be upcast to the target element type.

UT

Closes apache#35724 from gengliangwang/canCast.

Authored-by: Gengliang Wang <gengliang@apache.org>
Signed-off-by: Gengliang Wang <gengliang@apache.org>
@gengliangwang
Copy link
Member Author

This PR is to port #35724 to branch-3.2

@gengliangwang
Copy link
Member Author

Merging to branch-3.2

gengliangwang added a commit that referenced this pull request Mar 9, 2022
…on-null complex types

### What changes were proposed in this pull request?

When ANSI mode is off, `ArrayType(DoubleType, containsNull = false)` can't cast as `ArrayType(IntegerType, containsNull = false)` since there can be overflow thus result in null results and breaks the non-null constraint.

When ANSI mode is on, currently Spark SQL has the same behavior. However, this is not correct since the non-null constraint won't be break. Spark SQL can just execute the cast and throw runtime error on overflow, just like casting DoubleType as IntegerType.

This applies to MapType and StructType as well.  This PR is to loosen the limitation of casting non-null Array/Map/Struct types.

### Why are the changes needed?

For ANSI mode compliance
<img width="559" alt="image" src="https://user-images.githubusercontent.com/1097932/156600154-24e3817b-ba22-45ce-a16f-faf7461af73e.png">

### Does this PR introduce _any_ user-facing change?

Yes, for Cast under ANSI mode or table insertion, a complex type which don't contain null can be cast as another non-null complex type as long as the element types are castable. Before changes, this is only allowed when the source element type can be upcast to the target element type.

### How was this patch tested?

UT

Closes #35754 from gengliangwang/port-35724.

Authored-by: Gengliang Wang <gengliang@apache.org>
Signed-off-by: Gengliang Wang <gengliang@apache.org>
kazuyukitanimura pushed a commit to kazuyukitanimura/spark that referenced this pull request Aug 10, 2022
…on-null complex types

### What changes were proposed in this pull request?

When ANSI mode is off, `ArrayType(DoubleType, containsNull = false)` can't cast as `ArrayType(IntegerType, containsNull = false)` since there can be overflow thus result in null results and breaks the non-null constraint.

When ANSI mode is on, currently Spark SQL has the same behavior. However, this is not correct since the non-null constraint won't be break. Spark SQL can just execute the cast and throw runtime error on overflow, just like casting DoubleType as IntegerType.

This applies to MapType and StructType as well.  This PR is to loosen the limitation of casting non-null Array/Map/Struct types.

### Why are the changes needed?

For ANSI mode compliance
<img width="559" alt="image" src="https://user-images.githubusercontent.com/1097932/156600154-24e3817b-ba22-45ce-a16f-faf7461af73e.png">

### Does this PR introduce _any_ user-facing change?

Yes, for Cast under ANSI mode or table insertion, a complex type which don't contain null can be cast as another non-null complex type as long as the element types are castable. Before changes, this is only allowed when the source element type can be upcast to the target element type.

### How was this patch tested?

UT

Closes apache#35754 from gengliangwang/port-35724.

Authored-by: Gengliang Wang <gengliang@apache.org>
Signed-off-by: Gengliang Wang <gengliang@apache.org>
(cherry picked from commit d99cbbf)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
1 participant