Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-39865][SQL][3.3] Show proper error messages on the overflow errors of table insert #37311

Conversation

gengliangwang
Copy link
Member

What changes were proposed in this pull request?

In Spark 3.3, the error message of ANSI CAST is improved. However, the table insertion is using the same CAST expression:

> create table tiny(i tinyint);
> insert into tiny values (1000);

org.apache.spark.SparkArithmeticException[CAST_OVERFLOW]: The value 1000 of the type "INT" cannot be cast to "TINYINT" due to an overflow. Use `try_cast` to tolerate overflow and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.

Showing the hint of If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error doesn't help at all. This PR is to fix the error message. After changes, the error message of this example will become:

org.apache.spark.SparkArithmeticException: [CAST_OVERFLOW_IN_TABLE_INSERT] Fail to insert a value of "INT" type into the "TINYINT" type column `i` due to an overflow. Use `try_cast` on the input value to tolerate overflow and return NULL instead.

Why are the changes needed?

Show proper error messages on the overflow errors of table insert. The current message is super confusing.

Does this PR introduce any user-facing change?

Yes, after changes it show proper error messages on the overflow errors of table insert.

How was this patch tested?

Unit test

…of table insert

In Spark 3.3, the error message of ANSI CAST is improved. However, the table insertion is using the same CAST expression:
```
> create table tiny(i tinyint);
> insert into tiny values (1000);

org.apache.spark.SparkArithmeticException[CAST_OVERFLOW]: The value 1000 of the type "INT" cannot be cast to "TINYINT" due to an overflow. Use `try_cast` to tolerate overflow and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
```

Showing the hint of `If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error` doesn't help at all. This PR is to fix the error message. After changes, the error message of this example will become:
```
org.apache.spark.SparkArithmeticException: [CAST_OVERFLOW_IN_TABLE_INSERT] Fail to insert a value of "INT" type into the "TINYINT" type column `i` due to an overflow. Use `try_cast` on the input value to tolerate overflow and return NULL instead.
```

Show proper error messages on the overflow errors of table insert. The current message is super confusing.

Yes, after changes it show proper error messages on the overflow errors of table insert.

Unit test

Closes apache#37283 from gengliangwang/insertionOverflow.

Authored-by: Gengliang Wang <gengliang@apache.org>
Signed-off-by: Gengliang Wang <gengliang@apache.org>
@gengliangwang gengliangwang changed the title [SPARK-39865][SQL] Show proper error messages on the overflow errors of table insert [SPARK-39865][SQL][3.3] Show proper error messages on the overflow errors of table insert Jul 27, 2022
@gengliangwang
Copy link
Member Author

gengliangwang commented Jul 27, 2022

@cloud-fan this is backport of #37283 for branch 3.3

@gengliangwang
Copy link
Member Author

Tests are passed on https://github.com/gengliangwang/spark/runs/7546291041
Merging to branch-3.3

gengliangwang added a commit that referenced this pull request Jul 28, 2022
…rors of table insert

### What changes were proposed in this pull request?

In Spark 3.3, the error message of ANSI CAST is improved. However, the table insertion is using the same CAST expression:
```
> create table tiny(i tinyint);
> insert into tiny values (1000);

org.apache.spark.SparkArithmeticException[CAST_OVERFLOW]: The value 1000 of the type "INT" cannot be cast to "TINYINT" due to an overflow. Use `try_cast` to tolerate overflow and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
```

Showing the hint of `If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error` doesn't help at all. This PR is to fix the error message. After changes, the error message of this example will become:
```
org.apache.spark.SparkArithmeticException: [CAST_OVERFLOW_IN_TABLE_INSERT] Fail to insert a value of "INT" type into the "TINYINT" type column `i` due to an overflow. Use `try_cast` on the input value to tolerate overflow and return NULL instead.
```
### Why are the changes needed?

Show proper error messages on the overflow errors of table insert. The current message is super confusing.

### Does this PR introduce _any_ user-facing change?

Yes, after changes it show proper error messages on the overflow errors of table insert.

### How was this patch tested?

Unit test

Closes #37311 from gengliangwang/PR_TOOL_PICK_PR_37283_BRANCH-3.3.

Authored-by: Gengliang Wang <gengliang@apache.org>
Signed-off-by: Gengliang Wang <gengliang@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants