Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-44004][SQL] Assign name & improve error message for frequent LEGACY errors. #41504

Closed
wants to merge 18 commits into from

Conversation

itholic
Copy link
Contributor

@itholic itholic commented Jun 8, 2023

What changes were proposed in this pull request?

This PR proposes to assign name & improve error message for frequent LEGACY errors.

Why are the changes needed?

To improve the errors that most frequently occurring.

Does this PR introduce any user-facing change?

No API changes, it's only for errors.

How was this patch tested?

The existing CI should passed.

@itholic
Copy link
Contributor Author

itholic commented Jun 8, 2023

cc @MaxGekk @srielau @cloud-fan could you take a look when you find some time?

@itholic
Copy link
Contributor Author

itholic commented Jun 8, 2023

Thanks @cloud-fan for review. Just adjusted the comments.

core/src/main/resources/error/error-classes.json Outdated Show resolved Hide resolved
core/src/main/resources/error/error-classes.json Outdated Show resolved Hide resolved
@@ -1754,6 +1779,11 @@
],
"sqlState" : "42826"
},
"OPERATION_NOT_ALLOWED" : {
"message" : [
"Operation not allowed: <message>. This error occurs when attempting an operation that is not currently supported. Please check the documentation for the list of allowable operations."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of the generic message. Please, add sub-classes.

I told you about this in your PR #39965

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh,,, I didn't realized that the PR was closed. Let me revisit my PR and revert the change here in this PR.

@itholic
Copy link
Contributor Author

itholic commented Jun 14, 2023

@MaxGekk I adjusted the comments, but CI keep failing that I can't get the reason from the log below:

02:24:58.931 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 81.0 (TID 129) (localhost executor driver): org.apache.spark.SparkUpgradeException: [INCONSISTENT_BEHAVIOR_CROSS_VERSION.READ_ANCIENT_DATETIME] You may get a different result due to the upgrading to Spark >= 3.0:
reading dates before 1582-10-15 or timestamps before 1900-01-01T00:00:00Z
from Parquet files can be ambiguous, as the files may be written by
Spark 2.x or legacy versions of Hive, which uses a legacy hybrid calendar
that is different from Spark 3.0+'s Proleptic Gregorian calendar.
See more details in SPARK-31404. You can set the SQL config "spark.sql.parquet.datetimeRebaseModeInRead" or
the datasource option "datetimeRebaseMode" to "LEGACY" to rebase the datetime values
w.r.t. the calendar difference during reading. To read the datetime values
as it is, set the SQL config or the datasource option to "CORRECTED".

Could you happen to help me resolving CI failure when you find some time?

also cc @cloud-fan FYI

@itholic
Copy link
Contributor Author

itholic commented Jun 19, 2023

CI passed. cc @MaxGekk @cloud-fan

Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting for CI.

@MaxGekk
Copy link
Member

MaxGekk commented Jun 21, 2023

+1, LGTM. Merging to master.
Thank you, @itholic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants