Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "[SPARK-25496][SQL] Deprecate from_utc_timestamp and to_utc_timestamp" #27351

Closed
wants to merge 2 commits into from

Conversation

gatorsmile
Copy link
Member

This reverts commit 1d20d13.

@@ -1,113 +0,0 @@
/*
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests are moved by #26885

@dongjoon-hyun
Copy link
Member

cc @MaxGekk and @cloud-fan

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Pending Jenkins. (This is a non-trivial clean revert which is related to multiple PRs. Most parts are the same the other adapted parts including StreamingaggregationSuite look good.)
Thank you, @gatorsmile .

@gatorsmile
Copy link
Member Author

Thanks! @dongjoon-hyun

Both Hive and Impala have such functions. We should simplify the migration by introducing the needed functions if we decide to drop/deprecate the existing functions. Adding a conf is not enough in such cases.

I think we should introduce more time/timestamp related built-in functions in the future. Processing date and timestamp is a common task in the data processing/cleaning/analysis. We have to be more careful when introducing a behavior breaking change.

@SparkQA
Copy link

SparkQA commented Jan 24, 2020

Test build #117342 has finished for PR 27351 at commit af5a793.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 24, 2020

Test build #117336 has finished for PR 27351 at commit 4a273a1.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Retest this please.

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am good with this way too since code freeze is close.
I think #24195 (comment) can be an option too.

@SparkQA
Copy link

SparkQA commented Jan 24, 2020

Test build #117344 has finished for PR 27351 at commit af5a793.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Ping @MaxGekk and @cloud-fan once more since he is the original author and committer.

@MaxGekk
Copy link
Member

MaxGekk commented Jan 24, 2020

@gatorsmile I do understand importance of backward compatibility but:

  • Deprecated functions break semantic of the timestamp type. Its values are micros since the epoch which is in UTC timezone. Placing values of another meaning spreads errors across user apps unpredictably.
  • You mentioned Hive/Impala. but if we looked at common used time API like Java 8 time API, Joda and etc. The libraries don't provide such functions.
  • isn't the major version Spark 3.0 is good time to remove ugly functions from API? If you think we should put more details to Spark docs, let's do that.
  • How else can we inform users that some functions have been deprecated already if we don't throw an exception, and force users to set SQL config, explicitly. At the moment, we don't have any mechanisms of informing users about deprecated functions, unfortunately.

As summary, I cannot agree with reverting the changes. I would update docs and place a link to the docs into the exception message, and update expression description as well.

@dongjoon-hyun
Copy link
Member

I'll merge this. We can deprecate later in two releases.
At the first release, Apache Spark should warn only as a log.
At the next release, Apache Spark can introduce a flag and switch the default.

@HyukjinKwon
Copy link
Member

Yeah, I think it's fine to revert it for now as it becomes pretty controversial. @MaxGekk I think we can open new PR as guided. Probably we should clearly document the alternatives.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
5 participants