[MINOR][DOCS] Document when `current_date` and `current_timestamp` are evaluated #29892

MaxGekk · 2020-09-28T11:59:09Z

What changes were proposed in this pull request?

Explicitly document that current_date and current_timestamp are executed at the start of query evaluation. And all calls of current_date/current_timestamp within the same query return the same value

Why are the changes needed?

Users could expect that current_date and current_timestamp return the current date/timestamp at the moment of query execution but in fact the functions are folded by the optimizer at the start of query evaluation:

spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/finishAnalysis.scala

Lines 71 to 91 in 0df8dd6

    
           /** 
        
            * Computes the current date and time to make sure we return the same result in a single query. 
        
            */ 
        
           object ComputeCurrentTime extends Rule[LogicalPlan] { 
        
             def apply(plan: LogicalPlan): LogicalPlan = { 
        
               val currentDates = mutable.Map.empty[String, Literal] 
        
               val timeExpr = CurrentTimestamp() 
        
               val timestamp = timeExpr.eval(EmptyRow).asInstanceOf[Long] 
        
               val currentTime = Literal.create(timestamp, timeExpr.dataType) 
        
               plan transformAllExpressions { 
        
                 case CurrentDate(Some(timeZoneId)) => 
        
                   currentDates.getOrElseUpdate(timeZoneId, { 
        
                     Literal.create( 
        
                       LocalDate.now(DateTimeUtils.getZoneId(timeZoneId)), 
        
                       DateType) 
        
                   }) 
        
                 case CurrentTimestamp() | Now() => currentTime 
        
               } 
        
             } 
        
           }

Does this PR introduce any user-facing change?

No

How was this patch tested?

by running ./dev/scalastyle.

SparkQA · 2020-09-28T12:50:02Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33797/

SparkQA · 2020-09-28T13:13:57Z

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33797/

srowen

Looks good though do you want to merge this into the PR that fixes this behavior?

SparkQA · 2020-09-28T16:50:38Z

Test build #129181 has finished for PR 29892 at commit df64fae.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2020-09-29T05:20:23Z

thanks, merging to master/3.0!

…e evaluated ### What changes were proposed in this pull request? Explicitly document that `current_date` and `current_timestamp` are executed at the start of query evaluation. And all calls of `current_date`/`current_timestamp` within the same query return the same value ### Why are the changes needed? Users could expect that `current_date` and `current_timestamp` return the current date/timestamp at the moment of query execution but in fact the functions are folded by the optimizer at the start of query evaluation: https://github.com/apache/spark/blob/0df8dd60733066076967f0525210bbdb5e12415a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/finishAnalysis.scala#L71-L91 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? by running `./dev/scalastyle`. Closes #29892 from MaxGekk/doc-current_date. Authored-by: Max Gekk <max.gekk@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 1b60ff5) Signed-off-by: Wenchen Fan <wenchen@databricks.com>

…e evaluated ### What changes were proposed in this pull request? Explicitly document that `current_date` and `current_timestamp` are executed at the start of query evaluation. And all calls of `current_date`/`current_timestamp` within the same query return the same value ### Why are the changes needed? Users could expect that `current_date` and `current_timestamp` return the current date/timestamp at the moment of query execution but in fact the functions are folded by the optimizer at the start of query evaluation: https://github.com/apache/spark/blob/0df8dd60733066076967f0525210bbdb5e12415a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/finishAnalysis.scala#L71-L91 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? by running `./dev/scalastyle`. Closes apache#29892 from MaxGekk/doc-current_date. Authored-by: Max Gekk <max.gekk@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 1b60ff5) Signed-off-by: Wenchen Fan <wenchen@databricks.com>

Document when current_date/current_timestamp are evaluated

df64fae

probot-autolabeler bot added PYTHON R SQL labels Sep 28, 2020

srowen reviewed Sep 28, 2020

View reviewed changes

HyukjinKwon approved these changes Sep 29, 2020

View reviewed changes

cloud-fan closed this in 1b60ff5 Sep 29, 2020

MaxGekk deleted the doc-current_date branch December 11, 2020 20:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MINOR][DOCS] Document when `current_date` and `current_timestamp` are evaluated #29892

[MINOR][DOCS] Document when `current_date` and `current_timestamp` are evaluated #29892

MaxGekk commented Sep 28, 2020

SparkQA commented Sep 28, 2020

SparkQA commented Sep 28, 2020

srowen left a comment

SparkQA commented Sep 28, 2020

cloud-fan commented Sep 29, 2020

	/**
	* Computes the current date and time to make sure we return the same result in a single query.
	*/
	object ComputeCurrentTime extends Rule[LogicalPlan] {
	def apply(plan: LogicalPlan): LogicalPlan = {
	val currentDates = mutable.Map.empty[String, Literal]
	val timeExpr = CurrentTimestamp()
	val timestamp = timeExpr.eval(EmptyRow).asInstanceOf[Long]
	val currentTime = Literal.create(timestamp, timeExpr.dataType)

	plan transformAllExpressions {
	case CurrentDate(Some(timeZoneId)) =>
	currentDates.getOrElseUpdate(timeZoneId, {
	Literal.create(
	LocalDate.now(DateTimeUtils.getZoneId(timeZoneId)),
	DateType)
	})
	case CurrentTimestamp() \| Now() => currentTime
	}
	}
	}

[MINOR][DOCS] Document when current_date and current_timestamp are evaluated #29892

[MINOR][DOCS] Document when current_date and current_timestamp are evaluated #29892

Conversation

MaxGekk commented Sep 28, 2020

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

SparkQA commented Sep 28, 2020

SparkQA commented Sep 28, 2020

srowen left a comment

Choose a reason for hiding this comment

SparkQA commented Sep 28, 2020

cloud-fan commented Sep 29, 2020

[MINOR][DOCS] Document when `current_date` and `current_timestamp` are evaluated #29892

[MINOR][DOCS] Document when `current_date` and `current_timestamp` are evaluated #29892