Skip to content

Conversation

@kmitchener
Copy link
Contributor

@kmitchener kmitchener commented Aug 19, 2022

Which issue does this PR close?

Closes #3057 .

Rationale for this change

What changes are included in this PR?

Root cause of the issue was that down one code path, creating a physical plan wasn't setting the query_execution_start_time in session state. I've fixed it by making the code the same between DataFrame's create_physical_plan() and SessionContext's create_physical_plan().

  • I added tests to ensure now() returns a consistent value within a statement but different values when called in different statements within the same SessionContext. I added tests for SQL and using the DataFrame API to ensure it works via the different code paths for collecting dataframes.
  • views are fixed so that they return now() properly
    • a side-effect of this is that we now store unoptimized plans in view definitions and optimize it on every view execution. I believe this is preferable behavior anyway and also don't see any other way to have now() or functions like that work in views.
  • Some functions are incorrectly declared as unary #3069 is partially resolved .. needed to fix the now() expression definition here

I'm still not that happy with how I solved it the root issue with the SessionState cloning via this code duplication, but can follow up with a separate PR.

Are there any user-facing changes?

now() works correctly across statements in the same session regardless of code path taken to execute the SQL

…ent statements. fix the issue so the tests pass.
@github-actions github-actions bot added core Core DataFusion crate physical-expr Changes to the physical-expr crates labels Aug 19, 2022
@codecov-commenter
Copy link

codecov-commenter commented Aug 19, 2022

Codecov Report

Merging #3210 (c95326a) into master (3df9f80) will decrease coverage by 0.00%.
The diff coverage is 96.55%.

@@            Coverage Diff             @@
##           master    #3210      +/-   ##
==========================================
- Coverage   85.85%   85.85%   -0.01%     
==========================================
  Files         291      291              
  Lines       52899    52938      +39     
==========================================
+ Hits        45416    45448      +32     
- Misses       7483     7490       +7     
Impacted Files Coverage Δ
datafusion/core/src/physical_plan/mod.rs 88.00% <ø> (ø)
...tafusion/physical-expr/src/datetime_expressions.rs 83.52% <ø> (ø)
datafusion/proto/src/from_proto.rs 35.40% <0.00%> (+0.12%) ⬆️
datafusion/core/tests/sql/timestamp.rs 99.65% <97.82%> (-0.17%) ⬇️
datafusion/core/src/dataframe.rs 88.91% <100.00%> (+0.05%) ⬆️
datafusion/core/src/datasource/view.rs 87.06% <100.00%> (+0.22%) ⬆️
datafusion/core/src/execution/context.rs 78.18% <100.00%> (-0.06%) ⬇️
datafusion/expr/src/expr_fn.rs 91.06% <100.00%> (+0.05%) ⬆️
datafusion/expr/src/window_frame.rs 92.43% <0.00%> (-0.85%) ⬇️
.../physical-expr/src/aggregate/array_agg_distinct.rs 79.41% <0.00%> (-0.78%) ⬇️
... and 15 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @kmitchener.

Bonus points if you could put a brief description in the user guide for now() while you are here, in the SQL scalar function reference.

@kmitchener
Copy link
Contributor Author

Good thinking, docs updated.

@kmitchener
Copy link
Contributor Author

Actually, I'm going to make this a WIP because I tested with a now() in a view and that code path is still broken. Needs a more robust fix. Will pick it up again in a couple days.

@kmitchener kmitchener changed the title fix issue with now() returning same value across statements WIP: fix issue with now() returning same value across statements Aug 20, 2022
@liukun4515
Copy link
Contributor

Actually, I'm going to make this a WIP because I tested with a now() in a view and that code path is still broken. Needs a more robust fix. Will pick it up again in a couple days.

You can convert this pr as a draft @kmitchener If there are some bug or issue not resolved

@kmitchener kmitchener marked this pull request as draft August 20, 2022 08:54
… during the create_physical_plan() call so this test is misleading and duplicative
…ssion

add test for now() expression via dataframe API
@github-actions github-actions bot added the logical-expr Logical plan and expressions label Aug 21, 2022
don't store optimized logical plan in View. store the original unmodified plan and optimize on every execution of the view.
added tests to cover all cases
@kmitchener kmitchener changed the title WIP: fix issue with now() returning same value across statements fix issue with now() returning same value across statements Aug 22, 2022
@kmitchener kmitchener marked this pull request as ready for review August 22, 2022 17:38
@kmitchener
Copy link
Contributor Author

@andygrove PTAL

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I think one test needs to be added. See earlier comment.

@andygrove andygrove merged commit eedc787 into apache:master Aug 23, 2022
@ursabot
Copy link

ursabot commented Aug 23, 2022

Benchmark runs are scheduled for baseline = c72f547 and contender = eedc787. eedc787 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@kmitchener kmitchener deleted the fix-now-across-statements branch August 23, 2022 10:45
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for thisPR @kmitchener 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates

Projects

None yet

Development

Successfully merging this pull request may close these issues.

once now() is called in a statement, it forever returns the same value

6 participants