Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support TIMESTAMP AS OF, VERSION AS OF in SQL #128

Closed
spmp opened this issue Aug 14, 2019 · 21 comments
Closed

Support TIMESTAMP AS OF, VERSION AS OF in SQL #128

spmp opened this issue Aug 14, 2019 · 21 comments
Assignees
Labels
documentation Improvements or additions to documentation
Milestone

Comments

@spmp
Copy link

spmp commented Aug 14, 2019

Please add support for the time travel functions TIMESTAMP AS OF, VERSION AS OF, and DESCRIBE HISTORY.

Time travel is a critical Delta use.

Cheers.

--
Updated by @zsxwing : Spark has added the SQL syntax support in https://issues.apache.org/jira/browse/SPARK-37219 . This will be supported when Delta supports Spark 3.3 (#1217).

@mukulmurthy
Copy link
Collaborator

Hi @spmp ,

The functionality is currently available. We don't have custom SQL API support because that depends on changes in Spark, but there are Scala APIs to do all of those. Documentation for time travel options is available at:
https://docs.delta.io/latest/delta-batch.html#query-an-older-snapshot-of-a-table-time-travel

Documentation for the Scala APIs for DESCRIBE HISTORY is available at:
https://docs.delta.io/latest/delta-utility.html#history

Thanks, and please let us know if you have further questions.

@spmp
Copy link
Author

spmp commented Aug 25, 2019

(sorry for late response, I received no email notification...)
This feature request is for the SQL API such that these functions are available on a registered table rather load time option.
@mukulmurthy is there an expected time for this functionality to exist in Spark. I assume there is much synergy (required) between Spark and delta-io.

@mukulmurthy
Copy link
Collaborator

You're right, there's definitely some interaction with Spark needed. It's hard to give a good estimate for this because it depends on the Spark 3.0 release (which can't really be scheduled, since it's a community effort with lots of different voices), but we're hoping for sometime this fall.

I'll reopen this issue (and add SQL to the title) so we can use it to track SQL APIs for these features.

@mukulmurthy mukulmurthy reopened this Aug 26, 2019
@mukulmurthy mukulmurthy changed the title Support TIMESTAMP AS OF, VERSION AS OF, and DESCRIBE HISTORY Support TIMESTAMP AS OF, VERSION AS OF, and DESCRIBE HISTORY in SQL Aug 26, 2019
@mukulmurthy mukulmurthy added the enhancement New feature or request label Aug 26, 2019
@mukulmurthy mukulmurthy added this to the Future Roadmap milestone Aug 26, 2019
@holydrinker
Copy link

holydrinker commented Feb 20, 2020

Hi guys, I'm reading this and it seems that SQL support was integrated in delta.

I tried this simple code snippet and it seems to work

def showDeltaTableHistoryViaSQL() = {
  spark.sql(s"DESCRIBE HISTORY '${mydata.tbAccountsPath}'").show(false)
}

Did I understand well? Can this issue be closed?

@tdas
Copy link
Contributor

tdas commented Feb 25, 2020

Yes, we did add support for simple SQL commands like DESCRIBE HISTORY by providing a SparkSession extension that users can add to their session. What this extension does is inject additional grammar rules that can be used to parse a SQL command that Spark's default rules fail to parse. This was easy to add and works nicely for simple SQL commands that do not take expressions. But adding support for complex SQL commands (e.g. MERGE) or adding new keywords inside existing commands (e.g. TIMESTAMP AS OF inside SELECT) is quite hard.

Since we have already added DESCRIBE HISTORY but not others, let me update the title to reflect that.

Incidentally, support for MERGE SQL command will come with 0.7.0 release (next release is 0.6.0 in March) after Spark 3.0 is released

@tdas tdas changed the title Support TIMESTAMP AS OF, VERSION AS OF, and DESCRIBE HISTORY in SQL Support TIMESTAMP AS OF, VERSION AS OF in SQL Feb 25, 2020
@holydrinker
Copy link

Nice! Thank you very much for your support and your answer!

@brucemen711
Copy link

Hi @tdas , when will this feature be available ? I'm looking for it. Thanks.

@spmp
Copy link
Author

spmp commented Feb 7, 2021

Thanks @brucemen711 for bumping this. I have looked again in 2021 at https://docs.delta.io/latest/delta-utility.html#history and seen that there is (AFAIK) still no SQL interface to TIMESTAMP AS OF and VERSION AS OF.
I was looking for this for the use case where consumers are connecting via a SQL interface such as Hive.

@YannByron
Copy link
Contributor

I agree that this feature depends on changes in Spark. It involves adding keywords, overriding some visit functions in AstBuilder, and other UnResolved LogicalPlans for time-travel.
But, we need to this on Spark 2.X indeed. Shall we open a Spark issue to relate to this ? And i'll be pleasure to work on it.
@tdas @mukulmurthy

@zsxwing
Copy link
Member

zsxwing commented Apr 7, 2021

@YannByron Feel free to raise a Spark issue to ask for the time travel SQL syntax support.

@YannByron
Copy link
Contributor

@YannByron Feel free to raise a Spark issue to ask for the time travel SQL syntax support.

https://issues.apache.org/jira/browse/SPARK-34978

@spmp
Copy link
Author

spmp commented Apr 8, 2021

I am looking forward to it, @YannByron please keep us in the loop for testing 8) Thanks

@AFFogarty
Copy link
Contributor

Since this currently doesn't work in SQL, can we remove the SQL examples from the public docs for now? https://docs.delta.io/1.0.0/delta-batch.html#syntax

@zsxwing
Copy link
Member

zsxwing commented Sep 1, 2021

@AFFogarty Good call. We will review the doc and update the examples.

@spmp
Copy link
Author

spmp commented Sep 2, 2021

Kia ora All,
To clarify, is this now a Spark issue with the discussion moving to the Jira issue as above (https://issues.apache.org/jira/browse/SPARK-34978)

If so, should we call this out and close this issue?

The use case I really have in mind is that of a non technical user accessing a Delta table via Hive and hence I am looking for SQL interfaces to these commands such that the non technical user or downstream processes can access them.
Is there an existing solution for accessing Delta and Timetravel with Hive or similar?

@nchammas
Copy link

Since this currently doesn't work in SQL, can we remove the SQL examples from the public docs for now? https://docs.delta.io/1.0.0/delta-batch.html#syntax

+1 on this. It's very confusing to look at these docs and then wonder why it doesn't work in practice. :)

@dennyglee dennyglee added documentation Improvements or additions to documentation and removed enhancement New feature or request labels Oct 7, 2021
@dennyglee
Copy link
Contributor

Great call out @nchammas - this should be fixed for the next documentation release. Will keep this open until this is resolved. Thanks!

@dennyglee dennyglee modified the milestones: Future Roadmap, 1.1.0 Oct 7, 2021
@dennyglee dennyglee self-assigned this Oct 7, 2021
@Tagar
Copy link
Contributor

Tagar commented Nov 7, 2021

FYI apache/spark#34497

@zjffdu
Copy link

zjffdu commented Nov 27, 2021

I hit the same issue today, could you update the official document ? https://docs.delta.io/latest/delta-batch.html#query-an-older-snapshot-of-a-table-time-travel
Here's another document error I found. #845

@zsxwing
Copy link
Member

zsxwing commented Nov 29, 2021

I hit the same issue today, could you update the official document ? https://docs.delta.io/latest/delta-batch.html#query-an-older-snapshot-of-a-table-time-travel

We are working on the fix for the doc. It will be out soon.

@allisonport-db
Copy link
Collaborator

Closing this as support was added in #1288 and is being released in 2.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests