Skip to content
This repository has been archived by the owner on Dec 18, 2023. It is now read-only.

Support for incremental models #5

Open
buremba opened this issue May 10, 2019 · 6 comments
Open

Support for incremental models #5

buremba opened this issue May 10, 2019 · 6 comments

Comments

@buremba
Copy link

buremba commented May 10, 2019

This is one of the killer DBT features and I think that it should be available in Presto as well. I will try to create a PR for this one but it might take a few weeks.

If there's anyone who is willing to implement this feature in less than 2 weeks, feel free to comment.

@drewbanin
Copy link
Contributor

Thanks @buremba! How are you thinking about implementing incremental models? We like to use the merge statement on Snowflake/BigQuery, but I don't believe that's possible on Presto. For databases that don't support merge, we implement incremental models with a delete and an insert. You can find some more info on this approach here.

I think the delete + insert approach will work, but it's unfortunately not atomic as I don't believe Presto supports transactions? Do you know how most Presto projects handle incremental updates to a table? My understanding is that this is typically handled with partitions, not insert/delete statements

@buremba
Copy link
Author

buremba commented May 10, 2019

@drewbanin The only way to make it work in Presto is delete + insert at the moment but Presto supports transactions to some extent. (https://prestodb.github.io/docs/current/sql/start-transaction.html)

The problem is that Presto is actually just a distributed query executor and under the hood, it has a concept of connectors which might be an RDBMS, S3, Hadoop, Elasticsearch, etc. AFAIK, only a few connectors support transactions so this feature won't be available in most of the connectors. I will look into the connectors and see which ones support transactions and will let you know.

@Cabeda
Copy link

Cabeda commented Jan 30, 2020

Are there any updates on this?

@chickenPopcorn
Copy link

Can we use Alter Table with a new location to simulate this? I believe the alter table statement is atomic. We will have to merge existing data with the incoming data and save it to a new location first, and use an alter table to point the original table to the new location. It has the drawback of not being space efficient(maybe a vacuum step) and only suited for infrequent batch uses

@mdesmet
Copy link
Contributor

mdesmet commented Aug 10, 2021

They are working on merge into for hive: trinodb/trino#7708

@Cabeda
Copy link

Cabeda commented Sep 8, 2021

@mdesmet I've opened an issue on the dbt-trino adapter and I'll try to have a proof-of-concept

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants