-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Hive managed Transactional/ACID tables #576
Comments
Proposal for supporting Insert Only Transactional Hive tables: https://docs.google.com/document/d/1GwSNc_6jSP8N1SyJIHnaMeWvh7JzhGPZ7QKm1DHJqEw/edit?usp=sharing @electrum @martint @dain please review. I have a wip implementation of this which is able to read Insert Only Transactional tables from Hive3 |
@electrum what do you think about create a high level issue to track this stuff? I would say the proposal above is part of the Hive 3.0 support project, and a sub item of supporting new table organizations. The proposal only covers reading of insert only tables and not write or full acid tables. |
I read the proposal and it seems reasonable. One question is that it mentions deltas, which to my understanding are only for full ACID, not insert-only. From an implementation perspective, table layouts are deprecated. I’m working on a PR to remove them from the Hive connector. I suggest waiting for that, which should be complete and merged sometime next week, then send a PR on top. I’m happy to review and work with you on it. |
@electrum I remembered there were discussion around removing table layouts but saw them in master hence went ahead with it, we can wait till your changes land and replace table layout dependency with whatever gets used to fetch partitions. |
@electrum I have a question whether presto will provide the support for ACID of Hive in the future.Because when I used Hive to develop the data warehouse of my company, I found that some dim tables(scd) and some fact table(accumulator snapshot table) need to update.For that, I opened the transaction support of Hive but it made presto work wrong for the reason that Presto can't read HIVE's tables which support row update or delete. |
@Jedda1314Jessie we are adding the read support for ACID tables. Review has been opened for the first part to support reading INSERT ONLY ACID tables. I will be adding docs and reviews for full ACID support too soon. |
Proposal for Full ACID table reads: https://docs.google.com/document/d/1VrF48kqr_paTtF5iSwryRhZvMwL_ovtbjcAedQd7hyk/edit?usp=sharing I have this implemented and it works fine. There are some performance optimizations possible over this which can be picked once we finalize this solution. |
@stagraqubole are there any latest update or schedule? thanks! |
@stagraqubole got it, thanks for your reply! |
Presto 331 (pending release) can read from Hive transactional/ORC ACID tables. The outstanding things are:
All the above is tracked by the Hive 3 umbrella issue #1218 |
I am using Presto version 0.232, and Hive version 3.1.0. Is this integration supports managed tables of Hive3.1.0 Transactional/ACID tables. |
@MohammedLayeeq please try with Presto 331 -- https://prestosql.io/download.html |
Presto fails to run queries against Hive managed tables. In Hive 3.x these tables are defaulted to transactional/insert-only as per https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.1.0/hive-overview/content/hive_upgrade_changes.html . Given that this is the default table type in Hive 3.x for all hive managed table, can Presto be enhanced to handle them?
Any one upgrading to Hive 3.x will have diminished use for Presto without this support, as Presto can only handle ‘external’ tables in Hive 3.x
Error Message, when trying to access a ‘transactions/insert-only’ table:
The text was updated successfully, but these errors were encountered: