-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Presto plugin for Hoodie #81
Comments
Here we hope to have support for faster, point lookup (batch) |
Is this completed? It seems the presto patch is merged. I was trying to follow the hudi doc to build a presto + minimum hive metastore POC (https://stackoverflow.com/questions/43727964/does-presto-require-a-hive-metastore-to-read-parquet-files-from-s3) on top of s3 files written by hudi. It seems presto
Any guidance on what went wrong? Thanks in advance! |
yes. presto patch is merged. Yes presto support is via the Hive catalog.
For Copy-on-write, hoodie support works without any need for a plugin. This ticket is more for longer term.. |
Thanks for the clarification, @vinothchandar. I guess it was due to my own local env set up. So I fell back to local env set up in quickstart with hadoop-2.6.0-cdh5.4.7, hive-1.1.0-cdh5.4.7, presto 0.205 and dataset generated from HoodieJavaApp default options. Presto is working fine for
|
Dug a little bit more, it seems presto is trying to decode the binary serialized parquet written by hudi into double, which will require a Not sure if the above holds true. If yes, all open source presto user with uber hudi will suffer the same issue? Wondering how this is resolved in Uber. |
So, this does not seem like a Hudi issue to me. the parquet files generated by hudi are the same standard parquet files. Can you try just copying the files themselves into another table (non-hudi) and see if it works.. Also validate that the parquet files are good via spark sql? Also please feel free to open a new issue around this, since this one is about the presto plugin. |
Thanks for the info @vinothchandar |
Nice to hear.. lets put together a hoodie docker container for the future.. Given so many dependencies, it can be overwhelming at times :) |
Totally agree. Any thoughts on which folder this should go? |
we can create a hoodie-docker and host all scripts and DockerFiles there |
No description provided.
The text was updated successfully, but these errors were encountered: