-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is it possible to reference tables as parquet files in iceberg on glue via external source? #184
Comments
Mmm I've started working on iceberg as a source using an AWS catalog here via a plugin mechanism: https://github.com/jwills/dbt-duckdb/blob/master/dbt/adapters/duckdb/plugins/iceberg.py ...but it's still pretty early/experimental. Can you tell me a bit more about what you're up to, and I'll see if I can figure out a way to help? |
Yes we have a db schema we pipe to glue that we want to do some testing/analysis on. The db updates every few hours with an iceberg merge, so we need just the latest, to pipe into duck, and we want to then run dbt transformations on it. |
Okay-- so in theory, this will work:
...and yeah, in theory, that should load the latest iceberg snapshot in its entirety as a pyarrow table into DuckDB for subsequent transformation by the rest of the pipeline. (But again, this is super-new, I haven't tried it for real, maybe take a look at dbt-athena and see if that meets your needs, etc., etc.) |
I see a method to write to glue, but no reference to use it to get the data back out?
Please help!
The text was updated successfully, but these errors were encountered: