Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support DuckDB warehouse #1124

Open
ntnhaatj opened this issue Aug 29, 2023 · 5 comments
Open

Support DuckDB warehouse #1124

ntnhaatj opened this issue Aug 29, 2023 · 5 comments

Comments

@ntnhaatj
Copy link

First of all, thank you for your awesome product.

Is your feature request related to a problem? Please describe.
No, this is not motivated by a problem. It's about supporting an additional warehouse that I believe would be highly suitable for data-quality observability.

Why DuckDB?

  • Utilize Apache Arrow's zero-copy capabilities supported in DuckDB to seamlessly work with various datalake filesystem formats (such as json, parquet, iceberg, deltalake...).
  • Leverage DuckDB's robust vectorized engine, which aligns well with the profiling and analyzing requirements of data-quality use cases.

Would you be willing to contribute this feature?
Yes, I am willing to. I've already started development in my forked repository, and it's functional to a certain expectation. However, a few tests are failing, and I would appreciate your guidance to finalize it.

Current limitation
While working on my forked repo, I encountered some limitations:

@wfclark5
Copy link

wfclark5 commented Oct 4, 2023

@ntnhaatj How's your progress on this? Were you able to figure out the last few tests?

@ntnhaatj
Copy link
Author

ntnhaatj commented Oct 6, 2023

@ntnhaatj How's your progress on this? Were you able to figure out the last few tests?

duckdb has just released v0.9.0 (which officially has fixed the shadowing name issue in CTE).

I tested it in my forked branch, it passed almost of defined tests suite, there are a few tests still failed (3 if I remember correctly) which relate to schema change, just don't have time to fix and finish them.

@mamonu
Copy link

mamonu commented Oct 19, 2023

that is great news. i encountered today an error message when i tried to test how duckdb-dbt works with elementary and my heart sank when i saw: Adapter "duckdb" is not supported on Elementary.

thanks for the work for this

@ntnhaatj
Copy link
Author

Hi @mamonu, my work is currently designed for my specific use case, primarily for data quality reporting. It doesn't guarantee the quality and coverage of all elementary features.
It might be better to wait until they have official support for DuckDB.

@haritamar
Copy link
Collaborator

Hi @ntnhaatj !
Not sure if you made additional work on this since, but if you'd like to contribute DuckDB support we'd be happy to review. I think it's also fine if it will cover a subset of features.
Please let me know if you're still interested in contributing this to the main Elementary repo.

Cheers,
Itamar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants