You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
DuckDB integration can be helpful to run queries on files, dataframes, pyarrow tables in an efficient manner. A native integration with Flyte opens up a new set of possibilities to use DuckDB from within an orchestration platform, which is the first of its kind.
Goal: What should the final outcome look like, ideally?
A task plugin that enables users to run queries seamlessly; along the lines of the following prototype:
duckdb_task=DuckDBQuery(name="duckdb_task", query="SELECT SUM(a) FROM mydf", inputs=kwtypes(mydf=pd.DataFrame))
Motivation: Why do you think this is important?
DuckDB integration can be helpful to run queries on files, dataframes, pyarrow tables in an efficient manner. A native integration with Flyte opens up a new set of possibilities to use DuckDB from within an orchestration platform, which is the first of its kind.
Goal: What should the final outcome look like, ideally?
A task plugin that enables users to run queries seamlessly; along the lines of the following prototype:
Describe alternatives you've considered
An alternative is to handle DuckDB code from within a Flyte task: https://gist.github.com/samhita-alla/003c3f409e8caa88470f6f7206b54ae3.
Propose: Link/Inline OR Additional context
A task plugin that accepts a query, a dataframe/pyarrow table/parquet file/csv file and parameters.
Are you sure this issue hasn't been raised already?
Have you read the Code of Conduct?
The text was updated successfully, but these errors were encountered: