Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core feature] DuckDB Integration #3246

Closed
2 tasks done
samhita-alla opened this issue Jan 19, 2023 · 0 comments · Fixed by flyteorg/flytekit#1419
Closed
2 tasks done

[Core feature] DuckDB Integration #3246

samhita-alla opened this issue Jan 19, 2023 · 0 comments · Fixed by flyteorg/flytekit#1419
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@samhita-alla
Copy link
Contributor

Motivation: Why do you think this is important?

DuckDB integration can be helpful to run queries on files, dataframes, pyarrow tables in an efficient manner. A native integration with Flyte opens up a new set of possibilities to use DuckDB from within an orchestration platform, which is the first of its kind.

Goal: What should the final outcome look like, ideally?

A task plugin that enables users to run queries seamlessly; along the lines of the following prototype:

duckdb_task = DuckDBQuery(name="duckdb_task", query="SELECT SUM(a) FROM mydf", inputs=kwtypes(mydf=pd.DataFrame))

Describe alternatives you've considered

An alternative is to handle DuckDB code from within a Flyte task: https://gist.github.com/samhita-alla/003c3f409e8caa88470f6f7206b54ae3.

Propose: Link/Inline OR Additional context

A task plugin that accepts a query, a dataframe/pyarrow table/parquet file/csv file and parameters.

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@samhita-alla samhita-alla added enhancement New feature or request untriaged This issues has not yet been looked at by the Maintainers labels Jan 19, 2023
@cosmicBboy cosmicBboy added this to the 1.4.0 milestone Jan 29, 2023
@cosmicBboy cosmicBboy removed the untriaged This issues has not yet been looked at by the Maintainers label Mar 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants