Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exploring Snowflake Support via PyArrow & Pandas #104

Open
kjam opened this issue Sep 15, 2020 · 0 comments
Open

Exploring Snowflake Support via PyArrow & Pandas #104

kjam opened this issue Sep 15, 2020 · 0 comments

Comments

@kjam
Copy link
Contributor

kjam commented Sep 15, 2020

Is your feature request related to a problem? Please describe.
We would like to eventually support workflows that use SnowflakeDB, and one idea that has come up is to use their integration with PyArrow and Pandas to learn more about Arrow but also to support Snowflake data - https://docs.snowflake.com/en/user-guide/python-connector-pandas.html

Describe the solution you'd like
An initial test of whether this workflow is feasible would be useful to see what benchmarks we can create for pulling data into Pandas and then applying Cape policy to it. It might also be worthwhile diving into the library internals to see how the Query -> Arrow -> Dataframe workflow works!

Describe alternatives you've considered
We have explored the idea of a ODBC or JDBC layer as another way of solving this issue.

Additional context
We could pick an interesting example use case and check it out! Would also be excited to hear about architecture choices they made here and see if we can explore how we might apply policy in/to Arrow??

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant