New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Databricks Dataset #974
Comments
Hi @tiencuongkieu - are you talking about regular databricks tables or delta tables? You can do the former via the SparkJDBCDataSet today, but we also have a Delta dataset in progress as well #964. |
I think @tiencuongkieu might also mean the native |
Ah I wasn't aware of that! @tiencuongkieu we're always open to PRs :) |
Yes, exactly as @jiriklein mentioned above, it is for the native saveAsTable and the code is quite simple as he mentioned as well. I'll raise a PR. |
Thank you! |
Awesome, thank you @tiencuongkieu Then inheritance should come from Let us know if and when you need help with reviews. |
Thank you @jiriklein for you suggestion. I raised a draft PR. Still need to add tests though. |
Just copied it over from my project's implementation. Inheritance comes from AbtractDataSet only. Maybe we need versioning functionality as well? |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Description
Currently Kedro doesn't support read/write directly to Databricks tables. Should we support this feature?
Context
Databricks is used in many projects. And using Databricks tables, instead of, for example S3, has several benefits like we can see sample data in Databricks UI, we have metadata for tables, it is also faster.
Possible Implementation
I already implemented the dataset and used it in a client project.
The text was updated successfully, but these errors were encountered: