-
Notifications
You must be signed in to change notification settings - Fork 960
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add deltalake
backend
#3865
Comments
as I understand you want to query a feature table as delta format, |
No I would like to do this without a JVM application. So delta-rs Python bindings (deltalake) can be used to achieve this: https://github.com/delta-io/delta-rs |
cool, we need some changes to extend FileSource to read delta tables, do you mind contributing? |
Sure, if you can give me some pointers : ) |
@ion-elgreco Let me try to give you a quick rundown of options how the integration might look like. First of all, The concept closest to Feast has another concept called |
@tokoko gotcha, that helps! Since I mainly use Polars I will look into adding that as an offline store and then add delta as additional filesource using deltalake as dependency. Yup Polars uses deltalake to read and write. |
Glad to be able to help. One more pointer that may help you out, but note that this my preferred direction that I'm trying to push (but with not much luck as of yet :) ). Despite your preference for polars, you should probably still check out duckdb PR I linked above. The actual offline store implementation is written using ibis rather than duckdb directly. As ibis has a fairly good polars backend, you could easily reuse the same ibis implementation. In that case, polars implementation might be just a single line code change (probably not but something close to that). |
Great explaination @tokoko ! |
Is your feature request related to a problem? Please describe.
I am considering using feast but the main back-end we use is not supported, which is
deltalake
. Deltalake is the only data lake implementation that has read and write support without a JVM. This makes it fairly easy to build large data lakes with only Python but there is no easy way with out a feature store wrapper to make features easily accessible..Describe the solution you'd like
Add
deltalake
as an officially supported back-end.Describe alternatives you've considered
There aren't really any.
The text was updated successfully, but these errors were encountered: