New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Feast extractor #414
Conversation
Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>
Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>
Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>
Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>
3e390a3
to
c164b3e
Compare
Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>
thanks, I will take a look. |
@szczeles great contribution, I wonder how easy to setup databuilder using kubeflow job? Given most people used Airflow for orchestration daily, I am interested in knowing and trying with kubeflow as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
columns, | ||
) | ||
|
||
if self._describe_feature_tables: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting to know you yield the prog description in the same extractor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it was quite easy, since both tables and prog descriptions use same class. I was thinking once about creating multiple extractors - one for table and second for prog descirptions, but decided to not overcomplicate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @dikshathakur3119 for FYI as this will be easier for Lyft internal programmatic description use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is super useful
) | ||
) | ||
|
||
for index, feature in enumerate(feature_table.features): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there any order in feast for entity vs feature? or you just put the entity column first before feature column?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feast doesn't define ordering, these are just different properties of the feature table. I put entities first, because they are like "primary keys" of the table.
It would be good to evangelize in the feast community as well given both projects are in LFAI incubation! |
Thanks! It's super easy to setup in Kubeflow. Kubeflow Pipelines is a system to orchestrate and schedule run of different containers, so I have built one Docker image with databuilder, added the scripts inside and I'm calling them one by one in the pipeline like this:
Absolutely! As soon as the feature is merged I'm going to evangelize there :-) The lack of user interface for features exploration was noticed by Feast devs as well, see: https://docs.feast.dev/#problems-feast-does-not-yet-solve, so it can be a valuable add-on. |
thanks @szczeles for the info! |
Summary of Changes
This PR provider Extractor for Feast feature store, announced in amundsen-io/amundsen#815. Apart from FeatureTables definitions, the extractor also pushes the metadata collected by Feast as programmatic descriptions, so they look like this on the Frontend:
The new "extra" dependency is added:
feast
(python sdk for Feast), it is distributed on ASF licence.Fixes amundsen-io/amundsen#815
Tests
Unit tests for the extractor class.
Documentation
A sample job with loader definition, doc strings for FeastExtractor class and
extract
methodCheckList
Make sure you have checked all steps below to ensure a timely review.
make test