New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend filter vector ability (focus on off-line featurestore) #1604
Comments
@george0st i assume this filter should be done on the result set (after join)? |
What do you think? BTW: it is nice issue in case of deepdive :-) |
@yaronha , to be honest:
|
@george0st if the filter is done before the join a user would need to specify different filters per source feature-set, so logically the filtering should be post join, depending on the engine it may do lazy eval and query compilation (e.g. in Spark & Dask) which will in practice result in doing some filtering before the join in practice. i agree that SQL semantics are the best, but as you know SQL has different sub dialects, and since we pass the |
@yaronha, I understand the logic because if you unify filter logic directly to output in level of results (independent of targets because each can have different filter language), the situation will be not so complicated. You need only support filtering in level of data frame pandas and data frame spark. It makes sense to do performance tests for bigger FeatureSets |
See the relation to #1956 |
It will be very useful to support in the vector rich filtering e.g.:
High priority
Medium priority
Low priority
BTW: you are supporting in get_offline_features only exact match see part of the code
The text was updated successfully, but these errors were encountered: