-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Store] Read optimisation on partitioned target #5465
[Feature Store] Read optimisation on partitioned target #5465
Conversation
… CSVTarget and DFTarget
…CSVSource, dataframeSource and BigQuerySource
only in parquetsource or parquettarget.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good in principle but I do have some comments. See below.
Co-authored-by: Gal Topper <gal.topper@gmail.com>
mlrun/feature_store/api.py
Outdated
@@ -175,6 +176,14 @@ def get_offline_features( | |||
By default, the filter executes on the timestamp_key of each feature set. | |||
Note: the time filtering is performed on each feature set before the | |||
merge process using start_time and end_time params. | |||
:param additional_filters: (list of tuples, optional): List of additional_filters conditions as tuples. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to follow the convention of how we specify type and optionality in the rest of the code base. I guess there isn't exactly one way that we do this, but better not introduce one more way. Docs should be consistent.
Co-authored-by: Gal Topper <gal.topper@gmail.com>
Co-authored-by: Gal Topper <gal.topper@gmail.com>
…_target' into read_optimisation_on_partitioned_target
mlrun/datastore/base.py
Outdated
@@ -550,6 +575,12 @@ def as_df( | |||
:param end_time: filters out data after this time | |||
:param time_column: Store timestamp_key will be used if None. | |||
The results will be filtered by this column and start_time & end_time. | |||
:param additional_filters: List of additional_filters conditions as tuples. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:param additional_filters: List of additional_filters conditions as tuples. | |
:param additional_filters: List of additional filter conditions as tuples. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM otherwise.
a) In local_merger + dask_merger.
b) In ParquetSource.
c) In ParquetTarget.
d) get_offline_features and more.
ML-6161