Skip to content

Conversation

@zhilingc
Copy link
Collaborator

@zhilingc zhilingc commented May 26, 2020

What this PR does / why we need it:
Resubmission of #612, rebased on master.

This PR adds support for retrieval of batch statistics over data ingested into a feast warehouse store, as proposed in M2 of the Feature Validation RFC.

Note that it deviates from the RFC in the following ways:

  • Statistics are computed using SQL. This is because TFDV is unfortunately, only available in python, and Multi-SDK connectors for Beam is still a work in progress. Computing the statistics using SQL will be the compromise until either TFDV is available in Java, or cross-language execution is supported.
  • Statistics can only be computed over a single feature-set at a time. This is mostly to reduce complexity in implementation. Since datasets are unable to span multiple feature sets, it makes sense to have this restriction in place.

This is a bit of a chonky PR, but refer to the attached notebook for how this implementation looks like for a user of Feast.

Does this PR introduce a user-facing change?:

- Adds GetFeatureStatistics to the CoreService
- Adds get_statistics method to the client in the python SDK

@zhilingc zhilingc added kind/feature New feature or request area/core labels May 26, 2020
@zhilingc zhilingc force-pushed the batch-query-get-stats-api-2 branch 2 times, most recently from 5567ce1 to 7115066 Compare May 29, 2020 05:56
@zhilingc
Copy link
Collaborator Author

/test test-end-to-end

@zhilingc zhilingc force-pushed the batch-query-get-stats-api-2 branch from 7115066 to 2782cff Compare June 2, 2020 03:51
@zhilingc zhilingc force-pushed the batch-query-get-stats-api-2 branch from 2782cff to 1b52bc2 Compare June 9, 2020 02:41
@woop
Copy link
Member

woop commented Jun 9, 2020

/lgtm

@zhilingc
Copy link
Collaborator Author

zhilingc commented Jun 9, 2020

/test test-end-to-end

@zhilingc
Copy link
Collaborator Author

zhilingc commented Jun 9, 2020

/approve

@khorshuheng
Copy link
Collaborator

/approve

@woop woop added this to the v0.6.0 milestone Jun 9, 2020
@feast-ci-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: khorshuheng, woop, zhilingc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [khorshuheng,woop,zhilingc]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@zhilingc zhilingc merged commit 1e12d3f into feast-dev:master Jun 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants