Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add entity column validations when getting historical features from bigquery #1614

Merged
merged 4 commits into from Jun 7, 2021

Conversation

achals
Copy link
Member

@achals achals commented Jun 2, 2021

Signed-off-by: Achal Shah achals@gmail.com

What this PR does / why we need it:

Right now, we don't validate that the entity_df contains the expected columns. This may result in strange errors downstream. But since we already have all the necessary information to validate that the needed columns exist, the validations can catch errors before we try to retrieve data from bigquery.

Which issue(s) this PR fixes:

Fixes #

Does this PR introduce a user-facing change?:

Add entity column validations when getting historical features from bigquery

@codecov-commenter
Copy link

codecov-commenter commented Jun 2, 2021

Codecov Report

Merging #1614 (2c434f3) into master (25daab3) will increase coverage by 0.11%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1614      +/-   ##
==========================================
+ Coverage   83.76%   83.88%   +0.11%     
==========================================
  Files          67       67              
  Lines        5846     5883      +37     
==========================================
+ Hits         4897     4935      +38     
+ Misses        949      948       -1     
Flag Coverage Δ
integrationtests 83.80% <100.00%> (+0.11%) ⬆️
unittests 77.64% <21.05%> (-0.38%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
sdk/python/feast/errors.py 65.78% <100.00%> (+2.93%) ⬆️
sdk/python/feast/infra/offline_stores/bigquery.py 92.48% <100.00%> (+1.45%) ⬆️
sdk/python/tests/test_historical_retrieval.py 98.88% <100.00%> (+0.03%) ⬆️
sdk/python/feast/feature_view.py 88.88% <0.00%> (+0.11%) ⬆️
sdk/python/feast/feature_store.py 92.95% <0.00%> (+0.44%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 25daab3...2c434f3. Read the comment docs.

@@ -0,0 +1 @@
<?xml version="1.0" encoding="utf-8"?><testsuites><testsuite errors="0" failures="0" hostname="Achals-MBP" name="pytest" skipped="0" tests="6" time="154.414" timestamp="2021-06-01T17:24:58.989137"><testcase classname="telemetry_tests.test_telemetry" file="telemetry_tests/test_telemetry.py" line="32" name="test_telemetry_on_v09" time="18.383"></testcase><testcase classname="telemetry_tests.test_telemetry" file="telemetry_tests/test_telemetry.py" line="59" name="test_telemetry_off_v09" time="32.697"></testcase><testcase classname="telemetry_tests.test_telemetry" file="telemetry_tests/test_telemetry.py" line="88" name="test_telemetry_on" time="18.214"></testcase><testcase classname="telemetry_tests.test_telemetry" file="telemetry_tests/test_telemetry.py" line="120" name="test_telemetry_off" time="32.748"></testcase><testcase classname="telemetry_tests.test_telemetry" file="telemetry_tests/test_telemetry.py" line="153" name="test_exception_telemetry_on" time="18.851"></testcase><testcase classname="telemetry_tests.test_telemetry" file="telemetry_tests/test_telemetry.py" line="170" name="test_exception_telemetry_off" time="32.693"></testcase></testsuite></testsuites>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming this should be removed.

missing_keys = expected_columns - entity_df_columns

if len(missing_keys) != 0:
raise ValueError(
Copy link
Member

@woop woop Jun 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@achals would love to get your take on whether we should be using a custom error type here (or ValueError)? I've accepted #1615, but now that I think about it we probably need tests that cover this failure case. We can always leave those tests as a TODO in the code until we have completed our test suite refactor, but at that point we probably want different offline stores to raise identical exceptions, which may be easier to do with a custom error type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a specific validation error is better to throw since they can be caught and handled better potentially.

And yeah, re: tests I'll fill in the gaps in the stuff I just merged.

@achals
Copy link
Member Author

achals commented Jun 3, 2021

@woop I've updated the test_historical_features in this PR (and will update the error and tests for stuff landed in #1615 in a separate PR).

@feast-ci-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: achals, woop

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@woop
Copy link
Member

woop commented Jun 7, 2021

/lgtm

…igquery

Signed-off-by: Achal Shah <achals@gmail.com>
Signed-off-by: Achal Shah <achals@gmail.com>
Signed-off-by: Achal Shah <achals@gmail.com>
Signed-off-by: Achal Shah <achals@gmail.com>
@achals achals force-pushed the achal/bigquery-validation branch from 9ee3ae6 to 2c434f3 Compare June 7, 2021 18:35
@feast-ci-bot
Copy link
Collaborator

New changes are detected. LGTM label has been removed.

@feast-ci-bot feast-ci-bot removed the lgtm label Jun 7, 2021
@achals achals added the lgtm label Jun 7, 2021
@feast-ci-bot feast-ci-bot merged commit 17231d0 into feast-dev:master Jun 7, 2021
@achals achals deleted the achal/bigquery-validation branch June 7, 2021 18:48
tsotnet pushed a commit that referenced this pull request Jun 17, 2021
…igquery (#1614)

* Add entity column validations when getting historical features from bigquery

Signed-off-by: Achal Shah <achals@gmail.com>

* make format

Signed-off-by: Achal Shah <achals@gmail.com>

* Remove wrong file

Signed-off-by: Achal Shah <achals@gmail.com>

* Add tests

Signed-off-by: Achal Shah <achals@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants