-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add better logging in the indicator runner #1892
Merged
Merged
Changes from 2 commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
264067b
Add extra logging to delphi utils
rzats 975c075
Fix linting
rzats d9d352e
Upgrade & fix more logging
rzats fbc8fb8
small tweaks
rzats 06c0375
Review tweaks
rzats 23c261a
spooky brackets
rzats 3e18651
more spooky brackets
rzats fee17cc
less spooky brackets
rzats 1ba7d36
not here
rzats e76baa7
Minor review tweaks
rzats File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -230,7 +230,7 @@ def find_all_unexpected_geo_ids(df_to_test, geo_regex, geo_type): | |
ValidationFailure( | ||
"check_geo_id_type", | ||
filename=nameformat, | ||
message="geo_ids saved as floats; strings preferred")) | ||
message=f"{len(leftover)} geo_ids saved as floats; strings preferred")) | ||
melange396 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
if geo_type in fill_len.keys(): | ||
# Left-pad with zeroes up to expected length. Fixes missing leading zeroes | ||
|
@@ -281,29 +281,35 @@ def check_bad_val(self, df_to_test, nameformat, signal_type, report): | |
|
||
if percent_option: | ||
if not df_to_test[(df_to_test['val'] > 100)].empty: | ||
bad_values = df_to_test[(df_to_test['val'] > 100)]['val'].unique() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nice touch with the |
||
report.add_raised_error( | ||
ValidationFailure( | ||
"check_val_pct_gt_100", | ||
filename=nameformat, | ||
message="val column can't have any cell greater than 100 for percents")) | ||
message="val column can't have any cell greater than 100 for percents; " | ||
f"invalid values: {bad_values}")) | ||
|
||
report.increment_total_checks() | ||
|
||
if proportion_option: | ||
if not df_to_test[(df_to_test['val'] > 100000)].empty: | ||
bad_values = df_to_test[(df_to_test['val'] > 100000)]['val'].unique() | ||
report.add_raised_error( | ||
ValidationFailure("check_val_prop_gt_100k", | ||
filename=nameformat, | ||
message="val column can't have any cell greater than 100000 " | ||
"for proportions")) | ||
"for proportions; " | ||
f"invalid values: {bad_values}")) | ||
|
||
report.increment_total_checks() | ||
|
||
if not df_to_test[(df_to_test['val'] < 0)].empty: | ||
bad_values = df_to_test[(df_to_test['val'] < 0)]['val'].unique() | ||
report.add_raised_error( | ||
ValidationFailure("check_val_lt_0", | ||
filename=nameformat, | ||
message="val column can't have any cell smaller than 0")) | ||
message="val column can't have any cell smaller than 0; " | ||
f"invalid values: {bad_values}")) | ||
|
||
report.increment_total_checks() | ||
|
||
|
@@ -346,10 +352,12 @@ def check_bad_se(self, df_to_test, nameformat, report): | |
report.increment_total_checks() | ||
|
||
if df_to_test["se"].isnull().mean() > 0.5: | ||
bad_mean = round(df_to_test["se"].isnull().mean() * 100, 2) | ||
report.add_raised_error( | ||
ValidationFailure("check_se_many_missing", | ||
filename=nameformat, | ||
message='Recent se values are >50% NA')) | ||
message='Many recent se values are missing: ' | ||
f'{bad_mean} > 50%')) | ||
melange396 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
report.increment_total_checks() | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The archiver already logs run timing, and each indicator has its own bit of logging which includes that; this adds logging for the flash & validation steps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the individual indicators' run time logging looks mostly sufficient, except in
quidel_covidtest
where it happens at program exit, which will include anything that runs after the core indicator function and thus lead to inaccuracy.you can refactor this so the runner does the timing and logging, with
indicator_fn
(aka each indicator'srun_module()
) returning a dict of metrics it wants logged (like csv_export_count and max_lag_in_days)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I can actually think of one disadvantage of that - it means that the summary line will be gone from the logs if the indicator is ran individually (e.g.
env/bin/python -m delphi_quidel_covidtest
as the README suggests).Are we OK with this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a lot of the documentation in this repo is is quite old and should be brought up to date.
is it ever desirable to run an indicator without validation and archiving? perhaps we can answer that in #1895. you should at least fix the timing for the quidel indicator in the meanwhile.