New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-43381: bps report doesn't show error codes with submit directory as the run id #31
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #31 +/- ##
===========================================
+ Coverage 17.01% 35.34% +18.33%
===========================================
Files 5 7 +2
Lines 1387 1737 +350
Branches 292 318 +26
===========================================
+ Hits 236 614 +378
+ Misses 1148 1121 -27
+ Partials 3 2 -1 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor suggestions. Merge approved.
24c6cfe
to
26736dd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merge approved. Minor comments mostly suggestions about comments.
Overall I have concerns that someday calling multiple functions per job trying to find the right handler vs just the nested if conditions may end up being too much overhead when there are 100k+ jobs. Right now these overheads are hidden by the time for reading multiple files. Also, this approach was to make future changes easier by just writing another handler. Note, however, the code will need changes if the handlers are needed for values independent of exit values (i.e., can't just quit trying handlers if exit values are handled).
``bps report`` wasn't showing error codes/counts when the submit directory was used as the run id due to missing attributes in jobs' ClassAds. Made changes to ensure that these attributes will be added to the ClassAds.
Existing unit tests for ``_get_exit_codes_summary()`` were covering only the main use case scenario. Added tests that also covered some of the possible alternative scenarios (e.g. missing attribute, unknown job status).
With multiple "corner" cases the function responsible for including jobs' exit statuses became too complex. I divided it to smaller, more manageable pieces.
Both 'Handler' and 'Chain' classes are intended to be fairly generic. The docstrings and some error messages implied that they can be only used for dealing with job ClassAds. Edited the docstrings and slightly refactored Chain.append() to make it clear.
Some unit tests were using assertIs() instead of just assertFalse()/assertTrue() and were implicitly taking advantage of the fact that the ad returned by the handler is the same ad that is passed to it which in general doesn't need to be true. Fixed all of those.
Added extra checks to handlers which should allow handlers to be more specific about what job ClassAds they can handle to avoid situation in which, for example, a handler for completed jobs will attempt to process a ClassAd for a held job.
Based on some ad hoc tests I made on my workstation with a workflow consisting of approx. 100k jobs (99,215), this version (i.e. with the bug fix) is roughly 1 s (or ~3%) slower than the original version (without the bug fix). However, the original version effectively does next to nothing in terms of determining job exist status when the submit directory is used as the run id (hence this bug fix). I'd expect the difference between the new version and the a version which works correctly, but uses multi-branch, likely nested if statements would be smaller (if any).
Of course, it's a matter of personal preferences, but I do think that implementing future changes (and writing unit tests for them) should be easier when the "business logic" of dealing with a job of the given type is encapsulated in a fairly limited, isolated piece of code instead being a part of a (large) set of if statements. |
Checklist
doc/changes