-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Removed ParentageResolved check, logged ParentageResolved false #11805
Removed ParentageResolved check, logged ParentageResolved false #11805
Conversation
Jenkins results:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@d-ylee Dennis, these changes look good to me. But I left a comment along the code for your consideration.
In addition, I am inclined to say that we should not archive announced
workflows unless their parentage has been resolved. Otherwise we have no way to spot such issues in the future. That means, we should likely add to the archival pipeline an if statement checking for the status and wflow['ParentageResolved']
information (or any other alternative implementation).
@todor-ivanov what do you think?
msg += " Will retry again in the next cycle." | ||
self.logger.info(msg) | ||
self._checkStatusAdvanceExpired(wflow, additionalInfo=msg) | ||
# elif wflow['RequestStatus'] == 'announced' and not wflow['ParentageResolved']: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to completely remove these lines.
if wflow['RequestStatus'] == 'announced' and not wflow['ParentageResolved']: | ||
msg = f"Adding workflow: {wflow['RequestName']} - 'ParentageResolved' flag set to false." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not see the point of those two lines here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, maybe moving it down to the archival method would be good enough. I think it's important to log that A workflow can't go through the component (completely) because its parentage has not been resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amaltaro, @d-ylee , this msg
variable here is not logged in practice. If left here as it is, and not sent to the logger it will be overwritten once the cycle bellow is entered. And here follow two more comments of mine:
- We are not "Adding" this workflow here or anywhere else. We are just "Not skipping" it in the
_dispatch
step - ergo in the cleanup process. So the message is misleading. - And yes if we are about to log the fact that a workflow cleanup is not skipped, but the workflow itself is not going to be archived, the proper place for such message is in the
archive
method in the code snippet i was pointing to in my general comment. More on this - this way we will avoid duplicated checks and also will benefit from the fact that once the workflow is kept in the system for time longer than the archival threshold this log message will be transformed into a proper alarm. These few lines bellow should do the job:
if not (wflow['ParentageResolved'] or wflow['ForceArchive']):
msg = "Not properly cleaned workflow: %s - 'ParentageResolved' flag set to false." % wflow['RequestName']
self.logger.info(msg)
if self._checkStatusAdvanceExpired(wflow, additionalInfo=msg):
self.alertStatusAdvanceExpired(wflow)
raise MSRuleCleanerArchivalSkip(msg)
hi @amaltaro
This is quite simple. The check needs to follow the exact same logic as this one: WMCore/src/python/WMCore/MicroService/MSRuleCleaner/MSRuleCleaner.py Lines 572 to 576 in dac2076
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@d-ylee I left one comment inline, which you may want to take a look at.
@d-ylee Dennis, I just remembered that we have some unit tests relying on workflow dump, example in the following lines: One option would be for you to adjust one of those dumps (maybe the StepChain one) and see how it behaves in the component with your changes. |
71d64ec
to
e0a3657
Compare
I made the changes suggested above. @amaltaro It looks like in the StepChain dump, ParentageResolved was already set to false. I also do not see self.stepChainReq being used in any tests of MSRuleCleaner_t.py, unless I am missing something. Would it be better if I use the existing file and write a test that uses it? |
Jenkins results:
|
@d-ylee we could either adapt the unit tests to use |
@amaltaro I ran the test with the existing
I did modify the Is this what we are expecting with the changes? |
The json dump has this request status: and if unit tests are resulting in a target status |
@amaltaro What status should we be expecting? |
The one from your diff, this:
given that that workflow dump has we need to update it from: |
@amaltaro The test was expecting |
Jenkins results:
|
Cool, thanks Dennis. Is it ready for another review? If so, please request it through the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Dennis, it looks good to me!
Fixes #11725
Status
In development
Description
Removed ParentageResolved checks. Log when ParentageResolved is false when processing.
Is it backward compatible (if not, which system it affects?)
MAYBE
Related PRs
N/A
External dependencies / deployment changes
N/A