-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new Merge Repack special cases to AccountantWorker #11952
Conversation
Jenkins results:
|
elif jobType == "Merge" and set(outputMap) == {'MergedL1SCOUT', 'MergedErrorL1SCOUT', 'logArchive'} \ | ||
and outputModules == {'MergedL1SCOUT', 'logArchive'}: | ||
pass | ||
elif jobType == "Merge" and set(outputMap) == {'MergedL1SCOUT', 'MergedErrorL1SCOUT', 'logArchive'} \ | ||
and outputModules == {'MergedErrorL1SCOUT', 'logArchive'}: | ||
pass | ||
elif jobType == "Merge" and set(outputMap) == {'MergedHLTSCOUT', 'MergedErrorHLTSCOUT', 'logArchive'} \ | ||
and outputModules == {'MergedHLTSCOUT', 'logArchive'}: | ||
pass | ||
elif jobType == "Merge" and set(outputMap) == {'MergedHLTSCOUT', 'MergedErrorHLTSCOUT', 'logArchive'} \ | ||
and outputModules == {'MergedErrorHLTSCOUT', 'logArchive'}: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a quick-and-dirty solution that works. A nicer fix would be to use fwjrFile['dataset'].get('dataTier', '')
to avoid repeating the code, defining:
mergedModule="Merged"+datatier
mergedErrorModule="MergedError"+datatier
elif jobType == "Merge" and set(outputMap) == {'MergedL1SCOUT', 'MergedErrorL1SCOUT', 'logArchive'} \ | |
and outputModules == {'MergedL1SCOUT', 'logArchive'}: | |
pass | |
elif jobType == "Merge" and set(outputMap) == {'MergedL1SCOUT', 'MergedErrorL1SCOUT', 'logArchive'} \ | |
and outputModules == {'MergedErrorL1SCOUT', 'logArchive'}: | |
pass | |
elif jobType == "Merge" and set(outputMap) == {'MergedHLTSCOUT', 'MergedErrorHLTSCOUT', 'logArchive'} \ | |
and outputModules == {'MergedHLTSCOUT', 'logArchive'}: | |
pass | |
elif jobType == "Merge" and set(outputMap) == {'MergedHLTSCOUT', 'MergedErrorHLTSCOUT', 'logArchive'} \ | |
and outputModules == {'MergedErrorHLTSCOUT', 'logArchive'}: | |
pass | |
elif jobType == "Merge" and set(outputMap) == {mergedModule, mergedErrorModule, 'logArchive'} and 'logArchive' in outputModules and (mergedModule in outputModules or mergedErrorModule in outputModules) | |
pass |
Nonetheless, I'm not sure if that would brake other workflows in which this verification is necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you think @amaltaro?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have the following suggestion to replace all those checks for merge jobs:
elif jobType == "Merge" and (set(outputMap) & {'MergedErrorRAW', 'MergedErrorL1SCOUT', 'MergedErrorHLTSCOUT'}):
pass
In other words, if it's a merge that does not match set(outputMap) == outputModules
but it has one of the expected Error output modules, just let it through.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like your suggestion as is it a lot cleaner. The part I don't like is that it still means that, in the future, if we want to add a new data tier, we will still need to find this rather obscure code section.
I guess we can see this as a feature, not a bug, as it prevents typos in data tier names, but I would prefer to have explicit data tier validation in the repack Workload factory, rather than here (#11954).
Do you see a problem in using mergedErrorModule="MergedError"+datatier
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi German, I think that would work as well. It would add a few extra lines of code to do the parsing, add new output module and run the final comparison, but still it would work.
93c4312
to
4b70302
Compare
Jenkins results:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@germanfgv German, apologies for the long delay on reviewing this.
Current implementation looks good to me - and it will only be triggered for "special" Merge jobs that create the *Error output module on the fly, so it should be pretty safe.
If you are happy with this as well, please let me know. Can you please just add some extra information to the initial PR description to make it easier to understand in the future? Thanks
Thank you, @amaltaro. all good from my side. I added information to the description. |
Thanks German. I will let you know in which release to find it. |
Fixes #11950
Status
ready
Tested with T0 Replay
Description
AccountantWorker verifies that each job produces all the pre-stablished output modules. Tier-0 Repack and Express workflows do now follow this rule, as each job may or may not produce an Error module, depending on the size of the output, which cannot be predicted during workflow creation. This PR refactors these exceptions to output module verification in AccountantWorker, and adds exceptions for the new Repack data tiers
HLTSCOUT
andL1SCOUT
.Is it backward compatible (if not, which system it affects?)
YES
Related PRs
PR allowing for use of alphanumerical data tiers:#11951 #11951
External dependencies / deployment changes