Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new Merge Repack special cases to AccountantWorker #11952

Merged
merged 1 commit into from
Apr 15, 2024

Conversation

germanfgv
Copy link
Contributor

@germanfgv germanfgv commented Mar 27, 2024

Fixes #11950

Status

ready

Tested with T0 Replay

Description

AccountantWorker verifies that each job produces all the pre-stablished output modules. Tier-0 Repack and Express workflows do now follow this rule, as each job may or may not produce an Error module, depending on the size of the output, which cannot be predicted during workflow creation. This PR refactors these exceptions to output module verification in AccountantWorker, and adds exceptions for the new Repack data tiers HLTSCOUT and L1SCOUT.

Is it backward compatible (if not, which system it affects?)

YES

Related PRs

PR allowing for use of alphanumerical data tiers:#11951 #11951

External dependencies / deployment changes

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 3 new failures
    • 1 tests no longer failing
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 4 warnings and errors that must be fixed
    • 3 warnings
    • 32 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 2 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14997/artifact/artifacts/PullRequestReport.html

Comment on lines 475 to 472
elif jobType == "Merge" and set(outputMap) == {'MergedL1SCOUT', 'MergedErrorL1SCOUT', 'logArchive'} \
and outputModules == {'MergedL1SCOUT', 'logArchive'}:
pass
elif jobType == "Merge" and set(outputMap) == {'MergedL1SCOUT', 'MergedErrorL1SCOUT', 'logArchive'} \
and outputModules == {'MergedErrorL1SCOUT', 'logArchive'}:
pass
elif jobType == "Merge" and set(outputMap) == {'MergedHLTSCOUT', 'MergedErrorHLTSCOUT', 'logArchive'} \
and outputModules == {'MergedHLTSCOUT', 'logArchive'}:
pass
elif jobType == "Merge" and set(outputMap) == {'MergedHLTSCOUT', 'MergedErrorHLTSCOUT', 'logArchive'} \
and outputModules == {'MergedErrorHLTSCOUT', 'logArchive'}:
pass
Copy link
Contributor Author

@germanfgv germanfgv Mar 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a quick-and-dirty solution that works. A nicer fix would be to use fwjrFile['dataset'].get('dataTier', '') to avoid repeating the code, defining:

mergedModule="Merged"+datatier
mergedErrorModule="MergedError"+datatier
Suggested change
elif jobType == "Merge" and set(outputMap) == {'MergedL1SCOUT', 'MergedErrorL1SCOUT', 'logArchive'} \
and outputModules == {'MergedL1SCOUT', 'logArchive'}:
pass
elif jobType == "Merge" and set(outputMap) == {'MergedL1SCOUT', 'MergedErrorL1SCOUT', 'logArchive'} \
and outputModules == {'MergedErrorL1SCOUT', 'logArchive'}:
pass
elif jobType == "Merge" and set(outputMap) == {'MergedHLTSCOUT', 'MergedErrorHLTSCOUT', 'logArchive'} \
and outputModules == {'MergedHLTSCOUT', 'logArchive'}:
pass
elif jobType == "Merge" and set(outputMap) == {'MergedHLTSCOUT', 'MergedErrorHLTSCOUT', 'logArchive'} \
and outputModules == {'MergedErrorHLTSCOUT', 'logArchive'}:
pass
elif jobType == "Merge" and set(outputMap) == {mergedModule, mergedErrorModule, 'logArchive'} and 'logArchive' in outputModules and (mergedModule in outputModules or mergedErrorModule in outputModules)
pass

Nonetheless, I'm not sure if that would brake other workflows in which this verification is necessary

Copy link
Contributor Author

@germanfgv germanfgv Mar 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think @amaltaro?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the following suggestion to replace all those checks for merge jobs:

elif jobType == "Merge" and (set(outputMap) & {'MergedErrorRAW', 'MergedErrorL1SCOUT', 'MergedErrorHLTSCOUT'}):
  pass

In other words, if it's a merge that does not match set(outputMap) == outputModules but it has one of the expected Error output modules, just let it through.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like your suggestion as is it a lot cleaner. The part I don't like is that it still means that, in the future, if we want to add a new data tier, we will still need to find this rather obscure code section.

I guess we can see this as a feature, not a bug, as it prevents typos in data tier names, but I would prefer to have explicit data tier validation in the repack Workload factory, rather than here (#11954).

Do you see a problem in using mergedErrorModule="MergedError"+datatier?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi German, I think that would work as well. It would add a few extra lines of code to do the parsing, add new output module and run the final comparison, but still it would work.

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 3 new failures
    • 2 changes in unstable tests
  • Python3 Pylint check: failed
    • 4 warnings and errors that must be fixed
    • 3 warnings
    • 31 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 2 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/15004/artifact/artifacts/PullRequestReport.html

Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@germanfgv German, apologies for the long delay on reviewing this.
Current implementation looks good to me - and it will only be triggered for "special" Merge jobs that create the *Error output module on the fly, so it should be pretty safe.

If you are happy with this as well, please let me know. Can you please just add some extra information to the initial PR description to make it easier to understand in the future? Thanks

@germanfgv
Copy link
Contributor Author

Thank you, @amaltaro. all good from my side. I added information to the description.

@amaltaro
Copy link
Contributor

Thanks German. I will let you know in which release to find it.

@amaltaro amaltaro merged commit 790c9fb into dmwm:master Apr 15, 2024
2 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Adapt AccountantWorker checks of job output modules for T0 Repacking of Scouting data tiers
3 participants