Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSPileup monitoringTask fail to track containers without any wmcore_transferor rules #11578

Closed
amaltaro opened this issue Apr 27, 2023 · 0 comments · Fixed by #11579
Closed

Comments

@amaltaro
Copy link
Contributor

Impact of the bug
MSPileup

Describe the bug
Hasan pointed out to this RelVal pileup:

/RelValMinBias_14TeV/CMSSW_13_1_0_pre2-131X_mcRun4_realistic_v2_2026D95noPU-v2/GEN-SIM

which is supposed to have 1 wmcore_transferor rule locking it at T2_CH_CERN, accessible through (but also copied in [1]):
https://cmsweb.cern.ch/ms-pileup/data/pileup?pileupName=/RelValMinBias_14TeV/CMSSW_13_1_0_pre2-131X_mcRun4_realistic_v2_2026D95noPU-v2/GEN-SIM

It turns out that that rule 4650a485fe3f44258f5aedb60139a5cb does not even exist in Rucio; and MSPileup is not taking any action on the pileup either.

Problem seems to be in the monitoringTask which iterates through all the wmcore_transferor, however, it does not account for the case where no rules are found in Rucio:
https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/MicroService/MSPileup/MSPileupTasks.py#L259

How to reproduce it
Have a pileup configuration with existent rules, then delete all wmcore_transferor rules.

Expected behavior
If a pileup is active (active=True), it has expected RSEs in the configuration, but no rules in Rucio are found; it is expected that MSPileup will:

  • cleanup the currentRSEs property
  • and cleanup the ruleIds property.

With that, the activeTask will pick it up and make the relevant data placement.

Additional context and error message
[1]

{"result": [
 {
  "pileupName": "/RelValMinBias_14TeV/CMSSW_13_1_0_pre2-131X_mcRun4_realistic_v2_2026D95noPU-v2/GEN-SIM",
  "pileupType": "premix",
  "insertTime": 1681914426,
  "lastUpdateTime": 1682596591,
  "expectedRSEs": [
    "T2_CH_CERN"
  ],
  "currentRSEs": [
    "T2_CH_CERN"
  ],
  "fullReplicas": 1,
  "campaigns": [],
  "containerFraction": 1.0,
  "replicationGrouping": "ALL",
  "activatedOn": 1682596591,
  "deactivatedOn": 1681914426,
  "active": true,
  "pileupSize": 33060714720,
  "ruleIds": [
    "4650a485fe3f44258f5aedb60139a5cb"
  ]
}]}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

1 participant