Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add code to insert transition record; code re-factoring #11915

Closed
wants to merge 5 commits into from

Conversation

vkuznet
Copy link
Contributor

@vkuznet vkuznet commented Feb 28, 2024

Fixes #11914

Status

ready

Description

Update ad-hoc scripts to create initial transition record for MSPileup records. There are two scripts:

  • adjustMongoDocs.py to communicate with MongoDB directly
  • updatePileupObjects.py to communicate with MongoDB backend via MSPileup REST API

Here is how each script can be called:

# adjust port, dbname and dbcoll appropriately
adjustMongoDocs.py --add-tran-record --dburi=localhost:8230 --dbname=test --dbcoll=test --verbose 1

# use update script to rely on MSPileup REST API
updatePileupObjects.py --add-tran-record -url=https://cmsweb.cern.ch --userDN=<user-dn>

Is it backward compatible (if not, which system it affects?)

YES

Related PRs

part of meta #11537

External dependencies / deployment changes

pymongo client

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 2 new failures
  • Python3 Pylint check: failed
    • 1 warnings and errors that must be fixed
    • 8 comments to review
  • Pylint py3k check: failed
    • 3 errors and warnings that should be fixed
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14927/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 1 new failures
    • 1 changes in unstable tests
  • Python3 Pylint check: succeeded
    • 5 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14928/artifact/artifacts/PullRequestReport.html

@vkuznet vkuznet requested a review from amaltaro February 28, 2024 14:55
Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of updating this script which requires direct access to the mongodb cluster, I would suggest to provide such changes under: https://github.com/dmwm/WMCore/blob/master/bin/adhoc-scripts/updatePileupObjects.py

bin/adhoc-scripts/adjustMongoDocs.py Outdated Show resolved Hide resolved
@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 1 tests no longer failing
    • 2 changes in unstable tests
  • Python3 Pylint check: failed
    • 2 warnings and errors that must be fixed
    • 1 warnings
    • 7 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14930/artifact/artifacts/PullRequestReport.html

@vkuznet
Copy link
Contributor Author

vkuznet commented Feb 29, 2024

Alan, I added code to updatePileupObjects.py and kept the one I initially added in this PR. Now, the updatePileupObjects.py depends on WMCore since we must use gmtimeSeconds in update. I don't know if you like it or not, but it is worth to mention it.

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 1 new failures
    • 3 changes in unstable tests
  • Python3 Pylint check: succeeded
    • 1 warnings
    • 7 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14931/artifact/artifacts/PullRequestReport.html

@vkuznet vkuznet requested a review from amaltaro February 29, 2024 13:17
bin/adhoc-scripts/updatePileupObjects.py Outdated Show resolved Hide resolved
bin/adhoc-scripts/updatePileupObjects.py Outdated Show resolved Hide resolved
bin/adhoc-scripts/updatePileupObjects.py Outdated Show resolved Hide resolved
bin/adhoc-scripts/updatePileupObjects.py Outdated Show resolved Hide resolved
@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 1 changes in unstable tests
  • Python3 Pylint check: succeeded
    • 1 warnings
    • 7 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14936/artifact/artifacts/PullRequestReport.html

@vkuznet vkuznet requested a review from amaltaro March 4, 2024 12:58
Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valentin, in addition to the comments along the code, please also update the PR description (as now you are providing changes to 2 scripts).

bin/adhoc-scripts/updatePileupObjects.py Outdated Show resolved Hide resolved
:param dryrun: option to run dry-run mode (boolean)
:return: nothing
"""
with open(fin, 'r', encoding='utf-8') as istream:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like that with the current changes, we should stop making fin required in this script, no? In other words, I see 2 scenarios:
a) we provide a json file as input and write (update) them in the database
b) we do not provide a json file, fetch documents via MSPileup, update them in memory and persist them back in the database.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both options are valid use-case and I adjusted the code to support them both.

bin/adhoc-scripts/adjustMongoDocs.py Outdated Show resolved Hide resolved
@vkuznet vkuznet force-pushed the fix-issue-11914 branch 2 times, most recently from dcc10ce to dd61fb3 Compare March 4, 2024 14:04
@vkuznet
Copy link
Contributor Author

vkuznet commented Mar 4, 2024

Alan, I updated the code and description, please have a final look.

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 2 new failures
    • 3 changes in unstable tests
  • Python3 Pylint check: succeeded
    • 1 warnings
    • 7 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14937/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 1 changes in unstable tests
  • Python3 Pylint check: succeeded
    • 1 warnings
    • 7 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14938/artifact/artifacts/PullRequestReport.html

bin/adhoc-scripts/updatePileupObjects.py Outdated Show resolved Hide resolved
bin/adhoc-scripts/updatePileupObjects.py Outdated Show resolved Hide resolved
@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 1 tests no longer failing
  • Python3 Pylint check: failed
    • 2 warnings and errors that must be fixed
    • 1 warnings
    • 7 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14939/artifact/artifacts/PullRequestReport.html

@vkuznet vkuznet requested a review from amaltaro March 4, 2024 15:28
bin/adhoc-scripts/updatePileupObjects.py Outdated Show resolved Hide resolved
@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 1 tests no longer failing
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 2 warnings and errors that must be fixed
    • 1 warnings
    • 7 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14940/artifact/artifacts/PullRequestReport.html

@vkuznet
Copy link
Contributor Author

vkuznet commented Mar 5, 2024

Alan, upon testing the script against testbed I found new flaws in its logic. The main issue was the following:

  • we get all docs from MSPileup service
  • we update records without transition record
  • but we used all docs again to write back to MSPileup

The last step did a wrong thing since when we provided a document with already existing transition record the MSPileup code itself run the logic to use last transition record to create custom pileup name, etc. This is why I saw documents with such kind of structures:

"transition": [
    {
      "containerFraction": 1.0,
      "customDID": "/hasan-test10-Neutrino_E-10_gun/Run3Summer21PrePremix-120X_mcRun3_2021_realistic_v6-v2/PREMIX",
      "updateTime": 1709576366,
      "DN": "/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=msunmer/CN=852819/CN=Robot: WmCore Service Account"
    },
    {
      "containerFraction": 1.0,
      "customDID": "/hasan-test10-Neutrino_E-10_gun/Run3Summer21PrePremix-120X_mcRun3_2021_realistic_v6-v2/PREMIX-V1",
      "updateTime": 1709579969,
      "DN": "/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=valya/CN=443502/CN=Valentin Y Kuznetsov"
    },
    {
      "containerFraction": 1.0,
      "customDID": "/hasan-test10-Neutrino_E-10_gun/Run3Summer21PrePremix-120X_mcRun3_2021_realistic_v6-v2/PREMIX-V2",
      "updateTime": 1709581549,
      "DN": ""
    },
    {
      "containerFraction": 1.0,
      "customDID": "/hasan-test10-Neutrino_E-10_gun/Run3Summer21PrePremix-120X_mcRun3_2021_realistic_v6-v2/PREMIX-V3",
      "updateTime": 1709585151,
      "DN": ""
    }
  ]

In other words, we should never pass all record for the update and only pass those which have not had any transition record. I fixed this and other issue.

The other issue I found is when I tried to use --fin parameter which contain a change we want to apply to the document. I tried to reset transition records to empty list but since on testbed we already have code which checks transition my attempt fail.

I left my last commits without squashing that you can easily review them. Please do that:

  • eb15f24 fixes condition in adjust script (to be aligned with update script code)
  • 4cf9f1e fixes logic of of update (both issues I listed above)
  • f34e44a clarifies meaning of fin option to avoid confusion.

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 1 new failures
  • Python3 Pylint check: succeeded
    • 1 warnings
    • 7 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14943/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 3 new failures
    • 1 changes in unstable tests
  • Python3 Pylint check: succeeded
    • 1 warnings
    • 7 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14944/artifact/artifacts/PullRequestReport.html

@vkuznet vkuznet requested a review from amaltaro March 5, 2024 13:55
@amaltaro
Copy link
Contributor

amaltaro commented Mar 5, 2024

@vkuznet Valentin, if I understood this statement right:
""In other words, we should never pass all record for the update and only pass those which have not had any transition record. I fixed this and other issue.""

and the example of transition records keeping the same container fraction of 1.0, I would say this is a misbehavior on the server side (MSPileup) which we need to fix. We don't want the service to create custom container UNLESS there was a change of container fraction.

EDIT: Can you please create a new GH issue to track this?

@vkuznet
Copy link
Contributor Author

vkuznet commented Mar 5, 2024

Yes, it seems it is bug in server side, but I can't be sure since I don't know which code is deployed on testbed. We had so many issues/PRs that it is frankly possible we run not the latest version which may not have this issue. Said that, I agree to file new issue to check this.

@amaltaro
Copy link
Contributor

amaltaro commented Mar 5, 2024

Valentin, you can check the service version through this REST API:
https://cmsweb-testbed.cern.ch/ms-pileup/data/info

and see what changes went in through the CHANGES file:
https://github.com/dmwm/WMCore/blob/master/CHANGES

Said that, testbed is running 2.3.1rc4, which is exactly the same as 2.3.1 deployed in production.

In summary, the services version we have in testbed (and production) are supposed to have 100% of this feature.

@vkuznet
Copy link
Contributor Author

vkuznet commented Mar 5, 2024

Upon further investigation, now I see that in fact, the reported here transition records #11915 (comment) are correctly created since the entire MSPileup record contains container fraction 0.5 while I send a document with transition record 1.0. That's why we start see growth of transition records. But other existing records where container fraction was 1.0 have not had this issue. The remaining mystery for me right now how empty DN appears.

@vkuznet
Copy link
Contributor Author

vkuznet commented Mar 5, 2024

Ok, now I understand where empty DN comes from too. It comes from MSPileupTask.py which calls updatePileup method of MSPileupData.py module without providing userDN which is optional parameter of latter methon with empty string default value. Therefore, the bug we have is in MSPileupTask.py module where we should properly provide userDN which will be recorded in transition record. How to do it is yet unclear to me.

bin/adhoc-scripts/updatePileupObjects.py Outdated Show resolved Hide resolved
bin/adhoc-scripts/adjustMongoDocs.py Outdated Show resolved Hide resolved
@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 1 new failures
    • 2 tests no longer failing
    • 1 changes in unstable tests
  • Python3 Pylint check: succeeded
    • 1 warnings
    • 7 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14961/artifact/artifacts/PullRequestReport.html

@vkuznet vkuznet requested a review from amaltaro March 11, 2024 22:11
@vkuznet
Copy link
Contributor Author

vkuznet commented Mar 11, 2024

Alan, I don't know what else needs to be done here, but I removed default DN since it can be provided as external parameter to the script. Therefore, I request final review and if you're satisfied I can squash changes for the merge.

Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good to me, but I would like to first update the transition records with this script, before we actually merge it.

It looks like testbed has the expected transition record now, right? Do I understand it right that first we need to upgrade MSPileup, before we can actually update the production records?

@vkuznet
Copy link
Contributor Author

vkuznet commented Mar 13, 2024

Cmsweb testbed have already transition records. But the MSPileupData code checks if transition record is in place, see https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/MicroService/MSPileup/MSPileupData.py#L259, before allowing the update. Therefore, I think we first need transition record to be in place in production MongoDB then we can setup new code.

@amaltaro
Copy link
Contributor

Therefore, I think we first need transition record to be in place in production MongoDB then we can setup new code.

Okay. Can you then update the production records with the script provided in this PR? Once we verify that, we can merge it.
But we need to keep in mind that new pileup documents might be created until we actually upgrade services in production, so we will have to run the same procedure once again (please keep/share notes on how/where exactly to update mongodb records).

@vkuznet
Copy link
Contributor Author

vkuznet commented Mar 13, 2024

per our discussion on MM, I'll wait until further notice to update production MSPileup records.

@amaltaro
Copy link
Contributor

Before merging it, please squash the commits. @vkuznet

@amaltaro
Copy link
Contributor

@vkuznet just a reminder about squashing commits.

@vkuznet
Copy link
Contributor Author

vkuznet commented Mar 25, 2024

Alan, I made a mistake locally by squashing other parts within this PR, so you have two options:

  • if you want squashed commits I can recreate PR with another branch and PR
  • or, take it as is w/o squashing commits.

Sorry about mistake.

@amaltaro
Copy link
Contributor

No problem. Please create a new PR and write that it superseeds this one.

@vkuznet
Copy link
Contributor Author

vkuznet commented Mar 26, 2024

Due to mistake during squash local procedure (I squashed an additional changes not relevant to this PR) I created a new PR#11947 which supersede this PR.

@amaltaro
Copy link
Contributor

Superseeded by #11947
I am closing this one out, but please reopen it if I missed anything, Valentin.

@amaltaro amaltaro closed this Mar 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Adjust records in MSPileup DB for support partial pileup data placement
3 participants