Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for customName (custom container DID) #11765

Merged
merged 2 commits into from
Nov 20, 2023

Conversation

vkuznet
Copy link
Contributor

@vkuznet vkuznet commented Oct 11, 2023

Fixes #11734

Status

In development

Description

Add support for new customName attribute (custon container DID)

Is it backward compatible (if not, which system it affects?)

YES

Related PRs

External dependencies / deployment changes

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 1 new failures
    • 1 tests no longer failing
  • Python3 Pylint check: succeeded
    • 4 warnings
    • 1 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14548/artifact/artifacts/PullRequestReport.html

@vkuznet vkuznet changed the title Add support for customName (custon container DID) Add support for customName (custom container DID) Oct 11, 2023
@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 1 new failures
    • 1 tests no longer failing
  • Python3 Pylint check: succeeded
    • 4 warnings
    • 3 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14549/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 3 new failures
    • 1 tests no longer failing
  • Python3 Pylint check: failed
    • 1 warnings and errors that must be fixed
    • 12 warnings
    • 10 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14556/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 3 new failures
    • 12 tests deleted
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 4 warnings and errors that must be fixed
    • 4 warnings
    • 4 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 337 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14558/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 1 new failures
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 1 warnings and errors that must be fixed
    • 12 warnings
    • 10 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14559/artifact/artifacts/PullRequestReport.html

@vkuznet
Copy link
Contributor Author

vkuznet commented Oct 16, 2023

Alan, Todor, I made initial changes which are ready to review. I see one Rucio test failing which I think is not related to this PR. Please review and provide your feedback of what else need to be done.

Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valentin, this is looking good in general. I did leave a couple of comments along the code though.

Comparing these changes against what has been requested in the ticket, I think you covered it all but the new group scope that we need to have someone from Rucio to create it for us.

I also wanted to point out that the adoption of the custom container was still somehow uncertain, in the tickets and as we discussed these developments. I do think it will be the best option though.

@vkuznet
Copy link
Contributor Author

vkuznet commented Oct 17, 2023

Alan, also please clarify if something needs to be done in WMCore codebase for new group scope you mentioned. I also see on original ticket mention of Rucio DID and lexicon rules. But for both items I interpreted that it is outside of WMCore codebase and will be done externally. Please advise if I'm wrong.

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 2 new failures
    • 1 tests no longer failing
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 1 warnings and errors that must be fixed
    • 12 warnings
    • 10 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14560/artifact/artifacts/PullRequestReport.html

Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see messages along the code.

@amaltaro
Copy link
Contributor

Alan, also please clarify if something needs to be done in WMCore codebase for new group scope you mentioned. I also see on original ticket mention of Rucio DID and lexicon rules. But for both items I interpreted that it is outside of WMCore codebase and will be done externally.

For the new group scope, it will be just a matter of having one of the CMS Rucio developers to create it in Rucio. After that, we can simply start creating DIDs against the new scope.

As far as I can see, scope has already been parameterized in all of the methods that we might need it:
https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/Services/Rucio/Rucio.py

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 1 new failures
    • 1 tests no longer failing
    • 3 changes in unstable tests
  • Python3 Pylint check: failed
    • 1 warnings and errors that must be fixed
    • 12 warnings
    • 9 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14571/artifact/artifacts/PullRequestReport.html

@vkuznet vkuznet requested a review from amaltaro October 23, 2023 13:22
@@ -18,6 +18,7 @@
"deactivatedOn": int, seconds since epoch in GMT timezone (service-based)
"active": boolean, (mandatory)
"pileupSize": integer, current size of the pileup in bytes (service-based)
"customName": string, custom container DID (opitonal)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix this typo as well (opitonal)

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 1 tests no longer failing
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 2 warnings and errors that must be fixed
    • 12 warnings
    • 9 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14577/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 1 tests no longer failing
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 2 warnings and errors that must be fixed
    • 12 warnings
    • 9 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14578/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 1 new failures
    • 1 tests no longer failing
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 1 warnings and errors that must be fixed
    • 12 warnings
    • 9 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14579/artifact/artifacts/PullRequestReport.html

@vkuznet vkuznet requested a review from amaltaro October 24, 2023 12:23
Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vkuznet Valentin, these changes look good to me. However, we still have to contact Eric/Rucio developers to decide which scope we want to create for this new data. Perhap we could use the DMWM mattermost channel for that discussion (or DM Dev Forum)?

And with that, it just occurred to me that, whenever we are dealing with a custom pileup dataset name, we need to also provide the new scope name to the Rucio queries. In other words, instead of using the standard cms scope, we should pass the scope used for customName pileups. For that, I would suggest to have a separate commit in this PR not to change what has already been reviewed.

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 1 tests no longer failing
    • 1 tests added
  • Python3 Pylint check: failed
    • 10 warnings and errors that must be fixed
    • 20 warnings
    • 78 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 14 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14640/artifact/artifacts/PullRequestReport.html

@amaltaro
Copy link
Contributor

Thank you for providing these changes, Valentin.
Honestly speaking, I always avoid adapting the actual source code providing a given functionality with code that is not necessary for a production environment (emulation/test related logic).

The only options I see though are:

  1. keep the mock/emulation code in the actual production function as is
  2. create an integration unit test running on real data. By "integration" here, it means not running in the Jenkins CI.

Any thoughts?

@vkuznet
Copy link
Contributor Author

vkuznet commented Nov 17, 2023

I understand your concern and I was trying to find best solution which I implemented as option 1. I doubt that option 2 is feasible since it will require a-prior knowledge of dataset presence, setting up MongoDB backend for testing, etc. I see it much more expensive. The changes I made to code are minimal and if you want they can be further separated into production and mock functions, i.e. getPileupContainerSizesRucio can be split into two with appropriate wrapper. I still in favor of option 1 as it provides a way of testing through Jenkins/unit tests without setting up MongoDB, Rucio, DBS services.

@amaltaro
Copy link
Contributor

I thought the mocking was isolated to the Pycurl module, but I see it also changes the other MSPileup module.

Why do you think we need to setup MongoDB? I see no difference between the mock rucio or real rucio, given that none of those depend on MongoDB. On what concerns finding a pileup dataset, that's easily achievable by:
https://cmsweb-testbed.cern.ch/ms-pileup/data/pileup
for the custom dataset, it will of course fail unless we create a custom container ourselves.

Said that, I am inclined to removing emulation code from the actual production baseline code and run a unit test as integration. Once you are happy with the result, just flag the unit test as integration not to have it executed by Jenkins.

@vkuznet
Copy link
Contributor Author

vkuznet commented Nov 20, 2023

The mocking is required since we need to test pileupSizeTask function which by itself calls updatePileup function to update pileup document in MongoDB after we acquire new dataset size. See for yourself at line https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/MicroService/MSPileup/MSPileupTasks.py#L69. And, usage of any record from cmsweb-testbed is not possible as they do not have customName attribute in it. As I wrote before, I can tweak testbed MongoDB records but them MSPileup service will fail if we introduce this record into Mongo as current code does perform check of record attributes. Because of this I see no way how I can use testbed for integration check, I either will break current MSPileup or we must deploy new code (this PR) to testbed and modify records to perform such tests. What do you prefer?

@amaltaro
Copy link
Contributor

amaltaro commented Nov 20, 2023

Oh, I didn't see that there was a mongodb record update within that function. At the moment we do not have MongoDB in our test setup. A few alternatives for that would be:

  • have a localhost mongodb instance for unit tests
  • use mongodb cmsweb-testbed in a controlled and isolated way
  • mock the few mongodb calls in MSPileup

I would suggest to do that in a future ticket/pull request though.
For now, we can move forward with this code and once it gets deployed in testbed, I would ask you to cross-check whether these developments are Okay. Does it look good to you?

UPDATE: I still would like to have the mocking-related code - within the core code - removed from this PR.

@vkuznet
Copy link
Contributor Author

vkuznet commented Nov 20, 2023

ok, so the plan would be:

  • remove this commit ffbfc74
  • merge this PR
  • deploy to testbed
  • change records on testbed mongodb to include custom Name
  • perform integration test on testbed using newly deployed code and new records
  • if test is successful we may move to production, i.e. include this PR in WMA release, or if test is not successful, I need to debug it and revert records in MongoDB on testbed cluster.

Is this plan correct?

BTW, we already have mocking for MongoDB, and my unit test which I provided does that.

@amaltaro
Copy link
Contributor

Yes, that's the plan with the following corrections/complement:

remove this commit ffbfc74

we should also remove this emulation-related code: https://github.com/dmwm/WMCore/pull/11765/files#diff-f6418aca4acfed87f5215da3ab07dcbf5c33fbcdc5b2617135eb20c676a1bf25R69 , which I failed to find in a specific commit.

if test is successful we may move to production, i.e. include this PR in WMA release, or if test is not successful, I need to debug it and revert records in MongoDB on testbed cluster.

it will follow a release candidate model with the usual pre-production validation of the WM ecosystem.

@vkuznet
Copy link
Contributor Author

vkuznet commented Nov 20, 2023

ok, mock code which emulates the rucio token has been removed. And, I adjusted unit test to not check cmsDict/cusDict records to allow test to pass and use logger instead to print how many we will have. This behavior will allow unit test to be successful but also will allow to perform integration tests when we'll add customName into MongoDB records.

Please have a look and proceed with this PR review (or merge). Then I can move towards testbed testing.

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 1 new failures
    • 1 tests no longer failing
    • 1 tests added
    • 2 changes in unstable tests
  • Python3 Pylint check: failed
    • 8 warnings and errors that must be fixed
    • 19 warnings
    • 31 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 1 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14643/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 1 tests no longer failing
    • 1 tests added
    • 3 changes in unstable tests
  • Python3 Pylint check: failed
    • 8 warnings and errors that must be fixed
    • 19 warnings
    • 31 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 1 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14644/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 13 new failures
    • 1 tests no longer failing
    • 1 tests added
    • 2 changes in unstable tests
  • Python3 Pylint check: failed
    • 6 warnings and errors that must be fixed
    • 19 warnings
    • 31 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 1 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14645/artifact/artifacts/PullRequestReport.html

@amaltaro
Copy link
Contributor

test this please

Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valentin, code looks good in general, other than 1 bug in the code that needs to be fixed. Once you fix that, feel free to squash commits accordingly.

The WMQuality related changes can either be in its own commit, or together with one of the other changes.

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 1 tests no longer failing
    • 1 tests added
    • 2 changes in unstable tests
  • Python3 Pylint check: failed
    • 6 warnings and errors that must be fixed
    • 19 warnings
    • 31 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 1 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14646/artifact/artifacts/PullRequestReport.html

author Valentin Kuznetsov <vkuznet@gmail.com> 1697051710 -0400
committer Valentin Kuznetsov <vkuznet@gmail.com> 1700501711 -0500
gpgsig -----BEGIN PGP SIGNATURE-----

 iHUEABEIAB0WIQROzG8FXytELxEiq36vnSwZCcnsGQUCZVuYzwAKCRCvnSwZCcns
 GUU4AQCIVcEWuhGRgiCBz0B2/3iMqZzU7wJ9wHLfUZLT55Ml9AD/T1sgPzYDwwML
 HKayNiZqBYGCoAclikaA1zmNwmDgRNY=
 =CZww
 -----END PGP SIGNATURE-----

Add support for customName (custom container DID)

Add new mockup data to use custom rucio scope

modify pileupSizeTask to return containers (used in unit test)

Allow usage of mock tokens

Added rucioParams to mock API as it is used in tests

Remove mock codebase which emulates rucio token

initialize datasetSizes dict
@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 1 new failures
    • 1 tests no longer failing
    • 1 tests added
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 6 warnings and errors that must be fixed
    • 19 warnings
    • 31 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 1 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14647/artifact/artifacts/PullRequestReport.html

@vkuznet
Copy link
Contributor Author

vkuznet commented Nov 20, 2023

Alan, commits are squashed and I fixed datasetSizes initialization.

Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Valentin. In the future, let's make sure the commit messages are meaningful. I will overwrite that one in the CHANGES file though.

@amaltaro amaltaro merged commit f183db4 into dmwm:master Nov 20, 2023
2 of 4 checks passed
@vkuznet
Copy link
Contributor Author

vkuznet commented Nov 20, 2023

Alan, for the record, the commit name was auto-generated after I squashed changes and re-order commits. Turns out the order was important and I resolved different conflicts and final commit message was auto-generated by github. I did not want to mess more with it, as it provided reference to previous commits in order they were committed.

Thanks for final approval and merge though. I'm glad we made it before holidays.

@vkuznet
Copy link
Contributor Author

vkuznet commented Nov 22, 2023

@amaltaro , I tried to run integration test and encountered the following error. I took one dataset from https://cmsweb-testbed.cern.ch/ms-pileup/data/pileup and put it into integration test, when I run it I see the following exception:

2023-11-22 15:24:02,270:ERROR:PycurlRucio: getPileupContainerSizesRucio function did not return a valid response for container: /MinBias_TuneCP5_14TeV-pythia8/PhaseIITDRSpring19GS-106X_upgrade2023_realistic_v2_ext1-v1/GEN-SIM. Error: {'ExceptionClass': 'DataIdentifierNotFound', 'ExceptionMessage': "Data identifier 'group.wmcore:/MinBias_TuneCP5_14TeV-pythia8/PhaseIITDRSpring19GS-106X_upgrade2023_realistic_v2_ext1-v1/GEN-SIM' not found"}

So, to proceed I think we need some dataset associated with group.wmcore

@amaltaro
Copy link
Contributor

@vkuznet Valentin, now that we have upgraded central services, could you please add customName attribute to all pileup documents in CMSWEB production?

I was about to create a GH issue to fix a KeyError exception in WorkflowUpdater code, but it's likely better if we just make everything consistent and update the pileup documents in production as well (all the testbed documents already contain that attribute).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MSPileup: support custom container DID in the doc schema
4 participants