Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adopt CMS_PATH and SITECONFIG_PATH to locate the site catalog #11481

Merged
merged 2 commits into from
Feb 10, 2023

Conversation

amaltaro
Copy link
Contributor

@amaltaro amaltaro commented Feb 8, 2023

Fixes #11449

Status

not-tested

Description

In light of the migration from storage.xml to storage.json, CMSSW went through some modifications to support those catalogs, in short:

  • any releases made before 12_6_0_pre2 will support storage.xml (through CMS_PATH environment variable)
  • any releases starting in 12_6_0_pre3 (exception to 12_6_3) will support only storage.json (through SITECONF_PATH environment variable).

That means, our Scram module needs to keep explicitly defining those variables in the CMSSW environment. As confirmed by Matti below, it should not be a problem to have both variables defined.

Is it backward compatible (if not, which system it affects?)

YES (to the best of my knowledge)

Related PRs

Somehow related to #11472

External dependencies / deployment changes

None

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 1 new failures
    • 3 tests no longer failing
    • 2 changes in unstable tests
  • Python3 Pylint check: failed
    • 23 warnings and errors that must be fixed
    • 12 warnings
    • 176 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 38 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13992/artifact/artifacts/PullRequestReport.html

@makortel
Copy link

makortel commented Feb 8, 2023

@amaltaro I think WM should keep the propagation of CMS_PATH. CMSSW versions before 12_6_0_pre2 still rely on CMS_PATH for the location of site-local-config.xml. It's ok to have both defined (cmsset_default.sh anyway defines both if they are not already defined)

@amaltaro amaltaro changed the title Move site config setup from CMS_PATH to SITECONFIG_PATH Adopt CMS_PATH and SITECONFIG_PATH to locate the site catalog Feb 8, 2023
@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 1 new failures
    • 4 tests no longer failing
  • Python3 Pylint check: failed
    • 23 warnings and errors that must be fixed
    • 12 warnings
    • 177 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 38 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13994/artifact/artifacts/PullRequestReport.html

@amaltaro
Copy link
Contributor Author

amaltaro commented Feb 8, 2023

Noted and code updated in the Scram module. Thanks Matti.

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 1 new failures
    • 4 tests no longer failing
  • Python3 Pylint check: failed
    • 23 warnings and errors that must be fixed
    • 12 warnings
    • 177 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 38 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13999/artifact/artifacts/PullRequestReport.html

Copy link
Contributor

@todor-ivanov todor-ivanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @amaltaro
I am only worried about backwards compatibility here. though, If we are sure the $SITECONFIG_PATH is defined regardless of the CMSSW version, we seem to be safe.

@@ -39,11 +39,11 @@ def loadSiteLocalConfig():
msg = "%s env. var. provided but not pointing to an existing file, ignoring." % overVarName
logging.log(logging.ERROR, msg)

defaultPath = "$CMS_PATH/SITECONF/local/JobConfig/site-local-config.xml"
defaultPath = "$SITECONFIG_PATH/JobConfig/site-local-config.xml"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC the ../local.. part from this path here is usually a sym. link related to the site name/setup. Are we sure it has been properly expanded in the $SITECONFIG_PATH env. variable? Or should we even care to check, actually?... Maybe we just take it for granted and use what ever provided inside the environment? (just a thought here, no request for a change).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @khurtado, indeed we can.

@amaltaro
Copy link
Contributor Author

amaltaro commented Feb 9, 2023

The way I understand it is that SITECONFIG_PATH is actually defined in the CMS pilot (the same as it is for CMS_PATH). So I'd simply take it for granted as we used to do for CMS_PATH.

Regarding the local symlink, it should be just a special file name that is part of the absolute path. I fail to see any problems with that.

@makortel
Copy link

makortel commented Feb 9, 2023

The way I understand it is that SITECONFIG_PATH is actually defined in the CMS pilot (the same as it is for CMS_PATH). So I'd simply take it for granted as we used to do for CMS_PATH.

This is correct.

There is one detail in the conceptual difference between SITECONFIG_PATH and CMS_PATH that may or may not be relevant here. By default SITECONFIG_PATH is defined as $CMS_PATH/SITECONF/local, but sites are allowed to override the SITECONFIG_PATH to be something else for sub-sites. This works fine for CMSSW releases >= 12_6_0_pre2 that use the SITECONFIG_PATH.

But I wonder if there is anything to be concerned about for CMSSW releases <= 12_6_0_pre1 that still use CMS_PATH? (mainly because I don't know what information exactly WM uses from the site-local-config.xml and for what) My understanding is that sites ought to keep storage.xml and storage.json consistent, and the subsites should function in some sensible way with CMSSW releases using CMS_PATH, but I'm not sure if there could be a room for some inconsistency when SITECONFIG_PATH points to different place than CMS_PATH.

Maybe @stlammel would have a better view?

@khurtado
Copy link
Contributor

khurtado commented Feb 9, 2023

The way I understand it is that SITECONFIG_PATH is actually defined in the CMS pilot (the same as it is for CMS_PATH). So I'd simply take it for granted as we used to do for CMS_PATH.

Regarding the local symlink, it should be just a special file name that is part of the absolute path. I fail to see any problems with that.

Just some extra content here:

From what I see, both variables get defined when sourcing /cvmfs/cms.cern.ch/cmsset_default.sh

(base) [khurtado@earth /etc/profile.d]$ unset CMS_PATH SITECONFIG_PATH
(base) [khurtado@earth /etc/profile.d]$ echo $CMS_PATH $SITECONFIG_PATH

(base) [khurtado@earth /etc/profile.d]$ source /cvmfs/cms.cern.ch/cmsset_default.sh
(base) [khurtado@earth /etc/profile.d]$ echo $CMS_PATH $SITECONFIG_PATH
/cvmfs/cms.cern.ch /cvmfs/cms.cern.ch/SITECONF/local

The environment variable SITECONFIG_PATH defaults to `$CMS_PATH/SITECONF/local" but it changes when they are already pre-defined for subsites:

https://github.com/cms-sw/cms-common/blob/9c8203bf637b153697b855dfa522d6741379bcc9/cmsset_default.sh#L40-L44

An example for a file structure of a subsite would be Wisconsin and its CHTC resource (subsite resource of Wisconsin)

(base) [khurtado@earth /cvmfs/cms.cern.ch/SITECONF]$ ls T2_US_Wisconsin/
GlideinConfig/ JobConfig/     PhEDEx/        storage.json   WISC-CHTC/
(base) [khurtado@earth /cvmfs/cms.cern.ch/SITECONF]$ ls T2_US_Wisconsin/WISC-CHTC/
JobConfig  storage.json

@amaltaro
Copy link
Contributor Author

amaltaro commented Feb 9, 2023

An example for a file structure of a subsite would be Wisconsin and its CHTC resource (subsite resource of Wisconsin)

@khurtado Kenyi, but in this case wouldn't the worker node/pilot use a different local symlink?

In other words, it's up to the site admin to define the correct symlink that each site and subsite has to use. While our jobs simply set the environment and consume the actual catalog files provided by SITECONFIG_PATH (meant to have different different content between subsites, but still accessed through the same path). Or am I misunderstanding something in here?

@khurtado
Copy link
Contributor

khurtado commented Feb 9, 2023

An example for a file structure of a subsite would be Wisconsin and its CHTC resource (subsite resource of Wisconsin)

@khurtado Kenyi, but in this case wouldn't the worker node/pilot use a different local symlink?

In other words, it's up to the site admin to define the correct symlink that each site and subsite has to use. While our jobs simply set the environment and consume the actual catalog files provided by SITECONFIG_PATH (meant to have different different content between subsites, but still accessed through the same path). Or am I misunderstanding something in here?

I believe the local symlink doesn't change, but rather SITECONFIG_PATH would point to the subsite directory in the local symlinks (done by the site admins), e.g.:

SITECONFIG_PATH = /cvmfs/cms.cern.ch/SITECONF/local/WISC-CHTC

where local still points to the site name T2_US_Wisconsin in the example above.

But yes, I was just putting some extra information there for the record, regarding what happens with this new variable. But for us, we ultimately don't care how this is set, we just take it for granted. The validation of these paths are done elsewhere (during the setup of a site or sub-site, during the pilot validation tests, etc).

Copy link
Contributor

@khurtado khurtado left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amaltaro Changes to the actual python scripts look good to me, I only have some comments/changes requested to the unit tests.

@@ -155,7 +155,7 @@ def createTestWorkload(self, workloadName='Test', emulator=True):
os.makedirs(siteConfigPath)
shutil.copy('site-local-config.xml', siteConfigPath)
environment = rereco.data.section_('environment')
environment.CMS_PATH = workloadDir
environment.SITECONFIG_PATH = workloadDir
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't SITECONFIG_PATH be:

environment.SITECONFIG_PATH = "%s/SITECONF/local" % workloadDir

To make it consistent with the default definition here:
https://github.com/cms-sw/cms-common/blob/9c8203bf637b153697b855dfa522d6741379bcc9/cmsset_default.sh#L40-L44

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure from the L153 above. Maybe it even does not really matter, but let me make this change.

@@ -155,7 +155,7 @@ def testSlcPhedexNodesEqualPhedexApiNodes(self):
For each site, verify that the stageout node specified in
site-local-config.xml is the same as the one returned by the PhEDEx api.
"""
os.environ["CMS_PATH"] = "/cvmfs/cms.cern.ch"
os.environ["SITECONFIG_PATH"] = "/cvmfs/cms.cern.ch/SITECONFIG/local"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace SITECONFIG with SITECONF in the cvmfs path

@stlammel
Copy link

stlammel commented Feb 9, 2023

So sites are keeping access via both CMS_PATH (for older CMSSW versions) and SITECONFIG_PATH (for newer CMSSW versions) working right now. How the local link in CVMFS is set is up to each site. Most sites set it to the sitename and site/subsite for subsites but in case of a local SITECONF copy this doesn't need to be the case. ${CMS_PATH}/SITECONF/local is guaranteed to point to a good SITECONF area and ${SITECONF_PATH}.
Hope this helps.

  • Stephan

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 1 new failures
    • 1 tests no longer failing
    • 2 changes in unstable tests
  • Python3 Pylint check: failed
    • 25 warnings and errors that must be fixed
    • 12 warnings
    • 177 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 38 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14003/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 2 tests added
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 25 warnings and errors that must be fixed
    • 12 warnings
    • 177 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 38 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14004/artifact/artifacts/PullRequestReport.html

@amaltaro
Copy link
Contributor Author

Thanks for the review, @khurtado. Some unit tests were actually broken and I managed to get them fixed with this PR. For now, I kept unit tests defining both environment variables, as you suggested. Please have another look

@khurtado
Copy link
Contributor

@amaltaro Looking good! There is just one unit test that I think is still missing the CMS_PATH, besides that it all looks good to me.

Keep CMS_PATH around in the Scram module
more fixes to unit tests

trying to run integration unit test

fix stageout unit test
@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 1 new failures
    • 1 tests no longer failing
    • 2 tests added
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 25 warnings and errors that must be fixed
    • 12 warnings
    • 177 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 38 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14005/artifact/artifacts/PullRequestReport.html

@germanfgv
Copy link
Contributor

@amaltaro I tested #11473 and this on top of T0 agent 3.0.8 (WMCore 2.1.6.1) and the replays were successful, both with CMSSW_12_6_2 (RucioCatalog) and CMSSW_12_6_3 (TrivialFileCatalog)

@amaltaro
Copy link
Contributor Author

Thank you for the review and further tests!

@stlammel
Copy link

Thanks everybody for discovering and handling the dependency! - Stephan

@makortel
Copy link

@amaltaro I tested #11473 and this on top of T0 agent 3.0.8 (WMCore 2.1.6.1) and the replays were successful, both with CMSSW_12_6_2 (RucioCatalog) and CMSSW_12_6_3 (TrivialFileCatalog)

Thanks @germanfgv! Does this mean we could safely switch 12_6_X back to RucioCatalog?

@germanfgv
Copy link
Contributor

@makortel I still would like to test how it will work with T2_CH_CERN_P5 worker nodes. I'll launch test this tonight with a replay.

@makortel
Copy link

Thanks @germanfgv!

@makortel
Copy link

@germanfgv In the mean time I took the liberty of opening a PR to change the file catalog to Rucio catalog in 12_6_X in cms-sw/cmssw#40755 . If your tests succeed and the PR could proceed, could mention it in that PR as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Replace CMS_PATH by SITECONF_PATH to locate the site configuration file
7 participants