Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discover the singularity container OS at runtime. #11896

Conversation

todor-ivanov
Copy link
Contributor

@todor-ivanov todor-ivanov commented Feb 13, 2024

Fixes #11893

Status

ready

Description

With the current PR, we try to estimate the OS on the singularity container, where the job has landed at runtime.

Is it backward compatible (if not, which system it affects?)

YES

Related PRs

None

External dependencies / deployment changes

None

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 1 tests no longer failing
    • 1 changes in unstable tests
  • Python3 Pylint check: succeeded
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14850/artifact/artifacts/PullRequestReport.html

@todor-ivanov todor-ivanov changed the title Estimate the singularity container OS at tuntime. Estimate the singularity container OS at runtime. Feb 13, 2024
@todor-ivanov todor-ivanov force-pushed the bugfix_Runtime_CleanupJobsXrdfsFail_fix-11893 branch 2 times, most recently from 299dc93 to 8337c22 Compare February 13, 2024 15:55
@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 1 tests no longer failing
    • 2 changes in unstable tests
  • Python3 Pylint check: succeeded
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14851/artifact/artifacts/PullRequestReport.html

@todor-ivanov todor-ivanov force-pushed the bugfix_Runtime_CleanupJobsXrdfsFail_fix-11893 branch from 8337c22 to 2d700fe Compare February 13, 2024 16:01
@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 1 tests no longer failing
    • 1 changes in unstable tests
  • Python3 Pylint check: succeeded
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14852/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 1 tests no longer failing
    • 1 changes in unstable tests
  • Python3 Pylint check: succeeded
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14854/artifact/artifacts/PullRequestReport.html

Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@todor-ivanov please find a couple of comments inline.
In addition, can you please amend the commit message and fix that typo? Thanks

etc/submit_py3.sh Outdated Show resolved Hide resolved
etc/submit_py3.sh Outdated Show resolved Hide resolved
@todor-ivanov todor-ivanov force-pushed the bugfix_Runtime_CleanupJobsXrdfsFail_fix-11893 branch from 2d700fe to cf2a22e Compare February 13, 2024 19:02
@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 2 changes in unstable tests
  • Python3 Pylint check: succeeded
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14855/artifact/artifacts/PullRequestReport.html

Add a check for the value obtained for the WMA_CURRENT_OS

Fix regular expression

Fix regular expression

Add support for rhel6
@todor-ivanov todor-ivanov force-pushed the bugfix_Runtime_CleanupJobsXrdfsFail_fix-11893 branch from cf2a22e to f7dda5a Compare February 14, 2024 07:12
@todor-ivanov todor-ivanov changed the title Estimate the singularity container OS at runtime. Discover the singularity container OS at runtime. Feb 14, 2024
@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 1 tests no longer failing
    • 2 changes in unstable tests
  • Python3 Pylint check: succeeded
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14863/artifact/artifacts/PullRequestReport.html

Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes are looking good to me. Thanks for updating the commit message as well, Todor.

@amaltaro amaltaro merged commit 582399b into dmwm:master Feb 14, 2024
3 of 4 checks passed
@amaltaro
Copy link
Contributor

Can you please take care of the backporting and patching as well? We need to backport this PR to the branch 2.3.0_wmagent and make a new release against that branch (2.3.0.2).

As previously discussed, we also need to patch all the agents connected to the production system. Note that this only affects the job wrapper, so there is no need to restart any service/components in the agent.

@germanfgv @LinaresToine could you please take care of the T0 agents? If you need any guidance, please let us know.

@todor-ivanov
Copy link
Contributor Author

@amaltaro All has been backported and new release created: https://github.com/dmwm/WMCore/commits/2.3.0_wmagent/

@amaltaro
Copy link
Contributor

Thank you, Todor. We have usually made the backport through another PR targetting the specific branch, instead of pushing the commit directly upstream.

I don't think there is any strong reason for doing it that way or this way, but if you feel like adding a short section with a procedure for backporting hot-fixes, I would suggest this wiki: https://github.com/dmwm/WMCore/wiki/TaggingAndReleasing#new-tagging-convention (well, the relevant in gitlab).

@todor-ivanov
Copy link
Contributor Author

Hi @amaltaro

I know the general procedure we usually follow. But this time, I decided to directly cherry-pick just the commit from the master branch to the one to be back ported. I think this keeps the repository history more clean. And avoids any eventual differences between the master branch and the branch to which the commit is backported. Also in the case of eventual conflicts (which may appear if in the meantime some code changes have accumulated on the same source files as the current patch), they will be resolved in the process of cherry-picking, which makes it local to the working tree of the person doing the operation, instead of the upstream repository during the merge of the additional PR. This again - helps keeping the history clean. Of course, we lack the additional PR, but that's fine with me - we can create it if we want of course. I have no strong opinion here either.

@amaltaro
Copy link
Contributor

I like having the reference to the PR and the details that come with it, I guess that's why I always went with an extra PR against the particular branch. In addition, note that we can also cherry-pick - actually, this is the recommended way to keep meta-data intact - the master commit and provide it in a development branch against the particular wmagent/cmsweb branch. So, conflict resolution is the same between pushing upstream or pushing through a development branch.

germanfgv added a commit to dmwm/T0 that referenced this pull request Feb 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CleanupJobs fail due to xrdfs shared library version mismatch.
3 participants