Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds WMCMSSWSubprocess metrics to FJR document #11716

Merged
merged 2 commits into from
Sep 19, 2023

Conversation

vkuznet
Copy link
Contributor

@vkuznet vkuznet commented Sep 13, 2023

Fixes #11660
Complement to #11665

Status

ready

Description

Add WMCMSSWSubproces metrics to WMCore FJR stored in WMAgent local CouchDB. Full procedure is described in the following wiki.

Is it backward compatible (if not, which system it affects?)

YES

Related PRs

#11692

External dependencies / deployment changes

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 3 new failures
    • 1 tests no longer failing
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 4 warnings and errors that must be fixed
    • 3 warnings
    • 123 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 6 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14481/artifact/artifacts/PullRequestReport.html

Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vkuznet in addition to the comments along the code, can you please provide unit test(s) for these changes? Thanks

@@ -113,7 +112,6 @@
# 'StageOutCommand': ''
}],
'WMCMSSWSubprocess': {},
'WMTiming': {},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you removing this because this isn't yet fully functional? If so, then don't you prefer to leave it in while we work on getting this WMTiming metric in the proper fwjr document?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am removing this because it does not belong to this step part of the report and it should be at top level. But so far I do not see any defaults for top level parts in this codebase, and only defaults for step parts.

@@ -1410,6 +1424,17 @@ def getPrepID(self):
"""
return getattr(self.data, 'prep_id', "")

def getWMTiming(self):
"""
_getWMTiming_
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refresh this contribution guidelines: https://github.com/dmwm/WMCore/blob/master/CONTRIBUTING.rst#project-docstrings-best-practices

It seems you also have trailing spaces here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do, I copied docstring as it is used in other parts in this code. And, new docstring style will differ from other methods just saying.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine. At least we will be slowly converging towards one style.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping

@@ -301,6 +303,18 @@ def getExitCodes(self):
returnCodes.update(self.getStepExitCodes(stepName=stepName))
return returnCodes

def getWMCMSSWSubprocess(self, stepName):
"""
_getWMCMSSWSubprocess_
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 4 warnings and errors that must be fixed
    • 3 warnings
    • 123 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 6 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14482/artifact/artifacts/PullRequestReport.html

Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vkuznet Valentin, this code looks good to me, despite a few minor comments that have not been addressed from the previous review.

In addition, I would suggest to:

  • refactor the initial PR description (IMO, most of that content belongs to the wiki that you created, for the data flow of CMSSW metrics).
  • if you are done working on the PR, please remove that label as well (PR: work in progress)
  • please provide unit tests for the 2 new getter methods.

@@ -301,6 +303,18 @@ def getExitCodes(self):
returnCodes.update(self.getStepExitCodes(stepName=stepName))
return returnCodes

def getWMCMSSWSubprocess(self, stepName):
"""
_getWMCMSSWSubprocess_
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping

@@ -1410,6 +1424,17 @@ def getPrepID(self):
"""
return getattr(self.data, 'prep_id', "")

def getWMTiming(self):
"""
_getWMTiming_
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 1 new failures
    • 4 tests no longer failing
    • 2 tests added
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 4 warnings and errors that must be fixed
    • 3 warnings
    • 178 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 25 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14488/artifact/artifacts/PullRequestReport.html

@vkuznet
Copy link
Contributor Author

vkuznet commented Sep 19, 2023

@amaltaro , I updated description of this PR, removed label and added unit tests.
Regarding your requests:

  • I never created wiki about it but rather had google doc. Therefore, to fulfill your request I created new wiki. It may still require adjustments though when I'll resolve WMTiming metrics and generalize it
  • the old style _getWMTiming_ docstring are removed, it was not clear to me if you wanted them to keep or not, but I added proper documentation to method
  • unit test are created and according to test they passed, but it looks like there is one unstable unit test (Throttling) which is not related to this PR.

Please review this PR again and let me know if it is ok.

Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vkuznet changes look good to me, but first please squash these commits accordingly and if needed, amend the commit message as well.

@vkuznet
Copy link
Contributor Author

vkuznet commented Sep 19, 2023

Squash is done

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 1 new failures
    • 2 tests added
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 4 warnings and errors that must be fixed
    • 3 warnings
    • 178 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 25 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14490/artifact/artifacts/PullRequestReport.html

@amaltaro
Copy link
Contributor

Thanks, Valentin.

@amaltaro amaltaro merged commit 27a5dec into dmwm:master Sep 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

WM measurements of wallclock and cpu time for each cmsRun step
3 participants