-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update workqueue stats as new blocks are added to the workflow #11135
Conversation
Jenkins results:
|
Jenkins results:
|
Jenkins results:
|
Jenkins results:
|
Jenkins results:
|
simply update a few comments
StepChain json template with OpenRunningTimeout
Jenkins results:
|
Valentin, Todor, even though I am still trying to run a real test with a growing dataset, I'd appreciate any feedback that you might have on this PR. The goal is to get it deployed in testbed tomorrow morning CERN time (meaning, we need to merge it before I go to bed :)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes to the code are fine, but I have a question about choice of OpenRunningTimeout
. Why 7200 (2h)? And, I don't know what is a good place to document it since you make this default in JSON. May be you can default values to src/python/WMCore/ReqMgr/Tools/cms.py
, but it does not describe other defaults either.
That's just a random number that came up to my mind. It has to be small enough such that:
Regarding this module, I think it's used for the ReqMgr2 Web UI. There are other things that we should modify in there, including the addition/removal of workflow parameters from the web form. Given that the Web UI is barely used, I think we can address that in the future, when we also work on other workflow fields that are no longer relevant. The important is that we stop mentioning that that parameter has been deprecated :-D |
I am still unable to see progress with my real test. Problem is likely coming from the stuckness of the JobAccountant components spotted today. I will keep testing it during the day tomorrow (Thursday), and hopefully new blocks will be appended to the input dataset I am using in my tests. I do see though global workqueue trying to add new work for such requests open running requests, now regarless of which spec type it is. |
Now that I managed to get a test with a growing dataset running, I can confirm that:
and enhancement here would be not to trigger a couchdb document update when there is nothing to be updated. From the ReqMgr2 logs, I see PUT requests like this:
every few minutes (5min, but it really depends on the load in the system). I'm going to make another GH issue later today to keep track of this. |
Fixes #11129
Status
testing
Description
Short description of the changes provided within this PR:
OpenRunningTimeout
for StepChain and TaskChain workflow specsStartPolicy
dictionaryIs it backward compatible (if not, which system it affects?)
YES
Related PRs
None
External dependencies / deployment changes
None