Remove StateCheck for SubProcess/DependTask#14324
Remove StateCheck for SubProcess/DependTask#14324ORuteMa wants to merge 4 commits intoapache:3.1.8-preparefrom
Conversation
…mission madness in the default state-wheel-interval configuration
SbloodyS
left a comment
There was a problem hiding this comment.
Please change your target branch to dev.
In dev this pr #14242 has already fix it. This pr actually solved a more serious problem than an improvement, it was not realized that it needed to be cherry-picked into version 3.1.x. |
|
cc @ruanwenjun |
Codecov Report
@@ Coverage Diff @@
## 3.1.8-prepare #14324 +/- ##
================================================
Coverage ? 38.24%
Complexity ? 3968
================================================
Files ? 999
Lines ? 36827
Branches ? 4255
================================================
Hits ? 14086
Misses ? 21128
Partials ? 1613 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
|
SonarCloud Quality Gate failed.
|
|
@SbloodyS E2E failed, pls retry. |
|
@ORuteMa The dev branch removes this part of the code because the delay queue detection status scheme is introduced, but the 3.1.8 branch does not have this scheme, so the above status detection scheme needs to be retained. |
From my practice, subprocess works fine after I remove this part of code. It may have other scheme to ensure async task. |
|
I will figout out what exacly going on here, provide a more appropriate solution for 3.1.x. |
|
@ORuteMa I also encounter this bug. |
The stuck one is a subProcess or dependentTask? |
|
@ORuteMa Pls help me, I hava the same issue, but when I cherry-pick the
@ORuteMa |
…task submission madness in the default state-wheel-interval configuration" This reverts commit 6a90719.
Sorry for my late reply. You should revert the pr from dev, and adjust the config |
I offer another solution maybe more appropriate, pls have a look. |
|
hi @zhuangchong @ORuteMa I also encountered this situation, but I think this polling time needs to be configured according to their respective system resources. The master will poll all eligible tasks assigned by it. We can set this time as the maximum number of startups in the worker multiplied by the number of workers, divided by the number of masters, multiplied by the time each event is executed, and multiplied by 2/3 |
hi @fuchanghai 我觉得策略可能稍显复杂,比如这里的event执行完的时间就不好定义,worker最大数量也是个问题。无论如何,默认5ms是非常不合理的。而且statecheck的事件我看下来只有dependtask和subprocess需要,它们的检查应该不需要并发,所以这里我用一个独立的队列并且保证事件不重复去规避这个问题,这样用户无论怎么调整执行间隔都不会导致事件大量提交的问题。 The strategy may be a bit complicated, for example, the time for the event to finish executing is not well defined here, and the maximum number of workers is also a problem. In any case, the default 5ms is very unreasonable. And statecheck event I see down only |
|
@zhuangchong 请问下为何close掉这个pr呢,这样3.1.x版本还是遗留这个问题 |
|
@zhuangchong Should I cherry-pick this pr to 3.1.9-prepare? |










Remove StateCheck for SubProcess/DependTask which will cause task submission madness in the default state-wheel-interval configuration
Close #14262
In dev I found pr #14242 has already fix this.
However, the description of PR14242 misses an important part: the statecheck can actually result in a large number of SubProcess/DependTask scenarios where the workflow cannot run successfully due to excessive task submission. This issue should be fixed in version 3.1.8, otherwise 3.1.x is not available with default parameters in such a scenario,
state-wheel-intervalis default 5 which will cause 200 statecheck task submitted in a second.What's worse is that once the task is success, the previous state events are all in exception, cause slower event handling.

From my pratice, it can be stuck for 10+ hours.
cc @ruanwenjun @zhuangchong
Purpose of the pull request
Fix task submission madness in 3.1.x
Brief change log
Remove StateCheck for SubProcess/DependTask
Verify this pull request
This pull request is code cleanup without any test coverage.