[Improvement][Worker] Do not verify the status of yarn in ShellCommandExecutor #5665
[Improvement][Worker] Do not verify the status of yarn in ShellCommandExecutor #5665skymsg wants to merge 4 commits intoapache:devfrom skymsg:dev
Conversation
|
deeply thx for your contribution, code style check fails.you can refer this doc(https://dolphinscheduler.apache.org/en-us/community/development/pull-request.html) |
Codecov Report
@@ Coverage Diff @@
## dev #5665 +/- ##
============================================
- Coverage 45.34% 45.32% -0.03%
+ Complexity 3689 3682 -7
============================================
Files 607 607
Lines 24860 24850 -10
Branches 2825 2824 -1
============================================
- Hits 11274 11263 -11
- Misses 12504 12505 +1
Partials 1082 1082
Continue to review full report at Codecov.
|
|
hi, how it is getting? |
|
Sorry , I have a busy schedule recently. I will try to fix these error on the weekend. |
|
SonarCloud Quality Gate failed. |
| } | ||
| } else { | ||
| logger.error("process has failure , exitStatusCode:{}, processExitValue:{}, ready to kill ...", | ||
| result.getExitStatusCode(), process.exitValue()); |
There was a problem hiding this comment.
We may also need to modify the kill method on line 214. In the current implementation, it will find appliacationId from log and kill the application on yarn, for normal tasks, we don't need to kill on yarn.
There was a problem hiding this comment.
just remove the killYarnJob call from kill method maybe enough .
public static void kill(TaskExecutionContext taskExecutionContext) {
try {
int processId = taskExecutionContext.getProcessId();
if (processId == 0) {
logger.error("process kill failed, process id :{}, task id:{}",
processId, taskExecutionContext.getTaskInstanceId());
return;
}
String pidsStr = getPidsStr(processId);
if (StringUtils.isNotEmpty(pidsStr)) {
String cmd = String.format("kill -9 %s", pidsStr);
cmd = OSUtils.getSudoCmd(taskExecutionContext.getTenantCode(), cmd);
logger.info("process id:{}, cmd:{}", processId, cmd);
OSUtils.exeCmd(cmd);
}
} catch (Exception e) {
logger.error("kill task failed", e);
}
// find log and kill yarn job
killYarnJob(taskExecutionContext);
}I found the TaskExecuteThread would call the cancelApplication to kill the yarn job when AbstractYarnTask handle method throws Exception.
There was a problem hiding this comment.
the cancelApplication method of AbstractYarnTask would call ProcessUtils.killYarnJob(taskExecutionContext) for itself
public void cancelApplication(boolean status) throws Exception {
cancel = true;
// cancel process
shellCommandExecutor.cancelApplication();
TaskInstance taskInstance = processService.findTaskInstanceById(taskExecutionContext.getTaskInstanceId());
if (status && taskInstance != null) {
ProcessUtils.killYarnJob(taskExecutionContext);
}
}There was a problem hiding this comment.
@skymsg In some case, the task may just exit with false status and won't throw an exception.
Do not verify the status of yarn in ShellCommandExecutor (#5564)
Purpose of the pull request
fix #5564
Move the code that verify the status of yarn in ShellCommandExecutor to AbstractYarnTask .
My shell command ouput log contains text that match the pattern APPLICATION_REGEX = "application_\d+_\d+" but that task is not a yarn task. These yarn related code cause my shell command task failure.
Brief change log
Verify this pull request
This pull request is code cleanup without any test coverage.