Skip to content

[Bug][Master] remove check with executePath when kill yarn job#5552

Merged
CalvinKirs merged 3 commits intoapache:devfrom
geosmart:5550
May 29, 2021
Merged

[Bug][Master] remove check with executePath when kill yarn job#5552
CalvinKirs merged 3 commits intoapache:devfrom
geosmart:5550

Conversation

@geosmart
Copy link
Contributor

@geosmart geosmart commented May 27, 2021

Purpose of the pull request

when worker is down, a master trigger worker tolerance,
when handle a yarn task tolerance, it first kill the yarn job by yarn rest api and make the task state to NEED_FAULT_TOLERANCE,
so the task can scheduler by master.

bug detail

the check make the yarn job not killed, but re submit by another worker。 
it will make the yarn running duplicated biz application.

change

This change added tests and can be verified as follows:

so remove the check code.
just get the applocationId and kill it ,no need to check the executePath.

issue link to #5550

Copy link

@vimacro vimacro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modify this:
cancelApplication(appIds, logger, taskExecutionContext.getTenantCode(), taskExecutionContext.getExecutePath());
remove taskExecutionContext.getExecutePath()

throw new RuntimeException("task instance work dir is empty");
}
if (CollectionUtils.isNotEmpty(appIds)) {
cancelApplication(appIds, logger, taskExecutionContext.getTenantCode(), taskExecutionContext.getExecutePath());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

taskExecutionContext.getExecutePath() should remove

Copy link
Contributor

@brave-lee brave-lee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@geosmart
Copy link
Contributor Author

maybe the sudo -u tenantCode will throw NPE error?
and tenantCode linux user is not exist in master node, does we need to create it before run the command?

@CalvinKirs CalvinKirs added the bug Something isn't working label May 28, 2021
@CalvinKirs CalvinKirs added this to the 1.3.7-release milestone May 28, 2021
taskExecutionContext.getProcessDefineVersion(),
taskExecutionContext.getProcessInstanceId(),
taskExecutionContext.getTaskInstanceId()));
FileUtils.createWorkDirIfAbsent(taskExecutionContext.getExecutePath());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mv FileUtils.createWorkDirIfAbsent(taskExecutionContext.getExecutePath());
to if outside

@codecov-commenter
Copy link

Codecov Report

Merging #5552 (155d8ae) into dev (842c540) will decrease coverage by 0.01%.
The diff coverage is 0.00%.

Impacted file tree graph

@@             Coverage Diff              @@
##                dev    #5552      +/-   ##
============================================
- Coverage     46.74%   46.72%   -0.02%     
+ Complexity     3786     3784       -2     
============================================
  Files           605      605              
  Lines         24791    24794       +3     
  Branches       2811     2811              
============================================
- Hits          11588    11586       -2     
- Misses        12096    12099       +3     
- Partials       1107     1109       +2     
Impacted Files Coverage Δ
...he/dolphinscheduler/server/utils/ProcessUtils.java 48.50% <0.00%> (-0.89%) ⬇️
...er/master/dispatch/host/assign/RandomSelector.java 77.77% <0.00%> (-5.56%) ⬇️
...dolphinscheduler/remote/future/ResponseFuture.java 81.35% <0.00%> (-1.70%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 842c540...155d8ae. Read the comment docs.

@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

0.0% 0.0% Coverage
0.0% 0.0% Duplication

@geosmart
Copy link
Contributor Author

maybe the sudo -u tenantCode will throw NPE error?
and tenantCode linux user is not exist in master node, does we need to create it before run the command?

@zhanguohao ptal

Copy link
Member

@CalvinKirs CalvinKirs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@CalvinKirs CalvinKirs merged commit 5627331 into apache:dev May 29, 2021
@CalvinKirs CalvinKirs removed this from the 1.3.7-release milestone Jul 19, 2021
CalvinKirs added a commit that referenced this pull request Jul 29, 2021
#5846)

* [1.3.7-prepare#5552][Bug][Master] remove check with executePath when kill yarn job
pr #5552
issue #5550

* update method name
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants