Skip to content

[Improvement][Task] Support end-to-end transfer file between tasks#13343

Closed
Radeity wants to merge 9 commits intoapache:devfrom
Radeity:end2end_transfer_file
Closed

[Improvement][Task] Support end-to-end transfer file between tasks#13343
Radeity wants to merge 9 commits intoapache:devfrom
Radeity:end2end_transfer_file

Conversation

@Radeity
Copy link
Member

@Radeity Radeity commented Jan 5, 2023

Purpose of the pull request

Brief change log

  • Add configuration tmp.transfer.file.size, default size 100MB.

  • Add TmpDirClearProcessor to clean temporary directory after process is finished.

  • If usage of temporary storage doesn't reach the limit, transfer files will be stored in local temporary path: DATA_BASEDIR/tmp/{tenandCode}/{projectCode}/{processDefineCode}/{processDefineVersion}/{processInstanceId}

  • If downloaded file is scp command template file(with .template suffix), complete and execute scp command to fetch files from upstream worker.

Verify this pull request

  • Add UT for FileUtils and exist UT covers changes in TaskFileTransferUtils.

  • Manually test scp way can work in standalone mode. However, it has to be verified that deploy DS in cluster mode and k8s mode, can workers communicate with each other without password? Otherwise, scp command will not execute successfully.

Log in upstream task:

image

Log in downstream task:

image

@SbloodyS SbloodyS added the 3.2.0 for 3.2.0 version label Jan 5, 2023
@SbloodyS SbloodyS added this to the 3.2.0 milestone Jan 5, 2023
@SbloodyS SbloodyS added the feature new feature label Jan 5, 2023

boolean isZip = downloadPath.endsWith(PACK_SUFFIX + TEMPLATE_SUFFIX);

String execCommand = String.format(commandString, downloadPath);

Check warning

Code scanning / CodeQL

Unused format argument

This format call refers to 0 argument(s) but supplies 1 argument(s).
@codecov-commenter
Copy link

codecov-commenter commented Jan 5, 2023

Codecov Report

Merging #13343 (5a8dee6) into dev (313ba44) will decrease coverage by 0.01%.
The diff coverage is 32.82%.

@@             Coverage Diff              @@
##                dev   #13343      +/-   ##
============================================
- Coverage     39.44%   39.42%   -0.02%     
- Complexity     4307     4313       +6     
============================================
  Files          1083     1085       +2     
  Lines         40738    40866     +128     
  Branches       4669     4681      +12     
============================================
+ Hits          16069    16112      +43     
- Misses        22884    22962      +78     
- Partials       1785     1792       +7     
Impacted Files Coverage Δ
...e/dolphinscheduler/common/constants/Constants.java 75.00% <ø> (ø)
.../server/master/runner/WorkflowExecuteRunnable.java 9.99% <0.00%> (-0.17%) ⬇️
...inscheduler/remote/command/TmpDirClearCommand.java 0.00% <0.00%> (ø)
.../server/worker/processor/TmpDirClearProcessor.java 0.00% <0.00%> (ø)
...inscheduler/server/worker/rpc/WorkerRpcServer.java 0.00% <0.00%> (ø)
...er/server/worker/utils/TaskFilesTransferUtils.java 70.73% <54.38%> (-8.36%) ⬇️
...pache/dolphinscheduler/common/utils/FileUtils.java 62.38% <55.00%> (+0.58%) ⬆️
...e/dolphinscheduler/remote/command/CommandType.java 100.00% <100.00%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@sonarqubecloud
Copy link

sonarqubecloud bot commented Jan 5, 2023

SonarCloud Quality Gate failed.    Quality Gate failed

Bug C 1 Bug
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

38.4% 38.4% Coverage
0.0% 0.0% Duplication

Comment on lines +74 to +82
try {
File delDir = new File(tmpDir);
String parentPath = delDir.getParent();
org.apache.commons.io.FileUtils.deleteDirectory(delDir);
FileUtils.deleteEmptyParentDir(parentPath);
logger.info("Success clear the tmp dir: {}.", tmpDir);
} catch (IOException e) {
logger.error("Tmp dir clear failed!");
}
Copy link
Member Author

@Radeity Radeity Jan 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have to consider about the scenario of using task cache mechanism, upload to resource center, rather simply delete.

Comment on lines +214 to +216
if (isScpCommandTemplate) {
isPack = scpFetchFile(downloadPath, targetPath);
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If upstreaming task uses cache mechanism, scp fetch will fail, however, transferred files have been uploaded to resource center. Try to remove template suffix of resource path and download them.

@Radeity Radeity marked this pull request as draft January 30, 2023 10:29
@github-actions
Copy link

github-actions bot commented Jun 6, 2023

This pull request has been automatically marked as stale because it has not had recent activity for 120 days. It will be closed in 7 days if no further activity occurs.

@github-actions github-actions bot added the Stale label Jun 6, 2023
@github-actions
Copy link

This pull request has been closed because it has not had recent activity. You could reopen it if you try to continue your work, and anyone who are interested in it are encouraged to continue work on this pull request.

@github-actions github-actions bot closed this Jun 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Improvement][Task] Support end-to-end transfer file between tasks

3 participants