-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NIFI-11472 Make PutFTP processor more multithread friendly #7184
Conversation
Hello @MormonJesus69420 , thanks for your contribution! |
Hi @arpadboda I might not have described it so well, but the issue we face is when we configure processor to run several concurrent tasks.
We have noticed a significant performance boost when using a PutFTP processor with more than one concurrent task. I don't have the numbers on me at the moment, but switching to two or three concurrent tasks significantly sped up the transfer time. |
Add an extra check during directory creation to see if directory wasn't already created in another thread.
I am sorry, I managed to make the most basic mistake in such a small change, I forgot to add a |
Strange, I don't understand why it failed on the Windows action. I don't have the ability to test it on Windows either, since we use Linux for development at work. Also my branch is based off of the latest nifi/main branch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution @MormonJesus69420, and thanks for the review @arpadboda!
The Windows build often takes longer than the others, so it timed out, I restarted it to try again.
The use case of multiple concurrent tasks for PutFTP makes sense, the introducing the additional call to setWorkingDirectory
seems reasonable under the circumstances. Within the context of a single NiFi server, a more robust solution might be to add synchronized
to this method, but that would not help when running PutFTP across multiple NiFi nodes in a cluster.
With that background, attempting to change to the directory if the makeDirectory
command fails seems like the best approach. I will monitor the Windows build status for verification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again @MormonJesus69420! +1 merging
Add an extra check during directory creation to see if directory wasn't already created in another thread.
From Issue:
Problem happens when a PutFTP is set to run several concurrent tasks and two (or more ) FlowFiles come in and both need to create the same directory. One of them will create directory and succeed immediately while the other will try to create directory, but fail since it already exist, throw an error, the FlowFile will then be penalized and on second run will succeed.
While it is not the biggest error, as files are getting transferred in the end, but the bulletins and errors are annoying, especially in production environment where you don't want to get unnecessary errors.
We found that the solution involves a simple change to the FTPTransfer.java class in:
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/util/FTPTransfer.java
On line 398 and ensureDirectoryExists method you can simply add another if check which double checks that the directory exists when it fails to create one.
Summary
NIFI-11472
Tracking
Please complete the following tracking steps prior to pull request creation.
Issue Tracking
Pull Request Tracking
NIFI-00000
NIFI-00000
Pull Request Formatting
main
branchVerification
Please indicate the verification steps performed prior to pull request creation.
Build
mvn clean install -P contrib-check
Licensing
LICENSE
andNOTICE
filesDocumentation