NIFI-13159 PutFTP/PutSFTP change when delete happens on REPLACE#8914
NIFI-13159 PutFTP/PutSFTP change when delete happens on REPLACE#8914mosermw wants to merge 3 commits intoapache:mainfrom
Conversation
…elay deletion of remote file until after temp file transfer completes
exceptionfactory
left a comment
There was a problem hiding this comment.
@mosermw can you provide some additional context for the reasons behind this proposed change in behavior?
Reviewing the code, this change introduces an additional mlist() command for FTP and stat() command for SFTP, for each file transferred through the corresponding Processors. Those commands require both file access and network communication, which could have an impact on high volume flows. Is there a particular reason for adding those calls prior to calling delete()? It seems like that should not be necessary.
|
Thanks for looking at this PR, @exceptionfactory
As a use case, let's say I have a server running software that has an anxiety attack if a certain file doesn't exist, and it checks for that file every 5 seconds. I have a requirement to update that file periodically. I use the power of NiFi to generate the file contents and then PutSFTP the file into place. Currently, my server software sometimes panics because it checks for the file while I am updating it. This is because PutSFTP deletes the file first, transfers the file as ".filename" then renames to "filename". As my file gets larger, it can take more than 5 seconds to transfer. After the change in this PR, the file will only not exist in the short period of time between an SFTP delete then rename. I asked the question on Slack a while back and got positive feedback that this change would be useful. I should have put the link into the Jira ticket. https://apachenifi.slack.com/archives/C0L9S92JY/p1713799005100369
You're absolutely right and I can do better. It doesn't hurt to call delete whether the destination file exists or not. I will modify the PR to remove those additional commands and test again. |
Thanks for the reply and additional background @mosermw, that is helpful. At the core, I agree the amount of time between uploading and renaming should be minimal. If you are able to rework the approach and avoid introducing the additional status calls, it seems like a viable improvement. It is worth noting, however, that having some other independent process checking for the existence of a file is still bound to result in a race condition. This change may reduce the chances, but it does not sound like a complete solution. So with that said, I will take another look when you have posted some updates. |
…on if file does not exist
Oh I totally agree, but sometimes we must do the best with the hand we are dealt. I made the changes and the PR did get much simpler. |
|
Thanks for the update @mosermw, this looks like a good approach as the put method already considers the temporary file when deciding whether to run the rename command. If the rename failed, it would throw an exception, which could occur if the delete failed for some reason. For troubleshooting, I believe it would be helpful to log the remote exception at the debug level, and include the remote filename that could not be deleted. That way, if the rename operation fails, there is the opportunity to enable additional logging that could explain more about what failed. Otherwise, I think this approach looks good. |
|
Added debug logging if the delete command throws an exception. This is a good suggestion also because some static analysis tools do not like empty catch blocks. |
exceptionfactory
left a comment
There was a problem hiding this comment.
Thanks for making the adjustments @mosermw, the latest version looks good. +1 merging
Summary
NIFI-13159
For PutFTP/PutSFTP conflict resolution strategy REPLACE, when DOT_RENAME or temp filename is used, delay the deletion of remote file until after temp file transfer completes.
Tracking
Please complete the following tracking steps prior to pull request creation.
Issue Tracking
Pull Request Tracking
NIFI-00000NIFI-00000Pull Request Formatting
mainbranchVerification
Please indicate the verification steps performed prior to pull request creation.
Build
mvn clean install -P contrib-checkUI Contributions
Licensing
LICENSEandNOTICEfilesDocumentation