[VFS-862] Fix ON_RESOLVE triggering refresh on internal navigation#761
[VFS-862] Fix ON_RESOLVE triggering refresh on internal navigation#761garydgregory merged 4 commits intoapache:masterfrom
Conversation
CacheStrategy.ON_RESOLVE is intended to refresh files when the user resolves them via the public API. However, internal navigation methods (getParent, getRoot, getChildren child resolution, symlink resolution) also called fileSystem.resolveFile(), triggering ON_RESOLVE refreshes on files the user never asked to refresh. This became a severe regression after refresh() was changed to unconditionally clear FtpFileObject.childMap: each child's getParent() refreshed the parent, clearing its childMap, forcing a new FTP LIST command per child. A directory with N files produced ~N LIST commands instead of 1. Fix: - Add resolveFileInternal() that skips the ON_RESOLVE refresh. All internal navigation call sites use it instead of resolveFile(). - After a fresh directory listing, FTP and SFTP providers propagate metadata to cached child objects in-place, preserving object identity. This establishes a clear contract: cached state is used until the user explicitly calls refresh() or resolves via the public API. Internal navigation never triggers server operations. Tests: - FtpGetChildrenListCommandTest: verifies findFiles() on a directory with 50 files issues exactly 1 LIST command with ON_RESOLVE. - SftpGetChildrenListCommandTest: verifies refresh + findFiles() returns fresh children reflecting filesystem changes.
|
Hi @garydgregory If you don't accept this fix, you should probably revert PR #758 or fix it in a better way. |
|
Hello @ilang Please see the test failures. TY! |
Jackrabbit 1.x bundles Jetty 6.x which does not drain unconsumed request bodies before sending error responses (e.g. 404) on persistent connections, violating HTTP/1.1 requirements. This leaves stale request bytes that corrupt the next request on a reused connection. The resolveFileInternal change in getParent() removed an extra PROPFIND request to the parent directory that used to happen between the child's PROPFIND (404) and the child's PUT. That extra request happened to flush the stale bytes from the connection. Without it, the server reads the 112-byte PROPFIND XML request body as the PUT file content. Fix: add Connection: close to the test HttpClient configuration so each request uses a fresh connection. Jetty 9.4+ handles this correctly (see jetty#651, jetty#4117, jetty#6168). Jackrabbit 2 tests pass without changes. No production code modified.
|
This PR now includes a fix for the Jackrabbit 1.x WebDAV test failure ( What happened: The embedded Jackrabbit 1.x test server bundles Jetty 6.x, which does not drain unconsumed request bodies on 404 responses. When VFS sends a PROPFIND (112-byte XML body) to check if a file exists and gets a 404, then reuses the same persistent connection for a PUT to create the file, Jetty reads the stale PROPFIND bytes as the PUT body — storing the XML as file content. Before this PR, This is a known class of Jetty issue — jetty#651, jetty#4117, jetty#6168. Fixed in Jetty 9.4+. Fix: Add |
|
Thanks @garydgregory , see my comment above, it is a bug in test (or actually the old jackrabbit that the test uses) not in the code, I create a fix to bypass it. |
There was a problem hiding this comment.
Hello @ilang
Thank you for your updates.
If I apply the test side of the patch:
- ✅
FtpGetChildrenListCommandTestfails, which is good - 🔴
SftpGetChildrenListCommandTestpasses:. This means the tests likely don't test what you think they test. Tests for a fix should fail when themainside of the patch is not applied. Otherwise, we might have a regression in the future.
Thank you!
The SFTP test cannot distinguish the fix from the original ON_RESOLVE behavior because SFTP's doListChildrenResolved() pushes metadata to each child via setStat() immediately after resolution. The ON_RESOLVE refresh clears attrs, but setStat() repopulates it right after — same server traffic either way. The optimization is purely client-side. The FTP test remains and definitively proves the fix (82 LISTs without the fix, 1 LIST with).
|
You're right about the SFTP test, I've removed it. The SFTP code change ( If you prefer, I can drop the SFTP changes from this PR, it is just a small optimization which is hard to test. |
|
Hello @ilang Thank you for the update. I don't see the need for busy work and creating another PR for what's already here. I'll review again tomorrow. |
Summary
getParent(),getRoot(),getChildren()child resolution, symlink resolution) triggersCacheStrategy.ON_RESOLVErefresh, causing redundant FTP/SFTP operationschildMap = nulltoFtpFileObject.refresh()Fix
resolveFileInternal()that skips the ON_RESOLVE refresh for internal navigationTest plan
FtpGetChildrenListCommandTest: verifiesfindFiles()issues exactly 1 LIST with ON_RESOLVE (50 files)SftpGetChildrenListCommandTest: verifies refresh +findFiles()returns fresh childrenJIRA: https://issues.apache.org/jira/browse/VFS-862