Skip to content

[VFS-862] Fix ON_RESOLVE triggering refresh on internal navigation#761

Merged
garydgregory merged 4 commits intoapache:masterfrom
new-proimage:VFS-862
Apr 11, 2026
Merged

[VFS-862] Fix ON_RESOLVE triggering refresh on internal navigation#761
garydgregory merged 4 commits intoapache:masterfrom
new-proimage:VFS-862

Conversation

@ilang
Copy link
Copy Markdown
Contributor

@ilang ilang commented Apr 8, 2026

Summary

  • Internal VFS navigation (getParent(), getRoot(), getChildren() child resolution, symlink resolution) triggers CacheStrategy.ON_RESOLVE refresh, causing redundant FTP/SFTP operations
  • This was always wasteful (2 LIST commands instead of 1 for sibling file lookups), but became O(N) after PR Fix refresh() not clearing cached state when file was never attached #758 added unconditional childMap = null to FtpFileObject.refresh()
  • A directory with 50 files produces ~50 LIST commands instead of 1

Fix

  1. Add resolveFileInternal() that skips the ON_RESOLVE refresh for internal navigation
  2. After a fresh directory listing, FTP and SFTP providers propagate metadata to cached children in-place (preserving object identity)

Test plan

  • FtpGetChildrenListCommandTest: verifies findFiles() issues exactly 1 LIST with ON_RESOLVE (50 files)
  • SftpGetChildrenListCommandTest: verifies refresh + findFiles() returns fresh children
  • All existing tests pass (3208 tests, 0 failures)
  • Checkstyle clean

JIRA: https://issues.apache.org/jira/browse/VFS-862

CacheStrategy.ON_RESOLVE is intended to refresh files when the user
resolves them via the public API. However, internal navigation methods
(getParent, getRoot, getChildren child resolution, symlink resolution)
also called fileSystem.resolveFile(), triggering ON_RESOLVE refreshes
on files the user never asked to refresh.

This became a severe regression after refresh() was changed to
unconditionally clear FtpFileObject.childMap: each child's getParent()
refreshed the parent, clearing its childMap, forcing a new FTP LIST
command per child. A directory with N files produced ~N LIST commands
instead of 1.

Fix:
- Add resolveFileInternal() that skips the ON_RESOLVE refresh. All
  internal navigation call sites use it instead of resolveFile().
- After a fresh directory listing, FTP and SFTP providers propagate
  metadata to cached child objects in-place, preserving object identity.

This establishes a clear contract: cached state is used until the user
explicitly calls refresh() or resolves via the public API. Internal
navigation never triggers server operations.

Tests:
- FtpGetChildrenListCommandTest: verifies findFiles() on a directory
  with 50 files issues exactly 1 LIST command with ON_RESOLVE.
- SftpGetChildrenListCommandTest: verifies refresh + findFiles()
  returns fresh children reflecting filesystem changes.
@ilang
Copy link
Copy Markdown
Contributor Author

ilang commented Apr 8, 2026

Hi @garydgregory
I tried to fix PR #758 with some easy fix, but I think the root cause is the refresh behavior, which I try to fix in this PR.
So this is my stab at fixing it.
I am aware that it is a bit of a larger fix, and I am new here, but this is the simplest and most idiomatic root I found.
I summarized in the Jira that it might have affect on existing behavior, but this is if someone relates on existing bugs.

If you don't accept this fix, you should probably revert PR #758 or fix it in a better way.

@garydgregory
Copy link
Copy Markdown
Member

Hello @ilang

Please see the test failures.

TY!

Jackrabbit 1.x bundles Jetty 6.x which does not drain unconsumed
request bodies before sending error responses (e.g. 404) on persistent
connections, violating HTTP/1.1 requirements. This leaves stale request
bytes that corrupt the next request on a reused connection.

The resolveFileInternal change in getParent() removed an extra PROPFIND
request to the parent directory that used to happen between the child's
PROPFIND (404) and the child's PUT. That extra request happened to flush
the stale bytes from the connection. Without it, the server reads the
112-byte PROPFIND XML request body as the PUT file content.

Fix: add Connection: close to the test HttpClient configuration so
each request uses a fresh connection.

Jetty 9.4+ handles this correctly (see jetty#651, jetty#4117, jetty#6168).
Jackrabbit 2 tests pass without changes. No production code modified.
@ilang
Copy link
Copy Markdown
Contributor Author

ilang commented Apr 9, 2026

This PR now includes a fix for the Jackrabbit 1.x WebDAV test failure (testFileCreate: expected 0 but was 112) that was caused by the getParent() change.

What happened: The embedded Jackrabbit 1.x test server bundles Jetty 6.x, which does not drain unconsumed request bodies on 404 responses. When VFS sends a PROPFIND (112-byte XML body) to check if a file exists and gets a 404, then reuses the same persistent connection for a PUT to create the file, Jetty reads the stale PROPFIND bytes as the PUT body — storing the XML as file content.

Before this PR, getParent() used resolveFile() which triggered an extra PROPFIND to the parent directory. That extra request happened to flush the stale bytes from the connection, masking the server bug. With resolveFileInternal, the PUT follows the 404 immediately, exposing it.

This is a known class of Jetty issue — jetty#651, jetty#4117, jetty#6168. Fixed in Jetty 9.4+.

Fix: Add Connection: close to the Jackrabbit 1.x test HttpClient configuration (second commit). No production code changes. Jackrabbit 2 tests already pass without any workaround.

@ilang
Copy link
Copy Markdown
Contributor Author

ilang commented Apr 9, 2026

Thanks @garydgregory , see my comment above, it is a bug in test (or actually the old jackrabbit that the test uses) not in the code, I create a fix to bypass it.

Copy link
Copy Markdown
Member

@garydgregory garydgregory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @ilang

Thank you for your updates.

If I apply the test side of the patch:

  • FtpGetChildrenListCommandTest fails, which is good
  • 🔴 SftpGetChildrenListCommandTest passes:. This means the tests likely don't test what you think they test. Tests for a fix should fail when the main side of the patch is not applied. Otherwise, we might have a regression in the future.

Thank you!

The SFTP test cannot distinguish the fix from the original ON_RESOLVE
behavior because SFTP's doListChildrenResolved() pushes metadata to
each child via setStat() immediately after resolution. The ON_RESOLVE
refresh clears attrs, but setStat() repopulates it right after — same
server traffic either way. The optimization is purely client-side.

The FTP test remains and definitively proves the fix (82 LISTs without
the fix, 1 LIST with).
@ilang
Copy link
Copy Markdown
Contributor Author

ilang commented Apr 10, 2026

You're right about the SFTP test, I've removed it.

The SFTP code change (resolveFileInternal in doListChildrenResolved() + injectType(null) after setStat()) applies the same resolveFileInternal principle consistently, but has no server-side observable difference. SFTP's doListChildrenResolved() pushes metadata to each child via setStat() immediately after resolution, so the ON_RESOLVE refresh is redundant client-side work, clearing attrs that is immediately repopulated. Unlike FTP, where the cascade causes real network LIST commands, the SFTP optimization is purely client-side and cannot be distinguished by a server-side test.

If you prefer, I can drop the SFTP changes from this PR, it is just a small optimization which is hard to test.

@garydgregory
Copy link
Copy Markdown
Member

Hello @ilang

Thank you for the update. I don't see the need for busy work and creating another PR for what's already here.

I'll review again tomorrow.

@garydgregory garydgregory merged commit ed12c66 into apache:master Apr 11, 2026
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants