Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Robustify the code for NFS weirdness #9

Merged
merged 3 commits into from Oct 9, 2023
Merged

Robustify the code for NFS weirdness #9

merged 3 commits into from Oct 9, 2023

Conversation

pkhuong
Copy link
Owner

@pkhuong pkhuong commented Sep 9, 2023

The write side actually worked fine (not deliberately). We also expect to see stale file handles when reaping unused files from overfull caches. Treat that like ENOENT.

Like the comment says, we want to support non-fsynced NFS-backed
caches.  They make a lot of sense when the NFS server is a reliable
distributed service (e.g., FileStore on GCP or EFS on AWS), especially
when the data also lives somewhere else (e.g., object storage).

In such configurations, the chmod to mark files read-only before
we make them visible in the cache directory also happens to flush
any write buffered in the client: since filehandles are stateless
inodes, the write requests could be rejected after the chmod (an
interesting deviation from POSIX file semantics...)
@coveralls
Copy link

coveralls commented Sep 10, 2023

Coverage Status

coverage: 98.889% (+0.02%) from 98.869% when pulling 79624ab on pkhuong/nfs into 0e7648b on main.

Most cases where we check for ENOENT, we want to bubble that up like
a non-erroneous cache miss. ESTALE means we thought there was something
there, but that handle (inode) has since disappeared, either because
the file was unlinked or because it was replaced with another.

Either way, it seems simplest to treat such stale filehandles like
missing files.

We don't expect this to happen a lot, except for cleanups: files in the
kismet cache should mostly be write-once (content addressed), until the
cache grows too large and old unused files are reaped to make room. In
that situation, a stale filehandle means the file was deleted, and it's
fine to treat that like ENOENT.

Alternatively, one could also flush the client-side filehandle cache
and retry, but that's more work for marginal benefits.
Copy link
Collaborator

@ceberly ceberly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. The additional comments are very interesting, I don't have much to say about them other than that. Refactoring as benign_error makes a lot of sense..

@pkhuong pkhuong merged commit b3091a1 into main Oct 9, 2023
2 checks passed
@pkhuong pkhuong deleted the pkhuong/nfs branch October 9, 2023 20:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants