Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nfs4_op_readdir seems inefficient and possibly buggy in some cases #1034

Closed
drieber opened this issue Dec 6, 2023 · 11 comments
Closed

nfs4_op_readdir seems inefficient and possibly buggy in some cases #1034

drieber opened this issue Dec 6, 2023 · 11 comments

Comments

@drieber
Copy link
Contributor

drieber commented Dec 6, 2023

I am looking at how nfs-ganesha implements NFSv4 READDIR (and v4.1, I haven't looked at other protocols), specifically how ATTR_CHANGE of the dirents is requested, collected and returned.

I am seeing in wireshark and my own debug logging that the client is doing READDIR, requesting a bunch of attributes including ATTR_CHANGE, but my fsal's readdir function gets an attrmask that does NOT have ATTR_CHANGE. However, I see in wireshark the READDIR response has directory entries with all attributes, including ATTR_CHANGE. After debugging some more I found that nfs-ganesha is doing an extra getattrs call for each dirent inside a test_access call within nfs4_readdir_callback.

I am surprised things work this way. It seems wasteful (my fsal's readdir implementation already produces all required attributes). In my use-case I don't think there is any need to do anything special in test_access. In fact, I'm considering turning my FSAL's test_access function into a noop.

So I tried that (turning test_access into a noop), and with that I see in wireshark that now the READDIR response entries do NOT have ATTR_CHANGE.

At this point it is worth noting that nfs4_op_readdir does NOT pass the attrmask from the user's READDIR request to fsal_readdir. Instead it passes ATTRS_NFS3 plus possibly ATTR_ACL and/or ATTR4_SEC_LABEL. See:

attrmask = ATTRS_NFS3;
... This seems wrong.

Some additional notes I collected, including more details about my usecase:

I know the call path goes more or less like this: nfs4_op_readdir -> fsal_readdir -> readdir where that last readdir is really mdcache_readdir. There is a fork in the road at this point: if avl_chunk == 0 then we go through mdcache_readdir_uncached which essentially just calls our FSAL's readdir hook. if avl_chunk != 0 then we would go through mdcache_readdir_chunked which may call mdcache_populate_dir_chunk if there are cache misses. mdcache_populate_dir_chunk calls the sub-fsal's readdir with an attrmark that includes every supported attribute.

In my use-case we configure are export with Dir_Chunks = 0, so in my use-case we go through mdcache_readdir_uncached.

The mdcache_readdir_uncached path passes attrmask unchanged. Conclusion: my FSAL's readdir function gets the same attrmask value as what fsal_readdir received.

In summary:

Should nfs4_op_readdir pass the full attrmark from the request to fsal_readdir?
Is it ok to turn test_access into a noop? What are the things to keep in mind when making that decision?

Thanks!

@ffilz
Copy link
Member

ffilz commented Dec 7, 2023

OK, so first, mdcache_readdir_uncached should ask for all the attributes. Only asking for the requested ones is liable to result in attributes marked as valid in an mdcache entry that are missing some attributes.

Second, nfs4_op_readdir should either ask for all the supported attributes or ask for the ones actually requested, probably the ones actually requested as that will avoid causing extra attribute fetch for things like ACL if not requested and will adapt to any future change in mdcache of attribute validity granularity (currently only ACL and security label are separately managed).

@ffilz ffilz added the bug label Dec 7, 2023
@drieber
Copy link
Contributor Author

drieber commented Dec 7, 2023

What about the test_access calls? If readdir returns all attrs, why make a test_access call? Is it ok to override test_access and turn it into a noop?

@ffilz
Copy link
Member

ffilz commented Dec 7, 2023

FSALs can't override test_access as it only ever makes it to mdcache_test_access. It doesn't get passed down to the FSAL.

Also, the test_access call is making a permission check, it's not trying to cause attrs to be fetched.

If we do the right attr requests, where the critical bit is that uncached readdir isn't asking for all the attributes, then the test_access should not result in additional attribute fetch.

Note that we strongly discourage uncached readdir so fixing the bug in it is lower priority for us. If you want to submit a patch that would be great.

Also, it would be good for nfs4_op_readdir to ask for the right attributes as a second patch.

@drieber
Copy link
Contributor Author

drieber commented Dec 8, 2023

We configure nfs-ganesha like this:

MDCACHE {
  Dir_Chunk = 0;
}
...
EXPORT {
  Attr_Expiration_Time = 0;
  ...
  FSAL {
    name = DEVTOOLS_VFS;
  }
}

DEVTOOLS_VFS is our own fsal. With this configuration, is it still true that test_access will not result in additional attribute fetch?

I would love to experiment with MDCACHE caching enabled, but for that we need to be able to do invalidation synchronously.

@ffilz
Copy link
Member

ffilz commented Dec 11, 2023

Why would you need to do invalidation synchronously?

A 2nd client fetching attributes will always race with a 1st client that is doing something that changes the attributes.

@drieber
Copy link
Contributor Author

drieber commented Dec 11, 2023

At Google we have a virtual filesystem for accessing our very large source code repository to developer machines, build machines and so on. We are using NFS Ganesha and our custom FSAL for providing access to this virtual file system locally, over the loopback address, on macOS (because Apple is moving towards deprecating kernel extensions, and hence our fuse-based solution will stop working at some point).

Some things to point out:

  • We mount the filesystem as NFSv4 (macOS does not support NFSv4.1 or NFSv4.2)
  • The only client is the macOS kernel
  • We are not using delegations or file locking (we have them disabled)

When developers or tools do a version control checkout, they expect the state of the filesystem to be synchronously updated by the time the checkout returns.

Google’s internal version control tools are tightly integrated with the virtual filesystem to optimize performance.

@ffilz
Copy link
Member

ffilz commented Dec 12, 2023

Could you jump on IRC? What time-zone are you in? I think we might benefit from a live conversation. We could also do a Google Meet. Or you could join the community call Tuesday 7:00 AM Pacific Standard Time.

@drieber
Copy link
Contributor Author

drieber commented Dec 12, 2023

I sent you an invite to discuss this at 1pm.

I also sent https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/1173507 which should fix the core issue.

@ffilz
Copy link
Member

ffilz commented Dec 13, 2023

I was reviewing the mdcache upcalls after some conversation with @dang. Invalidate and update are synchronous. I think invalidate would be safe to call in the context of a thread that had made a FSAL call as long as you don't use FSAL_UP_INVALIDATE_CLOSE that triggers a file close. Update would be troublesome to call from a thread that made a FSAL call because it takes the attr_lock which may be held.

ffilz pushed a commit that referenced this issue Dec 22, 2023
See #1034

Change-Id: Idaa1a3c819a76996fc71e71e7c82e84f84372d2f
Signed-off-by: David Rieber <drieber@google.com>
@drieber
Copy link
Contributor Author

drieber commented Jan 16, 2024

I believe this can be closed now.

@ffilz
Copy link
Member

ffilz commented Jan 16, 2024

Patch submitted dc63fef

@ffilz ffilz closed this as completed Jan 16, 2024
ffilz pushed a commit to ffilz/nfs-ganesha that referenced this issue Mar 13, 2024
See nfs-ganesha#1034

Change-Id: Idaa1a3c819a76996fc71e71e7c82e84f84372d2f
Signed-off-by: David Rieber <drieber@google.com>
ffilz pushed a commit to ffilz/nfs-ganesha that referenced this issue May 14, 2024
See nfs-ganesha#1034

Change-Id: Idaa1a3c819a76996fc71e71e7c82e84f84372d2f
Signed-off-by: David Rieber <drieber@google.com>
ffilz pushed a commit to ffilz/nfs-ganesha that referenced this issue May 17, 2024
See nfs-ganesha#1034

Change-Id: Idaa1a3c819a76996fc71e71e7c82e84f84372d2f
Signed-off-by: David Rieber <drieber@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants