Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mounting functionality for latest versions of macOS using NFS #7254

Merged
merged 6 commits into from Oct 6, 2023

Conversation

salehqt
Copy link

@salehqt salehqt commented Aug 28, 2023

What is the purpose of this change?

To provide the mount command on macOS in cases where FUSE kernel extensions are not available (e.g. enterprise machines)

Was the change discussed in an issue or in the forum before?

Checklist

  • I have read the contribution guidelines.
  • I have added tests for all changes in this PR if appropriate.
  • I have added documentation for the changes if appropriate.
  • All commit messages are in house style.
  • I'm done, this Pull Request is ready for review :-)

Please provide guidance on testing this feature. I am not very familiar with testing in rclone. I manually tested the changes and documented the testing in each commit.

Copy link
Member

@ncw ncw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great submission - thank you!

I put some initial comments inline.

It needs some more testing though. If you copy (adjust as needed) cmd/mount/mount_test.go into cmd/nfsmount can you get go test -v to work?

Ideally we'd run some tests on rclone serve nfs directly too. I think they would end up effectively being the same as the cmd/nfsmount tests unless we implement an nfs backend for rclone.

Can you rebase this on master please as I just fixed the linter!

I'll run the CI for this so we can see what happens.

cmd/serve/nfs/filesystem.go Show resolved Hide resolved
cmd/serve/nfs/filesystem.go Show resolved Hide resolved
"github.com/rclone/rclone/vfs"
)

type ROFS struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to read and write so ROFS probably isn't a good name any more! I'd suggest FS

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed it to FS

func (f *ROFS) TempFile(dir, prefix string) (billy.File, error) {
return nil, os.ErrInvalid
}
func (f *ROFS) MkdirAll(filename string, perm os.FileMode) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a vfs.MkdirAll BTW

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redirected to MkdirAll instead.

opt.ListenAddr = "localhost:"
}
// NFS doesn't work without CacheModeFull
vfs.SetCacheMode(vfscommon.CacheModeFull)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally the servers expose the full set of VFS options - check out cmd/serve/sftp to see how that is done.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove this. It turned out the problem was with the backend I was testing againt.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested it manually. somehow if the vfs cache is not set to full it doesn't work at all.

}

func NewServer(ctx context.Context, vfs *vfs.VFS, opt *Options) (s *server, err error) {
// Our NFS server doesn't have any authentication, we run it on localhost by default
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it have authentication?

Copy link
Author

@salehqt salehqt Aug 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I know, NFS before v4 didn't have any reasonable authentication. NFSv4 introduces Kerberos but it is way too complicated and requires enterprise infrastructure. 1

go-nfs only implements NFSv3

)

func init() {
mountlib.NewMountCommand("mount", false, mount)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make this available as nfsmount when it isn't being aliased to mount. Check out cmd/cmount for how it does it.

cmd/nfsmount/nfsmount.go Outdated Show resolved Hide resolved
cmd/nfsmount/nfsmount.go Show resolved Hide resolved
@salehqt
Copy link
Author

salehqt commented Aug 30, 2023

I tried adding a test, but it failed because there is no way to stop the NFS server.

Here is the log:
https://gist.github.com/salehqt/d1dc7fb04094d514548c433062cddcfe

Here is the forever loop that has no escape:
https://github.com/willscott/go-nfs/blob/master/server.go#L61C1-L81C3

@ncw
Copy link
Member

ncw commented Aug 30, 2023

I tried adding a test, but it failed because there is no way to stop the NFS server.

Here is the log: https://gist.github.com/salehqt/d1dc7fb04094d514548c433062cddcfe

This needs to be handled by your unmount function

	unmount = func() error {
		s.Shutdown()
		VFS.Shutdown()
		if runtime.GOOS == "darwin" {
			return exec.Command("diskutil", "umount", "force", mountpoint).Run()
		} else {
			return exec.Command("umount", "-f", mountpoint).Run()
		}
	}

I think you need to do the exec.Command before the Shutdown otherwise you are in danger of deadlock. (I see cmd/mount doesn't do this which probably needs fixing).

So something like

	unmount = func() error {
                var err error
		if runtime.GOOS == "darwin" {
			err =  exec.Command("diskutil", "umount", "force", mountpoint).Run()
		} else {
			err = exec.Command("umount", "-f", mountpoint).Run()
		}
		s.Shutdown()
		VFS.Shutdown()
               return err
	}

@ncw
Copy link
Member

ncw commented Aug 30, 2023

It looks like go-nfs doesn't compile on these architectures so these need excluding with build tags

  • js/wasm
  • plan9/386
  • plan9/amd64

There are some lint-y things to fix mostly to do with Exported methods needing comments. You can add a comment or unexport the method by making it small initial letter.

exported: exported method RWFileHandle.Unlock should have comment or be unexported (revive)

These ones should be easy to fix too

  Warning: receiver-naming: receiver name fh should be consistent with previous receiver name f for RWFileHandle (revive)

@salehqt
Copy link
Author

salehqt commented Aug 30, 2023

It looks like there is still a lot of work left on this PR. Aside from fixes for the CI, I managed to run the tests and none of them passed. I tried the actions in the test manually and I can see that they all error out. This requires debugging every single test case and figure out why NFS doesn't behave properly.

In my usage, I only used NFS with CacheModeFull and mostly read operations. I knew at the time that it couldn't handle the entire matrix of operations but it was good enough for my use. I could navigate the remote in Finder and quickly look at all the files.

I have made my code public. So if someone wants to take this and put it over the finish line, they are more than welcome. Otherwise I will slowly work on this PR in the next couple of weeks.

@ncw
Copy link
Member

ncw commented Aug 31, 2023

It looks like there is still a lot of work left on this PR. Aside from fixes for the CI, I managed to run the tests and none of them passed.

:-(

It might be that we need to force CacheModeFull for this and I don't think the tests deal with that at the moment.

If you comment this can you get more of the tests to pass?

--- a/vfs/vfstest/fs.go
+++ b/vfs/vfstest/fs.go
@@ -52,11 +52,11 @@ func RunTests(t *testing.T, useVFS bool, mountFn mountlib.MountFn) {
 		cacheMode vfscommon.CacheMode
 		writeBack time.Duration
 	}{
-		{cacheMode: vfscommon.CacheModeOff},
-		{cacheMode: vfscommon.CacheModeMinimal},
-		{cacheMode: vfscommon.CacheModeWrites},
+		// {cacheMode: vfscommon.CacheModeOff},
+		// {cacheMode: vfscommon.CacheModeMinimal},
+		// {cacheMode: vfscommon.CacheModeWrites},
 		{cacheMode: vfscommon.CacheModeFull},
-		{cacheMode: vfscommon.CacheModeFull, writeBack: 100 * time.Millisecond},
+		// {cacheMode: vfscommon.CacheModeFull, writeBack: 100 * time.Millisecond},
 	}
 	for _, test := range tests {
 		vfsOpt := vfsflags.Opt

I have made my code public. So if someone wants to take this and put it over the finish line, they are more than welcome. Otherwise I will slowly work on this PR in the next couple of weeks.

No hurry! I'll put this on the v1.65 milestone which is in about 2 months probably which will give lots of time for fixing.

@ncw ncw added this to the v1.65 milestone Aug 31, 2023
@ncw
Copy link
Member

ncw commented Sep 11, 2023

I don't know why GitHub isn't trying to run the CI on this... Maybe rebase on master, fix the conflicts in go.mod/go.sum and force push to try again?

Copy link
Member

@ncw ncw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking very good now :-)

How do the tests run? Do they run clean now?

At some point I'd like you to clean up the history...

maybe into the following commits

  • commit for adding new parameter to vfs.RunTests
  • Lock additions to vfs
  • commit for adding nfs serve
  • commit for adding nfsmount

cmd/nfsmount/nfsmount.go Show resolved Hide resolved
vfs/vfstest/fs.go Outdated Show resolved Hide resolved
vfs/vfstest/fs.go Outdated Show resolved Hide resolved
vfs/vfstest/fs.go Show resolved Hide resolved
vfs/vfstest/fs.go Show resolved Hide resolved
@salehqt
Copy link
Author

salehqt commented Sep 20, 2023

I have been debugging at this and pushing fixes. But I didn't get the chance to comment here. I have made all the tests pass except for TestDirCacheFlush.

There is a lot to unpack here:

  1. I had to fix at least one proper bug in VFS module, with that fixed, most of the tests pass 7654473
  2. VFS is quite inconsistent when it comes to Unix modes on files. It accepts any mode on Create/Mkdir, but completely ignores it and always returns a fixed mode on stat. While create/mkdir takes any mode, chmod always fails. This is a problem since NFS requires a certain level of consistency. In order to achieve this consintency in the tests, I had to modify the modes in the tests to match what VFS returns for created files and directories (e.g. 0644 and 0755)
  3. Many write operations in VFS require a cache. see write.go. Hence I added the minimal required cache option and made NFS server read-only without a cache.
  4. Directory caching has multiple problems and I have only managed to make the test pass with a hack:
    1. There is caching at the NFS client side, which can be disabled for the test
    2. NFS relies on timestamps for directories to figure out if it needs to re-read the directory. Not all backends support consistent timestamp on directories. I couldn't even make it pass on local backend, which should have consistent timestamps.

I couldn't figure out how attach a debugger to the test because it spawns a separate go process for the server. I even tried printing the PID and using attach, but got other issues. I had to resort to printf debugging which is really slow.

For debugging TestDirCacheFlush, I have attached the log here: TestDirCacheFlush test log. This has all my debug prints to trace what is happening. Basically, NFS doesn't re-read the directory because its timestamp doesn't change when it is invalidated. I might spend more time on this part later. Any help would be appreciated.

Once I have made all the tests pass, I am planning to clean-up history. With a commit specifically for fixing VFS bugs.

@salehqt
Copy link
Author

salehqt commented Sep 20, 2023

This hack makes the test pass:
Screenshot 2023-09-20 at 2 01 59 PM

@ncw
Copy link
Member

ncw commented Sep 20, 2023

It sounds like you've been doing great work :-)

I have been debugging at this and pushing fixes. But I didn't get the chance to comment here. I have made all the tests pass except for TestDirCacheFlush.

Well done :-)

  1. I had to fix at least one proper bug in VFS module, with that fixed, most of the tests pass 7654473

Ah. You should probably make a test for that if you can.

  1. VFS is quite inconsistent when it comes to Unix modes on files. It accepts any mode on Create/Mkdir, but completely ignores it and always returns a fixed mode on stat. While create/mkdir takes any mode, chmod always fails. This is a problem since NFS requires a certain level of consistency. In order to achieve this consintency in the tests, I had to modify the modes in the tests to match what VFS returns for created files and directories (e.g. 0644 and 0755)

Yes VFS is very sloppy with permissions!

  1. Many write operations in VFS require a cache. see write.go. Hence I added the minimal required cache option and made NFS server read-only without a cache.

OK

  1. Directory caching has multiple problems and I have only managed to make the test pass with a hack:

    1. There is caching at the NFS client side, which can be disabled for the test
    2. NFS relies on timestamps for directories to figure out if it needs to re-read the directory. Not all backends support consistent timestamp on directories. I couldn't even make it pass on local backend, which should have consistent timestamps.

This might be a VFS bug I guess. Note the recent change 071c3f2 (which I don't know if you have in your tree) that might help too.

I couldn't figure out how attach a debugger to the test because it spawns a separate go process for the server. I even tried printing the PID and using attach, but got other issues. I had to resort to printf debugging which is really slow.

Yes we have to run the mount in a separate process otherwise the Go runtime deadlocks. It took me a very long time to work that out!

For debugging TestDirCacheFlush, I have attached the log here: TestDirCacheFlush test log. This has all my debug prints to trace what is happening. Basically, NFS doesn't re-read the directory because its timestamp doesn't change when it is invalidated. I might spend more time on this part later. Any help would be appreciated.

Is this fixed with your hack? It looks reasonable.

Once I have made all the tests pass, I am planning to clean-up history. With a commit specifically for fixing VFS bugs.

Great!

Give me a ping when you are ready and I'll give it another review.

@salehqt
Copy link
Author

salehqt commented Sep 24, 2023

Update on TestDirCacheFlush: it is not fixable.

NFS has multiple levels of caching and this structure is not compatible with the expectations of the test. NFS uses ctime of directories to figure out if it needs to list them again. But the way that VFS handles ctime is very loose. and in general most backends don't report correct ctime anyways.

I fixed a ctime bug in VFS/dir.go in the latest commit, with the assumption that the backend has consistent ctimes (in this case it does). But it is still not compatible with expectation of the test. ctime of directory get updated when the parent is invalidated. This means that the dir is re-read when otherdir is invalidated. breaking the test.

My proposal is to remove this test. Because in general, filesystem implementations have their own cache and this test doesn't send cache invalidation commands to the filesystem. Otherwise we can exclude this test for NFS.

@ncw
Copy link
Member

ncw commented Sep 25, 2023

Update on TestDirCacheFlush: it is not fixable.

Great effort in trying to get it to work :-)

I suggest we detect the NFS file system in the test and skip it. You'll probably have to do that by name or something untidy like that!

	if run.fremoteName == "nfswhatever" {
		t.Skip("Skipping test on NFS")
	}

cmd/serve/nfs/handler.go Show resolved Hide resolved
cmd/serve/nfs/handler.go Show resolved Hide resolved
cmd/serve/nfs/filesystem.go Show resolved Hide resolved
cmd/serve/nfs/handler.go Show resolved Hide resolved
cmd/serve/nfs/server.go Outdated Show resolved Hide resolved
@lucasew
Copy link

lucasew commented Sep 29, 2023

I'm really hyped for rclone to be able to serve NFS as it may be a little faster than FUSE because less kernel back and forth on Linux. The last time I was really hyped about a rclone feature was when I dropped UMS and minidlnad because rclone got a native, simple, straightforward and overpowered DLNA server.

I am not an expert in the codebase at all so far but I hope I am helping in some way.

@ncw
Copy link
Member

ncw commented Oct 3, 2023

@salehqt how are you doing with this? Let me know when you want another review. I'd like to merge this for v1.65. (This needs a rebase too).

Can you tidy the history also? Squash into a small number of commits?

Thank you

@salehqt
Copy link
Author

salehqt commented Oct 3, 2023

@ncw, I am now working on the documentation for both serve nfs and nfsmount. Once I have done that. I will rebase and clean-up history.

@salehqt
Copy link
Author

salehqt commented Oct 3, 2023

@lucasew thanks for taking the time to review. There are a couple of features that are beyond this PR and you could help with:

  1. Build and test serve nfs on Windows
  2. Add nfs as backend using NFS client module in willscot/go-nfs package. This would be very useful for testing serve nfs.
  3. Create unit tests for serve nfs by using an NFS client library.

@salehqt
Copy link
Author

salehqt commented Oct 4, 2023

@ncw , this PR is ready for review now. I did my best in the documentation, please let me know if you want the wording changed.

I also cleaned up the history into:

  • three diffs for modifications to VFS
  • two diffs for adding serve nfs and nfsmount
  • one diff for modifying the documentation of mount feature

Saleh Dindar added 3 commits October 4, 2023 11:57
…Handle

Name() method was originally left out and defaulted to the base
class which always returns empty. This trigerred incorrect behavior
in serve nfs where it relied on the Name() of the interafce to figure
out what file it was modifying.

This method is copied from RWFileHandle struct.

Added extra assert in the tests.
A subtle bug where dir modification time is not updated when the dir already exists
in the cache. It is only noticeable when some clients use dir modification time to
invalidate cache.
…y.File

billy defines a common file system interface that is used in multiple go packages.
vfs.Handle implements billy.File mostly, only two methods needed to be added to
make it compliant.

An interface check is added as well.

This is a preliminary work for adding serve nfs command.
@salehqt
Copy link
Author

salehqt commented Oct 4, 2023

I ran the workflow on my own repo and it is passing: https://github.com/salehqt/rclone/actions/runs/6410303444

Summary:
Adding a new command to serve any remote over NFS. This is only useful for new macOS versions where FUSE mounts are not available.
 * Added willscot/go-nfs dependency and updated go.mod and go.sum

Test Plan:
```
go run rclone.go serve nfs --http-url https://beta.rclone.org :http:
```

Test that it is serving correctly by mounting the NFS directory.

```
mkdir nfs-test
mount -oport=58654,mountport=58654 localhost: nfs-test
```

Then we can list the mounted directory to see it is working.
```
ls nfs-test
```
Saleh Dindar added 2 commits October 4, 2023 20:19
…ut FUSE

Summary:
In cases where cmount is not available in macOS, we alias nfsmount to mount command and transparently start the NFS server and mount it to the target dir.

The NFS server is started on localhost on a random port so it is reasonably secure.

Test Plan:
```
go run rclone.go mount --http-url https://beta.rclone.org :http: nfs-test
```

Added mount tests:
```
go test ./cmd/nfsmount
```
Copy link
Member

@ncw ncw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very good.

Thank you for making a nice set of organised commits - lovely :-)

I will merge this now - thank you!

If you think of anything else then please send a follow up PR.

Do you want to announce this in on the rclone forum? It might be good to have a bit more testing before we release it.

@ncw ncw merged commit bcb3289 into rclone:master Oct 6, 2023
10 checks passed
@salehqt
Copy link
Author

salehqt commented Oct 6, 2023

@ncw , thanks for the reviews and merging the contribution. I am going to start testing on for my daily workload and try to see if I can find any bugs.

I will also make an announcement. I guess this is to let people know that this is available on master so they can start testing it before the full release.

@salehqt salehqt deleted the nfs branch October 6, 2023 16:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants