Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os: hard to use DirFS for cross-platform root filesystem #44279

Open
josharian opened this issue Feb 16, 2021 · 20 comments
Open

os: hard to use DirFS for cross-platform root filesystem #44279

josharian opened this issue Feb 16, 2021 · 20 comments
Labels
NeedsInvestigation
Milestone

Comments

@josharian
Copy link
Contributor

@josharian josharian commented Feb 16, 2021

I wrote a little package called cdup that walks a directory and each of its ancestors looking for a file/dir. This is useful e.g. to find the root of a git dir (look for .git) or a Go module (look for go.mod).

I aimed to write it using fs.FS and use os.DirFS as a bridge to the OS filesystem as needed, perhaps inside a wrapper function.

I may just be holding it wrong, but I encountered a few surprising bumps.

A typical invocation is cdup.Find(pwd, ".git"). The obvious but wrong translation of this is cdup.FindIn(os.DirFS("/"), pwd, ".git").

What is the correct root path to pass to os.DirFS on Windows? filepath.VolumeName seems like the right API to use. (I think?)

Then we must convert pwd to a relative path using filepath.Rel. On Windows, we also need to call filepath.ToSlash.

So I believe that the correct code is something like:

root := "/"
if vol := filepath.VolumeName(dir); vol != "" {
  root = vol
}
rel := filepath.Rel(root, dir)
rel = filepath.ToSlash(rel)
cdup.FindIn(os.DirFS(root), rel, ".git")

This is a surprising amount of non-obvious work for what seems like it should be a trivial bridge between fs.FS and the OS filesystem.

I'm not sure exactly what the fix should be here. Docs/examples? New package filepath API to get the OS fs root for a path? New package os API to construct a root filesystem (and expose the root for use with filepath.Rel)?

@dmitshur
Copy link
Contributor

@dmitshur dmitshur commented Feb 16, 2021

CC @rsc.

@dmitshur dmitshur added the NeedsInvestigation label Feb 16, 2021
@dmitshur dmitshur added this to the Backlog milestone Feb 16, 2021
@bcmills
Copy link
Member

@bcmills bcmills commented Feb 16, 2021

What is the correct root path to pass to os.DirFS on Windows? filepath.VolumeName seems like the right API to use. (I think?)

IIUC there is no single root path on Windows: each drive letter is its own root.

Perhaps you could implement an fs.FS that identifies all mapped (or all local) drive letters and lists those as virtual subdirectories of the root?

@josharian
Copy link
Contributor Author

@josharian josharian commented Feb 16, 2021

At least for my purposes, I don't need a single root. I only need the root corresponding to a particular path, since I'm only walking up the tree. The broader question is interesting too, though.

(Ideally, for my purposes, I'd like to stop at filesystem boundaries, instead of traversing through mount points. But that's out of scope—the point of this issue is that it's hard to do something that feels like it should be easy. It is reasonable for detecting and stopping at mount points to be hard. 😁)

@jwalton
Copy link

@jwalton jwalton commented Oct 31, 2021

I think the real problem here is, if you want to write an application that can open any file on the filesystem, then on Linux/Mac you can do:

files := os.DirFS("/")

but on Windows, this will make your application unusable:

files := os.DirFS("/")
dir, err := fs.ReadDir(files, "c:/Windows/win.ini") // Error open c:/Windows/win.ini: invalid argument
dir, err := fs.ReadDir(files, "Windows/win.ini") // Error open //Windows/win.ini: The network path was not found

"The network path was not found" is especially weird, as it suggests that fs.ReadDir(files, "sambahost/foo/bar") will work, which is probably not what anyone expects.

On Windows, you need to work out which volume the file is on and modify the path before you can even call DirFS. If I have a function doTheThing(files fs.FS, path string), and I want to call it correctly in a cross-platform way for an arbitrary file, I need all of that boilerplate from @josharian's example every time I call this function.

I suspect what this means is that many Go app authors will forget about the Windows case and do files := os.DirFS("/"), and the app will just fail to run correctly on Windows machines. It seems like os.DirFS("") on Windows is roughly equivalent to os.DirFS(filepath.VolumeName(cwd)), so if you do os.DirFS("") it'll have a chance of working on Windows machines where CWD and the file you want to open are on the same volume, although you'ld still have to strip the volume from the absolute filename.

As a library author, the obivous solution here is to write your function as DoTheThing(path string), and do these gymnastics inside your library... but at that point there seems little benefit to using fs.FS in the first place. There's no way to pass a different fs.FS implementation to DoTheThing(), so you can't pass in a fake testfs instance anyways. At that point you may as well just use os.Open().

It would be really nice if this:

files := os.DirFS("")
file, err := files.Open('d:/foo/bar.txt')

did what most people probably expect it to do. Then you'd still need to call filepath.ToSlash(), but at least you could open an arbitrary file without writing a short novel every time, but we probably don't want to change the existing behavior. Some app somewhere probably relies on that...

@jwalton
Copy link

@jwalton jwalton commented Oct 31, 2021

Maybe we could add a os.SystemFS() that behaves like this:

files := os.SystemFS()

// On Windows
file, err := files.Open('d:/foo/bar.txt')
file, err := files.Open('//networkShare/foo/bar')

On Linux
file, err := files.Open('/foo/bar')

@jwalton
Copy link

@jwalton jwalton commented Oct 31, 2021

Actually - one more possible solution. On Windows, we could make it so os.DirFS("/") returns a FS instance that accepts "d:/foo/bar.txt" or "//networkShare/foo/bar":

files := os.DirFS("/")

// On Windows
file, err := files.Open('d:/foo/bar.txt')
file, err := files.Open('//networkShare/foo/bar')

On Linux
file, err := files.Open('/foo/bar')

This changes the behavior of os.DirFS(), but since os.DirFS("/") currently returns a more-or-less unusable FS instance on Windows it's unlikely anyone relies on it right now. Also people seem to think this should work anyways, so this fixes apps that are likely already broken.

@qmuntal
Copy link
Contributor

@qmuntal qmuntal commented Oct 31, 2021

I've also been hit by this issue in production code, and after several attempts to fix it I just revert the code to use good old filepath+os.

I would be happy with a safe cross-platform example, but a new os API will be also welcomed.

@bcmills
Copy link
Member

@bcmills bcmills commented Nov 1, 2021

file, err := files.Open('/foo/bar')

Note that io/fs.FS.Open supports only unrooted paths (see https://pkg.go.dev/io/fs#ValidPath), so /foo/bar is not valid on Linux today.

@jwalton
Copy link

@jwalton jwalton commented Nov 1, 2021

Ah, my bad:

files := os.DirFS("/")

// On Windows
file, err := files.Open('d:/foo/bar.txt')
file, err := files.Open('networkShare/foo/bar')

On Linux
file, err := files.Open('foo/bar')

This should be a reatively easy and safe change to make, since two out of three of these examples already work, and the third today returns an error, and it also seems pretty clear that this was probably what the caller intended.

@bcmills
Copy link
Member

@bcmills bcmills commented Nov 1, 2021

I don't think os.DirFS("/") itself should support UNC network shares — the equivalent behavior on Unix filesystems does not enumerate network shares unless the root filesystem itself includes mounts of all of those shares (which seems unlikely).

I think the less surprising behavior would be a filesystem that resembles the WSL2 /mnt directory, with a relative directory (without a colon) for each mapped drive letter:

files := os.DirFS("/")
file, err := files.Open("d/foo/bar.txt")

@bcmills
Copy link
Member

@bcmills bcmills commented Nov 1, 2021

(Or, perhaps it should support network shares but not list them in ReadDir? What does the WSL /mnt do in that case?)

@jwalton
Copy link

@jwalton jwalton commented Nov 1, 2021

Unfortunately os.DirFS("/") supports network shares right now on Windows, in exactly the way I described above. I agree this was a surprising thing. Removing it now would be a breaking change, though.

files.Open("d/foo/bar.txt") sounds reasonable enough, but if we want to preserve the existing behavior with network shares, with "d/foo/bar.txt" how do you differentiate between "D:" the volume and "\\d" the bizzarely-named samba server?

Second, imagine you're writing an application that prints the contents of a file (essentially cat). You want someone to be able to pass any arbitrary file: cat /etc/passwd, cat c:\windows\win.ini, cat \\share\stuff.txt. If you handle network shares as we do today, and handle d:/foo/bar.txt, then when you want to want to write a program that can handle all of these cases, you get something like:

func catFile(path string) error {
	files := os.DirFS("/")
	buf := make([]byte, 4096)

	slashPath := filepath.ToSlash(path)
        // Remove the leading slashes.
	for len(slashPath > 0) && slashPath[0] == '/' {
		slashPath = slashPath[1:]
	}

	file, err := files.Open(slashPath)
	if err != nil {
		return err
	}

	_, err = file.Read(buf)
	if err != nil {
		return err
	}

	fmt.Println(string(buf))
	return nil
}

If the file was d/foo/bar.txt instead of d:/foo/bar/txt, then in the above example it adds a lot of complexity because we have to detect and strip out that ":" in the Windows case. Already in the above we have to strip one or two leading /s, which is already more work than you really want to do in order to just open a file. The more required boilerplate there is around this, the more likely it won't be handled correctly. (Maybe we could solve the boilerplate issue with some kind of fs.ResolvePath(files FS, absolutePath string) (relativeSlashPath string, err error), or something along those lines?)

@bcmills
Copy link
Member

@bcmills bcmills commented Nov 1, 2021

Unfortunately os.DirFS("/") supports network shares right now on Windows, in exactly the way I described above. I agree this was a surprising thing. Removing it now would be a breaking change, though.

The Go compatibility policy explicitly allows for fixing “specification errors” and “bugs”. Given the bizarre current behavior and the newness of os.DirFS, I would argue that the current behavior is pretty clearly a bug. 😅

@bcmills
Copy link
Member

@bcmills bcmills commented Nov 1, 2021

Second, imagine you're writing an application that prints the contents of a file (essentially cat). You want someone to be able to pass any arbitrary file: cat /etc/passwd, cat c:\windows\win.ini, cat \\share\stuff.txt.

If you're doing that, then you already need a layer to convert absolute paths to rootless ones in order to use DirFS. At that point, I think you may as well use DirFS with filepath.VolumeName instead of trying to munge the whole thing into one root filesystem:

func catFile(absPath string) error {
	root := "/"
	if volume := filepath.VolumeName(path); volume != "" {
		root = volume
		path = path[len(volume):]
	}
	fsys, err := os.DirFS("/")
	…
}

Or, if you (for example) want to use a passed-in fs.FS, you need a conversion from absolute to rootless paths anyway:

func toRootless(absFilePath string) (string, error) {
	volume := filepath.VolumeName(absFilePath)

	slashPath := filepath.ToSlash(absFilePath[len(volume):])
	if !strings.HasPrefix(slashPath, "/") {
		// Path is not absolute.
		return "", fmt.Errorf(…)
	}
	relPath := slashPath[1:]

	switch {
	case volume == "":
		return relPath
	case len(volume) == 2 && volume[1] == ':':
		return path.Join(string.ToLower(volume[:1]), relPath)
	default:  // UNC path
		// Encode the UNC path as a DirFS path.
		…
	}
}

In that case, it would suffice to transform the UNC path to anything that doesn't collide with a drive letter — perhaps network or share?

	f, err := fsys.Open("c/windows/win.ini")
	f, err := fsys.Open("share/networkShare/foo/bar.txt")

@aldas
Copy link

@aldas aldas commented Jan 10, 2022

file, err := files.Open('/foo/bar')

Note that io/fs.FS.Open supports only unrooted paths (see https://pkg.go.dev/io/fs#ValidPath), so /foo/bar is not valid on Linux today.

@bcmills why is os.dirFS.Open and fs.Sub() treating \ differently on Windows?

go/src/os/file.go

Lines 643 to 647 in 1f411e9

func (dir dirFS) Open(name string) (fs.File, error) {
if !fs.ValidPath(name) || runtime.GOOS == "windows" && containsAny(name, `\:`) {
return nil, &PathError{Op: "open", Path: name, Err: ErrInvalid}
}
f, err := Open(string(dir) + "/" + name)

go/src/io/fs/sub.go

Lines 34 to 38 in 1f411e9

func Sub(fsys FS, dir string) (FS, error) {
if !ValidPath(dir) {
return nil, &PathError{Op: "sub", Path: dir, Err: errors.New("invalid name")}
}
if dir == "." {

This means that this fs.Sub(currentFs, filepath.Clean("../assets"))

  • returns an error on Linux
  • passes on Windows (path is converted to ..\assets and slips trough ValidPath check)

but for os.Open (nonWindows(Linux for me)/Windows) this path is invalid.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jan 10, 2022

@aldas See #44166 for background.

@aldas
Copy link

@aldas aldas commented Jan 10, 2022

It is quite confusing or not that obvious as fs.Sub method doc-comments does not say it only operates with /. You would think filepath.Clean would be enough but actually you should chain ToSlash with Clean to get consistent behavior. This is feeling coming from os.Open world to fs.FS world. And even with that you are probably doing it wrong for some cases.

This was example I played around to understand:

func TestFsSubWithOpen(t *testing.T) {
	osFS := os.DirFS("_fixture")
	subPath := "../_fixture" // this depends on user. It could be given with slashes or backslashes
	//subFS, err := fs.Sub(osFS, filepath.ToSlash(filepath.Clean(subPath))) // <-- consistent behavior, at least for my use-case
	subFS, err := fs.Sub(osFS, filepath.Clean(subPath)) // result depends on OS even if you would think Clean helps you with separators and ../.
	if err != nil {
		t.Fatal(err) // nonWindows ends here
	}
	f, err := subFS.Open("index.html") // ./_fixture/index.html is actually existing file
	if err != nil {
		t.Fatal(err) // Windows ends here. Because eventually dirFs.Open tries to open `..\_fixture/index.html` and rejects `\`
	}
	f.Close()
}

is it confusing then you are doing fs.Sub at one corner of your application and some cases you are stopped there (which is ok) but on other OS same stuff fails at another corner (the place you are doing Open on that subfs)

thanks @ianlancetaylor

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jan 11, 2022

I suppose I would say that it doesn't really make sense to pass the result of filepath.Clean to fs.Sub. filepath.Clean operates on file-system paths. fs.Sub does not. I agree that it's an easy mistake to make.

@sewera
Copy link

@sewera sewera commented Jun 19, 2022

Here is my take (workaround maybe?), which is stable now, and I think will be stable in the future.

With this approach, we get a standard os implementation of Open, with a correct root directory.

type OsFilesystem struct{}

func (f *OsFilesystem) Open(name string) (fs.File, error) {
	return os.Open(name)
}

// *OsFilesystem implements fs.FS
var _ fs.FS = new(OsFilesystem)


func TestOsFilesystem(t *testing.T) {
	// given
	openFile := func(fsys fs.FS, path string) error {
		_, err := fsys.Open(path)
		return err
	}

	fsys := new(OsFilesystem)

	t.Run("exists", func(t *testing.T) {
		// given
		path := "/tmp/existent.file"

		// when
		err := openFile(fsys, path)

		// then
		assert.NoError(t, err)
	})

	t.Run("does not exist", func(t *testing.T) {
		// given
		path := "/tmp/nonexistent.file"

		// when
		err := openFile(fsys, path)

		// then
		assert.Error(t, err)
		assert.True(t, errors.Is(err, fs.ErrNotExist))
	})
}

@bcmills
Copy link
Member

@bcmills bcmills commented Jun 22, 2022

@sewera, the use of os.Open makes that implementation no longer cross-platform: the argument to io/fs.FS.Open is a slash-separated unrooted path, not a platform-native filesystem path. (That implementation does not work for volume names or drive letters on Windows, such as C:\ or \\server\sharename\.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation
Projects
None yet
Development

No branches or pull requests

8 participants