Skip to content

Conversation

@jakebailey
Copy link
Member

@jakebailey jakebailey commented Nov 15, 2024

This adds a filesystem implementation, wrapping io/fs.FS.

The filesystem works on "normalized paths", which in our parlance means "absolute paths, normalized to / slashes regardless of OS". These paths look like:

  • /home/jake/mycode/wow.ts
  • c:/Users/Jake/project/hello.ts

io/fs.FS is not a "traditional" abstraction. All paths must be / separated. Good news, that's how we already did it. But, each fs.FS itself is already rooted and expects to be given paths relative to that root.. For example, I can do os.DirFS("/") or os.DirFS("c:/"), then pass in "home/jake/mycode/wow.ts". This poses a problem on Windows machines because there's no common root between all drives. But, good news again, we already have GetEncodedRootLength which is able to perform the splitting we need.

The FS in this PR just splits paths with GetEncodedRootLength (rejecting relative paths), and then opens the relevant FS by root, passing in the rest of the path to perform file operations.

The FS also supports wrapping a single fs.FS, which is useful for using testing/fstest.Map. Since this is not multi-root, paths like c:/blah are treated as /c:/blah.

This FS differs from the original TS "System" interface in a few ways:

  • ReadFile returns (contents string, ok bool) instead of string | undefined.
  • Instead of providing readDirectory, which in TS is a recursive function that also performs exclude/include globs, I'm just giving WalkDir, which is the same as fs.WalkDir, filepath.WalkDir, etc. Globbing can be done externally.
  • Realpath is implemented by hand and is a noop in the virtual side of the FS. I think my implementation is okay, but, the real thing to use would be fs.Readlink which is not yet available in Go.

This PR also eliminates all uses of path and path/filepath outside of the VFS, tspath, tests, and code gen.

)

// FS is a file system abstraction.
type FS interface {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given there's only one concrete implementation of this interface (which then itself wraps a more generic abstraction), I may just eliminate this entirely.

@jakebailey jakebailey marked this pull request as ready for review November 20, 2024 21:46
Copy link
Member

@andrewbranch andrewbranch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are all the Path related !!!?

type ResolutionHost interface {
FileExists(fileName string) bool
ReadFile(fileName string) string
FS() vfs.FS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, why add a method instead of embed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would then imply that we have to create a struct which embeds a vfs.FS value and then assign the object there in order to make something which implements ResolutionHost. Then, if you need to construct another host object that references the FS of another host, you can't just pass the previous FS by value, you'd have to pass in the entire previous interface and do an "I to I conversion". If it's an accessor like this, then all you do is grab one host's FS and then pass it as part of the next one.

But, maybe that's not the right approach, and we actually do want to construct giant hosts which implement loads of methods. We have a lot of hosts, and that may help us determine "optional" functionality by doing speculative type conversions...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's nice to implement these big hosts when you only need a subset of functionality, but I'm not against this change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, declaring that you only need an FS with ReadFile is definitely something not handled here. It's pretty easy to change this, though.

Comment on lines +181 to +182
v.readSema <- struct{}{}
defer func() { <-v.readSema }()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn’t get this far in the Go book yet 😆

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is semaphore implemented in terms of a channel; it allows a fixed number of concurrent work. Normally this is not a problem in Go, but performance benchmarking shows that not storming the filesystem with reads improves performance and that's also what gopls does.

@jakebailey
Copy link
Member Author

What are all the Path related !!!?

They're all places where I find it mega awkward that I have to convert to string and back, or where I am unsure if I even did it right.

Are our path helpers designed to operate on Paths? Like, could I instead make:

func ConvertToRelativePath[S ~string](p S, opt ComparePathsOptions) S { ... }

So that it worked with both strings and Path?

Comment on lines 20 to 23
UseCaseSensitiveFileNames() bool

// GetCurrentDirectory returns the current directory.
GetCurrentDirectory() string
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think I should drop these methods entirely; they only carry state which I think is actually better passed through other hosts? The values are not used internally at all. Thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW TS's system has GetCurrentDirectory, I am wondering if we should restore it just so paths are relative? But, I kinda want us to actually name a NormalizedPath type or something... It's probably ok right now.

@andrewbranch
Copy link
Member

Are our path helpers designed to operate on Paths? Like, could I instead make:

func ConvertToRelativePath[S ~string](p S, opt ComparePathsOptions) S { ... }

So that it worked with both strings and Path?

It depends. Unfortunately, you just have to reason about whether the outputs maintain the Pathyness of the inputs. I think this example may be safe, but you have to check whether there’s any way that the CurrentDirectory string from ComparePathsOptions could contribute any path components to the output. If both the current directory and absolute path inputs were already Paths, then it would be safe, but also UseCaseSensitiveFileNames option would be unnecessary. So it’s not always totally straightforward.

func rootLength(p string) int {
l := tspath.GetEncodedRootLength(p)
if l <= 0 {
panic("expected absolute path, got: " + p)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this assert; the FS doesn't have a CWD so cannot itself resolve paths. That's a downside of it not having a CWD...

@jakebailey
Copy link
Member Author

I believe the PR is now fully ready for a review; I know @andrewbranch already approved it but I pushed one last change that fixed error reporting paths to now use the non-lowercased paths which I think is correct but want eyes on.

@jakebailey
Copy link
Member Author

Realized I did not add unit tests, will do that quick.

@jakebailey jakebailey merged commit 3bbf3cf into main Nov 22, 2024
11 checks passed
@jakebailey jakebailey deleted the jabaile/vfs branch November 22, 2024 22:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants