Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: io/fs presents us an opportunity to rework the whole api into a v2 #407

Open
puellanivis opened this issue Feb 22, 2021 · 12 comments

Comments

@puellanivis
Copy link
Collaborator

Most of the proposal is right there in the title. What with io/fs being rolled out in go1.16, we have a big opportunity to remodel, simplify, and rework our API. Current issues, as I see them:

  • The current sftp.Client implements github.com/kr/fs, but we could switch to model more compatible with io/fs.FS
  • The in-memory test filesystem cannot be removed from the main package without a breaking change, ideally if kept, this should be a subpackage.
  • The ErrSshFxXyz errors are still around, because they cannot be removed without a breaking change.
  • The RequestServer and Server implementations are significantly different and almost parallel implementations.
  • The RequestServer could be better implemented via a base interface with additional extension interfaces. (à la io/fs)
  • From an earlier discussion, many of the raw implementation concepts of the protocol itself could be isolated into its own subpackage, providing a better separation of responsibilities.
  • Some features can be reassessed for utility, and either properly folded into functionality without needing feature flags (like the allocator?) or dropped entirely (we could switch WriteTo so that it does not depend upon filesizes, thus saving us the need of an UseFstat() option).
@drakkan
Copy link
Collaborator

drakkan commented Feb 25, 2021

@puellanivis I'm interested to improve the RequestServer. I don't like the Server implementation: even if you need to work with the local filesystem I think it is better to handle the logic within your own application. For example using the request server you can easily chroot your users, using the Server implementation we need to add this feature to the library and in a way that is good for all use cases.

This is my plan for the RequestServer:

  1. we currently use multiple workers for write packets. For appends we should process the received packets sequentially, we should not rely on packet offset in this case according to the specs
  2. I'll try to add V6 support and some filexfer extensions like file hashing and server side copying

I'm not sure how we can improve the request interfaces, they already seem well designed. If we want to change them we should do it in a way that does not require to checking if an interface is implemented for each packet (for example for writes). Do you already have some sort of draft?

The V6 support is a long term plan, I have other priorities for now, I think I can work on this after releasing SFTPGo 2.1

@puellanivis
Copy link
Collaborator Author

Well, as mentioned above, the current Client targets a different FS object interface than does the new standard library FS defines. Many of the distinctions are probably small, like ReadDir instead of Readdir, but these are the sort of thing that we would need to make a breaking change in order to implement.

I don’t think it would be a good idea to jump straight into reimplementing the server or client, as it would probably be better to get a solid baseline done first, like a wire protocol package, and maybe a memfs. The former would then allow us to build on top of to get a good client+server, while the later helps define really what we would expect the implementations themselves to look like.

It might be most nice to have both the client and server both look quite parallel themselves, and both of them try to model after the fs.FS design as much as possible. Then people familiar with the standard library would see a lot of parallel and familiarity with this package.

@greatroar
Copy link
Contributor

Re: reworking sftp.Client, it can be made more compatible maybe, but full compatibility severely restricts the client's ability to talk to arbitrary servers. io/fs.FS wants the Open method to validate paths using fs.ValidPath. That requires the path to be "UTF-8-encoded, unrooted, slash-separated sequences of path elements", which may not contain . or ... This is very restrictive compared to SFTP, which allows paths to be arbitrary strings, interpreted by the server. Not being able to use .. or rooted paths is especially problematic.

Skipping the validation is possible, but leads to surprising behavior when trying to use a Client as an fs.FS:

type unvalidatingFS struct{}

func (unvalidatingFS) Open(path string) (fs.File, error) {
	return nil, nil
}

func main() {
	var fsys fs.FS = unvalidatingFS{}
	_, err := fsys.Open("../shouldnt/work/but/does")
	fmt.Println(err)
	// <nil>

	fsys, err = fs.Sub(fsys, "foo")
	if err != nil {
		panic(err)
	}
	_, err = fsys.Open("../doesnt/work")
	fmt.Println(err)
	// Here, fs.Sub's Open has done the validation that we skipped, and we get:
	// open ../doesnt/work: invalid name
}

@puellanivis
Copy link
Collaborator Author

puellanivis commented Mar 3, 2021

Indeed, I am expecting that this wouldn’t be 100% fs.FS compatible but rather be modeled based on, and then I would expect something like a receiver method that would return a more strictly compatible implementation for cases where compatibilities with FS must be guaranteed.

Note, if you add this receiver method, the second one won’t error anymore:

func (fsys unvalidatingFS) Sub(dir string) (fs.FS, error) {
  return fsys, nil
}

Though fs.Sub(fsys, "../doesnt/work") would continue to not work, as it uses fs.ValidatePath() before calling, though fsys.Sub("../doesnt/work") would work.

TL;DR: I agree, we don’t want to restrict ourselves too heavily onto fs.FS, as this would be like shooting ourselves in the foot, but providing a path towards compatibility provides a significant bonus, and building with a greater eye towards compatibility gives us a shot at allowing users to exploit knowledge of one API to quickly grasp onto our API.

P.S.: I mean, first and foremost, right now the fs.FS does not have any way to create an io.Writer, which would be an enormous gaping lack of feature if we didn’t provide.

@parro-it
Copy link

parro-it commented Mar 3, 2021

I started some experiments here: https://github.com/parro-it/vs/tree/master/sshfs, in case someone would take a look at it.
I don't agree with the fact the semantic changes in how an fs.FS is required to handle path are limiting:

In my (probably simplistic) implementation you can just call sshfs.ConnectClient("/", sshClientInstance) and access the whole file system. From there, there is no need to use .. or . , and if you happen to use them in the middle of a path, you can easily remove them with path.Clean().

@greatroar
Copy link
Contributor

Doing path.Clean is not a sufficient replacement. If /a/b/c is a symlink to /d/c, then /a/b/c/.. is /d, not /a/b.

@parro-it
Copy link

parro-it commented Mar 5, 2021

Doing path.Clean is not a sufficient replacement. If /a/b/c is a symlink to /d/c, then /a/b/c/.. is /d, not /a/b.

Good point, I always found that thing really annoying... anyway, I'm not saying that Replace it's an automatic replacement, just that it's easy to switch between the two semantics.
In your case, it is a bit difficult to do...

@puellanivis
Copy link
Collaborator Author

puellanivis commented Mar 5, 2021

I mean, it should be relatively simple for us to provide a built-in wrapper to provide an fs.FS compatible wrapper. The implementation can even be internally hidden, and we can design it to support only the functions that have been already defined and release feature patches to add in features as they are rolled out into the standard package.

P.S.: I don’t think we should just tack this onto the current implementation though, while this feature would be a super nice feature to add in, there are a lot of other points calling for making a breaking-change API update. I mean, just alone being able to clean up the public surface would be great.

@parro-it
Copy link

parro-it commented Mar 5, 2021

@puellanivis I agree with you, that totally makes sense...

@greatroar
Copy link
Contributor

greatroar commented Mar 10, 2021

I still don't see the benefit of moving halfway to io/fs, but if a v2 is on the table, I do have a proposal to make. Please tell me if this should be a separate issue, but I'd like suggest that in v2, normaliseError be changed so that StatusErrors are returned as-is, and that type gets the following method:

// Is implements errors.Is's interface, matching s against os.ErrExist,
// os.ErrNotExist and os.ErrPermission.
func (s *StatusError) Is(err error) bool {
        switch s.Code {
        case sshFxFileAlreadyExists:
                return err == os.ErrExist
        case sshFxNoSuchFile:
                return err == os.ErrNotExist
        case sshFxPermissionDenied:
                return err == os.ErrPermission
        }
        return false
}

Right now, errors are changed in backward-incompatible ways even in minor releases (#401). With a publicly documented Is method, that need not happen again and all errors retain full details of what caused them.

@puellanivis
Copy link
Collaborator Author

The benefits of moving “halfway to io/fs” are in improved API parallelism. We don’t need to rely upon Random Peron’s FS design, there’s a solid design to target in the standard library now. But it’s not really moving to a whole new API just for this.

The greater reasons for a new API are that there are tons of improvements that could be made throughout the code base considering the newer features of the language, like errors.Is / errors.As etc, not just the fs.FS stuff.

@greatroar
Copy link
Contributor

I didn't mean to be polemical :)

Re: random person's design, I thought kr/fs was pretty popular, but looking at https://pkg.go.dev/github.com/kr/fs?tab=importedby it looks like many of the importers are in fact clones of pkg/sftp. So it looks like little is lost in dropping it and, say, returning []io/fs.DirEntry from ReadDir.

I also see the broader benefit of a v2. Having a perm argument to OpenFile and Mkdir would also be great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants