Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client side implementation #4

Open
chipsenkbeil opened this issue May 10, 2023 · 20 comments
Open

Client side implementation #4

chipsenkbeil opened this issue May 10, 2023 · 20 comments

Comments

@chipsenkbeil
Copy link

chipsenkbeil commented May 10, 2023

I'm eagerly awaiting a Rust-native implementation of an ssh client that includes sftp. This is because I offer a remote editing experience through distant and distant.nvim respectively, which can be done both through a custom server written in Rust and ssh.

The two existing ssh client library wrappers, ssh2 and libssh, are both error-prone and add extra complexity to support.

Just wanted to drop a line that I'm watching the work being done here and am hopeful that I'll be able to adopt russh and this 3rd party addon to for sftp support in the near future!

@AspectUnk
Copy link
Owner

I will be glad if you share your experience with other libraries, it will help me to think over a better solution during development.

@chipsenkbeil
Copy link
Author

chipsenkbeil commented May 11, 2023

Sure. The library I use today is wezterm-ssh written by an old colleague of mine. It acts as a wrapper around both ssh2 and libssh client libraries, themselves wrappers around the associated sys bindings to the C ssh client libraries.

For using ssh with sftp, the entrypoint starts from crating an ssh session and then creating a new SFTP channel via Session::sftp. I actually wrote the original code for the SFTP wrapper for wezterm-ssh, and the API itself is okay, but the client libraries themselves are a bit buggy and poor-performing, so I'm hoping that a Rust-native implementation will eventually reach better stability.

You can see how I use the SFTP abstraction from wezterm-ssh in distant_ssh2::api, where I handle a variety of operations that abstract to SFTP.

The SFTP API

The high-level SFTP channel API can be found here: https://docs.rs/wezterm-ssh/0.4.0/wezterm_ssh/struct.Sftp.html. Because this library was abstracting across two different client implementations, you find wrapper implementations like wezterm_ssh::SftpWrap to invoke the underlying library method, but you shouldn't need to do this yourself since you'd be using russh and your own implementation on top.

In a nutshell, it provides a flat collection of async functions to perform common filesystem actions. Many of these mirror those you'd find in the std::fs module:

  • open_with_mode - returns an ssh File (not exported out of the crate) representing a remote file. Uses OpenOptions to specify how you're opening file (reading/writing, binary, etc). This is the core method used for working with files (everything else tends to use this underneath).
  • open - same as open_with_mode, but just for reading a file. Equivalent to std::fs::File::open.
  • create - same as open_with_mode, but just for writing a file. Equivalent to std::fs::File::create.
  • open_dir - like opening a pointer to a file, but for a directory to read the entries one by one. In a better implementation, this would actually be an iterator similar to std::fs::ReadDir.
  • create_dir - makes a remote directory and supports being provided the mode that you'd use on a Unix system (e.g. the octal of permissions). The mode flags were abstracted using bitflags into wezterm_ssh::FilePermissions with a method to_unix_mode that you could call to convert it into an u32. Redoing this, I'd just make the high-level abstraction take the bitflag instance and convert behind the scenes, especially since this abstraction already supported being created from a u32 representing the bits. Mirror of std::fs::create_dir, and there was no equivalent to std::fs::create_dir_all, but you could do that just by breaking apart the path and trying each piece one at a time. That's what I do in my code that uses this library.
  • remove_dir - deletes a directory. I can't remember if ssh's protocol for SFTP supports anything more complex. Mirrors std::fs::remove_dir.
  • metadata - retrieves metadata for a file/directory. SFTP supports this version and another one that I named symlink_metadata. The difference is that this version returns information about the file pointed to by the symlink, so it has to resolve it and report the kind (directory/file), etc. Mirrors std::fs::metadata.
  • symlink_metadata - same as metadata, but not resolving symlinks. Mirrors std::fs::symlink_metadata.
  • set_metadata - sets the metadata for a file/directory. Alongside the metadata method, this uses an abstraction struct wezterm_ssh::Metadata similar to std::fs::Metadata (but a bit more limited).
  • symlink - creates a symlink. Mirror of std::os::unix::fs::symlink.
  • read_link - evaluates a symlink's path to the real directory/file. Mirror of std::fs::read_link.
  • canonicalize - evaluates the real path. Mirror of std::fs::canonicalize.
  • rename - renames a file/directory. The SFTP libraries we used provided some additional options like whether or not to fail or overwrite an existing path, which were abstracted into the struct wezterm_ssh::RenameOptions with a default that overwrites the existing path. Kind of mirrors std::fs::rename, but with more options.
  • remove_file - removes the file. Mirror of std::fs::remove_file.

Abstractions

When I first wrote the sftp abstraction, I had exported types that represented high-level wrappers across API usage. Those specifically were:

  • wezterm_ssh::sftp::Dir - provides a singular method, read_dir, that reads the next entry from in the directory. You call this multiple times until you get an error, IIRC. In a better implementation, this would actually be an iterator similar to std::fs::ReadDir.
  • wezterm_ssh::sftp::File - provides methods to read and write the metadata of a file (specifically permissions) as well as implements the async traits for reading and writing to the file itself so you can use it as you would any other IO implementation. It's equivalent to tokio::fs::File or std::fs::File. You can find examples of it being used in the end-to-end tests.

@AspectUnk
Copy link
Owner

@chipsenkbeil I researched this issue and the current implementation of serialization and deserialization is not very good for extending the code with the client side, so I will do that first and then add a small client example. I'll provide the basis for the client side soon, hopefully you can help expand the client side in the future

@chipsenkbeil
Copy link
Author

@AspectUnk great! Keep me posted on the updates.

@chipsenkbeil
Copy link
Author

@AspectUnk any update here? Anything else I can contribute to help?

@AspectUnk
Copy link
Owner

@chipsenkbeil sorry, i didn't have much time before this. i'll try to provide a new beta version for feedback one of these days

@oh-yes-0-fps
Copy link

oh-yes-0-fps commented Sep 3, 2023

I'm interested in adding client support but I have a few questions first.

  • Do you have any un-pushed work I would be overwriting
  • Is the idea for the handles that they implement nothing by default or is that just a side-effect of the project being in early development
  • Would you consider using tracing over log
  • If you are interested in having functional default implementations do you want the server to be multi-threaded / multi-tasked

@AspectUnk
Copy link
Owner

@oh-yes-0-fps Welcome

  • Yes, at the moment I am actively working on the client side and most of the work is not published at the moment, only reworked packets are published now.
  • Initially the goal of the project was to provide raw work with the protocol, but now I see that the standard implementation will also be useful. So yes, now it is a consequence of early development
  • No problem
  • Definitely yes, it will become part of the standard implementation because for many projects this may be a bottleneck, but it will be after the client part

If you have any suggestions that require a longer discussion, please create a new issue. Thank you

@AspectUnk
Copy link
Owner

AspectUnk commented Sep 11, 2023

I opened a PR #5 with an almost completely ready raw api and partly a ready high level api.

@chipsenkbeil I apologize again for the long inactivity, I'll try not to repeat this. For the most part, I just have to implement the high level api for folders and files, while I'm doing this, could you please evaluate the current implementation. I tried to leave the high level api exactly the same as in your crate to make it easier for you to integrate. After looking through your crate, I had a question. Is its high functionality superficial and maybe it should be expanded?

@chipsenkbeil
Copy link
Author

@chipsenkbeil I apologize again for the long inactivity, I'll try not to repeat this. For the most part, I just have to implement the high level api for folders and files, while I'm doing this, could you please evaluate the current implementation. I tried to leave the high level api exactly the same as in your crate to make it easier for you to integrate.

Thanks for this and no worries about inactivity! You're doing this in your spare time and for free, so I'm appreciative when you take my needs into consideration. I skimmed through it earlier, and will try to take a deeper look and leave comments in the next week or so.

After looking through your crate, I had a question. Is its high functionality superficial and maybe it should be expanded?

Sorry, I don't understand the question. Can you rephrase it? Are you asking if the high level API should be expanded to include other features?

@AspectUnk
Copy link
Owner

AspectUnk commented Sep 15, 2023

Sorry, I don't understand the question. Can you rephrase it? Are you asking if the high level API should be expanded to include other features?

Yes, I think it can be significantly expanded. Btw I did not see an AsyncSeek implementation for a file in your crate. Why is this so?

Now I have already implemented read_dir by analogy with std::fs::read_dir, file and their async i/o implementation (btw AsyncSeek too), and all other methods from your crate. An example of using the client can be found here. I hope this is what you wanted 😅

A couple of points:

  • You said about wezterm_ssh::RenameOptions, is it that important? It is not in the current version of the protocol, but if it is important, it could be expanded to a new version
  • Currently, the flush or fsync method is not implemented. Their implementation implies support for the fsync@openssh.com extension. Probably only this and limits@openssh.com is stopping us from beta. I would be glad if you could contribute to this, if not then let me know and I will try to find the time but a little later. If you have such an opportunity, let me know and after that I can write docs for the client part and release it as beta.

@chipsenkbeil
Copy link
Author

Btw I did not see an AsyncSeek implementation for a file in your crate. Why is this so?

No reason. Wasn't needed for the minimum implementation at the time. So if you've already implemented it, that's great!

I hope this is what you wanted 😅

You've done so much! I can take a stab and trying to use it as a replacement for wezterm-ssh to see how it works.

You said about wezterm_ssh::RenameOptions, is it that important? It is not in the current version of the protocol, but if it is important, it could be expanded to a new version

I don't use the options in my downstream library powering distant. It was just created to expose some options you could provide during a rename. The libssh backend doesn't even support the rename options, so I'd say they could just be ignored.

    pub fn rename(
        &self,
        src: &Utf8Path,
        dest: &Utf8Path,
        #[cfg_attr(not(feature = "ssh2"), allow(unused_variables))] opts: RenameOptions,
    ) -> SftpChannelResult<()> {
        match self {
            #[cfg(feature = "ssh2")]
            Self::Ssh2(sftp) => {
                Ok(sftp.rename(src.as_std_path(), dest.as_std_path(), Some(opts.into()))?)
            }


            #[cfg(feature = "libssh-rs")]
            Self::LibSsh(sftp) => Ok(sftp.rename(src.as_str(), dest.as_str())?),
        }
    }

Currently, the flush or fsync method is not implemented. Their implementation implies support for the fsync@openssh.com extension. Probably only this and limits@openssh.com is stopping us from beta. I would be glad if you could contribute to this, if not then let me know and I will try to find the time but a little later. If you have such an opportunity, let me know and after that I can write docs for the client part and release it as beta.

It was just part of me exposing a friendly wrapper on top of the ssh libraries. I don't see it being used in my downstream library within distant, so if it simplifies things I'd say leave it out.

I'm not familiar with extensions, so what does limits@openssh.com provide?

@AspectUnk
Copy link
Owner

I'm not familiar with extensions, so what does limits@openssh.com provide?

It provides the client with information about server limits such as the max open of handles, the max number of bytes to read/write per packet, and the max length of the packet itself. If the client violates these limits, the server may terminate the connection. So this is a pretty important part

@chipsenkbeil
Copy link
Author

I'm not familiar with extensions, so what does limits@openssh.com provide?

It provides the client with information about server limits such as the max open of handles, the max number of bytes to read/write per packet, and the max length of the packet itself. If the client violates these limits, the server may terminate the connection. So this is a pretty important part

How common is this extension? If a server doesn't implement the extension, can you still use all of the other features of sftp if the server supports that subsystem?

@AspectUnk
Copy link
Owner

How common is this extension? If a server doesn't implement the extension, can you still use all of the other features of sftp if the server supports that subsystem?

Almost every modern server has it. And no, due to its absence the protocol is not limited. In any case, you don't have to worry about this, I have already implemented support for limits@openssh.com and fsync@openssh.com, now the flush and sync_all (old fsync, I renamed it according to std::fs::File::sync_all) methods are available. On your abstraction (SftpSession) you don't have to worry about limits because the async i/o implementation splits packets depending on the method used.

Now I can be sure that this is a completely ready sftp client with advanced abstraction and methods, but there may still be errors or flaw so I hope you can test it. Btw, I saw your issue, if you need help with integration let me know 😼

@chipsenkbeil
Copy link
Author

Almost every modern server has it. And no, due to its absence the protocol is not limited. In any case, you don't have to worry about this, I have already implemented support for limits@openssh.com and fsync@openssh.com, now the flush and sync_all (old fsync, I renamed it according to std::fs::File::sync_all) methods are available. On your abstraction (SftpSession) you don't have to worry about limits because the async i/o implementation splits packets depending on the method used.

Fantastic!

Now I can be sure that this is a completely ready sftp client with advanced abstraction and methods, but there may still be errors or flaw so I hope you can test it. Btw, I saw your chipsenkbeil/distant#193, if you need help with integration let me know 😼

Yep! Was planning to use russh and was figuring out if I needed to implement the sftp protocol myself. Since you've done all of the work now for me to try, I'll be giving your implementation a go instead 😄

So once you've published a beta, I can give it a go to try to swap my usage of wezterm-ssh for russh and russh-sftp!

@AspectUnk
Copy link
Owner

It's now published as 2.0.0-beta.1

@chipsenkbeil
Copy link
Author

It's now published as 2.0.0-beta.1

Woo hoo! I'm going to give it a try to implement in a branch I'd started to migrate over to russh. Will share my findings here within the next couple of days.

@AspectUnk
Copy link
Owner

@chipsenkbeil I did not notice one flaw that made it into the published version. New 2.0.0-beta.2 was published. Looking forward to your feedback! 😄

@chipsenkbeil
Copy link
Author

@chipsenkbeil I did not notice one flaw that made it into the published version. New 2.0.0-beta.2 was published. Looking forward to your feedback! 😄

Got it. I've updated my branch with that beta version. Didn't have time this weekend to work on it, but hoping to tackle this coming weekend. At least within the next couple of weeks! The harder part for me is the general switch to russh and authentication. I'm hoping your work for sftp will integrate more easily from all of the hard work you put in 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants