Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenSSH ssh_config support #279

Closed
rojer opened this issue May 6, 2020 · 52 comments
Closed

OpenSSH ssh_config support #279

rojer opened this issue May 6, 2020 · 52 comments

Comments

@rojer
Copy link

rojer commented May 6, 2020

I know this is a duplicate of #108 but hear me out.
A helper to set options from ssh_config that clearly documents limitations and the subset of the
options that are supported, would be immensely useful.
I understand it may not be easy but it doesn't mean it would not be useful.
Because what ends up happening is users writing their own solutions of unknown quality as the author of #108 ended up doing and as I am about to do myself.

@sanga
Copy link

sanga commented May 7, 2020

+1 for this. I ended up writing my own support for this too as I needed it. The spec for ssh_config is pretty big, but most of the most commonly used options are supported in asyncssh e.g. hostname, identityfile etc. And as @rojer says: so long as the limitations are clearly communicated I don't think that partial support is necessarily a problem. Is there any room for negotiation on this topic @ronf ? :)

@ronf
Copy link
Owner

ronf commented May 7, 2020

I'll need to look at the OpenSSH config spec again before promising anything here. In the meantime, it would be helpful to flesh out some examples of what you'd like AsyncSSH to support in terms of config file parsing. Here are some questions along those lines to get the discussion started:

  • Which directives are most important to you?
  • Are you looking for this for just the SSHClient, or would you want to have SSHServer be able to support referencing an sshd_config file?
  • Do you need support for Host/Match blocks? I'm guessing yes, and that this is probably the main reason you want this.
  • Do you need support for multiple evaluation passes due to CanonicalizeHostname or the "final" keyword?
  • Do you need support for '%' token expansion?

@sanga
Copy link

sanga commented May 7, 2020

So for us everything we needed was supported already in SSHClientConfigurationOptions it was just a question of parsing the ssh config file and mapping the values therein to what asyncssh expects. To be specific what we needed was: username, host, kexalgos, macalgos, ciphers, port and stricthostkeychecking/userknownhostsfile. Actually ConnectTimeout support would've been nice too (but is ConnectTimeout a one-to-one mapping to login_timeout actually?)

For us, clientside was enough.

re: Host/Match blocks. Yes. This is where I cut corners and did a rather naive parser, personally.

No, we at least don't need support for CanonicalizeHostname.

Neither have we needed support for % expansion so far.

@rojer
Copy link
Author

rojer commented May 7, 2020

same here - all the options i need are already supported, it's just a matter of parsing them.
only client, not interested in server.
IdentityFile, keepalive settings, host and match blocks (although only Match host is used in our config), MACs and ciphers, SendEnv.
no canonicalization.
% expansion: yes, but only %u,

@ronf
Copy link
Owner

ronf commented May 9, 2020

Thanks, that's helpful.

I've been thinking about what this might look like. Right now, I'm leaning toward seeing if I can make this be an optional argument on SSHClientConnection (and maybe SSHServerConnection in the future). The idea would be that it would use values it found in the config file as settings in the constructed SSHClientConnections object, but only when the corresponding option was not already specified explicitly in a passed-in options object or as an explicit keyword argument. Directly specifying any argument would always take precedence over anything found in the config file.

Since there can be both a system-wide and user-specific config file, I may allow the option of multiple config files to be specified, but by default I'd probably only look for the user-specific one, since the system-wide config file location can vary from one SSH installation to another. If you didn't want it to parse the default user-specific config file, you could specify something like "config=None" as a keyword argument to disable this feature.

In cases where the same option appeared in the config file(s) multiple times, the first one would take precedence (assuming it wasn't already set as an explicit keyword argument), matching OpenSSH's behavior for that.

I still need to do a prototype to see just how messy this gets before I can commit to adding this, but I'll let you know how that goes.

Sanga, are you using any "Match" keywords other than "host"? I'm not planning to support "exec" there yet, but I could look at supporting all, host, originalhost, user, and localuser initially. In the first cut, I probably won't support the various Canonical options or the corresponding "canonical" and "final" Match types, but that can potentially be added later if things go well.

Rojer, where are you using the %u? Is it in CertificateFile or one of the Identity options? I'm not expecting to support most of the other options that allow %u right now.

I noticed while reading up on the config that it allows an '=' to separate the key & value, mainly for use with "-o" on the command line. I'm thinking I may leave that out for now, though. Have either of you ever run across the use of "=" in an actual config file?

I haven't made a pass through the full set of options to see which ones I'll support, but I'll do that if my initial prototyping goes well. I'll probably start with something simple like Hostname or Username in the prototype and build up from there to add other items mentioned here.

@sanga
Copy link

sanga commented May 10, 2020

We're just matching on the host as a static string e.g.

host x86-64
  hostname 192.168.1.1
  stricthostkeychecking no
....

Nothing fancy at all. I would suppose that supporting only that would cover most people's usecase i.e. I feel like exec is not widely used?

Actually github search supports this: 39 hits for Match in that path .ssh/config over thousands of files.

@ronf
Copy link
Owner

ronf commented May 10, 2020

I spent a bit of time on this tonight, and managed to get a pretty complete config file parser, including support for "Host" and "Match" directives (supporting all, originalhost, host, user, and localuser), including positive and negative wildcard matching (which I was able to reuse from previous work). Matching OpenSSH's behavior on the way the host & user matching works required a bit of ugly code, but it's not TOO bad.

As I dug a little further into this, though, I see there's going to be a fair amount of additional special casing for each of the directives, as their handling isn't consistent. For instance, some directives support being present multiple times (e.g. SendEnv and IdentityFile), while others only keep the first instance that is matched.

In the specific case of SendEnv, there's also the complexity that it specifies a set of wildcard patterns and makes the final choice based on what's in the local environment, but in AsyncSSH today it never pulls anything from the local environment. If that's what an application wants, it has to do so itself, passing in an dict itself which contains whatever values it wants from os.environ.

Also, the various algorithm directives support a leading '^', '+', or '-' to modify the default list rather than replacing it. However, supporting that is going to require much deeper integration than just filling in an SSHClientConnectionOptions object.

There are other messy bits as well. For instance, certificates are specified independently from private keys in OpenSSH, whereas AsyncSSH expects you to provide them as a tuple. Matching up the private keys to certs is possible, but it will be a fair amount of extra work. Also, it's not clear to me how this should interact with the usual behavior to automatically look for a certificate by appending '-cert.pub' to a private key when those are specified by filename.

@ronf
Copy link
Owner

ronf commented May 10, 2020

I'm not really comfortable doing such a minimal implementation here and making that part of the main code base. That's fine for people to do themselves, as they know what config file directives they're using. However, if I include this support in the main AsyncSSH code, it needs to be relatively complete, at least for the subset of configuration that AsyncSSH supports. More importantly, for that subset of directives, the resulting behavior really should match OpenSSH exactly.

To be clear, I haven't given up here. It'll just require more extensive code than I was hoping, and changes in more places. If I add support for incrementally specifying algorithms, for instance, I'll probably allow that syntax in both the existing algorithm arguments as well as the config file. So, instead of only passing in a complete ordered list of args to support, you'll also have the option to pass in a comma-separated string prefixed with '^', '+', or '-'.

For the environment, I could offer both the existing "env" param and a "send_env" parameter, the latter of which takes wildcard strings and looks in os.environ() for the variables & values to send.

For identities, I'm not really sure what to do that would be a clean extension of what I have now. That'll require more thought.

@ronf
Copy link
Owner

ronf commented May 12, 2020

Ok - I've added support for OpenSSH syntax when specifying algorithms in commit 57fad3c in the "develop" branch. Part of adding that support included creating the notion of "default" algorithms for each algorithm type, so that AsyncSSH can enable only a subset of supported algorithms by default, leaving weaker algorithms available when they are requested but not enabled by default.

Whenever the application passes in a string rather than a list of strings, OpenSSH syntax is supported, including a prefix of '^', '+', or '-' to add and remove things from the default algorithms rather than completely replacing the list.

In this initial cut, I also added support for wildcard patterns when matching algorithm names, thinking OpenSSH supported that. After further testing, I see that this is not the case, though. It looks like OpenSSH only supports wildcards in the case of the '-' prefix. I may go back and revisit that in AsyncSSH before I release this code.

@ronf
Copy link
Owner

ronf commented May 13, 2020

I ran into another wrinkle tonight in attempting to implement this support. A number of config settings supported by OpenSSH are at the channel/session level, rather than the connection level, which means they are not actually a part of my SSHClientConnectionOptions class right now. Examples of this include ForwardX11, RemoteCommand, RequestTTY, SendEnv, SetEnv, and XAuthLocation.

A number of other settings are not really config but requests for specific actions to be take after connecting, like setting up port/path forwarding. I'm beginning to think I'm going to need a whole different object to hold all of these settings, and then I'll have to selectively use them to either populate arguments when creating new connections, new sessions, or making calls to other methods like create_server(), forward_local_port(), or forward_socks().

@ronf
Copy link
Owner

ronf commented May 14, 2020

I've added support for "send_env" in SSHClientConnection methods that open new SSH sessions in commit c2ba4af in the "develop" branch. This is another building block which can be used to support directives in an OpenSSH config file.

@sanga
Copy link

sanga commented May 14, 2020

👍 thanks for working on this @ronf !

@ronf
Copy link
Owner

ronf commented May 19, 2020

Quick update on this: Over the weekend, I made some good progress. I have a fairly complete first cut config file parser (including support for Host and Match conditionals) integrated into AsyncSSH as a new "config" argument when starting new connections, defaulting to read from ~/.ssh/config if you don't specify an alternative.

In the client case, I have hooked up a dozen or so options so far including hostname, user, all the algorithms, keepalive, some socket options, and some other miscellaneous stuff that was easy. I haven't finished hooking up the identity/certificate stuff yet, but I have some ideas for how to do that.

I was able to share code to allow many of these same settings to also work on the server side where applicable (like setting algorithms). I won't default to any specific path for the server-side config file, but you'll be able to use the same "config" argument to specify that on something like a create_server() or listen() call.

There's more to do before this is ready for check-in, including writing unit tests for it and shaking out any bugs, but I'm pretty happy with how it looks so far. The config parsing is only a few hundred lines, and the work to integrate a new config option is generally just a few lines per item.

Here's a snippet from SSHConnectionOptions that shows the internals of how it is hooked up:

        self.host = config.get('Hostname', host)
        self.port = port if port else config.get('Port', _DEFAULT_PORT)

        self.family = family if family is not None else \
            config.get('AddressFamily', socket.AF_UNSPEC)
        self.local_addr = local_addr if local_addr is not None else \
            (config.get('BindAddress'), 0)

Generally speaking, specifying an argument in an "options" argument or directly in the call will always take precedence over what's in the config, with the exception of "host", where what you specify is used by default but can be overridden by a "Hostname" option in the config file. If neither "options" nor "config" specifies a value, a default is provided, as shown above.

@ronf
Copy link
Owner

ronf commented May 23, 2020

Another quick update: I've figured out how to handle IdentityFile/CertificateFile. In commit 64f8144, I've checked in new arguments to specify a list of certificates which is separate from the list of keys and AsyncSSH will automatically match them up in cases where you don't give it an explicit (key, cert) tuple and it doesn't find a '-cert.pub' file. It was quite a bit less complicated than I expected to support this.

I've also expanded support for '~' expansion in more places in commit ba9dd48.

@ronf
Copy link
Owner

ronf commented May 30, 2020

My first cut at config file support is now available in the "develop" branch as commit 8c88a8b! See the commit summary for details of what's supported so far. Testing and feedback is highly welcome.

The main thing missing from our discussion above is support for token substitutions, but I'll look into that shortly. Also, while you can change the "known hosts" file to use by setting UserKnownHostsFile, the StrictHostKeyChecking value is not supported right now, since AsyncSSH never writes new host keys to the known hosts file. I need to determine how you disable known hosts checking in OpenSSH to see if there's something there equivalent to AsyncSSH's "known_hosts=None" option.

@ronf
Copy link
Owner

ronf commented May 31, 2020

I spent a little more time tonight looking at the token substitutions. The code will need a bit of cleanup before I can check it in (adding unit tests and the like), but I now have the following client-side substitutions working:

%%      literal %
%C      SHA-1 hex digest of %l%h%p%r
%d      Local user's home directory
%h      Remote hostname
%i      Local UID (UNIX only)
%L      Local hostname (without the domain)
%l      Local hostname (including the domain)
%n      Original remote hostname
%p      Remote port
%r      Remote username
%u      Local username

That's pretty much the complete set of tokens that OpenSSH supports except for %T, which is related to the tun/tap feature that AsyncSSH doesn't support.

These substitutions work in the following config options:

Hostname                %%, %h
CertificateFile         all
IdentityAgent           all
IdentityFile            all
RemoteCommand           all

@ronf
Copy link
Owner

ronf commented May 31, 2020

My first cut as token expansion is now available in commit 073bff9 in the "develop" branch. Please let me know if you run into any problems with any of these changes.

@sanga
Copy link

sanga commented Jun 3, 2020

Finally found some time to look at this. After a cursory tire-kicking it looks great - works perfectly.

@ronf
Copy link
Owner

ronf commented Jun 3, 2020

Great, thanks for testing it!

Looking through the rest of the config options, there are a few more I can probably figure out a way to support, but the majority of what's remaining either don't apply (like EscapeChar or NumberOfPasswordPrompts) or may be used infrequently enough that it may not make sense to add the option until someone requests it.

There are a few I might explore as new options in both native form and through the config, like RequestTTY. The behavior won't match exactly, as OpenSSH bases its choice on whether a TTY is allocated on the client side, but I can get something close that extends what AsyncSSH does today where by default it requests a remote TTY when term_type is specified.

One other thing I've considered is whether to support LocalForward and RemoteForward options (and the related ClearAllForwardings). That can be done in AsyncSSH today with explicit calls to methods like forward_local_port() and forward_remote_port(). However, AsyncSSH returns an SSHListener object that can be used to turn off a previously requested port forwarding when you use those calls. If this is requested via the config file, it's not clear how the forwarding would ever later get turned off without closing the entire SSH connection. It may be better to stick with the current support for that.

Similarly, authentication in AsyncSSH is handled by the application either defining callbacks to provide credentials when requested or by providing credentials explicitly as options on the connect call. So, I'm not sure it makes sense to honor the config options that OpenSSH defines, since those settings should probably be overridden when credentials are provided using the existing AsyncSSH mechanisms.

@sanga
Copy link

sanga commented Jun 3, 2020

Actually, there is one thing I just hit that might be nice to include. man ssh_config has the following

  ssh(1) obtains configuration data from the following sources in the following order:

           1.   command-line options
           2.   user's configuration file (~/.ssh/config)
           3.   system-wide configuration file (/etc/ssh/ssh_config)

Although apparently asyncssh doesn't notice /etc/ssh/ssh_config. I suppose that might be pretty easy to support?

@ronf
Copy link
Owner

ronf commented Jun 3, 2020

It doesn't look for that file automatically, as the path can vary from one OpenSSH deployment to another. For instance, I have three different SSH instances on my system. The one in /usr/bin looks in /etc/ssh, but I also have an /opt/local/bin/ssh that looks in /opt/local/etc/ssh and a version I built myself that looks in /usr/local/etc (note: no "ssh" directory there).

If you need to read multiple files, you can pass a list of paths in "config". In other words:

    asyncssh.connect(host, ..., config=['~/.ssh/config', '/etc/ssh/ssh_config'])

@sanga
Copy link

sanga commented Jun 4, 2020

👍 thanks

@ronf
Copy link
Owner

ronf commented Jun 6, 2020

Another quick update: I added support for PreferredAuthentications and ProxyJump this week. Preferred authentications can also be set via a new "preferred_auth" client connection option, and a jump server can be set by passing a string of the form "[user@]host[:port]" as the "tunnel" argument to the top-level connect/listen functions.

Tunneling like this has been available for a while, but it required you to pass in an already open SSHClientConnection as the tunnel argument, or use helper functions like connect_ssh() on SSHClientConnection. This is still needed if you want to pass in non-default arguments when connecting the tunnel, but if you can get by with defaults or options in the user's config file when opening the tunnel, passing in a string (or using the new ProxyJump config option) is now an alternative.

@ronf
Copy link
Owner

ronf commented Jun 8, 2020

Here are some additional config options I added support for in the last couple of days:

GSSAPIAuthentication
GSSAPIKeyExchange

ChallengeResponseAuthentication
HostbasedAuthentication
KbdInteractiveAuthentication
PasswordAuthentication
PubkeyAuthentication

RequestTTY
PermitTTY (on server)

Also, Both GlobalKnownHostsFile and UserKnownHostsFIle are supported now, and both of them can take a list of multiple files. I'll probably look into doing something similar soon for the authorized_keys file when I hook that up.

@jeremyschulman
Copy link

jeremyschulman commented Jun 8, 2020

@ronf - I am looking at the develop branch for using the config file. I have a tool that will make ~1000 SSH connections, and I would like to load & parse the contents of the ssh_config once. After reviewing the code it appears that each call to connect will re-parse the config file. Is that correct?

@ronf
Copy link
Owner

ronf commented Jun 8, 2020

Yes, I think creating an "options" object once with "config" as an argument should allow you to parse the config file only once, as long as the config doesn't depend on something like user or host which might be changing when you make the actual connections. In this regard, it should work similar to other option values which point at filenames, like loading client keys. I must admit this is a combination I haven't explicitly tried, though. Let me know if you run into problems trying this out.

@jeremyschulman
Copy link

@ronf - For my use-case I am connecting to > 1000 different devices. I believe AsyncSSH will load and parse the ssh_config files(s) 1000 times; once per device; and potentially more since my program will attempt multiple credentials per device if necessary.

If that is true, I would like to request a mechanism where by I could load & parse the ssh_config file(s) once and pass that instance to connect as an alternative to using the currently defined config=[<list of file name>].

That said, I examined the code for asyncssh.config.load_client_config and I realize this requires unique parameters on a per-device basis.

@ronf
Copy link
Owner

ronf commented Jun 10, 2020

Yeah - if you want to use Host/Match blocks or token expansion in your config file, something like that wouldn't work. However, there are probably use cases where that's not an issue.

Before I added config support, I already allowed something like this for saving a collection of options. Basically, you could create your own SSHClientConnectionOptions() object with the same keywords you would have passed to connect(), and it would do the error checking and normalization once, and they you could pass this object via the "options" argument to connect() instead of passing the individual keyword arguments. You could also incrementally modify one options object to create another, by passing in "options" along with arguments you want to change when creating a second options object out of the first.

To make something like this work for the "config" argument is going to require some small changes to how I've implemented that so far, but I think it's a good idea. Then, you could create an SSHClientConnectionOptions object by passing in "config=", along with other native AsyncSSH options if you wanted to and it would parse the config file once and save the results. You could then pass that options object to connect() (or other similar functions). As long as you don't have any conditional blocks which would match differently, you could reuse this options object for multiple connect() calls.

I've got a first cut of this written -- I just need to add some unit tests and do another pass over the code. I should be able to have something you can try out in a day or two.

@jeremyschulman
Copy link

@ronf - Thank you for your consideration and proposed approach:

Then, you could create an SSHClientConnectionOptions object by passing in "config=", along with other native AsyncSSH options if you wanted to and it would parse the config file once and save the results. You could then pass that options object to connect() (or other similar functions).

This would be perfect for my use-case.

@ronf
Copy link
Owner

ronf commented Jun 11, 2020

Ok - commit d832d8a in the "develop" branch adds support for incrementally creating SSHClientConnectionOptions and SSHServerConnectionOptions objects that can start from a base configuration but then update specific config options or load additional config files. Once created, the parsed configuration can then be used for multiple connect()/listen() calls, without needing to parse the config files multiple times.

The one catch here is that you won't get the right results if you try and reuse a config with Host/Match blocks in it and you are changing the user/host/port information between requests in a way that would change which conditional blocks matched. Since you are only parsing the configuration once, these blocks will be evaluated against the user/host/port information set at the time the options objects are created.

The syntax is similar to the existing options support:

    options = asyncssh.SSHClientConnectionOptions(config='my_config')

    async with asyncssh.connect(options=options) as conn:
        ...

A mix of AsyncSSH option arguments and config is supported here, as is using one options object to create another:

    options = asyncssh.ClientConnectionOptions(user='some_user', config='my_config')

    new_options = asyncssh.ClientConnectionOptions(options=options, user='new_user')

You can even pass both "options" and "config" to incrementally load additional config files onto a previously parsed set of options, either when creating a new options object or when calling something like connect() or listen().

@ronf
Copy link
Owner

ronf commented Jun 11, 2020

Actually, please hold off for a bit on trying this. It looks like my unit tests on Linux and Windows are picking up something that wasn't caught on MacOS that I need to look into.

@ronf
Copy link
Owner

ronf commented Jun 11, 2020

Ok - the latest commit should fix the issue. The problem did actually show up on the Mac when I reran the unit tests here, so somehow I must have missed doing that on the final edit of the code before the last checkin. The error was in handling the "config=None" case to disable config file loading.

@jeremyschulman
Copy link

@ronf - Thank you very much for adding this support!

@ronf
Copy link
Owner

ronf commented Jun 14, 2020

My pleasure!

Today I added support for Match blocks in the server config (User, Address, Host, LocalAddress, and LocalPort), and the server option "UseDNS" to control whether to do reverse DNS lookups of the client address to allow "Match Host" to match on client hostnames. The "UseDNS" config option is also available as an "rdns_lookup" connection option, and applies to the authorized keys "from=" matching. Previously, the reverse lookup was always done when "from=" options were present, but now you can avoid the cost of doing the reverse DNS lookup, as long as you are ok matching only on IP addresses, and this is actually the new default behavior (matching OpenSSH).

I also added token substitution support in server config files. For now, though, the only option is to use "%u" to substitute the username in the AuthorizedKeysFile option.

@ronf
Copy link
Owner

ronf commented Jun 15, 2020

I just checked in some initial documentation for this new config file support at:

https://asyncssh.readthedocs.io/en/develop/api.html#config-file-support

It summarizes all of the supported config options, match criteria, and token substitutions currently supported. Please let me know if you have any problems with any the new functionality, or if there are specific options you'd like to see support for which aren't there yet.

@ronf
Copy link
Owner

ronf commented Jul 12, 2020

Initial support for this is now available in release 2.3.0!

@sanga
Copy link

sanga commented Jul 13, 2020

Awesome. Thanks again!

@rojer
Copy link
Author

rojer commented Jul 13, 2020

thanks, Ron! i drifted away from the project i needed this for initially, but i'm glad this happened, and will definitely use it in the future

@sanga
Copy link

sanga commented Jul 16, 2020

I guess this issue can be closed now? Or is there still some remaining work?

@ronf
Copy link
Owner

ronf commented Jul 17, 2020

Yeah, I'll go ahead and close this. There are some additional config options that could probably be added at some point, but nothing urgent, so I'll probably let demand drive this to some degree.

@ronf ronf closed this as completed Jul 17, 2020
@ANaumann85
Copy link

Thanks a lot for all the effort in implementing the configuration :)

Would it be possible to add support for the Include-statement? That allows me to keep ssh configurations consistent over several systems, without need to update the configuration in ~/.ssh/ssh_config.

@ronf
Copy link
Owner

ronf commented Nov 8, 2021

The "Include" directive should already be supported for both client and server config files, along with "Match" to make some portions of the configuration conditional. For client config, the "Host" directive is also supported as an alternative to "Match Host", and all of these are intended to match the behavior of OpenSSH.

Have you run into a case where "Include" isn't working properly?

@ANaumann85
Copy link

But I do not see the "include"-directive in the list of "client config options" at the documentation.

My minimal non-working example is:

  1. a file ~/someFolder/ssh_config containing only
Match Host server.example.com
  User MyUserName
  1. In ~/.ssh/config i have the line Include ~/someFolder/ssh_config at the beginning

Please note the ~ at the beginning. With the tilde, I get a "permission denied", due to wrong the user name.

If I replace the tilde by /fullPath/to/my/home, then the login works as expected.

@ANaumann85
Copy link

I have version 2.7.0.. May that be the reason?

@ANaumann85
Copy link

Indeed, it works with 2.7.1 and newer. I did not see #384 . Sorry for the noise

@ronf
Copy link
Owner

ronf commented Nov 9, 2021

No problem - glad you got it to work!

Regarding the documentation, there is a mention of "Include" in the text at the top of https://asyncssh.readthedocs.io/en/latest/api.html#config-file-support:

AsyncSSH also supports the “Include” directive, to allow one config file trigger the loading of others.

Special directives like Include and Match are covered at the top of that section, since they aren't specific to either the client or server side.

@tshu-w
Copy link

tshu-w commented Mar 28, 2022

@ronf Hi, is there any plan to support exec, because when using dvc if there is match exec in the configuration it will report an error. If not, allowing the user to ignore it instead of raising an error would be an option.

@ronf
Copy link
Owner

ronf commented Mar 29, 2022

I've generally been hesitant to rely on things that fork external processes as it can have a serious performance impact, especially when opening many SSH sessions in parallel from a single Python instance. However, I can probably do this in a non-blocking manner using the asyncio version of the subprocess support, and then it'd just be up to the caller to take responsibility for how much CPU it costs to evaluate each "Match exec".

I'll give this a closer look this week and let you know how I make out.

@ronf
Copy link
Owner

ronf commented Apr 3, 2022

As I dug into this, I ran into a bit of a snag -- changing config file parsing to be asynchronous would break backward compatibility with the existing API. So, I'm not able to do something like make the parse() function run the "Match Exec" in an executor. This means that you need to be careful to make sure that any commands run by "Match Exec" don't block for any significant amount of time, as the entire event loop will be blocked until config parsing completes.

With that limitation in mind, I think I have a first cut of this which you can try. It is checked into the "develop" branch as commit 331d9fd.

That said, while I can't make the parse() function itself run in an executor, I may be able to make the async functions that currently perform config parsing do so. This would eliminate the majority of the risk of blocking the event loop without breaking backward compatibility. Callers which directly trigger config parsing through the SSHConnectionOptions classes would still need to manage this themselves, but passing the options or config arguments to calls like asyncssh.connect() and asyncssh.listen() could be made to do this automatically. I'll look into that further.

In the meantime, I've added a warning under https://asyncssh.readthedocs.io/en/develop/api.html#config-file-support about this.

@ronf
Copy link
Owner

ronf commented Apr 3, 2022

Ok - commit 6c174d9 in the "develop" branch has been updated to use run_in_executor() for all AsyncSSH APIs which construct client/server connection option or config objects. So, applications only need to worry about this if they are instantiating the options or config objects manually.

@tshu-w
Copy link

tshu-w commented Apr 4, 2022

@rojer Thank you for your efforts! Since I'm not a direct user of asyncssh, I may not be able to test it very well, but the current update seems to be enough for me, I'll try using the develop branch of asyncssh in dvc and let you know if I run into problems.

@ronf
Copy link
Owner

ronf commented Apr 6, 2022

Sounds great - thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants