An unmaintained fsspec implementation for the ssh/sftp protocol via using parallel-ssh.
I didn't spend a lot of time on this project,
it was just test to see how good parallel ssh
performs in our use case though I wasn't able
to run the benchmarks due to various errors. I
also have to note that this implementation depends
on various internal methods of parallel-ssh (like
_eagain, _make_sftp) and undocumented public
methods (like eagain_write). I couldn't find
a public API to cover all the operations I needed
so just be aware.
You might encounter with different sorts of errors from different places, some might arise on race conditions and others just happen randomly. Here are a few that I stumbled against;
- When writing to a stream it might poll indefinitely and never continue
- When writing to a stream sometimes it just hits an assertion in the underlying libssh2 library (the C one, not the python-binding) and gets a core dump.
- There is a double free happening that I didn't try to deduce at which point, though I suspect might be a race condition (? even then it shouldn't happen).
Other stuff to note;
- Since all sftp errors throw a generic error,
we have to waste a couple exists() calls for each
operation to ensure
FileNotFoundErrors are properly delegated. This is costful and also doesn't guarantee much since you might still hit other sorts of errors which are nearly indistungishable. - No kerberos/gss auth for the ssh2, though just to note parallel-ssh claims they are available on the other client.