Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSH / SFTP Implementations #208

Open
juftin opened this issue Mar 14, 2024 · 3 comments
Open

SSH / SFTP Implementations #208

juftin opened this issue Mar 14, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@juftin
Copy link
Contributor

juftin commented Mar 14, 2024

The following script works differently between 0.2.2 and 0.0.24 and I'm trying to figure the cleanest way to account for it:

import upath

ssh_path = upath.UPath("ssh://juftin@192.168.1.123:22/").resolve()

(yeah, I know SFTP/SSH isn't a current implementation - I'd be happy to help to implement it, it looks to work pretty seamlessly without any custom functions)

0.0.24

>>> ssh_path
UPath('ssh://juftin@192.168.1.123:22/)
>>> ssh_path.path
'/'
>>> str(ssh_path)
'ssh://juftin@192.168.1.123:22/'
>>> ssh_path.fs.ftp.normalize(ssh_path.path)
'/'
>>> upath.__version__
'0.0.24'

0.2.2

>>> ssh_path
UPath('ssh://.')
>>> ssh_path.path
'.'
>>> str(ssh_path)
'ssh://.'
>>> ssh_path.fs.ftp.normalize(ssh_path.path)
'/home/juftin'
>>> upath.__version__
'0.2.2'

The difference here is that 0.2.2 assumes you are starting from the user's home directory and every path is relative from there, this means paths like ssh://juftin@192.168.1.123:22/var/local are no longer working.

In my case the previous way the SFTP/SSH filesystem went to the root of the directory tree was more intuitive. I've been working on getting this to work for browsr / textual-universal-directorytree , the below code works for me:

class SFTPTextualPath(UPath):
    """
    SFTPTextualPath
    """

    @property
    def path(self) -> str:
        """
        Always return the path relative to the root
        """
        pth = super().path
        if pth.startswith("."):
            return f"/{pth[1:]}"
        elif pth.startswith("/"):
            return pth
        else:
            return "/" + pth

    def __str__(self) -> str:
        """
        Add the protocol prefix + extras to the string representation
        """
        string_representation = f"{self.protocol}://"
        if "username" in self.storage_options:
            string_representation += f"{self.storage_options['username']}@"
        string_representation += f"{self.storage_options['host']}"
        if "port" in self.storage_options:
            string_representation += f":{self.storage_options['port']}"
        string_representation += self.path
        return string_representation
@ap--
Copy link
Collaborator

ap-- commented Mar 14, 2024

Hi @juftin,

thank you for reporting! This is a "bug" in the sense, that even though ssh filesystems are untested, it should be possible to avoid this for all untested filesystems where the netloc would be stripped or end up in the path anchor.

Correct behavior can be restored by setting a special internal setting in UPath's flavour implementation. Note that the following code fixes behavior for ssh:

import upath

# ------- v0.2.2 specific
# Please don't use these two lines in your code:

from upath._flavour import WrappedFileSystemFlavour
WrappedFileSystemFlavour.protocol_config["netloc_is_anchor"].add("ssh")

# There's going to be a correct fix implemented in `universal_pathlib`
# -------

ssh_path = upath.UPath("ssh://juftin@192.168.1.123:22/var/local")

print(f"{ssh_path.parts=}")  # ssh_path.parts=('/', 'var', 'local')
print(f"{ssh_path.root=}")   # ssh_path.root='/'

For a general fix, I need to check if it's possible to infer the netloc_is_anchor setting from the return value of a filesystem's _strip_protocol method, if provided with a generic uri with non-empty netloc.

But anyways, it would of course be great to add an SSHPath implementation and thoroughly test all path functionality.

Cheers,
Andreas

@ap-- ap-- added the bug Something isn't working label Mar 14, 2024
@juftin
Copy link
Contributor Author

juftin commented Mar 14, 2024

Oh very neat. I saw the SFTPFileSystemFlavour - but wasn't sure what might be needed there. I'd be happy to get a SSHPath implementation going - that would need this upstream netloc_is_anchor inference feature first though, right?

@ap--
Copy link
Collaborator

ap-- commented Mar 15, 2024

that would need this upstream netloc_is_anchor inference feature first though, right?

No, not really. You could add "ssh" and "sftp" here:

"netloc_is_anchor": {
"http",
"https",
"s3",
"s3a",
"gs",
"gcs",
"az",
"adl",
"abfs",
"webdav+http",
"webdav+https",
},

and make a SFTPPath / SSHPath implementation like the others. Most of the work will be to make a test fixture that sets up a local ssh server to test against. https://github.com/fsspec/filesystem_spec/blob/master/fsspec/implementations/tests/test_sftp.py this should be a good source for how to set this up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants