Skip to content

File not found error when using SFTPFileSystem implementation in duckDB #1938

@denisrohr

Description

@denisrohr

Hello,

When using fsspec.implementations.sftp.SFTPFileSystem and fs.size() gets called with the fully qualified path (including sftp://servername), the path is not properly stripped and the literal "sftp://servername/folder/file.csv" gets queried, which results in file not found.

The issue stems from SFTPFileSystem overriding info() here: https://github.com/fsspec/filesystem_spec/blob/master/fsspec/implementations/sftp.py#L95 and hiding fsspec.implementations.AbstractFileSystem's info() from here: https://github.com/fsspec/filesystem_spec/blob/master/fsspec/spec.py#L671
so that the protocol stripping logic here https://github.com/fsspec/filesystem_spec/blob/master/fsspec/spec.py#L688 isn't called anymore.

We experienced this behaviour when using fsspec from within duckDB, which always uses the fully qualified path in all calls. We also believe that this will happen when exists(), checksum(), sizes(), isdir(), isfile(), ukey(), stat() gets called.

Is this behaviour expected? Should these filesystem calls not strip the protocol and should users take care to not include the protocol in the path? Then I would raise the issue with duckdb instead.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions