Skip to content

dvc.fs.Path.parts wrong results #7228

@jonburdo

Description

@jonburdo

EDIT: This issue will just be for this first problem of handling a sep at the end of a path. I made the windows-style path problem a separate issue #7233

When a path ends with the path sep, the parts function doesn't split. It returns a tuple with a single item:

from dvc.fs.path import Path
Path('/').parts('/a/b/c/')
('/a/b/c',)

A second problem occurs when using windows style paths. We get the sep between the drive and the rest of the path:

Path('\\').parts('c:\\a')
('c:', '\\', 'a')

The first problem could be solved by simply stripping the final sep:

        drive, path = self.flavour.splitdrive(path.rstrip(self.flavour.sep))

but the second problem would still exist.

We should really get these results:

Path('/').parts('/a/b/c/')
('/', 'a', 'b', 'c')

and

Path('\\').parts('c:\\a')
('c:', 'a')

Note the second case is still a little different from pathlib, which would include the sep with the drive:

from pathlib import PureWindowsPath
PureWindowsPath('c:\\a').parts
('c:\\', 'a')

but this is probably more in-line with fsspec, which basically treats the drive letter as the first element of a relative path:

fsspec.AbstractFileSystem._parent('c:/a')
'c:'

version info:

DVC version: 2.9.4.dev28+gd90fe54d.d20220106 
---------------------------------
Platform: Python 3.10.1 on Linux-5.15.11-arch2-1-x86_64-with-glibc2.33
Supports:
	azure (adlfs = 2021.10.0, knack = 0.9.0, azure-identity = 1.7.1),
	gdrive (pydrive2 = 1.10.0),
	gs (gcsfs = 2021.11.1),
	hdfs (fsspec = 2021.11.1, pyarrow = 6.0.1),
	webhdfs (fsspec = 2021.11.1),
	http (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
	https (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
	s3 (s3fs = 2021.11.1, boto3 = 1.19.8),
	ssh (sshfs = 2021.11.2),
	oss (ossfs = 2021.8.0),
	webdav (webdav4 = 0.9.3),
	webdavs (webdav4 = 0.9.3)
Cache types: <https://error.dvc.org/no-dvc-cache>
Caches: local
Remotes: https
Workspace directory: btrfs on /dev/mapper/nvme0n1p3_crypt
Repo: dvc, git

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugDid we break something?

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions