Skip to content

Cache file names should include file name extension #5486

@aschuh-hf

Description

@aschuh-hf

At the moment, names of files in the DVC cache have no file name extension, unless it is a special .dir text file. This poses issues when using symbolic links from the workspace to the DVC cache and the use of pathlib.Path.resolve(), for example, when the consuming program determines based on the file name extension the file format (e.g., image file format, PNG, JPEG). This would be solved by keeping the filename extension when moving files to the DVC cache (e.g., {cache}/{hash[:2]}/{hash[2:}{ext}, where ext=".png"). Because such change would invalidate existing caches, maybe consider adding this as a option in .dvc/config that a new project can opt-in to.

As a concrete example of an image file reader which requires a proper file name extension, see SimpleITK.ReadImage(), which is more common in medical applications than computer vision.

Metadata

Metadata

Assignees

No one assigned

    Labels

    awaiting responsewe are waiting for your reply, please respond! :)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions