Skip to content

Base.Filesystem.samefile inconsistent, possibly too restrictive #46830

@umlet

Description

@umlet

As outlined on Discourse, there are two unrelated issues with samefile:

  1. samefile results can be inconsistent in a border case
    Assume files "A" and "B" do not exist:

    julia> using Base.Filesystem
    julia> samefile(stat("A"), stat("B"))  # -> samefile(::StatStruct, ::StatStruct)
    true
    julia> samefile("A", "B")              # -> samefile(::AbstractString, ::AbstractString)
    false
    

    The latter uses ispath to check for file existence, and arguably has the expected/"correct" behavior.

  2. samefile might be too restrictive
    The Filesystem API provides many specific functions on StatStruct like isfile(::StatStruct), and unrestricted, generic variants like isfile(x) = isfile(stat(x)). Types that are stat()-able thus can fit in nicely (e.g., types cashing the StatStruct).
    samefile(::AbstractStrings, ::AbstractStrings) seems to be unnecessarily restrictive.

Not sure if it is a good idea to mix up two unrelated changes, but it's all within a handful of LOC; so a proposed change:

OLD/current:

# samefile can be used for files and directories: #11145#issuecomment-99511194
samefile(a::StatStruct, b::StatStruct) = a.device==b.device && a.inode==b.inode

"""
    samefile(path_a::AbstractString, path_b::AbstractString)
Check if the paths `path_a` and `path_b` refer to the same existing file or directory.
"""
function samefile(a::AbstractString, b::AbstractString)
    infoa = stat(a)
    infob = stat(b)
    if ispath(infoa) && ispath(infob)
        samefile(infoa, infob)
    else
        return false
    end
end

NEW:

"""
    samefile(patha, pathb) -> Bool
Check if `patha` and `pathb` refer to the same existing file or directory.
"""
function samefile(sta::StatStruct, stb::StatStruct)
    ispath(sta) || return false
    ispath(stb) || return false
    return sta.device == stb.device && sta.inode == stb.inode
end

samefile(patha, pathb) = samefile(stat(patha), stat(pathb))

(Reasons for doc string location, arg names in doc string,.. are given on Discourse.)

There are other issues that might be interesting in this context (should stat return nothing instead of a 0-struct; is the name samefile ideal if it can be used for dirs,..), but maybe for now it's best to keep this minimal/surgical?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIndicates an unexpected problem or unintended behavior

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions