Skip to content

Conversation

@stevengj
Copy link
Member

@stevengj stevengj commented Sep 26, 2025

This PR changes cp, and other uses of the internal function Base.sendfile(srcpath, destpath), to use the uv_fs_copyfile function.

This should be faster and more reliable, hopefully — on many filesystems, uv_fs_copyfile can just create a copy-on-write link. Hopefully, this should fix longstanding problems with cp of large files: fixes #56537 (macos), fixes #39868 (linux), fixes #30723.

In particular, I noticed two problems with our previous sendfile implementation, which called a lower-level sendfile function on the file descriptors that in turn called uv_fs_sendfile. First, uv_fs_sendfile takes a Csize_t argument for the number of bytes, which is clearly too small on 32-bit systems. Second, it assumes that the return value of uv_fs_sendfile is the number of bytes written, but the return value is a Cint, which is clearly too small for the number of bytes in a large file (> 2GiB). The PR therefore fixed the uv_fs_sendfile call to pass at most typemax(Cssize_t) (which is the maximum value on a 32-bit Unix system, since the underlying libc sendfile returns an ssize_t) in a single call (writing larger files in chunks), and second to only use the return value as an error code — if uv_fs_sendfile succeeds, it looks like we can assume that it wrote the requested number of bytes.

In principle, we could delete this lower-level Base.sendfile method completely, since it is undocumented and we don't use it. I couldn't find any external packages that use it either. But I left it in for now, to be conservative.

I'm still seeing a test failure with this PR related to file mode: on my macos system, it seems to be ignoring the umask for the permissions of the copied file? I'm not sure why — libuv's copyfile implementation on Unix should call open with the mode of the source file, which should mask out the umask, no? See also #27295.

cc @vtjnash, who suggested using uv_fs_copyfile in #27295 (review).

@stevengj stevengj added the filesystem Underlying file system and functions that use it label Sep 26, 2025
@stevengj
Copy link
Member Author

The failing tests are due to the mode not respecting umask, as noted above. @vtjnash, any advice on why libuv might be ignoring the umask, here?

@vtjnash
Copy link
Member

vtjnash commented Sep 26, 2025

It is implemented to copy permissions, not the umask: libuv/libuv#4396

@stevengj
Copy link
Member Author

stevengj commented Sep 27, 2025

So should our cp do the same? Or should I do an explicit chown and chmod with the umask after copying?

Or should I submit a PR to libuv to add a flag to have copyfile to not copy ownership and permissions? (But it seems like this could introduce an inconsistency with Windows, according to the issue linked below.)

@vtjnash filed libuv/libuv#3125 requesting that libuv copy over the permissions and ownership to match cp, so I'm guessing that you would advocate that we change our behavior as well? But the umask treatment in #27295 was clearly intentional by @staticfloat at the suggestion of @KristofferC; would this be considered a breaking change?

@vtjnash
Copy link
Member

vtjnash commented Sep 28, 2025

IIRC, we don't care, but want to ensure it is consistent across platforms whichever way it goes so it isn't a weird gotcha when testing different platforms

@staticfloat
Copy link
Member

IMO we should generally follow whatever coreutils does, so that our cp() is similar to their cp.

@stevengj
Copy link
Member Author

stevengj commented Sep 28, 2025

IMO we should generally follow whatever coreutils does, so that our cp() is similar to their cp.

uv_fs_copyfile seems similar to coreutils cp --preserve.

Our existing behavior of applying the umask is similar to the default cp behavior in applying umask, but it differs in other ways; in particular, if the destination file already exists then cp preserves its existing permissions.

In the absence of [the --preserve] option, the permissions of existing destination files are unchanged. Each new file is created with the mode of the corresponding source file minus the set-user-ID, set-group-ID, and sticky bits as the create mode; the operating system then applies either the umask or a default ACL, possibly resulting in a more restrictive file mode.

whereas I believe we do not do this (in force=true mode we rm the file before copying).

@staticfloat
Copy link
Member

I think it's fine to be similar to cp --preserve, as long as we document that we're going to try and preserve permissions, that seems like a totally reasonable default.

@stevengj
Copy link
Member Author

Okay, I've changed the docstring and test to document that it acts like cp -p, at least for files (ignoring the question of directories for now).

This doesn't seem like a breaking change since the previous behavior was undocumented.

@stevengj
Copy link
Member Author

stevengj commented Oct 1, 2025

CI failure seems to be an unrelated Error in testset precompile. Update: now CI is green.

@floswald
Copy link

floswald commented Oct 28, 2025

hi all,
this worked for me.

floswald@PTL11077 ~/g/julia (sgj/uv_fs_copyfile)> ./julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.13.0-DEV.1204 (2025-09-30)
 _/ |\__'_|_|_|\__'_|  |  sgj/uv_fs_copyfile/fc4d77672f (fork: 4 commits, 33 days)
|__/                   |

julia> versioninfo()
Julia Version 1.13.0-DEV.1204
Commit fc4d77672f (2025-09-30 05:27 UTC)
Platform Info:
  OS: macOS (arm64-apple-darwin24.6.0)
  CPU: 10 × Apple M1 Pro
  WORD_SIZE: 64
  LLVM: libLLVM-20.1.8 (ORCJIT, apple-m1)
  GC: Built with stock GC
Threads: 1 default, 1 interactive, 1 GC (on 8 virtual cores)



julia> cp("/Users/floswald/Dropbox (Personal)/Apps/JPE-packages/JPE/Green-20220767/2/replication-package/Ben Sand - Union_Rep_Package.zip", "/Users/floswald/Downloads/test.zip")
"/Users/floswald/Downloads/test.zip"

julia> exit()
floswald@PTL11077 ~/g/julia (sgj/uv_fs_copyfile)> julia
The latest version of Julia in the `release` channel is 1.12.1+0.aarch64.apple.darwin14. You currently have `1.11.4+0.aarch64.apple.darwin14` installed. Run:

  juliaup update

to install Julia 1.12.1+0.aarch64.apple.darwin14 and update the `release` channel to that version.
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.11.4 (2025-03-10)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |


julia> cp("/Users/floswald/Dropbox (Personal)/Apps/JPE-packages/JPE/Green-20220767/2/replication-package/Ben Sand - Union_Rep_Package.zip", "/Users/floswald/Downloads/test.zip", force = true)
ERROR: IOError: sendfile: Unknown system error -184335295 (Unknown system error -184335295)
Stacktrace:
 [1] uv_error
   @ ./libuv.jl:106 [inlined]
 [2] sendfile(dst::Base.Filesystem.File, src::Base.Filesystem.File, src_offset::Int64, bytes::Int64)
   @ Base.Filesystem ./filesystem.jl:224
 [3] sendfile(src::String, dst::String)
   @ Base.Filesystem ./file.jl:1131
 [4] cp(src::String, dst::String; force::Bool, follow_symlinks::Bool)
   @ Base.Filesystem ./file.jl:386
 [5] top-level scope
   @ REPL[2]:1

julia> 

floswald@PTL11077 ~/g/julia (sgj/uv_fs_copyfile)> du -h /Users/floswald/Downloads/test.zip 
3.8G	/Users/floswald/Downloads/test.zip

@stevengj stevengj added the triage This should be discussed on a triage call label Nov 22, 2025
@stevengj
Copy link
Member Author

CI failure looks unrelated:

Error in testset REPL:
Error During Test at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-master/julia-4aa2ef83b2/share/julia/stdlib/v1.14/REPL/test/repl.jl:1862
  Got exception outside of a @test
  failed process: Process(`/Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-master/julia-4aa2ef83b2/bin/julia --startup-file=no -e 'using REPL; print(REPL.Pkg_promptf())'`, ProcessExited(1)) [1]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

filesystem Underlying file system and functions that use it triage This should be discussed on a triage call

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Base.cp seems to have a bug when copying large files cp fails on linux for large files IOError from cp with large file on Windows

5 participants