-
-
Notifications
You must be signed in to change notification settings - Fork 33.3k
Description
Feature or enhancement
Proposal:
Brief summary: Python ought to use clonefile (not fclonefile) with the COPYFILE_CLONE flag on macOS to copy files with copy-on-write semantics similar to Linux.
In response to an update on the Pip issue about using copy-on-write to improve install performance, someone mentioned “In Python 3.14 shutil.copyfile uses CoW optimizations”.
The documentation actually accurately reflects the current state of things, as fcopyfile is a “fast copy” which will do the work “more efficiently”, and it calls out “Copy-on-write” as specific to “supported Linux filesystems”. But this platform specificity is very subtle, and macOS could use CoW copies too.
As such, I was a bit confused and went on a little journey. The man page for fcopyfile on macOS documents both COPYFILE_CLONE_FORCE and COPYFILE_CLONE as being equivalent to the flags COPYFILE_EXCL | COPYFILE_STAT | COPYFILE_XATTR | COPYFILE_DATA | COPYFILE_NOFOLLOW_SRC. So I thought maybe it was implemented already. But that can't be right, because their behavior is documented as differing from each other, even though the list of flags that they are “equivalent to” is the same. Also, COPYFILE_CLONE(_FORCE) mentions progress callbacks not being invoked, etc, whereas COPYFILE_DATA etc do not document this feature, making it sound like "cloning" (i.e. CoW-semantics copying) is actually only invoked by the COPYFILE_CLONE variants and not their “equivalent” flags. Searching around lead me to this page, which marks copyfile(…, …, …, COPYFILE_DATA) as explicitly not using CoW. So, in order to make sure I was actually reporting a real issue, I grabbed and compiled apfs-clone-checker, in order to verify that cp -c produced a cloned file and 3.14's Path.copy and shutil.copyfile both did not.
Indeed: they did not.
I thought that the obvious patch to change the behavior would be to simply replace posix._COPYFILE_DATA with 0x1000000, i.e. the value of COPYFILE_CLONE. And it did change the behavior! Unfortunately the behavior change in question was to create destination files with a length of zero, and not report any errors.
When I switched it to posix._COPYFILE_DATA |= COPYFILE_CLONE, I did at least get the data of the file back out again, but apfs-clone-checker reports it as “not a clone”. I was wondering if perhaps it was overly crudely measuring, bailing out because one or two metadata blocks didn't match, so I wrote my own:
from fcntl import fcntl
from os import fstat
import struct
F_LOG2PHYS_EXT = 65
log2phys = struct.Struct("=Iqq")
offsets = set()
def blocks(fn: str) -> set[int]:
with open(fn, "rb") as f:
fno = f.fileno()
sr = fstat(fno)
blksize = sr.st_blksize
n = 0
offsets = set()
for blkoff in range(0, sr.st_size, blksize):
result = fcntl(fno, F_LOG2PHYS_EXT, log2phys.pack(0, blksize, blkoff))
flags, contig, offt = log2phys.unpack(result)
offsets.add(offt)
return offsets
from sys import argv
reference = argv[1]
others = argv[2:]
a = blocks(reference)
for fn in others:
b = blocks(fn)
overlap = len(a & b)
noverlap = len(a | b)
print(f"{reference} ⤳ {fn}: {overlap / noverlap}")and then tried various strategies to see if I could get anything above 0.0.
I could get close-to-but-not-exactly-1.0 numbers by using cp -c and then doing something like this:
from sys import argv
with open(argv[1], "r+b") as f:
f.seek(4096 * 20)
f.write(b"scribble")but any call to fcopyfile (as compared to copyfile or clonefile) resulted in files with an overlap score of 0, including with a minimal C reproducer like this:
int ret;
int ret2;
int src = open("somedata", O_RDONLY);
if (src < 0) {
printf("couldnt open src");
return 1;
}
int dst = open("somedata.c.clone", O_WRONLY | O_CREAT, 0700);
if (dst < 0) {
printf("couldnt open dst");
return 2;
}
ret = fcopyfile(src,dst,NULL,COPYFILE_CLONE_FORCE|COPYFILE_DATA|COPYFILE_ALL);
ret2 = copyfile("somedata", "somedata.c.byfile", NULL, COPYFILE_CLONE_FORCE|COPYFILE_DATA|COPYFILE_ALL);In this case, somedata.c.byfile is (by F_LOG2PHYS_EXT's standards, at least) a clone, but somedata.c.clone is not.
The fcopyfile operation is really fast, but then, so is my computer, and I can't find a better heuristic for “did this work”. But to the best of my investigative abilities, this will need to be modified to use path names, and thus copyfile, rather than file descriptors in order to get a CoW clone on macOS.
(I have not yet had the endurance to test with clonefileat, maybe that could be made to work, but I am not entirely sure what's going on with those directory file descriptors and I didn't want to do yet another round of debugging.)
Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere