Description
Bug report
Bug description:
If you use shutil.copyfileobj()
to copy a file-like object to a file-like object, the destination isn't flushed at the end of the copy. This can lead to the destination file being in a written, but unflushed state, which could be counterintuitive.
This was found as part of the Emscripten build script, which works reliably on Python 3.13.5, but breaks on 3.14.0b3:
import shutil
import tempfile
from urllib.request import urlopen
workingdir = "./tmp"
shutil.rmtree(workingdir, ignore_errors=True)
with tempfile.NamedTemporaryFile(suffix=".tar.gz") as tmp_file:
with urlopen(
"https://github.com/libffi/libffi/releases/download/v3.4.6/libffi-3.4.6.tar.gz"
) as response:
shutil.copyfileobj(response, tmp_file)
# Uncomment this flush to make the example work on 3.14
# tmp_file.flush()
shutil.unpack_archive(tmp_file.name, workingdir)
I'm guessing this discrepancy was caused by #119783 (/cc @morotti), which increased the buffer size for copyfileobj()
. I'm guessing that the bug existed prior to that change, but increasing the buffer size also increased the potential exposure to buffer flush errors/inconsistencies.
I'd argue that the documented API for copyfileobj()
reads that the method is a "complete" operation, including an implied flush on completion.
However, I can also see the argument that the decision to flush should be left to the user - in which case, the resolution for this issue would be updating the documentation to make it clear that users should flush after a copy.
There could also be an argument for adding a flush
argument (defaulting to True
) so that flushing could be optionally omitted for cases where the overhead of a flush might matter.
CPython versions tested on:
3.14
Operating systems tested on:
macOS