-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disk/image corruption with large numbers of files #2625
Comments
This issue has been most problematic in the real world scenario of an unfortunately large |
I'm seeing the same behavior of random corrupt files on |
thanks. I tried switching to the qcow image format, and was unable to reproduce the issue with this Dockerfile. |
Hi, Thanks for the reproduction scenario. We are actively looking for a fix on this issue. |
@tristanpemble I tried to repro this (one of colleagues was able to), but so far I had no luck and I'm trying on several machines in parallel. One question, how full is your disk? Is it close to the limit or is there plenty of free space? Thanks |
@rn:
|
Some additional information: several of my coworkers are seeing the same issues in other scenarios, and saw them go away with qcow. When trying to use this Dockerfile just now, we were unable to reliably reproduce, unfortunately. I've been able to reproduce every time:
One coworker could not, but saw the issue in other scenarios before switching to qcow:
Another coworker, on Sierra, was unable to reproduce (since Sierra will default to the qcow2 format). |
This was reproducing for me reliably on 2 laptops yesterday:
I updated the beta laptop to 10.13.4 Beta (17E160e) and now I can't reproduce it on that machine, but can still on the other one. Furthermore in the beta build another tangentially-related APFS bug was fixed (a bug in |
I'm wondering if there's something about switching back and forth between qcow/raw that makes it disappear? |
UPDATE:
Nevermind on that one. Sorry for all of the comments, trying to give as much info as I can since this is a pretty unpredictable bug. |
Note that the release notes for Docker 17.12.0-ce-mac55 reference this ticket:
|
Could this be related to an issue that I am seeing with mounting volumes? I am using Docker to run the toolchain for building a project in Ubuntu. I have all of the source in the Mac filesystem and the tools installed in a Docker container. I need a case-sensitive filesystem, so I initially created one on the Mac using a sparse bundle. When I ran my Docker container with the Mac disk image mounted as a volume, the make process failed because it created many files of apparently the right size, but filled with 0x00. Changing to a sparse image didn't help, but it seems to work when I created a .dmg file with the total amount pre-allocated by Disk Utility. I am using 17.12.0-ce-mac49 (21995) on High Sierra. |
Sorry I forgot to comment here when we released the update switching back to @agodoroja I suspect that is the same (or very similar) issue. It seems that when APFS allocates blocks for sparse files, they can occasionally end up with corrupt contents. I suspect the bug has been fixed on the macOS developer beta (10.13.4) recently and I hope the fix will be released as a regular update soon. In the meantime we've switched back to Thanks all (especially @tristanpemble for the test case) |
@djs55 will this be added directly to stable or incubated in edge first? |
macOS 10.13.4 has released - have we confirmed that it is safe to reset back to raw format? I don't know if Apple fixed this issue with sparsebundle allocation |
I confirmed the bug still reproduces on my 10.13.3 machine and then updated to 10.13.4. I re-ran the test 5 times in a row and the bug didn't reproduce. It's hard to prove the bug has been fixed (there's no mention in the release notes). Perhaps someone else who has a machine where it reproduces could try updating, switch to raw mode (by editing the file extension in I suspect we'll want to incubate the change in edge before moving to stable. We'll also need to add a version check somewhere to keep qcow2 for 10.13.0-10.13.3. |
My 10.13.4 took a while to install with a few more reboots than normal which leads me to believe an APFS update was included. If it was, and the bug was patched (I never experienced it on my machine) then I think the lack of transparency from Apple regarding APFS is a little bit concerning. Say what you want about keeping features secret, but filesystem bugs should surely be reported. That being said, do we know when k8s support will enter the stable release track? |
FWIW, I just updated to 10.13.4 and switched back to raw. On 10.13.3 I had some file corruption when untar'ing linux kernel source trees, maybe 1 in 3 or 4. I've now done about 12 iterations without any noticeable corruption. Definitely an improvement and another datapoint that this might be fixed in APFS |
I personally can't seem to reproduce it with the testcase provided in this issue on my end (D4M stable Not sure if this is related to the APFS bug described here: https://bombich.com/blog/2018/02/15/macos-may-lose-data-on-apfs-formatted-disk-images but it notes the issue persists in the 10.13.4 release 🤔
|
I just saw the edge channel update suggesting that this is the default again, linking me to this issue, but I don't see any indication that the underlying issue is fixed. Does anyone have a reference on the fix? Do I need to update macOS? |
hi @glyph you need to make sure you're on macos 10.13.4 and reset docker to factory defaults. 10.13.3 introduced the bug I believe.
About the reference to the fix; docker is closed source so you won't see a commit link here. I assume the docker team will resolve this one once the fix makes its way to stable. I personally can attest to raw working in macos 10.13.4, I have been using raw in stable using the settings.json workaround for weeks with no corruption issues. The performance improvement is significant :) It's been talked about above but in case it helps others, here are the exact steps to use raw with docker stable (18.03.0-ce-mac60):
|
Yes, it looks like Apple has fixed the issue in APFS which caused disk corruption with sparse files. We had a number of tests, including the one kindly provided by @tristanpemble, and none of them triggered any corruption on 10.13.4. So the latest edge enables raw disks again. We did not see any mention in the Apple's changelog and noticed some other APFS related chances too. |
I've ran @tristanpemble's test on both 10.13.4 and 10.13.5 and a raw backing store. In neither cases corrupt files were created. |
Closed issues are locked after 30 days of inactivity. If you have found a problem that seems similar to this, please open a new issue. Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows. |
Expected behavior
Generating large numbers of files in the container will not corrupt the files
Actual behavior
Files are corrupted
Information
Diagnose & Feedback:
Dockerfile:
Steps to reproduce the behavior
docker build .
:docker run --rm 9968c5d7fc72 cat 6482.txt
(truncated output):The text was updated successfully, but these errors were encountered: