Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

io.Copy error? #132

Open
hallyn opened this issue Oct 14, 2021 · 11 comments
Open

io.Copy error? #132

hallyn opened this issue Oct 14, 2021 · 11 comments

Comments

@hallyn
Copy link

hallyn commented Oct 14, 2021

I'm probably doing something wrong here, so apologies in advance. However, when I copy a large file into a newly created fat32, the copied size is correct, but the tail of the file is zeroed out.

Using the following as a fat32.go test file: https://gist.github.com/hallyn/24acd9c7cf407c0561629b5c032f2bb8

With the following go.mod:

module github.com/hallyn/disktest

require (
    #github.com/diskfs/go-diskfs v1.2.0
    github.com/diskfs/go-diskfs v0.0.0-20211001121417-bcaf6e5e95e0
)

go 1.17

I fetched
https://github.com/shadow-maint/shadow/releases/download/v4.9/shadow-4.9.tar.gz
and copied it to /tmp/in, and ran 'go run fat32.go'. The test places the output
in /tmp/fat, so

serge@sl ~/test/go-disk$ diff /mnt/EFI/BOOT/kernel.efi  /tmp/in
Binary files /mnt/EFI/BOOT/kernel.efi and /tmp/in differ
serge@sl ~/test/go-disk$ od -x /tmp/in > /tmp/1
oserge@sl ~/test/go-disk$ od -x /mnt/EFI/BOOT/kernel.efi > /tmp/2
serge@sl ~/test/go-disk$ diff -up /tmp/1 /tmp/2
--- /tmp/1	2021-10-13 22:54:31.010732470 -0500
+++ /tmp/2	2021-10-13 22:54:37.782968885 -0500
@@ -247614,15 +247614,7 @@
 17071720 ce36 091d e94a f314 f129 ed8b 3bac 9276
 17071740 bed6 fb3e 3266 fb35 9e22 a7d6 7f44 f10b
 17071760 4c2f 0a45 abfc 4b98 8e50 199e 4cae e0a9
-17072000 cade de09 4c98 b359 466e f0fb eb3b 41e3
-17072020 4726 4ff9 0597 7fc9 01ac f7dc 5ef4 39b6
-17072040 b8f0 a181 27e3 4430 837d f9ae d9b7 4219
-17072060 f295 ddd0 4d92 2460 7938 b618 5720 fef6
-17072100 121d 9a6e 598f 94c8 76a3 ed09 fbdc 8fa0
-17072120 a4a2 9b6f 3cab 2de1 23cf 50ef 3e75 3fa1
-17072140 639c 003b 6318 62b7 42b9 614a 06db f011
-17072160 149e 8ee3 88be a95c 2523 aa58 7a75 f5ad
-17072200 facf fd67 feb3 ff59 7fac 3fd6 9feb cff5
-17072220 67fa b3fd 59fe acff d67f eb3f 5f9f e7fd
-17072240 01ff 4005 874f 0800 011f
+17072000 0000 0000 0000 0000 0000 0000 0000 0000
+*
+17072240 0000 0000 0000 0000 0000
 17072252
@deitch
Copy link
Collaborator

deitch commented Oct 14, 2021

Interesting. I was able to reproduce it.

  • Cluster size is by default 512
  • copy size when using io.Copy() is 32768
  • input file size is 3962026, which is 7738 whole clusters and one partial (or incompletely used cluster)
  • input file size is 3962026, which is 120 complete writes and one partial write of 29866 bytes

Based on the diff, it looks like the first 0x3c0000 bytes are identical, which is 3932160, or the exact 120 complete writes.

That would mean that the last partial write is not going across correctly.

Can you confirm those numbers? If they are, can you construct a simple file of just one whole and one partial, perhaps a 64000 byte file, and see what happens with that?

@hallyn
Copy link
Author

hallyn commented Oct 14, 2021

I can confirm the input size is the same with mine. (FWIW my actual file was 48M, the file I'm reporting here was for demonstration purposes)

I had wondered whether I should just do io.CopyBytes() over a loop.

@hallyn
Copy link
Author

hallyn commented Oct 14, 2021

Indeed when I make /tmp/in only the first 64000 bytes of the original file, then it works.

@deitch
Copy link
Collaborator

deitch commented Oct 14, 2021

Hold on. 64000 bytes exactly works? But it isn't an exact multiple of 32768.

@hallyn
Copy link
Author

hallyn commented Oct 15, 2021

Yeah, but it works!

It gets worse. When I use mcopy to copy the file out of the fat32 file, the file is copied out correctly!

At least I think. I might be doing something wrong. Will try again in the morning.

@hallyn
Copy link
Author

hallyn commented Oct 15, 2021

Oh, well:

serge@sl ~/test/go-disk$ diff /mnt/EFI/BOOT/kernel.efi /tmp/out
Binary files /mnt/EFI/BOOT/kernel.efi and /tmp/out differ
serge@sl ~/test/go-disk$ stat /mnt/EFI/BOOT/kernel.efi
  File: /mnt/EFI/BOOT/kernel.efi
  Size: 64102           Blocks: 126        IO Block: 512    regular file
Device: 723h/1827d      Inode: 5142        Links: 1
Access: (0755/-rwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2021-10-14 19:00:00.000000000 -0500
Modify: 2021-10-14 16:32:22.000000000 -0500
Change: 2021-10-14 16:32:22.000000000 -0500
 Birth: -
serge@sl ~/test/go-disk$ stat /tmp/out
  File: /tmp/out
  Size: 64102           Blocks: 128        IO Block: 4096   regular file
Device: fd01h/64769d    Inode: 40894883    Links: 1
Access: (0664/-rw-rw-r--)  Uid: ( 1000/   serge)   Gid: ( 1000/   serge)
Access: 2021-10-14 21:32:44.560948106 -0500
Modify: 2021-10-14 21:32:23.899576207 -0500
Change: 2021-10-14 21:32:23.899576207 -0500
 Birth: -

/mnt/EFI/BOOT/kernel.efi is the file on the mounted fat.
/tmp/out is mcopy'd out of the same fat file.

Heck maybe it's a Linux kernel bug. Well, where does the 'Blocks: 126' come from?

@deitch
Copy link
Collaborator

deitch commented Oct 15, 2021

Yeah, but it works!

So this 3962026 byte file works except for the last <32768 (i.e. single cluster) chunk. But a 64000 byte file with exactly one cluster and another <32768 does not. That is odd.

It gets worse. When I use mcopy to copy the file out of the fat32 file, the file is copied out correctly!

You mean when you use mcopy with the 3962026 file that we used before?

I need to see that.

@deitch
Copy link
Collaborator

deitch commented Oct 15, 2021

Luck of the draw. Try 63000 and 65000 and it still gives the error. Exactly 64000 works. This is significant, but I do not know why or how.

@deitch
Copy link
Collaborator

deitch commented Oct 15, 2021

And it isn't multiples of 32000 either, but exact multiples of 64000. 10000 also fails. How odd

@deitch
Copy link
Collaborator

deitch commented Oct 15, 2021

Could it be the underlying OS filesystem not flushing or something? That 64000 is not a multiple of 32768 or 512, so it is quite strange. I stepped through the code with a 10000 byte file, it clearly is the last cluster (272 bytes) that is not getting sent out.

No, that isn't it. I tried your mcopy. The copied file has the same hash, but then I mount the disk and still have the issue:

/ # mcopy -i /pwd/fat ::EFI/BOOT/kernel.efi /tmp
/ # sha256sum /tmp/kernel.efi /pwd/in
9d7daad3f67ad5f9c016c37ddc6588f516eaa0e62e6bef37899e6dce9eded095  /tmp/kernel.efi
9d7daad3f67ad5f9c016c37ddc6588f516eaa0e62e6bef37899e6dce9eded095  /pwd/in
/ # mount /pwd/fat /mnt
/ # sha256sum /mnt/EFI/BOOT/kernel.efi
fbe55c219372e06bf00e635736a9adaa9de685f9bb75181583fa4ded1dd220b5  /mnt/EFI/BOOT/kernel.efi

This leads me to suspect that there is something going on with how Linux is mounting it. Given the age, I would suspect that there is something mcopy is comfortable with that diskfs is taking advantage of, but that Linux kernel (more correctly, the fat32 driver) is stricter about.

I have no evidence at this time to prove it, though.

@hallyn
Copy link
Author

hallyn commented Oct 15, 2021

Yeah I need to find some time to instrument a linux kernel's fat driver and run it in a vm, but I'm afraid it probably won't be today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants