Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

silent data corruption when write to temporary file fails (eg TMPDIR is full) #24

Open
h01ger opened this issue Feb 11, 2016 · 0 comments

Comments

@h01ger
Copy link
Contributor

h01ger commented Feb 11, 2016

this was originally reported as https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=813959 by jozsef imrek mazsi-debian-bugs@imrek.org

Package: pxz
Version: 4.999.99~beta5+gitfcfea93-1
Severity: important

Dear Maintainer,

pxz seems to ignore errors when writing to the temporary output
files used by the parallel compression threads, and silently
creates a corrupted output file.

to reproduce the bug (by using an extremely small /tmp dir, and
the bash builtin random number generator as a data source):

    # mkdir -p /tmp
    # mount -t tmpfs tmpfs tmp -o size=512k
    # RANDOM=0; while [[ true ]]; do echo $RANDOM; done | dd bs=1M iflag=fullblock count=4 | ltrace -s0 -f -o trace pxz -T16 -f > out.bin.xz

observe the corrupted output file:

    xz -l out.bin.xz
    xz: out.bin.xz: File format not recognized

also observe the failed writes (where the number of bytes to
write out is not equal to the number of bytes actually written):

    # awk '/fwrite/{ if(int($4) != int($7)){ print $0; } }' < trace 
    32159 fwrite(""..., 1, 61444, 0x1c380e0)                                                                                                 = 32705
    32159 fwrite(""..., 1, 61444, 0x1c380e0)                                                                                                 = 4096
    32159 fwrite(""..., 1, 61444, 0x1c380e0)                                                                                                 = 4096
    32159 fwrite(""..., 1, 61444, 0x1c380e0)                                                                                                 = 4096
    ...

if you use strace instead of ltrace, you can see the actual ENOSPC error:

    # RANDOM=0; while [[ true ]]; do echo $RANDOM; done | dd bs=1M iflag=fullblock count=4 | strace -s0 -f -o trace pxz -T16 -f > out.bin.xz
    # grep 'write.* = -1' trace
    32241 write(3, ""..., 28672)            = -1 ENOSPC (No space left on device)
    32241 write(3, ""..., 4096)             = -1 ENOSPC (No space left on device)
    32241 write(3, ""..., 4096)             = -1 ENOSPC (No space left on device)
    32241 write(3, ""..., 4096)             = -1 ENOSPC (No space left on device)
    ...

looking at pxz's source in the git repository, the error
checking after some fwrite() calls is erroneous:

    https://github.com/jnovy/pxz/blob/fcfea93/pxz.c#L391
    https://github.com/jnovy/pxz/blob/fcfea93/pxz.c#L408

    if ( !fwrite(mo, 1, BUFFSIZE - strm.avail_out, ftemp[p]) ) {
            error(EXIT_FAILURE, errno, "writing to temp file failed");
    }

it should be at least

    if ( fwrite(mo, 1, BUFFSIZE - strm.avail_out, ftemp[p]) != BUFFSIZE - strm.avail_out ) {

just like at other parts of the program.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant