Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LZ4 accesses memory outside the provided buffer #77

Closed
nemequ opened this issue Apr 7, 2015 · 7 comments
Closed

LZ4 accesses memory outside the provided buffer #77

nemequ opened this issue Apr 7, 2015 · 7 comments

Comments

@nemequ
Copy link
Contributor

@nemequ nemequ commented Apr 7, 2015

Apparently you're aware of this (based on the comment preceding the LZ4_wildCopy function), but during decompression (at least) LZ4 will attempt to access memory outside of the provided buffer. This causes crashes when doing things like decompressing to a memory-mapped file:

==20225== Process terminating with default action of signal 11 (SIGSEGV)
==20225==  General Protection Fault
==20225==    at 0x5E1D670: LZ4_copy8 (lz4.c:275)
==20225==    by 0x5E1D670: LZ4_wildCopy (lz4.c:291)
==20225==    by 0x5E1D670: LZ4_decompress_generic (lz4.c:1105)
==20225==    by 0x5E1D670: LZ4_decompress_safe (lz4.c:1123)
==20225==    by 0x5E18610: squash_lz4_decompress_buffer (squash-lz4.c:122)
==20225==    by 0x4E38BE6: squash_codec_decompress_with_options (codec.c:728)
==20225==    by 0x4E393CD: squash_codec_process_file_with_options (codec.c:1040)
==20225==    by 0x4E3974A: squash_codec_decompress_file_with_options (codec.c:1168)
==20225==    by 0x401719: benchmark_codec_with_options (benchmark.c:138)
==20225==    by 0x401BB0: benchmark_codec (benchmark.c:240)
==20225==    by 0x401E93: main (benchmark.c:324)

If you want I could probably put together a test case, but I doubt it's necessary. That backtrace is from a copy of enwik8 which was compressed with LZ4_compress_limitedOutput.

@Cyan4973
Copy link
Member

@Cyan4973 Cyan4973 commented Apr 8, 2015

Thanks for notification Evan.
This could be a high priority bug.

I'm nonetheless a bit surprised, because that kind of situation is currently heavily tested.

Unfortunately, I'm unable to reproduce this issue when using enwik8 as input, nor any of the test tools available (valgrind, lz4io, lz4 -b, fullbench, different blocks sizes, ...).

It could be that the conditions you generate in your tests are specific enough to never be generated within lz4, nor within the fuzzer tool (which generates myriads of random situations, hence statistically find some weird corner cases that were not foresee in the normal, directed, test suite).

The line of code mentioned LZ4_decompress_generic (lz4.c:1105), is just a call to LZ4_wildCopy(op, match, cpy), but this call is preceded by some important tests.

To reach this line of code, the following condition is verified cpy<=oend-12. oend in this context is the end of output buffer. Since LZ4_wildCopy() may write up to 7 bytes beyond cpy, this condition ensures that it does not translate into a write beyond the limit.

Now, LZ4_wildCopy() may also read beyond the limit. This sound strange though, since read is taken from match pointer, with match < op < cpy <= oend-12, so it's difficult to imagine that match may reach a position beyond end of buffer. But maybe it could be before beginning of buffer.
However, this condition has been checked too : if ((checkOffset) && (unlikely(match < lowLimit))) goto _output_error; /* Error : offset outside destination buffer */.

Therefore, it seems it's difficult to guess what's going wrong without a proper test case to properly witness it. Would you mind providing some elements to reproduce it ?

Regards

@Cyan4973
Copy link
Member

@Cyan4973 Cyan4973 commented Apr 8, 2015

Also, just as a quick basic check,
you could verify if the buffer sizes provided to LZ4_decompress_safe() are correct, just in case.
Last time there was a similar report, it ended up being the reason.

@nemequ
Copy link
Contributor Author

@nemequ nemequ commented Apr 8, 2015

The good news is I was able to produce a test case which doesn't depend on Squash.

The bad news is that the bug is only triggered with -O3 (on GCC5 from Fedora 22, I haven't tested with 4.9 but I have a feeling it is going to work since this only started happening after I upgraded).

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <assert.h>
#include <sys/mman.h>

#include "lib/lz4.h"

int main(int argc, char *argv[]) {
  int res;

  int uncompressed_fd = open (argv[1], O_RDONLY);
  assert (uncompressed_fd != -1);

  int compressed_fd = open (argv[2], O_RDWR | O_CREAT | O_TRUNC, S_IWUSR | S_IRUSR);
  assert (compressed_fd != -1);

  int decompressed_fd = open (argv[3], O_RDWR | O_CREAT | O_TRUNC, S_IWUSR | S_IRUSR);
  assert (decompressed_fd != -1);

  struct stat uncompressed_stat;
  res = fstat (uncompressed_fd, &uncompressed_stat);
  assert (res == 0);
  fprintf (stderr, "%s: %zu bytes\n", argv[1], uncompressed_stat.st_size);

  void* uncompressed_data = mmap (NULL, uncompressed_stat.st_size, PROT_READ, MAP_SHARED, uncompressed_fd, 0);
  assert (uncompressed_data != MAP_FAILED);

  size_t compressed_size = LZ4_COMPRESSBOUND(uncompressed_stat.st_size);
  ftruncate (compressed_fd, compressed_size);
  fprintf (stderr, "%s: %zu bytes max\n", argv[2], compressed_size);

  void* compressed_data = mmap (NULL, compressed_size, PROT_READ | PROT_WRITE, MAP_SHARED, compressed_fd, 0);
  assert (compressed_data != MAP_FAILED);

  compressed_size = LZ4_compress_limitedOutput (uncompressed_data, compressed_data, (int) uncompressed_stat.st_size, (int) compressed_size);
  assert (compressed_size != 0);
  fprintf (stderr, "%s: now %zu bytes\n", argv[2], compressed_size);

  ftruncate (compressed_fd, compressed_size);
  munmap (compressed_data, LZ4_COMPRESSBOUND(uncompressed_stat.st_size));
  compressed_data = mmap (NULL, compressed_size, PROT_READ | PROT_WRITE, MAP_SHARED, compressed_fd, 0);
  assert (compressed_data != MAP_FAILED);

  ftruncate (decompressed_fd, uncompressed_stat.st_size);

  void* decompressed_data = mmap (NULL, uncompressed_stat.st_size, PROT_READ | PROT_WRITE, MAP_SHARED, decompressed_fd, 0);
  assert (decompressed_data != MAP_FAILED);

  res = LZ4_decompress_safe (compressed_data,
                 decompressed_data,
                 (int) compressed_size,
                 (int) uncompressed_stat.st_size);
  assert (res == uncompressed_stat.st_size);

  munmap (decompressed_data, uncompressed_stat.st_size);
  munmap (compressed_data, LZ4_COMPRESSBOUND(uncompressed_stat.st_size));
  munmap (uncompressed_data, uncompressed_stat.st_size);

  return 0;
}

One interesting thing about this is if I feed a lot of extra room in the buffers (e.g., don't re-map compressed_data after compressing and make the decompressed data twice as big as it needs to be) the problem still occurs.

After a bit more testing with -O2 and specific flags from -O3, it seems like the combination of -ftree-loop-vectorize and -fvect-cost-model triggers it.

@nemequ nemequ closed this Apr 8, 2015
@Cyan4973
Copy link
Member

@Cyan4973 Cyan4973 commented Apr 8, 2015

OK,
now it remembers me of a bug which happened in an older version of LZ4
with only GCC 4.9.x at -O3 settings.

The bug was pretty nasty :
GCC generated a movdqa opcode, which is 128 bits SSE,
in place of "copy8()".

First, this is 16 bytes, instead of 8 (hello buffer overflow !)
Second, movdqa requires 16-bytes aligned access, and there was no way for the compiler to ensure that condition. A completely buggy generation.

Solution was to special-case GCC 4.9 to use a different code paths which prevented this broken optimization from happening.

static void LZ4_copy8(void* dstPtr, const void* srcPtr)
{
#if GCC_VERSION!=409  /* disabled on GCC 4.9, as it generates invalid opcode (crash) */
    if (LZ4_UNALIGNED_ACCESS)
    {
        if (LZ4_64bits())
            *(U64*)dstPtr = *(U64*)srcPtr;
        else
            ((U32*)dstPtr)[0] = ((U32*)srcPtr)[0],
            ((U32*)dstPtr)[1] = ((U32*)srcPtr)[1];
        return;
    }
#endif
    memcpy(dstPtr, srcPtr, 8);
}

The protection could be extended, with GCC_VERSION >= 409 for example.

Now, if removing a more specialized feature flag achieves the same trick, it might preferable, and a more generic protection.

@nemequ
Copy link
Contributor Author

@nemequ nemequ commented Apr 8, 2015

GCC 5 isn't final yet (nor is Fedora 22). I'm in the process of doing a fresh install of Fedora 22 (previously I upgraded from 21, which has historically been a bit iffy until the release candidates, 22 is still in the alpha stage), if the issue is still there I'll start talking to the GCC devs.

IIRC you can disable specific optimizations with the optimize function attribute or a pragma for a. If this is present in the GCC 5 release that would probably be a good way to go.

I'll let you know what happens.

@nemequ
Copy link
Contributor Author

@nemequ nemequ commented Apr 9, 2015

Looks like gcc-5.0.0-0.22.fc22 works.

Spoke too soon.

@nemequ
Copy link
Contributor Author

@nemequ nemequ commented Apr 9, 2015

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65709

Note that the test case I attached earlier is unnecessary—the lz4 CLI will trigger the issue, too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants