Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

heap-buffer-overflow in bz3_decompress() #97

Closed
asarubbo opened this issue Mar 27, 2023 · 9 comments
Closed

heap-buffer-overflow in bz3_decompress() #97

asarubbo opened this issue Mar 27, 2023 · 9 comments
Labels
invalid This doesn't seem right

Comments

@asarubbo
Copy link

On v1.2.3, reproducibile via examples/decompress-file.c

==9769==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7fefd53fed58 at pc 0x0000004a22f7 bp 0x7ffe364dd760 sp 0x7ffe364dcf30
READ of size 16580610 at 0x7fefd53fed58 thread T0
    #0 0x4a22f6 in __asan_memcpy /var/tmp/portage/sys-libs/compiler-rt-sanitizers-15.0.7/work/compiler-rt/lib/asan/asan_interceptors_memintrinsics.cpp:22:3
    #1 0x7fefd60372e5 in bz3_decompress /var/tmp/portage/app-arch/bzip3-1.2.3/work/bzip3-1.2.3/src/libbz3.c:899:9
    #2 0x4dd360 in main /root/bzip3/decompress-file.c:28:17
    #3 0x7fefd5d6b1f6 in __libc_start_call_main /var/tmp/portage/sys-libs/glibc-2.36-r7/work/glibc-2.36/csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    #4 0x7fefd5d6b2ab in __libc_start_main /var/tmp/portage/sys-libs/glibc-2.36-r7/work/glibc-2.36/csu/../csu/libc-start.c:381:3
    #5 0x41d5c0 in _start (/usr/bin/decompress_file+0x41d5c0)

0x7fefd53fed58 is located 0 bytes to the right of 603480-byte region [0x7fefd536b800,0x7fefd53fed58)
allocated by thread T0 here:
    #0 0x4a2fee in __interceptor_malloc /var/tmp/portage/sys-libs/compiler-rt-sanitizers-15.0.7/work/compiler-rt/lib/asan/asan_malloc_linux.cpp:69:3
    #1 0x7fefd6037075 in bz3_decompress /var/tmp/portage/app-arch/bzip3-1.2.3/work/bzip3-1.2.3/src/libbz3.c:861:28
    #2 0x4dd360 in main /root/bzip3/decompress-file.c:28:17
    #3 0x7fefd5d6b1f6 in __libc_start_call_main /var/tmp/portage/sys-libs/glibc-2.36-r7/work/glibc-2.36/csu/../sysdeps/nptl/libc_start_call_main.h:58:16

SUMMARY: AddressSanitizer: heap-buffer-overflow /var/tmp/portage/sys-libs/compiler-rt-sanitizers-15.0.7/work/compiler-rt/lib/asan/asan_interceptors_memintrinsics.cpp:22:3 in __asan_memcpy
Shadow bytes around the buggy address:
  0x0ffe7aa77d50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ffe7aa77d60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ffe7aa77d70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ffe7aa77d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ffe7aa77d90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0ffe7aa77da0: 00 00 00 00 00 00 00 00 00 00 00[fa]fa fa fa fa
  0x0ffe7aa77db0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ffe7aa77dc0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ffe7aa77dd0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ffe7aa77de0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ffe7aa77df0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==9769==ABORTING

Testcase:
6.crashes.bz3.zip

If you want to make a new tag, I will test if the bfa5bf8 works as expected

@ppisar
Copy link
Contributor

ppisar commented Mar 27, 2023

That input file is malformed:

$ hexdump -C  /tmp/6.crashes.bz3 | head
00000000  02 11 00 00 08 17 15 00  42 5a 33 76 31 00 07 09  |........BZ3v1...|
00000010  00 10 00 00 00 0a 00 00  00 02 00 fd 00 db 46 0b  |..............F.|
00000020  47 ff ff ff ff 61 0a                              |G....a.|
00000027

The malformation is properly caught by bzip3 tool:

$ ./bzip3 -d -k -c /tmp/6.crashes.bz3  >/dev/null
Invalid signature.

src/main.c of bzip3 tool first expects "BZ3v1" magic string and then 4 bytes of a little-endian unsigned integer.

Contrary, examples/decompress-file.c has a very naive parser which blindly interprets first sizeof(size_t) bytes as a size_t value. It does not consider that a size and an endianess of size_t differ among platforms.

@asarubbo
Copy link
Author

@ppisar your analisys is perfect and in other words that mean that the library in not strong enough to survive to malformed inputs

@ppisar
Copy link
Contributor

ppisar commented Mar 27, 2023

examples/decompress-file.c is not the library. I cannot comment on the library because I did not study it.

@kspalaiologos
Copy link
Owner

kspalaiologos commented Mar 27, 2023

Program received signal SIGSEGV, Segmentation fault.
__memcpy_avx_unaligned_erms ()
    at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:710
710	../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: No such file or directory.
(gdb) bt
#0  __memcpy_avx_unaligned_erms ()
    at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:710
#1  0x000055555556245f in bz3_decompress (in=<optimized out>,
    in@entry=0x555555568498 "BZ3v1", out=out@entry=0x0,
    in_size=<optimized out>, in_size@entry=31,
    out_size=out_size@entry=0x7fffffffe448) at ../src/libbz3.c:908
#2  0x000055555555525a in main (argc=<optimized out>,
    argv=0x7fffffffe588) at decompress-file.c:28

out is simply a null pointer because there is not enough RAM to contain a 0x7fffffffe448-byte file in my RAM. Not an issue with the bzip3 library. It's on the user's end to make sure that the buffers passed to bzip3 contain enough space to be filled with decompressed data.

Because the example code does not check for memory allocation failures, for reasons mentioned before, I am marking this issue as invalid. Please consider to further test your findings to make sure that they are an actual issue in the library, not a problem caused by contract violations - ideally problems that can be reproduced with regular bzip3 command invocation.

/* Decompress a file produced by compress-file SEQUENTIALLY (i.e. *not* in parallel) using bzip3 high level API. */
/* This is just a demonstration of bzip3 library usage, it does not contain all the necessary error checks and will not
 * support cross-endian encoding/decoding. It *WILL* segfault if the input file does not exist,
 * malloc returns a NULL pointer, etc... - you are expected to handle these corner cases in your code. */

Further SAIS-related fuzzing and issues are welcome.

@kspalaiologos kspalaiologos added the invalid This doesn't seem right label Mar 27, 2023
@asarubbo
Copy link
Author

Because the example code does not check for memory allocation failures, for reasons mentioned before, I am marking this issue as invalid. Please consider to further test your findings to make sure that they are an actual issue in the library, not a problem caused by contract violations - ideally problems that can be reproduced with regular bzip3 command invocation.

/* Decompress a file produced by compress-file SEQUENTIALLY (i.e. *not* in parallel) using bzip3 high level API. */
/* This is just a demonstration of bzip3 library usage, it does not contain all the necessary error checks and will not
 * support cross-endian encoding/decoding. It *WILL* segfault if the input file does not exist,
 * malloc returns a NULL pointer, etc... - you are expected to handle these corner cases in your code. */

I respect your opinion but let me say that I saw a lot of cve assigned for similar cases. When you can crash a library in this way by using its api, usually is a library fault.

@kspalaiologos
Copy link
Owner

kspalaiologos commented Mar 27, 2023

Contract violations are not bugs in the library. By this logic, the C standard library crashing when you try to seek on a FILE * obtained using fopen on a file that doesn't exist is a CVE in glibc.

The example code is not a part of the library. It is not distributed in the release tarballs (see https://github.com/kspalaiologos/bzip3/blob/master/Makefile.am#L1). You have not found an issue in bzip3 v1.2.3 because the file decompress-file.c is not a part of the release.

@asarubbo
Copy link
Author

Contract violations are not bugs in the library. By this logic, the C standard library crashing when you try to seek on a FILE * obtained using fopen on a file that doesn't exist is a CVE in glibc.

I'm pretty sure we are talking about different things.

The full reproduction command is:

decompress_file 6.crashes.bz3 output

I don't understand why are you talking about a missing file.

I see that the issue happens on decompress-file.c:28:17 where bz3_decompress is called

@kspalaiologos
Copy link
Owner

kspalaiologos commented Mar 27, 2023

bz3_compress crashes because out is a null pointer. it was improperly called. once again I have to show you the stack trace:

Program received signal SIGSEGV, Segmentation fault.
__memcpy_avx_unaligned_erms ()
    at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:710
710	../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: No such file or directory.
(gdb) bt
#0  __memcpy_avx_unaligned_erms ()
    at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:710
#1  0x000055555556245f in bz3_decompress (in=<optimized out>,
    in@entry=0x555555568498 "BZ3v1", out=out@entry=0x0,
    in_size=<optimized out>, in_size@entry=31,
    out_size=out_size@entry=0x7fffffffe448) at ../src/libbz3.c:908
#2  0x000055555555525a in main (argc=<optimized out>,
    argv=0x7fffffffe588) at decompress-file.c:28

notice: out=out@entry=0x0, out_size=out_size@entry=0x7fffffffe448). The size of the output is large, hence malloc failed. when malloc inside of decompress-file fails, it returns null. the memory under null is not big enough for us to be able to write 0x7fffffffe448 into, hence it is not the library's fault. it is a problem with decompress-file.c not being robust enough.

it is impossible to determine whether the buffer passed to bz3_decompress will even contain enough bytes for the output, which is why we pass out_size to it. but the decompress-file.c program clearly lied to the API, and there is no way in portable C to check it. you would have to use malloc_get_usable_size:

 0 [14:55] Desktop/workspace/bzip3@master % cd examples                            22s
 0 [14:55] workspace/bzip3@master/examples % gcc decompress-file.c -o decompress-file -O3 -lbzip3
 0 [14:55] workspace/bzip3@master/examples % ./decompress-file ~/Desktop/6.crashes.bz3 6.crashes
zsh: segmentation fault  ./decompress-file ~/Desktop/6.crashes.bz3 6.crashes
 139 [14:55] workspace/bzip3@master/examples % ./decompress-file ~/Desktop/6.crashes.bz3 6.crashes
zsh: segmentation fault  ./decompress-file ~/Desktop/6.crashes.bz3 6.crashes
 139 [14:55] workspace/bzip3@master/examples % gdb -q decompress-file
Reading symbols from decompress-file...
(No debugging symbols found in decompress-file)
(gdb) set args ~/Desktop/6.crashes.bz3 6.crashes
(gdb) run
Starting program: /home/palaiologos/Desktop/workspace/bzip3/examples/decompress-file ~/Desktop/6.crashes.bz3 6.crashes
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
__memcpy_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:710
710	../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: No such file or directory.
(gdb) bt
#0  __memcpy_avx_unaligned_erms ()
    at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:710
#1  0x00007ffff7fb8cbc in bz3_decompress (in=0x55555555a4a5 "\n", out=0x0,
    in_size=<optimized out>, out_size=0x7fffffffe4c8) at src/libbz3.c:908
#2  0x000055555555518a in main ()
(gdb)

notice: out_size=x7fffffffe4c8, out=0x0

@ppisar
Copy link
Contributor

ppisar commented Mar 27, 2023

I think bz3_decompress() and other high-level functions could check all the passed pointers for NULL and return BZ3_ERR_INIT if there is any. Not superproof, but helps to discover programming mistakes early.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid This doesn't seem right
Projects
None yet
Development

No branches or pull requests

3 participants