New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix various crash, memory corruption and infinite loop conditions #1105

Merged
merged 4 commits into from Dec 13, 2018

Conversation

Projects
None yet
3 participants
@daxtens
Copy link
Contributor

daxtens commented Dec 11, 2018

I have found some hangs, crashes and memory corruption issues in libarchive.

Two are in the RAR decoder. The first (patch 1) is a double-free via a realloc(area, 0). This leads to a crash.

The second (patch 2) is memory corruption which seems to arise in ppmd7 decoding. The code can be made to read and write to a previously freed ppmd buffer by tricking the read-ahead code around multi-part archives. (This can be done even with a single archive file.) My gut feeling is that someone more skilled than I could cause arbitrary code execution with this, but I cannot say for certain.

There is a crash in ACL parsing for tar archives (patch 3). This is a simple NULL dereference leading to a crash.

The last of this batch is a quasi-infinite loop in the warc code (patch 4), where data isn't consumed after being written out, so a large Content-Length can be used to consume almost limitless time and space, leading to a DoS condition.

These were found with a combination of AFL, afl-rb and qsym.

daxtens added some commits Nov 20, 2018

Avoid a double-free when a window size of 0 is specified
new_size can be 0 with a malicious or corrupted RAR archive.

realloc(area, 0) is equivalent to free(area), so the region would
be free()d here and the free()d again in the cleanup function.

Found with a setup running AFL, afl-rb, and qsym.
rar: file split across multi-part archives must match
Fuzzing uncovered some UAF and memory overrun bugs where a file in a
single file archive reported that it was split across multiple
volumes. This was caused by ppmd7 operations calling
rar_br_fillup. This would invoke rar_read_ahead, which would in some
situations invoke archive_read_format_rar_read_header.  That would
check the new file name against the old file name, and if they didn't
match up it would free the ppmd7 buffer and allocate a new
one. However, because the ppmd7 decoder wasn't actually done with the
buffer, it would continue to used the freed buffer. Both reads and
writes to the freed region can be observed.

This is quite tricky to solve: once the buffer has been freed it is
too late, as the ppmd7 decoder functions almost universally assume
success - there's no way for ppmd_read to signal error, nor are there
good ways for functions like Range_Normalise to propagate them. So we
can't detect after the fact that we're in an invalid state - e.g. by
checking rar->cursor, we have to prevent ourselves from ever ending up
there. So, when we are in the dangerous part or rar_read_ahead that
assumes a valid split, we set a flag force read_header to either go
down the path for split files or bail. This means that the ppmd7
decoder keeps a valid buffer and just runs out of data.

Found with a combination of AFL, afl-rb and qsym.
Skip 0-length ACL fields
Currently, it is possible to create an archive that crashes bsdtar
with a malformed ACL:

Program received signal SIGSEGV, Segmentation fault.
archive_acl_from_text_l (acl=<optimised out>, text=0x7e2e92 "", want_type=<optimised out>, sc=<optimised out>) at libarchive/archive_acl.c:1726
1726				switch (*s) {
(gdb) p n
$1 = 1
(gdb) p field[n]
$2 = {start = 0x0, end = 0x0}

Stop this by checking that the length is not zero before beginning
the switch statement.

I am pretty sure this is the bug mentioned in the qsym paper [1],
and I was able to replicate it with a qsym + AFL + afl-rb setup.

[1] https://www.usenix.org/conference/usenixsecurity18/presentation/yun
warc: consume data once read
The warc decoder only used read ahead, it wouldn't actually consume
data that had previously been printed. This means that if you specify
an invalid content length, it will just reprint the same data over
and over and over again until it hits the desired length.

This means that a WARC resource with e.g.
Content-Length: 666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666665
but only a few hundred bytes of data, causes a quasi-infinite loop.

Consume data in subsequent calls to _warc_read.

Found with an AFL + afl-rb + qsym setup.
@daxtens

This comment has been minimized.

Copy link
Contributor

daxtens commented Dec 11, 2018

@daxtens

This comment has been minimized.

Copy link
Contributor

daxtens commented Dec 13, 2018

I have requested CVEs for these issues through the Distributed Weakness Filing project.

@mmatuska mmatuska merged commit cef9730 into libarchive:master Dec 13, 2018

@rgacogne

This comment has been minimized.

Copy link

rgacogne commented Jan 6, 2019

In case someone else is looking for the CVEs, it looks like the assignments are :

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment