stack overflow due to an infinite recursion (on 32-bit) #247

Closed
gockelhahn opened this Issue Oct 20, 2016 · 4 comments

Comments

Projects
None yet
3 participants
@gockelhahn

while fuzzing with afl, a crash has been found after compiling lz4 as 32-bit binary:

==18808==ERROR: AddressSanitizer: stack-overflow on address 0xff1b0fe8 (pc 0xf70336cc bp 0x00000004 sp 0xff1b0fec T0)
#0 0xf70336cb in _IO_file_xsgetn (/usr/lib32/libc.so.6+0x6a6cb)
#1 0xf70281b6 in fread (/usr/lib32/libc.so.6+0x5f1b6)
#2 0x80887db in selectDecoder /home/build/lz4/programs/lz4io.c:927
#3 0x8088d1c in selectDecoder /home/build/lz4/programs/lz4io.c:948
[...]
#250 0x8088d1c in selectDecoder /home/build/lz4/programs/lz4io.c:948

SUMMARY: AddressSanitizer: stack-overflow (/usr/lib32/libc.so.6+0x6a6cb) in _IO_file_xsgetn
==18808==ABORTING

it happens when we try to decompress a crafted lz4 file:

  1. in the function selectDecoder(), we are reading 4 bytes to check the magic number
  2. if the "skippable" magic number was found, we are reading the next 4 bytes to find out how many bytes we should skip
  3. we are saving the result as an unsigned int and with it we call fseek on our file ... however fseek expects a signed int
  4. if our number is "4294967288", fseek receives a "-8" due to casting effects
  5. fseek will move the file pointer 8 bytes prior, thus to the beginning of the file where we started
  6. selectDecoder() is calling itself ... back to point 1. ... and so on

hexdump of the first crash file:

0000000 2a52 184d fff8 ffff 0020 5200 4d2a 0318
0000010 0000 2000 0000 2102 184c 0003 0000 0020
0000020 6f00 2102 184c 0003
0000028

hexdump of the second crash file:

0000000 2a52 184d 0003 0000 0020 5200 4d2a 0318
0000010 0000 2000 0000 2102 184c 2a52 184d fff8
0000020 ffff 0020 5200 4d2a 0318 0000 2000 0000
0000030 2102 184c 0003 0000 0020 6f00 6f72 086f
0000040

hexdump of another file where the process is stuck in 100% CPU load,
probably this has also something to do with the magic number:

0000000 2a52 184d 0003 0000 0020 5200 4d2a 0318
0000010 0000 2000 0000 2102 184c 2a52 184d fff4
0000020 ffff 0020 5200 4d2a 0318 0000 2000 0000
0000030 2102 184c 0003 0000 0020 6f00 6f72 086f
0000040

@t-mat

This comment has been minimized.

Show comment
Hide comment
@t-mat

t-mat Oct 21, 2016

Contributor

Good catch, @gockelhahn !
And we must fix it.

Action plan is:

(1) Write "safer" fseek.
(2) Replace all fseek(x, y, SEEK_CUR) with it.
(3) Add specific test to CI.
(4) Find the reason why compiler didn't raise warning for narrow casting such as unsigned (32bit, unsigned) to long (32bit, signed).

And here's an example (not tested):

static int LZ4IO_internal_fseek(FILE *fp, unsigned offset, int where) {
    /* Here, we want to use LONG_MAX.  But since offset is `unsigned`, we should use INT_MAX.
       Because INT_MAX <= UINT_MAX <= LONG_MAX.
       Also note that LZ4 doesn't require `signed long long`.  It only uses `unsigned long long`.
    */
    const unsigned stepMax = INT_MAX;
    int errorNb = 0;

    /* Only allows SEEK_CUR */
    if(where != SEEK_CUR) {
        errorNb = -1;
    } else {
        while (offset > 0) {
            unsigned s = offset;
            if (s > stepMax) {
                s = stepMax;
            }
            errorNb = fseek(fp, (long) s, SEEK_CUR);
            if (errorNb != 0) {
                break;
            }
            offset -= s;
        }
    }
    return errorNb;
}

static unsigned long long selectDecoder(dRess_t ress, FILE* finput, FILE* foutput)
{
    ...
    switch(magicNumber)
    {
        ...
    case LZ4IO_SKIPPABLE0:
        ...
        size = LZ4IO_readLE32(MNstore);     /* Little Endian format */
-       errorNb = fseek(finput, size, SEEK_CUR);
+       errorNb = LZ4IO_internal_fseek(finput, size, SEEK_CUR);
        ...
    }
}

@Cyan4973 , do you find any other TODOs and/or better idea for this issue?

Contributor

t-mat commented Oct 21, 2016

Good catch, @gockelhahn !
And we must fix it.

Action plan is:

(1) Write "safer" fseek.
(2) Replace all fseek(x, y, SEEK_CUR) with it.
(3) Add specific test to CI.
(4) Find the reason why compiler didn't raise warning for narrow casting such as unsigned (32bit, unsigned) to long (32bit, signed).

And here's an example (not tested):

static int LZ4IO_internal_fseek(FILE *fp, unsigned offset, int where) {
    /* Here, we want to use LONG_MAX.  But since offset is `unsigned`, we should use INT_MAX.
       Because INT_MAX <= UINT_MAX <= LONG_MAX.
       Also note that LZ4 doesn't require `signed long long`.  It only uses `unsigned long long`.
    */
    const unsigned stepMax = INT_MAX;
    int errorNb = 0;

    /* Only allows SEEK_CUR */
    if(where != SEEK_CUR) {
        errorNb = -1;
    } else {
        while (offset > 0) {
            unsigned s = offset;
            if (s > stepMax) {
                s = stepMax;
            }
            errorNb = fseek(fp, (long) s, SEEK_CUR);
            if (errorNb != 0) {
                break;
            }
            offset -= s;
        }
    }
    return errorNb;
}

static unsigned long long selectDecoder(dRess_t ress, FILE* finput, FILE* foutput)
{
    ...
    switch(magicNumber)
    {
        ...
    case LZ4IO_SKIPPABLE0:
        ...
        size = LZ4IO_readLE32(MNstore);     /* Little Endian format */
-       errorNb = fseek(finput, size, SEEK_CUR);
+       errorNb = LZ4IO_internal_fseek(finput, size, SEEK_CUR);
        ...
    }
}

@Cyan4973 , do you find any other TODOs and/or better idea for this issue?

@Cyan4973

This comment has been minimized.

Show comment
Hide comment
@Cyan4973

Cyan4973 Oct 21, 2016

Member

Excellent catch @gockelhahn , Thanks !

and yes, I fully agree with your detailed and complete action plan @t-mat .

Member

Cyan4973 commented Oct 21, 2016

Excellent catch @gockelhahn , Thanks !

and yes, I fully agree with your detailed and complete action plan @t-mat .

@Cyan4973 Cyan4973 closed this in d05ca2f Nov 1, 2016

@Cyan4973

This comment has been minimized.

Show comment
Hide comment
@Cyan4973

Cyan4973 Nov 2, 2016

Member

Latest update of dev should fix the issue (seems it was automatically closed by the commit message), though to be complete I should also find a way to add some relevant test in the CI environment.

It uses both a safer version of fseek(), largely inspired by @t-mat proposition,
and removes the stack corruption opportunity due to use of recursive calls when dealing with skippable frames.
Now the same problem, if it was not fixed, would lead to an infinite loop instead of a stack corruption. It feels better, though it's debatable...

Btw, I'm really impressed you found the issue through afl.
The chances of finding both the skippable frame magic number and the exact offset value which produces this sequence is pretty damn small.
Are you using some additional tool to help afl find such bug @gockelhahn ?

Member

Cyan4973 commented Nov 2, 2016

Latest update of dev should fix the issue (seems it was automatically closed by the commit message), though to be complete I should also find a way to add some relevant test in the CI environment.

It uses both a safer version of fseek(), largely inspired by @t-mat proposition,
and removes the stack corruption opportunity due to use of recursive calls when dealing with skippable frames.
Now the same problem, if it was not fixed, would lead to an infinite loop instead of a stack corruption. It feels better, though it's debatable...

Btw, I'm really impressed you found the issue through afl.
The chances of finding both the skippable frame magic number and the exact offset value which produces this sequence is pretty damn small.
Are you using some additional tool to help afl find such bug @gockelhahn ?

@gockelhahn

This comment has been minimized.

Show comment
Hide comment
@gockelhahn

gockelhahn Nov 2, 2016

@Cyan4973 - i only used the recent stock afl ...
if i recall correctly, the issue was found after about a day with some finished cycles done

... thanks for fixing it!

@Cyan4973 - i only used the recent stock afl ...
if i recall correctly, the issue was found after about a day with some finished cycles done

... thanks for fixing it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment