Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

heap-buffer-overflow Error In tinyxml2.cpp:2286::strlen() #675

Closed
EnchantedJohn opened this issue May 16, 2018 · 27 comments
Closed

heap-buffer-overflow Error In tinyxml2.cpp:2286::strlen() #675

EnchantedJohn opened this issue May 16, 2018 · 27 comments
Assignees

Comments

@EnchantedJohn
Copy link

hello!I use libfuzzer to test XMLDocument::parse().then I meet heap-buffer-overflow Error.I think it is due to tinyxml2.cpp:2286 strlen().because function strlen() lead to heap-buffer-overflow.

@EnchantedJohn
Copy link
Author

then I want to show my example of libfuzzer.

#include "tinyxml2.h"

#include <stdint.h>

extern "C" int LLVMFuzzerTestOneInput(const char *Data, size_t Size)
{
    tinyxml2::XMLDocument doc;
    doc.Parse(Data);
    return doc.ErrorID();
}

@EnchantedJohn
Copy link
Author

EnchantedJohn commented May 16, 2018

then this is the error information.

`INFO: Seed: 4187962468
INFO: Loaded 1 modules   (3 inline 8-bit counters): 3 [0x79ef60, 0x79ef63), 
INFO: Loaded 1 PC tables (3 PCs): 3 [0x57a7c0,0x57a7f0), 
INFO:        0 files found in corpus1/
=================================================================
==53936==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000051 at pc 0x00000049c574 bp 0x7fff76bae7c0 sp 0x7fff76badf70
READ of size 2 at 0x602000000051 thread T0
    #0 0x49c573 in strlen /home/lx/5_4/llvm/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:343
    #1 0x7fa080dbecdf in tinyxml2::XMLDocument::Parse(char const*, unsigned long) (/usr/local/lib/libtinyxml2.so.6+0xccdf)
    #2 0x55d3e0 in LLVMFuzzerTestOneInput /home/lx/5_15/tinyxml2/tinyxml2-master/fuzz_test/fuzz_parse.cpp:8:9
    #3 0x42d3b7 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /home/lx/5_4/llvm/projects/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:515
    #4 0x4322b8 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::vector<std::string, fuzzer::fuzzer_allocator<std::string> > const&) /home/lx/5_4/llvm/projects/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:701
    #5 0x432c62 in fuzzer::Fuzzer::Loop(std::vector<std::string, fuzzer::fuzzer_allocator<std::string> > const&) /home/lx/5_4/llvm/projects/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:739
    #6 0x427534 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /home/lx/5_4/llvm/projects/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:754
    #7 0x41d992 in main /home/lx/5_4/llvm/projects/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20
    #8 0x7fa07fec3f44 in __libc_start_main /build/eglibc-ripdx6/eglibc-2.19/csu/libc-start.c:287
    #9 0x41d9fb in _start (/home/lx/5_15/tinyxml2/tinyxml2-master/fuzz_test/fuzz_parse+0x41d9fb)

0x602000000051 is located 0 bytes to the right of 1-byte region [0x602000000050,0x602000000051)
allocated by thread T0 here:
    #0 0x559148 in operator new[](unsigned long) /home/lx/5_4/llvm/projects/compiler-rt/lib/asan/asan_new_delete.cc:95
    #1 0x42d2f9 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /home/lx/5_4/llvm/projects/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:506
    #2 0x4322b8 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::vector<std::string, fuzzer::fuzzer_allocator<std::string> > const&) /home/lx/5_4/llvm/projects/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:701
    #3 0x432c62 in fuzzer::Fuzzer::Loop(std::vector<std::string, fuzzer::fuzzer_allocator<std::string> > const&) /home/lx/5_4/llvm/projects/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:739
    #4 0x427534 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /home/lx/5_4/llvm/projects/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:754
    #5 0x41d992 in main /home/lx/5_4/llvm/projects/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20
    #6 0x7fa07fec3f44 in __libc_start_main /build/eglibc-ripdx6/eglibc-2.19/csu/libc-start.c:287

SUMMARY: AddressSanitizer: heap-buffer-overflow /home/lx/5_4/llvm/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:343 in strlen
Shadow bytes around the buggy address:
  0x0c047fff7fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c047fff8000: fa fa 00 fa fa fa 00 fa fa fa[01]fa fa fa fa fa
  0x0c047fff8010: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8020: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==53936==ABORTING
MS: 0 ; base unit: 0000000000000000000000000000000000000000


artifact_prefix='./'; Test unit written to ./crash-da39a3ee5e6b4b0d3255bfef95601890afd80709
Base64:`

@EnchantedJohn
Copy link
Author

the strlen() in tinyxml2.cpp:

XMLError XMLDocument::Parse( const char* p, size_t len )
{
    Clear();

    if ( len == 0 || !p || !*p ) {
        SetError( XML_ERROR_EMPTY_DOCUMENT, 0, 0 ); 
        return _errorID;
    }    
    if ( len == (size_t)(-1) ) {
        len = strlen( p ); 
    }    
    TIXMLASSERT( _charBuffer == 0 ); 
    _charBuffer = new char[ len+1 ];
    memcpy( _charBuffer, p, len );
    _charBuffer[len] = 0; 

    Parse();
    if ( Error() ) {
        // clean up now essentially dangling memory.
        // and the parse fail can put objects in the
        // pools that are dead and inaccessible.
        DeleteChildren();
        _elementPool.Clear();
        _attributePool.Clear();
        _textPool.Clear();
        _commentPool.Clear();
    }    
    return _errorID;
}

@carnil
Copy link

carnil commented May 16, 2018

This issue has been assigned CVE-2018-11210

@EnchantedJohn
Copy link
Author

OK,I will close this Issues.

@kbabioch
Copy link

Why was the issue closed? What is the fixing commit?

@leethomason leethomason reopened this May 17, 2018
@leethomason
Copy link
Owner

Re-opening. I don't track all the external bug sites, so it's better to have here.

@leethomason
Copy link
Owner

Although an actual test case would be great.

@leethomason
Copy link
Owner

But that is a bug; strlen() search should be bound by the buffer length.

@leethomason leethomason self-assigned this May 17, 2018
@leethomason
Copy link
Owner

This one worries me; initial patch here: https://github.com/leethomason/tinyxml2/compare/strlen
Note that my current editor is trimming whitespace, so some extra lines get picked up.

@leethomason leethomason removed the bug label May 18, 2018
@leethomason
Copy link
Owner

Despite some misguided hacking on my part, I'm pretty sure there is no fixing this. I suspect it comes from passing a bad len to the parse - which the underlying code can't handle - or a -1 for length and a string that isn't null terminated - which the underlying code can't handle.

If there is a crash here that isn't an incorrect use of the API, I'd like a test case, else it isn't technically a bug.

@VictorRodriguez
Copy link

@leethomason due to the severity of CVE-2018-11210 can we take your patches from https://github.com/leethomason/tinyxml2/compare/strlen ?

@leethomason
Copy link
Owner

The patches don't fix the issue. If there are correct inputs, I never found a bug. (Or a practical way to defend against bad inputs.) I don't think there's a bug if the 'len' is correctly specified.

@dodjiseketeli
Copy link

Hello,

The patches don't fix the issue. If there are correct inputs, I never found a bug. (Or a practical way to defend against bad inputs.) I don't think there's a bug if the 'len' is correctly specified.

Would you accept a patch that uses strnlen to read a maximum of 4K bytes when no size is specified for XMLDocument::parse()? The 4K bytes size comes from the fact that it's the default virtual memory page size on GNU/Linux. Of course, when the user wants to read more that a maximum of 4K, she can specify the size she wants.

The patch would be this one: dodjiseketeli@665535a

@Dmitry-Me
Copy link
Contributor

Dmitry-Me commented Nov 18, 2018

This 4K limitation makes no sense either. What if the user wants to parse a string tail only? Then the substring to parse will not start at page length.

@leethomason
Copy link
Owner

I agree with @Dmitry-Me that the limit doesn't make sense.
@dodjiseketeli this is C++, if you pass bad inputs, it will crash, like any library in C++. What are you trying to solve?

@dodjiseketeli
Copy link

dodjiseketeli commented Nov 19, 2018 via email

@Dmitry-Me
Copy link
Contributor

Dmitry-Me commented Nov 19, 2018

@dodjiseketeli This is something unreasonable. A user passing a const char* into a C++ function and expecting the function to figure out the length is expecting too much. The contract is clear - either specify the length yourself or ensure that the data is properly null-terminated. Failing to null-terminate the data is the same as specifying too much of length - that's just malformed data being passed.

@dodjiseketeli
Copy link

dodjiseketeli commented Nov 19, 2018 via email

@dodjiseketeli
Copy link

Anyway, I don't feel strong about this. I just thought I'd ask you guys if you'd agree to tighten the API a little more.

I agree that in principle, the author of the fuzzing input test did mis-use the API. She or he should have forced the fuzzing program to pass in the size of the input buffer. My gut feeling is that the default argument -1 (for the size) makes it a little too easy to mis-use the API, as opposed to forcing the caller to always specify the size, but hey, that is just a design decision and I agree that stricto sensu, the test is failing because of a pre-condition violation.

So please do not feel to strong about this either. If you feel like you don't want to change the API in the future, then I am fine with your decision. I just wanted to make sure.

Thanks for your time!

@Dmitry-Me
Copy link
Contributor

It's the same as passing a buffer and a positive length which is "too large". 30 bytes data buffer and 41 as length for example. Do this and you face undefined behavior.

@ghost
Copy link

ghost commented Feb 11, 2019

I agree with @leethomason and @Dmitry-Me. The fuzzing test was the buggy "program" that allowed "end user" (libfuzzer in this instance) to cause the buffer to be over read. If instead of the Parse function there would be direct invocation of the strlen function with the otherwise same implementation of the test, would it lead to the CVE being filled for the C standard library?

There is this thread about writing fuzzer test for the APIs that require C string in the buffer: libFuzzer: add an option to always null-terminate? The test program should have either use the suggested in this llvm thread aproach to update the buffer to be NULL terminated or call the Parse with 2 parameters.

If any change on the tinyxml2 side I might tend to suggest removing optional parameter and instead providing 2 functions Parse(const char* c_string) and Parse(const char* data, size_t size). It would provide the same flexibility but would indicate (more strongly - because the API description is already clear) what is expected when calling the Parse with either one or two parameters.

@ghost
Copy link

ghost commented Feb 11, 2019

Just for reference fixed fuzzer tests would look like this:

#include "tinyxml2.h"

#include <stdint.h>

extern "C" int LLVMFuzzerTestOneInput(const char *Data, size_t Size)
{
    tinyxml2::XMLDocument doc;
    doc.Parse(Data, Size);
    return doc.ErrorID();
}

OR

#include "tinyxml2.h"

#include <stdint.h>
#include <string>

extern "C" int LLVMFuzzerTestOneInput(const char *Data, size_t Size)
{
    tinyxml2::XMLDocument doc;
    std::string data_string(Data, Size);
    doc.Parse(data_string.c_str());
    return doc.ErrorID();
}

On a side note I have submitted request to reject the CVE-2018-11210.

@chennyu
Copy link

chennyu commented Mar 5, 2019

Just for reference fixed fuzzer tests would look like this:

#include "tinyxml2.h"

#include <stdint.h>

extern "C" int LLVMFuzzerTestOneInput(const char *Data, size_t Size)
{
    tinyxml2::XMLDocument doc;
    doc.Parse(Data, Size);
    return doc.ErrorID();
}

OR

#include "tinyxml2.h"

#include <stdint.h>
#include <string>

extern "C" int LLVMFuzzerTestOneInput(const char *Data, size_t Size)
{
    tinyxml2::XMLDocument doc;
    std::string data_string(Data, Size);
    doc.Parse(data_string.c_str());
    return doc.ErrorID();
}

On a side note I have submitted request to reject the CVE-2018-11210.

Hi, has CVE-2018-11210 beed cancelled? I found it still exist in NVD.

@mssalvatore
Copy link

On a side note I have submitted request to reject the CVE-2018-11210.

@kurylo, have you received any response from MITRE?

@ghost
Copy link

ghost commented Aug 8, 2019

No response from MITRE.

@mssalvatore
Copy link

@kurylo, I was able to get MITRE to mark this CVE as "disputed". The CVE in entry in MITRE should be updated to reflect this within the next few hours.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants