Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.Sign up
Fernando has found and reported a heap-buffer-overflow directly to me, the text so far is attached below, and I have now been able to REPEAT the event, and am now searching for a fix.
I do not know anything about Valgrind, but luckily the Windows MSVC Debug configuration does the same. It inserts a 'deb_malloc' which allocates much more than the user request, and fills the leading and trailing memory with a marker, and returns the appropriate pointer to the user.
Leading is 0xfd, and trailing is 0xab, Then on 'deb_free', it checks the leading marker for buffer under-run, and the trailing marker for buffer over-run.
This is a view of the memory as allocated in this case
fd fd fd fd fd fd fd fd ab ab ab ab ab ab ab ab ab ab ab ab ab ab ...
The event happens in ParseValue, where the length is held in an int len, which gets to contain a -1! MINUS ONE!! It is parsing the anchor href in this strange string
ParseValue then calls -
Inexplicably tmpstrndup is decalared as -
Notice the int has now become a uint, so can not test for less that zero... it does
Notice the plus 1, so it arrives at TidyAlloc with a ZERO!!!
Now it seems malloc does not mind a zero value, malloc(0), and dutifully returns a pointer!!!! But that is a pointer to what exactly?
Then tmbstrndup does the corruption, with -
But thankfully the corruption stops when a 0 is reached in the lexer with
Then a final corruption, the reason for the plus 1 -
This is a view of the memory after -
fd fd fd fd 68 72 65 66 00 00 ab ab ab ab ab ab ab ab
Note in this case, since the loop did not end when len became zero, so we end up with a double 0! But I get a big popup dialog when this memory is passed to 'deb_free'... advising me of possible heap corruption...
So that is a detailed desciption of the problem. Now to decide what would be the best fix! Ideas...
Since tmbstrndup does check is len > 0, could change the uint to an int, and then it would return a NULL memory. Probably not a good thing in all case...
I am still trying to understand exactly why the int len back in ParseValue went negative!
Will work on this... As indicated any ideas welcome...
And again thanks to Fernando for finding this case... What follows is our direct conversation to date...
I believe I tried both versions, the old (which is the one that Debian
Did you try to run it under valgrind? Valgrind reports the errors but
I can give it a try later again and tell you the exact version I used.
On 6/1/15 5:28 AM, Geoff McLane wrote:
Ok, found the problem and what seems like a suitable solution...
This is a case where we effectively have
So since there is no attribute value, len will be zero, which is ok... it is the truth - there is no attribute value length...
Now the code has
So added to the code in this block -
And even more, if len does have a value, then protect against len ever going negative with -
It was the first of these two that did the damage.
start=4, len=0, and just coincidently lexer->lexbuf contained an 0xa, thus len was decremented to -1 ;=((.
That in itself is quite unique since there are only a few special cases where control chars are added to the lexer. But you guessed it, one is when parsing 'code', like
Now when the code does
In some cases this bug could exibit a different problem like parsing the snippet
Now the lexer buffer will contain 2, or more IsWhite() chars and len would be reduced to -2, or less, which means the malloc buffer allocation would be a giant 4,294,967,295 byte allocation, a value lots of OSes will reject...
And I can confirm this BUG exists in the 2008/9 libtidy.0.99.so last release, the sourceforge cvs tidy, which is still present in some distributions. Just the quite unique nature of using 'code' ending in spaces or a newline just before an attribute with a 'blank' value prevents it from being seens more often...
Interestingly, it is NOT present in TidyAug2000, the earliest tidy source I have, since (a) it did not have that additional len decrement, and (b) used an int in wstrndup, the predecessor of tmbstrndup, so the code
Anyway, now no more buffer corruption, or massive allocs, for this latest tidy ;=)). I love crushing such ancient bugs...
Also bumped the version to 4.9.31 for this important fix.
A note that this is CVE-2015-5522 and CVE-2015-5523 and that Ubuntu has fixed this particular bug in their version https://launchpad.net/ubuntu/+source/tidy/20091223cvs-1.5 in 'wily'.
(Adding the CVE numbers here to make it clear when searching for that CVE that the bug has been fixed.)