Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash with malformed <meta> tag #739

Closed
jengelh opened this issue May 30, 2018 · 3 comments
Closed

Crash with malformed <meta> tag #739

jengelh opened this issue May 30, 2018 · 3 comments

Comments

@jengelh
Copy link

jengelh commented May 30, 2018

Input:

#include <tidy.h>
int main(void)
{
        TidyDoc tdoc = tidyCreate();
        tidyParseString(tdoc, "<!DOCTYPE HTML PUBLIC \"\"-//W3C//DTD HTML 4.0 Transitional//EN\"\"><html><head><meta content=\"\"text/html; charset=utf-8\"\" http-equiv=Content-Type>");
        tidyCleanAndRepair(tdoc);
        return 0;
}

Output:

GNU gdb (GDB; openSUSE Tumbleweed) 8.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-suse-linux".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://bugs.opensuse.org/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from a.out...done.
(gdb) r
Starting program: /home/jengelh/work/kc/librosie/a.out 
line 1 column 77 - Warning: <meta> attribute "text/html;" lacks value
line 1 column 77 - Info: value for attribute "charset" missing quote marks
line 1 column 77 - Info: value for attribute "http-equiv" missing quote marks
line 1 column 71 - Warning: inserting missing 'title' element
line 1 column 77 - Warning: discarding unexpected <meta>

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7b31418 in prvTidyTidyMetaCharset (doc=0x602260) at /usr/src/debug/tidy-5.6.0-0.x86_64/src/clean.c:2215
2215        for (currentNode = head->content; currentNode; currentNode = currentNode->next)
(gdb) b prvTidyTidyMetaCharset
Breakpoint 1 at 0x7ffff7b30d2a: file /usr/src/debug/tidy-5.6.0-0.x86_64/src/clean.c, line 2178.
(gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/jengelh/work/kc/librosie/a.out 
line 1 column 77 - Warning: <meta> attribute "text/html;" lacks value
line 1 column 77 - Info: value for attribute "charset" missing quote marks
line 1 column 77 - Info: value for attribute "http-equiv" missing quote marks
line 1 column 71 - Warning: inserting missing 'title' element

Breakpoint 1, prvTidyTidyMetaCharset (doc=0x602260) at /usr/src/debug/tidy-5.6.0-0.x86_64/src/clean.c:2178
2178    {
(gdb) n
2182        Bool charsetFound = no;
(gdb) 
2183        uint outenc = cfg(doc, TidyOutCharEncoding);
(gdb) 
2184        ctmbstr enc = TY_(GetEncodingNameFromTidyId)(outenc);
(gdb) 
2186        Node *head = TY_(FindHEAD)(doc);
(gdb) 
2194        Bool add_meta = cfgBool(doc, TidyMetaCharset);
(gdb) 
2197        if (!head || !enc || !TY_(tmbstrlen)(enc))
(gdb) 
2199        if (outenc == RAW)
(gdb) 
2202        if (outenc == ISO2022)
(gdb) 
2205        if (cfgAutoBool(doc, TidyBodyOnly) == TidyYesState)
(gdb) 
2208        tidyBufInit(&charsetString);
(gdb) 
2210        tidyBufClear(&charsetString);
(gdb) 
2211        tidyBufAppend(&charsetString, "charset=", 8);
(gdb) 
2212        tidyBufAppend(&charsetString, (char*)enc, TY_(tmbstrlen)(enc));
(gdb) 
2213        tidyBufAppend(&charsetString, "\0", 1); /* zero terminate the buffer */
(gdb) 
2215        for (currentNode = head->content; currentNode; currentNode = currentNode->next)
(gdb) 
2217            if (!nodeIsMETA(currentNode))
(gdb) p currentNode
$1 = (Node *) 0x60b5f0
(gdb) p currentNode->prev
$2 = (Node *) 0x0
(gdb) n
2219            charsetAttr = attrGetCHARSET(currentNode);
(gdb) 
2220            httpEquivAttr = attrGetHTTP_EQUIV(currentNode);
(gdb) 
2221            if (!charsetAttr && !httpEquivAttr)
(gdb) 
2227            if (charsetAttr && !httpEquivAttr)
(gdb) 
2262            if (httpEquivAttr && !charsetAttr)
(gdb) 
2329            if (httpEquivAttr && charsetAttr)
(gdb) 
2332                prevNode = currentNode->prev;
(gdb) 
2333                TY_(Report)(doc, head, currentNode, DISCARDING_UNEXPECTED);
(gdb) 
line 1 column 77 - Warning: discarding unexpected <meta>
2334                TY_(DiscardElement)(doc, currentNode);
(gdb) 
2335                currentNode = prevNode;
(gdb) 
2215        for (currentNode = head->content; currentNode; currentNode = currentNode->next)
(gdb) 

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7b31418 in prvTidyTidyMetaCharset (doc=0x602260) at /usr/src/debug/tidy-5.6.0-0.x86_64/src/clean.c:2215
2215        for (currentNode = head->content; currentNode; currentNode = currentNode->next)

More info:
libtidy 5.6.0.
Regression from 5.4.0 where it appeared to work fine (no crash).

@geoffmcl
Copy link
Contributor

@jengelh thank you for the issue, and the code to duplicate... and the debugging...

This issue was addressed in #656, and closed in PR #661... upgrade to at least version 5.7.1 plus...

Can you confirm it has been fixed... thanks...

@geoffmcl geoffmcl added this to the 5.7 milestone May 31, 2018
@jengelh
Copy link
Author

jengelh commented Jun 1, 2018

I can confirm it has already been fixed with PR 661. It only needs a proper release/tag. :-)

@geoffmcl
Copy link
Contributor

geoffmcl commented Oct 8, 2020

@jengelh thank you for confirming the fix, years ago... so am closing this... just to be tidy...

The tagging, and the release is a separate topic... I did tag a 5.7.28, on Jul 8, 2019, but ran out of steam, in completing the release... and nothing since... sorry about that... the project needs some helpers...

But as stated, thanks for the issue...

@geoffmcl geoffmcl closed this as completed Oct 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants