assorted errors in corner cases #245

rsc · 2021-09-02T16:10:01Z

I have been playing around with the new Go fuzzing support and found some bugs in goldmark v1.4.0. I am running it as:

gm := goldmark.New(
	goldmark.WithRendererOptions(
		html.WithUnsafe(),
	),
)
var buf bytes.Buffer
if err := gm.Convert([]byte(input), &buf); err != nil {
	t.Fatal(err)
}
output := buf.String()

The first four bugs seem important. The rest are more minor. Here they are:

In a* b c d *e*, the unusable first star appears to stop the *e* from being recognized as emphasis.
- input: "a* b c d *e*\n"
- output: "a* b c d *e*\n"
- golden: "a* b c d e\n"
A trailing space in an HTML tag stops it from being starting an HTML block, but it is still recognized as an inline HTML tag. I think condition 7 applies here.
- input: "<aaa >\n"
- output: "<aaa >\n"
- golden: "<aaa >\n"
Internal tabs are expanded to four spaces incorrectly in pre blocks:
- input: "\t\tx\n"
- output: "<pre><code> x\n</code></pre>\n" (four spaces there, not just one)
- golden: "<pre><code>\tx</code></pre>\n"
NUL bytes are not replaced with U+FFFD as required:
- input: "hello\x00world\n"
- output: "hello\x00world\n"
- golden: "hello\ufffdworld\n"
Newlines are not preserved the same way as spaces in code spans, even though they are supposed to be converted to spaces before any other processing.
- input: "`\n`"
- output: "<code></code>\n"
- golden: "<code> </code>\n"
- input: "`x\n`"
- output: "<code>x</code>\n"
- golden: "<code>x </code>\n"
- input: "`\nx`"
- output: "<code>x</code>\n"
- golden: "<code> x</code>\n"
A lone # on a line without a trailing newline is not turned into an h1. (If a \n is added, this case starts working.)
- input: "#"
- output: "#\n"
- golden: "<h1></h1>\n"
A lone * on a paragraph continuation line without a trailing newline is turned into a bullet list item, incorrectly. (If a \n is added, this case starts working.)
- input: "x\n*"
- output: "x\n<ul>\n<li></li>\n</ul>\n"
- golden: "x\n*\n"
If a link reference definition is followed by a single or double quote, it is not recognized as a link reference definition (but it is):
- input: "[x]: <>\n'\n"
- output: "[x]: <>\n'\n"
- golden: "'\n"
- input: "[x]: <>\n\"\n"
- output: "[x]: <>\n"\n"
- golden: ""\n"
Hex character entities are not limited to 6 digits:
- input: "&#x0000041;\n"
- output: "A\n"
- golden: "&#x0000041;\n"
Not sure if this is technically a bug, but I can't figure out what rule goldmark applies for escaping a 0x01 byte. If I make it the whole URL it does not get escaped, as shown below, but if instead I use "[x](\x01a)" or "[x](a\x01)" or even "[x](\x01\x01)" then it does get escaped. Only the 0x01 byte by itself doesn't get escaped. It seems like it should get escaped all the time.
- input: "[x](\x01)"
- output: "<a href=\"\x01\">x</a>\n"
- golden: "<a href=\"%01\">x</a>\n"
A form feed is treated as a space (this appears to be a change in CommonMark 0.30).
- input: "x \f\n"
- output: "x\n"
- golden: "x \f\n"

Full self-contained test at https://play.golang.org/p/wHdu67atFXW.

The text was updated successfully, but these errors were encountered:

yuin · 2021-09-12T09:30:18Z

@rsc Thanks for your detailed report. I've fixed issues and released a new version.

yuin closed this as completed in 8174177 Sep 11, 2021

yuin added a commit that referenced this issue Sep 11, 2021

Fix #245 - 2

fad80b4

yuin added a commit that referenced this issue Sep 11, 2021

Fix #245 - 3

466482b

yuin added a commit that referenced this issue Sep 11, 2021

Fix #245 - 4

351308f

yuin added a commit that referenced this issue Sep 11, 2021

Fix #245 - 5

7efc483

yuin added a commit that referenced this issue Sep 11, 2021

Fix #245 - 6

1306649

yuin added a commit that referenced this issue Sep 11, 2021

Fix #245 - 7

4317d98

yuin added a commit that referenced this issue Sep 11, 2021

Fix #245 - 8

457c157

yuin added a commit that referenced this issue Sep 11, 2021

Fix #245 - 9

a8ed3c4

yuin added a commit that referenced this issue Sep 11, 2021

Fix #245 - 10

97df31c

yuin added a commit that referenced this issue Sep 11, 2021

Fix #245 - 11

d44652d

rsc mentioned this issue Sep 17, 2021

assorted errors in corner cases (v1.4.1) #248

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assorted errors in corner cases #245

assorted errors in corner cases #245

rsc commented Sep 2, 2021 •

edited

Loading

yuin commented Sep 12, 2021

assorted errors in corner cases #245

assorted errors in corner cases #245

Comments

rsc commented Sep 2, 2021 • edited Loading

yuin commented Sep 12, 2021

rsc commented Sep 2, 2021 •

edited

Loading