Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assorted errors in corner cases #245

Closed
rsc opened this issue Sep 2, 2021 · 1 comment
Closed

assorted errors in corner cases #245

rsc opened this issue Sep 2, 2021 · 1 comment

Comments

@rsc
Copy link

rsc commented Sep 2, 2021

I have been playing around with the new Go fuzzing support and found some bugs in goldmark v1.4.0. I am running it as:

gm := goldmark.New(
	goldmark.WithRendererOptions(
		html.WithUnsafe(),
	),
)
var buf bytes.Buffer
if err := gm.Convert([]byte(input), &buf); err != nil {
	t.Fatal(err)
}
output := buf.String()

The first four bugs seem important. The rest are more minor. Here they are:

  1. In a* b c d *e*, the unusable first star appears to stop the *e* from being recognized as emphasis.

    • input: "a* b c d *e*\n"
    • output: "<p>a* b c d *e*</p>\n"
    • golden: "<p>a* b c d <em>e</em></p>\n"
  2. A trailing space in an HTML tag stops it from being starting an HTML block, but it is still recognized as an inline HTML tag. I think condition 7 applies here.

    • input: "<aaa >\n"
    • output: "<p><aaa ></p>\n"
    • golden: "<aaa >\n"
  3. Internal tabs are expanded to four spaces incorrectly in pre blocks:

    • input: "\t\tx\n"
    • output: "<pre><code> x\n</code></pre>\n" (four spaces there, not just one)
    • golden: "<pre><code>\tx</code></pre>\n"
  4. NUL bytes are not replaced with U+FFFD as required:

    • input: "hello\x00world\n"
    • output: "<p>hello\x00world</p>\n"
    • golden: "<p>hello\ufffdworld</p>\n"
  5. Newlines are not preserved the same way as spaces in code spans, even though they are supposed to be converted to spaces before any other processing.

    • input: "`\n`"
    • output: "<p><code></code></p>\n"
    • golden: "<p><code> </code></p>\n"
    • input: "`x\n`"
    • output: "<p><code>x</code></p>\n"
    • golden: "<p><code>x </code></p>\n"
    • input: "`\nx`"
    • output: "<p><code>x</code></p>\n"
    • golden: "<p><code> x</code></p>\n"
  6. A lone # on a line without a trailing newline is not turned into an h1. (If a \n is added, this case starts working.)

    • input: "#"
    • output: "<p>#</p>\n"
    • golden: "<h1></h1>\n"
  7. A lone * on a paragraph continuation line without a trailing newline is turned into a bullet list item, incorrectly. (If a \n is added, this case starts working.)

    • input: "x\n*"
    • output: "<p>x</p>\n<ul>\n<li></li>\n</ul>\n"
    • golden: "<p>x\n*</p>\n"
  8. If a link reference definition is followed by a single or double quote, it is not recognized as a link reference definition (but it is):

    • input: "[x]: <>\n'\n"
    • output: "<p>[x]: &lt;&gt;\n'</p>\n"
    • golden: "<p>'</p>\n"
    • input: "[x]: <>\n\"\n"
    • output: "<p>[x]: &lt;&gt;\n&quot;</p>\n"
    • golden: "<p>&quot;</p>\n"
  9. Hex character entities are not limited to 6 digits:

    • input: "&#x0000041;\n"
    • output: "<p>A</p>\n"
    • golden: "<p>&amp;#x0000041;</p>\n"
  10. Not sure if this is technically a bug, but I can't figure out what rule goldmark applies for escaping a 0x01 byte. If I make it the whole URL it does not get escaped, as shown below, but if instead I use "[x](\x01a)" or "[x](a\x01)" or even "[x](\x01\x01)" then it does get escaped. Only the 0x01 byte by itself doesn't get escaped. It seems like it should get escaped all the time.

    • input: "[x](\x01)"
    • output: "<p><a href=\"\x01\">x</a></p>\n"
    • golden: "<p><a href=\"%01\">x</a></p>\n"
  11. A form feed is treated as a space (this appears to be a change in CommonMark 0.30).

    • input: "x \f\n"
    • output: "<p>x</p>\n"
    • golden: "<p>x \f</p>\n"

Full self-contained test at https://play.golang.org/p/wHdu67atFXW.

@yuin yuin closed this as completed in 8174177 Sep 11, 2021
yuin added a commit that referenced this issue Sep 11, 2021
yuin added a commit that referenced this issue Sep 11, 2021
yuin added a commit that referenced this issue Sep 11, 2021
yuin added a commit that referenced this issue Sep 11, 2021
yuin added a commit that referenced this issue Sep 11, 2021
yuin added a commit that referenced this issue Sep 11, 2021
yuin added a commit that referenced this issue Sep 11, 2021
yuin added a commit that referenced this issue Sep 11, 2021
yuin added a commit that referenced this issue Sep 11, 2021
yuin added a commit that referenced this issue Sep 11, 2021
@yuin
Copy link
Owner

yuin commented Sep 12, 2021

@rsc Thanks for your detailed report. I've fixed issues and released a new version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants