Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect parsing of fenced code blocks inside quoted blocks when followed by another fenced code outside quoted block #126

Open
artyom opened this issue Aug 6, 2019 · 7 comments

Comments

@artyom
Copy link

artyom commented Aug 6, 2019

When fenced code block is matched inside quote block that's followed later by some non-quoted fenced code block, it is incorrectly parsed, matching end of fenced code block greedily.

Try parsing and then rendering this text:

Bug demo:

> quoted block 1 start
>
> ```
> fenced pre block 1
> ```
>
> quoted block 2 end

Paragraph

```
fenced pre block 2
```

> quoted block 2 start
>
> ```
> fenced pre block 3
> ```
>
> quoted block 2 end

It's incorrectly parsed/rended to this html:

<p>Bug demo:</p>

<blockquote>
<p>quoted block 1 start</p>

<pre><code>&gt; fenced pre block 1
&gt; ```
&gt;
&gt; quoted block 2 end

Paragraph

</code></pre>

<p>fenced pre block 2</p>

<pre><code>
quoted block 2 start

</code></pre>

<p>fenced pre block 3
&ldquo;`</p>

<p>quoted block 2 end</p>
</blockquote>

Please see this gist with complete code reproducing this output (also notice how github parser/renderer handles this case as expected).

@artyom
Copy link
Author

artyom commented Aug 7, 2019

This patch to testdata/FencedCodeInsideBlockquotes.tests illustrates the issue:

bug126-test.zip

@miekg
Copy link
Collaborator

miekg commented Aug 7, 2019 via email

@miekg
Copy link
Collaborator

miekg commented Aug 17, 2019

shorter example

Bug demo:

> quoted block 1 start
>
> ```
> fenced pre block 1
> ```
> quoted block 2 end

Paragraph

```
fenced pre block 2
```

@chmike
Copy link

chmike commented May 2, 2021

Bug is still not corrected. Is this gomarkdown module supported ?

@kjk
Copy link
Contributor

kjk commented Sep 18, 2021

A minimized test case:

> ```
> fenced 1
> ```


```
fenced 2
```

We generate:

<blockquote>
<pre><code>&gt; fenced 1
&gt; ```

</code></pre>

<p>fenced 2
&ldquo;`</p>
</blockquote>

It should be more like:

<blockquote>
  <p>
    <code>
      fenced 1
    </code>
  </p>
</blockquote>
<p>
  <code>
    fenced 2
  </code>
</p>

Babelmark: https://babelmark.github.io/?text=%3E+%60%60%60%0A%3E+fenced+1%0A%3E+%60%60%60%0A%0A%60%60%60%0Afenced+2%0A%60%60%60%0A

The issue seems to be that we don't recognize code block ending inside > which messes the rest of parsing.

kjk added a commit that referenced this issue Sep 18, 2021
@kjk
Copy link
Contributor

kjk commented Sep 18, 2021

Seems like Parser.fencedCodeBlock would need to recognize end of fence line inside blockquote.

@kjk
Copy link
Contributor

kjk commented Sep 18, 2021

Tried to modify fencedCodeBlock to add insideBlockquote flag and added:

if insideBlockquote {
	// skip '>' prefix
	pre := p.quotePrefix(data[beg:])
	if pre > 0 {
		beg += pre
	}
}

but it's more complicated than that. Would have to strip > from code inside blockquote and properly detect termination of blockquote (p.terminateBlockquote() didn't hit)

More complicated than I thought.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants