Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in lazy blockquote continuation (or corresponding spec is wrong, whichever) #204

Open
shyouhei opened this issue Jun 15, 2017 · 7 comments

Comments

Projects
None yet
3 participants
@shyouhei
Copy link

commented Jun 15, 2017

Current spec says:

If a string of lines Ls constitute a block
quote
with contents Bs, then the result of deleting
the initial [block quote marker] from one or
more lines in which the next [non-whitespace character] after the [block
quote marker] is [paragraph continuation
text] is a block quote with Bs as its content.

http://spec.commonmark.org/0.27/#block-quotes

This spec is clear, and explicit. I see no ambiguity. Now consider this blockquote:

> This is _a_ paragraph continuation text
> 2. because the line starts with `2`, not `1`.

This blockquote is valid, and satisfies the "If a string of lines Ls constitute a block quote with contents Bs" condition. So according to the spec, we can delete the leading >. By doing so we get this:

> This is _a_ paragraph continuation text
2. because the line starts with `2`, not `1`.

This blockquote, according to the spec, must be identical to the former one. However cmark does not agree.

% cmark <<'EOS'
> This is _a_ paragraph continuation text
> 2. because the line starts with `2`, not `1`.
EOS
<blockquote>
<p>This is <em>a</em> paragraph continuation text
2. because the line starts with <code>2</code>, not <code>1</code>.</p>
</blockquote>
% cmark <<'EOS'
> This is _a_ paragraph continuation text
2. because the line starts with `2`, not `1`.
EOS
<blockquote>
<p>This is <em>a</em> paragraph continuation text</p>
</blockquote>
<ol start="2">
<li>because the line starts with <code>2</code>, not <code>1</code>.</li>
</ol>
%

Is it intentional?

@jgm

This comment has been minimized.

Copy link
Member

commented Jun 18, 2017

@jgm

This comment has been minimized.

Copy link
Member

commented Jun 18, 2017

This is not so easy to fix. For parse_list_marker, if we define interrupts_paragraph as "last matched block is a PARAGRAPH block", then we fail on this case. If we define it as "current block is a PARAGRAPH block", then we fail on cases like

1. foo
2. bar

since the line beginning 2. can be interpreted as a lazy continuation of the paragraph in item 1..

This may be a deep problem that needs to be fixed by rethinking the spec, at least the decision in jgmCommonMark@0ff8022.

@shyouhei

This comment has been minimized.

Copy link
Author

commented Jun 18, 2017

OK, it seems I happened to poke something overlooked. It's OK for me to wait for a better spec. Thank you for the quick reply.

@jgm

This comment has been minimized.

Copy link
Member

commented Mar 26, 2018

@jgm

This comment has been minimized.

Copy link
Member

commented Mar 20, 2019

I'm wondering whether this could be handled by modifying the spec as follows.

    [Paragraph continuation text](@) is text that
+   is
-   will be
    parsed as part of the content of a paragraph,
+   and would be parsed as part of the content of a paragraph
+   if the leading `>` were removed,
    but does not occur at the beginning of the paragraph.
@mity

This comment has been minimized.

Copy link
Contributor

commented Mar 26, 2019

I'm wondering whether this could be handled by modifying the spec as follows.

    [Paragraph continuation text](@) is text that
+   is
-   will be
    parsed as part of the content of a paragraph,
+   and would be parsed as part of the content of a paragraph
+   if the leading `>` were removed,
    but does not occur at the beginning of the paragraph.

The term Paragraph continuation text is also referred from the section about list items. So the wording should cover any container block markers, not just >.

But that would open another problem. Because for lists, we naturally need that the 2nd list item can interrupt the 1st item:

1. first item
2. second item

So, we would probably have to redefine paragraph continuation lines differently for lists and for block quotes.

Do we want that?

@mity

This comment has been minimized.

Copy link
Contributor

commented Mar 26, 2019

Maybe that is also key how to (re)define the continuation line if we decide to keep the current behavior. I.e. that it is more or less merging of two paragraphs, where the 2nd one (the continuation) is higher in the current block nesting hierarchy.

I.e. more formally perhaps something like this:

Paragraph continuation text is line of text that fulfills all these conditions:

  1. it would otherwise end the current container block (block quote or list item) because the lack of proper prefix (> marker or list item indentation), possibly even multiple ones if the container blocks are nested in each other;
  2. it is a line which would not start a new container block; and
  3. it is a line which, if it would follow a blank line, would start a new paragraph block.

(Yeah, someone with better English could rephrase it be better. But I hope you can get the idea.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.