Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indented footnote definitions confuse the parser #623

Closed
allenap opened this issue Nov 26, 2022 · 5 comments · Fixed by #654
Closed

Indented footnote definitions confuse the parser #623

allenap opened this issue Nov 26, 2022 · 5 comments · Fixed by #654
Labels

Comments

@allenap
Copy link

allenap commented Nov 26, 2022

The following:

Foo[^foo]
Bar[^bar]

[^foo]:
    FooDef1
    FooDef2

[^bar]:
    BarDef

renders like so on GitHub:

image

but pulldown_cmark::Parser gets confused:

  1. It sees the definition text as indented code blocks.
  2. It does not see the end of the first definition before the start of the second.

Running the following code:

static MARKDOWN: &str = r#"\
Foo[^foo]
Bar[^bar]

[^foo]:
    FooDef1
    FooDef2

[^bar]:
    BarDef
"#;

fn main() {
    use pulldown_cmark::{Options, Parser};
    let options = Options::empty().union(Options::ENABLE_FOOTNOTES);
    let parser = Parser::new_ext(MARKDOWN, options);
    for event in parser {
        eprintln!("{event:?}");
    }
}

yields the following events:

  • Start(Paragraph)
  • HardBreak
  • Text(Borrowed("Foo"))
  • FootnoteReference(Borrowed("foo"))
  • SoftBreak
  • Text(Borrowed("Bar"))
  • FootnoteReference(Borrowed("bar"))
  • End(Paragraph)
  • Start(FootnoteDefinition(Borrowed("foo")))
  • Start(CodeBlock(Indented))
  • Text(Borrowed("FooDef1\n"))
  • Text(Borrowed("FooDef2\n"))
  • End(CodeBlock(Indented))
  • Start(FootnoteDefinition(Borrowed("bar")))
  • Start(CodeBlock(Indented))
  • Text(Borrowed("BarDef\n"))
  • End(CodeBlock(Indented))
  • End(FootnoteDefinition(Borrowed("bar")))
  • End(FootnoteDefinition(Borrowed("foo")))

I would expect it to:

  • treat the indented definitions as part of the definition, and
  • end the first definition before starting the second.

There is a workaround:

Foo[^foo]
Bar[^bar]

[^foo]: FooDef1
    FooDef2

[^bar]: BarDef

This yields:

  • ...
  • Start(FootnoteDefinition(Borrowed("foo")))
  • Start(Paragraph)
  • Text(Borrowed("FooDef1"))
  • SoftBreak
  • Text(Borrowed("FooDef2"))
  • End(Paragraph)
  • End(FootnoteDefinition(Borrowed("foo")))
  • Start(FootnoteDefinition(Borrowed("bar")))
  • Start(Paragraph)
  • Text(Borrowed("BarDef"))
  • End(Paragraph)
  • End(FootnoteDefinition(Borrowed("bar")))

i.e. by putting content on the same line as the definition begins, the parser is happy. Unfortunately, at least one code formatter in VSCode for Markdown (maybe the built-in one?) likes to format long definitions indented below the [^foo]: opener.

@allenap
Copy link
Author

allenap commented Nov 27, 2022

Fwiw, I think this is the de facto documentation on GitHub footnotes, and it specifically mentions how indented blocks should be interpreted.

@Martin1887 Martin1887 added the bug label May 7, 2023
notriddle added a commit to notriddle/pulldown-cmark that referenced this issue Jun 2, 2023
Resolves pulldown-cmark#20

Resolves pulldown-cmark#530

Resolves pulldown-cmark#623

This change is similar to, but a more limited change than,
 <pulldown-cmark#544>. It changes
the syntax, but does not touch the generated HTML or event API.

Motivation
----------

This commit is written with usage in mdBook, rustdoc, and docs.rs
in mind.

* Having a standard to follow, or at least a public test suite in
  [cmark-gfm] [^c], makes it easier to distinguish bugs from features.
* It makes sense to commit to following GitHub's behavior specifically,
  because mdBook chapters and docs.rs README files are often viewed in
  GitHub preview windows, so any divergence will be very annoying.
* If mdBook and docs.rs are going to use this syntax, then rustdoc
  should, too.
* Having both footnote syntaxes use the same API and rendering makes it
  more feasible for rustdoc to change the syntax over an [edition].
  To introduce a syntax change in a new edition of Rust, we must make
  rustdoc warn anyone who writes code that will have its meaning change.
  To do it, run the parser twice in lockstep (with `ENABLE_FOOTNOTES`
  on one parser, and `ENABLE_GFM_FOOTNOTES` on the other), and warn if
  they diverge.
  * Alternatively, run a Crater build with this same code to check if
    this actually causes widespread breakage.
* In particular, using tree rewriting to push the footnotes to the end
  is not as useful as it sounds, since that's not enough to exactly
  copy the way GitHub renders footnotes. To do that, you also need to
  sort the footnotes by the order in which they are *referenced*, not
  the order in which they are defined. This type of tree rewriting is
  also a waste of time if you want "margin note" rendering instead of
  putting them all at the end.

[cmark-gfm]: https://github.com/github/cmark-gfm/blob/1e230827a584ebc9938c3eadc5059c55ef3c9abf/test/extensions.txt#L702
[edition]: https://doc.rust-lang.org/edition-guide/editions/index.html

[^c]: cmark-gfm is under the MIT license, so incorporating parts of its
    test suite into pulldown-cmark should be fine.
notriddle added a commit to notriddle/pulldown-cmark that referenced this issue Jun 2, 2023
Resolves pulldown-cmark#20

Resolves pulldown-cmark#530

Resolves pulldown-cmark#623

This change is similar to, but a more limited change than,
 <pulldown-cmark#544>. It changes
the syntax, but does not touch the generated HTML or event API.

Motivation
----------

This commit is written with usage in mdBook, rustdoc, and docs.rs
in mind.

* Having a standard to follow, or at least a public test suite in
  [cmark-gfm] [^c], makes it easier to distinguish bugs from features.
* It makes sense to commit to following GitHub's behavior specifically,
  because mdBook chapters and docs.rs README files are often viewed in
  GitHub preview windows, so any divergence will be very annoying.
* If mdBook and docs.rs are going to use this syntax, then rustdoc
  should, too.
* Having both footnote syntaxes use the same API and rendering makes it
  more feasible for rustdoc to change the syntax over an [edition].
  To introduce a syntax change in a new edition of Rust, we must make
  rustdoc warn anyone who writes code that will have its meaning change.
  To do it, run the parser twice in lockstep (with `ENABLE_FOOTNOTES`
  on one parser, and `ENABLE_GFM_FOOTNOTES` on the other), and warn if
  they diverge.
  * Alternatively, run a Crater build with this same code to check if
    this actually causes widespread breakage.
* In particular, using tree rewriting to push the footnotes to the end
  is not as useful as it sounds, since that's not enough to exactly
  copy the way GitHub renders footnotes. To do that, you also need to
  sort the footnotes by the order in which they are *referenced*, not
  the order in which they are defined. This type of tree rewriting is
  also a waste of time if you want "margin note" rendering instead of
  putting them all at the end.

[cmark-gfm]: https://github.com/github/cmark-gfm/blob/1e230827a584ebc9938c3eadc5059c55ef3c9abf/test/extensions.txt#L702
[edition]: https://doc.rust-lang.org/edition-guide/editions/index.html

[^c]: cmark-gfm is under the MIT license, so incorporating parts of its
    test suite into pulldown-cmark should be fine.
@Keats
Copy link
Contributor

Keats commented Jun 3, 2023

Indented code blocks break lots of things, could there be an option to disable them?

@Martin1887
Copy link
Collaborator

Indented code blocks break lots of things, could there be an option to disable them?

What do you mean? Indented code blocks are in the CommonMark spec, so an option to disable them will not exist. This issue with indented footnote definitions will be fixed in the next release in the coming weeks.

@Keats
Copy link
Contributor

Keats commented Jun 4, 2023

I've just been running into this issue with indented code messing up things (things built on top of pulldown-cmark, no issues with this crate) that I wish I could disable them. I might just eventually end up forking this crate just to do so.

@Martin1887
Copy link
Collaborator

#654 fixes this issue and it is already merged in the master branch, so you can use the master branch until the 0.10 version is published instead of creating a fork.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants