Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support math extension #622

Closed
wants to merge 24 commits into from
Closed

Support math extension #622

wants to merge 24 commits into from

Conversation

rhysd
Copy link
Contributor

@rhysd rhysd commented Nov 19, 2022

This PR adds support for mathematical expressions behind Options::ENABLE_MATH flag, which was introduced to GitHub recently.

This extension supports both inline-level and block-level expressions:

Inline-level:  $\sqrt{3x-1}+(1+x)^2$

Block-level:

$$\left( \sum_{k=1}^n a_k b_k \right)^2 \leq \left( \sum_{k=1}^n a_k^2 \right) \left( \sum_{k=1}^n b_k^2 \right)$$

They are rendered as follows on GitHub:

Inline-level: $\sqrt{3x-1}+(1+x)^2$

Block-level:

$$\left( \sum_{k=1}^n a_k b_k \right)^2 \leq \left( \sum_{k=1}^n a_k^2 \right) \left( \sum_{k=1}^n b_k^2 \right)$$

src/parse.rs Outdated Show resolved Hide resolved
src/parse.rs Outdated Show resolved Hide resolved
specs/math.txt Outdated Show resolved Hide resolved
@rhysd
Copy link
Contributor Author

rhysd commented Nov 30, 2022

Thanks! I'll take a look at review comments tomorrow.

@rhysd
Copy link
Contributor Author

rhysd commented Dec 1, 2022

@mattwidmann Yes, you're totally correct. I fixed the off-by-one error and test cases at 7123b76.

@Schabolon
Copy link

Schabolon commented Jan 15, 2023

Hi,
I hope this is the right place to add this information.
The math-option is very useful to me.
After using it for some time, I think I have discovered a bug.
The following Markdown: $\{1, 2, 3\}$
Gets converted to this HTML: <div class="math inline">{1, 2, 3\}</div>
The backslash before the first curly-bracket went missing.
This problem doesn't seem to occur on math blocks (they work fine).
I tried to find the problem, but couldn't figure it out (my only guess is, that it gets removed because all \{ gets converted to { in HTML and other items starting with \ like \sqrt{1} work fine)
Would be amazing if this bug could be fixed.

@rhysd
Copy link
Contributor Author

rhysd commented Jan 17, 2023

@Schabolon Thanks. I fixed it at cbcdf26.

$ echo '$\{1, \{2, \{3\}\}\}$' | cargo run -- -M -e
    Finished dev [unoptimized + debuginfo] target(s) in 0.07s
     Running `target/debug/pulldown-cmark -M -e`
0..22: Start(Paragraph)
1..20: Math(Inline, Borrowed("\\{1, \\{2, \\{3\\}\\}\\}"))
0..22: End(Paragraph)
EOF

$ echo '$\{1, \{2, \{3\}\}\}$' | cargo run -- -M
    Finished dev [unoptimized + debuginfo] target(s) in 0.07s
     Running `target/debug/pulldown-cmark -M`
<p><span class="math inline">\{1, \{2, \{3\}\}\}</span></p>

$ echo '$$\{1, \{2, \{3\}\}\}$$' | cargo run -- -M -e
    Finished dev [unoptimized + debuginfo] target(s) in 0.07s
     Running `target/debug/pulldown-cmark -M -e`
0..23: Math(Block, Borrowed("\\{1, \\{2, \\{3\\}\\}\\}"))
EOF

@Schabolon
Copy link

Thanks a lot for the quick fix 👍

@xelatihy
Copy link

xelatihy commented Feb 5, 2023

Any chance this will be accepted soon? Math support is a needed feature for a project I am working on and I will soon have to choose a parser. Thanks.

@rhysd
Copy link
Contributor Author

rhysd commented Feb 6, 2023

I don't know why this PR cannot get a review from the maintainers. I guess they're super busy. If it is because no issue is filed for this PR, I'll make a new issue fixed by this PR soon. Let me know. @marcusklaas @raphlinus

Meanwhile, I'm using git dependency to use my forked branch. Note that this is not allowed when you upload your crate to crates.io.

pulldown-cmark = { git = "https://github.com/rhysd/pulldown-cmark.git", branch="math", default-features = false, features = [] }

@carloskiki
Copy link

I would love for this PR to go through, as my project needs math expressions, Thanks!

@ehuss
Copy link
Contributor

ehuss commented Feb 24, 2023

Can you say more about how the rendering is expected to work? Does this require MathJax? Is there some guidance on how to integrate it properly? Does this work with katex?

@rhysd
Copy link
Contributor Author

rhysd commented Feb 24, 2023

@ehuss This PR does not render the math expressions. It just parses the mathematical expressions notation at Markdown level. So it parses $...$ and $$...$$ math blocks, but does not touch to their contents (... parts).

https://github.com/rhysd/pulldown-cmark/blob/61e4922f6bc9023e49fa220ff2a673e969ff3f9e/src/html.rs#L135-L144

The text arguments in the match arms are raw math block contents. How they are rendered is left to the user of this library.

Does this require MathJax?

No. But some math expression renderer is at least necessary (MathJax, Katex, MathML, ...). MathML is a standard way to render math expressions, but they are implemented very recently on Chromium and not mature enough yet.

Does this work with katex?

Yes. However I recommend to use MathJax because GitHub uses MathJax for rendering math expressions in Markdown.

Is there some guidance on how to integrate it properly?

No. But I can explain it for you. It is almost the same as raw HTML in Markdown document.
Regarding to the Markdown to HTML conversion provided by pulldown-cmark, currently this PR converts

$\sqrt{3x-1}+(1+x)^2$

to

<span class="math inline">\sqrt{3x-1}+(1+x)^2</span>

To render this HTML element, you would load the element and MathJax library in the same page on your browser and call some MathJax's API to render the element's content as math expression. For the API, please see the official document.

I think trying it on your side would be a good way to understand how it works.

git clone https://github.com/rhysd/pulldown-cmark.git -b math
cd pulldown-cmark
echo '$\sqrt{3x-1}+(1+x)^2$' > test.md
cargo run -- -M < test.md # See the converted HTML from the Markdown file
cargo run -- -M -e < test.md # See the parsed events from the Markdown file

I don't know your use case. If you explain it, I may explain the usage better.

@ehuss
Copy link
Contributor

ehuss commented Feb 24, 2023

I don't know your use case. If you explain it, I may explain the usage better.

I'm the maintainer of mdBook. It currently has MathJax support, which if I understand correctly, scans the entire page for delimiters and renders them.

mdBook also supports plugins, such as mdbook-katex.

I assume the largest benefit here is to avoid problems with escaping delimiters?

What would the mathjax init code look like for basic usage?

specs/math.txt Outdated
Comment on lines 45 to 54
Sole `$` character is handled as normal text. And Inline-level mathematical expressions cannot continue across newlines:

```````````````````````````````` example
Hello $ world.

Dollar at end of line: $
.
<p>Hello $ world.</p>
<p>Dollar at end of line: $</p>
````````````````````````````````
Copy link
Collaborator

@ollpu ollpu Aug 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to see more activity around this long-standing issue! I'm willing to work on this if any help is needed.

On this part, I'm not sure I agree. In LaTeX and markdown files alike, a common practice is to justify paragraphs to a width of e.g. 80 characters. Single newlines do not affect the output, even inside math expressions. Multiple newlines of course break paragraphs and should also break inline math.

There seems to be a divergence of behavior here between markdown files and issues/comments on GitHub. Comments preserve newlines from the raw text, but in markdown files this is not the case. This markup:

text and $math broken
into multiple lines$

renders in this comment as

text and $math broken
into multiple lines$

but is shown as math in a Gist: https://gist.github.com/ollpu/2b0e18b8fe20d76b97d9e89912f4df3e

It also works with Pandoc.

Since with pulldown-cmark single newlines are generally ignored, it would make sense to ignore them inside inline math as well.

There is subtlety with how newlines can start/end a math expression though. GitHub and Pandoc have different behavior: On GitHub, an inline math expression cannot start with a newline right after the $, but it can end with one before the closing $. (See the same Gist for examples of this) Pandoc is more consistent, and inline math can neither start or end with a newline – they are essentially treated the same as spaces for this purpose.

Escaped newlines (newline preceded by \) are another can of worms.

For what it's worth, I prefer following Pandoc over GitHub in this and other similar cases since the implementation is more mature, consistent and doesn't suffer from the robustness issues that GitHub did at least originally.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, I support following $\LaTeX$'s standards. They are around for a good while and solid.

Copy link
Collaborator

@ollpu ollpu Aug 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another common thing supported by both GitHub and Pandoc are display-mode math blocks inside a paragraph.

display $$\LaTeX$$ math

renders as

display $$\LaTeX$$ math

Likewise, both agree on not allowing empty lines inside display math blocks.

$$
a

b
$$

becomes

$$
a

b
$$

Copy link
Collaborator

@ollpu ollpu Aug 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm eyeing a refactor to this PR branch to make math parsing kind of combine the logic of ItemBody::MaybeEmphasis and ItemBody::MaybeCode. #477 does something similar for inline math, but not display blocks. It seems like display math should behave a little closer to inline math and inline code, but just not do the whitespace checks.

What do you think @rhysd?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went ahead and implemented that. More details in the PR: rhysd#1

I'd rather not have a third competing PR to upstream, so hopefully we can converge on a good design here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found another bug: When you want to parse this markdown:

math
$$x$$

This is not interpreted as math. I tested it with the latest version:

$ echo -e 'math \n$$x$$' | cargo run -- -m -e
    Finished dev [unoptimized + debuginfo] target(s) in 0.04s
     Running `target/debug/pulldown-cmark -m -e`
0..13: Start(Paragraph)
0..5: Text(Borrowed("math "))
5..6: SoftBreak
7..9: Text(Borrowed("$$"))
9..10: Text(Borrowed("x"))
10..12: Text(Borrowed("$$"))
0..13: End(Paragraph)
EOF

I don't know if anybody saw this, but it still doesn't work.
I'm pretty sure it is not the default behaviour for most math-in-markdown implementations.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rambip

Yeah, to be clear, I made a separate PR onto @rhysd's branch, but that is not incorporated here yet.

This case works with my version there though:

$ echo -e 'math \n$$x$$' | cargo run -- -m -e
   Compiling pulldown-cmark v0.9.2 (/home/ollpu/git/pulldown-cmark)
    Finished dev [unoptimized + debuginfo] target(s) in 3.77s
     Running `target/debug/pulldown-cmark -m -e`
0..12: Start(Paragraph)
0..5: Text(Borrowed("math "))
5..6: SoftBreak
6..11: Math(Display, Borrowed("x"))
0..12: End(Paragraph)
EOF

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok nice !

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry for the delay. I didn't have enough time to see each PRs I've created this week.

There seems to be a divergence of behavior here between markdown files and issues/comments on GitHub.

I didn't know this. Thank you for your PR. I'll take a look.

Syntax is based on pandoc's commonmark+math mode, i.e.
https://github.com/jgm/commonmark-hs/blob/master/commonmark-extensions/test/math.md
excluding the handling for balanced curly braces and how inline math can
end in a newline (may be a bug? issue filed:
jgm/commonmark-hs#110).

Removes support for $`..`$ and simplifies whitespace/punctuation rules
for inline math.
@rambip
Copy link

rambip commented Oct 8, 2023

Is there additional work that need to be done before merging ?
I tested the @ollpu branch in some of my projects and didn't notice any bug; the only undocumented situation may be multi-line math with empty breaks, like

$$
x=

1
$$

I would be really happy to help, 0.10 needs math IMO

@ollpu
Copy link
Collaborator

ollpu commented Oct 8, 2023

@rambip That's good to note. I added such a case to the math spec in my branch.

@Martin1887
Copy link
Collaborator

Is there additional work that need to be done before merging ? I tested the @ollpu branch in some of my projects and didn't notice any bug; the only undocumented situation may be multi-line math with empty breaks, like

$$
x=

1
$$

I would be really happy to help, 0.10 needs math IMO

Math is an important feature but including it requires a big analysis of the options in syntax, features and implementations. Coordination with other projects using pulldown-cmark and supporting math like mdBook would also be desirable.

Therefore, math support will not be included in the 0.10 version but it is a priority for the 0.11 version.

@rambip
Copy link

rambip commented Oct 9, 2023

Is there additional work that need to be done before merging ? I tested the @ollpu branch in some of my projects and didn't notice any bug; the only undocumented situation may be multi-line math with empty breaks, like

$$
x=

1
$$

I would be really happy to help, 0.10 needs math IMO

Math is an important feature but including it requires a big analysis of the options in syntax, features and implementations. Coordination with other projects using pulldown-cmark and supporting math like mdBook would also be desirable.

Therefore, math support will not be included in the 0.10 version but it is a priority for the 0.11 version.

I can understand that but "a big analysis" seems pretty vague to me.
If there is any spec that have to be written or even a ROADMAP, I can try.

@Martin1887
Copy link
Collaborator

There is no spec or roadmap yet, they are the first things to do and a survey of the possible options would be the first step for the spec.

The survey is a 'help-wanted' thing and a different pull request or discussion could be opened to review the possible options and discuss them.

Thanks!

@rambip
Copy link

rambip commented Oct 16, 2023

I was starting to write a very early spec to have a starting point in a github pull request, but I just realized @ollpu had a decent one in his own pull request.
Where should the conversation continue then ?

I have listed some ambiguous cases where a survey could be usefull, but I don't know what to do with it.
Maybe @ollpu could other another pull request called "math spec" ?

ollpu added a commit to ollpu/pulldown-cmark that referenced this pull request Oct 16, 2023
Based on pulldown-cmark#622 and
copied from https://github.com/ollpu/pulldown-cmark/tree/alt-math.

Co-authored-by: rhysd <lin90162@yahoo.co.jp>
@ollpu
Copy link
Collaborator

ollpu commented Oct 16, 2023

Yeah, I've gotten myself familiar with the various approaches as well and the spec on my branch might be a good starting point. I've been busy the last week so I didn't have time to respond yet.

I opened a PR to discuss the spec: #733

notriddle pushed a commit to notriddle/pulldown-cmark that referenced this pull request Oct 19, 2023
Based on pulldown-cmark#622 and
copied from https://github.com/ollpu/pulldown-cmark/tree/alt-math.

Co-authored-by: rhysd <lin90162@yahoo.co.jp>
@rhysd
Copy link
Contributor Author

rhysd commented Oct 23, 2023

Should we continue the discussion at #733? Or should I merge @ollpu's PR in this branch and continue the discussion here? Either is fine for me.

Since maintainers of this crate don't seem to get interested in this feature, I honestly don't have so much motivation to keep/improve this branch.

rhysd and others added 2 commits October 24, 2023 00:11
@ollpu ollpu mentioned this pull request Oct 23, 2023
2 tasks
notriddle pushed a commit to notriddle/pulldown-cmark that referenced this pull request Oct 27, 2023
Based on pulldown-cmark#622 and
copied from https://github.com/ollpu/pulldown-cmark/tree/alt-math.

Co-authored-by: rhysd <lin90162@yahoo.co.jp>
notriddle pushed a commit to notriddle/pulldown-cmark that referenced this pull request Mar 4, 2024
Based on pulldown-cmark#622 and
copied from https://github.com/ollpu/pulldown-cmark/tree/alt-math.

Co-authored-by: rhysd <lin90162@yahoo.co.jp>
notriddle pushed a commit to notriddle/pulldown-cmark that referenced this pull request Mar 7, 2024
Based on pulldown-cmark#622 and
copied from https://github.com/ollpu/pulldown-cmark/tree/alt-math.

Co-authored-by: rhysd <lin90162@yahoo.co.jp>
notriddle pushed a commit to notriddle/pulldown-cmark that referenced this pull request Mar 11, 2024
Based on pulldown-cmark#622 and
copied from https://github.com/ollpu/pulldown-cmark/tree/alt-math.

Co-authored-by: rhysd <lin90162@yahoo.co.jp>
@rhysd
Copy link
Contributor Author

rhysd commented Mar 12, 2024

Closing this in favor of #734. Thank you @notriddle for pushing forward the feature.

@rhysd rhysd closed this Mar 12, 2024
notriddle pushed a commit to notriddle/pulldown-cmark that referenced this pull request Apr 18, 2024
Based on pulldown-cmark#622 and
copied from https://github.com/ollpu/pulldown-cmark/tree/alt-math.

Co-authored-by: rhysd <lin90162@yahoo.co.jp>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet