Skip to content

👌 IMPROVE: Math Parsing#217

Merged
chrisjsewell merged 6 commits into
masterfrom
dollarmath
Aug 19, 2020
Merged

👌 IMPROVE: Math Parsing#217
chrisjsewell merged 6 commits into
masterfrom
dollarmath

Conversation

@chrisjsewell
Copy link
Copy Markdown
Member

This centralises parsing configuration storage and validation, meaning it only has to be updated in a single place.
Also moved from using texmath_plugin to dollarmath_plugin

This centralises parsing configuration storage and validation, meaning it only has to be updated in a single place.
Also moved from using `texmath_plugin` to `dollarmath_plugin`
@codecov
Copy link
Copy Markdown

codecov Bot commented Aug 18, 2020

Codecov Report

Merging #217 into master will decrease coverage by 0.20%.
The diff coverage is 85.24%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #217      +/-   ##
==========================================
- Coverage   91.17%   90.97%   -0.21%     
==========================================
  Files          12       12              
  Lines        1281     1296      +15     
==========================================
+ Hits         1168     1179      +11     
- Misses        113      117       +4     
Flag Coverage Δ
#pytests 90.97% <85.24%> (-0.21%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
myst_parser/cli/benchmark.py 82.53% <ø> (ø)
myst_parser/docutils_renderer.py 93.16% <0.00%> (-0.86%) ⬇️
myst_parser/sphinx_parser.py 97.05% <80.00%> (-0.38%) ⬇️
myst_parser/__init__.py 90.32% <81.25%> (+5.02%) ⬆️
myst_parser/main.py 91.54% <89.74%> (-0.12%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9c4ddac...2054b9f. Read the comment docs.

@chrisjsewell chrisjsewell changed the title ♻️ REFACTOR: Add MdParserConfig class 👌 IMPROVE: Math Parsing Aug 19, 2020
@chrisjsewell chrisjsewell marked this pull request as ready for review August 19, 2020 01:21
@chrisjsewell
Copy link
Copy Markdown
Member Author

chrisjsewell commented Aug 19, 2020

Ok @executablebooks/ebpteam and @phaustin I have now hopefully fully solved the Math parsing/rendering.

Basically, I wrote a new markdown plugin in executablebooks/markdown-it-py@91bf225, which gives greater control over the parsing process, and fixes the internal escaping issue (as raised in jupyter-book/jupyter-book#750 (comment)).

Then here, I use that plugin and also re-configure mathjax such that it does not try to replicate this parsing, and so "respects" if the math paring is disabled and makes it consistent with e.g. TeX/PDF outputs.

See https://myst-parser--217.org.readthedocs.build/en/217/using/syntax.html#math-shortcuts, and in particular https://myst-parser--217.org.readthedocs.build/en/217/using/syntax.html#mathjax-and-math-parsing
Also all the option are specified in https://myst-parser--217.org.readthedocs.build/en/217/using/intro.html#myst-configuration-options

NOTE because myst_amsmath_enable=False by default, this will mean that anyone who currently has "bare" LaTeX in their HTML documentation, which is now being rendered by MathJax, will no longer have it rendered, until they set myst_amsmath_enable=True.
We can decide if we want this to be the case, or to set it to True by default in myst-nb and/or jupyter-book.

This will then feed into executablebooks/MyST-NB#226 and finally into jupyter-book.

@chrisjsewell chrisjsewell requested a review from choldgraf August 19, 2020 01:43
Comment thread docs/using/syntax.md Outdated

Since these delimiters are how `sphinx.ext.mathjax` wraps the math content in the built HTML documents.

To inhibit this override, set `override_mathjax=False`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there some way for people to provide their own extra mathjax overrides? several folks have asked about this in JB

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just use mathjax_config? Then now, they will also have to ensure myst_override_mathjax=False

@choldgraf
Copy link
Copy Markdown
Member

the docs addition looks good to me, I don't have time right now to take a look at the implementation so perhaps @mmcky can give a review on that. Two quick thoughts:

  • Is there a reason not to make "don't parse math if there are dollars signs just before/after" the default?
  • We mention mathjax overrides and that mathjax is the default, but I'm not sure where there's an option to choose anything other than Mathjax. Am I missing something?

Other than those nits I think that this looks really nice!

@chrisjsewell
Copy link
Copy Markdown
Member Author

Is there a reason not to make "don't parse math if there are dollars signs just before/after" the default?

Not sure what you mean by this? Are you talking about myst_dmath_allow_space or can you give an example?

@chrisjsewell
Copy link
Copy Markdown
Member Author

We mention mathjax overrides and that mathjax is the default, but I'm not sure where there's an option to choose anything other than Mathjax. Am I missing something?

Yes sphinx.ext.imgmath, sphinx.ext.jsmath and https://github.com/hagenw/sphinxcontrib-katex

@chrisjsewell
Copy link
Copy Markdown
Member Author

Right I'm going to merge this, so I can get on to other things, but feel free to have a look through @mmcky and we can always fix anything in post

@chrisjsewell chrisjsewell merged commit eab8d5b into master Aug 19, 2020
@chrisjsewell chrisjsewell deleted the dollarmath branch August 19, 2020 03:04
Comment thread docs/using/syntax.md

There are a few other options available to control dollar math parsing:

`myst_dmath_allow_space=False`, will cause inline math to only be parsed if there are no initial / final spaces, e.g. `$a$` but not `$ a$` or `$a $`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chrisjsewell this is on by default?

Comment thread docs/using/syntax.md

`myst_dmath_allow_space=False`, will cause inline math to only be parsed if there are no initial / final spaces, e.g. `$a$` but not `$ a$` or `$a $`.

`myst_dmath_allow_digits=False`, will cause inline math to only be parsed if there are no initial / final digits, e.g. `$a$` but not `1$a$` or `$a$2`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chrisjsewell is this on by default?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the default configuration here: https://myst-parser.readthedocs.io/en/latest/using/intro.html#myst-configuration-options and in the code:

class MdParserConfig:

I'm open to discussion about these though 😄

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as in at present the default is that $ a $ will be parsed as math

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah nice page. I just noticed you added (off by default) in a couple other places in the text and was wondering if you wanted that here as well?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeh good point

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If having a default value of myst_dmath_allow_space=True provides a more robust (but less technically correct) parse of the text -- I think it is probably good to have on by default. What do you think?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

provides a more robust (but less technically correct)

Well I don't know if there is a "technichally correct" syntax rule for this, since obviously its not covered by the CommonMark spec. Just to clarify, you think that $ a $ should or should not be recognised a math?

Comment thread myst_parser/main.py

md.options.update(
{
"enable_html_img": config.html_img_enable,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chrisjsewell is there a reason to have different option names here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, I just did want to make a breaking change to the internal code. But yeh can be changed at some point

Copy link
Copy Markdown
Member

@mmcky mmcky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chrisjsewell love your work on this! Many people will really love the features added here.

In my testing -- the only other edge case I have found is:

Edge Cases

The following myst

What happens if I use $$
L_x = y
$$

produces

Screen Shot 2020-08-19 at 2 57 51 pm

@chrisjsewell
Copy link
Copy Markdown
Member Author

The following rst

Do you mean that this is what is produced using rst (with a sphinx extension), or that this example was taken from an rst document?

Currently the dollarmath plugin does not support inline double-dollar. It is on the TODO list though

@mmcky
Copy link
Copy Markdown
Member

mmcky commented Aug 19, 2020

Do you mean that this is what is produced using rst (with a sphinx extension), or that this example was taken from an rst document?

Sorry old habits -- I mean myst

Currently the dollarmath plugin does not support inline double-dollar

OK Roger that. So the display math needs leading white space

These also produce the same output

What happens if I use $$L_x = y$$

What happens if I use
$$
L_x = y
$$

but this works:

$$
L_x = y
$$
What happens if I use

so leading white space is a hard requirement for displaymath -- makes sense

@mmcky
Copy link
Copy Markdown
Member

mmcky commented Aug 19, 2020

We may want to update this doc page as the syntax used shows $$..$$ on the one line which may confuse people. Want me to add a note?

@chrisjsewell
Copy link
Copy Markdown
Member Author

Yeh pop it in a PR cheers

Basically this is the current specification: https://raw.githubusercontent.com/executablebooks/markdown-it-py/master/tests/test_plugins/fixtures/dollar_math.md

@chrisjsewell
Copy link
Copy Markdown
Member Author

If you (or anyone) feels there are any issues then altered/additional tests should be added to that document (which gets run a parametrized pytests)

@mmcky
Copy link
Copy Markdown
Member

mmcky commented Aug 19, 2020

Basically this is the current specification: https://raw.githubusercontent.com/executablebooks/markdown-it-py/master/tests/test_plugins/fixtures/dollar_math.md

So I notice the one line syntax for display math works when next to a code-block but not text

numbered equation following code block. (valid=True)
.
```
code
```
$$a+b$$(1)
.
<pre><code>code
</code></pre>
<section>
<eqn>a+b</eqn>
<span class="eqno">(1)</span>
</section>

Do you want me to add a test case

numbered equation following text. (valid=False)
.
Immediately after text
$$a+b$$(1)

and document False for now?

@chrisjsewell
Copy link
Copy Markdown
Member Author

yeh go for it

@chrisjsewell
Copy link
Copy Markdown
Member Author

It wouldn't be too hard to add inline displaymath parsing. However, I would also need to test how sphinx would handle this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants