Make #+pandoc-emphasis-pre work as expected (#8059) #8360

adql · 2022-10-08T09:15:39Z

The main issue with the original implementation is that it kept the assumption that no emphasis markers may occur directly after most non-whitespace chars, e.g. in the middle of the word (in Pandoc terms: following native Str). But with the possibility to modify which chars are allowed around emphasized text, Pandoc should remain indifferent in that respect.

These commits allow it and modify some tests accordingly. This is both:

in accordance with Org-mode's behavior when customizing org-emphasis-regexp-components.
reasonable in many languages¹ (but rare in English).

Resolves #8059.

which is the case in Hebrew and German, both of which I speak. ↩

Since #+pandoc-emphasis-pre and #+pandoc-emphasis-post have somewhat different implementations, it is wise to test them separately (e.g. issue jgm#8059 affects only the pre char setting). The addition of tests for an alphanumeric pre/post char demonstrates the aforementioned issue.

So far, orgStateLastPreCharPos wasn't updated appropriately after each parsing to native Str (by the parser str). In addition to solving this, the guard notAfterString in emphasisStart is removed to allow emphasis after Str at the first place.

With the introduction of the settings for allowed pre/post chars around emphasized text (and the resolve to jgm#8059), such markup-chars may appear in the middle of the word according to the user's choice.

adql · 2022-10-08T09:41:38Z

By the way, I originally used *> in updatePositions', but hlint suggested to replace it with $>. This is very strange because it's wrong. Replacing with >> eliminated the suggestion. Any explanation to that?

tarleb · 2022-10-09T07:21:45Z

Thank you, this is excellent.

hlint sometimes suggests functions that'd need a separate import; this would work though: updatePositions str' = str' <$ update....

The function T.last is partial and throws an error when passed "". While we know that this can never happen, we have the general policy of avoiding partial functions if possible: could you change the code to use T.unsnoc to replace T.last?

adql · 2022-10-09T14:45:39Z

this would work though: updatePositions str' = str' <$ update....

Yes, I see how this expression is useful. Makes it clear that we're here for the side effect.

tarleb · 2022-10-10T10:21:21Z

Thank you!

adql · 2022-10-10T18:02:34Z

My pleasure :)

adql added 3 commits October 8, 2022 09:06

Narrow down test case to default pre/post chars

adb36d3

With the introduction of the settings for allowed pre/post chars around emphasized text (and the resolve to jgm#8059), such markup-chars may appear in the middle of the word according to the user's choice.

jgm requested a review from tarleb October 8, 2022 14:30

Replace partial T.last with T.unsnoc

2455620

tarleb merged commit 31aa660 into jgm:master Oct 10, 2022

adql deleted the issue8059 branch October 10, 2022 18:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make #+pandoc-emphasis-pre work as expected (#8059) #8360

Make #+pandoc-emphasis-pre work as expected (#8059) #8360

adql commented Oct 8, 2022 •

edited

adql commented Oct 8, 2022

tarleb commented Oct 9, 2022 •

edited

adql commented Oct 9, 2022

tarleb commented Oct 10, 2022

adql commented Oct 10, 2022

Make #+pandoc-emphasis-pre work as expected (#8059) #8360

Make #+pandoc-emphasis-pre work as expected (#8059) #8360

Conversation

adql commented Oct 8, 2022 • edited

Footnotes

adql commented Oct 8, 2022

tarleb commented Oct 9, 2022 • edited

adql commented Oct 9, 2022

tarleb commented Oct 10, 2022

adql commented Oct 10, 2022

adql commented Oct 8, 2022 •

edited

tarleb commented Oct 9, 2022 •

edited