Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

commonmark+fancy_lists awkward parsing of some alphabetic list labels as roman #89

Closed
dubiousjim opened this issue Feb 13, 2022 · 4 comments
Labels
bug Something isn't working

Comments

@dubiousjim
Copy link

Explain the problem.
Not sure this is a bug, more a design difference between pandoc's and commonmark's parsing of numeric lists starting with labels that could be interpreted either as roman or alphabetic. If I use something like this (very common for me):

> printf 'a. one\nb. two\nc. three' | pandoc -fmarkdown -tnative

I get the very reasonable AST:

[ OrderedList
    ( 1 , LowerAlpha , Period )
    [ [ Plain [ Str "one" ] ]
    , [ Plain [ Str "two" ] ]
    , [ Plain [ Str "three" ] ]
    ]
]

But the CommonMark reader does it differently:

> printf 'a. one\nb. two\nc. three' | pandoc -fcommonmark+fancy_lists -tnative

[ OrderedList
    ( 1 , LowerAlpha , Period )
    [ [ Plain [ Str "one" ] ] , [ Plain [ Str "two" ] ] ]
, OrderedList
    ( 100 , LowerRoman , Period ) [ [ Plain [ Str "three" ] ] ]
]

Depending on one's site styling, the html version of the latter might render awfully (e.g., with a gap between the "two" and the "three" lines).

Is it compatible with the design of CommonMark to permit this markup to be parsed the way Pandoc does? I'd guess so, since this is an extension, so presumably the behavior isn't directed by the standard.

On the other hand, Pandoc doesn't happily handle lists that you do want to start with small roman 100s:

> printf 'c. one\nc. two\nc. three' | pandoc -fmarkdown -tnative
[ OrderedList
    ( 3 , LowerAlpha , Period )
    [ [ Plain [ Str "one" ] ]
    , [ Plain [ Str "two" ] ]
    , [ Plain [ Str "three" ] ]
    ]
]

> printf 'c. one\nci. two\ncii. three' | pandoc -fmarkdown -tnative
[ OrderedList
    ( 3 , LowerAlpha , Period ) [ [ Plain [ Str "one" ] ] ]
, OrderedList
    ( 101 , LowerRoman , Period )
    [ [ Plain [ Str "two" ] ] , [ Plain [ Str "three" ] ] ]
]

Pandoc version?
pandoc 2.17.1.1, binary release from the GitHub releases. On Mac OS 10.15.7.

@dubiousjim dubiousjim added the bug Something isn't working label Feb 13, 2022
@dubiousjim
Copy link
Author

In case I should be more explicit: I didn't think the behavior of the Pandoc reader for the lists starting with c. was something that there was compelling reason to change. Just noting that there are some (for me very unusual) contexts where the current CommonMark parsing might be the one desired, rather than the current Pandoc one.

@dubiousjim
Copy link
Author

Also I realize there's an easy workaround for this issue, namely just use a. labels for all the items, rather than a., b., c.. Still thought the design differences here were worth discussing, and perhaps tweaking. Perhaps I should have raised on pandoc-discuss, but am not subscribed to that list.

@dubiousjim
Copy link
Author

Only related issue I found was jgm/pandoc#590.

@jgm
Copy link
Owner

jgm commented Feb 13, 2022

I'll move this to commonmark-hs: it will need to be addressed there.

@jgm jgm transferred this issue from jgm/pandoc Feb 13, 2022
jgm added a commit that referenced this issue Feb 13, 2022
@jgm jgm closed this as completed in 3f7d9ab Feb 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants