-
-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Markdown checkboxes #3051
Comments
I agree, this would be an excellent enhancement. I'm pretty sure it would parse GFM through pdflatex right? There are plenty of simple and elegant solutions to creating a checkbox in LaTeX. |
I'm also interested in seeing this added to the Pandoc Markdown syntax. It's a common and rather obvious bit of markup, and with it, I think Pandoc Markdown becomes a strict superset of every common Markdown variant. |
I would vote on this as well, great feature to have |
Not opposing the idea of supporting this feature, but this statement is far from true. An important question to ask is how it can be done. It seems it's an "AST Change"-level of difficulty. If indeed it does, I guess it won't be implemented for a long time. I personally also like to use markdown to manage todo list. For example, the taskpaper style syntax |
It seems reasonable that github flavoured markdown should implement GitHub markdown. (Checkboxes aren't part of daringfireball's original markdown, iirc - it's a GitHub extension) |
Yes, it should, but again, how? As far as I understand: pandoc's unique feature is to implement different markdown extensions seperately, such that when different extensions turned on/off, those combinations of extensions becomes another markdown variant. But the premise is pandoc already have that extension. When pandoc don't have such extenson, then you need to ask, what it takes to add this extension? Is it a document model pandoc already support? e.g. in the case of multimarkdown inline footnote, pandoc do not support it, but pandoc's internal surely can handle footnote (inline or not), so in this case adding such extension only requires a change in the markdown parser. But in this case, if I am not mistaken, it requires an AST change. Because pandoc's internal just can't handle that. If I'm correct, this put the required change into the most difficult category: "AST change". It requires a change in all existing writers and readers. And if you see the graph in <pandoc.org>, it certainly is many. Now, among all these "AST change" level of feature request (see a list of them by clicking GitHub's label), how many of them are older? or more important? e.g. 1 of the important feature request that involve AST change is column/row span in tables. Now, to me this is something much more important and general to have. So what I'm suggesting is not it should not happen, I want it happens too. But I'm saying given the level of difficulties, the workloads of the core developers, the amount of open issues, the priorities they might have, this is unlikely to happen in the foreseeable future. 1 thing people often mistaken by those markdown variants in pandoc is that it is supposed to be fully compatible (say, Again, some of the points above based on the assumption that it requires AST change. Feel free to correct me if I'm wrong. |
Ah, sorry I misunderstood your previous comment. I don't know how the AST in Pandoc works, I'm here ignorantly opening bug reports with a "would-be-nice-if" flavour. Naively pecking through the code base it looks like the only way to propagate the checkboxes will be a new AST node type. Is there a node type available for "fallback"? So you can specify 2 different ways to represent a given part of the AST, in the case of Checkbox some Markdown.Checkbox type as well as Teletype with value This sort of thing might give developers more freedom to add features in a way which doesn't trample on every single backend. It also encourages a bad behaviour of adding features which aren't going to be visible in every backend, but perhaps that's okay. |
As far as I understand, it seems a fallback would not work. I've suggested a similar approach for column/row spans in tables, but they say it won't work. So unfortunately any AST change will be a very daunting task: at least all writers and readers and pandoc-type needed to be changed (sometimes involves more things, say, pandoc-citeproc, templates, etc.) I think the core developers have been thinking about AST changes. I don't know much about it, but if I were to make such a big change, I would want to do it correctly the second time and incluides as many features which is useful and requires AST changes as possible (so that there's no third time), which only makes the task more daunting. However, another unique feature in pandoc is its filter system. So I suggest if it is something you sorely needed right now, you should write a filter to do it. How it should be done depends on your need, e.g. is it write to or read from GFM, is pandoc markdown only an intermediate format you need (e.g. you want to gfm -> PDF)? If you are interested in writing a filter or need help on that, you can open a thread on pandoc-discuss, lots of experts there can give you advices and some might even write one for you (don't count on that, however). |
Yes, you can write a filter that finds list items
and replace these with e.g.
This should work fine for HTML output. |
I've written some pandoc filters in the past to do something similar, and I can't say I'm desparate. Like I say it's a "would-be-nice-if". If it's going to involve such heavy re-work I'm happy to leave this as |
I think that implement GFM task lists is important. Here I bring a real case scenario of the problems that this lacking feature can cause. If I task lists in a markdown document, likes this: - [ ] Mercury
- [x] Venus
- [x] Earth (Orbit/Moon) and then I use pandoc to clean up the markdown source:
the file gets cleaned up, except for the Task List, which gets corrupted by escaping the brackets: - \[ \] Mercury
- \[x\] Venus
- \[x\] Earth (Orbit/Moon) That's a pity. Pandoc is a great tool for cleaning up markdown source files (especially with But right now, this can't be used on GFM docs which make use of Tasks List — else they break up. In many GitHub projects I use batch scripts to clean up all markdown files via pandoc (from GFM to GFM) before commiting. This REALLY helps: I work with "lazy" markdown syntax, but after cleanup all files are up to pandoc standard (eg: I work with Atx-style header, but commit with Setext-style headers, ecc.); but most of all, it makes a much cleaner diffing when merging in contributions and solving conflicts. Then I have to choose: either I don't use task lists in markdown docs, or I don't use scripts automation to clean up source files. Tasks Lists being part of the GFM standard, they ought be implemented in pandoc's |
Also see this thread in pandoc-discuss. @jgm has specifically said pandoc is not designed as a linter. So we are on our own when we push pandoc beyond what it is designed for. And please read my comments in this thread. It is likely you don't understand what |
I agree, it would be good to support this somehow. One
option that wouldn't require an AST change would be to parse
- [x] foo
as
[BulletList
[[Plain [Span ("",["checkbox checked"],[]) [Str "[",Space,Str "]"],
Space,Str "m"]]]]
That would give decent output in all writers, and when
writing markdown_github we could special-case it and not
escape these brackets.
This wouldn't give you nice-looking checkboxes in PDF or
HTML output, though. For that, we'd need either a bunch of
fairly ugly special-case code in the writers, or an AST
change allowing us to represent a list with arbitrary
markers.
|
Thanks. I am aware of the AST problems and complexities. Nonetheless, I wanted to put forth this particular usage case. So, it seems that the only solution for now would be to create a filter that preserves Unfortunately I have no knowledge of Haskell, so I can't contribute much on this issue. But I could look into creating a filter. But I did look into pandoc sources, to inspect the AST structure. From what I gather, a checklist is just a list subtype -- like a roman letters is just a subtype of an ordered list. Couldn't the AST accomodate some extra attribute to specify that checkboxes are unordered/bullet list items with an extra
That's a pity though. Pandoc does a good job at cleaning up documents (because of the AST). Maybe in future editions it could have a special After all, people look for pandoc because they like the idea of having a standalone single binary (ok, + citepro) tool to handle formats conversion. But if we need to install Node.js, or Python or Ruby just to access a linter than its benefits tend to dilute down (an possible, you end up installing a different linter for each format, with dozens of dependencies). |
+++ Tristano Ajmone [Dec 14 16 02:47 ]:
***@***.***
Thanks. I am aware of the AST problems and complexities. Nonetheless, I
wanted to put forth this particular usage case.
So, it seems that the only solution for now would be to create a filter
that preserves [ ] and [x] when they are the first three chars at the
beginning of a list element. But couldn't this be implement outside the
AST, by having pandoc simply leave them verbatim on the text leaf when
working with markdown_github format?
Yes, we could special-case this in the markdown writer.
There's always some chance it would lead to a conflict, e.g.
if you had a link reference definition
[x]: foo
then an unescaped
- [x] bar
would become a link.
Unfortunately I have no knowledge of Haskell, so I can't contribute
much on this issue. But I could look into creating a filter.
All that would be needed for your purposes would be to
identify
Str "[",Space,Str "]"
at the beginning of a list item and replace it with
RawInline (Format "markdown") "[ ]"
which won't be escaped.
But I did look into pandoc sources, to inspect the AST structure. From
what I gather, a checklist is just a list subtype -- like a roman
letters is just a subtype of an ordered list. Couldn't the AST
accomodate some extra attribute to specify that checkboxes are
unordered/bullet list items with an extra checkbox qualifier (with an
on/off boolean status).
Yes, it's definitely possible to add a new type of list to
the AST. But it's a huge change since you then have to
change virtually every module in pandoc to deal with the new
element.
|
It seems unnatural to special-case this as such a case (GFM checklist) would only happens when it is both from and to
If the only thing you need is to change
I've made a similar suggestion before. But the problem of using pandoc as some sort of "linter" is 2-fold:
The first one is optional but nice to have as a linter. It can already be done partially by +/- markdown extensions. And this will probably never be the goal of pandoc. The second one is more critical, but is currently not true. It is very hard to achieve this, and @jgm has mentioned this is the area he wants to improve (but cannot guarantee). The reason it is important is it guarantees the output captures what the AST represents. i.e. its importance is not only for being a linter but any reader/writer pairs in general. See more on this topic in How to programmatically enforcing a pandoc markdown style - Google Groups. (I clicked the link I referred to this in the last post, but the link is wrong. This is not the first time I have problem posting a link to a certain post to pandoc-discuss, probably related to the mobile version of Google Groups. If the link doesn't work, search the topic there and you'll find it.)
This approach is interesting, since it circumvent the need of "AST change". On one hand, it feels unnatural. But on the other hand, if it is functionally equivalent to an "AST change" without an "AST change", might be we shouldn't care too much about being "syntactically correct". Just to bring this out explicitly, the example at the beginning of this thread is rendered by GitHub as <ul class="contains-task-list">
<li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled=""> example unchecked</li>
<li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" checked="" disabled=""> example checked</li>
</ul> |
Not sure what you mean here. Org-mode has a similar feature, and checkboxes can be output in HTML and probably in docx and odf. There's a textile issue about implementing a feature like this, too. |
Lots of cool suggestions here! Hopefully this feature might be first implemented via some filters or custom readers and writers, to test the grounds with different approaches ... What could be a way to represent checkboxes in non-html formats? I remember coming across various solutions in doc/pdf, like using some common dingbats (of the sort you should find on all OSs’). I've found an MS Office Help article suggesting use of Wingdings font. Unicode symbols could be a more universal approach, provided the font being used contains the glyphs (I think there is a fallback mechanism for missing glyphs, resorting to use default fonts). On Wikipedia I've found some unicode symbols that might do the job:
The choice is between using a pair of checkbox symbols (ticked box / empty box) or check marks (tick / cross). The latter is often confusing. Here in Italy we use both systems, and the checkboxes can be interpreted differently, depending on whether or not there is a distinction between check- and X-marks:
or:
I like the GFM checkbox because it is clearly a yes/no binary choice. But maybe for formats other than html there might be some other standard ways in place, which I am not aware of. |
If you put the quote in the context, I guess @jgm means special-casing that paricular combination in markdown writer (while leaving AST, any other writer untouched). That discussion is independent of implementing the whole feature of checklist, which in that case is no longer "special-case". |
I would also like this feature and am having to work around their absence. GFM task lists have become really quite prevalent. I understand the difficulties of adding them but it is surely a matter of time before they get added? |
+1 vote for GFM checkboxes support when exported as As for presentation, I would vote for the simple Bullets with hyphens/dashes:
Bullets with asterisks/stars:
And the case of the |
I'm voting for this too. |
Sorry for being the bad guy, but if one want to vote, check the emoji on the right of each message. The difference is that won't notify people and causes spams. e.g. you can read more about this in Reactions to Pull Requests, Issues, and Comments · Issue #141 · dear-github/dear-github. In some repo, thread like this will be locked very soon (but not pandoc because developers here are nice). It is not that developers don't see a value in this issue (see @jgm's comment above for example), but it is difficult to handle it properly (and if one want a hackish approach, suggestions has already been made above). |
Sorry. I actually wanted to start by just showing some formatting (including checkboxes) in a terminal-based viewer. One that I can use as a So I started throwing one together in go for now: https://github.com/ec1oud/mdcat (and using my fork of blackfriday https://github.com/ec1oud/blackfriday ) mainly because the blackfriday parser seemed like a good starting point, and because I've been curious about go. (Probably Haskell is better, but I haven't gotten around to climbing that learning curve yet.) At some point hopefully the world will stop calling this feature something from "github markdown" and expect it to be part of markdown itself. A de-facto extension, or even part of the standard. IMO it's one of the most useful extensions of all, and it's also easy to implement. I think pandoc should also have an output mode for ANSI terminal codes (to style some text spans, like headings and emphasized phrases) plus unicode (for checkboxes, bullets, fractions, "smartypants" quotes and ellipses etc., block quote indentation bars, and box drawing around tables). Then it could be used directly as a filter for |
If I understand you correctly that you mean it is easy to implement GitHub checklist in pandoc, then my point all along is that it is actually not. Try to follow the discussion above. P.S. I'd consider discussion like this helpful though, unlike the voting message above. And I've been there too, so don't worry. |
Because you have an AST, it needs to be extended for this. I get it. |
@lollipopman I'd expect the html writer to output with extension <ul>
<li><input type="checkbox"> example unchecked</li>
<li><input type="checkbox" checked> example checked</li>
</ul> and with <ul>
<li>☐ example unchecked</li>
<li>☒ example checked</li>
</ul> what more do you want? |
@lollipopman the task-list filter posted above does exactly that. It serves as a stopgap measure until the best way to handle checkboxes has been decided. |
That sounds like a great idea! The only output format that has a special way to represent checkboxes is HTML anyway, and for the others the unicode |
If the HTML writer does some magic for that character, how would you literally have that character in HTML output? |
@quasicomputational Yes, the markdown writer should also be sensitive to the |
As far as I know the Github checkboxes don't have bullets and are disabled. Should be something along these lines, then:
Some padding might be necessary for indentation. |
@OleMussmann I agree that it looks better without the bullet point |
One possibility would be to parse
into the pandoc structure Div ("",["checklist"],[])
[ BulletList
[ [Plain [Span ("",["checkbox","checked"],[]) [Str "☑",Space], Str "Foo"]]
, [Plain [Span ("",["checkbox","unchecked"],[]) [Str "☐",Space], Str "Bar"]]
]
] In most formats, this would come out as a bullet list with the unicode checkbox characters. |
I suppose the extra Span wrapping the unicode string lowers the probability of a clash even further, though I guess it's not strictly necessary. But yeah, maybe it's somewhat cleaner... |
Mauro Bieg <notifications@github.com> writes:
I suppose the extra Span wrapping the unicode string lowers the probability of a clash even further, though I guess it's not strictly necessary. But yeah, maybe it's somewhat cleaner...
The point of the extra span is to make it easy for
writers to replace it, e.g. with an `input` element.
|
closes jgm#3051 changes CommonMark Writer to output raw "markdown"
closes jgm#3051 changes CommonMark Writer to output raw "markdown"
closes jgm#3051 changes CommonMark Writer to output raw "markdown"
closes jgm#3051 changes CommonMark Writer to output raw "markdown"
closes jgm#3051 changes CommonMark Writer to output raw "markdown"
closes jgm#3051 changes CommonMark Writer to output raw "markdown"
I have a hack that I'm using to render a box next to a text list. I'm writing Markdown in Obsidian, and it allows for LaTeX math symbols, natively.
Pandoc conversions of the Markdown to pdf using |
It would be nice if Pandoc's GFM supported checkboxes, either through an extension or native to the GFM.
I hope I'm not just being thick here, I couldn't find anything about it in the manual, and no one appears to have mentioned it in the issue tracker.
The text was updated successfully, but these errors were encountered: