Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: a new general purpose Block extension #1175

Closed
waylan opened this issue Aug 13, 2021 · 84 comments
Closed

Proposal: a new general purpose Block extension #1175

waylan opened this issue Aug 13, 2021 · 84 comments
Labels
3rd-party Should be implemented as a third party extension. feature Feature request. someday-maybe Approved low priority request.

Comments

@waylan
Copy link
Member

waylan commented Aug 13, 2021

I haven't solidified the proposal yet, but I am envisioning something like a fenced code block (except its not for code and would use different delimitators) which might support two different block types:

  1. Element Block: The user would define the HTML element (tag) and any attributes. This would be simpler to deal with (read, write and parse) that md_in_html as it avoids lots of HTML tags in your Markdown.
  2. Templated Block: The user would define the block 'type' and any attributes. So long as a template exists for the 'type', then the content is inserted into the template after being parsed as Markdown.

Therefore, if the user provided div (for an element block), the content is parsed as Markdown and then wrapped in a div element with any additional attributes being set on the div. However, if the user provides a 'type' of admonition, then the admonition template is used. Presumably the template would expect a set of attributes (including the admonition type), a title and a body and insert those values into the HTML template. Other templates might accept other options.

This would also allow users to use the CSS provided by whichever CSS framework they are using. For example, the MkDocs default theme is using Bootstrap, which provides it own set of alerts. However, MkDocs doesn't use them, but instead also provides CSS for Rst style admonitions because that is what the admonition extension expects. With a new block extension, a Bootstrap alert template could allow Bootstrap's CSS to be used along with all of Bootstrap's alert features (icons, dismissal, etc) removing the need for the MkDocs theme to also include the Rst based CSS.

A few additional things to note:

I would prefer to not add any new extensions to the core library. So I would expect this to be developed in a separate repo. However, I mention it here because it could effect how we proceed with #1174. Also, I would appreciate any feedback on the idea and/or input on a possible syntax proposal.

@waylan waylan added 3rd-party Should be implemented as a third party extension. feature Feature request. labels Aug 13, 2021
@waylan
Copy link
Member Author

waylan commented Aug 13, 2021

My initial proposal suggests that these blocks are different from fenced code blocks, but another option would be to use the fenced code block delimitators and also provide for a 'code' type. In our implementation, a user could provide their own 'code' template which might allow them to provide their own custom syntax highlighting solution. If we went this way, the user would simply use this extension instead of the fenced_code extension, avoiding any conflicts.

I am undecided which way to go on this. Do we overload fenced code blocks, or create a completely new and different fenced block? Similar approaches (overloading fenced code blocks) have been taken by other implementations. For example r-markdown uses fenced code blocks to define the r code which is executed with its output being included in the rendered document. I have also seen at least one Markdown-to-reStructuredText bridge which used overloaded fenced code blocks to replicate directives (sorry, I don't recall which one). To my mind, the primary benefit of overloading fenced code blocks is that when a document using the blocks are parsed by other Markdown implementations, the block is simply parsed as a plain code block. In contrast, introducing a completely new syntax could result in some weird output from other implementations. The only reason the existing admonitions extension mostly avoids this is because the content is indented and becomes a code block. But I would prefer to not indent the content of the new block.

@facelessuser
Copy link
Collaborator

In general, when I went through the pain of getting fenced blocks to work under lists and such via SuperFences, I just made it so that users could create custom fences to generate whatever kind of block they wanted. With that said, it still doesn't run inline with normal block processors, so it has some limitations as the content is stored under placeholders like code.

I'd love an approach that uses the normal block processors, and if it is possible, I'd love to see how that would work.

  1. The way Python Markdown currently normalizes line breaks and such between blocks is frustrating for code blocks as any code blocks with multiple empty lines always get normalized to one.
  2. Even in Admontions there is still a degree of complexity involved with appending additional blocks as children and accounting for lists and such. I think this is partly due to how Python Markdown consumes blocks.

Anyways, I'm at least interested to see how this may potentially work.

@waylan
Copy link
Member Author

waylan commented Aug 13, 2021

@facelessuser you make some valid points. I have not looked too closely at how you implemented superfences, and was curious how you worked around the problems with the existing block parser. Sounds like you didn't. It may not make sense to even attempt this until after we refactor how blocks are parsed. However, I'm not really interested in tackling that problem right now. Mostly I am simply trying to get my ideas recorded at this point. I haven't given much thought to how it might be implemented.

@mitya57
Copy link
Collaborator

mitya57 commented Aug 13, 2021

See also Pandoc Markdown's Divs and Spans https://pandoc.org/MANUAL.html#divs-and-spans which have a bit different but similar idea.

@facelessuser
Copy link
Collaborator

@facelessuser you make some valid points. I have not looked too closely at how you implemented superfences, and was curious how you worked around the problems with the existing block parser. Sounds like you didn't. It may not make sense to even attempt this until after we refactor how blocks are parsed. However, I'm not really interested in tackling that problem right now. Mostly I am simply trying to get my ideas recorded at this point. I haven't given much thought to how it might be implemented.

Yeah, I accomplish everything through the preprocessor still. It's a bit hacky, but it was the only way to avoid the above issues. Anyways, I think this is a good idea, and generally, if/when block parsers are able to handle blocks better, this will be a good addition on top of it.

@waylan waylan added the someday-maybe Approved low priority request. label Nov 3, 2021
@alkisg
Copy link

alkisg commented Nov 28, 2021

As a user, I think this is a wonderful idea!
I'm currently migrating a mediawiki site to git/markdown. For admonitions, the pandoc mediawiki→markdown converter produced ::: warning, which is the same that markdown-it, docusaurus and some others use. It'll take time to update hundreds of wiki pages to !!! warning and indented content that mkdocs uses. Also, if I ever need to switch away from mkdocs in the future, I'll have to convert the content once more.
The ::: class (and the ::: tag) ideas are so powerful, that will probably make !!! admonitions obsolete if they're implemented! I hope we can see this in production soon! :)

@waylan
Copy link
Member Author

waylan commented Nov 29, 2021

Just stumbled upon MyST, a Commonmark parser which is intended as a reStructuredText replacement. I've seen many before, but this is the first one where I like the directives syntax. It is very similar to what I was trying to accomplish here. See also, their optional colon_fence extension, which looks similar to (but not the same as) Pandoc's syntax.

Also of interest is their discussion of various existing proposals and implementations out in the wild and why they chose what they did.

@facelessuser
Copy link
Collaborator

Just stumbled upon MyST, a Commonmark parser which is intended as a reStructuredText replacement. I've seen many before, but this is the first one where I like the directives syntax. It is very similar to what I was trying to accomplish here. See also, their optional colon_fence extension, which looks similar to (but not the same as) Pandoc's syntax.

Unfortunately, us doing code this way, without a huge overhaul of the parser to get block handling where we want, may not be possible. And I definitely wouldn't try to use the ```{directive} format unless the block handling was refactored as the only real way to properly preserve code is using the PreProcessor, and that syntax will conflict with fenced code (at least as things are currently).

But, if you don't need to preserve empty lines and such for code blocks and you just want to handle normal Markdown content, it may be possible to cook something up now. I've been considering possibly tinkering with a way to pull this off now that I've had some time to think about it, at least the logic in regards to the handling of the "fences". Assuming some sort of prototype could be pulled off, we'd at least have something to work with now.

@facelessuser
Copy link
Collaborator

facelessuser commented Aug 4, 2022

So, I actually started to play around with this, and it is in a very early prototype stage. I wanted to make sure I could get it to work in lists and such. I kind of played with the directive syntax some. I'm not sure if anything below, syntax or behavior-wise, will be the final implementation, but this is currently just exploratory.

The thing I like is that you don't have to do indentations for admonitions and such, and it doesn't break my editor's syntax highlighting :). I guess some common admonitions could be created like warning and if people want something different, they could use the generic admonition type. Anyways, this was a simple test:

- 
    ::::{admonition} This is really important!
    ---
    type: warning
    ---

    Don't do that, for these reasons:

    - 
        :::{details} Here is a summary
        This is nested!
        :::

    ::::

    :::{html} div
    ---
    attributes: {id: some-id, class: these are classes}
    ---

    Some other content
    :::

Results

<ul>
<li>
<div class="admonition warning">
<div class="admonition-title">This is really important!</div>
<p>Don't do that, for these reasons</p>
<ul>
<li>
<details>
<summary>Here is a summary</summary>
<p>This is nested!</p>
</details>
</li>
</ul>
</div>
<div class="these are classes" id="some-id">
<p>Some other content</p>
</div>
</li>
</ul>

So far the directives are pretty simple objects. They have on_create events and on_end events. You can store the content as you accumulate it and then process it once you hit on_end, or not store it and let the Markdown parser just parse it as Markdown content. That's really it.

class DirectiveTemplate:
    """Directive template."""

    # Set to something if argument should be split.
    # Arguments will be split and white space stripped.
    ARG_DELIM = ''
    NAME = ''
    STORE = False

    def __init__(self, length):
        """Intitialize."""

        self.store = []
        self.length = length

    def config(self, args, **options):
        """Parse configuration."""

        self.args = [a.strip() for a in args.split(self.ARG_DELIM)] if args and self.ARG_DELIM else [args]
        self.options = options

    def on_create(self, el):
        """On create event."""

    def on_end(self, el):
        """Perform action on end."""

Anyways, I figured I share it and see if people had any thoughts.

@facelessuser
Copy link
Collaborator

Yeah, it is pretty easy to just derive to create shortcut for Note admonitions and such:

class Admonition(DirectiveTemplate):
    """Admonition."""

    NAME = 'admonition'

    def on_create(self, parent):
        """Create the element."""

        el = etree.SubElement(parent, 'div')
        t = self.options.get('type', '').lower()
        title = self.args[0] if self.args and self.args[0] else t.title()
        classes = [c for c in self.options.get('classes', '').split(' ') if c]
        if t != 'admonition':
            classes.insert(0, t)
        classes.insert(0, 'admonition')
        ad_title = etree.SubElement(el, 'div', {'class': 'admonition-title'})
        ad_title.text = title
        el.set('class', ' '.join(classes))
        return el


class Note(Admonition):
    """Note."""

    NAME = 'note'

    def config(self, args, **options):
        """Parse configuration."""

        super().config(args, **options)
        self.options['type'] = 'note'

And then this:

:::{admonition} This is really important!
---
type: warning
---

Don't do that!
:::

:::{note}
Just a note
:::

:::{note} With a title
And some words.
:::

Becomes this:

<div class="admonition warning">
<div class="admonition-title">This is really important!</div>
<p>Don't do that!</p>
</div>
<div class="admonition note">
<div class="admonition-title">Note</div>
<p>Just a note</p>
</div>
<div class="admonition note">
<div class="admonition-title">With a title</div>
<p>And some words.</p>
</div>

@waylan
Copy link
Member Author

waylan commented Aug 4, 2022

Very cool. Although, when I initially proposed a template block, I meant to actually use a template. Something like the following, which would allow the end user to define any layout they want.

<div class="admonition {{ type }}>
<div class="admonition-title">{{ title }}</div>
{{ body }}
</div>

Of course, that would ideally end up as part of the etree, which adds additional complications, so I understand why you haven't taken that approach. It's just that requiring users to use the etree API to define their own custom blocks narrows the target audience. Although, I suppose for specific predefined block types like admonitions, using the etree API is probably more performant. However, a wrapper around what you have so far could provide a more general purpose system which uses actual templates. I just wouldn't name what you have "template."

@facelessuser
Copy link
Collaborator

However, a wrapper around what you have so far could provide a more general purpose system that uses actual templates. I just wouldn't name what you have "template."

Yeah, the idea of templates wasn't really laid out, and I do realize what I have isn't a template. The naming is quite wrong in that regard. I haven't even thought about true templates yet. I'm still working through getting the blocks to not get messed up when passing through lists. There are always list corner cases...always 😢 .

I feel there is an advantage to more advanced, non-template type variants, but I can also see the attraction for actual tempaltes.

@facelessuser
Copy link
Collaborator

I did get "templating" working. But it seems to offer a host of troublesome situations. We have to create a temporary div to let Markdown figure out the context and properly wrap things in <p> and such, but it doesn't have any intelligence to know which variables need escaping and which will be handled by Markdown. Also, what if you insert the content into something that requires preserving the text (like in a code block).

It's all kind of a pain. I haven't even bothered to try and figure out all the templating cases, just wanted to see if we could utilize the existing system to convert templates into directives:

TEST = """
<div class="{directive}">
<div class="test-title">{arg0}</div>
{body}
</div>
"""
:::{test} A title
Some **content**

More content

    This is code

:::
<div class="test">
<div class="test-title">A title</div>
<p>Some <strong>content</strong></p>
<p>More content</p>
<pre><code>This is code
</code></pre>
</div>

So, could it kind of sort of work? Yeah, is there a lot more intelligence that would have to be added? Probably. Is it worth the effort? 🤷🏻 How much motivation do I have to plow through and get a fully working template approach? 🤷

Knowing that a template approach is probably viable is probably enough for me right now. I think my main concern is making sure the general flow is fairly sound and maybe putting up an experimental branch over at pymdown-extensions.

I think I have most major list flow issues solved.

@facelessuser
Copy link
Collaborator

I guess you could maybe make available an escaped-body and a body that could dictate what kind of temporary element gets created to store the content. For the rest, you just have to kind of assume the user won't feed in bad content as there is no way to be for sure about context.

If the user is creating a series of elements that require their own different body each with different requirements...too bad 🙃 . I think the template case shouldn't be allowed to be that complicated. If they have a greater need, they shouldn't use the template approach.

@facelessuser
Copy link
Collaborator

I guess I should state that currently, templates with nested directives fail...not sure why though. I'll have to dig a bit deeper.

@waylan
Copy link
Member Author

waylan commented Aug 4, 2022

I think leaving templates as something to tackle later is a reasonable approach. My initial proposal was simply trying to present an ideal situation from the user's perspective with no thought to how it would be implemented. I'm sure its possible, but it may not be worth the effort.

@facelessuser
Copy link
Collaborator

facelessuser commented Aug 4, 2022

Yeah, that sounds good. That's probably going to be my approach for now. I think my current issue is the fact I'm swapping out the placeholder with the real template element during the block processor phase...but maybe that kind of operation shouldn't happen until after the block processor phase is over...

I'm certain it s doable, but I think getting a sound base before I burn up my motivation is key 🙂.

Anyways, I'll keep experimenting as I have time and then pair down to what I think is most useful once it seems the obvious issues are resolved.

@facelessuser
Copy link
Collaborator

facelessuser commented Aug 5, 2022

I've found something particular about list handling that won't allow:

- ::::{admonition} This is really important!
  ---
  type: warning
  class: some-class
  ---

It gets all messed up and turns them to hr blocks and such. This doesn't happen outside of lists.

I'm going to prototype with ~~~ for now:

- ::::{admonition} This is really important!
  ~~~
  type: warning
  class: some-class
  ~~~

@facelessuser
Copy link
Collaborator

I forgot that ~~~ is an alternate code format 🤦🏻 . But it seems I can get away with two (--) which is not quite what I'm looking for, but will allow me to finish making sure the logic is okay....Somehow hr is getting a hold of it before we see can get it under a list.

This is somewhat surprising as I would expect that hr wouldn't touch the content of a list until we've started processing the content as list blocks, but I guess maybe it preprocesses the blocks before lists...

@facelessuser
Copy link
Collaborator

Yep, hr is processed before lists, this is likely to catch - - - before lists as I think --- shouldn't trigger lists. Sigh, well that limits using --- unless someone has a clever workaround.

@facelessuser
Copy link
Collaborator

If we really wanted to use --- Python Markdown would need to split up HR handling I think. Handle loose HR - - - prior to lists and tight --- after lists. I think that is the only way. I'll just use -- for now as it sits safely between list and HR.

@facelessuser
Copy link
Collaborator

I have an experimental branch up here: facelessuser/pymdown-extensions#1777

I ended up monkey patching HR so we could use the --- format. Nothing we are doing is set in stone. Ideally, I'd like Markdown to make the HR change as I don't currently see a reason not to, but worst case, we could always use --.

Feel free to try it out.

@facelessuser
Copy link
Collaborator

facelessuser commented Aug 6, 2022

Just an overall description of how the current prototype is laid out. I think I'm ready to discuss the syntax.

:::{directive-name} arguments
---
option1: value
option2: value
---

content
:::
  1. Fencing of blocks requires 3 or more :. If nesting blocks, the outer block should have a greater length of : than the child blocks. Currently, if a closing fence is encountered that is greater than or equal to the starting fence, it is sufficient to close the current fence.

  2. directive-name is currently the name of whatever directive you are using. It is case insensitive.

  3. arguments can be one or many arguments, and the given directive would specify required delimiters if required.

  4. The optional frontmatter containing options is YAML-ish bock that comes immediately after the header, or better put, is part of the header when specified. We currently went with YAML-ish to avoid pulling in PyYAML as a requirement. I'm kind of going with simpler is better right now.

    Due to limitations of the Python Markdown parser, it must be a "tight" config (no empty new lines). This also means there should be no newline after the initial header line.

    The key value pairs are similar to the meta extension (though I haven't yet made them multi-line, was considering this).

    The directive would impose any specific rules that these options require: delimiter split list, etc.

  5. content: any valid content can be used, even nested directives. There does not have to be a separation between the header and the content, but can be if desired.

Examples

Note with no content

:::{warning} All we need is a title!
:::
:::{warning} This is important!
Read this notice.
:::
:::{warning}
Arguments are optional with some directives!
:::
:::{figure} some/image.png
---
width: 300
---

I'm a figure caption, and you can specify optional settings for your directives as well!
:::
::::{tip} Here's a secret

:::{note}
You can nest directives as well
:::

::::

@facelessuser
Copy link
Collaborator

I did end up allowing the other format for options as well.

:::{admonition} Title
:class: note

This is a note
:::

I had to add some intelligence so it knew how to handle insertion into different kinds of parents: spans, blocks, etc. I also switched to just using PyYAML for option parsing for now.

Anyways, this is all of course assuming we decide to keep the Myst approach, but I think we are modeling the format as well as Python Markdown can. I may hold off and wait and see if I get some feedback on the format and such before plowing forward much more.

It ended up being a surprisingly more complex endeavor than I thought at first, but I think the hard part is over. Even if we completely rework everything, I think generally the flow is working.

@pawamoy
Copy link
Contributor

pawamoy commented Aug 27, 2022

Hello everyone, thank you for the work you've done on this subject, lots of very interesting thoughts and ideas. That was pleasant to read.

I am absolutely in favor of a general purpose block extension. I actually thought about it before myself (not as much as you did though), in response to reStructuredText-lovers saying it is superior because (notably) of its powerful and extensible directives, whereas markdown is just markup. And I thought to myself: "well, isn't .. directive just markup as well? so nothing prevents us from implementing something similar for markdown?"

Anyway. About your suggestions: I think I like better the /// one. One could think it's because I don't want ::: to conflict with mkdocstrings own extension, but really it's just visually more appealing to me. I wouldn't mind at all porting mkdocstrings' extension to this general purpose block extension, whether ::: or /// is chosen. Obviously it will be easier to migrate if /// is selected since we'll be able to support both extensions during the transition.

The pipe separator | also looks nice to me.

I think YAML for options is a good choice as well, as it allows for complex configuration. I'm quite happy with YAML in mkdocstrings, though it definitely needs better validation and error reporting (wrong option name, invalid type, missing item, invalid value, etc.). I wonder if you had envisioned something about these topics, if YAML was to be approved.

@squidfunk
Copy link

squidfunk commented Aug 27, 2022

Thanks for pushing this forward @facelessuser and @waylan! I really like the idea of a general purpose block extension. How easy or complex would it be to define a new 'directive' and implement a handler for it? I'm asking because this would open up the possibility for Material for MkDocs to provide more complex components.

I'm okay with the syntax, yet I don't know whether typing /// and then | feels natural to authors, given that keyboard layouts might be different (at least they are on OSX and Windows). Since those are two different characters, it might feel sluggish. Furthermore, users might confuse | for / and vice versa, since both characters are quite similar. I'm not saying we shouldn't use it, I'm just thinking out loud.

Having a fully feature YAML parser (maybe with support for !ENV and the other stuff that's possible in mkdocs.yml) would be amazing, since we could now use proper data types like boolean, number etc.

Question: how would generic blocks for details and content tabs look like? I've seen admonitions in this issue, but details and content tabs have slightly different semantics.

@facelessuser
Copy link
Collaborator

I think YAML for options is a good choice as well, as it allows for complex configuration. I'm quite happy with YAML in mkdocstrings, though it definitely needs better validation and error reporting (wrong option name, invalid type, missing item, invalid value, etc.). I wonder if you had envisioned something about these topics, if YAML was to be approved.

Currently, each option requires you to spec them. Each type can be validated and/or normalized as well. For example:

class Details(Block):
    """Details."""

    NAME = 'details'

    ARGUMENTS = {'optional': 1}
    OPTIONS = {
        'open': [False, type_boolean],
        'type': ['', type_class]
    }

Here we require open to be a boolean, and type_boolean will check and fail the processing of the block if open is not a bool if provided. If not provided, False will be returned as the default.

type on the other hand will be parsed as a string, specifically a class, so if type is not a string, block parsing will fail, if the parse is a string, we'll ensure it is a single class and quote escape automatically so it is ready to be used as an HTML attribute value.

"Failing" just means the block will not be processed as a generic block and is unrecognized.

I'm okay with the syntax, yet I don't know whether typing /// and then | feels natural to authors, given that keyboard layouts might be different (at least they are on OSX and Windows). Since those are two different characters, it might feel sluggish. Furthermore, users might confuse | for / and vice versa, since both characters are quite similar. I'm not saying we shouldn't use it, I'm just thinking out loud.

I'm open to discussing this further if some alternatives are suggested. Right now, the experimental branch is using /// name | arguments.

Having a fully feature YAML parser (maybe with support for !ENV and the other stuff that's possible in mkdocs.yml) would be amazing, since we could now use proper data types like boolean, number etc.

I'm not quite sure what you mean by supporting !ENV, but it will support basic types. Currently, we will restrict YAML parameters to be parsed in PyYAML using safe_load. As YAML in this sense is being used directly in syntax, I'm not sure we'd like to allow arbitrary functions, classes, and other potentially dangerous types to be loaded, but if this is a strong request, we might make them optional with an "unsafe" mode. My plan is not to ship such functionality with the initial release though.

Question: how would generic blocks for details and content tabs look like? I've seen admonitions in this issue, but details and content tabs have slightly different semantics.

Currently, details will look like:

/// details | My summary
type: warning

content
///

Tabs will look like:

/// tab | My title

content
///

You'll notice that content must be separated from the header by 1 new line even if no options are specified. This is because YAML fences are optional. We have to distinguish the YAML content from the block's content. If we required YAML fences (---), then such a requirement would not be needed. It has been argued though that YAML fences are awkward, so they are currently recognized, but are optional.

/// details | My summary
---
type: warning
---

content
///

@facelessuser
Copy link
Collaborator

I have noticed some odd cases if blocks that implement Admonitions, Details, and Tabs are run in conjunction with legacy Admonitions, Details, and Tabbed. As they both share the same classes and such, the legacy extensions can sometimes try and hijack child blocks under the new generic block versions as the HTML output is identical, and that is how they identify child content that should be handled. So, in general, I would not recommend mixing the legacy implementation and the new generic implementation.

The only mitigation to allow them to work together would be to make the output somewhat different between the two implementations so they would not confuse each other, they may also have to be made aware of the difference so they can purposely avoid grabbing each other's child blocks.

For now, we will document, for instance, that legacy Admonitions should not be used with Generic Block Admonitions and simply state that issues may arise if they are both used together. This would go for Pymdownx Extensions Details and Tabbed extensions as well.

I don't think it is worth complicating things to make them work together. I really don't want to break the ability to use these new blocks as drop-in replacements as well as that would affect many documentation themes that then have to target new HTML and classes. I think simply documenting that using both legacy and new approaches can cause issues is enough. If people use them both together, I guess they'll get what they get 🙂.

@facelessuser
Copy link
Collaborator

As a side note, this exercise has helped me realize the two things that hold us back from implementing fenced code blocks as BlockProcessors instead of PreProcessors:

  1. Handling of things like --- is a problem. There would need to be some intelligence as to when to handle such things to ensure they made it through the code blocks untouched. I don't know if this could just be handled by some context state that would cause these to be avoided or not. Indented code blocks avoid this issue as they are, well, indented. Indented lines won't trigger these getting handled before the code block can process them. Fenced code blocks are not indented, and would have trouble in this case. All other blocks run into the possibility of conflicts with lists or hr tags.

  2. Lists seem to strip out extra newlines. This is currently a problem for even indented code blocks in lists and would affect fenced code blocks done as BlockProcessors as well. Currently, if in a list, even indented code block are affected by this. An indented code block does not have the ability to retain multiple blank new lines in Python Markdown. This makes PreProcessor fenced code blocks (that work in lists) superior in this sense. If there was a way to not strip these out and pass these through, this would no longer be a problem.

As far as I am aware, these are the only issues that prevent code blocks from being moved to BlockProcessors vs PreProcessors.

@waylan
Copy link
Member Author

waylan commented Aug 29, 2022

How easy or complex would it be to define a new 'directive' and implement a handler for it? I'm asking because this would open up the possibility for Material for MkDocs to provide more complex components.

@squidfunk this has been a concern of mine as well, which was discussed in some detail above. At this time we are concentrating on getting the underlying system working so all new block types must be defined using Python code. However, it would be possible to define such types as third party extensions. There is no need to have them built into the base extension itself. In fact, one could develop a third party extension which is built upon the base we are building and provides some simpler process for defining new types (perhaps with HTML templates or similar). Maybe, at some future point, such a system could even be merged with the base extension. However, that is not our focus at this time.

@facelessuser
Copy link
Collaborator

facelessuser commented Aug 29, 2022

Yeah, I'm finishing up testing and bug fixing. I think the syntax is settled (for now unless more discussion arises). Once it is all working, I'll probably write up the API and let people pick it apart.

Template blocks were discussed, and could be done in the future, but a discussion of flexibility vs simplicity would need to be discussed. You can make them really powerful, but it would potentially become a complex component or would pull in a few more dependencies to leverage the needed power from 3rd party libraries.

You'd have to define your templating syntax on top of the existing syntax, maybe leverage Jinja2, who knows. Or you can make it fairly dumb - easy to use, no additional dependencies, and not as powerful.

@squidfunk
Copy link

@waylan sounds great! I was asking for a 'Hello World' example, as I'm happy to get into the extension writer perspective. My Python-fu is okayish, but not great. A good example helps me understand the semantics.

As an example, I've tried to understand how Markdown parsing in this library is done and find it to be very complex. I really hope that the extension mechanism is easy to use. If you have a 'Hello World' example I can test-drive, happy to do so.

@facelessuser
Copy link
Collaborator

facelessuser commented Aug 29, 2022

@squidfunk You can take a look at the branch. Two fairly simple cases:

The main class is here, and I'm happy to discuss why I've exposed what pieces. I think all the expected override pieces have hooks in the form on_<whatever>.

I can create a super simple example later if needed.

@facelessuser
Copy link
Collaborator

Keep in mind that I'm still trying to identify overlooked corner cases, but generally, everything seems to be working.

@squidfunk
Copy link

Looks good!

@facelessuser
Copy link
Collaborator

I am still moving forward with this, but I'm finding that I like the requirement less and less which requires the content block to have a new line before it.

/// test

content
///

I really just want to make compact blocks at times:

/// test
content
///

I think compact blocks are a bit more readable when compact (when possible) especially when you have a number of blocks or nested blocks.

If people feel very strongly with hate for YAML fences (---), maybe I can add a general option to make them optional (or not optional depending on what the default is).

I didn't think it would bother me as much as it does, but as time has gone on, I think it really does bother me 😅.

@facelessuser
Copy link
Collaborator

I've implemented mandatory YAML fences on the branch. It is currently under a feature switch. I thought at first that I'd like the requirement to not have the fences, but actually using the blocks, I think I find the requirement for a new line before content also cumbersome. I'd be interested to see what people actually prefer when using these in the real world, not just looking at the syntax.

I'm not sure yet if I would like it enabled or disabled by default. I guess we could always course correct if people in the real world greatly prefer one option over the other. It may be wise to release the blocks as an experimental extension in the hopes to get feedback.

Anyways, I now have testing in place. I'm probably going to go over the extension API a bit closer and see if we can remove some unnecessary events or improve existing events.

Lastly, I think we just need to clean up and decide what generic blocks we'll release by default.

@facelessuser
Copy link
Collaborator

I may consider releasing this feature in a beta release. That way those who wish to try it can give feedback before we release it to the general public.

@waylan
Copy link
Member Author

waylan commented Sep 27, 2022

I've implemented mandatory YAML fences on the branch.

My initial reaction is to dislike this, but I also understand your reason. Maybe with use I would feel the same way. Therefore, yes, I think a beta release may make sense.

@facelessuser
Copy link
Collaborator

Keep in mind that mandatory YAML fences must be enabled currently, they aren't default.

@vokimon
Copy link

vokimon commented Nov 4, 2022

Just in case you don't know it, there is an extension, Custom Blocks, quite similar to what you are proposing here but in my opinion (skewed as i am the author) it uses a more simpler syntax (no need for a closing tag, no need for a separator between the block name and the parameters... It also has a more flexible mapping of parameters to python function parameters which makes defining custom blocks quite easy.

::: blocktype value1 "long value2" param3=value3 param4="long value4"
    Indented content which may contain any other blocks as long as they are **indented**.

By default it generates:

<div class="blocktype value1 long-value2" param3="value3" param4="long value4">
<p>Indented content which may contain any other blocks as long as they are <b>indented</b>.</p>
</div>

That is:

  • the type of the block as class
  • keyless values as classes
  • keyworded values as attributes
  • indented content reparsed as markdown

But you can customize it with a simple python function like this one:

def blocktype(ctx, param1, param2, param3, param4='default'):
     return E('.blocktype', dict(param1=param1), E('span.title', param2), ctx.parse(ctx.content))

To generate:

<div class="blocktype" param1="value1">
<div class="title">long value2</div>
<p>Indented content which may contain any other blocks as long as they are <b>indented</b>.</p>
</div>
  • keyworded attributes are assigned first to parameters by name
  • keyless values are then assigned by position to the rest of the parameters
  • the indented content is passed as ctx.content, and ctx.parser generates html with markdown. Depending on your block type you might not do that (source code, dot diagrams...)

@facelessuser
Copy link
Collaborator

One of the things we are trying to eliminate is the indentation. This requires an end. I've had complaints from others (and myself) about how many syntax highlighters style indentations in Markdown as code blocks ( if they aren't lists. Avoiding indentations silvers this and was mentioned earlier in the thread.

@facelessuser
Copy link
Collaborator

I'll go a little further to say, the intent is not to invalidate work others have done, and if an existing generic block satisfies your needs, I'd say that maybe what we are doing here won't make a difference to you.

With that said, I've tried to put a lot of thought into the generic blocks we are currently implementing.

  • They won't require indentation which many have complained about before.
  • They should be flexible allowing users to specify if they should treat the content as a block, span, or even raw text which they should preserve through the markdown process.
  • They should allow you to tap into various events to make things a bit easier. Do you need to process the content after the entire content under a block has been parsed? There's an on_end event for that. Does your generic block need to parse the content but then run a treeproessor on it later, maybe after some other extension? You should be able to use the on_register event to add a tree processor in conjunction.

While there are other solutions that may be sufficient for others, I am hoping that some will find what we are trying to do here a welcome addition. I'm trying to involve users of Python Markdown to give their input to try and make this a useful solution moving forward. I am hoping to have a pre-release soon to start allowing people to try it out. It has definitely been more involved to write this extension than what I initial thought, so work near the end has slowed as other obligations in life have crept in, but I think we are very soon reaching an alpha release. Hopefully, with feedback, we can polish any rough edges and get something useful out.

@vokimon
Copy link

vokimon commented Nov 6, 2022

It was just a heads up. Long live to the opensource... and dupped efforts :-) At least i hope you could take a look at it, to learn from its hits and failures. How easy is to add a new kind of block (or span or blockquote or whatever). How indentation or the lack of it may affect readability on nesting. Also the unsolved problem of integrating fenced code because of its pre and post processing hacks. Let me also suggest taking a look on how it uses introspection to map block parameters to user's generator functions, if you take a similar approach you may even reuse the builtin generators CustomBlocks already has with your unindented syntax.

Keep on.

@facelessuser
Copy link
Collaborator

How easy is to add a new kind of block (or span or blockquote or whatever).

I guess I should give some background. I am not new to implementing plugins in the Python Markdown ecosystem. I support an extremely popular set of extensions. You are even using them in your own documentation 🙂 . I am also a member of the Python Markdown team and have helped restructure large parts of the code and implement many features. I have a pretty intimate knowledge of the inner workings of Python Markdown.

How indentation or the lack of it may affect readability on nesting

I am familiar with the tradeoff that would be happening hear. I support a number of plugins that employ this nesting technique already: Details and Tabbed for instance.

Indented blocks can be more readable, but also pose a frustration for standard Markdown highlighters in editors, etc. Additionally, I do believe there are some issues with md_in_html and indented extensions, or at least this existed in the past. Additionally, some people also don't like using indented code blocks except in lists. I don't think there is a right or wrong answer here, but this is a well-understood tradeoff.

Could we allow some flexibility and allow users to choose whether they'd like to still use indentation, that is a possibility, but not currently in the plans.

Also the unsolved problem of integrating fenced code because of its pre and post processing hacks.

I'm not quite sure what you are referring to here. I've been down the road and have implemented complex plugins using fenced code and am all too familiar with the process. I am the author of SuperFences. Yes, SuperFences used pre and post-processing to implement fenced code and is quite complicated because of it, but there were specific reasons as to why we had to use pre and post-processors. I've also spent a lot of time thinking about how to implement fenced code without using pre and post-processing and have successfully pulled it off with this new generic block extension.

I should be clear and say I already have a working prototype for generic blocks. This isn't theoretical. It successfully allows for fenced code blocks using the standard block processor and no pre and post-processors. It can handle implementing the content as blocks, spans, or even preserved raw data. The only weakness or drawback is that Python Markdown will not retain more than one new line between blocks which doesn't make this approach useful for code blocks where such a requirement is important, but is fine for standard Markdown or preserving raw content where the number of new lines is not really an issue.


I am open to dialogue about any concerns or ideas for the new plugin. I will be looking for feedback when I release this as an alpha. I am interested in whether there are people who truly do prefer a more indented style. It would be interesting to see if those people are a majority or a minority.

I do understand that you may not be interested as you have your own plugin, and that is fine. Our goals and perspectives when approaching this may be different, and there is nothing wrong with that. With that said, if you have specific concerns about the current approach we are heading in and would like to have a broader discussion or suggestions to build upon what we are trying, I'd be happy to delve further into such a conversation.

@vokimon
Copy link

vokimon commented Nov 6, 2022

I know i am not talking to a kiddie. ;-) Apologizes if i said something that could be interpreted in the opposite sense.

Forget code blocks. It was just an example of a fail with CustomBlocks. Maybe you could help but OT, nevermind. Sorry.

The final comment referred to the following: CustomBlocks decouples syntax parsing from html generation. The parser and all the markdown_py extension framework details are dealt by the extension. User extends custom blocks dealing just with html generation, defining a function which receives the parameters (in a headline, in a yaml block, whatever syntax), the content (marked, indented, whatever) and spits html. My point is that if you are getting to a similar abstraction, even if we are using different markdown syntax, maybe the generators could be shared somehow. That was the idea i wanted to share if you want to get it.

@facelessuser
Copy link
Collaborator

I know i am not talking to a kiddie. ;-) Apologizes if i said something that could be interpreted in the opposite sense.

No worries. I have no ideas how much you are aware of what I'm familiar with.

Forget code blocks. It was just an example of a vokimon/markdown-customblocks#1 with CustomBlocks. Maybe you could help but OT, nevermind. Sorry.

If you help me understand what the issue is, I may potentially be able to help 🤷🏻. Though, you'd probably have to ping me over there to keep things on topic here.

My point is that if you are getting to a similar abstraction, even if we are using different markdown syntax, maybe the generators could be shared somehow. That was the idea i wanted to share if you want to get it.

I don't know much of the approach that CustomBlocks uses as I have not looked yet. I can explain what I am doing in the current Block extension.

The idea is to have a single generic format for the block themselves. The base extension handles finding the generic block syntax and processing it and feeding the content of the block through the Markdown parser in the appropriate way. If the block declares that the content should be parsed as block content then all content under it is processed as a block, if the content is declared as a span then it will be parsed as if the content is inline. If declared as raw content, it will be preserved under the wrapper element(s).

Blocks are specified by registering a "Block" object with the main Blocks extension. These objects declare any allowed global options, block-specific options, etc. They provide the parent wrapper element via the on_create event and specify how the content should be handled via the on_markdown event. If desired, after the block has been parsed, an on_end event can be used to do any further processing, rearranging of elements, etc. There are a few more events for handling some specific scenarios, but you get the idea.

Syntax parsing is separate in this sense, the block syntax is generic and handled by the base extension, but content you could handle however you wish. You could use an on_register event to register traditional Markdown extensions that target your special blocks alongside the generic block. For instance, I have a tabbed container created with generic blocks, but we generate ID slugs for each tab. To make sure we do not conflict with Header slugs, we register a Treeprocessor alongside the Tab block so that we can slugify tab IDs after Header slugs are handled via the TOC extension. This way we can avoid conflicting IDs. Additionally, you could preserve the content as raw text and have a special parser to handle the content during the on_end event. It's flexible to try and allow for some less straightforward block ideas. I'm sure there will be some limitations, but I'm trying to be flexible as possible.

Once I get the alpha out, I'm more than happy to have people give criticism and/or suggestions.

@facelessuser
Copy link
Collaborator

I have released an alpha containing the current changes: https://pypi.org/project/pymdown-extensions/#history.

Please provide feedback in this discussion: facelessuser/pymdown-extensions#1868

The main feedback I am looking for now is on syntax.

@squidfunk
Copy link

Maybe I have overlooked it, but is there a simple "hello world" example that we can try which allows me to define a custom block? I'd be primarily interested to test this from an extension author perspective.

@facelessuser
Copy link
Collaborator

@squidfunk I am purposely not focusing on the API right now as I want to nail down syntax first. If you really want to jump into the API, I think the Admonition block is a good template to get started. I would just ignore the on_register and on_parse events as those are used to handle dynamic type generation for the block and are more advanced.

At the very least you need an on_create event.

  • NAME must be a unique name to avoid conflicts with other registered blocks.
  • ARGUMENTS is used to specify what some may think of as the "title" input: /// block | argument.
  • OPTIONS specifies block-specific inputs that can be given via a YAML block in the header. Regardless of the block, the attributes option is always special and allows adding extra classes, ID, or whatever else is desired, so that option name is kind of reserved.
  • CONFIG is used to specify any global config settings for the block. These are done when registering your block and only then.

If you have general feedback, please provide it at the alpha's discussion topic here: facelessuser/pymdown-extensions#1868

If you have additional questions related to the API, please create a separate discussion thread on the repo. I'm happy to answer any additional questions, but since API is not the primary focus of the first alpha, and the API could change some before I document the API for a future alpha, that is not my focus, and I'd like to keep the main focus, at least in the alpha release topic, on nailing down the syntax.

@matchavez
Copy link

Well, it may be time to update all of this.

GitHub has implemented their Admonition style, and it's a Blockquote fallback.

Maybe implement the new style for specific admonitions (presumed), and then leave the !!! for an unstructured block as suggested, that continues to render as it does.

>[!NOTE]
!!! - follow the Material for MkDocs style for custom blocks
??? - use as a "Spoiler" indicator, with the same unstructured block hidden by default.

Here's the current supported array:

Note

Highlights information that users should take into account, even when skimming.

Tip

Optional information to help a user be more successful.

Important

Crucial information necessary for users to succeed.

Warning

Critical content demanding immediate user attention due to potential risks.

Caution

Negative potential consequences of an action.

> [!NOTE]  
> Highlights information that users should take into account, even when skimming.

> [!TIP]
> Optional information to help a user be more successful.

> [!IMPORTANT]  
> Crucial information necessary for users to succeed.

> [!WARNING]  
> Critical content demanding immediate user attention due to potential risks.

> [!CAUTION]
> Negative potential consequences of an action.

@oprypin
Copy link
Contributor

oprypin commented Jan 22, 2024

@matchavez I think that syntax has zero ideological overlap with this particular thread. The GitHub syntax lets you achieve only these 5 exact blocks with absolutely zero leeway for extensibility (any deviation makes the syntax unrecognized by GitHub).

That said, I will advertise a thread where there is interest in making this work despite these limitations, not as a general syntax

@waylan
Copy link
Member Author

waylan commented Jan 23, 2024

I'm closing this issue as the proposed extension was implemented some time ago as the Blocks extension of pymdown-extensions starting with version 9.10 (see also facelessuser/pymdown-extensions/discussions/1973).

I realize others might want something different, but the Bocks extension meets the needs I was trying to address here in my proposal.

@waylan waylan closed this as completed Jan 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3rd-party Should be implemented as a third party extension. feature Feature request. someday-maybe Approved low priority request.
Projects
None yet
Development

No branches or pull requests

9 participants