Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML block handling nested in indented blocks doesn't work properly #1096

Open
facelessuser opened this issue Jan 12, 2021 · 3 comments
Open
Labels
bug Bug report. confirmed Confirmed bug report or approved feature request. someday-maybe Approved low priority request.

Comments

@facelessuser
Copy link
Collaborator

facelessuser commented Jan 12, 2021

The following example shows that raw HTML that has empty newlines in the content are not handled properly, and are instead treated as incomplete HTML fragments.

import markdown

print(f"Markdown: {markdown.__version__}")

print("\n------ Results ------\n")

content = r'''
!!! note "Admonition"
    <div>
    Some text

    Some more text
    </div>
'''

print(markdown.markdown(content, extensions=['markdown.extensions.admonition']))

Output

Markdown: 3.3.3

------ Results ------

<div class="admonition note">
<p class="admonition-title">Admonition</p>
<p><div>
Some text</p>
<p>Some more text
</div></p>
</div>
@facelessuser facelessuser added the bug Bug report. label Jan 12, 2021
@facelessuser
Copy link
Collaborator Author

Updated results using the latest released Markdown.

@waylan
Copy link
Member

waylan commented Jan 12, 2021

So, I based the current behavior on the rules, which state:

The only restrictions are that ... the start and end tags of the block should not be indented with tabs or spaces.

Of course, the reference implementation does not support admonitions, but nested lists are indented. And according to Babelmark, the reference implementation parses indented raw HTML blocks as raw HTML blocks. ☹️ Not what I was expecting. I really thought the reference implementation matched our behavior here. In fact I even added tests for our behavior. 😠

In any event, so long as we are parsing raw HTML in a preprocessor, this is what we will get. The parser is very strict about requiring no indentation (even a single space is not allowed). We would need to switch to a blockprocessor, which would strip the indentation in the appropriate cases (when nested) before parsing as raw HTML. In the early commits to the original PR I was using a block processor but reverted to a preprocessor as I was encountering to many obstacles with the way the block parser splits the source on blank lines.

@facelessuser
Copy link
Collaborator Author

Ugh, I see this is broken in lists as well. For some reason, I was assuming this to be an Admonition specific issue 😦. And I guess, even before the rewrite this was the behavior of Python Markdown. I guess I just never stumbled on this until now.

@waylan waylan added the confirmed Confirmed bug report or approved feature request. label Nov 3, 2021
@waylan waylan changed the title HTML block handling in Admonitions doesn't work properly HTML block handling nested in indented blocks doesn't work properly Nov 3, 2021
@waylan waylan added the someday-maybe Approved low priority request. label Nov 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug report. confirmed Confirmed bug report or approved feature request. someday-maybe Approved low priority request.
Projects
None yet
Development

No branches or pull requests

2 participants