Document side-effects of renderers' initialisation #56

Rogdham · 2018-08-31T09:45:02Z

Hello, this is possibly an issue concerning the doc and not the code.

Parsing outside of the renderer's context manager:

d = Document('a <b> c')
with HTMLRenderer() as r:
    print(r.render(d))  # <p>a &lt;b&gt; c</p>

Parsing inside of the renderer's context manager:

with HTMLRenderer() as r:
    d = Document('a <b> c')
    print(r.render(d))  # <p>a <b> c</p>

Not sure where the difference in output comes from. CommonMark asks for the second output though, which seems to be what is performed in mistletoe.markdown and by the mistletoe command line.

$ python -V
Python 3.7.0
$ pip freeze
mistletoe==0.7.1

The text was updated successfully, but these errors were encountered:

Rogdham · 2018-08-31T09:57:02Z

It seems that in fact it is not the use of a context manager that changes things, but the initialisation of the HTMLRenderer before the call to Document:

HTMLRenderer()
d = Document('a <b> c')
with HTMLRenderer() as r:
    print(r.render(d))  # <p>a <b> c</p>

Rogdham · 2018-08-31T10:20:01Z

Ok so as to why this is happening, if I'm not mistaken:

HTMLRenderer.__init__ calls BaseRenderer.__init__ with (HTMLBlock, HTMLSpan) as extra
BaseRenderer.__init__ will then register HTMLBlock and HTMLSpan with e.g. block_token.add_token
so HTMLBlock and HTMLSpan will now be tokenized

And if the renderers are always called inside a context manager, it's all good since the __exit__ takes care of the cleaning up.

I suggest updating the README to always use the mistletoe.markdown function, and never call the renderers directly. This may avoid future confusion.

Moreover, this means that the ASTRenderer and LaTeXRenderer do not comply with CommonMark since the do not register HTMLBlock and HTMLSpan. I guess it makes sense for LaTeXRenderer, not sure for ASTRenderer. Maybe add a note somewhere about that?

miyuchina · 2018-09-03T02:09:01Z

Hey there, thanks for this!

I agree with your points, and yes it is a bit confusing to have side-effects in renderer initializations. I'm actually considering getting rid of this behavior, and thus removing the necessity of using context managers entirely. Would that be a better solution? I'll also allow HTMLBlock and HTMLSpan to be tokenized by default in ASTRenderer.

I'll see if I can do so fast enough, but if not I'll add a note to the README.

Thanks again!

Rogdham · 2018-09-03T09:00:40Z

I'm actually considering getting rid of this behavior, and thus removing the necessity of using context managers entirely. Would that be a better solution?

I guess so.

For what it's worth, here is what I was expecting when I first encountered missletoe last week:

Specify which tokens to be generated when doing the parsing phase: given that it seems to be done by Document, I would expect to write something like Document(md, extra_tokens=[GithubWiki])
Then, the returned value could be rendered by different renderers without needing to parse the markdown text again
At that point, it would not be needed to tell the renderer about extra_tokens: I would expect the renderer to just throw an exception if it doesn't know about some token type

So something like this:

md = 'Hello, *world*!'
tokens = Document(md, extra_tokens=[GithubWiki])
print(ASTRenderer().render(tokens))
print(HTMLRenderer().render(tokens))

In other words, I would expect the parsing and the rendering phases to be independent. Since the types of tokens has to be specified in the parsing phase, it seems odds to me to specify it in the rendering phase instead.

Just my two cents. Obviously, you do what you want with the project, and what I mentioned above is likely to change the API and break backward compatibility.

I'll also allow HTMLBlock and HTMLSpan to be tokenized by default in ASTRenderer.

I don't really have an opinion on the matter, except that it should probably be documented if that's not the case.

That make me think of the following: it may be a good thing to be able to specify some tokens not to be generated. For example, if I want to do CommonMark and nothing more, how would I disable tables and footnotes? I'm not sure if that's really an important feature though.

miyuchina · 2018-09-06T13:28:41Z

The interesting thing about this is that Document is not meant to be the only entry point to parsing. My original intention is that users are free to swap out Document entirely for their own class, thus having a greater control over parsing (see, e.g., the amount of Markdown-specific logic in Document.__init__).

But separating parsing and rendering phases is a very good suggestion. I'll see what I can do with that.

... it may be a good thing to be able to specify some tokens not to be generated.

There's a nuclear option:

class MyRenderer(HTMLRenderer):
    def __init__(self):
        # ...
        block_token._token_types = []
        span_token._token_types = []
        # ...

... although manipulating an underscored variable is undocumented, and will probably change based on our discussion in this issue.

pbodnar · 2021-10-30T19:37:49Z

Thanks for your suggestions. :)

So I've tried to document the side effects as originally requested. See f5ea6d6. I hope this will suffice for now.

Regarding separating parsing and rendering phases, I hope we will get to this one day, but let's do this in a separate ticket.

Rogdham changed the title ~~Parsing outside of renderer's context manager~~ HTMLRenderer initialisation side-effect on HTML parsing Aug 31, 2018

Rogdham changed the title ~~HTMLRenderer initialisation side-effect on HTML parsing~~ Document side-effects of renderers' initialisation Aug 31, 2018

miyuchina self-assigned this Sep 3, 2018

miyuchina added the documentation label Sep 3, 2018

chrisjsewell mentioned this issue Mar 5, 2020

How best to work with Mistletoe? executablebooks/MyST-Parser#33

Closed

pbodnar mentioned this issue Sep 18, 2021

HTMLBlock isn't parsed. #106

Closed

pbodnar closed this as completed Oct 30, 2021

pbodnar mentioned this issue Jan 18, 2024

MarkdownRenderer should emit extra newline after list #211

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document side-effects of renderers' initialisation #56

Document side-effects of renderers' initialisation #56

Rogdham commented Aug 31, 2018

Rogdham commented Aug 31, 2018 •

edited

Rogdham commented Aug 31, 2018

miyuchina commented Sep 3, 2018

Rogdham commented Sep 3, 2018

miyuchina commented Sep 6, 2018

pbodnar commented Oct 30, 2021

Document side-effects of renderers' initialisation #56

Document side-effects of renderers' initialisation #56

Comments

Rogdham commented Aug 31, 2018

Rogdham commented Aug 31, 2018 • edited

Rogdham commented Aug 31, 2018

miyuchina commented Sep 3, 2018

Rogdham commented Sep 3, 2018

miyuchina commented Sep 6, 2018

pbodnar commented Oct 30, 2021

Rogdham commented Aug 31, 2018 •

edited