Preserving empty line nodes #264

ijioio · 2022-07-29T05:09:53Z

I am using commonmark library to generate emails. It is a really nice and easy to use library. It have both HtmlRenderer for rich email body, and TextContentRenderer for plain text email body. Looks like perfect solution. The person who sends an emails, prepares it using markdown syntax, then content is processed by both parsers and embedded to the email as plain and html body. But the problem is that parser discards line breaks (or empty lines).

Example:

Parser parser = Parser.builder().build();

Node document = parser.parse("foo\n\nbar\n\nbaz");

HtmlRenderer htmlRenderer = HtmlRenderer.builder().build();
TextContentRenderer textRenderer = TextContentRenderer.builder().build();

htmlRenderer.render(document);
textRenderer.render(document);

will result in for html

<p>foo</p>
<p>bar</p>
<p>baz</p>

for plain text:

foo
bar
baz

Parser just discards the empty lines completely. While for html this is acceptable cause each line will be wrapped with paragraph, for plain text renderer it is critical as the text looses its formatting.

What I expect is that parser will handle empty lines as a separate nodes, but renderers can process them differently, html renderer can just skip it, while plain text renderer will append them to the ouptut:

foo

bar

baz

I was not dig too deep in the sources, so not sure if there is some possible solution for that at the current state...

The text was updated successfully, but these errors were encountered:

robinst · 2022-10-18T06:27:50Z

Hi @ijioio. As far as I understand, you're just unhappy with what the TextContentRenderer does with paragraphs, right? If it rendered it as follows:

foo

bar

baz

You would be happy?

I'm honestly not sure why it renders it this way at the moment:

foo
bar
baz

@JinneeJ do you remember what the reason was for this? We might have to make this configurable for backwards compat reasons, but it would still be good to know why.

robinst · 2022-10-18T06:32:57Z

@ijioio Also note, you can already change this behavior with existing API via TextContentRenderer.builder().nodeRendererFactory(myFactory) and overriding the rendering of Paragraph nodes.

JinneeJ · 2022-10-24T14:57:17Z

@JinneeJ do you remember what the reason was for this? We might have to make this configurable for backwards compat reasons, but it would still be good to know why.

@robinst I don't remember exactly 😅 I guess it didn't occur to me that we might need adding extra empty lines for paragraphs. But anyway having backward compatibility would be great.

ijioio · 2023-01-15T05:10:24Z

Hi @ijioio. As far as I understand, you're just unhappy with what the TextContentRenderer does with paragraphs, right? If it rendered it as follows:
foo

bar

baz
You would be happy?

Yes, exactly! I just want plain emails still keep some logical structure of the original text.

ijioio · 2023-01-15T05:17:20Z

@ijioio Also note, you can already change this behavior with existing API via TextContentRenderer.builder().nodeRendererFactory(myFactory) and overriding the rendering of Paragraph nodes.

Thank you for the suggestion. Not sure is it a universal solution though. I mean, that I will never know in advance in which commonmark syntax block these new line breaks will be used. Is that implies I need to override all possible cases?

mattrob33 · 2023-02-07T14:30:21Z

+1 for the original feature request of actually preserving empty lines. As a specific use case, I'm using commonmark to build a sort of rich text editor using markdown as the data, which requires maintaining a mapping between the literal and rendered text. So, e.g., in

An _italic_ text.

when the user places the cursor at index 5 in the rendered text, this actually maps to index 6 in the markup text, to account for the first underscore.

Unfortunately this breaks with newlines since they are simply discarded, so I have no way of knowing how many newlines the user entered.

robinst · 2024-09-18T08:53:47Z

@mattrob33 I think that's a different use case than the original reporter, which is for the TextContentRenderer. Could you create a new issue with your use case please? It sounds like you want the empty lines to be represented in the AST (Node) somehow? You might be able to do something with source spans, see includeSourceSpans on Parser.Builder?

ijioio added the enhancement label Jul 29, 2022

robinst mentioned this issue Sep 18, 2024

Internal space line is trimmed in converted plaintext #334

Open

robinst added the help wanted label Sep 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preserving empty line nodes #264

Preserving empty line nodes #264

ijioio commented Jul 29, 2022 •

edited

Loading

robinst commented Oct 18, 2022

robinst commented Oct 18, 2022

JinneeJ commented Oct 24, 2022

ijioio commented Jan 15, 2023

ijioio commented Jan 15, 2023 •

edited

Loading

mattrob33 commented Feb 7, 2023

robinst commented Sep 18, 2024

Preserving empty line nodes #264

Preserving empty line nodes #264

Comments

ijioio commented Jul 29, 2022 • edited Loading

robinst commented Oct 18, 2022

robinst commented Oct 18, 2022

JinneeJ commented Oct 24, 2022

ijioio commented Jan 15, 2023

ijioio commented Jan 15, 2023 • edited Loading

mattrob33 commented Feb 7, 2023

robinst commented Sep 18, 2024

ijioio commented Jul 29, 2022 •

edited

Loading

ijioio commented Jan 15, 2023 •

edited

Loading