Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Block-level HTML formatting missing #228

Closed
cw789 opened this issue Apr 2, 2019 · 36 comments
Closed

Block-level HTML formatting missing #228

cw789 opened this issue Apr 2, 2019 · 36 comments
Milestone

Comments

@cw789
Copy link

cw789 commented Apr 2, 2019

Hi. In the following Block-level HTML contexts I don't get any markdown formatting.
Is there something, I'm doing wrong?

<details>
  <summary><strong>Table of Contents</strong></summary>

  * [one](#one)
  * [two](#two)
</details>
<details>
  <summary><strong>Table of Contents</strong></summary>

```Bash
echo 'hello'
```
</details>
@RobertDober RobertDober self-assigned this Apr 2, 2019
@RobertDober RobertDober added the bug label Apr 2, 2019
@RobertDober
Copy link
Collaborator

Short answer: nothing
Long answer: Thank you for reporting this ;)

@RobertDober RobertDober added this to the 1.3.3 milestone Apr 2, 2019
@RobertDober
Copy link
Collaborator

And as a side note

<summary><strong>Table of Contents</strong></summary>

should be written as

<summary>
<strong>
Table of Contents
</strong>
</summary>

as indicated in the docs, unless my fix eliminates that restriction, which is unlikely I believe.

@cw789
Copy link
Author

cw789 commented Apr 3, 2019

Have to say thank you to you!
In case any help is needed - just give me a hint where to start.

@RobertDober
Copy link
Collaborator

RobertDober commented Apr 3, 2019

You certainly do not need to say thank you, but it surly is appreciated \o/

It seems that I (I am 100% sure that this is a regression I inflicted to Dave's code, so there are some tests missing too) removed the recursive rendering here.

Might be an easy fix, but as we say in our mothertongue: <span lang="de:at">Da Teifl liegt im Detail.</span>

In case you want to take it on, please base your PR on this branch

@cw789
Copy link
Author

cw789 commented Apr 3, 2019

Ok, this took my a while to understand it (in the code).
What was the intention for removing the recursive rendering?

You may be right - this could be too easy.
PS: Nice to hear some home dialect in here.

@RobertDober
Copy link
Collaborator

I do not think it was on purpose, I will look at the history if I find some time and keep you updated

@RobertDober
Copy link
Collaborator

apparently this was never implemented...
@cw789 are you going for it?

@cw789
Copy link
Author

cw789 commented Apr 3, 2019

I can try if I'm able to do. But will possibly take me some time to start with it.

@RobertDober
Copy link
Collaborator

Hmm I do not want to step on your toes, but I might find some time to do it this week, I'll ping you before starting.

@pragdave
Copy link
Owner

pragdave commented Apr 3, 2019 via email

@RobertDober
Copy link
Collaborator

Hmm I did file it as a bug because of the overwhelming majority of implementations doing this.

Maybe complexity of the implementation should break the tie. If @cw789 or YHS come up with something simple ok, if the code becomes overly complex --> WONTFIX

@cw789
Copy link
Author

cw789 commented Apr 4, 2019

Hmm I do not want to step on your toes, but I might find some time to do it this week, I'll ping you before starting.

This is in no way stepping onto my toes.

Back to topic
I also like the thinking of "that it's HTML, it's HTML".
But sadly lots of markdown formatted files doesn't treat it like that.
I think this is the cause, why it's now so common in other implementations.

I'm not seeing a really simple solution right now for a complete formatting inside HTML.

But we could try somehow to wrap text nodes into a <pre> tag.
Or give theme the CSS style of white-space: pre-line;.
But then we need to trim this text nodes first - otherwise the formatting is ugly.

This simple improvement will at least make the contents of HTML readable.

@RobertDober
Copy link
Collaborator

Let me indulge myself by talking a little bit about my vision.

Surprisingly I hate Markdown, (it is however a little bit like democracy according to Winston Churchill's definition, "bad but the best we have").

To be precise, I dislike its structural shortcomings especially the complex semantics. However I like its inline semantics for emphasis, strong, links, etc.

That said, why would one want to use html inside markdown, well, up to now that has been a minor concern as we are used (I guess) about 99.9% by ex_doc and its nature does not require that at all.

However your example pointed the finger to a (yet another) pain point of Markdown.

It is not semantic.

Therefor the only way to make it semantic is to use html itself. But instead of writing everything in html I still want to use markdown for the reasons mentioned above.

Your issue made that clear to me and that is why I would like to implement this ticket.

Once we have a PR, from @cw789 or myself, @pragdave can decide if to merge it or not.

@RobertDober
Copy link
Collaborator

Looking at some examples right now, e.g.

https://babelmark.github.io/?text=%3Csection%3E%0A%23+headline%0A++__bold__%0A++%3Cnav%3E%0A++++%5Bhello%5D(%23world)%0A++++++++++%3C%2Fnav%3E%0A+++++++%3C%2Fsection%3E

WOOOW

Let us try the same in this comment

# headline __bold__ [hello](#world)

@RobertDober
Copy link
Collaborator

RobertDober commented Apr 5, 2019

Maybe we should stick with GFM which does pretty much the same we do...

Same on commonmark https://spec.commonmark.org/dingus/

@RobertDober RobertDober modified the milestones: 1.3.3, 1.4 Apr 5, 2019
@RobertDober
Copy link
Collaborator

@cw789 the rendering is not the only thing to change, we need to parse differently.
Although it might be structurally a consistent update, why not parse recursively up to the end of the tag like say we parse lists, I am not sure that the added value justifies the effort.

That said, I changed assignments and labels but will keep two things open, my mind and this issue 😉.

Therefor I will be glad to discuss approaches or review PRs againts the assigned branch.

@RichMorin
Copy link
Contributor

Issue #228 seems to be closely related, but is complicated enough that I'll keep the details there.

@RobertDober
Copy link
Collaborator

Indeed, quite thoughtful of you.

Yeah I was afraid that the IAL would not cut it.

I believe that #228 is a really worthy goal, but it is a little bit complicated.
I will have some time during my vacations end of April hopefully but I'd prefer to work on exposing the AST #145, so I'll check with @cw789 if he is making progress on #228 or if I shall take it over?

@cw789
Copy link
Author

cw789 commented Apr 17, 2019

Perfect.

@cw789
Copy link
Author

cw789 commented Jun 18, 2019

I just started to make my first changes regarding this issue Commit.
But I'm afraid I still miss some deeper understanding of eamark to soon make any progress.

@RobertDober RobertDober modified the milestones: 1.4, 1.5 Sep 3, 2019
@myrrlyn
Copy link

myrrlyn commented Dec 7, 2019

Hi; I am working on moving my website from the Ruby static site generator Middleman, which uses kramdown as its parser, to Elixir/Phoenix with Earmark. I have some pages which use

<div markdown="block">
# Markdown Content
</div>

so that I can add some layout structure to a largely-text page.

I am interested in restoring? implementing? this behavior in Earmark, as this thread indicates that it may have previously existed and, as you note, is present in other language's major parsers. Has there been motion on this feature? Is this something with which I can help?

Thank you all for your work on this library. I hope I don't come across as demanding; this is just a feature in which I have a personal interest.

@RobertDober
Copy link
Collaborator

https://babelmark.github.io/?text=%3Cdiv+markdown%3D%22block%22%3E%0A%23+Markdown+Content%0A%3C%2Fdiv

I am not sure this is a mainstream feature and furthermore we have

iex(9)> x                             
"<div markdown=\"block\">\n# Markdown Content\n</div>\n"
iex(10)> Earmark.as_ast(x) 
{:ok,
 [
   {"div", [{"markdown", "block"}], ["# Markdown Content"],
    %{meta: %{verbatim: true}}}
 ], []}

So custom transformers can definitely be created and (as discussed in #312) and potentially moved outside Earmark.

Of course you are not too demanding, and a PR to create such a transformer would be welcome, actually I'd like to do that myself, much more fun than what I am doing right now 😉, but well...

@RichMorin
Copy link
Contributor

RichMorin commented Mar 19, 2020 via email

@RobertDober
Copy link
Collaborator

RobertDober commented Mar 19, 2020

I am not sure it is really what you want. As mentioned above there is no markdown rendering of inside in

<div>
inside
</div>

@RobertDober
Copy link
Collaborator

as a reminder, it is just sooo mainstream not to do that

https://babelmark.github.io/?text=%3Cdiv%3E%0A++**inside**%0A%3C%2Fdiv%3E

@cw789 cw789 removed their assignment Mar 20, 2020
@myrrlyn
Copy link

myrrlyn commented Mar 23, 2020

It is very mainstream to not render content inside a bare HTML block; however, it is roughly 5:4 in favor on rendering content inside an HTML block marked with the markdown="block" attribute. Notably, all the parsers described as "CommonMark compliant" choose to so render.

@RobertDober
Copy link
Collaborator

@myrrlyn interesting indeed, maybe for 1.6?

@RobertDober RobertDober reopened this Mar 24, 2020
@RobertDober RobertDober modified the milestones: 1.5, 1.6 Mar 24, 2020
@solomonhawk
Copy link

solomonhawk commented May 6, 2020

This is an interesting discussion. I was previously only mildly aware of the varying landscape of behaviors provided by different markdown parsers 😱. Treating all content inside HTML tags as raw content makes a ton of sense to me, however it is incredibly convenient to be able to leverage the formatting shortcuts that markdown provides.

I am in favor of supporting formatting nested content when markdown="block" is specified on the enclosing html tag.

For extra context, my use case is liberal usage of a pattern like:

<details>
  <summary>
    Short description of a thing
  </summary>
  
  ... much longer / detailed description ...
</details>

A sufficiently detailed readme can get very long. This represents a usability concern. However, there are cases where breaking this content up into separate pages or guides also makes it less accessible. Striking a balance is not always an easy task.

Perhaps my example is a misguided attempt to lean on collapsible sections in order to provide a more focused overview while surfacing detailed information inline. I could make a compromise and just include all the content or link off to other sections.

GFM (github flavored markdown) supports this - and trying to maintain a README.md that works for both GFM and ExDoc can be challenging (albeit only in some cases).

One answer could be: "don't do that." Instead don't try to embed the README.md inside docs generated with ExDoc, and just maintain a separate file.

I see that #331 was merged - that PR is not expected to resolve this issue, is it? (I tried pulling in Earmark at that commit, at 1.4.4, and master, but didn't see any change in output).

Thanks for all of your work 🙏

solomonhawk added a commit to vigetlabs/colonel_kurtz_ex that referenced this issue May 6, 2020
- todo: change back to earmark (the default) once pragdave/earmark#228 is resolved
- for now our docs won't have syntax highlighting :(
@RobertDober
Copy link
Collaborator

I understand why people want this, however it is very low on my priority list; sorry.

@solomonhawk
Copy link

solomonhawk commented May 6, 2020

@RobertDober Is there somewhere you can point me to learn more about writing custom transformers? I poked around including the linked PR that implemented as_plaintext but am missing some context about how I'd use a custom transformer fn with ExDoc.

Thanks for chiming in

@RobertDober
Copy link
Collaborator

RobertDober commented May 7, 2020

That is because the missing link is, well, missing, in Earmark.
ExDoc is now using as_ast and then transforms, meaning you should look there.

Out of curiosity, is Earmark's Transformer difficult to understand? (which might not shock me too much), maybe I should also look at how ex_doc does it and steal I mean learn from the best ;)

@solomonhawk
Copy link

I admit I haven't spent a ton of time looking into this. When I initially looked at transforms I found them approachable but I didn't see a clear integration point for adding custom transforms when using ExDoc. I'll take a look there!

@RobertDober
Copy link
Collaborator

I cannot speak for ex_doc but my transformer is not pluggable, just the simplest thing to do the job.

@RobertDober
Copy link
Collaborator

RobertDober commented Jul 1, 2020

The latest issues and the almost universal acceptance of GFM as the de facto standard and the expectation coming with that I have decided to implement (or use a) HTML parser to recursively include the HTML into the AST

I close this issue in favour of #358

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants