Allow structure modifications of child nodes during transform#443
Merged
RobertDober merged 1 commit intopragdave:masterfrom Jan 18, 2022
tessi:plugins_for_structural_modifications
Merged
Allow structure modifications of child nodes during transform#443RobertDober merged 1 commit intopragdave:masterfrom tessi:plugins_for_structural_modifications
RobertDober merged 1 commit intopragdave:masterfrom
tessi:plugins_for_structural_modifications
Conversation
Currently, it is not possible to transform nodes, or add some. As was stated in the README there was no clear way defined on how to do that. I wonder whether solving a sub-set of that problem might be easy enough for a start: modifying only children of nodes during the AST transform step. This PR adds support for post_processors/transforms to modify their children. This allows, for example, to create a post processor which maps captions (`h1`, `h2`, ...) to include anchor-links [as GitHub-flavored-MarkDown does](https://babelmark.github.io/?text=%23%23+heading+with+%5Ba+link%5D(to%3A%2F%2Fsome.url)). <details> <summary>Example of a post processor I intend to write</summary> ```elixir defmodule MarkdownRenderer do def render_markdown(markdown) do {:ok, html_doc, _messages} = Earmark.as_html(markdown, postprocessor: &postprocessor/1) do html_doc end defp postprocessor({tag, attrs, child_nodes, meta}) when tag in ["h1", "h2", "h3", "h4", "h5"] do { tag, attrs, [anchor(child_nodes)] ++ child_nodes, meta } end defp postprocessor(node), do: node @non_char ~r/[^a-z]+/ defp anchor(child_nodes) do id = text_content(child_nodes) |> String.trim() |> String.replace(@non_char, "-", global: true) anchor_id = "user-content-#{id}" {"a", [{"id", anchor_id}, {"href", "##{anchor_id}"}, {"aria-hidden", "true"}, {"class", "anchor"}], [svg_node()], %{}} end defp text_content([]), do: "" defp text_content([node|rest]), do: "#{text_content(node)} #{text_content(rest)}" defp text_content(node) when is_binary(node), do: String.downcase(node) defp text_content({_tag, _attrs, children, _meta}) do children |> Enum.map(fn node -> text_content(node) end) |> Enum.join("-") end @link_path "M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z" defp svg_node do { "svg", [{"class", "octicon octicon-link"}, {"viewBox", "0 0 16 16"}, {"version", "1.1"}, {"width", "16"}, {"height", "16"}, {"aria-hidden", "true"}], [ { "path", [{"fill-rule", "evenodd"}, {"d", @link_path}], [], %{} } ], %{} } end end ``` </details> I assume this is debatable, as it does not allow all transformation that are imaginable. But this solutions seems to work well within the existing code base and solves a bigger chunk of problems. I'm happy to discuss my approach, as I might not have understood all details of the recursive `_walk_ast` function :)
Collaborator
|
@tessi thank you so much, this is a very nice and simple extension of the current code which gives This will be a great improvement which will go into Earmark 1.5, unless you need a release quickly in which case I can release it as 1.4.2 |
Author
|
Thanks @RobertDober for the swift response and merge 💛 (and general maintenance of this project). |
RobertDober
added a commit
that referenced
this pull request
Jan 18, 2022
RobertDober
added a commit
that referenced
this pull request
Jan 18, 2022
Collaborator
|
great, that was a very nice PR a joy to merge, contributions like these are all the thanks I need :) |
RobertDober
pushed a commit
that referenced
this pull request
Mar 14, 2022
Currently, it is not possible to transform nodes, or add some. As was stated in the README there was no clear way defined on how to do that. I wonder whether solving a sub-set of that problem might be easy enough for a start: modifying only children of nodes during the AST transform step. This PR adds support for post_processors/transforms to modify their children. This allows, for example, to create a post processor which maps captions (`h1`, `h2`, ...) to include anchor-links [as GitHub-flavored-MarkDown does](https://babelmark.github.io/?text=%23%23+heading+with+%5Ba+link%5D(to%3A%2F%2Fsome.url)). <details> <summary>Example of a post processor I intend to write</summary> ```elixir defmodule MarkdownRenderer do def render_markdown(markdown) do {:ok, html_doc, _messages} = Earmark.as_html(markdown, postprocessor: &postprocessor/1) do html_doc end defp postprocessor({tag, attrs, child_nodes, meta}) when tag in ["h1", "h2", "h3", "h4", "h5"] do { tag, attrs, [anchor(child_nodes)] ++ child_nodes, meta } end defp postprocessor(node), do: node @non_char ~r/[^a-z]+/ defp anchor(child_nodes) do id = text_content(child_nodes) |> String.trim() |> String.replace(@non_char, "-", global: true) anchor_id = "user-content-#{id}" {"a", [{"id", anchor_id}, {"href", "##{anchor_id}"}, {"aria-hidden", "true"}, {"class", "anchor"}], [svg_node()], %{}} end defp text_content([]), do: "" defp text_content([node|rest]), do: "#{text_content(node)} #{text_content(rest)}" defp text_content(node) when is_binary(node), do: String.downcase(node) defp text_content({_tag, _attrs, children, _meta}) do children |> Enum.map(fn node -> text_content(node) end) |> Enum.join("-") end @link_path "M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z" defp svg_node do { "svg", [{"class", "octicon octicon-link"}, {"viewBox", "0 0 16 16"}, {"version", "1.1"}, {"width", "16"}, {"height", "16"}, {"aria-hidden", "true"}], [ { "path", [{"fill-rule", "evenodd"}, {"d", @link_path}], [], %{} } ], %{} } end end ``` </details> I assume this is debatable, as it does not allow all transformation that are imaginable. But this solutions seems to work well within the existing code base and solves a bigger chunk of problems. I'm happy to discuss my approach, as I might not have understood all details of the recursive `_walk_ast` function :) Co-authored-by: Philipp Tessenow <philipp@remote.com>
RobertDober
added a commit
that referenced
this pull request
Mar 14, 2022
RobertDober
pushed a commit
that referenced
this pull request
Mar 24, 2022
Currently, it is not possible to transform nodes, or add some. As was stated in the README there was no clear way defined on how to do that. I wonder whether solving a sub-set of that problem might be easy enough for a start: modifying only children of nodes during the AST transform step. This PR adds support for post_processors/transforms to modify their children. This allows, for example, to create a post processor which maps captions (`h1`, `h2`, ...) to include anchor-links [as GitHub-flavored-MarkDown does](https://babelmark.github.io/?text=%23%23+heading+with+%5Ba+link%5D(to%3A%2F%2Fsome.url)). <details> <summary>Example of a post processor I intend to write</summary> ```elixir defmodule MarkdownRenderer do def render_markdown(markdown) do {:ok, html_doc, _messages} = Earmark.as_html(markdown, postprocessor: &postprocessor/1) do html_doc end defp postprocessor({tag, attrs, child_nodes, meta}) when tag in ["h1", "h2", "h3", "h4", "h5"] do { tag, attrs, [anchor(child_nodes)] ++ child_nodes, meta } end defp postprocessor(node), do: node @non_char ~r/[^a-z]+/ defp anchor(child_nodes) do id = text_content(child_nodes) |> String.trim() |> String.replace(@non_char, "-", global: true) anchor_id = "user-content-#{id}" {"a", [{"id", anchor_id}, {"href", "##{anchor_id}"}, {"aria-hidden", "true"}, {"class", "anchor"}], [svg_node()], %{}} end defp text_content([]), do: "" defp text_content([node|rest]), do: "#{text_content(node)} #{text_content(rest)}" defp text_content(node) when is_binary(node), do: String.downcase(node) defp text_content({_tag, _attrs, children, _meta}) do children |> Enum.map(fn node -> text_content(node) end) |> Enum.join("-") end @link_path "M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z" defp svg_node do { "svg", [{"class", "octicon octicon-link"}, {"viewBox", "0 0 16 16"}, {"version", "1.1"}, {"width", "16"}, {"height", "16"}, {"aria-hidden", "true"}], [ { "path", [{"fill-rule", "evenodd"}, {"d", @link_path}], [], %{} } ], %{} } end end ``` </details> I assume this is debatable, as it does not allow all transformation that are imaginable. But this solutions seems to work well within the existing code base and solves a bigger chunk of problems. I'm happy to discuss my approach, as I might not have understood all details of the recursive `_walk_ast` function :) Co-authored-by: Philipp Tessenow <philipp@remote.com>
RobertDober
added a commit
that referenced
this pull request
Mar 24, 2022
RobertDober
pushed a commit
that referenced
this pull request
May 1, 2022
Currently, it is not possible to transform nodes, or add some. As was stated in the README there was no clear way defined on how to do that. I wonder whether solving a sub-set of that problem might be easy enough for a start: modifying only children of nodes during the AST transform step. This PR adds support for post_processors/transforms to modify their children. This allows, for example, to create a post processor which maps captions (`h1`, `h2`, ...) to include anchor-links [as GitHub-flavored-MarkDown does](https://babelmark.github.io/?text=%23%23+heading+with+%5Ba+link%5D(to%3A%2F%2Fsome.url)). <details> <summary>Example of a post processor I intend to write</summary> ```elixir defmodule MarkdownRenderer do def render_markdown(markdown) do {:ok, html_doc, _messages} = Earmark.as_html(markdown, postprocessor: &postprocessor/1) do html_doc end defp postprocessor({tag, attrs, child_nodes, meta}) when tag in ["h1", "h2", "h3", "h4", "h5"] do { tag, attrs, [anchor(child_nodes)] ++ child_nodes, meta } end defp postprocessor(node), do: node @non_char ~r/[^a-z]+/ defp anchor(child_nodes) do id = text_content(child_nodes) |> String.trim() |> String.replace(@non_char, "-", global: true) anchor_id = "user-content-#{id}" {"a", [{"id", anchor_id}, {"href", "##{anchor_id}"}, {"aria-hidden", "true"}, {"class", "anchor"}], [svg_node()], %{}} end defp text_content([]), do: "" defp text_content([node|rest]), do: "#{text_content(node)} #{text_content(rest)}" defp text_content(node) when is_binary(node), do: String.downcase(node) defp text_content({_tag, _attrs, children, _meta}) do children |> Enum.map(fn node -> text_content(node) end) |> Enum.join("-") end @link_path "M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z" defp svg_node do { "svg", [{"class", "octicon octicon-link"}, {"viewBox", "0 0 16 16"}, {"version", "1.1"}, {"width", "16"}, {"height", "16"}, {"aria-hidden", "true"}], [ { "path", [{"fill-rule", "evenodd"}, {"d", @link_path}], [], %{} } ], %{} } end end ``` </details> I assume this is debatable, as it does not allow all transformation that are imaginable. But this solutions seems to work well within the existing code base and solves a bigger chunk of problems. I'm happy to discuss my approach, as I might not have understood all details of the recursive `_walk_ast` function :) Co-authored-by: Philipp Tessenow <philipp@remote.com>
RobertDober
added a commit
that referenced
this pull request
May 1, 2022
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Currently, it is not possible to transform nodes, or add some.
As was stated in the README there was no clear way defined on how to do that.
I wonder whether solving a sub-set of that problem might be easy enough for a start: modifying only children of nodes during the AST transform step.
This PR adds support for post_processors/transforms to modify their children. This allows, for example, to create a post processor
which maps captions (
h1,h2, ...) to include anchor-links as GitHub-flavored-MarkDown does.Example of a post processor I intend to write
I assume this is debatable, as it does not allow all transformation that are imaginable.
But this solutions seems to work well within the existing code base and solves a bigger chunk of problems.
I'm happy to discuss my approach, as I might not have understood all details of the recursive
_walk_astfunction :)