Skip to content

Allow structure modifications of child nodes during transform#443

Merged
RobertDober merged 1 commit intopragdave:masterfrom
tessi:plugins_for_structural_modifications
Jan 18, 2022
Merged

Allow structure modifications of child nodes during transform#443
RobertDober merged 1 commit intopragdave:masterfrom
tessi:plugins_for_structural_modifications

Conversation

@tessi
Copy link
Copy Markdown

@tessi tessi commented Jan 18, 2022

Currently, it is not possible to transform nodes, or add some.
As was stated in the README there was no clear way defined on how to do that.

I wonder whether solving a sub-set of that problem might be easy enough for a start: modifying only children of nodes during the AST transform step.

This PR adds support for post_processors/transforms to modify their children. This allows, for example, to create a post processor
which maps captions (h1, h2, ...) to include anchor-links as GitHub-flavored-MarkDown does.

Example of a post processor I intend to write
defmodule MarkdownRenderer do
  def render_markdown(markdown) do
    {:ok, html_doc, _messages} = Earmark.as_html(markdown, postprocessor: &postprocessor/1) do
    html_doc
  end

  defp postprocessor({tag, attrs, child_nodes, meta}) when tag in ["h1", "h2", "h3", "h4", "h5"] do
    {
      tag,
      attrs,
      [anchor(child_nodes)|child_nodes],
      meta
    }
  end
  defp postprocessor(node), do: node

  @non_char ~r/[^a-z]+/
  defp anchor(child_nodes) do
    id = text_content(child_nodes)
         |> String.trim()
         |> String.replace(@non_char, "-", global: true)
    anchor_id = "user-content-#{id}"

    {"a", [{"id", anchor_id}, {"href", "##{anchor_id}"}, {"aria-hidden", "true"}, {"class", "anchor"}], [svg_node()], %{}}
  end

  defp text_content([]), do: ""
  defp text_content([node|rest]), do: "#{text_content(node)} #{text_content(rest)}"
  defp text_content(node) when is_binary(node), do: String.downcase(node)
  defp text_content({_tag, _attrs, children, _meta}) do
    children
    |> Enum.map(fn node -> text_content(node) end)
    |> Enum.join("-")
  end

  @link_path "M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z"
  defp svg_node do
    {
      "svg",
      [{"class", "octicon octicon-link"}, {"viewBox", "0 0 16 16"}, {"version", "1.1"}, {"width", "16"}, {"height", "16"}, {"aria-hidden", "true"}],
      [
        {
          "path",
          [{"fill-rule", "evenodd"}, {"d", @link_path}],
          [],
          %{}
        }
      ],
      %{}
    }
  end
end

I assume this is debatable, as it does not allow all transformation that are imaginable.
But this solutions seems to work well within the existing code base and solves a bigger chunk of problems.

I'm happy to discuss my approach, as I might not have understood all details of the recursive _walk_ast function :)

Currently, it is not possible to transform nodes, or add some.
As was stated in the README there was no clear way defined on how to do that.

I wonder whether solving a sub-set of that problem might be easy enough for a start: modifying only children of nodes during the AST transform step.

This PR adds support for post_processors/transforms to modify their children. This allows, for example, to create a post processor
which maps captions (`h1`, `h2`, ...) to include anchor-links [as GitHub-flavored-MarkDown does](https://babelmark.github.io/?text=%23%23+heading+with+%5Ba+link%5D(to%3A%2F%2Fsome.url)).

<details>
<summary>Example of a post processor I intend to write</summary>

```elixir
defmodule MarkdownRenderer do
  def render_markdown(markdown) do
    {:ok, html_doc, _messages} = Earmark.as_html(markdown, postprocessor: &postprocessor/1) do
    html_doc
  end

  defp postprocessor({tag, attrs, child_nodes, meta}) when tag in ["h1", "h2", "h3", "h4", "h5"] do
    {
      tag,
      attrs,
      [anchor(child_nodes)] ++ child_nodes,
      meta
    }
  end
  defp postprocessor(node), do: node

  @non_char ~r/[^a-z]+/
  defp anchor(child_nodes) do
    id = text_content(child_nodes)
         |> String.trim()
         |> String.replace(@non_char, "-", global: true)
    anchor_id = "user-content-#{id}"

    {"a", [{"id", anchor_id}, {"href", "##{anchor_id}"}, {"aria-hidden", "true"}, {"class", "anchor"}], [svg_node()], %{}}
  end

  defp text_content([]), do: ""
  defp text_content([node|rest]), do: "#{text_content(node)} #{text_content(rest)}"
  defp text_content(node) when is_binary(node), do: String.downcase(node)
  defp text_content({_tag, _attrs, children, _meta}) do
    children
    |> Enum.map(fn node -> text_content(node) end)
    |> Enum.join("-")
  end

  @link_path "M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z"
  defp svg_node do
    {
      "svg",
      [{"class", "octicon octicon-link"}, {"viewBox", "0 0 16 16"}, {"version", "1.1"}, {"width", "16"}, {"height", "16"}, {"aria-hidden", "true"}],
      [
        {
          "path",
          [{"fill-rule", "evenodd"}, {"d", @link_path}],
          [],
          %{}
        }
      ],
      %{}
    }
  end
end
```
</details>

I assume this is debatable, as it does not allow all transformation that are imaginable.
But this solutions seems to work well within the existing code base and solves a bigger chunk of problems.

I'm happy to discuss my approach, as I might not have understood all details of the recursive `_walk_ast` function :)
@RobertDober
Copy link
Copy Markdown
Collaborator

RobertDober commented Jan 18, 2022

@tessi thank you so much, this is a very nice and simple extension of the current code which gives transform so much more power

This will be a great improvement which will go into Earmark 1.5, unless you need a release quickly in which case I can release it as 1.4.201

@RobertDober RobertDober merged commit 2560b80 into pragdave:master Jan 18, 2022
@tessi tessi deleted the plugins_for_structural_modifications branch January 18, 2022 17:00
@tessi
Copy link
Copy Markdown
Author

tessi commented Jan 18, 2022

Thanks @RobertDober for the swift response and merge 💛 (and general maintenance of this project).
I'm happy to wait until 1.5 was released.

RobertDober added a commit that referenced this pull request Jan 18, 2022
RobertDober added a commit that referenced this pull request Jan 18, 2022
@RobertDober
Copy link
Copy Markdown
Collaborator

great, that was a very nice PR a joy to merge, contributions like these are all the thanks I need :)

RobertDober pushed a commit that referenced this pull request Mar 14, 2022
Currently, it is not possible to transform nodes, or add some.
As was stated in the README there was no clear way defined on how to do that.

I wonder whether solving a sub-set of that problem might be easy enough for a start: modifying only children of nodes during the AST transform step.

This PR adds support for post_processors/transforms to modify their children. This allows, for example, to create a post processor
which maps captions (`h1`, `h2`, ...) to include anchor-links [as GitHub-flavored-MarkDown does](https://babelmark.github.io/?text=%23%23+heading+with+%5Ba+link%5D(to%3A%2F%2Fsome.url)).

<details>
<summary>Example of a post processor I intend to write</summary>

```elixir
defmodule MarkdownRenderer do
  def render_markdown(markdown) do
    {:ok, html_doc, _messages} = Earmark.as_html(markdown, postprocessor: &postprocessor/1) do
    html_doc
  end

  defp postprocessor({tag, attrs, child_nodes, meta}) when tag in ["h1", "h2", "h3", "h4", "h5"] do
    {
      tag,
      attrs,
      [anchor(child_nodes)] ++ child_nodes,
      meta
    }
  end
  defp postprocessor(node), do: node

  @non_char ~r/[^a-z]+/
  defp anchor(child_nodes) do
    id = text_content(child_nodes)
         |> String.trim()
         |> String.replace(@non_char, "-", global: true)
    anchor_id = "user-content-#{id}"

    {"a", [{"id", anchor_id}, {"href", "##{anchor_id}"}, {"aria-hidden", "true"}, {"class", "anchor"}], [svg_node()], %{}}
  end

  defp text_content([]), do: ""
  defp text_content([node|rest]), do: "#{text_content(node)} #{text_content(rest)}"
  defp text_content(node) when is_binary(node), do: String.downcase(node)
  defp text_content({_tag, _attrs, children, _meta}) do
    children
    |> Enum.map(fn node -> text_content(node) end)
    |> Enum.join("-")
  end

  @link_path "M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z"
  defp svg_node do
    {
      "svg",
      [{"class", "octicon octicon-link"}, {"viewBox", "0 0 16 16"}, {"version", "1.1"}, {"width", "16"}, {"height", "16"}, {"aria-hidden", "true"}],
      [
        {
          "path",
          [{"fill-rule", "evenodd"}, {"d", @link_path}],
          [],
          %{}
        }
      ],
      %{}
    }
  end
end
```
</details>

I assume this is debatable, as it does not allow all transformation that are imaginable.
But this solutions seems to work well within the existing code base and solves a bigger chunk of problems.

I'm happy to discuss my approach, as I might not have understood all details of the recursive `_walk_ast` function :)

Co-authored-by: Philipp Tessenow <philipp@remote.com>
RobertDober added a commit that referenced this pull request Mar 14, 2022
RobertDober pushed a commit that referenced this pull request Mar 24, 2022
Currently, it is not possible to transform nodes, or add some.
As was stated in the README there was no clear way defined on how to do that.

I wonder whether solving a sub-set of that problem might be easy enough for a start: modifying only children of nodes during the AST transform step.

This PR adds support for post_processors/transforms to modify their children. This allows, for example, to create a post processor
which maps captions (`h1`, `h2`, ...) to include anchor-links [as GitHub-flavored-MarkDown does](https://babelmark.github.io/?text=%23%23+heading+with+%5Ba+link%5D(to%3A%2F%2Fsome.url)).

<details>
<summary>Example of a post processor I intend to write</summary>

```elixir
defmodule MarkdownRenderer do
  def render_markdown(markdown) do
    {:ok, html_doc, _messages} = Earmark.as_html(markdown, postprocessor: &postprocessor/1) do
    html_doc
  end

  defp postprocessor({tag, attrs, child_nodes, meta}) when tag in ["h1", "h2", "h3", "h4", "h5"] do
    {
      tag,
      attrs,
      [anchor(child_nodes)] ++ child_nodes,
      meta
    }
  end
  defp postprocessor(node), do: node

  @non_char ~r/[^a-z]+/
  defp anchor(child_nodes) do
    id = text_content(child_nodes)
         |> String.trim()
         |> String.replace(@non_char, "-", global: true)
    anchor_id = "user-content-#{id}"

    {"a", [{"id", anchor_id}, {"href", "##{anchor_id}"}, {"aria-hidden", "true"}, {"class", "anchor"}], [svg_node()], %{}}
  end

  defp text_content([]), do: ""
  defp text_content([node|rest]), do: "#{text_content(node)} #{text_content(rest)}"
  defp text_content(node) when is_binary(node), do: String.downcase(node)
  defp text_content({_tag, _attrs, children, _meta}) do
    children
    |> Enum.map(fn node -> text_content(node) end)
    |> Enum.join("-")
  end

  @link_path "M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z"
  defp svg_node do
    {
      "svg",
      [{"class", "octicon octicon-link"}, {"viewBox", "0 0 16 16"}, {"version", "1.1"}, {"width", "16"}, {"height", "16"}, {"aria-hidden", "true"}],
      [
        {
          "path",
          [{"fill-rule", "evenodd"}, {"d", @link_path}],
          [],
          %{}
        }
      ],
      %{}
    }
  end
end
```
</details>

I assume this is debatable, as it does not allow all transformation that are imaginable.
But this solutions seems to work well within the existing code base and solves a bigger chunk of problems.

I'm happy to discuss my approach, as I might not have understood all details of the recursive `_walk_ast` function :)

Co-authored-by: Philipp Tessenow <philipp@remote.com>
RobertDober added a commit that referenced this pull request Mar 24, 2022
RobertDober pushed a commit that referenced this pull request May 1, 2022
Currently, it is not possible to transform nodes, or add some.
As was stated in the README there was no clear way defined on how to do that.

I wonder whether solving a sub-set of that problem might be easy enough for a start: modifying only children of nodes during the AST transform step.

This PR adds support for post_processors/transforms to modify their children. This allows, for example, to create a post processor
which maps captions (`h1`, `h2`, ...) to include anchor-links [as GitHub-flavored-MarkDown does](https://babelmark.github.io/?text=%23%23+heading+with+%5Ba+link%5D(to%3A%2F%2Fsome.url)).

<details>
<summary>Example of a post processor I intend to write</summary>

```elixir
defmodule MarkdownRenderer do
  def render_markdown(markdown) do
    {:ok, html_doc, _messages} = Earmark.as_html(markdown, postprocessor: &postprocessor/1) do
    html_doc
  end

  defp postprocessor({tag, attrs, child_nodes, meta}) when tag in ["h1", "h2", "h3", "h4", "h5"] do
    {
      tag,
      attrs,
      [anchor(child_nodes)] ++ child_nodes,
      meta
    }
  end
  defp postprocessor(node), do: node

  @non_char ~r/[^a-z]+/
  defp anchor(child_nodes) do
    id = text_content(child_nodes)
         |> String.trim()
         |> String.replace(@non_char, "-", global: true)
    anchor_id = "user-content-#{id}"

    {"a", [{"id", anchor_id}, {"href", "##{anchor_id}"}, {"aria-hidden", "true"}, {"class", "anchor"}], [svg_node()], %{}}
  end

  defp text_content([]), do: ""
  defp text_content([node|rest]), do: "#{text_content(node)} #{text_content(rest)}"
  defp text_content(node) when is_binary(node), do: String.downcase(node)
  defp text_content({_tag, _attrs, children, _meta}) do
    children
    |> Enum.map(fn node -> text_content(node) end)
    |> Enum.join("-")
  end

  @link_path "M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z"
  defp svg_node do
    {
      "svg",
      [{"class", "octicon octicon-link"}, {"viewBox", "0 0 16 16"}, {"version", "1.1"}, {"width", "16"}, {"height", "16"}, {"aria-hidden", "true"}],
      [
        {
          "path",
          [{"fill-rule", "evenodd"}, {"d", @link_path}],
          [],
          %{}
        }
      ],
      %{}
    }
  end
end
```
</details>

I assume this is debatable, as it does not allow all transformation that are imaginable.
But this solutions seems to work well within the existing code base and solves a bigger chunk of problems.

I'm happy to discuss my approach, as I might not have understood all details of the recursive `_walk_ast` function :)

Co-authored-by: Philipp Tessenow <philipp@remote.com>
RobertDober added a commit that referenced this pull request May 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants