Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a bug that kept us from adding the markdown attribute to HTML tags that contain valid markdown when parsing HTML #614

Conversation

boblail
Copy link
Contributor

@boblail boblail commented Jun 24, 2019

You can reproduce the bug like this:

html = <<~HTML
<div>
  <img src="src.png" alt="alt" />
</div>
HTML

markdown = Kramdown::Document.new(html, input: :html).to_kramdown
# => "<div>\n![alt](src.png)\n</div>"

html = Kramdown::Document.new(markdown).to_html
# => "<div>\n![alt](src.png)\n</div>\n\n"

The bug is that the call to to_kramdown converted the IMG tag but did not add markdown="1" to the DIV tag.

As of Kramdown 2.1.0, the markdown attribute is added if the IMG tag is wrapped in a paragraph; and you can see that when the markdown attribute is present, the HTML roundtrips successfully:

html = <<~HTML
<div>
  <p><img src="src.png" alt="alt" /></p>
</div>
HTML

markdown = Kramdown::Document.new(html, input: :html).to_kramdown
# => "<div markdown=\"1\">\n![alt](src.png)\n</div>"

html = Kramdown::Document.new(markdown).to_html
# => "<div>\n  <p><img src=\"src.png\" alt=\"alt\" /></p>\n\n</div>"

@boblail boblail force-pushed the set-markdown-attr-appropriately-when-converting-standalone-span-elements branch from a9e16e7 to a7b296d Compare Jun 24, 2019
c.type != :html_element && (c.type != :p || !c.options[:transparent]) &&
Element.category(c) == :block
c.type != :html_element &&
!(c.type == :p && c.options[:transparent] && c.children.none? { |t| ![:entity, :text, :html_element].member?(t.type) }) &&
Copy link
Contributor

@ashmaroli ashmaroli Jun 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[:entity, :text, :html_element] is a static array. In order to avoid repeated allocations of this array on multiple iterations, I recommend hoisting the array as a private constant:

TYPES_ARRAY = [:entity, :text, :html_element]
private_constant :TYPES_ARRAY

def convert_html_element(el, opts)
  markdown_attr = el.options[:category] == :block && el.children.any? do |c|
    ...
    c.children.none? { |t| !TYPES_ARRAY.include?(t.type) }
    ...
  end
end

Copy link
Owner

@gettalong gettalong Jun 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did you determine these three element types?

Copy link
Contributor Author

@boblail boblail Jun 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gettalong, I added element types until the rest of the Kramdown spec suite passed without changes. 😄


I considered

c.children.none? { |t| MARKDOWN_TYPES.member?(t.type) }

instead of

c.children.none? { |t| !NON_MARKDOWN_TYPES.member?(t.type) }

but I think the list would be longer.

boblail added a commit to boblail/kramdown that referenced this issue Jun 24, 2019
…s that contain valid markdown when parsing HTML

You can reproduce the bug like this:

```ruby
html = <<~HTML
<div>
  <img src="src.png" alt="alt" />
</div>
HTML

markdown = Kramdown::Document.new(html, input: :html).to_kramdown

html = Kramdown::Document.new(markdown).to_html
```

The bug is that the call to `to_kramdown` converted the IMG tag but **_did not_** add `markdown="1"` to the DIV tag.

As of Kramdown 2.1.0, the `markdown` attribute is added if the IMG tag is wrapped in a paragraph; and you can see that when the `markdown` attribute is present, the HTML roundtrips successfully:

```ruby
html = <<~HTML
<div>
  <p><img src="src.png" alt="alt" /></p>
</div>
HTML

markdown = Kramdown::Document.new(html, input: :html).to_kramdown

html = Kramdown::Document.new(markdown).to_html
```
@boblail boblail force-pushed the set-markdown-attr-appropriately-when-converting-standalone-span-elements branch from a7b296d to ed0791d Compare Jun 24, 2019
@ghepting
Copy link

@ghepting ghepting commented Jun 24, 2019

Nice bug fix @boblail! :)

@boblail
Copy link
Contributor Author

@boblail boblail commented Aug 12, 2019

@gettalong, is this OK to merge? Or did you have any followup questions?

@boblail boblail requested a review from gettalong Mar 18, 2020
@gettalong gettalong self-assigned this Apr 17, 2020
@gettalong
Copy link
Owner

@gettalong gettalong commented Apr 17, 2020

Thank you for your contribution - I have cherry-picked your commit and adapted it a bit.

@gettalong gettalong closed this Apr 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants