Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML in Markdown #19

Closed
marian2js opened this issue Sep 10, 2020 · 6 comments
Closed

HTML in Markdown #19

marian2js opened this issue Sep 10, 2020 · 6 comments
Labels

Comments

@marian2js
Copy link

Subject of the issue

Strings starting with <p>...</p> cause everything else to be dropped.

remark()
  .use(strip)
  .process('<p>foo</p> bar', function(err, file) {
    if (err) throw err
    console.log(String(file))
  })

Actual: ""
Expected: "foo bar"

With some text before the tag, it works as expected:

remark()
  .use(strip)
  .process('foo <p>bar</p>', function(err, file) {
    if (err) throw err
    console.log(String(file))
  })

Actual: "foo bar"

Your environment

  • OS: macOS 10.15.5
  • Packages: strip-markdown "3.1.2"
  • Env: node 14

Steps to reproduce

The issue can also be reproduced on the demo, just enter:

<p>foo</p>
bar

Expected behavior

<p> tag should be stripped and the inner text should be returned. This behavior works with other tags like <span>.

Actual behavior

<p>foo</p> bar returns an empty string, but foo <p>bar</p> returns "foo bar".

@marian2js marian2js added 🐛 type/bug This is a problem 🙉 open/needs-info This needs some more info labels Sep 10, 2020
@wooorm
Copy link
Member

wooorm commented Oct 3, 2020

hey, sorry for the wait, was busy with other stuff!

This is a duplicate of #14, the answer is still the same!

@wooorm wooorm closed this as completed Oct 3, 2020
@wooorm wooorm added 👯 no/duplicate Déjà vu and removed 🐛 type/bug This is a problem 🙉 open/needs-info This needs some more info labels Oct 3, 2020
@marian2js
Copy link
Author

Hi @wooorm thanks for the reply. I wasn't expecting this to be a hard fix, since it works for tags like span but not for p.

As a workaround I prefixed the text with a character and stripe it after words, it's not very elegant, but it seems to work for all cases. Maybe a similar workaround can be introduced in this library in order to prevent unexpected behaviours.

@wooorm
Copy link
Member

wooorm commented Oct 3, 2020

The proper way to deal with HTML is through remark-rehype, rehype-raw, and then getting the text.
I understand this workaround works well for you, but I’d rather not introduce something fragile into the code here!

@marian2js
Copy link
Author

Hi @wooorm, it's fine if the workaround is not added, but I think this should be documented. The readme says this library removes HTML, but now you say the proper way is to use extra plugins for it. I was replacing a different implementation with this library and I was lucky that a test was covering this case, otherwise I might not have detected it until production.

Anyway, thanks for maintaining this project, it was really useful for me.

@wooorm
Copy link
Member

wooorm commented Oct 4, 2020

I think what’s confusing here is:

<p>foo</p> *bar*

^-- this whole thing is HTML to markdown. The emphasis doesn’t work. The docs refer to that aspect of HTML in markdown. But from what I gather, you’re reading html as just the XML-like parts of HTML, but not the text in HTML.

@wooorm wooorm changed the title Strings starting with <p> tag cause everything else to be dropped HTML in Markdown Oct 4, 2020
@wooorm
Copy link
Member

wooorm commented Oct 4, 2020

Added a link to this issue next to where the readme mentions HTML!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

2 participants