Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-standard HTML tags on input get encoded to HTML Entities #32

Closed
robogeek opened this issue Jan 7, 2015 · 7 comments
Closed

Non-standard HTML tags on input get encoded to HTML Entities #32

robogeek opened this issue Jan 7, 2015 · 7 comments

Comments

@robogeek
Copy link

robogeek commented Jan 7, 2015

This is occurring with 3.0.0 .. to my understanding of Markdown, entering an HTML tag will cause it to pass through to the output. This only seems to be true for standard HTML tags, whereas non-standard get encoded.

The behavior is visible here: https://markdown-it.github.io/

Entering <b>Hi there</b> and a bolded "Hi there" is displayed. Hence the standard tag is passed through unencoded as desired.

Entering <hello-world></hello-world> and instead that string is encoded with HTML entities so that it appears in the output as <hello-world></hello-world>. On the other hand <helloworld></helloworld> (no dashes in tag name) is passed through unencoded as desired.

  • standard tag: passes through unencoded (as desired)
  • non-standard tag with no dashes in tag name: passes through unencoded (as desired)
  • non-standard tag with dashes in tag name: encoded to HTML entities (BUG)

With AkashaCMS I'm using a large quantity of non-standard tags and processing them with jQuery code to produce the output.

I'm initializing markdown-it with:

var md        = require('markdown-it')({
  html:         true,        // Enable html tags in source
  xhtmlOut:     false,        // Use '/' to close single tags (<br />)
  breaks:       false,        // Convert '\n' in paragraphs into <br>
  // langPrefix:   'language-',  // CSS language prefix for fenced blocks
  linkify:      true,        // Autoconvert url-like texts to links
  typographer:  true,         // Enable smartypants and other sweet transforms

  highlight: function (/*str, , lang*/) { return ''; }
});
@puzrin
Copy link
Member

puzrin commented Jan 7, 2015

That's a CommonMark specification problem. It was already reported, and now in process of discussion:

http://talk.commonmark.org/t/raw-html-blocks-proposals-comments-wanted/983

I think, spec will resolve it in < 1 months, and i update parser immediately then. Is it ok?

@robogeek
Copy link
Author

robogeek commented Jan 7, 2015

I tried to read that but it got pretty technical, so I'm not quite understanding the proposal. It does explain why I'd seen a difference between a tag at the beginning of the line, and tags in the middle of some text.

For my purpose it's not useful to only support non-standard tags at the beginning of a line. They need to be supported anywhere (for my purpose).

At the moment I'm trying a workaround to use earlier versions of markdown-it or might have to switch back to Remarkable ...

@puzrin
Copy link
Member

puzrin commented Jan 7, 2015

Hm... they have the same codebase, and i don't remember that changed html tags logic. Are you sure it worked better before?

@puzrin
Copy link
Member

puzrin commented Jan 7, 2015

Version from master should eat dash in tag names now. But it you need those tags to be block-like (not wrapped with paragraphs), you have to extend this list https://github.com/markdown-it/markdown-it/blob/master/lib/common/html_blocks.js

var blockTags = require('markdown-it/lib/common/html_blocks');

blockTags.push('my-block-tag');

Also i recommend to read upcoming spec changes and write there your suggestions, if you see any problems. This parser strictly follow CommonMark spec.

In worst case, you can rewrite approproate parcer rules with your own.

@puzrin
Copy link
Member

puzrin commented Jan 7, 2015

I tried to read that but it got pretty technical, so I'm not quite understanding the proposal. It does explain why I'd seen a difference between a tag at the beginning of the line, and tags in the middle of some text.

http://spec.commonmark.org/0.15/#html-blocks

Your tag could work if it was the part of html block. For example:


<!-- -->
<hello-world></hello-world>

When parser find start of html block (<!--) in this case, it takes as html everything until next empty line.

@puzrin
Copy link
Member

puzrin commented Jan 8, 2015

Released 3.0.2 with allowed dashes in tag names.

Closing?

@robogeek
Copy link
Author

robogeek commented Jan 9, 2015

Yes it looks good. Thank you.

@puzrin puzrin closed this as completed Jan 9, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants