Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custome Parser using document.createElement does not allow html5 (div inside p) #314

Closed
geyang opened this issue Sep 13, 2014 · 11 comments
Closed
Labels

Comments

@geyang
Copy link
Collaborator

geyang commented Sep 13, 2014

Hi @SimeonC and @tarjei,

I am implementing a custom parser to clean up the textvalue when I ran into this rather interesting bug.

TLDR;

basically, when I use createDocumentFragment to create a shadow element to parse the textvalue, it incorrectly parses textNodes in a paragraph that are after child spans into a new paragraph.

example

content:

<p>lalala<div>haha</div>haha</p>

gets converted into

<p>lalala</p><div>haha</div><p>haha</p>

code

            mathJaxParser: function (textValue) {
                /** regex is NOT a good way to parse the text!!
                 * one can not make sure that string in the text field are properly
                 * escapped.
                 */
                try {
                    var i;
                    var fragment = document.createDocumentFragment();
                    var base = document.createElement("span");
                    fragment.appendChild(base);
                    base.innerHTML = textValue;
                    console.log(base);
                    console.log(textValue);

My thoughts

<div> inside <p> is not supported in html4 standard, however it is support in html5. Any thoughts?

ps: the code is copied from @tarjei's awesome math-jax textAngular project here:

@SimeonC
Copy link
Collaborator

SimeonC commented Sep 15, 2014

Use spans not divs maybe. I think that you'd have to set display: inline-block; or display: inline; on the div in that case to not break the paragraph display. Whether it's supported in the standard doesn't necessarily mean that the browser supports it properly yet which is likely the issue you're getting.

@geyang
Copy link
Collaborator Author

geyang commented Sep 16, 2014

I did replace all divs with spans. I just felt that it was rather hacky. So you are saying that the createElement command is actually style gnostic?

@SimeonC
Copy link
Collaborator

SimeonC commented Sep 16, 2014

Yea, hacky, I'm sure it's the catchphrase for web development!

Styling I think is only used when rendering it to the user in the browser. I don't think style rules are evaluated until you actually attach it to the document. To be honest I've never tried to use a div inside a p tag as it just didn't really work. I suppose technically I may have when I do things like <person> as an inline directive but then I force that to display correctly with the inline-block style.

@geyang
Copy link
Collaborator Author

geyang commented Sep 16, 2014

The div actually comes from the mathJax rendered element for display math (versus inline math which are just spans).

I highly suspect the problem we are having with list elements could also be attributed to how browser handles these. I will try to read through all of our issues on that when I am done with what I am working on at hand.

The problem is that this kind of issues will likely come up everytime we use a custom parser. In my case, before I pass the textValue into the createElement parser, I had manually replace all of the non-compatible elements with compatible elements. I got away by simple regexing it out.

Any thoughts on this @tarjei? Since the code originally came from you :)

@SimeonC
Copy link
Collaborator

SimeonC commented Sep 16, 2014

Yea, that possibly was the problem originally with lists. Then I re-wrote lists to use our own logic so it's mostly me to blame for lists now.

@geyang
Copy link
Collaborator Author

geyang commented Sep 16, 2014

I will look into your code then! I have meant to do that for a while now.

@tarjei
Copy link

tarjei commented Sep 16, 2014

@episodeyang my first thought is that you should also remember that there are quite large differences in the markup browsers generate :)

I.e. if you are aiming for perfect control of the markup generated, you might be in for a world of pain :)

@SimeonC
Copy link
Collaborator

SimeonC commented Sep 17, 2014

I'd say that you definitely are in for a world of pain if you're trying for perfect control of the markup. As is shown by these XHTML issues that have started popping up.

@geyang
Copy link
Collaborator Author

geyang commented Sep 19, 2014

@SimeonC and @tarjei

hahaha I really enjoyed your comment. It is really nice to be able to learn from someone who has a lot more experience.

The goal here is not to get perfect control of the markup, but to be able to parse and do some basic custom modification to the markup after it is returned by ngModel. Right now the problem is that the textValue returend by the ngModel is okay. However when I do createElement(textValue) it parses the original markup to some html4 standard compliant markup.

So here is the fundamental question:

So @tarjei basically used the createElement() function as a html 2 dom parser. If such native parser is incompatible with html5 standard, what other alternatives do I have? I like how @tarjei implemented his math parser, and am imagining myself using the same pattern to do a lot of cool custom stuff. But a bad (standard incompatible) parser in the first step is prohibitive.

Other user cases I have right now are:

  • I have <p> tags inside <li> tags. might be able to use regex to replace temporarily but non-trivial to make it robust.
  • I have <div> tags inside <span>

@SimeonC
Copy link
Collaborator

SimeonC commented Sep 19, 2014

I'm not sure why you're using createElement, personally I just use var text = angular.element("<div></div>"); text.html(textValue); then when you're done use result = text.html();. That's what we've used in textAngular since the start.

Have a look here: https://github.com/fraywing/textAngular/blob/master/src/textAngular.js#L1409-L1433, that might get you along the correct path, jqlite in angular is close to as functional as jquery for our general needs so it works well enough as a DOM parser and traversal tool.

@SimeonC
Copy link
Collaborator

SimeonC commented Feb 5, 2015

Closed due to inactivity.

@SimeonC SimeonC closed this as completed Feb 5, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants