Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Render unescaped HTML #2

Closed
AndreasHeintze opened this issue Sep 29, 2015 · 25 comments
Closed

Render unescaped HTML #2

AndreasHeintze opened this issue Sep 29, 2015 · 25 comments
Assignees

Comments

@AndreasHeintze
Copy link

In the docs it says we can define a raw tag to render unescaped HTML.
Well that works, but it's not very clean.

<raw content="{ json(aListData.data).toString() }"></raw>

I would like Riot to support an alternative way of displaying unescaped HTML, why not like this:

{{ html }}

{{ json(aListData.data).toString() }}

If it's not that hard to solve and if it doesn't increase the code size too much...

@GianlucaGuarini
Copy link
Member

I would just use something like:

{! myHtml !}

Where the ! flag enables the unescaped html rendering using any kind of template delimiters

@tipiirai
Copy link

How about double {{ brackets }} ? Would be analogous to Moustache and Handlebars with triple brackets.

@GianlucaGuarini
Copy link
Member

@tipiirai it will be a problem for the users using other template delimiters. What should use someone using riot.settings.brackets = '${ }'?

@sylvainpolletvillard
Copy link

Couldn't we just improve the <raw> tag to do so : <raw>{html}</raw> ?

@tipiirai
Copy link

tipiirai commented Oct 5, 2015

Raw tag is actually harder to implement in above way since there is no (easy) way to access the HTML. I prefer a custom delimiters for the job. Not sure why checking ! character is easier than { character when using custom template delimiters.

I think @aMarCruz is best to know.

@aMarCruz
Copy link
Contributor

Hi @tipiirai , you are right, and checking for ! is more convenient. The popular bracket { is "in use" by function bodies, literal objects, regexes, es6 classes, methods, and template strings, je... and of course by riot.

In the new version of tmpl (I'm testing with riot right now) there a little hack that allows compile with --type es6 --expr and shorthands: a caret. In expressions such as {^ foo: 1 }, the caret is removed and the expression is not passed to babel.
I think something like that can be used for raw html with quoted attribute values, or .tag files. But again, custom tags embedded in the markup (* .html files) containing characters not allowed by the html specs for text elements can be altered by browsers, and <> are in these set. The only "secure" from, to my knowledge, is to use a CDATA block.

@aMarCruz
Copy link
Contributor

Taking up this issue and thinking a bit more...
The change to insert HTML occurs at mount time (we can not use innerHTML, right?) and therefore the compiler needs to know that an expression is raw text, and it then tell to riot.tag2 about this.

For simplicity the compiler needs only one character starting the expression, as in the {^ } hack in my previous post this needs to be an invalid starting JS character, to avoid confusing the parser... e.g. It can not be ! because {! foo } is a valid expression. Goods chars are =, #, $, @, etc. no problem here.

The real question is how to tell to riot.tag2 (and to the mount function) an expression is raw text? preserving this first non-valid char?

This is a problem that I found with the precompiled expressions, in order to support mixed modes. How to know when the expression was already compiled? expressions has no meta-data. Because this, we are using the riot- and __ prefixes and other tricks.

...maybe I'm wrong and this is not a problem. Thoughts?

@aMarCruz
Copy link
Contributor

Thinking more...
the compiler needs to take care of the first character to temporarily remove it if a parser runs in the expression, nothing more. In the tag construction, riot removes the character and inserts the text as html.
Seems easy...

@tipiirai
Copy link

My preference would still be {{ str }}. The next best is {< str }, because it feels like outputting HTML (starts with "<") on UNIX shell (redirection operator).

I like {= str } also, which is pretty standard in templating, but usually it means escaped output.

@sylvainpolletvillard
Copy link

Double curly braces are used by default by very popular libraries like Angular, Handlebars (Ember), Mustache... Personnally I often put double curly braces by mistake inside my riot tags, because of the mustache background I come from.

I am concerned that double curly braces could now work, but without the security provided by HTML escaping. Many users could use it by mistake and have tags that seem to work perfectly, but are vulnerable to XSS attacks.

Inserting raw HTML content is an uncommon need, we do not need the syntax to be beautiful or handy. It must be explicit and warns the user about the risks. I vote for an explicit tag or for @GianlucaGuarini proposal with exclamation marks {! myHtml !}

@aMarCruz
Copy link
Contributor

Why we need new brackets? the brackets is not the problem.
With a single character at the beginning of the expression both the compiler and tmpl know that this is html.
Inject HTML in the DOM can be done easily, too.

The real problem is riot reads expressions from the DOM. This is the cause of many many issues.
Sorry, I have made several posts about this but my english is poor. So again, think about

  <p>{ "hello<br>world" }</p>

the compiler generates code for call riot.tag2 with the html param (a JS string) as '<p>{&quot;Hello<br>world&quot;}</p>'
riot.compile runs the generated code through the browser and call its callback (riot.mount).
riot.mount instantiate a Tag (in mountTo). The Tag ctor calls mkdom.
mkdom create a root html element (mostly a div) and sets the innerHTML to the JS string.
Next, tag.mount is called. Inside this, parseExpressions is called.
Remember div.innerHTML contains '<p>{"Hello<br>world"}</p>' and all is ok, yes? ...err no.

parseExpressions start walking the div through the firstChild/nextSibling html properties ...now I'm sure you see now the problems to come...
fistChild of <p> is a text node, but there are three:
1. {"Hello (#text)
2. <br> (HTMLBRElement)
3. world"} (#text)
browser's html parser (correctly) interpreted its content, which breaks the tmpl function.

Another things like <p>{ "<span>foo</span>" }</p> does not breaks tmpl, but do not generate the expected result. Raw html can not be inserted by expressions with the current implementation.

We need not depend on the browser for reading the expressions source. I think this is inneficient anyway. In my opinion, we have 2 choises:
a) Go one-way: Evaluate the expression from the vdom, and test vs the real DOM with the results.
b) Use of precompiled expressions. This requires changes in all the code.

With both solutions, we lose the ability to set/modify expressions directly into the DOM.

EDIT: This also explains why I am denying the use of < > in custom brackets.

@aMarCruz
Copy link
Contributor

...mmm thinking more...
perhaps another solution is that the compiler replaces the expression with a call to a precompiled special function that bypasses the text throgh the html parser. I will after some testing with this.

@GianlucaGuarini
Copy link
Member

@aMarCruz I understand the issue but I think we can just use a simpler solution without changing to much the riot source code: we can simply parse the expressions tagging the ones using a special template delimiter <p>{! Hello<br/>there !}</p> ( so they should no be escaped from the compiler ). Then in the update method we simply print the unescaped html string in the expression as el.innerHTML. I would like to not overcomplicate the things and the source code.

@aMarCruz
Copy link
Contributor

The compiler does not escape anything. Unless I'm missing something, conversion from "<br>" into "\n" the HTMLBRElement happens in the browser after mkdom injects the correct html string. This happens with other html elements, too. I find no way to avoid this (in the browser).

@aMarCruz
Copy link
Contributor

...or you mean the compiler must create a special, precompiled function for hidding the desired string from the browser's html parser?

@aMarCruz
Copy link
Contributor

e.g. assume <p>{@ Hi<br> }</p>.
the compiler see '@' as first char inside the expression (any brackets it have), and replaces this with a precompiled expression id, to register the function with its id in the tmpl cache:

riot.tag2(`tag`, `<p>{#0001}</p>`, '', '', function (opts) {
}, { "#0001": function(){ return "Hi<br>" } });

...now, riot.tag2 call to an inner tmpl function (e.g. registerFn), to insert these fn into the cache:

function registerFn(obj) {
    extend(cache, obj)
}

mkdom set div.innerHTML to <p>{#0001}</p>. The html parser sees this as text. There is no conversion.

Later, update see the format and knows this is raw html, anyway calls tmpl with "{#0001}". tmpl detect the format and call the #0001 function in the cache. Finally, update injects the html.

I'm ok?

@GianlucaGuarini
Copy link
Member

@aMarCruz I guess this are the steps to render unescaped strings in a tag element.
Assuming we use the string <p>{! '<span style="color: red;">Hi</span><br/><b>there</b>' !}</p>:

  • the riot-compiler should leave untouched the string (maybe it needs to escape the quotes)
  • the riot-tmpl should contain a function isRaw to detect expressions starting and ending with !
    • we can make the ! an option like riot.settings.rawDelimiter
  • the riot-tmpl should return only <span style="color: red;">Hi</span><br/><b>there</b>
  • in the update method in riot we just check whether isRaw on an expression is true and in that case we will print the string as innerHTML

At moment I would not invest too much time working on this feature (we have too many things to do), we can plan it for riot 2.4.0

@GianlucaGuarini
Copy link
Member

@aMarCruz can you have a look at this feature please? I would like to include it in the next riot release. Let me know if I should work on it as well

@aMarCruz
Copy link
Contributor

aMarCruz commented Dec 1, 2015

@GianlucaGuarini , sure.

@aMarCruz
Copy link
Contributor

aMarCruz commented Dec 2, 2015

Ok, This jsbin have 2 samples, one from riot#744, the other from the @GianlucaGuarini sample above, but using expressions properties for <span> and color.
The trick to avoid the parseExpressions() issues was simple and obvious: encode the <> chars.

I'm using = as flag (e.g. {= '<html>' }) because the '!' is a valid JS starting char (e.g. {!foo}), so I need one additional test for the closing sequence !}.
< is good too, but I don't like {< '<br>' } or {< x > 0 ? '<hr>' : '' }.
For simplicity and to not rely on riot.settings, I would prefer a non-customizable flag.

@GianlucaGuarini ,
The compiler temporarily removes the = for any external parsers -- babel, etc. Double quotes do not need to be encoded, in fact, this cause issues to fix in this version.
You can use tmpl.isRaw. The only change in tmpl() is to remove =, from there, the source is treated as a normal expression.

The code in this block (only for test) inject the elements in the parent node 'cause we can't use innerHTML here (all the expressions live in #text nodes) and I don't know if it breaks parent-child relations or generates sync issues. So please write this part and test, I will push the PR for the other modules when you are ready.

@GianlucaGuarini
Copy link
Member

thanks @aMarCruz I cant wait to test it :)

@GianlucaGuarini
Copy link
Member

I thought about this feature and I would like that riot-tmpl could be smart enough to handle this feature.
For expressions like:

{! raw } Hello { user }

I would like that riot-tmpl will produce:

// tmpl(string, data, mustEscape)
var str = '{! raw } Hello { user }',
  data = { raw: '<b>Cool</b>', user: '<i>Foo<i>' }
tmpl(str, data, true) // escape only the output of expressions not prefixed with `!`
// => <b>Cool</b> Hello &lt;i&gt;Foo&lt;i&gt;
tmpl(str, data, false || undefined) // behave as the current riot-tmpl
// => <b>Cool</b> Hello <i>Foo<i>

@aMarCruz let me know if you need help on this please. This will be part of a riot major release 3.0.0, so this means we can start tagging it riot-tmpl@3.0.0-alpha

@aMarCruz
Copy link
Contributor

aMarCruz commented Jun 3, 2017

This issue will be solved in the near future.

@aMarCruz aMarCruz closed this as completed Jun 3, 2017
@fabien
Copy link

fabien commented Jun 24, 2017

@GianlucaGuarini @aMarCruz I'm using riot.util.tmpl throughout my application whenever I need to evaluate an expression, this works nicely. However, by default the results will be unescaped. This is different from its use within Riot itself, which uses the DOM methods to stay safe.

For now, I solved this with a custom method:

riot.evalTmpl = function(expression, data, mustEscape) {
    expression = String(expression || '').replace(/(\{\s*\!?)\s*([^}]*)\s*\}/g, function(m, prefix, expr) {
        if (prefix.indexOf('!') > -1 || !mustEscape) {
            return '{ ' + expr.trim() + ' }';
        } else {
            return '{ _.escape(' + expr.trim() + ') }';
        }
    });
    return riot.util.tmpl(expression, data);
};

What do you think if my approach for the time being? It's similar to your proposal, but of course it's more of a syntactic-sugar preprocessor trick right now.

@GianlucaGuarini
Copy link
Member

@fabien we will completely rewrite riot-tmpl in riot@4 so at moment I wouldn't like to add new features to the current implementation, thanks for sharing your code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants