Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sanitization is causing some shortcodes to not render correctly in the MarkdownBody #16226

Open
DrewBrasher opened this issue Jun 3, 2024 · 13 comments

Comments

@DrewBrasher
Copy link
Contributor

I have a module that adds a shortcode. The short code creates a span with a data- attribute but the attribute does not get rendered in the MarkdownBody if the Sanitize HTML checkbox is checked in the part settings.

You can reproducer the issue by using the Blog Template and my Content Warning Module (https://github.com/DrewBrasher/OrchardCoreModules).

The HtmlBody renders the shortcode correctly even with Sanitization turned on so I think the MarkdownBody should do the same.

This was discussed some in discussion #13643 (reply in thread)

@MikeAlhayek
Copy link
Member

Alternative approach is to add exclusion rule so that your custom data is not removed with something like this

services.Configure<HtmlSanitizerOptions>(o =>
{
    o.Configure.Add(new System.Action<Ganss.Xss.HtmlSanitizer>(o =>
    {
        o.AllowedAttributes.Add("data-custom...");
    }));
});

@DrewBrasher
Copy link
Contributor Author

Thanks @MikeAlhayek, I've added that and it works.

@Piedone
Copy link
Member

Piedone commented Jun 4, 2024

The basic question is, should shortcode outputs run through sanitization? I don't think so, but since they inject something into the HTML output of the part/field, it might be hard to try to keep them separate.

@DrewBrasher
Copy link
Contributor Author

I see several places in the source where it looks like the sanitization could just be moved to before the shortcode processing. Here is one example:

model.Html = await _shortcodeService.ProcessAsync(model.Html,
new Context
{
["ContentItem"] = markdownBodyPart.ContentItem,
["TypePartDefinition"] = context.TypePartDefinition
});
if (settings.SanitizeHtml)
{
model.Html = _htmlSanitizerService.Sanitize(model.Html ?? "");
}

But there may be other things I don't know about that would need to be considered.

@Piedone
Copy link
Member

Piedone commented Jun 6, 2024

Also something to keep in mind is that shortcode inputs can be used for HTML/script injection. So, they themselves should be sanitized, but actually the shortcode output as well, since you can build an offending code by combining the shortcode with something that in itself isn't. Hmm, maybe we should keep this as it is.

@DrewBrasher
Copy link
Contributor Author

That makes sense to me. The solution @MikeAlhayek provided solved the issue I was having.

@MikeAlhayek
Copy link
Member

I think it's safe to evaluate short-codes after sanitizing. The idea behind sanitizing it to ensure that the use does not add malicious code. But with shortcodes, we provide HTML that is kind of trusted.

@Piedone
Copy link
Member

Piedone commented Jun 6, 2024

The problem is that sanitizing shortcode inputs doesn't guarantee that the output won't be malicious (well, sanitization is not bulletproof anyway, but still). There are all kinds of creative tricks for XSS, including tricking shortcode like this.

@sebastienros
Copy link
Member

The HtmlBody renders the shortcode correctly even with Sanitization turned on so I think the MarkdownBody should do the same.

It would be interesting to understand why.

@sebastienros
Copy link
Member

@deanmarcussen can you refresh our memory about when we can or cannot use shortcodes and what to expect wrt sanitization? Then we'll write it in the docs so we don't forget.

@sebastienros sebastienros added this to the 2.x milestone Jun 6, 2024
Copy link

github-actions bot commented Jun 6, 2024

We triaged this issue and set the milestone according to the priority we think is appropriate (see the docs on how we triage and prioritize issues).

This indicates when the core team may start working on it. However, if you'd like to contribute, we'd warmly welcome you to do that anytime. See our guide on contributions here.

@DrewBrasher
Copy link
Contributor Author

It would be interesting to understand why.

It looks like to me that for the HtmlBody, the sanitization happens on update while the shortcode is still in shortcode notation.

model.Html = settings.SanitizeHtml ? _htmlSanitizerService.Sanitize(viewModel.Html) : viewModel.Html;

Instead of after the shortcode is rendered to hrml.

@deanmarcussen
Copy link
Member

so probably the point here, is that by default Markdown does not support adding extra html, through shortcodes, whereas an Html editor, specifically does.

the idea being that Markdown is/was considered safe by not allowing any extra html in it, whereas Html is considered questionable, as you can write html in it, so we just santize essentially, that you didn't add a script to it.

You can't sanitize shortcodes themselves when they are rendered, you need to santize the total output of the shortcodes, in the medium it is being used, because otherwise you could write split shortcodes, which when the output is combined would generate bad html.

@MikeAlhayek 's suggested approach here is the appropriate one I believe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants