Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding svg dataurl #3418

Closed
FreddyA opened this issue Oct 2, 2023 · 9 comments
Closed

Encoding svg dataurl #3418

FreddyA opened this issue Oct 2, 2023 · 9 comments

Comments

@FreddyA
Copy link

FreddyA commented Oct 2, 2023

The encoding for svgs when using loader dataurl does not validate using validator.w3org.
Error: "Bad value for attribute src on element img: Illegal character in scheme data: < is not allowed."

I.e. the characters < and > are not encoded in the data url.
The solution should be to encode < and >.
Another solution would be to have the option to use base64-encoding for svgs

@hyrious
Copy link

hyrious commented Oct 2, 2023

Well that's bacause <> are actually effective in browsers. You may read the initial issue #1843 to see how it was landed.

@FreddyA
Copy link
Author

FreddyA commented Oct 2, 2023

Ok, it is effective but not not a legal character according to w3c. I understand how it saves some data not to encode, but since it does not validate, is there a way around this?

@evanw
Copy link
Owner

evanw commented Oct 3, 2023

Since you haven't provided any information about how to reproduce this, I'm going to make something up:

<!doctype html>
<img src="data:image/svg+xml,&lt;svg width='100' height='100' xmlns='http://www.w3.org/2000/svg'&gt;&lt;rect width='100' height='100' fill='red'/&gt;&lt;/svg&gt;">

That should show up as a red square. That appears to work fine in Chrome, Safari, and Firefox. Which specific browsers does this HTML not work in for you (including the version information)? Alternatively, please provide enough information to reproduce an incorrect rendering given a test case of yours (along with a specific browser).

If you need base64 encoding because some additional tooling you're using needs data URLs to be in base64 form, then you can explicitly use the base64 loader yourself: https://esbuild.github.io/content-types/#base64.

@FreddyA
Copy link
Author

FreddyA commented Oct 3, 2023

The problem is that the characters < and > are not encoded and that is not valid in the src-element.
If I take the image from your example
<svg width="100" height="100" xmlns="http://www.w3.org/2000/svg"><rect width="100" height="100" fill="red"/></svg>
that would be encoded by esbuild when using the loader dataurl to
<svg width=&quot;100&quot; height=&quot;100&quot; xmlns=&quot;http://www.w3.org/2000/svg&quot;><rect width=&quot;100&quot; height=&quot;100&quot; fill=&quot;red&quot;/></svg>

To validate that: paste the following into the validator; https://validator.w3.org/nu/#textarea:

<!doctype html>
<html lang="en">
<head><title>svg</title></head>
<body>
<img src="data:image/svg+xml,<svg width=&quot;100&quot; height=&quot;100&quot; xmlns=&quot;http://www.w3.org/2000/svg&quot;><rect width=&quot;100&quot; height=&quot;100&quot; fill=&quot;red&quot;/></svg>" alt="">
</body>
</html>

as I see it, < and > are supported unencoded by all modern browsers, but it is still not legal html.
< and > are not encoded at all, and the rest is html-encoded. When the whole svg should be urlencoded.
https://html.spec.whatwg.org/#refsRFC2397

Attribute values in [RFC2045] are allowed to be either represented as
tokens or as quoted strings. However, within a "data" URL, the
"quoted-string" representation would be awkward, since the quote mark
is itself not a valid urlchar. For this reason, parameter values
should use the URL Escaped encoding instead of quoted string if the
parameter values contain any "tspecial".

@evanw
Copy link
Owner

evanw commented Oct 3, 2023

<!doctype html>
<html lang="en">
<head><title>svg</title></head>
<body>
<img src="data:image/svg+xml,<svg width=&quot;100&quot; height=&quot;100&quot; xmlns=&quot;http://www.w3.org/2000/svg&quot;><rect width=&quot;100&quot; height=&quot;100&quot; fill=&quot;red&quot;/></svg>" alt="">
</body>
</html>

I'm confused. How did you even get this HTML in the first place? All the HTML templating languages I've used automatically escape < and > in HTML attributes for you so that this can't happen.

Regardless, you didn't respond to my previous post:

Which specific browsers does this HTML not work in for you (including the version information)? Alternatively, please provide enough information to reproduce an incorrect rendering given a test case of yours (along with a specific browser).

@evanw
Copy link
Owner

evanw commented Oct 17, 2023

I'm closing this issue due to lack of a follow-up response. AFAIK the data URLs that esbuild generates work fine in all known browsers. It is not esbuild's job to pre-escape the characters in its data URLs for all possible contexts that they might be used in. Instead, you should be properly escaping data for the relevant context when building HTML using string concatenation.

@raplemie
Copy link

raplemie commented Oct 19, 2023

<!doctype html>
<img src="data:image/svg+xml,&lt;svg width='100' height='100' xmlns='http://www.w3.org/2000/svg'&gt;&lt;rect width='100' height='100' fill='red'/&gt;&lt;/svg&gt;">

This is not what esbuild will encode though, esbuild will not encode the < and > as it is done in this example, nor would it convert double quotes to single quotes if they are present in the .svg file that it pulls out, so while this may work, it is not what esbuild would output.

Is it expected that the content of an '.svg' files that is targeted should already be encoded correctly ?

As an example, esbuild will encode # and %, but not the other invalid characters for URI. (Mainly for svg import, <, >, "...)
see

if c == '\t' || c == '\n' || c == '\r' || c == '#' || i >= trailingStart ||
(c == '%' && i+2 < n && isHex(text[i+1]) && isHex(text[i+2])) {

@evanw
Copy link
Owner

evanw commented Oct 19, 2023

Is it expected that the content of an '.svg' files that is targeted should already be encoded correctly ?

The only escaping that the dataurl loader does is the escaping necessary for a browser to be able to interpret it correctly. It is not esbuild's job to pre-escape the characters in its data URLs for all possible contexts that they might be used in. In particular, esbuild's dataurl loader doesn't do any special additional escaping for embedding data URLs into HTML element attributes (or for embedding into CSS, or for embedding into JSON, or for embedding into YAML, etc.).

If you have a specific example of a browser that esbuild's data URLs doesn't work with, then please post that information here (the browser name, version, operating system, and any other information necessary to reproduce the problem). Without a demonstration that esbuild's dataurl loader behaves incorrectly in a browser, I consider esbuild's dataurl loader to be behaving correctly.

@raplemie
Copy link

For a browser to be able to interpret it correctly.

Ok, if that's the goal and not to follow the standard defined for data urls then I guess there is no issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants