-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Base64 / Binary loader output is different between esbuild.build and esbuild.transform #2424
Comments
This doesn't seem like a bug to me. The transform API in JavaScript currently only takes a string: Line 502 in 40ef633
All strings in JavaScript are UTF-16 (or UCS-2 depending on how you look at it). Strings aren't intended for storing binary data. I'd expect something like this to happen if you put invalid Unicode characters into a string. So this issue really seems like a feature request for the |
I understand what you're saying with regards to strings in JavaScript being UTF-16 - however for "strings" that are loaded outside the system under your control - e.g from a file in my example it is possible they can be invalid to be interpreted as a JavaScript string.
I'm not suggesting that it's a bug, just confusing behaviour as I'm calling Is there a way to do something like if loader == base64 || binary then skip trying to interpret the string as a valid JavaScript string in the transform bit of code? |
Both the console.log((await esbuild.transform('\xC4',{ loader: 'base64' })).code)
console.log((await esbuild.build({ stdin: { contents: '\xC4', loader: 'base64' }, write: false })).outputFiles[0].text) This prints the following: module.exports = "w4Q=";
module.exports = "w4Q="; |
Thanks for continuing the look into it - I'm sure it's me trying to use esbuild wrong but I just want to be sure. I do see a difference between whether I use:
I'm not sure the above comparisons are totally valid because it returns "w4Q=" which when decoded is In my particular case I was running into this in unit tests using esbuild-jest Just to show the difference I mentioned - you can try reading a raw byte from a file on disk, so try this:
Gives the following output
|
Yes, this is the difference between textual input and binary input. I have added the ability to pass binary data as input which will let you do what you want (by omitting |
Sample Reproduction:
I wrote a very simple (1 byte) showcase of the problem here with reproduction steps:
https://github.com/jasoncabot/esbuild-issue
Problem
When using esbuild to transform code, binary strings are treated as UTF-8 which can cause the output to be corrupted and transformed to include the UTF-8 replacement character
EF BF BD
when an invalid byte sequence is detected.In my reproduction I use the
base64
loader to transform the byteC4
and would expect the result to be a base 64 encoded string that decodes toC4
however it is transformed into a string that decodes toEF BF BD
but this is also an issue withbinary
loader (which is where I discovered it - but the base64 case was simpler to showcase the issue)This is only an issue with transform,
esbuild.build
behaves as I would expect and preserves the binary data.Expected
I would expect that base64 or binary data is treated as a raw stream of bytes and
esbuild.transform
to behave consistently with howesbuild.build
works.The text was updated successfully, but these errors were encountered: