Skip to content
This repository has been archived by the owner on Jul 21, 2022. It is now read-only.

Allow BOM before hashbang #14

Closed
Eccenux opened this issue May 18, 2019 · 6 comments · Fixed by #15
Closed

Allow BOM before hashbang #14

Eccenux opened this issue May 18, 2019 · 6 comments · Fixed by #15

Comments

@Eccenux
Copy link

Eccenux commented May 18, 2019

I think in the spec it should be explicitly said if BOM is allowed or not.

The readme says that hasbang should be "only at the start of a Source Text". As BOM (byte order mark in UTF-8) is allowed in JavaScript the actual script might start with something different (even if it appears to start with #).

Note that at the moment both Firefox 67 and Chrome 74 allow BOM before hashbang. Not sure if the browser actually gets the BOM from the server though. Tested with below micro scripts from Windows (file: protocol). In mentioned browsers it shows an alert message so BOM before hashbang is ignored.

shebang.zip

@bakkot
Copy link
Contributor

bakkot commented May 19, 2019

As a data point, node allows a BOM to precede a #! - stripBOM is called before stripShebag (which do what you'd expect) - though it isn't totally clear that this is intentional, since node -c applies them in the reverse order and hence rejects a BOM preceding a #! (I've just filed a bug about this difference).

On the other hand, shebangs in executables on unix have to be the first two bytes; a BOM is not allowed to precede a #!. So I'm not sure it makes sense to allow it here either.

As to browsers accepting it, I expect that's down to them stripping the BOM before parsing. (The HTML spec probably says to do this, but I can't find where; maybe @domenic knows?) You can tell because if you inline the script, it throws an error if and only if the BOM is present.

@bakkot
Copy link
Contributor

bakkot commented May 19, 2019

As to what the current spec says, it does not allow a BOM to precede #!. The only special treatment given to U+FEFF in the spec is that it is considered to be WhiteSpace, which this proposal does not allow to precede a HashbangComment.

@domenic
Copy link
Member

domenic commented May 19, 2019

BOM stripping is done as part of the decoding layer in browsers, before the ES spec touches the resulting source text. As far as I know, the ES spec operates on code points, not bytes, as its input, which in this paradigm would put the BOM out of scope.

@bakkot
Copy link
Contributor

bakkot commented May 19, 2019

Thanks @domenic. (And yeah: "ECMAScript source text is a sequence of code points".) That suggests that forbidding a BOM from preceding a #! (by giving it no special treatment, i.e., the current behavior) is the correct behavior.

I made a PR to add a bit about this to the readme.

@Eccenux
Copy link
Author

Eccenux commented May 19, 2019

Thanks. Added note in MDN too.

@Alhadis
Copy link

Alhadis commented Aug 18, 2020

Another scenario where the presence of a BOM is relevant is with spec-conscious tooling. For example, an executable written in TypeScript might contain a BOM that gets stripped during transpilation:

#!/usr/bin/env node
type Foo = "bar" | 41.99999999;
// ... etc

How TypeScript treats BOM precedence should ideally match a spec — in the aforementioned scenario, the file isn't executed without preprocessing, so the topic of shell parity becomes moot.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants