Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inline vs out of line module attributes #13

Closed
littledan opened this issue Nov 7, 2019 · 62 comments
Closed

Inline vs out of line module attributes #13

littledan opened this issue Nov 7, 2019 · 62 comments
Milestone

Comments

@littledan
Copy link
Member

This proposal is all about putting module attributes inline in the JavaScript source text. I believe this is better because it would be annoying to have to switch to a separate resource file to write the module type, and keep these in correspondence. Probably it would only be practical to generate from tools, not write by hand. I'd prefer if we can reduce the number of administrative steps we need tools for.

What do you think? Does anyone think that we put the attributes out of line, in some sort of resource file? If so, where should we put them? (unclear whether import maps would be suitable)

@devsnek
Copy link
Member

devsnek commented Nov 7, 2019

I prefer out of band, since that would make it really clear when you're using a system that doesn't care about the attributes (like type:json on node.js). I do realize the pain of tooling though.

@justinfagnani
Copy link

I vastly prefer in band due to overall DX. I think host environments can offer guidance on what they do with these attributes, and I'm not convinced that non-web environments would have no use for them.

@bmeck
Copy link
Member

bmeck commented Nov 8, 2019

Is there a problem having both? I know we have an out of band solution for some stuff already in Node with policies. It avoids some pretty gross updating that in-band starts to show once you add things like SRI.

@littledan
Copy link
Member Author

littledan commented Nov 8, 2019

I can definitely see the argument for both/either when it comes to SRI. Is there any reason why we'd want both for type, rather than mandate its inclusion inline?

(Edit: and, I would be OK with mandating SRI be out of line--maybe this is the best option)

@bmeck
Copy link
Member

bmeck commented Nov 8, 2019

@littledan

  • verbosity/synchronization issues if the specifier ever changes the destination format (via import map rewriting or via updated resource to new format [JS module => CSS/HTML module etc.]), can be compounded by re-export focused modules export * from 'X'; export * from 'Y';
  • auditing is centrally located (this would require manifesting things for every import though)
  • saves parsing time/verification per location (node policies already have 100ms+ startup just parsing all the pieces of deduplicated SRIs on a small/medium CLI app on a modern laptop)
  • moves away from programmer vigilance of doing it all possible locations and prevents mismatches like import 'x' type: 'json' in one source text and then import 'x'; in a different source text.

@littledan
Copy link
Member Author

Well, this "why not both" option is interesting. Does anyone want to take up designing this external resource file, and we can see if we could keep them in correspondence as parallel efforts? We will probably need such a file for SRI, and I don't see any particular downside to allowing the module type either there or inline. However, I do think it would be harmful for ergonomics to require that module type be declared out of line.

@devsnek
Copy link
Member

devsnek commented Nov 8, 2019

I think it also somewhat solves 17, 9, and 7, since different environments can just load different out of band information, and for the web at least, that out of band information can also secure css @import and such.

@bmeck
Copy link
Member

bmeck commented Nov 8, 2019

@littledan can you clarify that ergonomics concern a bit? Is specifying the type at each import location not unergonomic as well?

@littledan
Copy link
Member Author

Specifying the type at the import site is not perfect, but it's better than having to manually create a separate file to put the type in. I only see those files as viable to be created by tools. (For cases where people use tooling to generate the type declarations, there isn't really an ergonomics issue either way.)

@bmeck
Copy link
Member

bmeck commented Nov 8, 2019

@littledan the import location would need to determine the type expected for completion already which ~= using the response MIME if it doesn't want some human intervention to decide and that could be bad as it encourages blind acceptance. Also, for any runtime APIs like React.lazy or Worker, it would require extra care and completions might be non-trivial to implement. I don't think the ergonomics is actually fixed by completions, it is certainly aided though in that it allows easier filling of boilerplate.

@littledan
Copy link
Member Author

To clarify, I'd propose that the semantics on the Web would be that the module graph will fail to load if any MIME type doesn't match what is declared. I personally hope other environments take analogous approaches (#10).

I agree that completions are not 100% mitigations vs the previous plan to not have to declare anything at all, but it looks to me like the best option on the table.

@devsnek
Copy link
Member

devsnek commented Nov 8, 2019

I think it's important to note that so far, all the examples of fields we want have been of things inherent to the file being imported, not the import itself.

also given that it's impossible to statically figure out what worker constructors and import() are loading, it may not be possible to verify all the import locations.

@bmeck
Copy link
Member

bmeck commented Nov 8, 2019 via email

@devsnek
Copy link
Member

devsnek commented Nov 8, 2019

@bmeck that still seems inherent to the polyfill, you'd want it hoisted everywhere. unless you wanted to polyfill the polyfill, in which case you should probably change your entry point rather than trying to carefully hoist things in the magic order.

@devsnek
Copy link
Member

devsnek commented Nov 8, 2019

some webpack bundlers might prefer data inline, like specifying the dimensions of an imported image.

@bmeck
Copy link
Member

bmeck commented Nov 8, 2019

@devsnek for other things doing async operations in general hoisting would be beneficial when possible. I'd argue that hoist isn't necessarily inherent to the dependency. As it is stating that the dependency can eagerly evaluate prior to the current module, it does not state that it is always safe to do so, eg:

import 'hoist-add-globalThis' with hoist: true;
import 'module-using-globalThis';
await null; // causes sibling deps to go in parallel

In this case, TLA chose to allow siblings to evaluate in parallel but we need to hoist the globalThis due to how these siblings are defined. Without hoist you would potentially have a race here.

@littledan
Copy link
Member Author

Note that TLA will preserve the order of when siblings launch. So hoisting should be unnecessary in that particular case as long as the polyfill module doesn't use TLA itself.

@bmeck
Copy link
Member

bmeck commented Nov 8, 2019

@littledan then assume it does use TLA ;p

@littledan
Copy link
Member Author

@devsnek

also given that it's impossible to statically figure out what worker constructors and import() are loading, it may not be possible to verify all the import locations.

The idea would be that this verification happens reliably during runtime, so I think it should work just fine for dynamic import.

@xtuc
Copy link
Member

xtuc commented Nov 11, 2019

I think that tooling doing transpiling could take advantage of the inline syntax, discussed here #22, for many code generation tuning purposes.

The "why not both" is interesting. I can easily image a tool that could generate the inline syntax from the out-of-band file or the opposite for the developer.
I'm wondering how we would include this file on the web? As a HTML script tag with a new attribute? or could it be part of CSP headers?

@bmeck
Copy link
Member

bmeck commented Nov 11, 2019

I think that tooling doing transpiling could take advantage of the inline syntax, discussed here #22, for many code generation tuning purposes.

Yes and no, it somewhat depends on the undefined behavior of what happens if attributes collide. If a tool generates something like a pure attribute to avoid side effects on load in one file, but not in another inline may actually prove difficult since it requires keeping many locations in sync potentially.

I'm wondering how we would include this file on the web? As a HTML script tag with a new attribute? or could it be part of CSP headers?

That seems up to web specs, not TC39. I can imagine some hosts may not even wish to allow out of band attributes (or vice versa).

@xtuc
Copy link
Member

xtuc commented Nov 11, 2019

If some host only implement the inline syntax or the out-of-band format it might endup being more confusing to the developer, especially if they are getting out-of-sync.

That seems up to web specs, not TC39. I can imagine some hosts may not even wish to allow out of band attributes (or vice versa).

Yes, sure. It was just for my information and if we had some ideas already.

@bmeck
Copy link
Member

bmeck commented Nov 11, 2019

If some host only implement the inline syntax or the out-of-band format it might endup being more confusing to the developer, especially if they are getting out-of-sync.

I'm not sure I understand this, things are already showing they intend to get out of sync with things like Node, Deno, and Web loading different types of modules and not matching. What is the thing we are wishing to preserve in light of already diverging scenarios?

@littledan
Copy link
Member Author

littledan commented Nov 11, 2019

@bmeck

That seems up to web specs, not TC39. I can imagine some hosts may not even wish to allow out of band attributes (or vice versa).

In this repository, I am specifically encouraging us to all discuss how this feature will relate to particular host environments, even though we probably won't make all the final decisions here. If we just talk about environments in the abstract, I think it will be too confusing and vague.

@littledan
Copy link
Member Author

littledan commented Nov 11, 2019

@bmeck

What is the thing we are wishing to preserve in light of already diverging scenarios?

Personally, I hope we can build as much compatibility across environments as possible, as we're discussing in #10 . I'm not sure how we could already conclude that things are diverging.

@bmeck
Copy link
Member

bmeck commented Nov 11, 2019

@littledan Deno is loading typescript, Node is loading things with user land loaders, web is loading HTML/CSS, etc. The module pipelines and formats being loaded differ significantly and I don't see how they are intended to be framed as on a common path. Even if we have some level of compatibility it shouldn't be framed as being something that cannot be broken nor as affecting location of attributes being able to be kept consistent across all the environments. I just don't understand how/what is being able to be kept in sync. The origins of the proposal were a security concern by some of the web and the model the web encourages for loading, that does not universally apply to all environments either.

@littledan
Copy link
Member Author

Can I both agree with #13 (comment) and hope that we can find a useful, interoperable subset among host environments that are interested?

@bmeck
Copy link
Member

bmeck commented Nov 11, 2019

@littledan i'm wary of a recommended subset as that will be leverage that environments will use to impose things like their security model affecting other environments which may have differences that see no need for, gained ergonomics by enforcing them, and even might see problems implementing any attributes.

@littledan
Copy link
Member Author

Following more offline discussion that @bmeck and I had, I'm convinced that this proposal should not recommend any sort of common, interoperable subset for hosts. I'm happy to leave the determination of semantics to the hosts, who can each find meanings that meet their goals. Some hosts may want to seem similar to each other, but that's well outside the scope of this proposal. Thanks for taking the time to explain the context to me, @bmeck .

@justinfagnani
Copy link

To make sure I understand: the reason that the CDN doesn't also serve the manifest because that's the whole security model we're trying to address, that an untrusted server tells us how to interpret the payload?

Well, the CDN doesn't serve the manifest initially because it hasn't implemented that. CDNs, playgrounds, etc., would have to be updated to generate manifests. Right now the in-band solution would just work with unpkg, stackblitz, etc.

Even once they could generate a manifest, they'd presumably have to do it from in-band metadata (or some package-local OOB file) in the first place. Then who specifies that format?

@bmeck
Copy link
Member

bmeck commented Nov 19, 2019

If transitive dependencies are brought into this line of thinking, aren't we also needing all the transitive deps to add these attributes? If the CDNs are possibly some of these untrusted origins wouldn't the trusted origin still need to list all the allowed importing attributes? e.g.

If my trusted origin uses CDN A's /a which loads CDN B's /b. Wouldn't the trusted origin need to trust CDN A to load CDN B's /b correctly? This seems like we still need the entire trust chain at the origin of these import graphs if the feature is based around preventing untrusted sources from loading things improperly.

I'm getting a bit confused here as to why this idea of trust for the imported resource format/side-effects only applies at the top level and not within the CDN itself if you are importing JS from the CDN.

@guybedford
Copy link

This is exactly why integrity, preload metadata, and import maps resolution all need to be generated at the same time by the same tool.

@justinfagnani
Copy link

If transitive dependencies are brought into this line of thinking, aren't we also needing all the transitive deps to add these attributes?

Well, yes, if they need them. Anyone that imports CSS would need to add the import attributes to allow that. In-line attributes just work in this case. OOB because much more difficult because you need a new tool to generate them and that tool has to have either global pre-knowledge of the module graph, or we have to allow for the OOB information to be updated in the presence of dynamic import(), new script tags, etc.

@bmeck
Copy link
Member

bmeck commented Nov 19, 2019

@justinfagnani I'm stating that the security feature being claimed here doesn't make sense in transitive dependencies and is making the entire idea questionable if in-band is used in untrusted origins.

@syg
Copy link

syg commented Nov 19, 2019

Well I thought I maybe understood the threat model but wouldn't you know it, I actually don't.

@littledan
Copy link
Member Author

@bmeck I'm very confused by this line of reasoning. I don't see why the type declaration would need to be transitive. We're not talking about the privilege to execute something of a certain type, but rather the importer giving its expectations of how its direct dependency is to be processed. (I don't see how this all relates to in-band vs out of band, though.)

@devsnek
Copy link
Member

devsnek commented Nov 19, 2019

@littledan for example, importing type:html that then imports js which infects my system or steals my cat pics or something. if the data is out of band, i can specify the rules for the imports of the imported type:html file and prevent that.

@littledan
Copy link
Member Author

@devsnek This sounds like a really different threat model from the one presented in WICG/webcomponents#839 . My understanding is that, if you do import something that's executable, you are granting it trust to appropriately import other things. Presumably the manifest won't describe all the edges in the entire module graph, right?

@devsnek
Copy link
Member

devsnek commented Nov 19, 2019

@littledan that issue seems to have put html/css/json in the same category, which is why i brought it up. anne mentioned that html might be different, but even if it isn't, i think we should plan for more complex graphs.

@littledan
Copy link
Member Author

@devsnek I think the difference had to do with the debate about whether HTML will or will not be able to execute JS code. The other issue is the different parsers. These are both shallow properties.

If you see an additional security issue to the one raised in WICG/webcomponents#839, that could be interesting to discuss, but I don't think it'd been presented in that thread, and I don't understand it from what people have been saying in this one.

@justinfagnani
Copy link

@bmeck I think we're misunderstanding each other at this point. I'm not claiming that any in-band type attribute should apply transitively, indeed those should only apply locally to the annotated import. But such annotated imports should work anywhere in the module graph without knowledge or opt-in by the top-level module. ie, a transitively imported module with a properly annotated CSS import should just work.

@ljharb
Copy link
Member

ljharb commented Nov 20, 2019

That reasoning would suggest that import maps should be specificable inline too. Why was an out of band mechanism used there?

@justinfagnani
Copy link

Import maps answer an entirely different question from module attributes. A type: css attribute says that the importer expects the module to be CSS. This is a local assertion, and can't actually be a global one: no one else knows what the local importer was expecting.

import 'x' only says to import some abstract name x. The whole purpose of the bare specifier is to abstract over the real location and to leave the resolution up to something else. If the local importer already knows exactly which module to expect, then they can just use a URL. IOW, there is no such thing as an inline import map, because at that point you wouldn't need to map imports at all.

@Jamesernator
Copy link

Jamesernator commented Nov 20, 2019

import 'x' only says to import some abstract name x. The whole purpose of the bare specifier is to abstract over the real location and to leave the resolution up to something else. If the local importer already knows exactly which module to expect, then they can just use a URL. IOW, there is no such thing as an inline import map, because at that point you wouldn't need to map imports at all.

But isn't this the point of of import './any-specifier.ext';, it abstracts away the content of the module leaving resolution up to the host environment.

For example most people would not want to be writing import minimist from 'minimist' with type: commonjs;, the fact it's commonjs is a detail and resolved by the host.

The same is true for some (albeit not all) uses of importing JSON. For example you might be wanting to import some config but not caring how it's constructed, the fact some configs are simple enough to be used with JSON is a detail.

This is especially true of dynamic import (less so of static). If you need everything that wants to be imported to also carry around some arbitrary metadata telling it's kind, even though the consumer doesn't need to know or care.

@justinfagnani
Copy link

For example most people would not want to be writing import minimist from 'minimist' with type: commonjs;, the fact it's commonjs is a detail and resolved by the host.

I'm really confused by this. Being CommonJS is very much not an implementation detail because the semantics are quite different from the default import type of standard JS modules.

If you wanted to support importing CommonJS this seems like an incredibly reasonable way to do it. Normally a CommonJS file would at least fail produce any exports, and likely fail to eval due to missing require, module, exports, etc., definitions. Telling the host to provide them and parse the module differently seems like it'd let the host set up the environment for that import correctly. A browser, which wouldn't support that module type, would then fail with a reasonable error message.

@Jamesernator
Copy link

Jamesernator commented Nov 20, 2019

Being CommonJS is very much not an implementation detail because the semantics are quite different from the default import type of standard JS modules.

This might be a fundamental disagreement but I think being CommonJS is strongly an implementation detail because I only want the function that I want. Whether it's authored with CommonJS, ESM or even WASM makes zero difference to me as the consumer of the module, from minimist I just want a function that parses some arguments, I don't care how it's implemented.

It also makes package upgrades require needless changes when otherwise no observable changes would be made. For example consider a simple math library:

import { add, sub } from 'math-package';

If math-package changes to use WASM for performance but exports an identical interface under this proposal I would need to add with type: webassembly on my end. But this is beyond pointless for me as the consumer given I don't care how add/sub are created, the with type: webassembly is just useless cruft that doesn't affect how I'm consuming the module in any way.

Honestly I think the most likely outcome of this proposal for package authors would be they'll wind up adding a redundant module in between that just exports the other type, this doesn't tangibly benefit the package author or consumer so it's not clear why we'd want this.

e.g.:

// main entry into package
export * from './math.wasm' with type: webassembly;

@thw0rted
Copy link

Did I miss the part where somebody suggested that the (proposed) type: metadata should be used to differentiate different JS module formats? I thought the discussion centered around type: css, type: html, etc.

@bmeck
Copy link
Member

bmeck commented Nov 20, 2019

@thw0rted none of this current proposition disambiguates/differentiates formats, it just is an assertion of what is an allowable response type.

@littledan
Copy link
Member Author

I understand that this remains a bit controversial, but I think we've articulated some strong reasons for putting at least some kinds of metadata in-band. Some in-band metadata here does not at all preclude out-of-band metadata from being added by some other proposal, whether in JS itself or host-specific.

This proposal remains based on adding in-band metadata. It's one of the core questions that I'd like to draw a conclusion on in proposing module attributes for Stage 2: consensus on Stage 2 would be settling on the tradeoff that we should add some in-band metadata.

@littledan littledan added this to the stage 2 milestone May 20, 2020
@ljharb
Copy link
Member

ljharb commented May 20, 2020

I still believe that out of band, or the specifier, is a better choice than inline syntax. The module author, not the module consumer, should be determining the format of a module.

iow, in-band metadata on the consumer side is fine for anything the consumer is the arbiter of - but the module’s “type” isn’t one of those things.

@devsnek
Copy link
Member

devsnek commented May 20, 2020

I really can't see inline as anything except as a confusing burden. Take #58 for example, a seemingly reasonable request is probably impossible because the caching semantics of inline attributes are a mess and anyone trying to use the data will probably end up with odd bugs.

@dandclark
Copy link
Collaborator

dandclark commented May 21, 2020

Regarding this:

I still believe that out of band, or the specifier, is a better choice than inline syntax. The module author, not the module consumer, should be determining the format of a module.

iow, in-band metadata on the consumer side is fine for anything the consumer is the arbiter of - but the module’s “type” isn’t one of those things.

I don't quite see it as making the consumer the arbiter of the type.

Even with the type attribute, the provider of the module can still make the final determination of the module's format. E.g. in the web the module is sent with a MIME type that the type attribute can't override. At most it will just make the module not load if it turns out to be a format that the importer didn't expect, but it will never change the format.

There are a couple issue threads (#43, #56) discussing ideas that would give the importer the ability to actually change the format of the module, but both of those are outside the scope of the core proposal.

@ljharb
Copy link
Member

ljharb commented May 21, 2020

@dandclark the other thing is that it forces the consumer to know the module's type, and i don't think that's knowledge they should have to have (nor something that should constrain the author's ability to transparently refactor a module between types).

@littledan
Copy link
Member Author

The requirements to avoid the privilege escalation attack is that the consumer of the module declares with what privileges (executable vs non-executable) the module is loaded with. That requirement exists whether it's stated in-band or out-of-band.

@ljharb
Copy link
Member

ljharb commented Jun 3, 2020

Fair, but "type" doesn't mean "privileges". If the requirements are to designate "executable", why is more than a boolean, or an explicit privilege, needed?

@littledan
Copy link
Member Author

We're going in circles a bit; @syg proposed this in WICG/webcomponents#839 (comment) . I got the sense from that thread that many people preferred including the type. I'd be fine with switching to something like this within module attributes syntax; I think we could consider this change within Stage 2.

@xtuc
Copy link
Member

xtuc commented Jun 13, 2020

We reached consensus for stage 2 with the inline form. Are we happy closing this issue?

@xtuc xtuc closed this as completed Jun 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests