Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Which properties should HTML add to Realms' global objects? #284

Open
littledan opened this issue Nov 11, 2020 · 75 comments
Open

Which properties should HTML add to Realms' global objects? #284

littledan opened this issue Nov 11, 2020 · 75 comments

Comments

@littledan
Copy link
Member

HostInitializeUserRealm says,

It is not expected that this hook would add properties to the Realm's global object.

We should consider making this a requirement, not a suggestion, to ensure that the hook is not used in different ways in different environments in a way that could hurt interoperability.

This hook must not add properties to the Realm's global object.

Thanks to @annevk for pointing out that cross-environment guarantees would be useful here.

(Since we're just talking about tweaking wording on something the spec already says, I think the exact choice of wording can be iterated on post-Stage 3.)

@leobalter
Copy link
Member

Limiting the now named HostInitializeSyntheticRealm seems fine for me.

Please just be aware the abstract setDefaultGlobalBindings allow additional globals into the Realm and we should extend this discussion if necessary.

@littledan littledan changed the title Consider requiring hosts to not add properties to the global object Which properties should HTML add to Realms' global objects? Nov 17, 2020
@littledan
Copy link
Member Author

This was the biggest concrete actionable piece of feedback that I heard from @syg and @codehag in our discussion at the November 2020 TC39 meeting: that we instead should have Web APIs defined on Realms. We can make some Web APIs available and others not, using WebIDL's [Exposed] extended attribute (if we add a bit of plumbing for Realms). I'd like to discuss this within the people working on Realms, so we can later form a proposal to HTML.

@kriskowal
Copy link
Member

kriskowal commented Nov 17, 2020

I believe this concern requires two solutions.

There are web API’s like TextEncoder that should be available in a JavaScript language Realm, where the web has gone ahead of the language standard as an expedient. The solution here is likely the same as with TypedArrays: these things should graduate to 262 and be incorporated in any Realm. Of course, nothing except self restraint precludes vendors from including them in Realm ahead of a blessing from TC39.

There are web API’s like document and fetch that should be available only in web realms, which could even be called WebRealm to make their capabilities-at-birth clear. While that API should rhyme with Realm, it clearly need not and should not come from 262. And, such a thing could be shimmed using Realm modules off-web.

@Jack-Works
Copy link
Member

Agree with the idea of WebRealm. Web can extend a subclass of Realm and in the initialization stage (not the host hook stage) add those Web APIs in.

@Jack-Works
Copy link
Member

I have a question. If we're not allowing adding properties on the Realms global object, developers will copy it manually. Does new Realm implements the Window interface in the WebIDL? If not, developers will need to "bind" the this of those Web APIs, or they will encounter Uncaught TypeError: Illegal invocation like what you do alert.call({}) today.

@littledan
Copy link
Member Author

littledan commented Nov 17, 2020

new Realm().globalThis would implement a new WebIDL interface--not Window, not WorkerGlobalScope, but something analogous. Then, different interfaces will be [Exposed] on it. We could separate Realm and WebRealm and make two of these interfaces, but I'm not really sure where the dividing line would be exactly... I don't think document should be in either, but I wonder if we might want fetch in both.

@Jack-Works
Copy link
Member

I think we should not allow host to add new properties in the host hook. They should make a subclass of realms and adding platform API in the constructor.

Realms itself should only have things that defined in the ECMAScript specification.

@codehag
Copy link
Contributor

codehag commented Nov 17, 2020

@Jack-Works this would mean that developers need to know that streams, setTimeout, atob are all not js apis, however many developers consider them to be part of JavaScript. This is why many of us advocate for not splitting js into "web" and "js" -- its worse for developers. Why should users have to be aware of details of specification bodies?

I agree that not all web apis should be exposed by default. For example, setTimeout, but limiting this to only what is in ECMA262 would be an arbitrary decision from the perspective of webdevs.

I agree that WebRealm might be interesting. However when it comes to which apis go where, I don't think we should have this hard limit. I think this will be a discussion in the coming months, and I appreciate dan getting that started.

@Jack-Works
Copy link
Member

@codehag if we are adding APIs into Realms, one of its original motivation is broke.

For example if I want to emulate a Node environment in the web by Realm, I need to clean up all those Web APIs, I don't have a clean environment to emulate. Things will be just a little easier then today (using iframe) because there is no unfrogables on it.
But if we split them, I can have a clean environment and not going to worry about removing a unwanted APIs.

And the by this approach, normal developers will use WebRealm (or HTMLRealm or whatever it named) and if there is special need (for example emulating other hosts), I can still have a clean Realm to manipulate on.

@codehag
Copy link
Contributor

codehag commented Nov 17, 2020

Which environment do you have in mind? Is it in node? Or do you have another host in mind?

@annevk
Copy link
Member

annevk commented Nov 17, 2020

I think the main problem is that JavaScript itself doesn't separate language from library too much either. Why exclude atob(), but not encodeURI()? Would it also not be a problem that anything new added to JavaScript would end up in there? E.g., if setTimeout() were somehow moved. How is that "clean"?

@Jack-Works
Copy link
Member

Maybe I didn't represent my idea well. Let me try again.

For example if I want to make a "Node.JS" emulator web app. I need to create a new Realm. If browsers are adding Web APIs into the created realm, I need to do extra work: make a list of ES APIs and remove anything else.
If the browser is not adding APIs on it, I can start to shim Node environments without making the list.

And for other usages like running pluggings, they can use the WebRealms instead.

By this way, Realm is a lower level API for I don't know, tooling authors if they want to emulate any other host environment, and host version of realm (WebRealm, NodeRealm, ...) it's a useful tool to create a new global object.

@Jack-Works
Copy link
Member

@annevk What I mean "clean" state is not from the perspective of API design (why encodeURI but no atob), but from the usage of the API (I want to get a minimal API set that I can ensure any possible engines will have that, I'm not care what it have in it's API set or not).

@littledan
Copy link
Member Author

littledan commented Nov 17, 2020

I think there is something real here: As much as it's arbitrary, the JavaScript standard does, in practice, for a common base among JS environments. Adoption of Web APIs is ongoing but it's much patchier, and there are greater interop issues in practice for Web APIs than for JS APIs.

You could think of this interop difference as a historical artifact, but it generally follows how JS engines try to implement the JS standard, web browsers and environments that try to be compatible to them implement web standards, and Web JS engines are often retargetable to outside of the Web (meaning there is more code sharing in practice). So it's far from a coincidence, and explains why there is demand for an API at this level.


At the same time, I'm very much sympathetic towards the idea that we should make Realms on the Web more full-featured, reducing sharp edges for developers, and including certain Web APIs there. I wanted to note a few concrete options, besides the JS/Web line, that could make sense for Realms.

One possible line which @annevk mentioned in #whatwg is: APIs which have to do with parsing/string processing are included. So Intl, atob and TextEncoder are in, but then maybe setTimeout, EventTarget and ReadableStream are out. (I'm not sure where WebCrypto falls here...) A lot of people expect setTimeout in JS universally, but that is actually an example of an API with various interop issues in practice, and scheduling is complicated (and might be the kind of thing you want to manipulate somehow in a Realm anyway). I wonder if this line would be intelligible and usable by JS developers in general.

Another possible line that @annevk mentioned would be to keep the set of globals to an absolute minimum. So we would even exclude many JS globals, and just include things which are needed to run JS syntax (like Array.prototype). While this idea makes sense to me in theory, it sounds a bit difficult to work with; it might require the evaluation of large polyfills for methods which create JS objects in the right Realm, slowing startup time.

A third possible line is to include everything which doesn't do "I/O". So HTMLElement, console, localStorage, fetch and postMessage are omitted, but setTimeout, EventTarget, WebCrypto, etc are all present, in addition to atob and TextEncoder. I see this "all but I/O" set as the sort of "maximalist" option.

@Jack-Works
Copy link
Member

Normal developers will learn to use WebRealm, it's full-featured, and if they need it, they can use Realm. There is no conflict.

This hook must not add properties to the Realm's global object.

We should add this.

@leobalter
Copy link
Member

leobalter commented Nov 18, 2020

At the same time, I'm very much sympathetic towards the idea that we should make Realms on the Web more full-featured, reducing sharp edges for developers, and including certain Web APIs there.

I second @littledan here. My pain point is not on shipping Web APIs in general but mostly dealing with the unforgeables. Those are the the main blockers for some virtualization with have within our goals.

My preference is to unblock this proposal and get to the strategy that goes better for the HTML integration. Saying that, I find more interesting to pick something on the edges regarding globals, minimalist (ES Primordials) or working through a good amount of values we can add to avoid less confusion for the users.

In case we go with anything non-minimalist, I can sync with my team - including @caridy - to list anything else to be avoided beyond unforgeables.

@Jack-Works
Copy link
Member

mostly dealing with the unforgeables. Those are the the main blockers for some virtualization with have within our goals.

Yes, unforgeables are the main blocker for virtualization. But unknown extra properties added by the host are also a problem with virtualization (even they can be deleted with no difficulty).

@littledan
Copy link
Member Author

One possible line which @annevk mentioned in #whatwg is: APIs which have to do with parsing/string processing are included. So Intl, atob and TextEncoder are in, but then maybe setTimeout, EventTarget and ReadableStream are out. (I'm not sure where WebCrypto falls here...) A lot of people expect setTimeout in JS universally, but that is actually an example of an API with various interop issues in practice, and scheduling is complicated (and might be the kind of thing you want to manipulate somehow in a Realm anyway). I wonder if this line would be intelligible and usable by JS developers in general.

Can I do a temperature check/call for emoji reacts on this option? What do people think of "string parsing" as the possible line? (We might include queueMicrotask here as well, as it's analogous to Promise operations.)

@Jack-Works
Copy link
Member

I still want to This hook must not add properties to the Realm's global object. for the reason I have presented. Sorry for repeating but it's somehow important for me.

@caridy
Copy link
Collaborator

caridy commented Nov 24, 2020

I still want to This hook must not add properties to the Realm's global object. for the reason I have presented. Sorry for repeating but it's somehow important for me.

@Jack-Works in the past we used the init hook which was a callback used by the super to provide a hook into the list of descriptors to be installed into the global object. That was removed a while ago for various reasons, look at the closed issues. Now, what you're asking, if I understand correctly, is very similar, a way to create a new realm with a global object without any global property defined, giving the option to the developer to populate it as will. Is that it? If yes, the next question is: where to provide the descriptors, if any, for you to populate the global properties? And what descriptors would you need? And how is that different from deleting what you don't need? Assuming everything is configurable.

@Jack-Works
Copy link
Member

@caridy yes, I want to forbid to add anything in the host hook.

Host can provide a subclass then define new properties in the constructor steps.

Developers can do this too, or they can directly manipulate the returned value of the Realm.globalThis.

@littledan
Copy link
Member Author

Yeah, I don't see why we need any more hooks here. You can just add properties after the Realm constructor returns. And if we want to permit hosts to add properties, they can do it in HostInitializeSyntheticRealm, not a separate host hook.

@Jack-Works
Copy link
Member

assert(new Realm().globalThis.TextEncoder === undefined)

class WebRealm extends Realm {
    constructor() { super()
        this.globalThis.TextEncoder = require('text-encoder')
        // ...
    }
}

@littledan
Copy link
Member Author

littledan commented Dec 10, 2020

We discussed this issue in an SES call recently. Some major points of agreement among participants there were:

  • It is not totally unacceptable for hosts to define some non-ES APIs on Realms, especially if there are things that make sense across environments (like atob, TextEncoder)
  • We'd like to drive alignment among hosts, but it is OK if this work starts informally, outside of official standards bodies. (May be something like js-shared-interfaces)
  • There are some requirements we'd like to make on hosts' properties, to ensure that the goals of Realms are still met (@caridy and @erights to follow up with details), including:
    • There shouldn't be any non-configurable properties, so that Realms can be customized
    • Authority should be avoided. Ideally, the only capabilities exposed would be what JS itself already exposes (loading modules, getting the time, queueing promise resolutions, etc)

Nobody present in the meeting raised concerns with @annevk 's suggestion of just including text-processing-like things, so I think that would be good to move ahead with as a starting point.

It doesn't sound like anyone in the Realm champion group will have time to articulate the HTML/WebIDL-side changes to specify all of the details here by January, but if someone is interested in working on this, please let me know; I would be happy to mentor you.

@Jack-Works
Copy link
Member

Nobody raised concerns with @annevk 's suggestion of just including text-processing-like things, so I think that would be good to move ahead with as a starting point.

I did raise concerns in the emulation & toolings. Did not I make my arguments clear? You can see in the threads above. 🤔

@littledan
Copy link
Member Author

Sorry, I mean, no meeting attendees... You made yourself clear above. Edited the above comment. My opinion is, if we can find a set of things which don't have authority and make sense generally, then it's reasonable to go this way. I should speak for myself rather than everybody.

@Jack-Works
Copy link
Member

I'm not against the idea of adding some useful tools to it. But why not add it to the host-defined subclass? There might be some tooling authors really want to have a truly clean environment without anything that not in the language itself.

Maybe I can provide a use case:

Sometimes I write libraries that depend on 0 host APIs (even they're shared across multiple platforms), to make sure it can be run on any ES engine. I need to run the test in a clean environment to make sure I (or any contributors) didn't use any of those APIs).

If the host is not adding anything other on the Realms, I can use Realms to achieve that work. But if TextEncoder is added and I accidentally used it, the test cannot find out I used a host-specific API.

@leobalter
Copy link
Member

@Jamesernator this is out of scope for this proposal and this issue. I believe the Compartments proposal has a better opportunity for more customization.

I opened #301 to resolve this issue with the actual technical blocker I identified. The PR just limits the hook to prevent non-configurable properties and authority-based API. The previous hook had no normative restriction.

@Jack-Works
Copy link
Member

Let me provide some other arguments for objecting to the host being able to add anything on the realm.

Cross-platform interoperability:

If major hosts provide EventTarget in the realm, developers might think every host provides this and forget to shim it. And the code failed to run on the host that the host doesn't add EventTarget on it.

This might gives pressure on the host to implement additional host objects on the Realm (because they want to keep consistent with other hosts). In this way, what should be added is not formally specified in the ES spec, it depends on observation of what other hosts add. I think this is definitely a bad idea.
Though this is happening in the global realm (Deno is following what Web added to the global scope, Node adding EventTarget too), I don't think it should happen on the Realm which aimed to provide a clean environment.

@Jack-Works
Copy link
Member

Rejecting the host to add things on the Realm does not close the possibility of allowing the host to do it in the future.
But if you allowing it on the first ship, there is no way back.

@leobalter
Copy link
Member

If major hosts provide EventTarget in the realm, developers might think every host provides this and forget to shim it. And the code failed to run on the host that the host doesn't add EventTarget on it.

I'm not sure what you're talking about. Today we have iframes and frameworks historically deal with unexpected values. The new text literally says the host can add other properties to the Realms global. Any decent framework would need to deal with this and it's really absurd to think one would forget to shim something.

I'd like to avoid the noise and repetition of things we extensively discussed already.

@Jack-Works
Copy link
Member

I'm not sure what you're talking about. Today we have iframes and frameworks historically deal with unexpected values. The new text literally says the host can add other properties to the Realms global.

I mean if we're not adding it, developers know those APIs definitely not appear. So if they need it, they will shim it and it can work on any platform.

@Jack-Works
Copy link
Member

I'd like to avoid the noise and repetition of things we extensively discussed already.

But I don't see my concerns get resolved so I have to repeat my arguments before the final decision is made.

@Jamesernator
Copy link

Jamesernator commented May 4, 2021

I opened #301 to resolve this issue with the actual technical blocker I identified. The PR just limits the hook to prevent non-configurable properties and authority-based API. The previous hook had no normative restriction.

This excludes the possibility of authority-based API being useful within a Realm. For using Realms for testing, in some situations I would absolutely like the host to be able to add (some) authority-based APIs, but not others. On the flip side for running untrusted code (e.g. for plugins using SES) one must remove all APIs that have any authority.

Some examples of potentially authority-based APIs that would absolutely be useful in plenty of use cases for realms would be things like: Worker, fetch, crypto, OffscreenCanvas, indexedDB, and so on.

As one particular example, Worker is heavily useful for many many types of code, however it includes certain authority, in particular in can enable Spectre attacks (breaking cases like SES), and triggers a fetch (in browsers, in Node it does IO). In many situations such as testing, we still absolutely want code to be able to spawn it's own workers and don't really care about these kinds of authority.

We don't really want a situation where some common APIs work, but other common APIs need to be manually wrapped in their entirety when the host could have provided them. e.g. Taking the Worker as an example again, within a testing framework it would be really annoying to have to shim + proxy everything through the Realm, and then this needs to repeated for every global that offers authority. In most cases, for testing you probably don't care (and in fact may actively encourage) the use of Worker for doing work off thread, just that the thing you're testing still returns the right thing (or does the right side effect, or whatever).

@Jamesernator this is out of scope for this proposal and this issue. I believe the Compartments proposal has a better opportunity for more customization.

Yes this is probably true. Although it should be considered if there's any significant advantages (e.g. optimizations) engines could make to code running inside a realm if they can omit creating these globals altogether (as in a lot of cases, they may wind up deleted/unused).

Compartments is probably better though still, especially if we have mixed-authority APIs (e.g. Date) in that we could hook (or at least tell hosts to turn off) behaviour within a specific compartment. This is a pretty big can of worms however.

@codehag
Copy link
Contributor

codehag commented May 20, 2021

One approach we can take here is to reference IDL [Exposed] as potential properties the host can provide to a realm, and select from those ones which have been adopted by multiple host types. In addition, not allowing access to this will put pressure on TC39 to re-standardize things that are already standard, and used. We already reference external specifications.

Speaking with @annevk, there was a proposal for something like [Exposed=*] to identify truly universal apis, though that hasn't materialized yet.

@caridy
Copy link
Collaborator

caridy commented May 20, 2021

@codehag I'm very supportive of the approach described above. How can we get some traction on that?

@Jack-Works
Copy link
Member

I'll repost my comment in #304 because it is marked as out-of-topic.

I'm happy to see Realms go for stage 3 but I'll block it because I'm not good with #301.

The blocking reason TLDR:

The proposal must at least have a way to opt-out host additions. e.g. new Realms({ host: false }).
It will be better for me to totally ban the host additions.

Details (summarize of #284):

And I need to point out the current spec is a bit vague (#301 (comment)). It doesn't have a clear rule about what should not be added thus implementors must understand what SES wants otherwise SES will be broken. I believe major implementors will handle this correctly but that leaves the possibility for a minor implementor to make things wrong.

My reply to some objections:

this would mean that developers need to know that streams, setTimeout, atob are all not js apis, however many developers consider them to be part of JavaScript. This is why many of us advocate for not splitting js into "web" and "js" -- its worse for developers. Why should users have to be aware of details of specification bodies? (from #284 (comment))

When programmers are using Realm API, they will find setTimeout does not exist but atob exists. It will be more confusing. Programmer already knows JS can run on Node.js so the "web"/"js" separation is not a serious problem because it already happens today. (Try search Node.js btoa is not defined on Google)

A developer found setTimeout does not exist in the Realm. But why?

  • In the Web/JS separation: He searches on Google and gets the conclusion: because it is not in the JS standard.
  • In the current mental model: He searches on Google and gets the conclusion: the host can choose to add or not randomly and the only requirement is not "providing authority" or "mutable state across/with-in Realms" and he needs to figure out what that SES requirement means.

IMO Web/JS separation is a much clear mental model than the current one and has a much more determinism result across different platforms.

Virtualization is not an important enough use case for the web platform to tradeoff ergonomics and possible confusion for web devs, who by and large (borne out by actual MDN survey data here!) do not understand the separation between the specs. (from #284 (comment))

I don't think the current isolated Realm is very ergonomic though. But if it really the case, we should at least allow developers to opt-out of that.

Solutions

There're some good solutions in #284 plz consider them.

EcmaScript depends on the unicode standard by citing it as of a given version, but without duplicating it or claiming jurisdiction over the definition of unicode itself.

In like manner, for standards like URL or TextDecoder that are defined by other standards bodies, we could decide to have EcmaScript standardize the global variable of that name, including it in EcmaScript, citing specific versions of those other specs as providing the definition of the object found at that variable name. Then these global variables are part of EcmaScript. Their values are also part of EcmaScript, by citation of given versions of external specs.

@Jack-Works
Copy link
Member

And I found @ljharb's argument in #22 (comment) is also applied to here.

The arguments I've heard for not separating these boil down to "most web devs don't know or care about the difference between browsers and the language, and it will be confusing for them".

This is a compelling argument (whether I think "most" is correct or not) - but only an argument for "what the default should be". It is not an argument for "actively obstructing and blocking the use cases of those that do know and care about the difference".

It's ok to have host additions by default for ergonomic, but it's not the reason to actively obstructing and blocking the use cases of those that do know and care about the difference.

@annevk
Copy link
Member

annevk commented May 21, 2021

That point of view seems in conflict with not exposing organizational structure through APIs. Which would mean we're at an impasse?

@ljharb
Copy link
Member

ljharb commented May 21, 2021

@annevk it's not about organizational structure as far as I can tell, it's about environments. "node" and "browsers" are different things, that's important to JS devs, including on the web. That it's not important to all of them is irrelevant - Typed Arrays and Atomics and wasm are important to FAR fewer devs than "the difference between node and browsers" is, and yet that wasn't an obstacle for those features.

@leobalter
Copy link
Member

@Jack-Works

Same answer:

We extensively discussed these objections. Some of the suggestions are also hard objections from other delegates. I wonder if this can be considered in a path this proposal could in fact advance, ever.

Also: #304 (comment)

There're some good solutions in #284 plz consider them.

  • Subclassing Realm extends ESRealm
  • Realm.prototype.attachHostAPI
  • new Realm({ addHostAPI: ['setTimeout'] })
  • Pick utils from Web API into the ES API so they can be used everywhere

To make it clear: the goal of this proposal is to resolve the use cases we mentioned in the explainer and presentations at TC39 plenaries.

The champion group understands the current API solves these use cases, that's why we are trying to move ahead and request stage advancement with the current drafted spec.

Trying to accomodate solid concerns for the HTML integration, this issue served to discuss the properties we want to add to each Realm, and yes we've heard many of the side effect problems, trade offs, etc.

With the work proposed at #301, we believe that work is still a solution that allows HTML integration to add properties and we trying to set the minimal boundaries to add them in order to still fit the use cases we are trying to resolve in this proposal.

This lead to the recent changes for HostInitializeSyntheticRealm setting constraints such as "Those properties must each be configurable to provide platform capabilities with no authority to cause side effects such as I/O or mutation of values that are shared across different realms within the same host environment."

I believe this opens possibility to set properties from host integrations but still allow virtualization. It might not a single "one ES set wins all" for any kind of host, but it still allows configuration of the global properties.

Something very interesting to explore, and I believe this should be a Stage 3 concern is tackling the suggestion from @codehag at #284 (comment). I'm interested in following up and dedicating more time for that and so I'm even opening a new issue here to have better details.

Subclassing Realm extends ESRealm

This is not the scope of what we are requesting. As noted, if configuration is enforced, that solves our use cases. Although, this wouldn't prevent a follow up proposal to have more customizable versions of Realms, which seems similar to what Compartments does.

Realm.prototype.attachHostAPI
new Realm({ addHostAPI: ['setTimeout'] })

I don't believe this should be a concern for user land, and yet it's pretty hard for me to champion it. I wouldn't be able to tell TC39 why we need this because to add this level complexity. The options arg make it even worse.

Pick utils from Web API into the ES API so they can be used everywhere

This is something answered in this thread, and I believe we can't and shouldn't do it before a clear definition of what the HTML integration includes. This part of the work should be done there.

@Jack-Works
Copy link
Member

Pick utils from Web API into the ES API so they can be used everywhere

This is something answered in this thread, and I believe we can't and shouldn't do it before a clear definition of what the HTML integration includes. This part of the work should be done there.

They're two separate things. Ban host additions and ship the Realm, then if anyone wants some HTML utils like atob or TextEncoder they can pick them into the ES spec in the future so those utils will appear on the Realm as well.

@leobalter
Copy link
Member

I don't see a possible way to advance this proposal banning the host to add more properties.

@Jack-Works
Copy link
Member

I don't see a possible way to advance this proposal banning the host to add more properties.

Just write them into the spec, like what we have now.

Those properties must each be configurable to provide platform capabilities with no authority to cause side effects such as I/O or mutation of values that are shared across different realms within the same host environment.

@leobalter
Copy link
Member

Just write them into the spec, like what we have now.

wow

@ljharb
Copy link
Member

ljharb commented May 22, 2021

@Jack-Works to clarify, i think the issue isn't "how to write it in the spec", it's "delegates will block it if it's written in the spec"

@Jamesernator
Copy link

I don't believe this should be a concern for user land, and yet it's pretty hard for me to champion it. I wouldn't be able to tell TC39 why we need this because to add this level complexity. The options arg make it even worse.

These were just arbitrary suggestions on my part for a way that certain APIs could be exposed. There could be a simpler way e.g. but I personally think the complexity of being able to configure it is absolutely worth it, as one of the fairly significant use cases of Realm is virtualization.

Without a way to configure realms correct virtualization is painful and difficult, in particular given that you need to replace the Realm global itself in child realms otherwise they can just break their own virtualization:

const realm = new Realm();

realm.evaluate(`
  delete globalThis.TextEncoder;
`);

realm.evaluate(`
  // This script shouldn't have access to TextEncoder yet it can get one:
  const subRealm = new Realm();
  const encodeText = subRealm.evaluate(`
    (string) => new TextEncoder().encode(string);
  `);
`);

This was one of the primary reasons for specifying APIs individually in my suggestions like Realm.prototype.addHostAPI("TextEncoder"). By calling this it would also be available to child realms, however APIs not present couldn't be recreated by a child realm.

Something I've seen said a lot over the Realms proposal is that this API is "low level". But that doesn't mean it needs to be awkward to use and borderline dysfunctional for a lot of its own (even motivating) use cases.

I've raised an issue on the compartments proposal but as it stands this mechanism of allowing hosts to add some APIs but not others just makes the whole Realms API kinda awful to use for most use cases.

Just as some examples of problems for some use cases:

  • For use cases like testing frameworks, because only "deterministic safe" APIs are allowed to be exposed by hosts, this means that those frameworks need to shim every single host API and worse keep those shims up to date, when hosts are capable of making these objects it'd be a lot nicer to just allow the host to attach them
  • For virtualization use cases one needs to create whitelists of globals, and even then virtualization can be messed up with new syntax features, e.g. Record/Tuple not on your whitelist? well now #{ x: 10 }.prototype allows you to detect certain things about the virtualization, so again another thing that needs to be kept perpetually up to date
  • For robustness use cases (like putting code in Realms to ensure non-tampered globals) this suffers the same thing as testing frameworks in that they probably want additional host APIs (not just safe ones) available, just not the tampered global
  • One of the motivations in the explainer cites portability, but if hosts can add whatever this is completely out the window

And yeah some, if not all, of these problems could be solved at the compartment level, but that's contingent on hosts agreeing to enabling such restrictions in compartments. What could happen is that Realm as specified here gets shipped, but then any attempts to enable certain virtualizations via compartments is rejected as undesirable by hosts. Or worse some hosts might even decide that actually many users want things like Worker or fetch exposed into Realms and willfully disregard the spec to ship them, thereby breaking any sandbox use cases like SES.

The current design philsophy of Realms seems to be "well it's just enough for SES to be safe, but it doesn't matter if the API is difficult and painful to use for most use cases". Personally I think the design should be trying to ensure it's as useful for as many use cases as possible. But now it might be the case that indeed all these problems should all be solved via Compartments, but the current model seems to just be deferring everything difficult to compartments, despite the fact its still stage 1 and doesn't neccessarily have consenus for any of these capabilities to actually ship, let alone to the degree needed by most use cases. I think it would be a lot better if Realm and Compartment were developed a lot more in lockstep to ensure that none of the primary use cases (virtualization, plugins, testing, robust-code, sandboxing (when combined with SES), etc, etc) won't be possible (or needlessly difficult) due to things that come up later when implementing compartments (assuming they gain consenus to ship at all).

@leobalter
Copy link
Member

... in particular given that you need to replace the Realm global itself in child realms otherwise ...

This is the whole idea of virtualization and how you control execution of code inside a realm.

Realm.prototype.addHostAPI("TextEncoder")

This would be just sugar of what can be achieved today replacing global properties after instantiating a new realm.

@Jamesernator
Copy link

This is the whole idea of virtualization and how you control execution of code inside a realm.

Yes, and no, sure you need to virualize globals that don't exist. However one can just hook behaviour for existing behaviour, for example in the compartments proposal there's interest in adding things like randomHook, etc. i.e. Instead of doing a bunch of work to replace APIs you simply hook into a specific stage.

e.g. :

// Instead of this:
const realm = new Realm();
realm.evaluate(`
  const newRandom = () => 0.5;
  Math.random = newRandom;
`);

// We will be able to do this:
const realm = new Realm();
const compartmentEval = realm.evaluate(`
  const compartment = new Compartment({
    randomHook: () => 0.5,
  });

  // Give back eval
  (code) => compartment.evaluate(code);
`);

compartmentEval(`Math.random()`); // 0.5

This virtualization only gets worse for objects where you need to correctly implement the interface exactly, otherwise your virualization might be exposed. e.g. Does every method have right .length, does it return function evaluate() { [native code] }, etc etc

This is horredonous to work with, and it's just bad when the language could provide a better alternative, i.e. hooking behaviour.

And again, this is even worse again if you want to be able to able to let hosts expose unsafe APIs that are popularly used into Realms in future (e.g. Worker, fetch, etc). These can be hidden via a compartment, but to expose any particular things you now have to maintain entire wrappings of every single API.

And while generating some of this with whitelisting can work, it still allows future breaks, e.g. as before Record/Tuple not on your whitelist, well now #{ x: 10 }.__proto__ allows detection of the virualization.

This would be just sugar of what can be achieved today replacing global properties after instantiating a new realm.

Only if you do a lot of work. And even then what happens if some new API comes out that breaks your assumptions about how to expose things? Also these objects won't be real host objects, it's possible with real host objects the engine might be able to do additional optimizations, amongst other things.

And I really don't understand the philosophy of "it's technically doable so we don't mind if the API is horredously difficult to use even for common use cases". This doesn't seem like a good philsophy for core primitives, and despite some statements that this is an "advanced API", it ultimately still affects any consumers. e.g. Even if all the realms stuff is dealt for you in your test framework, your plugin framework, or whatever, if things like #{ x: 10 } start working someday but Record wasn't added to your frameworks whitelist, its still your problem, and ultimately stemmed from bad API design.

@Jamesernator
Copy link

Jamesernator commented May 27, 2021

Also just to be clear I'm not advocating any particular solution, just that for Realms to be an actually useful primitive beyond the very specific needs of SES, and that these capabilities need to reasonable for library authors to work with so that they don't break their consumers later.

And sure it adds "complexity", but it also saves complexity in other places, particular for authors of things that use Realms. And yes perhaps .attachHostAPI is too high level, perhaps just a post-create realm hook would be better:

const realm = new Realm({
  // Runs after HostInitializeRealm in the child Realm, *and importantly* every subrealm
  // created
  postInitializeScript: `{
    const whitelist = [...getEcmaGlobalListSomehow, "TextEncoder", "TextDecoder"];
    for (const key in globalThis) {
      if (!whitelist.includes(key)) {
        delete globalThis[key];
      }
    }
  `,
});

const subRealmEvaluate = realm.evaluate(`
  // This still triggers the parent's realm hook script
  const subRealm = new Realm();
  (code) => subRealm.evaluate(code);
`);

// Error, the parent's hook ran, this wasn't on the whitelist, so it got deleted
subRealmEvaluate(`URL`);

But ultimately the API shape doesn't really matter, it's more about the ability to create things with Realms robustly, rather than just having a large series of hacks upon complexity upon hacks to accomplish any realistic use case.

@xerxes12354
Copy link

There should exist an opt-out for the ability to sandbox, instead of requiring authors to maintain the list to remove themselves. It is the same effort either, relatively speaking, way, but if, not only, it must be done, then it should, if needed, be done applicably.

@kriskowal
Copy link
Member

As a champion for SES and maintainer of the ses shim, “lockdown” is the intended mechanism for turning a realm into a lockded-down realm. With the SES shim, you opt-in by calling lockdown(), which transforms the environment to a SES environment. This includes the creation of “locked down” versions of eval, Function, and Compartment unique to each “locked down compartment”. By extension, we expect every realm to have an initial Realm that lockdown() becomes the basis of a “locked down realm” for each child compartment.

In short, SES does not require nor desire every realm to be locked down. The Realm constructor may produce realms with non-standard features as long as a lockdown shim can reliably discover and remove them.

SES does not require realms to be constructed with a configurable allowlist. SES shim comes with its own and erases anything it doesn’t recognize upon lockdown.

littledan pushed a commit to littledan/html that referenced this issue Jul 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests