Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What makes CropTarget special to require an asynchronous creation? #17

Open
youennf opened this issue Feb 2, 2022 · 88 comments
Open

What makes CropTarget special to require an asynchronous creation? #17

youennf opened this issue Feb 2, 2022 · 88 comments

Comments

@youennf
Copy link
Contributor

youennf commented Feb 2, 2022

We started discussing this topic (whether CropTarget creation should use promise or not) as part of #11 and it seems best to use a dedicated thread for this issue.

@eladalon1983 mentioned implementation issues that make using promises desirable, in particular security and potential race conditions if creating synchronously a CropTarget since CropTarget has to keep a link to the process it was created in. The following case was provided:

  • create CropTarget in process A
  • postMessage it synchronously to another process B
  • (optionally) postMessage it yet again to another process C
  • Use CropTarget to crop the track.
    @eladalon1983, please correct this description if it is not aligned with your thoughts.

Existing objects like MessagePort, WHATWG Streams, RTCDataChannel or MediaStreamTrack can be created-then-postMessaged synchronously and UAs are implementing this today, hopefully without the security/race condition issues.
AIUI, it seems consistent to use the same approach to CropTarget (synchronous creation), except if we find anything specific to CropTarget that actually prevents this existing pattern.

@eladalon1983
Copy link
Member

eladalon1983 commented Feb 2, 2022

I think it's easiest to answer with Chrome as the concrete example, thereby keeping the discussion simpler. This is generalizable to other browsers.

Chrome has a central "browser process," and documents are hosted in "render processes." (For simplicity, let's pretend every document has a dedicated render process.) Let's examine multiple documents embedded together in another document, and all living together in the same tab.

Agains for simplicity, we'll call the document where the crop-target lives SLIDE, and the document which holds the track we'll call VC. This I find easier than talking about D1, D2 etc., as we can have a practical example in our mind's eye. If necessary, map (SLIDE, VC) to (D1, D2).

CropTarget is essentially a token. That token is produced in SLIDE and passed elsewhere. It may be passed directly to VC or indirectly. A design that allows for it to be safely be passed through other documents is preferable, as it requires less care of developers. To be safely passed through other documents (and therefore processes), it should encode the minimum amount of information. This is mostly true for JS-exposed information, but non-JS-exposed information that lives in the render process that holds the token, is also theoretically accessible to malicious documents under certain conditions.

So, to keep the minimum amount of information, the token should not actually encode the information that it originates in SLIDE. Instead, this knowledge will be recorded in the trusted browser process as a mapping of T<->SLIDE.

When the token is minted, this mapping has to be recorded in the browser process, which requires IPC, which means that minting the token should be asynchronous. (Minting can fail if the browser process refuses to record more mappings.)

To generalize away from Chrome, other UA-implementers will either run into similar implementation constraints, or else they can just return a pre-resolved Promise and not worry about it.

@youennf
Copy link
Contributor Author

youennf commented Feb 2, 2022

When transferring a MessagePort or a RTCDataChannel from SLIDE to VC, there is a need to identify that the transferred object is originating from SLIDE, just like CropTarget. How is it possible to be synchronous for those objects but not for CropTarget?

@eladalon1983
Copy link
Member

There could be multiple documents in between SLIDE and VC. Each hop only exposes its own origin. So if the token is sent SLIDE->TOP_LEVEL->VC, then VC would not know the origin of SLIDE, only of TOP_LEVEL.

@youennf
Copy link
Contributor Author

youennf commented Feb 2, 2022

Again, the same could be said about transferring a RTCDataChannel from SLIDE->TOP_LEVEL->VC.
Either this is a new security hardening rule that we cannot enforce it in old APIs but should in new APIs or there is something specific to CropTarget. I am unclear which one it is from your explanations.

@eladalon1983
Copy link
Member

eladalon1983 commented Feb 2, 2022

I don't know how common it is to transfer RTCDataChannel objects. (I suspect somewhat uncommon...?) CropTarget is designed for transfer, and would have been wholly unnecessary otherwise. Given the cheap cost of ruggedizing CropTarget against being utilized by an attacker, I see it as desirable. I don't know enough about RTCDataChannel in order to say whether this was necessary there too but prohibitively complicated, or any other reason.

@youennf
Copy link
Contributor Author

youennf commented Feb 2, 2022

What about MessagePort then? MessagePort are very common and designed specifically for being transferred.
From what I can see so far, the problems you are describing have been solved for those objects in a secure and efficient way without using promises.

@eladalon1983
Copy link
Member

Does MessagePort remain bound in any way, or encode to some degree, the document in which it was originally instantiated? If I create a MessagePort in D1, then transfer it to D2, then to D3, will D3 know that this MessagePort is originally from D1?

@youennf
Copy link
Contributor Author

youennf commented Feb 2, 2022

Does MessagePort remain bound in any way, or encode to some degree, the document in which it was originally instantiated?

It remains bound to the MessagePort it was created jointly through MessageChannel.

If I create a MessagePort in D1, then transfer it to D2, then to D3, will D3 know that this MessagePort is originally from D1?

For RTCDataChannel, WHATWG Streams and MediaStreamTrack yes, always, the 'source' remains in D1.

MessagePort are a bit specific in that they are created as a pair through MessageChannel that can keep communicating with each other. MessageChannel.port1 can be synchronously transferred to another process, as well as MessageChannel.port2. port1 and port2 remain bound together in any case.

@youennf
Copy link
Contributor Author

youennf commented Feb 2, 2022

Another example that might be closer to CropTarget is OffscreenCanvas.
OffscreenCanvas can be created from a canvas element, in which case it remains tied to the canvas element like CropTarget would be to its element. Such OffscreenCanvas can be created-then-transferred synchronously.

@jan-ivar
Copy link
Member

jan-ivar commented Feb 3, 2022

@eladalon1983 said in #11 (comment):

  • When D2 calls cropTo(X), the UA has to validate that X is a valid CropTarget.
  • It is undesirable to query all documents and check if any of them have produced X

cropTo already returns a promise, so querying "all" documents in the single viewport captured to identify the crop target, seems reasonable to me.

Cost to implementers is low on the priority of constituencies when it comes to shaping APIs.

@eladalon1983
Copy link
Member

cropTo already returns a promise, so querying "all" documents in the single viewport captured to identify the crop target, seems reasonable to me.

I think you mean "browsing context" where you said "viewport".
Do you have a reasonable limit on how many different documents could be embedded in that browsing context?

Cost to implementers is low on the priority of constituencies when it comes to shaping APIs.

Cost to implementers is low priority, but non-zero. It's only a problem if accommodating implementers comes at a non-trivial cost to a higher-priority constituency. Does it?

@jan-ivar
Copy link
Member

jan-ivar commented Feb 3, 2022

I think you mean "browsing context" where you said "viewport".

No, each iframe has its own browsing context, which is nested (immediately or multiple levels under) the top-level browsing context.

I loosely mean all documents in this capture, which maybe translates to the top-level browsing context's document and all documents in its nested browsing contexts of iframes that intersect the viewport.

Do you have a reasonable limit on how many different documents could be embedded in that browsing context?

Looking up the CropTarget shouldn't be the bottleneck in extreme cases, so this should scale fine.

@eladalon1983
Copy link
Member

I think you mean "browsing context" where you said "viewport".

No, each iframe has its own browsing context, which is nested (immediately or multiple levels under) the top-level browsing context.

I loosely mean all documents in this capture, which maybe translates to the top-level browsing context's document and all documents in its nested browsing contexts of iframes that intersect the viewport.

I've been carrying that mistake around for a while. Thanks for enlightening me.

Do you have a reasonable limit on how many different documents could be embedded in that browsing context?

Looking up the CropTarget shouldn't be the bottleneck in extreme cases, so this should scale fine.

IPC with multiple processes is neither simple, nor performant, nor robust. The cost to implementers is greatly reduced when avoiding this. What's the downside to any other constituency?

@youennf
Copy link
Contributor Author

youennf commented Feb 4, 2022

IPC with multiple processes is neither simple, nor performant, nor robust. The cost to implementers is greatly reduced when avoiding this.

This is a known problem that is solved in modern browsers. A transferred WritableStream should not do multiple IPCes to locate the process of its sink when writing new values.
As I said before, I'd like to understand why it would be more difficult with CropTarget than with all these other existing APIs.

What's the downside to any other constituency?

It is more costly to both web developers and web engines. It is not consistent with existing Web APIs AFAIK.

@youennf
Copy link
Contributor Author

youennf commented Mar 8, 2022

Given this is a solved problem for other APIs and given these solutions are applicable to CropTarget as well, can we converge on moving away from Promises?

@eladalon1983
Copy link
Member

IPC with multiple processes is neither simple, nor performant, nor robust. The cost to implementers is greatly reduced when avoiding this.

This is a known problem that is solved in modern browsers. A transferred WritableStream should not do multiple IPCes to locate the process of its sink when writing new values. As I said before, I'd like to understand why it would be more difficult with CropTarget than with all these other existing APIs.

I believe I have explained why we have implemented things this way in Chrome. This is a real issue.

What's the downside to any other constituency?

It is more costly to both web developers and web engines. It is not consistent with existing Web APIs AFAIK.

The cost to Web developers is negligible. Crop-target production is a rare occurrence, it does not matter to the Web developer if it complete asynchronously. I can pull in Web-developers currently using Region Capture (in origin trial) for major products with a high level of polish, and they could comment as much. Would you find that convincing? (If not - please pull in a similarly qualified Web developer who could comment to the contrary.)

Given this is a solved problem for other APIs and given these solutions are applicable to CropTarget as well, can we converge on moving away from Promises?

Let's converge towards Promises, given that it's an important implementation issue for Chrome. (And I believe that when the time comes for Safari and Firefox to implement this, they'll find it equally problematic.)

@youennf
Copy link
Contributor Author

youennf commented Mar 11, 2022

I believe I have explained why we have implemented things this way in Chrome. This is a real issue.

You explained a real issue, that I would classify as an optimization problem (though at some point you alluded to security concerns as well). Is that correct?

The argument is that the same optimization issue exists for already deployed APIs, and was solved without making use of promises. If we do not want to follow this preexisting pattern, we need a clear justification.

To move forward, let's try with more narrowly focused questions:

  • Do you think that other existing web APIs enumerated in this thread have the same issue? Or do you think this is new problem specific to region?
  • If it is a new problem, can you detail what exactly is new?
  • If it is not a problem specific to region, what makes you think the solutions used for implementing those APIs are not good enough for region?

As a recap, here are some APIs that I think face the same issue:

  • Create a MessageChannel and transfer one of its MessagePort synchronously after creation. The transferred port needs to be able to locate in an optimal manner where is living its related channel port.
  • Create a MediaStreamTrack from a canvas and transfer it synchronously after creation. The transferred track needs to be able to locate in an optimal manner where is living the canvas element.
  • Create a RTCDataChannel from a RTCPeerConnection and transfer it synchronously after creation. The transferred data channel needs to be able to locate in an optimal manner where is living the peer connection.

I believe that when the time comes for Safari and Firefox to implement this, they'll find it equally problematic.

I implemented in WebKit some of the APIs that I think are facing the issue you are describing.
For that purpose, existing techniques were reused so that we do not introducing delay in creation of the objects/delay in transferring the objects/delay in the object algorithms/race conditions.

Answering the above questions might help figuring out which problems I might be overlooking.

@alvestrand
Copy link

Note: When we started trying to transfer MediaStreamTracks in Chrome, the synchronous nature of the transfer gave us major problems in implementation. So the idea that synchronous = solved problem is not universally true.

@youennf
Copy link
Contributor Author

youennf commented Mar 21, 2022

It is great to hear Chrome implementation on transferring tracks is making progress. It is also great to hear Chrome implemented the transfer of synchronously-created MediaStreamTracks in a secure and efficient manner.
Can you validate whether the same approach can be used for CropTarget?

The idea that synchronous = solved problem is not universally true.

The point is more about create-then-transfer-synchronously is a solvable problem (do we all agree?), it was solved multiple times already.
And that the web platform never cared about this particular implementation difficulty when designing new APIs.

To break this existing pattern, compelling motivations seem necessary.

Another reason not to use promises: what happens in case the element goes transferred to another document, which then gets destroyed (and the element gets reattached to another document), all of this during creation of the CropTarget. Should we reject the promise? A synchronous API is simpler for edge cases as well as for web developers.

@eladalon1983
Copy link
Member

eladalon1983 commented Mar 21, 2022

what happens in case the element goes transferred to another document

I had the same concern. I lost that concern when I learned that Elements are not transferable. (But do correct me if I am wrong.)

which then gets destroyed

It is generally possible for a CropTarget to outlive its Element, and that's OK. The document discusses what happens then. The summary is:

  • If cropped to a target that is in the DOM, has >0 pixels, etc., frames are produced.
  • Otherwise, no further frames are produced. The track is effectively "muted" (but without the muting - as per our earlier agreement).
  • Elements can be garbaged collected even! That's not special. It falls into the category set by the previous bullet-point.
  • It's always possible to get a track "unstuck" by calling cropTo again, either to uncrop it or to re-crop it to a new target.

@youennf
Copy link
Contributor Author

youennf commented Mar 21, 2022

I had the same concern. I lost that concern when I learned that Elements are not transferable.

The point I am making is during the creation of the CropTarget, i.e. when the promise is not yet settled.

@eladalon1983
Copy link
Member

I had the same concern. I lost that concern when I learned that Elements are not transferable.

The point I am making is during the creation of the CropTarget, i.e. when the promise is not yet settled.

The thing I am still not getting is - what's going to happen to the Element during that time? The worst that could happen is that it gets garbage collected. I don't think that's a problem. It doesn't seem to matter if the Element is GCed before/after its CropTarget is produced. (And getting GCed after CropTarget-production should be a normal occurrence.)

@eladalon1983
Copy link
Member

I want to settle this discussion about Promises. And I don't want to leave your message unanswered. Let's briefly examine the three other APIs you've brought up:

  • MessageChannel:
    • If implementing with direct communication between the processes, the risk involved is a necessary evil. This cannot be said for CropTarget. Discoverability in either direction is not a requirement here, and confers little to no benefit.
    • If implementing with mediation via another process, the story gets more complicated. A valid implementation can hide that it's asynchronically minting identifiers, behind the moment of posting the MessageChannel to the other document. (Some compromises are required, though.) But I don't want to discuss this because it would lose track of the topic - see below.
  • MediaStreamTrack:
    • These are not currently transferrable in Chrome - not synchronously, not at all. To claim it's possible one needs to present an implementation as proof, not a specification. (Does it work on Safari or Firefox yet? This demo suggests that it's not, as of 2022-03-31.)
    • My colleauges working on mst-transferability tell me that they are running into a lot of issues precisely because of the requirement that they be "synchronously transferable".
  • RTCDataChannel:

I hope we can proceed without trifurcating the discussion. I did not want to leave your points unanswered, but deep-diving into these three examples will be unwise. We have an independent engineering question here, and it can be resolved on its own merits. These precedents do not seem applicable. Nor should we assume that mistakes and compromises were not made in the design of these other APIs. Let's discuss our own case on its own merits.

I believe I've made a compelling case for why produceCropTarget() should be asynchronous.

  • It allows CropTarget to avoid potentially-risk identifiers.
  • It's the conservative design choice to make.
  • It imposes negligible costs on all constituencies.
  • There's nothing stopping implementations from returning a pre-resolved Promise if they so wish.

Let's go with an asynchronous produceCropTarget().

@jan-ivar
Copy link
Member

I should have posted #11 (comment) here. To summarize it: At least two highly skilled technical people were confused by the current API into thinking it does more than it does.

That's a cost to web developers that we should and do avoid on the regular, as @youennf shows.

@jan-ivar
Copy link
Member

(Minting can fail if the browser process refuses to record more mappings.)

This is an incorrect implementation since produceCropTarget is infallible.

@eladalon1983
Copy link
Member

eladalon1983 commented May 23, 2022

What I tried to say was that I do not see a big difference of UA implementation complexity/achievable performances between what Chrome implements and the proposed alternatives.

I humbly disagree. I believe I have argued the point.

I do not really understand in which conditions the application will pause the track, which I interpret as stop displaying/sharing it between the time cropTo is called and the time cropTo promise is resolved.

You gave two examples. I present you you a third - multiple targets. Assume the application has two sub-sections that the user can switch between sharing by clicking an in-content button. If I were to implement this app, I'd see a click from the user as telling me: (i) stop capturing (and/or transmitting-remotely) the old target's content and (ii) start capturing the new target's content. I'd pause the track until (ii) is completed. This depends on the the app; it would only matter to some.

@youennf
Copy link
Contributor Author

youennf commented May 23, 2022

I humbly disagree. I believe I have argued the point.

It seems we are stuck on this complexity assessment.

I hope we agree though that Chrome's current implementation is putting additional complexity on the web page shoulders for the sake of this optimisation.

You gave two examples.

I gave examples where pause is not required. Your third example is a generalisation of the second example, where pause is not required, at least to the apps I can think of.

If I were to implement this app, I'd see a click from the user as telling me: (i) stop capturing (and/or transmitting-remotely) the old target's content and (ii) start capturing the new target's content. I'd pause the track until (ii) is completed.

I do not see what would drive user to expect pausing here.
I would think VC apps tend to not follow that pattern, they prefer a seamless transition.
replaceTrack is suited for that seamless transition to go from old track to new track.
Also replaceTrack is a chained operation: it was thought fine to delay a bit switching the sources.
This probably mean that some limited switch latency is fine for most WebRTC applications.

If the app is so keen to do that kind of optimisations, they can do it on their side in a faster way than any UA-side optimizations:

  • When getting getDisplayMedia track, clone the track and call cropTo on the cloned track.
  • When user clicks, switch to either the original or the clone, synchronously. No need to do any pausing.

This depends on the the app; it would only matter to some.

I still do not understand which apps would actually want to do what you described, and for which reasons.

@eladalon1983
Copy link
Member

Holding multiple tracks but only using one at any given time, may incur performance costs, or be subject to implementation-specific practical limitations. However, I don't think this is core to our discussion, so I suggest we drop this particular sub-topic. It should be pretty clear that applications calling cropTo() prefer if the call has an effect sooner rather than later.

@youennf
Copy link
Contributor Author

youennf commented May 23, 2022

It should be pretty clear that applications calling cropTo() prefer if the call has an effect sooner rather than later.

Not at all cost, see replaceTrack precedent for instance.
All transparent optimisations are fine. None transparent optimisations like the one we are discussing here should be heavily justified.

@jan-ivar
Copy link
Member

As a side-comment: I don't know how you can credibly argue about performance and add 4 milliseconds of delay for no reason whatsoever. Check out postMessage is ~100 times more frequent than setTimeout(0).

The token can be posted while the user is interacting with getDisplayMedia's dialog. Or even before getDisplayMedia is called. Any latency introduced by minting OR posting a token therefore becomes zero.

@eladalon1983 I'm sorry are you still arguing for setTimeout? Do you understand its purpose is specifically to induce delay? If you meant to simply not wait for completion, just remove the await instead (but let's move this idea to #19):

  function totallySyncFunction() {
    getCropTargetAndSendItToCapturer(element).then(() => {});
  }

Back to performance. I've shown a sync minting API in JS is actually faster because otherwise you're essentially serializing the two steps of generating the key and postMessaging it. This seems to undermine your hypothesis that your design is faster. Do you disagree?

@yoavweiss's response suggests your design only eliminates the “token hasn’t yet arrived” case, eliminating the need for timeouts and lengthy failures in case an invalid token is used by VC. So your design sacrifices optimal success path of cropTo in order to optimize for its failure path? Am I missing something?

And in your response here you're finally claiming performance doesn't matter? Color me confused. Or are you criticizing my measurement fiddle for not demonstrating when the performance I measure would matter? If so, it's true I struggled to imagine an example when this would matter, but that would seem to be on you to produce. You've mentioned multiple targets... please show how the shape of produceCropTarget would have any bearing in that case. It would presumably have to start with IPC from a decision point, just like my measurement fiddle (which only starts measuring post-gDM success btw), and presumably open to the same criticisms that decisions could have started even earlier. Let's see it.

The misguided make-everything-async-there's-no-downside crowd don't seem to understand this or care, that the logical end of what they're advocating (await everything!¹) quickly degrades JS into this preemptive (co-routine) language again. I'm sure they would try to invent locks in JS to solve it. Thankfully, these views are more rare these days than they were in 2011, which is why I had to go back that far for sourcing.

Nobody here has ever advocated for making everything async, or re-inventing locks. Let's stick to the arguments presented.

You seem to be in the "no-downside crowd", claiming there's no downsides in the face of me clearly explaining them (having to reevaluate state assumptions, fueling the proliferation of unnecessary async, non-idiomatic JS confusing web devs into thinking the API does more than it does, and finally, performance).

Let me blow up my footnote that spelled out the relevant point it seems you flat-out missed:

  1. I'm contrasting extremes to aid explanation here. Clearly one more async method isn't the problem, nor does the problem only arise at the extreme. But my aim here was to clarify why the design rules we have matter, and mustn't be violated except for reasons we can all agree are sufficient.

@eladalon1983
Copy link
Member

eladalon1983 commented May 23, 2022

@eladalon1983 I'm sorry are you still arguing for setTimeout? Do you understand its purpose is specifically to induce delay? If you meant to simply not wait for completion, just remove the await instead (but let's move this idea to #19):

Messages only make sense in their proper context.

  • What I did: I described a technique by which a synchronous function could participate in cropping, even if the developer did not wish to make that function asynchronous.
  • What I did NOT do: Advocate for using setTimeout() in the general case of cropping.

Back to performance. I've shown a sync minting API in JS is actually faster because otherwise you're essentially serializing the two steps of generating the key and postMessaging it. This seems to undermine your hypothesis that your design is faster. Do you disagree?

Did you read my response?

Let me blow up my footnote that spelled out the relevant point it seems you flat-out missed:

I don't think this exaggeration promotes mutual understanding. Let's please keep to the arguments raised and not devolve to misrepresenting each other's arguments even as temporary polemic tools.

@eladalon1983
Copy link
Member

in the face of me clearly explaining them (having to reevaluate state assumptions, fueling the proliferation of unnecessary async, non-idiomatic JS confusing web devs into thinking the API does more than it does, and finally, performance)

I've asked you for the downsides. To your mention of these specific downsides, I answered:

  1. Let's use cropTarget(Element) in the same document - no new asynchronicity.
  2. Let's use the other API when posting to another document - posting a message is asynchronous anyway. And if your sender is synchronous - which is unlikely - let it just painlessly use setTimeout(async...). It's barely noticeable to the sender, and completely invisible to the receiver.

@azizj1
Copy link

azizj1 commented May 23, 2022

Hello there. Aziz Javed from Google Meet here, working with Lindsay from Google Editors. We've already built a very significant feature based on this API's origin trial. I'm happy to inform y'all that we were not at all inconvenienced by the async nature of produceCropId. Exposure on MediaDevices vs. on CropTarget is likewise not at all a concern for us; either approach works. Generally speaking, all alternatives debated here are perfectly reasonable for us, so long as they're performant. We'd favor whichever API shape would allow this feature to be shipped in as many browsers as possible, as early as possible.

@jan-ivar
Copy link
Member

We'd favor whichever API shape would allow this feature to be shipped in as many browsers as possible, as early as possible.

@azizj1 glad to hear this! I believe we've already resolved exposure on MediaDevices, so for me that API shape is what @youennf proposed in #11 (comment):

new CropTarget(element)

Sorry this is taking so long to get to the bottom of.

@eladalon1983
Copy link
Member

@jan-ivar, have you read this part of Aziz's message?

I'm happy to inform y'all that we were not at all inconvenienced by the async nature of produceCropId.

@youennf
Copy link
Contributor Author

youennf commented Jun 2, 2022

Quickly discussed at today's editor's call, hopefully we can discuss and resolve this issue at next WG interim (next Tuesday).

@eladalon1983
Copy link
Member

Repeating from elsewhere for better visibility:
I am not sure I'll be able to attend. I regret that the intention to discuss #17 was announced only today, or I would have cleared my schedule ahead of time. (As of earlier today, no agenda items were announced.) I'll try to shuffle around my other commitments and make it. But if I cannot, I suggest that it might be more productive to discuss Region Capture during an interim meeting which I can attend. For your consideration.

CC @aboba and @dontcallmedom

@eladalon1983
Copy link
Member

For those interested, the rescheduled meeting will take place 2022-06-23. Seem more details here.

@aboba
Copy link
Contributor

aboba commented Jun 24, 2022

After looking over the slides and listening to the arguments, my take is the following:

  1. The W3C design principles cite interprocess communications (IPC) as a potential reason to choose async over sync, so they don't provide much in the way of definitive guidance here.
  2. There is a good argument that cropTo() needs to be async.
  3. The concerns about resource exhaustion attacks against CropTarget() relate more to a potential implementation issue than to the async vs. sync question. That is, resource exhaustion attacks appear addressable even if CropTarget() remains async. For example, resource allocation could be moved from CropTarget() to cropTo(), since the latter is async.
  4. As Tim Panton pointed out, the only existing implementation uses async and the developers do not believe that a sync approach would be feasible for them. On the other hand, those who favor sync could adopt the async approach and if desired, move resource allocation and other potentially blocking operations to cropTo(), returning much sooner. So it would appear that all implementers should be able to "live with" async, whereas some implementers claim they cannot live with the sync approach.

@jan-ivar
Copy link
Member

jan-ivar commented Jun 27, 2022

The W3C design principles cite interprocess communications (IPC) as a potential reason to choose async over sync, so they don't provide much in the way of definitive guidance here.

It says to use sync if the "API ... will not be blocked by ... inter-process communication." (emphasis on blocked)

I hope I showed that any IPC in fromElement is (premature) optimization, and that there's no reason for JS to block on it since it cannot fail, and doing so just slows things down. Needs for failure are discussed and disputed in #48.

whereas some implementers claim they cannot live with the sync approach.

We talked about this on the call, and my recollection was this argument reduced to convenience of implementation, i.e. not a need.

For example, resource allocation could be moved from CropTarget() to cropTo(), since the latter is async.

Exactly. [in Chrome: mint a dud CropTarget that always fails in cropTo instead]

@yoavweiss
Copy link

It says to use sync if the "API ... will not be blocked by ... inter-process communication." (emphasis on blocked)

You missed "locks" inside the elipsis

here's no reason for JS to block on it since it cannot fail

It can though

Needs for failure are discussed and disputed in #48.

"disputed" != "settled". There's still an ongoing discussion.

Exactly. [in Chrome: mint a dud CropTarget that always fails in cropTo instead]

That would create a significantly worse developer experience, as failures would be disconnected from the code/origin/party that causes them.

@jan-ivar
Copy link
Member

Are there "locks" involved other than IPC? The FPWD says it cannot fail is what I meant. I agree about settling #48 first.

Who "caused" the (global?) resource error seems important only if they can do anything about it. We have no API for relinquishing CropTargets, only minting more, so any recovery would violate § 5.3. Don’t expose garbage collection.

@yoavweiss
Copy link

Are there "locks" involved other than IPC?

Locks were suggested as an alternative for renderer->browser IPCs, for off-main-thread renderer<=>renderer communication. I was suggesting that such a solution also requires an async API, to avoid the lock blocking the main thread.

@jan-ivar
Copy link
Member

jan-ivar commented Jun 28, 2022

I was suggesting that such a solution also requires an async API, to avoid the lock blocking the main thread.

I don't think it does. Any API can trigger in parallel work. Returning a promise is only valuable if the caller needs the result (success being a result).

If we "avoid blocking" then it says to use sync (if the "API ... will not be blocked by ... a lock.")

@aboba
Copy link
Contributor

aboba commented Jun 29, 2022

If CropTarget() is vulnerable to resource exhaustion attacks wouldn't that imply that`CropTarget()' could fail (e.g. due to lack of resources)?

Moving some or all resource allocation to CropTo() could result in the resource allocation arising later (in CropTo() instead of CropTarget()). So depending on where the resource allocation is done, couldn't either or both of CropTarget() and cropTo() fail, for similar reasons?

Recently I was asked to investigate a bug in a sync API that is supposed to always return immediately. In the most common use cases, it did return immediately (< .1 ms), no matter how much load was placed on it. However, in a particular (reproducible) set of circumstances, it blocks for 30-50 ms. The cause appears to be a mixture of resource (de)-allocation and blocking IPC. It's probably a bug not a feature, but sometimes sync APIs might not be able to return immediately for unanticipated reasons.

I've also seen sync APIs where the "in parallel work" turned out to be more complicated than expected. For example, in ORTC the RTCRtpSender.send() method was originally sync. However, we discovered that the parameters to the method required fairly extensive validation, which could result in issues being discovered after the method had returned, so we had to switch to async. Similarly with RTCRtpSender.getCapabilities(), we thought it would always return immediately because the result would be pre-computed, but that turned out not to be the case for codecs requiring hardware acceleration (e.g. discovering the hardware capabilities could take a while).

@jan-ivar
Copy link
Member

jan-ivar commented Jul 5, 2022

If CropTarget() is vulnerable to resource exhaustion attacks wouldn't that imply that`CropTarget()' could fail (e.g. due to lack of resources)?

Causality runs the other way: Letting CropTargets fail allows for implementations vulnerable to exhaustion attacks. Not doing so, doesn't.

A sensible implementation should be invulnerable to resource exhaustion attacks, by simply not tying resources to a token so easily created by anyone.

Moving some or all resource allocation to CropTo() could result in the resource allocation arising later (in CropTo() instead of CropTarget()). So depending on where the resource allocation is done, couldn't either or both of CropTarget() and cropTo() fail, for similar reasons?

What resource allocation is needed? A sensible cropTo implementation can use IPC to find the element it's supposed to crop to without consuming any finite resources. Also, cropTo is behind getDisplayMedia permission.

Chrome has implemented a neat but premature optimization, and refuse to implement the fallback needed to hide the resource exhaustion they’ve exposed themselves to.

I don't find the idea that creating a {} crop target will ever take appreciable time convincing.

@eladalon1983
Copy link
Member

Chrome has implemented a neat but premature optimization, and refuse to implement the fallback needed to hide the resource exhaustion they’ve exposed themselves to.

The spec does not compel anyone to replicate our alleged neat mistakes.

@youennf
Copy link
Contributor Author

youennf commented Jul 6, 2022

The issue is not really about other UA implementations but about web developer impact.
For instance, given Chrome is globally capping the number of CropTarget due to resource exhaustion risks, web developers might want to cautiously create CropTargets, typically only once they know the web page is being captured.
This goes against the idea of a web site proactively creating some CropTargets.
This might also have an impact on future additions to this API.

It would be good to understand whether Chrome has a plan to fix this global resource exhaustion issue.
As I understand it, the main problem seems to be implementation cost.

I already said that but working on #48 may allow to make progress on this particular issue.

@eladalon1983
Copy link
Member

The issue is not really about other UA implementations but about web developer impact. For instance, given Chrome is globally capping the number of CropTarget due to resource exhaustion risks, web developers might want to cautiously create CropTargets, typically only once they know the web page is being captured. This goes against the idea of a web site proactively creating some CropTargets. This might also have an impact on future additions to this API.

As previously explained, the global limit is an artifact of the current implementation, and can be changed in a number of ways. The minimal step forward is to make the limitation per-iframe rather than per-tab; that much is likely quite simple, and I doubt more would be needed.

It would be good to understand whether Chrome has a plan to fix this global resource exhaustion issue. As I understand it, the main problem seems to be implementation cost.

I'll be happy to share more about my plans for prioritizing work in Chrome if you would be interested in likewise sharing Apple's timelines for implementing this feature. Given the high engagement, may I read that both Mozilla and Apple are prioritizing this work highly and intend to accomplish it in the near future?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants