Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API shape for creating a MediaStreamTrack from a WHATWG Stream #70

Closed
alvestrand opened this issue Nov 9, 2021 · 12 comments
Closed

API shape for creating a MediaStreamTrack from a WHATWG Stream #70

alvestrand opened this issue Nov 9, 2021 · 12 comments

Comments

@alvestrand
Copy link
Contributor

Two API shapes have been proposed:

Arguments on which to base a decision on which API shape to choose are needed.

@jan-ivar
Copy link
Member

jan-ivar commented Nov 9, 2021

Note this issue is solely over shape: A subclass of track with a writable attribute, vs. a new interface object with a writable attribute and a track attribute:

const track = new MediaStreamTrackGenerator({kind: 'video'});
await readable.pipeThrough(transformer).pipeTo(track.writable);

vs.

const {writable, track} = new VideoTrackSource();
await readable.pipeThrough(transformer).pipeTo(writable);

Naming and exposure ([Exposed=DedicatedWorker] only, to reflect consensus) are orthogonal to this discussion.

@jan-ivar
Copy link
Member

jan-ivar commented Nov 9, 2021

Audio exposure is also orthogonal (video-only for now to reflect consensus).

@alvestrand
Copy link
Contributor Author

The pattern in the rest of the platform is "we generate a stream from a source":

  • getUserMedia() and getDisplayMedia() generates a stream from a hidden source object
  • HTMLMediaElmeent.captureStream() and HTMLCanvasElement() generates a stream from an explicit source object
  • AudioContext.createMediaStreamDestination() generates an object with a stream attribute from an explicit source object

I would like @guidou to comment, but I think the separation of the stream from the source object is a good pattern to be consistent about.

(Note: the name is just a name.)

(Note: [Exposed=DedidatedWorker] does not have consensus; it does have a trend - it's still not closed.)

@guidou
Copy link
Contributor

guidou commented Nov 10, 2021

CanvasCaptureMediaStreamTrack is precedent for using inheritance from MediaStreamTrack and having something representing the source as a field (i.e., the canvas field).

In principle, I'm OK with both approaches.
One advantage of using inheritance is that it is pretty clear that the generator, which is a track, must be stopped. It is less clear that a track field represents an already live track (or is not live until it's accessed?) which must be stopped to stop the source.
One advantage of having a field is that having clone() return an MST is clearer than in the inherited case.
Another proposal we considered was to have a static factory function that returned two objects: the source and the track.

There isn't really a big difference between the three approaches and, in principle, I would go for the one more likely to lead to consensus. However, given that the inheritance-based MSTG has shipped on Chromium and a lot of production code has already been written and even more is currently being written against it, in my view, the bar for using something different is a lot higher now and I would personally oppose anything different unless it is shown to be categorically better.

@jan-ivar
Copy link
Member

One advantage of using inheritance is that it is pretty clear that the generator, which is a track, must be stopped.

Why must it be stopped? This doesn't seem like an advantage to me.

But thanks for raising the precedent with canvas. However, following it, I think, would require fixing clone and renaming MSTG to GMST. Consider:

const [track1] = (await canvas.captureStream()).getVideoTracks();
const track2 = track1.clone();
console.log(track1.canvas === track2.canvas); // true

Semantically, track1 here isn't the source. canvas is. So following this precedent would lead me here:

const [track1] = new GeneratedMediaStreamTrack({kind: "video"});
const track2 = track1.clone();
console.log(track1.writable === track2.writable); // true

In other words, the writable is the source, not track1, and clone() returns a clone, not something else.

The cloning in the MSTG proposal seems problematic in this regard. Consider also https://github.com/w3c/mediacapture-main/issues/823.

@alvestrand
Copy link
Contributor Author

The CanvasCaptureMediaStreamTrack is produced by HTMLCanvasElement.captureStream(), and contains a backreference to its canvas. There is no constructor for the CanvasCaptureMediaStreamTrack. So I don't think this is a precedent for creating subclasses of MediaStreamTrack with independent constructors.

@guidou
Copy link
Contributor

guidou commented Nov 16, 2021

One advantage of using inheritance is that it is pretty clear that the generator, which is a track, must be stopped.

Why must it be stopped? This doesn't seem like an advantage to me.

The track is live and must be stopped if you want it to stop using resources. Why would you not want to stop the track?
You don't need to stop the source (it stops automatically when all tracks are disconnected).
When the track is a field it is less clear that it is a live track that must be stopped to stop consuming resources.

But thanks for raising the precedent with canvas. However, following it, I think, would require fixing clone and renaming MSTG to GMST. Consider:

const [track1] = (await canvas.captureStream()).getVideoTracks();
const track2 = track1.clone();
console.log(track1.canvas === track2.canvas); // true

Semantically, track1 here isn't the source. canvas is. So following this precedent would lead me here:

const [track1] = new GeneratedMediaStreamTrack({kind: "video"});
const track2 = track1.clone();
console.log(track1.writable === track2.writable); // true

In other words, the writable is the source, not track1, and clone() returns a clone, not something else.

Correct, the writable is the source. I'm fine with making the clone be a MSTG where clone.writable == original.writable. I preferred the clone to be a MST instead of MSTG to avoid giving the impression that writable is a new writable, but given the CanvasCaptureMST precedent, I agree that it would be more consistent to make the clone an MSTG.
We can make that change and be backwards compatible with practically all current code using MSTG.

The cloning in the MSTG proposal seems problematic in this regard. Consider also w3c/mediacapture-main#823.

As indicated above, we can easily fix it.

@guidou
Copy link
Contributor

guidou commented Nov 16, 2021

The CanvasCaptureMediaStreamTrack is produced by HTMLCanvasElement.captureStream(), and contains a backreference to its canvas. There is no constructor for the CanvasCaptureMediaStreamTrack. So I don't think this is a precedent for creating subclasses of MediaStreamTrack with independent constructors.

There is no precedent for creating tracks with a constructor, but CCMST is indeed a precedent for creating.a track using subclasses with a reference to its source. We could create it with a method instead of a constructor, but I don't see the advantage of doing that.

I reiterate that all these different surfaces are largely the same thing with very minor advantages and disadvantages and I would normally be fine with any of them. My point is that, since production code is written against MSTG, we should go with something different only if we see a significant advantage, and I don't see that advantage anywhere.
I'm fine with making MSTG.clone() returning another MSTG to make it more consistent with CCMST since that wouldn't break any code.

Sorry, but we had 12 months to discuss these things and instead we wasted it arguing mostly about the incorrect claim that MSTP was buffering frames out of control. Now compatibility with existing production code is another variable that I think we should taken into account.

@jan-ivar
Copy link
Member

The track is live and must be stopped if you want it to stop using resources.

What resources? Isn't the sole (re)source here the writable? Calling stop appears to indirectly close the writable, so if users close the writable they should be good.

Why would you not want to stop the track?

That's not the issue. I'm countering your claim that users MUST remember to call stop — and by extension countering your claim that not burying stop in a sub-object is a design advantage — or else resources somehow aren't relinquished, which doesn't appear accurate. Users bring their own source here, so as long as they abort or close that pipe, they should be good whether they've called stop or not. If they haven't yet started piping to the writable, there should be no need to do anything at all.

When the track is a field it is less clear that it is a live track that must be stopped to stop consuming resources.

Which means this doesn't matter.

@guidou
Copy link
Contributor

guidou commented Nov 17, 2021

The track is live and must be stopped if you want it to stop using resources.

What resources? Isn't the sole (re)source here the writable? Calling stop appears to indirectly close the writable, so if users close the writable they should be good.

Why would you not want to stop the track?

That's not the issue. I'm countering your claim that users MUST remember to call stop — and by extension countering your claim that not burying stop in a sub-object is a design advantage — or else resources somehow aren't relinquished, which doesn't appear accurate. Users bring their own source here, so as long as they abort or close that pipe, they should be good whether they've called stop or not. If they haven't yet started piping to the writable, there should be no need to do anything at all.

My point is that it's not clear that users need remember to stop something (either the track or the source).
With the field approach it's no so clear if you have a live source and/or a live track once you create the object.
If I create a VideoTrackSource, do I need to close its writable if I never access it? (e.g., if there is some other error in the application)
If the writable is live without accessing it, is the track live as well? (I agree that this question is less important, but as mentioned above, it's not so clear that the source is live already).

With the inheritance approach, it is very clear that you get a live track and therefore a live source, and if you stop that track you also automatically stop the source (provided you didn't clone it). Very similar to what happens with a camera and the video track returned by getUserMedia().

When the track is a field it is less clear that it is a live track that must be stopped to stop consuming resources.

Which means this doesn't matter.

It matters in the sense that if you don't know if you have a live track you also don't know if you have a live source that needs to be closed.

This doesn't mean I think the field approach is bad, or even inferior. IMO, there is no actual functional tradeoff here, since the lower clarity I refer to is solved via documentation and behavior is the same provided the VideoTrackSource is live upon creation. This is a case where the decision comes down to preference as there is no real tradeoff.

In principle, I'm OK with both approaches, but since we have a lot of production code being written against the inheritance-based option, that makes me lean towards it.

@guidou
Copy link
Contributor

guidou commented Nov 17, 2021

The CanvasCaptureMediaStreamTrack is produced by HTMLCanvasElement.captureStream(), and contains a backreference to its canvas. There is no constructor for the CanvasCaptureMediaStreamTrack. So I don't think this is a precedent for creating subclasses of MediaStreamTrack with independent constructors.

Both the field and inheritance approaches result in a returned track upon calling a constructor.
Is there a precedent for a new track returned as a field from an object created by calling a constructor?
If not, which I think is the case, there is no precedent for either approach. Still I think a constructor is OK for both approaches.

@dontcallmedom dontcallmedom transferred this issue from w3c/mediacapture-extensions Jan 4, 2022
@alvestrand
Copy link
Contributor Author

The consensus documented is that we have a VideoTrackGenerator (and an AudioTrackGenerator if we decide to add audio), which has a MediaStreamTrack attribute.
Fixed in #66.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants