Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add replaceTrack method to MediaStream #167

Open
guest271314 opened this issue May 9, 2019 · 7 comments

Comments

@guest271314
Copy link

commented May 9, 2019

Related:

Given a MediaStream instance either created using new MediaStream() and/or by getUserMedia(), HTMLMediaElement.captureStream(), add a replaceTrack() method directly to the existing MediaStream which behaves similarly to RTCRtpSender.replaceTrack() https://w3c.github.io/webrtc-pc/#dom-rtcrtpsender-replacetrack

NOTE

There is not an exact 1:1 correspondence between tracks sent by one RTCPeerConnection and received by the other. For one, IDs of tracks sent have no mapping to the IDs of tracks received. Also, replaceTrack changes the track sent by an RTCRtpSender without creating a new track on the receiver side; the corresponding RTCRtpReceiver will only have a single track, potentially representing multiple sources of media stitched together. Both addTransceiver and replaceTrack can be used to cause the same track to be sent multiple times, which will be observed on the receiver side as multiple receivers each with its own separate track. Thus it's more accurate to think of a 1:1 relationship between an RTCRtpSender on one side and an RTCRtpReceiver's track on the other side, matching senders and receivers using the RTCRtpTransceiver's mid if necessary. (emphasis added)

The primary use case for adding a replaceTrack method to MediaStream is for the ability to record media (for example following a call to HTMLMediaElement.captureStream()) with MediaRecorder where the src attribute of the HTMLMediaElement is changed (currently MediaRecorder stops recording when a track is added to the MediaStream being recorded by means of src of <video> or <audio> element changing).

This https://github.com/guest271314/MediaFragmentRecorder/blob/webrtc-replacetrack/MediaFragmentRecorder.html is currently possible using RTCPeerConnection (proof of concept) though should be possible without the end-user having to manually create two RTCPeerConnection instances to achieve the requirement. Instead, the end-user should simply be able to execute <MediaStreamInstance>.replaceTrack(withTrack) where internally replaceTrack(withTrack) achieves the same functionality of RTCRtpSender.replaceTrack()

multiple sources of media stitched together

which will be observed on the receiver side as multiple receivers each with its own separate track

Filing this issue before filing a PR for changes to the respective specification(s).

@youennf

This comment has been minimized.

Copy link
Contributor

commented May 10, 2019

It seems your use case is mostly about media recorder and the ability to switch tracks when recording.
I would file an issue on the media recorder spec.
Also, I am not clear whether this is a spec issue or a browser implementation limitation.
Media recorder could allow, if it is not the case, this mutation in a MediaStream while the recording is paused.

@guest271314

This comment has been minimized.

Copy link
Author

commented May 10, 2019

@youennf The MediaRecorder would necessarily have to replace the track of the underlying media resource (MediaStreamTrack) being written (stored) which would lead back to the functionality of MediaStream.

Relevant specification language

https://w3c.github.io/mediacapture-main/#dom-mediastream

4.1 Introduction

A MediaStream is used to group several MediaStreamTrack objects into one unit that can be recorded or rendered in a media element.

Each MediaStream can contain zero or more MediaStreamTrack objects. All tracks in a MediaStream are intended to be synchronized when rendered. This is not a hard requirement, since it might not be possible to synchronize tracks from sources that have different clocks. Different MediaStream objects do not need to be synchronized. (emphasis added)

https://w3c.github.io/mediacapture-fromelement/

Methods

captureStream

The captureStream() method produces a real-time capture of the media that is rendered to the media element.

The captured MediaStream comprises of MediaStreamTracks that render the content from the set of selected (for VideoTracks, or other exclusively selected track types) or enabled (for AudioTracks, or other track types that support multiple selections) tracks from the media element. If the media element does not have a selected or enabled tracks of a given type, then no MediaStreamTrack of that type is present in the captured stream.

The HTML specification mentions using videoTrack property of HTMLMediaElement, though as far as am aware (and to the extent have tried) that feature is not possible to implement using Media Fragments URI https://www.w3.org/TR/media-frags/ with MediaStream object.

https://www.w3.org/TR/html5/semantics-embedded-content.html#selecting-specific-audio-and-video-tracks-declaratively

4.7.13.10.2. Selecting specific audio and video tracks declaratively
The audioTracks and videoTracks attributes allow scripts to select which track should play, but it is also possible to select specific tracks declaratively, by specifying particular tracks in the fragment of the URL of the media resource. The format of the fragment depends on the MIME type of the media resource. [RFC2046] [URL]

In this example, a video that uses a format that supports the media fragments syntax is embedded in such a way that the alternative angles labeled "Alternative" are enabled instead of the default video track. [MEDIA-FRAGS]
<video src="myvideo#track=Alternative"></video>

MediaRecorder specification https://w3c.github.io/mediacapture-record/ does not specifically mention limiting video tracks to 1.

there is ambiguity present amidst the various specifications which use the term MediaStream and MediaStreamTrack.

Interestingly, Chromium 73 does not dispatch ended event at MediaStreamTrack when src is changed at HTMLMediaElement, which technically, should per the specification not stop MediaRecorder from recording, though since the audio and video tracks are set at MediaStream in order A,A,..V,V or V,V,..A,A, again, it is not possible to use enabled, addTrack and removeTrack or addtrack event to continue recording.

Browsing the source code of MediaRecorder implementations reveals that multiple tracks have been considered for some time now; e.g., https://github.com/mozilla/gecko-dev/blob/master/dom/media/MediaRecorder.cpp#L778

      // When MediaRecorder supports multiple tracks, we should set up a single
      // MediaInputPort from the input stream, and let main thread check
      // track principals async later.

https://github.com/mozilla/gecko-dev/blob/master/dom/media/MediaRecorder.cpp#L807

    // We only allow one audio track. See bug 1276928.
    return;
  }
  if (track->AsVideoStreamTrack() && aTrack.AsVideoStreamTrack()) {
    // We only allow one video track. See bug 1276928.
    return;
  }

Similar issues and branches for code which set multiple video tracks that could be selected exist for Chromium; e.g., https://bugs.chromium.org/p/chromium/issues/detail?id=528523.

The relevant issues have already been filed at MediaRecorder specification at GitHub (#147; et al., briefly referenced at the first post).

Would not characterize the lack of support for recording multiple MediaStreamTracks of kind video as a browser implementation "limitation"; the expected result is possible using various workarounds; where the functionality, from perspective here, should be possible directly at the MediaStream itself, to avoid the need to to address the potential for continued ambiguity at specifications derived from or using the MediaStream language. Allowing pause and resume between src changes to an HTMLMediaElement where the timecode EBML elements generated would be in sequential order would also be another possible solution. Though replaceTrack() appears to achieve the expected result, save for the issue of the last one second or less of recorded audio being muted.

Do not have a particular preference for how the functionality of using MediaRecorder to record multiple tracks is implemented - the more different ways the functionality is implemented, the more ways there will bbe possible to actually create code when outputs the same result at FOSS browsers. Since replaceTrack is already implemented, and the WebRTC developers are active, the incorporation of replaceTrack into the MediaStream and capture API should not be particularly difficult; less difficult in actual code than in writing exacting technical documents which actually are consistent in definitions and actual implementations.

@guest271314

This comment has been minimized.

Copy link
Author

commented May 10, 2019

@youennf Re https://bugs.chromium.org/p/chromium/issues/detail?id=528523#c14

Status: WontFix (was: Assigned)

I don't think we are going to implement this feature
any time soon, partially due to container lack of
support.

from own minimal experimentation the primary issue appears to be setting timecode elements in sequential order given a frame rate https://gist.github.com/guest271314/f942fb9febd9c197bd3824973794ba3e, https://gist.github.com/guest271314/17d62bf74a97d3f111aa25605a9cd1ca

<mkv2xml>
<EBML>
  <EBMLVersion>1</EBMLVersion>
  <EBMLReadVersion>1</EBMLReadVersion>
  <EBMLMaxIDLength>4</EBMLMaxIDLength>
  <EBMLMaxSizeLength>8</EBMLMaxSizeLength>
  <DocType>webm</DocType>
  <DocTypeVersion>2</DocTypeVersion>
  <DocTypeReadVersion>2</DocTypeReadVersion>
</EBML>
<Segment>
<Info>
  <TimecodeScale>1000000</TimecodeScale>
  <MuxingApp>webm-writer-js</MuxingApp>
  <WritingApp>webm-writer-js</WritingApp>
  <Duration>1966.66666667</Duration>
</Info>
<Tracks>
  <TrackEntry>
    <TrackNumber>1</TrackNumber>
    <TrackUID>1</TrackUID>
    <FlagLacing>0</FlagLacing>
    <Language>und</Language>
    <CodecID>V_VP8</CodecID>
    <CodecName>VP8</CodecName>
    <TrackType>1</TrackType>
    <Video>
      <PixelWidth>320</PixelWidth>
      <PixelHeight>240</PixelHeight>
    </Video>
  </TrackEntry>
</Tracks>
<Cluster>
  <Timecode>0</Timecode>
  <SimpleBlock>
    <track>1</track>
    <timecode>0.0</timecode>
    <keyframe/>
    <data>
    ..

(https://github.com/guest271314/mkvparse; https://github.com/guest271314/webm-writer-js (uncompressed the resulting webm file is considerably greater size than Chromium or Firefox resulting webm files, which are each differ between each other as well, both in general and depending on the codec used, video/x-matroska;codecs=avc1 being the least amount of total bytes)) - without delving into setting cues.

@alvestrand

This comment has been minimized.

Copy link
Contributor

commented May 13, 2019

what is the desired difference between the proposed "replaceTrack" and doing "removeTrack/addTrack" on the MediaStream?

@guest271314

This comment has been minimized.

Copy link
Author

commented May 14, 2019

@alvestrand In brief, the MediaRecorder specification stops recording when a MediaStreamTrack is added or removed from a MediaStream whether the tracks are added by src attribute of HTMLMediaElement is changed and the MediaStream returned by captureStream() has tracks added and addtrack event is dispatched or addTrack() and/or removeTrack() is called directly on the MediaStream which does not dispatch addtrack event.

The difference is that "replaceTrack()" defined as a method of MediaStream should behave similarly to or the same as RTCRtpSender.replaceTrack() without explicitly using RTCPeerConnection() where the recorded MediaStreamTrack "potentially representing multiple sources of media stitched together" remains the same as to MediaRecorder resulting in MediaRecorder not stopping recording. e.g.,

let mediaStream = new MediaStream([videoTrack, audioTrack]);
// do stuff
let nextVideoTrack = nextVideoTrack; // MediaStreamTrack 
ms.replaceTrack(nextTrack);

or

videoCapturedStream.onaddtrack = e => {
  videoCapturedStream.replaceTrack(e.track);
}

or

anyMediaStreamInstanceFromAnySpecificationIncludingMediaStreamUsage.replaceTrack(withTrack)

Adding replaceTrack() to MediaStream would provide the ability to switch media resource stream and allow MediaRecorder specification (which adding ability to switch between multiple live video and audio tracks and not stop the recorder instance might require adjustments that this proposal) to not have to be adjusted, while adding functionality to all specifications and API's which use MediaStream.

@alvestrand alvestrand transferred this issue from w3c/mediacapture-main May 16, 2019

@alvestrand

This comment has been minimized.

Copy link
Contributor

commented May 16, 2019

This seems like the right repo for it, since the justification for the function is strictly based on the MediaRecorder definition that "adding a track stops recording".
An alternative would be to let MediaRecorder do the same thing as the <video> playback, and let only one video track be recorded, but the video track can be replaced by add/remove.

@guest271314

This comment has been minimized.

Copy link
Author

commented Jul 8, 2019

@jan-ivar Any progress of composing the specification portions for a replaceTrack() method for MediaRecorder?

@aboba aboba added the TPAC 2019 label Aug 22, 2019

@henbos henbos assigned henbos and unassigned jan-ivar Aug 22, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.