Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make generateKeyFrame take a list of rids and return a promise<undefined> #165

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 13 additions & 38 deletions index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -279,7 +279,7 @@ enum RTCEncodedVideoFrameType {
</pre>
<table data-link-for="RTCEncodedVideoFrameType" data-dfn-for=
"RTCEncodedVideoFrameType" class="simple">
<caption>Enumeration description</caption>
<caption>Enumeration description</caption>
<thead>
<tr>
<th>Enum value</th><th>Description</th>
Expand Down Expand Up @@ -534,7 +534,7 @@ interface RTCRtpScriptTransformer {
readonly attribute ReadableStream readable;
readonly attribute WritableStream writable;
readonly attribute any options;
Promise&lt;unsigned long long&gt; generateKeyFrame(optional DOMString rid);
Promise&lt;undefined&gt; generateKeyFrame(optional sequence&lt;DOMString&gt; rids);
Promise&lt;undefined&gt; sendKeyFrameRequest();
};

Expand Down Expand Up @@ -575,14 +575,14 @@ Each RTCRtpScriptTransform has the following set of [=association steps=], given
1. Set |transformer|.`[[encoder]]` to |encoder|.
1. Set |transformer|.`[[depacketizer]]` to |depacketizer|.

The <dfn method for="RTCRtpScriptTransformer">generateKeyFrame(|rid|)</dfn> method steps are:
The <dfn method for="RTCRtpScriptTransformer">generateKeyFrame(|rids|)</dfn> method steps are:
1. Let |promise| be a new promise.
1. Run the [=generate key frame algorithm=] with |promise|, |this|.`[[encoder]]` and |rid|.
1. [=In parallel=], run the [=generate key frame algorithm=] with |promise|, the {{RTCRtpSender}} associated with |this|.`[[encoder]]` and |rids|.
1. Return |promise|.

The <dfn method for="RTCRtpScriptTransformer">sendKeyFrameRequest()</dfn> method steps are:
1. Let |promise| be a new promise.
1. Run the [=send request key frame algorithm=] with |promise| and |this|.`[[depacketizer]]`.
1. [=In parallel=], run the [=send request key frame algorithm=] with |promise| and |this|.`[[depacketizer]]`.
1. Return |promise|.

## Attributes ## {#RTCRtpScriptTransformer-attributes}
Expand All @@ -602,37 +602,12 @@ The <dfn attribute for="RTCRtpScriptTransformer">writable</dfn> getter steps are

## KeyFrame Algorithms ## {#KeyFrame-algorithms}

The <dfn>generate key frame algorithm</dfn>, given |promise|, |encoder| and |rid|, is defined by running these steps:
1. If |encoder| is undefined, reject |promise| with {{InvalidStateError}}, abort these steps.
1. If |encoder| is not processing video frames, reject |promise| with {{InvalidStateError}}, abort these steps.
1. If |rid| is defined, validate its value. If invalid, reject |promise| with {{NotAllowedError}} and abort these steps.
1. [=In parallel=], run the following steps:
1. Gather a list of video encoders, named |videoEncoders| from |encoder|, ordered according negotiated RIDs if any.
1. If |rid| is defined, remove from |videoEncoders| any video encoder that does not match |rid|.
1. If |rid| is undefined, remove from |videoEncoders| all video encoders except the first one.
1. If |videoEncoders| is empty, reject |promise| with {{NotFoundError}} and abort these steps.
|videoEncoders| is expected to be empty if the corresponding {{RTCRtpSender}} is not active, or the corresponding {{RTCRtpSender}} track is ended.
1. Let |videoEncoder| be the first encoder in |videoEncoders|.
1. If |rid| is undefined, set |rid| to the RID value corresponding to |videoEncoder|.
1. Create a pending key frame task called |task| with |task|.`[[rid]]` set to rid and |task|.`[[promise]]`| set to |promise|.
1. If |encoder|.`[[pendingKeyFrameTasks]]` is undefined, initialize |encoder|.`[[pendingKeyFrameTasks]]` to an empty set.
1. Let |shouldTriggerKeyFrame| be <code>true</code> if |encoder|.`[[pendingKeyFrameTasks]]` contains a task whose `[[rid]]`
value is equal to |rid|, and <code>false</code> otherwise.
1. Add |task| to |encoder|.`[[pendingKeyFrameTasks]]`.
1. If |shouldTriggerKeyFrame| is <code>true</code>, instruct |videoEncoder| to generate a key frame for the next provided video frame.

For any {{RTCRtpScriptTransformer}} named |transformer|, the following steps are run just before any |frame| is enqueued in |transformer|.`[[readable]]`:
1. Let |encoder| be |transformer|.`[[encoder]]`.
1. If |encoder| or |encoder|.`[[pendingKeyFrameTasks]]` is undefined, abort these steps.
1. If |frame| is not a video {{RTCEncodedVideoFrameType/"key"}} frame, abort these steps.
1. For each |task| in |encoder|.`[[pendingKeyFrameTasks]]`, run the following steps:
1. If |frame| was generated by a video encoder identified by |task|.`[[rid]]`, run the following steps:
1. Remove |task| from |encoder|.`[[pendingKeyFrameTasks]]`.
1. Resolve |task|.`[[promise]]` with |frame|'s timestamp.

By resolving the promises just before enqueuing the corresponding key frame in a {{RTCRtpScriptTransformer}}'s readable,
the resolution callbacks of the promises are always executed just before the corresponding key frame is exposed.
If the promise is associated to several rid values, it will be resolved when the first key frame corresponding to one the rid value is enqueued.
The <dfn>generate key frame algorithm</dfn>, given |promise|, |sender| and |rids|, is defined by running these steps:
1. If the sender's transceiver kind is not `video`, reject |promise| with an {{OperationError}} and abort these steps.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NoSupportedError would be more adequate maybe?

1. If |rids| is defined, for each |rid| in rids,
1. if |rid| is not associated with |sender|, reject Promise with an {{InvalidAccessError}} and abort these steps.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

associated with is not really clear to me. Maybe we could reuse webrtc-pc terminology. something like if sender's media stack is not configured with |rid| as active,

I also think we should reject if the sender is not actively sending (its track is null or current direction does not include send). I am not exactly sure of which wording to use, maybe something like if sender's media stack is not actively sending media.

It might also more accurate to queue a task to reject the Promise in the Promise realm event task loop, since all of those checks are done close to the encoder.

1. Instruct the encoder associated with |sender| to generate a key frame for |rids| or all layers when |rids| is empty.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make it clear that the promise gets resolved in the transformer context before any key frame triggered by this step is enqueued in the transformer readable stream. This PR is removing line 633-635 which was a similar statement with stronger-but-no-longer-valid guarantees.

1. Resolve |promise| with `undefined`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to resolve the promise when the key frame is being queued, this is what the current spec is asking and is a nice addition in the case of script transform.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will introduce unnecessary complexity to what was supposed to be as a simple key frame generation mechanism. And will make scenarios involving multiple streams (active, inactive, restricted) not only hard to define in specification, but even more harder to implement in user agents.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This complexity is fairly limited in the case of script transform, please have a look at how this is already implemented in WebKit.
For the sender API, I agree the usefulness is reduced, we could consider relaxing the resolution timing if it proves difficult to implement.

not only hard to define in specification

This is already defined in the spec for the script transform code path.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think anyone considered awaiting useful at TPAC, to quote the minutes:

this can be done by waiting to a keyframe in the stream before doing the key change"

With multiple rids when do you resolve?
On the first keyframe (regardless of layer) or do you wait for all of them? Return an array of promises?
What about cases like turning off the lowest simulcast layer and asking for a keyframe? Surprise (thankfully not returning an RTP timestamp you don't expect anymore).
What if the highest layer is disabled due to bandwidth restrictions?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the first keyframe (regardless of layer) or do you wait for all of them? Return an array of promises?

On the first keyframe of any layer would make sense.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What usecase does resolving on a "random" keyframe solve? Also how do you avoid this keyframe being triggered by another operation such as changing the size?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. this should happen in a "queue a task" or "in parallel", so that it's obviously async
  2. we should guarantee that the promise resolves before the first of the keyframes arrives at the transform.


The <dfn>send request key frame algorithm</dfn>, given |promise| and |depacketizer|, is defined by running these steps:
1. If |depacketizer| is undefined, reject |promise| with {{InvalidStateError}}, abort these steps.
Expand All @@ -649,7 +624,7 @@ An additional API on {{RTCRtpSender}} is added to complement the generation of k

<pre class="idl">
partial interface RTCRtpSender {
Promise&lt;undefined&gt; generateKeyFrame(optional sequence &lt;DOMString&gt; rids);
Promise&lt;undefined&gt; generateKeyFrame(optional sequence&lt;DOMString&gt; rids);
};
</pre>

Expand All @@ -658,7 +633,7 @@ partial interface RTCRtpSender {
The <dfn method for="RTCRtpSender">generateKeyFrame(|rids|)</dfn> method steps are:

1. Let |promise| be a new promise.
1. [=In parallel=], run the [=generate key frame algorithm=] with |promise|, |this|'s encoder and |rids|.
1. [=In parallel=], run the [=generate key frame algorithm=] with |promise|, |this| and |rids|.
1. Return |promise|.

# Privacy and security considerations # {#privacy}
Expand Down