Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter duplicate init data from the manifest by key ID #580

Closed
joeyparrish opened this issue Nov 9, 2016 · 4 comments
Closed

Filter duplicate init data from the manifest by key ID #580

joeyparrish opened this issue Nov 9, 2016 · 4 comments
Assignees
Labels
status: archived Archived and locked; will not be updated type: enhancement New feature or request
Milestone

Comments

@joeyparrish
Copy link
Member

We currently filter duplicate init data to reduce unnecessary license requests. This is a heuristic that still results in some unnecessary requests and MediaKeySessions. If key ID information and init data are both available in the manifest, we should filter init data by its associate key ID as well.

Note that even this would not be a perfect filter, since PSSHs can contain key-system-specific content IDs which allow multiple keys to be retrieved at once. We have no way to know about these content IDs without key-system-specific knowledge/parsers.

@ghost
Copy link

ghost commented Nov 22, 2016

Hi Joey,

I see you mention key-system specific data - and the 1 to many key delivery approach based on contentId (Widevine example).

Is it possible that you could also allow the MediaKeySession filtering to be done based on the initdata payload (only)?

Example:
MPD containing HD, SD and Audio adaptionsets - all different default keyids
initdata made up of V1 PSSH header and keysystem payload for each track. Again this data will be distinct for each adaptionset due to the unique KeyId being present in the header

Filter based on inittdata payload (strip the header )(this could be configurable option) - if the keysystem specific payload is the same (hash) then we know that the MediaKeySession can be shared between these adaptionsets - otherwise we need a new MediaKeySession..

I realise that you might say this is a little specific, but I think it could be considered a generic solution for DRMs that implement (or allow implementation) of key management at a higher level of abstraction than standard CENC keyid->key 1 to 1 mapping.

Finally a question...
In which cases to you expect the extended filter criteria defined in the OP to be applied (that are not already caught today>). The the best of my knowledge if the keyId is distinct then the initdata will be distinct for most DRMs.
In fact, if a V1 PSSH is used (as per the spec), then this distinctiveness is guaranteed is it not?
Thanks,
Karl.

@joeyparrish
Copy link
Member Author

We already compare init data and filter based on that, and it's not completely effective. Stripping the PSSH box down to the payload won't necessarily make this any better, since the payload can still contain key IDs or other unique parameters.

What we want, ideally, is to predict which things can share a session before we create those sessions. This is not directly achievable when the PSSH payload can't be interpreted, so we have to apply heuristics to dedup sessions in a safe way.

Having a duplicate sessions is tolerable, since playback can continue. (It's not ideal, since on embedded devices those sessions consume a limited resource.) Removing a session as a duplicate that shouldn't have been removed is not tolerable, since it leads to a playback failure (or hung media pipeline).

Today, we dedup on init data only. What's proposed here is to also dedup on key ID. So two unique init datas that we know refer to the same key ID could be filtered down to one session.

This still doesn't detect if there is a content ID. Content ID filtering would be best, but we don't have access to that in JavaScript and the very concept may not exist for all DRM providers. I know it does for Widevine, as an optimization, but it may not in the general case.

We won't be parsing the provider-specific payloads in init data, partly because the formats are not necessarily public, and partly because it doesn't scale in a provider-agnostic way. If we aren't parsing those payloads, we can't use the init data to learn about content IDs.

Does that help explain?

@ghost
Copy link

ghost commented Dec 5, 2016

Hi Joey,

Yes this somewhat explains the reasoning.

What I still don't understand is how we could have unique Initdata's that encapsulate or reference the same keyID? Based on my knowledge of a number of DRM systems - I can't see why/how anyone would want (or be able to) produce such a PSSH configuration.

On my other point - by stripping the PSSH down to the key system specific part - what you get is the ability for the DRM solution vendor to intentionally make the (stripped) initdata unique / non-unique, since the standard mean that (implicitly) there will be a unique PSSH for each keyId (regardless of DRM system)

I get the fact that you may not want to 'delve' to deep into vendor specifics in a generic player app however.

@joeyparrish
Copy link
Member Author

joeyparrish commented Dec 14, 2016

Nobody would want unique init data containing a non-unique key ID, but it happens when PSSHs are generated by automated tools which don't consider what web-based players do and don't know. Imagine a fictional JSON-based PSSH format like this:

{
  "key_id": "foo1",
  "track": "audio",
  "bitrate": 128000,
}

You really shouldn't need the track type and bitrate, but maybe somebody thought early-on that it would be useful to a particular CDM or platform or whatever. If you have tools that put extra information into the content itself or into the manifest, you can end up with non-unique PSSHs that contain the same key ID.

These are just examples and entirely fictional, but PSSHs are opaque and therefore we don't know what's in them in JavaScript. Encoders, packagers, PSSH formats, DRM clients, players, and web APIs are developed by different groups of people. Things don't always align and that makes things more complicated than they have to be.

It will take time to get things better aligned, and a lot of content has already been generated. We will have to have heuristics to deal with it for a while.

@theodab theodab self-assigned this Jan 20, 2017
@joeyparrish joeyparrish modified the milestones: Backlog, v2.1.0 Mar 31, 2017
@shaka-project shaka-project locked and limited conversation to collaborators Mar 22, 2018
@shaka-bot shaka-bot added the status: archived Archived and locked; will not be updated label Apr 15, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
status: archived Archived and locked; will not be updated type: enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants