Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Controlling 3rd party iframe audio output on a page? #63

Open
randallb opened this issue Nov 5, 2016 · 36 comments
Open

Controlling 3rd party iframe audio output on a page? #63

randallb opened this issue Nov 5, 2016 · 36 comments
Assignees

Comments

@randallb
Copy link

randallb commented Nov 5, 2016

(forgive me, this is my first time attempting to contribute)

We've built an application to allow for video streaming, and we essentially render the video using a browser, then output the framebuffer to another encoding process and send it along to an RTMP destination. (In our case, it's usually FB Live.)

We'd like to enable our users to easily embed other media in their productions.

Today, we capture the audio by changing a machine's default audio device to one we custom wrote, then reroute the audio to our encoder process. This works great generally, but has the major drawback of disabling audio for the rest of the computer.

As the web audio spec has started shipping in Chrome specifically, we've started experimenting with using the web audio output api to redirect the audio properly. Basically, we use the enumerate devices api to find our driver, and if a confluence of things are correct, we direct our audio to go out to that spot explicitly using the setSinkId of audio and video elements.

The issue is if we'd like to embed other external media, like an iframe from YouTube as a simple example, we'd need YouTube to explicitly support switching audio destinations in their postmessage api. We view this as unlikely given our usecase is more edgecase for their business. We think the top-most context for a page should likely be in charge of where audio ends up, if inner iframes haven't changed their sound settings past 'default'. Basically, the top-most context could be in charge of all audio routing ideally.

I'd propose a setSinkId api on an iframe, just like we have on audio / video elements. If this has been done before, I apologize, I wasn't able to find any data on this pretty much anywhere on the web.

I think there's likely some technical challenges here, but I think for advanced audio / video (what i'm obsessed with) it'll help a lot with what the web is great at: linking and embedding resources.

@alvestrand
Copy link
Contributor

Sorry, lost track of this one. Will attempt to look at the issue soon.

@alvestrand
Copy link
Contributor

Having thought about this some more...
the only not-completely-bogus thing I've been able to think of is that we could add an attribute to the Iframe object (https://html.spec.whatwg.org/#the-iframe-element) called "defaultAudioOutput", which would be the place where default audio output is sent.

The advantage of an attribute over a setter is that you can read it, and that it should be easy to say "on creation, it is copied from its parent context".

We could also place it on the objects referenced by the iframe's "contentWindow" or "contentDocument" elements - these are even more generic classes, so would have even more things to sort out (and perhaps other use cases).

I don't want to monkey-patch HTML (more than we already do), but that seems to be required both for a setSinkId() function and a "defaultAudioOutput" attribute on these objects.
@foolip @domenic do you have thoughts about how to best extend HTML objects for this type of control (or why we shouldn't do it)?

@domenic
Copy link

domenic commented Mar 28, 2017

By attribute, do you mean IDL attribute, or content attribute?

Note that contentWindow and contentDocument are just the window/document of the iframe, so if you placed things there it would be placing them on Window/Document classes universally, not just ones for iframes.

As for monkey-patching HTML, it's mostly reasonable to do this using partial interfaces without it being problematic. The only issue I can see is if you want to copy the value when creating the iframe (instead of, e.g., looking it up lazily up the chain). That is not very extensible in HTML right now. We could add a hook though, e.g. "run any iframe creation steps in applicable specifications", and then you could define "iframe creation steps". Although, I am not sure if you want iframe creation, or src="" attribute setting, or something else.

@alvestrand
Copy link
Contributor

I've not yet figured out what the difference between IDL attributes and content attributes are; I think I want to do something like "navigator.media.enumerateDevices().then(list => .defaultAudioOutput = .id)

Not sure it makes sense to change the default after the iframe is initiated, which argues for making it part of what you hand to the iframe when creating it.

Still in brainstorming mode on this; @randallb may want to comment given the constraints of his use case.

@domenic
Copy link

domenic commented Mar 28, 2017

Content attributes are <iframe something="..."> in the HTML. IDL attributes are iframeEl.something = "..." in JavaScript. You can have both by saying that the IDL attribute reflects the content attribute. (Almost all content attributes are reflected as IDL attributes, in fact. But not necessarily vice-versa.)

If the idea is to not change after the iframe is created, then I'm not sure which is better. Again it kind of depends on what you mean by "created". The sandbox="" attribute on iframes has one model, where it re-reads the value on navigation of the iframe (e.g. setting .src = "...", or clicking a link inside the page). I think that is also the model used by feature policy and the allowX attributes.

@alvestrand
Copy link
Contributor

The thing that made me worry about changing the attribute after creation is that when an iframe is instantiated, and starts producing sound, changing the attribute (if allowed) would switch the sound while it was playing, with the change being invisible to code running inside the iframe. This may be tricky to implement, so I'd like to avoid doing it. (if @guidou claims it's easy, and no other browser person claims the opposite, we can just allow it.)
Re-reading the attribute on navigation sounds like a reasonable model if we agree that we shouldn't allow the container page to change the default output device while the page is playing sound.

@alvestrand alvestrand removed their assignment Apr 6, 2017
@alvestrand
Copy link
Contributor

@guidou I think this is clarified enough that you can propose a spec update.

@randallb
Copy link
Author

I think changing it on create is fine, fwiw. In a worst case where we'd need to change the audio output of an iframe, we'd just destroy and recreate, updating the internal state of the YouTube player or what have you.

@stefhak
Copy link

stefhak commented May 10, 2017

@guidou you're assigned to this Issue, will you have time to look into it (soon)?

@guidou
Copy link
Contributor

guidou commented May 10, 2017

@stefhak I don't think I will have time to look into this issue soon.

@stefhak
Copy link

stefhak commented May 31, 2017

Getting desperate, is this something you could take a look at @YellowDoge?

@yell0wd0g
Copy link
Member

@stefhak sadly (or happily, I'd say) I'm swamped with shipping Image Capture and Shape Detection;

@hoch, would you have some time to look at this by any chance?

@randallb
Copy link
Author

Is there anything I can do to help progress this spec?

@dontcallmedom
Copy link
Member

@randallb fwiw, given that we can't seem to find enough resources to spec that additional behavior just now, we will likely proceed with the next steps in the standardization process with the spec as is. That said, this is not to say that this feature won't be considered for inclusion later - just a recognition that the spec as it exists today matches what implementations are shipping or considering shipping.

@TheBrenny
Copy link

Bump - Would love to see this implemented! 😁

@nefelin
Copy link

nefelin commented Oct 6, 2019

Bumping would also find this very useful!

@foolip
Copy link
Member

foolip commented Oct 6, 2019

@clelland has this been proposed as a feature policy?

@clelland
Copy link

clelland commented Oct 7, 2019

There was a 'speaker' policy proposed, to complement 'microphone' and 'camera' from the media input side. The actual behaviour was never really decided on, and the presence of the policy was confusing (people thought that it should control all audio output, which it definitely didn't do) so it was removed recently from Chrome. It's still present in Feature Policy's feature list.

@foolip
Copy link
Member

foolip commented Oct 7, 2019

I see. I think something that applies to all audio output is actually the policy that would be useful. But that hasn't been pursued, then?

@bigicoin
Copy link

Is this still being worked on? Would be cool to use for a hobby project I have.

@brianfields
Copy link

This would be particularly useful for collaborative A/V applications where you'd like the user to be able to select their preferred audio output device. My use case is a classroom instruction application (using WebRTC for streaming video) where the teacher would like to play a YouTube video that is seen and heard by everyone in class. We have application level control over the WebRTC audio output but not the iframed YouTube, so--to avoid confusing people--we don't allow the user to select their preferred output device (instead, it uses the system default). This has caused plenty of frustration.

@MatanYemini
Copy link

The same problem that I am getting into. Any ideas? That can be cool project :)

@toschlog
Copy link

It would be great to be able to do setSinkId on an iframe. I can control where on my page iframe visuals appear; I should be able to control where the audio goes.

@Johnny-John-John
Copy link

Any updates on this? I would love to see this since I am working on something which needs audio from an Iframe element and do fft magic with it. Or does this feature cause privacy issues? What if some company, say Youtube, want their Iframes' audio to not be analyzed?

@freddy-daniel
Copy link

freddy-daniel commented Mar 3, 2022

It would be nice if there was a final solution to this problem. or someone will ask same in 2023 😂

@bigicoin
Copy link

bigicoin commented Mar 3, 2022

I think everyone who had hoped to do this in a project probably gave up on their projects already because of this. 😛

@randallb
Copy link
Author

I'm closing as not planned for now.

@randallb randallb closed this as not planned Won't fix, can't repro, duplicate, stale May 22, 2022
@dontcallmedom dontcallmedom reopened this May 23, 2022
@dontcallmedom
Copy link
Member

re-opening to make sure this gets formally addressed - sorry this is taking so long though

@MatanYemini
Copy link

@dontcallmedom Hi - I don't mind contributing. Do you have suggestions/ideas (if you want me to jump on it) ?

@ErikDombi
Copy link

2023 now, would still love to see this one!

@MatanYemini
Copy link

@dontcallmedom Dominique - what do you think? maybe we can do this one?

@dontcallmedom
Copy link
Member

@hoch @padenot @mdjp with the Web Audio Working Group having adopted setSinkId for the Web Audio API, I wonder if that group would have more momentum in making progress in pushing this issue forward? for better or for worse, the WebRTC WG has never managed to put enough priority, despite continuous demand for the feature

@hoch
Copy link

hoch commented Apr 28, 2023

@dontcallmedom

Where would be the right spec/venue for the discussion though? One thing for sure is that Audio WG can't expand Web Audio API to make this happen.

I see the demand here in this thread, but still a bit unsure about its priority. (compared to projects that the Audio WG is currently working on) As you already are aware, the Audio WG is a relatively small group than others.

@kenzkiran
Copy link

kenzkiran commented Sep 29, 2023

This is a good feature to have. We have an app that allows for embedding of Youtube, Vimeo and other 3rd party providers. Our users have the flexibility in our app to route the audio from the top level frame to any choice of the speaker,BT device connected to the device. But we are unable to provide a consistent experience when the app has to embed video from 3rd party providers. @guidou @juberti how do we get traction on this. There are many ideas floating around including specifying attribute on the child <iframe> to follow top level frames audio routing, or explicitly allow embedder to call "setSinkId" on the <iframe> element.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests