Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussing potential enhancements to work on together #3

Closed
lminiero opened this issue Nov 2, 2016 · 10 comments
Closed

Discussing potential enhancements to work on together #3

lminiero opened this issue Nov 2, 2016 · 10 comments

Comments

@lminiero
Copy link

lminiero commented Nov 2, 2016

Hi @traud,

first of all, nice work! I'm the author of the original asterisk-opus patch, and I'm really impressed by the improvements you guys have made to the code, not only to support more recent versions of Asterisk, but also on advanced features like integration with SDP negotiation and, more importantly, FEC. I haven't been able to test the FEC support yet, but I plan to do that soon, so I hope I'll have feedback on any fix that might be needed.

I'm opening this issue for something partially related to that, actually, that is some further improvements that could be added to the management of an Opus transcoding setting in Asterisk. Since this patch is currently in a more advanced state than our original patch, it makes sense for me to contribute and integrate changes here and, if needed, backtrack it to 11 in there too.

One fix that you may integrate right away, and that we've been testing with success for quite some time already, is the use of the OPUS_SET_COMPLEXITY control. This know allows you to tune the complexity of the encoder: higher level means high quality but, at the same time, a lot of CPU; lower levels get you worse quality, but much less CPU consumption. Since it's a 0-10 value, playing with that value dynamically allows you to try and find the right trade-off between audio quality and CPU usage, which is important since by default, IIRC, the library sets the highest value, and so a very high CPU usage especially when you handle several calls at the same time. In an experimental branch we've been using 4 as a value, which seemed as the best trade-off in terms of quality/CPU, at least for our conferencing usages. Using a #define or a configurable value might provide users with a way to decide for themselves which one is the best for them. The right way to set the value is like this:

opus_encoder_ctl(opvt->opus, OPUS_SET_COMPLEXITY(complexity));

Another more important aspect I'd like to address, though, is a more dynamic reaction to what is happening in the network, something that can be seen as a complement to FEC itself. More specifically, the way I conceived the patch at the time (and the way it still works, I believe) was to check what we were going to transcode to/from in terms of sampling rate, and set a cap in the encoder settings with OPUS_SET_MAX_BANDWIDTH accordingly. So, if we're talking to the PSTN, we set the maximum value to OPUS_BANDWIDTH_NARROWBAND, if we're talking to, e.g, speex16 we set it to OPUS_BANDWIDTH_WIDEBAND, and so on. Anyway, in our experience it looks like that, more than a maximum sampling rate, the encoder seems to stick to that for the whole call (unless, say, silence is sent), which can be troublesome if you're using OPUS_BANDWIDTH_FULLBAND because you're talking to a 48kHz endpoint but there are network problems and a lot of losses. This can be probably ascribed to the lack of feedback from the Asterisk RTCP engine, which leaves the module/encoder clueless on what's really happening on the wire, and so with no way to try and adapt to the current conditions.

By looking at what the library provides, I believe there are a couple of things that could be done were there any way to get RTCP stats, e.g.:

  1. quite simply dynamically change the OPUS_SET_MAX_BANDWIDTH dynamically, e.g., lowering it whenever there's excessive loss and raising it again when things get back to normal;
  2. configure another control dynamically, OPUS_SET_PACKET_LOSS_PERC, which informs the encoder on the expected loss on the bitstream we're generating, which in turn, according to the docs, should make audio a bit worse when there's no loss, but in general better when there is loss (probably favouring redundant data in FEC usage over regular data).

Both approaches should help get a better, and more dynamic, behaviour of Opus in live calls, and I don't think one should exclude the other. Anyway, this all starts from the assumption that the module needs to have some way of receiving feedback from the RTCP engine on the call (e.g., RTCP RR/SR), and possibly also from the other leg that Asterisk is bridging in this case, if any (the transmission might be fine between us and Asterisk, and packets be lost between Asterisk and our peer), although that may not be really needed (Asterisk is still going to be transcoding for both anyway). Some indication on loss could also be partially evinced by the module itself, e.g., by looking at sequence numbers pretty much the way you already do to trigger a FEC-based recovery, but that would be an incomplete perspective, as it would only tell us what's happening on incoming packets we're decoding, but nothing on what happens to what we send.

What is your take on this? If you believe this is an interesting beast to tackle, would you be willing to work together on finding a way to make this all work?

Cheers!

@traud
Copy link
Owner

traud commented Nov 2, 2016

When it comes to Adaptive FEC, I wrote a bit in seanbright#25 – how that would work with chan_sip. However yet, this is out of the scope of this transcoding module because that part must be added into Asterisk first.

I have not understand that bandwidth proposal you made, yet. I re-read that tomorrow and ask some questions then.

When it comes to the complexity, do you know how to create a Pull Request? Otherwise I add that for you in the next few days. Of course, it would be nicer to have full codecs.conf interaction. Perhaps you could ask Digium to share at least that part of their closed source module.

@lminiero
Copy link
Author

lminiero commented Nov 3, 2016

When it comes to Adaptive FEC, I wrote a bit in seanbright#25 – how that would work with chan_sip. However yet, this is out of the scope of this transcoding module because that part must be added into Asterisk first.

Thanks for the pointer. Although I think there's a typo there, as you talk of doing a OPUS_SET_INBAND_FEC(10) which does not set the expected loss percentage. OPUS_SET_INBAND_FEC expects 0 or 1 (disable/enable), you set the loss percentage via the control I mentioned in my comment above.

As to the scope, I agree that it is orthogonal to the transcoding module, although it is also related if you think about how such a feedback would need to trickle down to the module to update the encoding details. That is why I was mentioning it here as a possible enhancements.

I have not understand that bandwidth proposal you made, yet. I re-read that tomorrow and ask some questions then.

Ok!

When it comes to the complexity, do you know how to create a Pull Request? Otherwise I add that for you in the next few days. Of course, it would be nicer to have full codecs.conf interaction. Perhaps you could ask Digium to share at least that part of their closed source module.

I'm familiar with the concept of PR, yes ;-) I do have my own projects as well, after all... I'll try and prepare one shortly. I don't know about codecs.conf, I may have been away from the latest Asterisk code for too much: I'll look into that as well.

@traud
Copy link
Owner

traud commented Nov 3, 2016

there's a typo there

Yes, that was a copy and paste error. Thanks for the notice. Fixed.

it is also related if you think about how such a feedback would need to trickle down to the module to update the encoding details. That is why I was mentioning it here as a possible enhancements.

Yes. However, first step first. This has to be added in Asterisk first. Then, it can be leveraged in codec modules. Or stated differently: Nobody should wait for me on this. I love to include it. It is on my ToDo list. But, please, do not wait on me for that.

I don't know about codecs.conf.

configs/samples/codecs.conf.sample
an example code is in codec_speech.c. However, if you got a copy from that code part in codec_opus of Digium, that would give 1:1 behavior.

change the OPUS_SET_MAX_BANDWIDTH dynamically, e.g., lowering it whenever there's excessive loss

In call, you want to change not just the FEC percentage but the bandwidth as well. Correct?

Interesting idea. Actually, a concept which is used in 3GPP AMR(-WB) and 3GPP EVS, when the mobile phone is dropping from LTE to a GSM Full-Rate channel or even just a GSM Half-Rate channel. Those audio-codecs offer different modes, which differ in bandwidth and bitrate. When such a change in transport network happens while in a call, the remote party sends in a lower mode and requests Asterisk to change its sending mode, as well (RTP in-band). However within 3GPP, the constraint device requests this change, not the perfectly connected Asterisk. Is there documentation, guideline, or another implementation doing this with the Opus-Codec?

This is a question for the Opus-Codec mailing list, after Adaptive FEC was added. The problem with this: Asterisk does not have any loss, normally. Asterisk should have a perfect Internet connection. It is the remote party with the congestion. The remote party should trigger a change, especially because downstream to the remote party is often as fast or even faster than its upstream. I have not looked at the internals of Opus-Codec in that regard. Therefore, a question for the Opus-Codec mailing list, how this should be done (via RTCP, RTP in-band or via SIP re-INVITE with a different fmtp).

I would like to prioritize these additions:

  1. Static FEC via a DEFINE or codecs.conf,
  2. maximum bandwidth via a DEFINE or codecs.conf,
  3. Adaptive FEC via RTCP,
  4. Adaptive bandwidth.

@lminiero
Copy link
Author

lminiero commented Nov 3, 2016

Yes. However, first step first. This has to be added in Asterisk first. Then, it can be leveraged in codec modules. Or stated differently: Nobody should wait for me on this. I love to include it. It is on my ToDo list. But, please, do not wait on me for that.

Of course, I didn't mean you'd have to do that... my point was to try and foster discussion and, if there was interest, collaboration on this. I've been away from the Asterisk internals for a while and so even nudges in the right direction can help me better understand where to put my hands. I chatted with Matt in a conference recently, and he mentioned some events that some modules can trigger and be intercepted by others, which would be quite helpful here (the RTCP engine could notify about losses) but I haven't looked into that yet.

an example code is in codec_speech.c

Cool, that helps. There's even a complexity setting already available, so that should be easy enough to integrate.

However within 3GPP, the constraint device requests this change, not the perfectly connected Asterisk. Is there documentation, guideline, or another implementation doing this with the Opus-Codec?

I don't think we'd need any explicit mechanism to trigger such a change. Asterisk would indeed request the change from a functional perspective, but on feedback coming from the connected endpoint via RTCP RR (e.g., "this user lost 20% of the packets, let's try sending less data"), so it's not really a decision it makes by itself. No signalling update would be needed, either via SIP/SDP or even custom inband RTP, as everything would be detected automatically inband on the Opus bitstream. In fact, lowering the sampling rate and/or bitrate you're encoding at doesn't require any SDP update or change in the decoder on the other end, as an Opus stream is always formally a 48kHz stream, that the decoder can render at whatever raw sampling rate the application is interested in no matter the source.

Reading around, tweaking the controls according to the feedback you get seems to be what developers suggest on the group. Anyway, they also say that OPUS_SET_MAX_BANDWIDTH can only be done in the setup phase, while OPUS_SET_BITRATE can be called at any time, so that's probably the way to go (together with the expected loss control to make it more robust).

The remote party should trigger a change, especially because downstream to the remote party is often as fast or even faster than its upstream. I have not looked at the internals of Opus-Codec in that regard. Therefore, a question for the Opus-Codec mailing list, how this should be done (via RTCP, RTP in-band or via SIP re-INVITE with a different fmtp).

Yeah, as I said, this is done via RTCP. Asterisk has no loss, but the user might, as you said, which would result in losses being notified in RTCP RR packets. No further communication needed. In case it's the user that is failing to send all packets to Asterisk (maybe because of a poor uplink), then Asterisk would notify the losses to the user the same way, and it would be up to the application to follow the same behaviour. Not sure if browsers do that in a WebRTC environment, but I guess it's likely they do.

I would like to prioritize these additions:

  1. Static FEC via a DEFINE or codecs.conf,
  2. maximum bandwidth via a DEFINE or codecs.conf,
  3. Adaptive FEC via RTCP,
  4. Adaptive bandwidth.

1 and 2 are already there but in include/asterisk/opus.h, right? Which means it would just be a matter of integrating the same stuff in a dedicated context of codecs.conf, e.g., [opus-open-source] (as I guess [opus] would be used by the closed source one). 3 and 4 are probably the same thing here, as both would require the same kind of feedback (RTCP info for the call), and just different actions. I'll investigate if that mechanism for inter-module events is actually available/usable or not.

I'll work on a PR to add the complexity setting to the module, and codecs.conf parsing.

@traud
Copy link
Owner

traud commented Nov 3, 2016

events that some modules can trigger and be intercepted by others

No idea about that, I would need more information. I mentioned an approach in that other issue. That is the only approach I am aware of, right now. What about that?

I guess, it's likely they do.

Until now, I never read about something like that. Therefore, I encourage to ask at the mailing list. From my experience, RTCP is not helpful to decide whether to lower the bitrate. It is helpful for the FEC percentage – according to several documents. However, I am not aware of any real-world and/or formal scientific paper about that area at all. On the first glance, I do not like the idea to have a call with asymmetric call quality (on one end bandwidth is reduced, on the other end FEC is increased). Therefore, I would like to start with FEC/RTCP interaction. When that solves the issue already, there is no need to go for additional steps like bitrate reduction. There are a lot of statements on the Internet. Surprisingly, I do not find/read much from authors of Opus. At some bitrate thresholds and modes, FEC even gets disabled and the bitrate must be raised to activate it again. This area looks like a business secret – or a statement like: ‘Read the source of Opus, our knowledge is written there!’ Perhaps it is even easier to start with SiLK and then transfer the gained knowledge to Opus.

Therefore, the very first step is to give codec modules access to the RTCP of that channel. The next step is to understand how that information is leveraged. Or stated differently: If you researched that topic more in-depth (find recommendations, scientific papers, do some internal tests), I would be very grateful. Then, I can re-adjust my priority list and implement that puzzle pieces because my AMR(-WB) module would benefit from that as well.

always formally a 48kHz stream

I was not about that. I was about informing/trigger the other party (here: Asterisk) to reduce the bandwidth/bitrate to create the same call quality in both directions. I did not find much in Opus-Codec mailing lists about that topic. Therefore, I recommend to ask there again. Additionally, you might have to look into existing implementations like Chrome and Firefox whether this is done. By the way, what are your remote parties, WebRTC or SIP phones?

Actually, I am more about AMR(-WB) and EVS. Because those audio codecs require the negotiated SDP parameters, I added the same functionality to the Opus-Codec module. Just to make sure my changes to the Asterisk Core work not only for AMR but for other audio codecs as well. Or stated differently: If you want me to help coding—motivate me—you have to find something which benefits my AMR(-WB) module, too.

@lminiero
Copy link
Author

lminiero commented Nov 3, 2016

No idea about that, I would need more information. I mentioned an approach in that other issue. That is the only approach I am aware of, right now. What about that?

I think he may have been referring to the ast_rtp_publish_rtcp_message that is done for some incoming RTCP messages, that results in a stasis_publish on the Stasis message bus, which means that doing a stasis_subscribe in the module would have the module receive the info. Anyway, the information is stringified to a JSON, using that approach, which I guess means that there's no way for the module to then figure out which encoding/decoding context an event it just received is actually referring to.

Adding a ast_rtp_instance reference as you propose in that issue comment feels a bit overkill. I'd rather follow a lighter approach, e.g., one that uses/extends an ast_frame to convey RTCP statistics that can be handled by a module, as they would follow the same path RTP packets take. I'll have a look to see if that's viable and if that makes any sense.

This area looks like a business secret – or a statement like: ‘Read the source of Opus, our knowledge is written there!’ Perhaps it is even easier to start with SiLK and then transfer the gained knowledge to Opus.

Opus is kind of a mix between silk and celt, and as such it can behave weirdly depending on how you configure and use it. I wouldn't say there's any real secret, though. You have knobs to control most if not all of the functionality it provides, as a library is supposed to, so it's up to the application to use them sometimes. One of them are the dynamic bitrate adaptation, as the library can't of course be aware of what happens on the wire as it's not responsible for that, RTCP is.

From what I've found looking at some code around, both Google's WebRTC stack and Freeswitch do make use of RTCP feedback to adapt the bitrate. Freeswitch seems to be using RTCP feedback with some sort of a Kalman filter to decide when to apply changes, so may be more sophisticated, but the base mechanism is the same.

I was about informing/trigger the other party (here: Asterisk) to reduce the bandwidth/bitrate to create the same call quality in both directions.

Not sure what you mean by "same call quality"? In an ideal scenario, both endpoints would react to the feedback the other provides, and adapt the bitrate accordingly. It's not at all unusual to have one direction use more data than the other one, as downlink/uplink characteristics may vary wildly. That happens all the time with video, for instance. The benefit of Opus is exactly the fact that it can adapt dynamically (if prompted) to such variations, and the fact you can do so asymmetrically is actually a plus, especially if it results in a good conversation when compared to broken audio.

By the way, what are your remote parties, WebRTC or SIP phones?

I work almost entirely with WebRTC endpoints, as I'm the author of a WebRTC server myself (that I incidentally sometimes use as a frontend to Asterisk), so that's what I work most with.

Or stated differently: If you want me to help coding—motivate me—you have to find something which benefits my AMR(-WB) module, too.

Sorry but I tend to stay very far away from proprietary and royalty encumbered codecs, so my knowledge or expertise on AMR is minimal at best.

Anyway, I understand the different priorities and the lack of interest on this, so no problem at all. I'll keep on looking for some ways to implement the feature myself, and in case I get somewhere I'll make sure to prepare a PR here so that interested people can test and take advantage of this.

@traud
Copy link
Owner

traud commented Nov 3, 2016

the library can't of course be aware of what happens on the wire as it's not responsible for that, RTCP is

That is a question actually. Does Opus give any feedback on the data it receives? When PLC and FEC are used correctly, the decoder knows the amount of damage/loss in the upstream. From the point of Asterisk, the upstream is the bootle-neck, because Asterisk is well connected and the Internet connection of clients are asymmetrically (higher download than upload). Or stated differently: If you could find (or create) a documentation how bitrate adoption (and FEC percentage) should be done with Opus in case of VoIP, that would be very welcome. Not code, but a high-level description what to do when and why. Even a summary of the existing solutions/implementations would help.

What hinders me most to start implementing all this is lack of personal knowledge. I have no idea how the Opus library should be used and how this can be tested then. And I am looking for something like this for years, because it was the same with SiLK.

@traud
Copy link
Owner

traud commented Nov 9, 2016

Adding a ast_rtp_instance reference as you propose in that issue comment feels a bit overkill.

Can you elaborate a bit more on that? By the way, codecs (which are not ‘smoothable’ – a term of Asterisk; affects all codecs with variable bit rates) need a way to get the negotiated Packetization Time (ptime) as well. Otherwise, always 20ms are used when frames originate Asterisk. Please, consider this additional requirement, when search for a solution to link RTCP.

@lminiero
Copy link
Author

lminiero commented Nov 9, 2016

Currently traveling so just a quick reply. I've started a discussion on
asterisk-dev as well:

http://lists.digium.com/pipermail/asterisk-dev/2016-November/075923.html

I managed to get the SSRC+Stasis stuff to work, but ASAP I'll look into the
new ast_frame type Joshua suggested.

Il 09/nov/2016 14:23, "Alexander Traud" notifications@github.com ha
scritto:

Adding a ast_rtp_instance reference as you propose in that issue comment
feels a bit overkill.

Can you elaborate a bit more on that? By the way, codecs (which are not
‘smoothable’ – a term of Asterisk; affects all codecs with variable bit
rates) need a way to get the negotiated Packetization Time (ptime) as well.
Otherwise, always 20ms are used when frames originate Asterisk. Please,
consider this additional requirement, when search for a solution to link
RTCP.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADg5vITn68sd0fMeRKxG-CZvbkQ3OTiMks5q8clAgaJpZM4KnkE7
.

@traud
Copy link
Owner

traud commented Mar 10, 2017

The feature request about the complexity is continued in Pull Request #4…

When it comes to the feature request about Adaptive FEC, Lorenzo added the required API and an example implementation for Speex, see ASTERISK-26584… When you or someone else has the same ready for the Opus Codec, please, do not hesitate to create a Pull Request.

Until then, I am closing this. If I missed a feature request, please re-open this issue.

[…] ask Digium to share at least [the codecs.conf part] of their closed source module.

see http://lists.digium.com/pipermail/asterisk-dev/2016-December/076049.html

@traud traud closed this as completed Mar 10, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants