Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add insert of concealed packets #3308

Closed
wants to merge 2 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
46 changes: 38 additions & 8 deletions src/plugins/janus_audiobridge.c
Original file line number Diff line number Diff line change
Expand Up @@ -1699,6 +1699,7 @@ typedef struct janus_audiobridge_participant {
uint16_t expected_seq; /* Expected sequence number */
uint16_t probation; /* Used to determine new ssrc validity */
uint32_t last_timestamp; /* Last in seq timestamp */
uint16_t last_seq; /* Last sequence number */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code style: indentation of comment looks broken here.

gboolean reset; /* Whether or not the Opus context must be reset, without re-joining the room */
GThread *thread; /* Encoding thread for this participant */
gboolean mjr_active; /* Whether this participant has to be recorded to an mjr file or not */
Expand Down Expand Up @@ -2135,7 +2136,7 @@ static int janus_audiobridge_resample(int16_t *input, int input_num, int input_r
#define JITTER_BUFFER_MIN_PACKETS 2
#define JITTER_BUFFER_MAX_PACKETS 40
#define JITTER_BUFFER_CHECK_USECS 1*G_USEC_PER_SEC
#define QUEUE_IN_MAX_PACKETS 4
#define QUEUE_IN_MAX_PACKETS 20
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally don't like this: it will cause a permanent delay increase for audio of this participant after a burst of lost packets caused this to grow because of PLC, ven way after the problem was solved. I'd argue that a burst of lost packets should indeed cause artifacts (like a speed up), and that if not 4, then a way smaller number should be more than enough to absorb shorter bursts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've expiremented with it and low value caused pops/cracks for small bursts.



/* Error codes */
Expand Down Expand Up @@ -6504,6 +6505,7 @@ static void *janus_audiobridge_handler(void *data) {
participant->expected_seq = 0;
participant->probation = 0;
participant->last_timestamp = 0;
participant->last_seq = 0;
janus_mutex_init(&participant->qmutex);
participant->arc = NULL;
janus_audiobridge_plainrtp_media_cleanup(&participant->plainrtp_media);
Expand Down Expand Up @@ -8613,6 +8615,7 @@ static void *janus_audiobridge_participant_thread(void *data) {
gint64 now = janus_get_monotonic_time(), before = now;
gboolean first = TRUE, use_fec = FALSE;
int ret = 0;
int lost_packets = 0;

/* Start working: check both the incoming queue (to decode and queue) and the outgoing one (to encode and send) */
while(!g_atomic_int_get(&stopping) && g_atomic_int_get(&session->destroyed) == 0) {
Expand Down Expand Up @@ -8655,29 +8658,55 @@ static void *janus_audiobridge_participant_thread(void *data) {
janus_audiobridge_buffer_packet_destroy(bpkt);
break;
}

rtp = (janus_rtp_header *)bpkt->buffer;
/* If this is Opus, check if there's a packet gap we should fix with FEC */
use_fec = FALSE;
if(!first && participant->codec == JANUS_AUDIOCODEC_OPUS && participant->fec) {
if(ntohs(rtp->seq_number) == (participant->expected_seq + 1)) {
/* Lost a packet here? Use FEC to recover */
if(ntohs(rtp->seq_number) != (participant->expected_seq)) {
lost_packets = ntohs(rtp->seq_number) - participant->expected_seq;
JANUS_LOG(LOG_ERR, "[%d] got rtp seq, expected_seq: [%d], diff: [%d]\n", ntohs(rtp->seq_number), participant->expected_seq, lost_packets);

int32_t output_samples;
opus_decoder_ctl(participant->decoder, OPUS_GET_LAST_PACKET_DURATION(&output_samples));

for(int i=1; i < lost_packets; i++) {
Copy link
Contributor Author

@spscream spscream Dec 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think to get rid of cycle and produce plc data which frame_size will be lost_packets * OPUS_SAMPLES and shouldn't exceed BUFFER_SAMPLES size.
In our production I've seen losses of 20+ packets in a row, so plc for 20 packet won't fit BUFFER_SAMPLES and I'm not sure how to behave in this case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually the whole thing that perplexes me. The way it's done right now, you're basically breaking FEC, as it would never be done. You're preserving the use_fec = true bit, but it will be useless, since later on we'll try to get redundant information from a packet that doesn't have any (the PLC data you've added). I think PLC and FEC should be handled separately: if the gap is 1, then do what was done before; if it's higher than 1, do PLC (new code, without use_fec=true). Or do you have a reason for implementing it this way?

As a side not, the LOG_ERR there is probably overly verbose, and would clog the logs any time there's bursts of lost packets, which may happen often. A LOG_VERB or even LOG_HUGE would probably be a better choice (but that's something that can be discussed, it may still be useful to have info on the logs when this happens).

Copy link
Contributor Author

@spscream spscream Dec 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code generates plc for gap sequence, which can't be recovered using fec. Last gap member is generated using fec of arrived rtp packet, if it's possible(opus_decode will return plc in case fec data isn't available), for example:

  • gap of 1 will be covered using fec, since condition i < lost_packets for cycle wont match
  • gap of 2 will generate 1 plc and 1 will be recovered from fec information using rtp payload

I'm affraid of the following problem now - frame size can be differ from packet to packet and decoder should adapt to changes of frame size accordingly since encoder may change frame_size. Looks like chrome encoder changes frame size in case of losses - opus_packet_get_nb_frames returns 10ms frame size periodically on bursts of losses for rtp packet after gap. Chromium realization of fec decode uses opus_packet_get_nb_frames to determine fec frame_size https://chromium.googlesource.com/external/webrtc/+/refs/heads/main/modules/audio_coding/codecs/opus/opus_interface.cc#681).

I left LOG_ERR for debug purpose, i'm going to change it to LOG_VERB or LOG_HUGE for production.

pkt = g_malloc(sizeof(janus_audiobridge_rtp_relay_packet));
pkt->data = g_malloc0(OPUS_SAMPLES * (participant->stereo ? 2 : 1) * sizeof(opus_int16));
pkt->ssrc = 0;
pkt->timestamp = participant->last_timestamp + 960 * i;
pkt->seq_number = participant->last_seq + 1 * i;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure the RTP info here is correct, since last_timestamp and last_seq may come from out of order packets. But it's probably not that relevant, since it has a limited use anyway in the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought jitter buffer is reordering out of order packets and duplicates? At least I've expiremented with out of order and duplicates on output and everything still worked fine

/* This is a redundant packet, so we can't parse any extension info */
pkt->silence = FALSE;
pkt->length = opus_decode(participant->decoder, NULL, 0, (opus_int16 *)pkt->data, output_samples, 0);
JANUS_LOG(LOG_ERR, "[%d] packet concealed, length: [%d], timestamp: [%d] stereo: [%d]\n", pkt->seq_number, pkt->length, pkt->timestamp, participant->stereo ? 2 : 1);
janus_mutex_lock(&participant->qmutex);
participant->inbuf = g_list_append(participant->inbuf, pkt);
janus_mutex_unlock(&participant->qmutex);
}

use_fec = TRUE;
}
}
first = FALSE;
if(use_fec) {
/* There was a gap, try to get decode from redundant info first */
pkt = g_malloc(sizeof(janus_audiobridge_rtp_relay_packet));
pkt->data = g_malloc0(BUFFER_SAMPLES*sizeof(opus_int16));
pkt->data = g_malloc0(OPUS_SAMPLES * (participant->stereo ? 2 : 1) * sizeof(opus_int16));
pkt->ssrc = 0;
pkt->timestamp = participant->last_timestamp + 960; /* FIXME */
pkt->seq_number = participant->expected_seq; /* FIXME */
pkt->timestamp = ntohl(rtp->timestamp) - 960; /* FIXME */
pkt->seq_number = ntohs(rtp->seq_number) - 1; /* FIXME */
/* This is a redundant packet, so we can't parse any extension info */
pkt->silence = FALSE;
/* Decode the lost packet using fec=1 */

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why you completely changed the way we handle FEC? The existing FEC management should remain the same, if not broken. See comments above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FEC management which were implemented earlier didn't work in case of gap > 1, I didn't change too much, just adapted it to work using latest arrived rtp packet info instead of expected value based on last decoded packet.

int32_t output_samples;
opus_decoder_ctl(participant->decoder, OPUS_GET_LAST_PACKET_DURATION(&output_samples));

pkt->length = opus_decode(participant->decoder, payload, plen, (opus_int16 *)pkt->data, output_samples, 1);

JANUS_LOG(LOG_ERR, "[%d] packet fec decoded [%d] pkt->length, timestamp: [%d]\n",
pkt->seq_number, pkt->length, pkt->timestamp);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above on LOG_ERR usage.


/* Queue the decoded redundant packet for the mixer */
janus_mutex_lock(&participant->qmutex);
participant->inbuf = g_list_append(participant->inbuf, pkt);
Expand Down Expand Up @@ -8720,6 +8749,7 @@ static void *janus_audiobridge_participant_thread(void *data) {
/* Get rid of the buffered packet */
janus_audiobridge_buffer_packet_destroy(bpkt);
/* Update the details */
participant->last_seq = pkt->seq_number;
participant->last_timestamp = pkt->timestamp;
participant->expected_seq = pkt->seq_number + 1;
g_atomic_int_set(&participant->decoding, 0);
Expand All @@ -8738,7 +8768,7 @@ static void *janus_audiobridge_participant_thread(void *data) {
locked = TRUE;
/* Do not let queue-in grow too much */
guint count = g_list_length(participant->inbuf);
if(count > QUEUE_IN_MAX_PACKETS) {
if((int) count > (QUEUE_IN_MAX_PACKETS + lost_packets)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once you increase QUEUE_IN_MAX_PACKETS, then there's no reason to make it even less strict like you're doing now IMHO. It will make the internal mixer queue for the participant grow beyond control and cause huge and uncoverable latencies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to set it to default 4 and got pops and cracks. On screen is 20-30% of drops set on output channel.
image

With 20 QUEUE_IN_MAX_PACKETS I don't have such pops/cracks and I don't get over the limit.

JANUS_LOG(LOG_WARN, "Participant queue-in contains too many packets, clearing now (count=%u)\n", count);
janus_audiobridge_participant_clear_inbuf(participant);
}
Expand Down