Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic latency adjustment #688

Open
8 of 21 tasks
gavv opened this issue Feb 3, 2024 · 21 comments
Open
8 of 21 tasks

Dynamic latency adjustment #688

gavv opened this issue Feb 3, 2024 · 21 comments
Assignees

Comments

@gavv
Copy link
Member

gavv commented Feb 3, 2024

Implementation

  • in LinkMeter, compute jitter, total_packets, lost_packets
  • teach fec::Reader and fec::Writer to compute FEC block duration
  • teach LatencyMonitor to compute target latency continuously based on min latency, max latency, FEC block duration, and jitter, and pass updated target to FreqEstimator
  • update pipeline to support dynamic latency mode, enabled when target latency is zero (currently some components may be not ready for this)
  • make this mode default in CLI and API (in CLI, enable it via --target-latency=auto, in API via target_latency=0)
  • add new metrics to API: jitter, total_packets, lost_packets (see Metrics API #681)

Fixes

  • RTCP crash
  • seqnum crash
  • incorrect calculated FEC block duration
  • rework defaults deduction (min/max/target)
  • add margin for auto-tuning min/max relative to hard min/max
  • use CsvDumper
  • use LogNote

Tests

  • LinkMeter tests

  • pipeline tests: test_receiver_source

    // Check how receiver computes packet metrics:
    // total_packets, lost_packets, ext_first_seqnum, ext_last_seqnum
    IGNORE_TEST(receiver_source, metrics_packet_counters) {
    // TODO(gh-688): implement test
    }
    // Check how receiver computes jitter metric.
    IGNORE_TEST(receiver_source, metrics_jitter) {
    // TODO(gh-688): implement test
    }

    • metrics_packet_counters (simulate losses and reorders and check total_packets, lost_packets, ext_first_seqnum, ext_last_seqnum)
    • metrics_jitter (simulate timestamp jitter and check jitter metric)
  • pipeline tests: test_loopback_sink_2_source: cover new metrics (jitter, total_packets, lost_packets - smoke test that all 3 metrics are available on both sender and receiver)

    // TODO(gh-688): check that metrics are non-zero:
    // - total_packets
    // - lost_packets
    // - jitter

    // TODO(gh-688): check that metrics are equal on sender and receiver:
    // - total_packets
    // - lost_packets
    // - jitter

  • public_api tests: cover new metrics (jitter, total_packets, lost_packets - smoke test that all 3 metrics are available on both sender and receiver)

Docs

  • update API comments
  • update CLI manual pages
@gavv gavv added this to Frontlog in kanban board Feb 3, 2024
@gavv gavv moved this from Frontlog to In work in kanban board Feb 3, 2024
@baranovmv baranovmv self-assigned this Feb 3, 2024
@gavv gavv mentioned this issue Feb 4, 2024
16 tasks
@baranovmv
Copy link
Member

@gavv What is the best way you see to convert FEC block size into nanoseconds?

@baranovmv
Copy link
Member

add component that continuously computes target latency based on min latency, max latency, FEC block size, and jitter

What interface of this component do you presume would fit our needs here?

I see it as a single function class like this:

class LatencyRuller {
public:
    LatencyRuller() {}

    ns get_target_latency (ns min_latency, ns max_latency, ns fec_block_duration, ns avg_jitter) const {
        ns res = avg_jitter * 3;
        if (res < fec_block_duration) {
                res = fec_block_duration;
        }
        if (res < min_latency) {
                res = min_latency;
        }
        if (res > max_latency) {
                res = max_latency;
        }
        return res;
    }
};

Am I missing something?

  • min/max latencies and FEC block size -- API parameters, so they are constant values;
  • jitter -- is not supposed to change fast/often, its variability or speed of change is parameterized by win_len ctor argument of LinkMeter

@gavv
Copy link
Member Author

gavv commented Feb 5, 2024

@gavv What is the best way you see to convert FEC block size into nanoseconds?

Since both fec block and packet length are variable, I think we should compute fec block duration on fly.

We can teach fec::Reader to do it. In the end of each block, we can compute its duration.

However, this duration may be varying even for fixed block size, depending on encoding and packetization. We could compute maximum ever seen duration after last block size change. Or we could compute a moving maximum of last N blocks.

Anyway, we'll need to add a pointer to fec::Reader to LatencyMonitor, and if it's non-null, LatencyMonitor should query block duration and pass it to LatencyTuner via LatencyMetrics.

Also we'll need to add a reference to IFrameDecoder to fec::Reader and use it to fill duration of repaired packets (no full decoding, just query duration). I think we'll need packet duration during block duration computation.

Since we also need to be able to tune latency on sender side, we also need to teach fec::Writer to compute FEC block duration; and teach FeedbackMonitor to write fec block duration into LatencyMetrics passed to LatencyTuner.

@gavv
Copy link
Member Author

gavv commented Feb 5, 2024

I see it as a single function class like this:

Then I think we can inline it into LatencyTuner. I thought it would have more complicated logic.

@gavv
Copy link
Member Author

gavv commented Feb 5, 2024

I've updated task description.

@gavv
Copy link
Member Author

gavv commented Feb 5, 2024

add a reference to IFrameDecoder to fec::Reader and use it to fill duration of repaired packets (no full decoding, just query duration)

Am I right you imply getting difference between values of sum position() + available() on every new fec block?

Here's how we do it currently:

void Filter::populate_(const packet::PacketPtr& packet) {
    if (packet->rtp()->duration == 0) {
        packet->rtp()->duration =
            (packet::stream_timestamp_t)decoder_.decoded_sample_count(
                packet->rtp()->payload.data(), packet->rtp()->payload.size());
    }
}

@gavv
Copy link
Member Author

gavv commented Apr 1, 2024

Wi-Fi + speedtest (no FEC, start from 100ms)

@baranovmv Testing your branch on my home wifi, so far works great!

Plotting using csvplotter.

Streaming from RPi4 to laptop over 5ghz. Running a speed test in background (in web browser) to cause increased jitter.

Full logs:
20240401_dynamic_latency_speedtest.zip

grep 'watchdog: status' roc-recv.log

15:12:48.141 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: wwwwwwwwwwwwwwwwwwww
15:12:48.160 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: wwwwwwwwwwwwwwwwwwww
15:12:48.181 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: wwwwwwwwwwwwwwwwwwww
15:12:48.203 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: wwwwwwwwwwwwwwwwwwww
15:12:48.213 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: wwwwwwwwwwwwwwwwwwww
15:12:48.235 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: wwwwwwwwwwwb........
15:12:53.226 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: ..............ibbbbb
15:12:53.248 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:12:53.259 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:12:53.280 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbB
15:12:53.291 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:12:53.312 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:12:53.333 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:12:53.355 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:12:53.376 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:12:53.387 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:12:53.408 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bD..................
15:13:04.298 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: .............ibbbbbb
15:13:04.299 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:13:04.320 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bD..................
15:14:43.360 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: ...................i
15:14:43.381 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:14:43.402 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:14:43.424 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:14:43.445 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbD.............
15:14:55.743 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: ..............ibbbbb
15:14:55.765 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbi..............

(w = warmup, b = blank, i = incomplete, D = drops of late packets)

  • There were drops at: 15:12:53, 15:13:04, 15:14:43, 15:14:55.

  • 15:12:53 is the very beginning. After seeing higher jitter than the initial latency (100ms here), latency tuner increased latency and drops quickly disappeared.

  • From ~ 15:13:20 to 15:14:50 I see in logs that jitter was gradually increasing from 155ms to 305ms, because I started the speed test. At 15:13:04 there were drops caused by it. At 15:14:43 latency tuner reacted to increased jitter and set target latency to 472ms.

  • There were a bit more drops at 15:14:43 and 15:14:55, and then they disappeared.

Start streaming

Latency tuner: increasing target latency 4800(100.000ms) → 11561(240.852ms)

Start speed test in background

Latency tuner: increasing target latency 11561(240.854ms) → 22692(472.760ms)

Stop speed test

Latency monitor: decreasing target latency 22692(472.750ms) → 18020(375.419ms)
Latency monitor: decreasing target latency 18020(375.417ms) → 14310(298.125ms)

@gavv
Copy link
Member Author

gavv commented Apr 1, 2024

I can confirm that time of spikes in gping roughtly correspond to time of spikes in jitter in csvplotter.

@gavv
Copy link
Member Author

gavv commented Apr 1, 2024

Ethernet (no FEC, start from 40ms)

Here's another test. RPi is streaming to laptop connected to laptop via dock station (no hubs/switches/routers).

Full logs:
20240401_ethernet.zip

Initial latency was 40ms, then it was slowly (during 10 minutes) adjusted as: 40 -> 31 -> 25 -> 53 -> 42 -> 33 ->26.

Jitter is pretty stable but has rare spikes that caused those target latency adjustments.

There were no drops at all.

I'd say that something like 25ms is indeed very suitable target latency for this link. If we'd be able to filter out jitter spikes from calculations, I guess tuner would just stick to it.

rg -N → 
roc-recv.log
18:01:04.921 [1976622] [err] roc_audio: [latency_tuner.cpp:564] Latency monitor: decreasing target latency 1764(40.000ms) → 1401(31.765ms)
18:03:14.147 [1976622] [err] roc_audio: [latency_tuner.cpp:564] Latency monitor: decreasing target latency 1401(31.769ms) → 1113(25.228ms)
18:03:29.155 [1976622] [err] roc_audio: [latency_tuner.cpp:608] Latency tuner: increasing target latency 1113(25.238ms) → 2370(53.741ms)
18:05:57.613 [1976622] [err] roc_audio: [latency_tuner.cpp:564] Latency monitor: decreasing target latency 2370(53.741ms) → 1882(42.677ms)
18:06:26.648 [1976622] [err] roc_audio: [latency_tuner.cpp:564] Latency monitor: decreasing target latency 1882(42.676ms) → 1495(33.890ms)
18:06:54.711 [1976622] [err] roc_audio: [latency_tuner.cpp:564] Latency monitor: decreasing target latency 1495(33.900ms) → 1187(26.921ms)

@gavv
Copy link
Member Author

gavv commented Apr 1, 2024

fe_stable

Here is fe_stable from previous measurement. @baranovmv is it expected?

rg -IN fe_stable  | ch 0 -1   
18:00:14.810 fe_stable=false
18:00:19.802 fe_stable=false
18:00:24.804 fe_stable=false
18:00:29.806 fe_stable=false
18:00:34.809 fe_stable=false
18:00:39.802 fe_stable=false
18:00:44.804 fe_stable=false
18:00:49.807 fe_stable=false
18:00:54.809 fe_stable=false
18:00:59.802 fe_stable=false
18:01:04.804 fe_stable=false
18:01:09.807 fe_stable=false
18:01:14.809 fe_stable=false
18:01:19.802 fe_stable=false
18:01:24.804 fe_stable=false
18:01:29.807 fe_stable=false
18:01:34.809 fe_stable=true
18:01:39.801 fe_stable=true
18:01:44.804 fe_stable=true
18:01:49.806 fe_stable=true
18:01:54.809 fe_stable=true
18:01:59.801 fe_stable=true
18:02:04.803 fe_stable=true
18:02:09.806 fe_stable=true
18:02:14.809 fe_stable=true
18:02:19.801 fe_stable=true
18:02:24.804 fe_stable=true
18:02:29.807 fe_stable=false
18:02:34.809 fe_stable=false
18:02:39.801 fe_stable=false
18:02:44.803 fe_stable=true
18:02:49.806 fe_stable=false
18:02:54.809 fe_stable=false
18:02:59.801 fe_stable=false
18:03:04.803 fe_stable=true
18:03:09.806 fe_stable=true
18:03:14.809 fe_stable=false
18:03:19.801 fe_stable=false
18:03:24.803 fe_stable=false
18:03:29.806 fe_stable=false
18:03:34.809 fe_stable=false
18:03:39.802 fe_stable=false
18:03:44.804 fe_stable=false
18:03:49.806 fe_stable=false
18:03:54.809 fe_stable=false
18:03:59.801 fe_stable=false
18:04:04.803 fe_stable=false
18:04:09.806 fe_stable=false
18:04:14.808 fe_stable=true
18:04:19.800 fe_stable=true
18:04:24.803 fe_stable=true
18:04:29.805 fe_stable=true
18:04:34.808 fe_stable=true
18:04:39.801 fe_stable=true
18:04:44.803 fe_stable=true
18:04:49.805 fe_stable=true
18:04:54.808 fe_stable=true
18:04:59.801 fe_stable=true
18:05:04.803 fe_stable=true
18:05:09.805 fe_stable=true
18:05:14.808 fe_stable=true
18:05:19.800 fe_stable=true
18:05:24.803 fe_stable=true
18:05:29.805 fe_stable=true
18:05:34.808 fe_stable=true
18:05:39.800 fe_stable=true
18:05:44.803 fe_stable=true
18:05:49.805 fe_stable=true
18:05:54.808 fe_stable=true
18:05:59.800 fe_stable=false
18:06:04.803 fe_stable=false
18:06:09.805 fe_stable=false
18:06:14.808 fe_stable=false
18:06:19.801 fe_stable=false
18:06:24.803 fe_stable=false
18:06:29.805 fe_stable=false
18:06:34.808 fe_stable=false
18:06:39.800 fe_stable=false
18:06:44.803 fe_stable=false
18:06:49.805 fe_stable=false
18:06:54.807 fe_stable=true
18:06:59.800 fe_stable=false
18:07:04.802 fe_stable=false
18:07:09.805 fe_stable=false
18:07:14.808 fe_stable=false
18:07:19.800 fe_stable=false
18:07:24.802 fe_stable=true
18:07:29.805 fe_stable=true
18:07:34.807 fe_stable=true
18:07:39.799 fe_stable=true
18:07:44.802 fe_stable=true
18:07:49.804 fe_stable=true
18:07:54.807 fe_stable=true
18:07:59.799 fe_stable=true
18:08:04.802 fe_stable=true
18:08:09.804 fe_stable=true
18:08:14.807 fe_stable=false
18:08:19.800 fe_stable=false
18:08:24.802 fe_stable=false
18:08:29.804 fe_stable=true
18:08:34.807 fe_stable=true
18:08:39.799 fe_stable=false
18:08:44.802 fe_stable=false
18:08:49.805 fe_stable=false
18:08:54.807 fe_stable=true
18:08:59.799 fe_stable=true
18:09:04.801 fe_stable=true
18:09:09.804 fe_stable=false
18:09:14.807 fe_stable=false
18:09:19.799 fe_stable=false
18:09:24.801 fe_stable=true
18:09:29.804 fe_stable=true
18:09:34.807 fe_stable=false
18:09:39.799 fe_stable=false
18:09:44.802 fe_stable=false
18:09:49.804 fe_stable=true
18:09:54.807 fe_stable=true
18:09:59.798 fe_stable=true
18:10:04.801 fe_stable=false
18:10:09.804 fe_stable=false
18:10:14.807 fe_stable=false
18:10:19.798 fe_stable=true
18:10:24.801 fe_stable=false
18:10:29.804 fe_stable=false
18:10:34.807 fe_stable=false
18:10:39.798 fe_stable=true

@gavv
Copy link
Member Author

gavv commented Apr 1, 2024

RTCP crash

When I enable RTCP, roc-send crashes when it gots first feedback report from receiver:

./roc/roc-send -vv -s rtp+rs8m://192.168.0.101:20001 -r rs8m://192.168.0.101:20002 \
    -c rtcp://192.168.0.101:20003 -i file:./long.wav
roc-recv -vv -s rtp+rs8m://0.0.0.0:20001 -r rs8m://0.0.0.0:20002 \
    -c rtcp://0.0.0.0:20003 \
    --target-latency=200ms --min-latency=5ms --max-latency=1000ms --beep

Crash:

18:52:23.180 [2408] [dbg] roc_sndio: [backend_map.cpp:20] backend map: initializing: n_backends=2 n_drivers=58
18:52:23.181 [2408] [dbg] roc_core: [slab_pool_impl.cpp:61] pool: initializing: name=packet_pool object_size=496 min_slab=8B(1S) max_slab=0B(0S)
18:52:23.181 [2408] [dbg] roc_core: [slab_pool_impl.cpp:61] pool: initializing: name=buffer_pool object_size=992 min_slab=8B(1S) max_slab=0B(0S)
18:52:23.181 [2408] [dbg] roc_core: [slab_pool_impl.cpp:61] pool: initializing: name=buffer_pool object_size=4136 min_slab=8B(1S) max_slab=0B(0S)
18:52:23.181 [2408] [dbg] roc_core: [slab_pool_impl.cpp:61] pool: initializing: name=encoding_pool object_size=240 min_slab=3840B(16S) max_slab=0B(0S)
18:52:23.181 [2409] [dbg] roc_netio: [network_loop.cpp:278] network loop: starting event loop
18:52:23.182 [2410] [dbg] roc_ctl: [control_task_queue.cpp:95] control task queue: starting event loop
18:52:23.182 [2408] [dbg] roc_node: [context.cpp:24] context: initializing
18:52:23.182 [2408] [inf] roc_sndio: [wav_source.cpp:177] wav source: opened input file: path=./long.wav in_bits=16 in_rate=44100 in_ch=2
18:52:23.182 [2408] [dbg] roc_core: [slab_pool_impl.cpp:61] pool: initializing: name=slot_pool object_size=1072 min_slab=8B(1S) max_slab=0B(0S)
18:52:23.182 [2408] [dbg] roc_node: [sender.cpp:32] sender node: initializing
18:52:23.182 [2408] [dbg] roc_node: [sender.cpp:73] sender node: configuring audiosrc interface of slot 0
18:52:23.182 [2408] [inf] roc_pipeline: [sender_sink.cpp:72] sender sink: adding slot
18:52:23.183 [2408] [dbg] roc_rtp: [identity.cpp:50] rtp identity: ssrc=1540280261 cname=853129e1-8270-4cf8-aa47-bc0d202891db
18:52:23.183 [2408] [inf] roc_node: [sender.cpp:121] sender node: connecting audiosrc interface of slot 0 to rtp+rs8m://192.168.0.101:20001
18:52:23.183 [2409] [dbg] roc_netio: [udp_port.cpp:129] udp port: : opened port
18:52:23.184 [2408] [inf] roc_node: [sender.cpp:201] sender node: bound audiosrc interface to 0.0.0.0:46121
18:52:23.184 [2409] [dbg] roc_netio: [network_loop.cpp:427] network loop: starting sending packets on port 
18:52:23.184 [2408] [dbg] roc_pipeline: [sender_slot.cpp:61] sender slot: adding audiosrc endpoint rtp+rs8m
18:52:23.184 [2408] [dbg] roc_node: [sender.cpp:73] sender node: configuring audiorpr interface of slot 0
18:52:23.184 [2408] [inf] roc_node: [sender.cpp:121] sender node: connecting audiorpr interface of slot 0 to rs8m://192.168.0.101:20002
18:52:23.185 [2408] [dbg] roc_node: [sender.cpp:493] sender node: sharing audiosrc interface port with audiorpr interface
18:52:23.185 [2408] [dbg] roc_pipeline: [sender_slot.cpp:61] sender slot: adding audiorpr endpoint rs8m
18:52:23.185 [2408] [dbg] roc_fec: [openfec_encoder.cpp:28] openfec encoder: initializing: codec=rs m=8
18:52:23.185 [2408] [dbg] roc_fec: [writer.cpp:91] fec writer: update block size: cur_sbl=0 cur_rbl=0 new_sbl=18 new_rbl=10
18:52:23.185 [2408] [dbg] roc_audio: [packetizer.cpp:60] packetizer: initializing: packet_length=5.000ms samples_per_packet=221 payload_size=884 sample_spec= chset=>
18:52:23.185 [2408] [dbg] roc_audio: [latency_tuner.cpp:199] latency tuner: initializing: target_latency=0(0.000ms) min_latency=0(0.000ms) max_latency=0(0.000ms) latency_upper_limit_coef=1.700000 stale_tolerance=0(0.000ms) scaling_interval=0(0.000ms) scaling_tolerance=0.000000 backend=niq profile=intact
18:52:23.185 [2408] [dbg] roc_node: [sender.cpp:73] sender node: configuring audioctl interface of slot 0
18:52:23.185 [2408] [inf] roc_node: [sender.cpp:121] sender node: connecting audioctl interface of slot 0 to rtcp://192.168.0.101:20003
18:52:23.186 [2408] [dbg] roc_node: [sender.cpp:493] sender node: sharing audiosrc interface port with audioctl interface
18:52:23.186 [2408] [dbg] roc_pipeline: [sender_slot.cpp:61] sender slot: adding audioctl endpoint rtcp
18:52:23.186 [2408] [dbg] roc_core: [slab_pool_impl.cpp:61] pool: initializing: name=stream_pool object_size=856 min_slab=6848B(8S) max_slab=0B(0S)
18:52:23.186 [2408] [dbg] roc_core: [slab_pool_impl.cpp:61] pool: initializing: name=address_pool object_size=200 min_slab=800B(4S) max_slab=0B(0S)
18:52:23.186 [2408] [dbg] roc_rtcp: [reporter.cpp:69] rtcp reporter: initializing: local_ssrc=1540280261 local_cname="853129e1-8270-4cf8-aa47-bc0d202891db" report_mode=address report_addr=192.168.0.101:20003 timeout=5000.000ms
18:52:23.186 [2408] [dbg] roc_audio: [feedback_monitor.cpp:66] feedback monitor: start gathering feedback
18:52:23.186 [2409] [dbg] roc_netio: [network_loop.cpp:447] network loop: starting receiving packets on port 
18:52:23.187 [2408] [dbg] roc_sndio: [pump.cpp:55] pump: starting main loop
18:52:23.187 [2408] [dbg] roc_packet: [router.cpp:103] router: detected new stream: source_id=1540280261 route_flags=[audio] packet_flags=[rtp,fec,audio,prepared,composed]
18:52:23.187 [2408] [dbg] roc_netio: [udp_port.cpp:579] udp port: : recv=0 send=1 send_nb=1
18:52:23.187 [2408] [dbg] roc_rtp: [timestamp_extractor.cpp:82] timestamp extractor: returning mapping: cts:1711983143187832242/sts:1946609469
18:52:23.187 [2408] [dbg] roc_rtcp: [reporter.cpp:1452] rtcp reporter: creating address: remote_addr=192.168.0.101:20003
18:52:23.188 [2408] [dbg] roc_rtcp: [reporter.cpp:1172] rtcp reporter: completed index rebuild: n_streams=0 n_addrs=1
18:52:23.188 [2408] [dbg] roc_rtcp: [communicator.cpp:855] rtcp communicator: generated_pkts=1 processed_pkts=0 proc_errs=0
18:52:23.188 [2408] [dbg] roc_pipeline: [pipeline_loop.cpp:483] pipeline loop: tasks=5 in_place=1.00 in_frame=0.00 preempts=0 sched=0/0
18:52:23.277 [2408] [dbg] roc_packet: [router.cpp:121] router: detected new stream: source_id=none route_flags=[repair] packet_flags=[fec,repair,prepared,composed]
18:52:25.197 [2408] [dbg] roc_rtcp: [reporter.cpp:1399] rtcp reporter: creating stream: ssrc=1469140577
18:52:25.197 [2408] [dbg] roc_rtcp: [reporter.cpp:310] rtcp reporter: updating stream address: ssrc=1469140577 old_addr= new_addr=192.168.0.101:20003
18:52:25.197 [2408] [inf] roc_audio: [feedback_monitor.cpp:81] feedback monitor: got first report from receiver: source=1469140577

src/internal_modules/roc_core/optional.h:58: error: roc_panic()

ERROR: roc_audio: optional: attempt to dereference unitialized object

Backtrace:
#1: 0x134D25 roc::core::print_backtrace_full()+0x5C
#2: 0x133905 roc::core::die_gracefully(char const*, bool)+0x50
#3: 0x12DE85 roc::core::panic(char const*, char const*, int, char const*, ...)+0xCC
#4: 0xC2C13 roc::core::Optional<roc::audio::FreqEstimator, 4184u>::operator->() const+0x22
#5: 0xC1FDB roc::audio::LatencyTuner::report_()+0x19A
#6: 0xC1B03 roc::audio::LatencyTuner::advance_stream(unsigned int)+0x7A
#7: 0xC02A5 roc::audio::FeedbackMonitor::update_tuner_(unsigned int)+0x134
#8: 0xC0013 roc::audio::FeedbackMonitor::write(roc::audio::Frame&)+0x52
#9: 0xBF99F roc::audio::Fanout::write(roc::audio::Frame&)+0x2A
#10: 0x7F003 roc::pipeline::SenderSink::write(roc::audio::Frame&)+0x4C
#11: 0x7DFA9 roc::pipeline::SenderLoop::process_subframe_imp(roc::audio::Frame&)+0x18
#12: 0x7CB3F roc::pipeline::PipelineLoop::process_next_subframe_(roc::audio::Frame&, unsigned int*, unsigned int)+0xFC
#13: 0x7C6B1 roc::pipeline::PipelineLoop::process_subframes_and_tasks_precise_(roc::audio::Frame&)+0x98
#14: 0x7C583 roc::pipeline::PipelineLoop::process_subframes_and_tasks(roc::audio::Frame&)+0x1A
#15: 0x7DF07 roc::pipeline::SenderLoop::write(roc::audio::Frame&)+0x146
#16: 0x88FBD roc::sndio::Pump::transfer_frame_(roc::sndio::ISource&)+0x178
#17: 0x88D11 roc::sndio::Pump::run()+0x2AE
#18: 0x6FDEB main+0x1FF6
#19: 0xB6AE8740 __libc_start_main+0x114
#20: 0x6DD55 _start+0x20
Aborted

@gavv
Copy link
Member Author

gavv commented Apr 1, 2024

Wi-Fi + noisy link (FEC, standard conservative settings)

Full logs:
20240401_wifi_sane_defaults.zip

Settings:

  • Reed-Solomon FEC with default block size
  • --target-latency=200ms
  • --min-latency=50ms
  • --max-latency=1s

(I think these values can be default in roc-recv).

There were no speedtest, however the link itself was noisy since it's evening and everybody around use Wi-Fi / Bluetooth.

rg -IN → | ch 5:
Latency tuner: increasing target latency 8820(200.000ms) → 13535(306.910ms)
Latency tuner: increasing target latency 13535(306.916ms) → 27415(621.666ms)
Latency monitor: decreasing target latency 27415(621.655ms) → 21771(493.667ms)
Latency monitor: decreasing target latency 21771(493.673ms) → 17289(392.035ms)
Latency monitor: decreasing target latency 17289(392.041ms) → 13730(311.327ms)

So we started from 200ms, jumped to 306ms, then after a big jitter spike to 621ms, then gradually rolled back to 311ms (during 12 minutes).

Drops happened only once during that jitter spike. FEC partially recovered those drops.

rg -IN 'watchdog|repaired|→' | perl -ne 'print if /18:57:00/../19:03:43/'
18:57:00.531 [1980925] [err] roc_audio: [latency_tuner.cpp:608] Latency tuner: increasing target latency 8820(200.000ms) → 13535(306.910ms)
18:57:00.532 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: ..............ibb...
18:57:44.659 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: ...ibbbbbbbbbbbbbbbb
18:57:44.659 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.681 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.702 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.723 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.745 [1980925] [err] roc_audio: [latency_tuner.cpp:608] Latency tuner: increasing target latency 13535(306.916ms) → 27415(621.666ms)
18:57:44.745 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbD..
18:57:44.862 [1980925] [dbg] roc_fec: [openfec_decoder.cpp:370] openfec decoder: repaired 0/14/255 ....XXXXXXXXXXXXXX .........
18:57:44.862 [1980925] [dbg] roc_fec: [openfec_decoder.cpp:370] openfec decoder: repaired 0/18/255 XXXXXXXXXXXXXXXXXX .........
18:57:44.862 [1980925] [dbg] roc_fec: [openfec_decoder.cpp:370] openfec decoder: repaired 2/2/255 rr................ .........
18:57:44.862 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: .................ibb
18:57:44.883 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.905 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.926 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.947 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.968 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.980 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.990 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:45.011 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:45.032 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbi.....
19:00:14.856 [1980925] [err] roc_audio: [latency_tuner.cpp:564] Latency monitor: decreasing target latency 27415(621.655ms) → 21771(493.667ms)
19:01:09.693 [1980925] [err] roc_audio: [latency_tuner.cpp:564] Latency monitor: decreasing target latency 21771(493.673ms) → 17289(392.035ms)
19:03:43.729 [1980925] [err] roc_audio: [latency_tuner.cpp:564] Latency monitor: decreasing target latency 17289(392.041ms) → 13730(311.327ms)

@gavv
Copy link
Member Author

gavv commented Apr 2, 2024

Wi-Fi, unstable channel

Streaming from laptop to orange pi connected to 2 different 5ghz access points, in turn connected together via cable.

I'm seeing regular repairs and drops (and crackling), but don't see any reaction from latency tuner.

Sample 1

pulse.log

Sample 2

pulse2.log

Sample 3

pulse3.log

@gavv
Copy link
Member Author

gavv commented Apr 2, 2024

After a closer look, it seems the last one is really caused by packet loss, as discussed in chat.

So it's not a problem of latency adjustment algorithm, it behaves correctly here.

20240402_poor_link_2.zip

From wireshark:

5.61% of 314sec = 17.6 secs of lost audio

From our log:

rg -NI 'watchdog: status' | ch -1 | tr -d '\n' | grep -o 'b' | wc -l
67308

67308 blank frames * 8 samples per frame / 44100Hz = 12.21 secs of lost audio

(8 samples used by speex resampler)

Also from our log:

rg -NI cum_loss | ch 13 | tail -1
cum_loss=2301

Wireshark report and our report are close in number and duration of lost packets.

BTW reported max jitter is quite different: 100ms vs 600ms.

@gavv
Copy link
Member Author

gavv commented Apr 2, 2024

The problem can be solved by using larger FEC blocks on sender, e.g. --nbsrc=30 --nbrpr=20.

@gavv
Copy link
Member Author

gavv commented Apr 3, 2024

If I use --packet-len 7ms --nbsrc 40 --nbrpr 20 on sender, which gives FEC blocks of 280ms, receiver still uses target latency of 200ms for some reason.

(Jitter is below 200ms in that case).

@gavv
Copy link
Member Author

gavv commented Apr 3, 2024

Wi-Fi, unstable channel

I did some more experiments in this setup (streaming from laptop to orange pi connected to 2 different 5ghz access points, in turn connected together via cable).

Some observations:

  • All problems are caused by burst losses.

  • Here is the best configuration that I've found for this particular setup:

    • on sender: increase packet_len to 7ms
    • on sender: increase FEC block size to 60/20 (420ms)
    • on sender: switch from reed-solomon to LDPC-staircase
    • on receiver: increase min_latency to 500ms
  • Why:

    • when packets are smaller, there are more losses
    • when FEC block is smaller, burst losses cause loss of entire block(s)
    • when FEC block is larger, and there are losses, reed-solomon is too slow and causes glitches on receiver (device spec)
    • FEC works fine only when target latency is larger than FEC block, in this case higher by ~100ms works well

All this is expected, just want to document an actual experiment.

BTW even after doing all this I'm still not satisfied with that setup - sometimes burst losses are too long to be covered by a sane FEC block size. I wonder if RTX would help here (I don't know if in this periods ALL packets are dropped or just MANY of them. Also I don't know yet exact reason of losses - physical link or router queues).

Also I think compression would help here.

@gavv
Copy link
Member Author

gavv commented Apr 3, 2024

During some short period, I was constantly getting this panic:

link meter: seqnum was not processed in writer part

I never restarted sender. I restarted receiver multiple times and got panic each time.

After a short time, panic disappears. And it starts reproducing again for a short time after a minute or two. And so on.

I suppose it's related to seqnum overflows.

First two files in zip are just logs. Second to files also has RTP headers, I added packet->print(0) to LinkMeter::read().

20240403_seqnum_crash.zip

@gavv
Copy link
Member Author

gavv commented Apr 3, 2024

Here is an example when tuner was too optimistic, reduced latency, and it caused drops of late packets and crackling.

It happened after a long measurement (about 30 minutes). Attached logs have only the final part. See "dropping late packet" in the end of pulse.log.

20240403_too_optimistic.zip

baranovmv added a commit to baranovmv/roc-toolkit that referenced this issue Apr 20, 2024
The whole combination of factor that enabled
the issue are still unclear. The fix is just in
simplifying the logic by disabling estimator for
the blocks with absent first packet, which
should not be a great deal.
Apparently the issue could take place
during start-up transition, and it
basically lead to wrong block length estimation
with two times greater value than correct one.
@baranovmv
Copy link
Member

baranovmv commented May 6, 2024

@gavv I'm about to enable roc-recv to determine if latency is fixed or it is dynamic.

We've got --target-latency command line parameter.

  • As you proposed, one possible option is to allow it to comprehend auto variant, but it wouldn't be possible to specify a starting latency, which would be default in this case. I'd like to keep it though
  • Another option is to add a bool parameter --dynamic-latency which should be set to true by default

@gavv
Copy link
Member Author

gavv commented May 6, 2024

If we want to support specifying starting latency in CLI, then we also need it in C API?

There are two approaches for C API:

  • target_latency = 0 means adaptive latency, and we add a new field start_latency
  • we add a new flag / mode that enables adaptive latency; in this mode target_latency is used as starting latency

I think the second approach is a bit confusing, because target_latency suddenly transforms to starting latency in some cases.

If you agree, then in C API we can have target_latency and start_latency. In this case, I think in CLI we should have the same: --target-latency and --start-latency (because CLI mirrors C API).

If in future version we'll deduce starting latency automatically, this layout will work fine too. In this case start_latency = 0 would mean "deduce starting latency", and start_latency != 0 would mean "use fixed starting latency".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

2 participants