when running a liquidsoap based radio with an %opus encoder configured as such:
opus = %ogg(%opus(samplerate=48000, frame_size=60.))
output.harbor(opus, port=2048, mount="opus", radio)
I’m seeing a SIGSEGV occur rarely. This happens maybe once every 24 hours on average. There doesn’t seem to be any specific song or part of it that triggers it, at least as far as I can intuit.
The following is the stack trace I’ve been able to obtain from a coredump:
#0 0x0000000000c1f6a1 in ocaml_opus_encode_float ()
#1 0x0000000000a507a8 in camlOpus__fun_897 ()
#2 0x00000000006dc956 in camlOpus_encoder__data_encoder_455 ()
#3 0x00000000006e07d1 in camlOgg_muxer__encode_949 ()
#4 0x0000000000b78000 in camlStdlib__List__iter_507 ()
#5 0x00000000006df16b in camlOgg_encoder__encode_964 ()
#6 0x0000000000785a3d in camlOutput__output_chunks_1581 ()
#7 0x0000000000783a16 in camlOutput__fun_2241 ()
#8 0x00000000008a0ab2 in camlClock__fun_2268 ()
#9 0x0000000000b780d4 in camlStdlib__List__fold_left_521 ()
#10 0x000000000089e71a in camlClock__fun_2219 ()
#11 0x00000000008a07b2 in camlClock__loop_1348 ()
#12 0x000000000089ead8 in camlClock__fun_2196 ()
#13 0x00000000008d84f3 in camlTutils__process_832 ()
#14 0x0000000000b66eb5 in camlThread__fun_850 ()
#15 0x0000000000c722d5 in caml_start_program ()
#16 0x0000000000c683d0 in caml_callback_exn ()
#17 0x0000000000c494d8 in caml_thread_start ()
#18 0x00007f9155270e86 in start_thread () from libc.so.6
#19 0x00007f91552f7c60 in clone3 () from libc.so.6
The faulting instruction is the cvtsd2ss in what I believe is some sort of loop:
0x0000000000c1f658 <+344>: test %r12d,%r12d
0x0000000000c1f65b <+347>: jle 0xc1f6c0 <ocaml_opus_encode_float+448>
0x0000000000c1f65d <+349>: mov 0x48(%rsp),%r11
0x0000000000c1f662 <+354>: mov 0x10(%rsp),%r8d
0x0000000000c1f667 <+359>: mov %r15d,%r9d
0x0000000000c1f66a <+362>: xor %r10d,%r10d
0x0000000000c1f66d <+365>: sub 0x2c(%rsp),%r9d
0x0000000000c1f672 <+370>: nopw 0x0(%rax,%rax,1)
0x0000000000c1f678 <+376>: test %ebx,%ebx
0x0000000000c1f67a <+378>: jle 0xc1f6b1 <ocaml_opus_encode_float+433>
0x0000000000c1f67c <+380>: movslq %r9d,%rsi
0x0000000000c1f67f <+383>: mov %r10d,%eax
0x0000000000c1f682 <+386>: mov %r11,%rdx
0x0000000000c1f685 <+389>: shl $0x3,%rsi
0x0000000000c1f689 <+393>: nopl 0x0(%rax)
0x0000000000c1f690 <+400>: mov (%rdx),%rdi
0x0000000000c1f693 <+403>: movslq %eax,%rcx
0x0000000000c1f696 <+406>: pxor %xmm0,%xmm0
0x0000000000c1f69a <+410>: add $0x1,%eax
0x0000000000c1f69d <+413>: add $0x8,%rdx
=> 0x0000000000c1f6a1 <+417>: cvtsd2ss (%rdi,%rsi,1),%xmm0
0x0000000000c1f6a6 <+422>: movss %xmm0,0x0(%rbp,%rcx,4)
0x0000000000c1f6ac <+428>: cmp %eax,%r8d
0x0000000000c1f6af <+431>: jne 0xc1f690 <ocaml_opus_encode_float+400>
0x0000000000c1f6b1 <+433>: add $0x1,%r9d
0x0000000000c1f6b5 <+437>: add %r13d,%r10d
0x0000000000c1f6b8 <+440>: add %r13d,%r8d
0x0000000000c1f6bb <+443>: cmp %r15d,%r9d
0x0000000000c1f6be <+446>: jne 0xc1f678 <ocaml_opus_encode_float+376>
This is the only appearance of the cvtsd2ss in the function, so I suspect it corresponds to this code here:
|
for (i = 0; i < loops; i++) { |
|
for (j = 0; j < frame_size; j++) |
|
for (c = 0; c < chans; c++) |
|
pcm[chans * j + c] = |
|
Double_field(Field(buf, c), off + j + i * frame_size); |
The following are the GPRs of interest:
rax 0x1302 4866
rbx 0x2 2
rcx 0x1301 4865
rdx 0x7f915112fe00 140262107184640
rsi 0x4c00 19456
rdi 0x7f913a7f8400 140261728420864
rbp 0x7f912402ce80 0x7f912402ce80
rsp 0x7f914951b520 0x7f914951b520
r8 0x1302 4866
r9 0x980 2432
r10 0x1300 4864
r11 0x7f915112fdf0 140262107184624
r12 0xb40 2880
r13 0x2 2
r14 0x7f91240b8740 140261351720768
r15 0xb40 2880
Trying to reason about the assembly, it would seem to me that the issue is with how the size of the input buffer is computed. Maybe the problem is with the fact that the input vector is _len - _off bytes long, rather than _len bytes long? The int loops = len / frame_size; would seem suspect in that case.
when running a liquidsoap based radio with an
%opusencoder configured as such:I’m seeing a
SIGSEGVoccur rarely. This happens maybe once every 24 hours on average. There doesn’t seem to be any specific song or part of it that triggers it, at least as far as I can intuit.The following is the stack trace I’ve been able to obtain from a coredump:
The faulting instruction is the
cvtsd2ssin what I believe is some sort of loop:This is the only appearance of the
cvtsd2ssin the function, so I suspect it corresponds to this code here:ocaml-opus/src/opus_stubs.c
Lines 763 to 767 in 3e1a566
The following are the GPRs of interest:
Trying to reason about the assembly, it would seem to me that the issue is with how the size of the input buffer is computed. Maybe the problem is with the fact that the input vector is
_len - _offbytes long, rather than_lenbytes long? Theint loops = len / frame_size;would seem suspect in that case.