Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception SIGSEGV while reconnect RTSP #53

Open
heleon19 opened this issue Oct 13, 2020 · 19 comments
Open

Exception SIGSEGV while reconnect RTSP #53

heleon19 opened this issue Oct 13, 2020 · 19 comments

Comments

@heleon19
Copy link

Hello

I would like to use this lib to get JPEG images of an RTSP MPEG stream. I can already establish a connection and receive the single images.
When the connection to the camera is lost or if it is restarting, I want to close the demuxer so I can establish a new connection, how can I do that?

If I simply create a new demuxer, I get a SIGSEGV error.

Find attached my test script and the logs. For my test I used beamcoder v0.6.3

Thank you for your support.

test_and_log.zip

@scriptorian
Copy link
Contributor

Hi, thanks for raising the issue.
I have just added a forceClose method to the demuxer that should allow you to do what you are trying to do. I believe the garbage collection would have closed the stream eventually but this will allow you to get ahead of that!

@heleon19
Copy link
Author

Hi, thank you for your support.
I've tested with the new version an call the forceClose. But still I get SIGSEGV exception.

Once I got the following error message:
2020-10-15 21_02_26-index js — C__DATEN_nodejs — Atom
The instruction in 0x0000... refers to memory at 0x0... The read operation could not be performed in memory.

Find my adapted testscript and logs attached.
test_and_log_v0.6.5.zip

@scriptorian
Copy link
Contributor

I have tried your code on a public rtsp stream (I used rtsp://wowzaec2demo.streamlock.net/vod/mp4:BigBuckBunny_115k.mov). I'm not getting a timeout but by taking out the keepalive() call from the packet loop I can simulate what is happening. I don't get a crash on restart but I have confirmed that the new forceClose is not doing anything useful in this case, unfortunately.

As a test I have now added a bit of code that tries to make sure the old demuxer has been garbage collected before reconnecting to see if that makes any difference. You will need to run with node --expose-gc index.js and add the following instead of await connection.forceClose() but after connection = null:

      return new Promise(resolve => {
        setTimeout(() => {
          global.gc();
          resolve();
        }, 200);
      });

@heleon19
Copy link
Author

heleon19 commented Oct 19, 2020

Hi Scriptorian
Calling garbage collector befor reconnecting does work. Is there a way to handle this in the beamcoder lib?

How can I disable this ffmpeg log?
[rtsp @ 0000026dc3e1c800] max delay reached. need to consume packet
[rtsp @ 0000026dc3e1c800] RTP: missed 2 packets
[rtsp @ 0000026dc3e1c800] Missing packets; dropping frame.
[rtsp @ 0000026dc3e1c800] Missing packets; dropping frame.
[rtsp @ 0000026dc3e1c800] max delay reached. need to consume packet
[rtsp @ 0000026dc3e1c800] RTP: missed 2 packets
[rtsp @ 0000026dc3e1c800] Missing packets; dropping frame.

@scriptorian
Copy link
Contributor

Hi,

I've had another go to make sure the forceClose method does what you need. You should be able to remove the code to force garbage collection now.

@scriptorian
Copy link
Contributor

I've also now added a call beamcoder.logging() which allows you to read or set the FFmpeg logging level. The README file lists the valid log level strings.

@heleon19
Copy link
Author

Hi, thank you again!

The logging function works as expected.
When using forceClose I still get an exception sometimes.

Find logs and updated test script attached.
test_and_log_v0.6.7.zip

@scriptorian
Copy link
Contributor

I've had another try at shutting down cleanly, hopefully this will be enough!

@heleon19
Copy link
Author

no data timeout triggered
disconnect from rtsp://admin:admin@10.1.45.107/1
connect to rtsp://admin:admin@10.1.45.107/1
PID 7916 received SIGSEGV for address: 0x41092b6a
SymInit: Symbol-SearchPath: '.;C:\DATEN\nodejs\beamcode;C:\Program Files\nodejs;C:\WINDOWS;C:\WINDOWS\system32;SRVC:\websymbolshttp://msdl.microsoft.com/downl
oad/symbols;', symOptions: 530, UserName: 'rje'
OS-Version: 10.0.18362 () 0x100-0x1
c:\daten\nodejs\beamcode\node_modules\segfault-handler\src\stackwalker.cpp (924): StackWalker::ShowCallstack
c:\daten\nodejs\beamcode\node_modules\segfault-handler\src\segfault-handler.cpp (242): segfault_handler
00007FF9B4C585B6 (ntdll): (filename not available): RtlIsGenericTableEmpty
00007FF9B4C4A056 (ntdll): (filename not available): RtlRaiseException
00007FF9B4C7FE3E (ntdll): (filename not available): KiUserExceptionDispatcher
00007FF941092B6A (avformat-58): (filename not available): avformat_get_riff_audio_tags
00007FF9410A7204 (avformat-58): (filename not available): avformat_get_riff_audio_tags
00007FF9410A9DDB (avformat-58): (filename not available): avformat_get_riff_audio_tags
00007FF9410D4A7B (avformat-58): (filename not available): av_find_default_stream_index
00007FF9410D5D1B (avformat-58): (filename not available): av_find_default_stream_index
00007FF9410D6D58 (avformat-58): (filename not available): av_read_frame
c:\daten\nodejs\beamcode\node_modules\beamcoder\src\demux.cc (238): readFrameExecute
00007FF7A099D26E (node): (filename not available): uv_queue_work
00007FF7A098A7CD (node): (filename not available): uv_poll_stop
00007FF7A1658100 (node): (filename not available): v8::internal::SetupIsolateDelegate::SetupHeap
00007FF9B3117BD4 (KERNEL32): (filename not available): BaseThreadInitThunk
00007FF9B4C4CE51 (ntdll): (filename not available): RtlUserThreadStart

@heleon19
Copy link
Author

PID 62040 received SIGSEGV for address: 0x410dd712
SymInit: Symbol-SearchPath: '.;C:\DATEN\nodejs\beamcode;C:\Program Files\nodejs;C:\WINDOWS;C:\WINDOWS\system32;SRVC:\websymbolshttp://msdl.microsoft.com/downl
oad/symbols;', symOptions: 530, UserName: 'rje'
OS-Version: 10.0.18362 () 0x100-0x1
c:\daten\nodejs\beamcode\node_modules\segfault-handler\src\stackwalker.cpp (924): StackWalker::ShowCallstack
c:\daten\nodejs\beamcode\node_modules\segfault-handler\src\segfault-handler.cpp (242): segfault_handler
00007FF9B4C585B6 (ntdll): (filename not available): RtlIsGenericTableEmpty
00007FF9B4C4A056 (ntdll): (filename not available): RtlRaiseException
00007FF9B4C7FE3E (ntdll): (filename not available): KiUserExceptionDispatcher
00007FF9410DD712 (avformat-58): (filename not available): avformat_close_input
c:\daten\nodejs\beamcode\node_modules\beamcoder\src\format.cc (3835): formatContextFinalizer
00007FF7A092A158 (node): (filename not available): node::Stop
00007FF7A0FFABD1 (node): (filename not available): v8::internal::GlobalHandles::InvokeSecondPassPhantomCallbacks
00007FF7A0FFAD7E (node): (filename not available): v8::internal::GlobalHandles::InvokeSecondPassPhantomCallbacksFromTask
00007FF7A089CBCC (node): (filename not available): v8::internal::wasm::JSToWasmWrapperCompilationUnit::~JSToWasmWrapperCompilationUnit
00007FF7A089BB51 (node): (filename not available): v8::internal::wasm::JSToWasmWrapperCompilationUnit::~JSToWasmWrapperCompilationUnit
00007FF7A099982B (node): (filename not available): uv_async_send
00007FF7A0998FCC (node): (filename not available): uv_loop_init
00007FF7A0999194 (node): (filename not available): uv_run
00007FF7A08B9B73 (node): (filename not available): EVP_CIPHER_CTX_buf_noconst
00007FF7A0917860 (node): (filename not available): node::Start
00007FF7A07C6ABC (node): (filename not available): RC4_options
00007FF7A161F068 (node): (filename not available): v8::internal::SetupIsolateDelegate::SetupHeap
00007FF9B3117BD4 (KERNEL32): (filename not available): BaseThreadInitThunk
00007FF9B4C4CE51 (ntdll): (filename not available): RtlUserThreadStart

@scriptorian
Copy link
Contributor

It would really help if I could reproduce this here!
Is it crashing every time now or intermittently as previously?

@heleon19
Copy link
Author

heleon19 commented Oct 20, 2020

I understand. Would it help if you had access to my camera?
Tryed a couple of times, happend always while the camera reboots.

@scriptorian
Copy link
Contributor

Perhaps simpler for now if I can send you one or two patch cpp files with some extra debug - can you use them and rebuild beamcoder locally?

@heleon19
Copy link
Author

I didn't get it run to build on windows. I do have a Dockerfile to build it on a Raspberry. But I don't get detailed error log.
With this Dockerfile I could clone from an test branch.

What version of ffmpeg do I have to use, currently I use 4.3.1

Example with the current master branch:
{
time: 1603194169741,
data: <Buffer ff d8 ff e0 00 10 4a 46 49 46 00 02 01 00 00 01 00 01 00 00 ff db 00 84 00 06 04 05 06 05 04 06 06 05 06 07 07 06 08 0a 10 0a 0a 09 09 0a 14 0e 0f 0c ... 85877 more bytes>,
counter: 66
}
{
time: 1603194170358,
data: <Buffer ff d8 ff e0 00 10 4a 46 49 46 00 02 01 00 00 01 00 01 00 00 ff db 00 84 00 06 04 05 06 05 04 06 06 05 06 07 07 06 08 0a 10 0a 0a 09 09 0a 14 0e 0f 0c ... 85965 more bytes>,
counter: 67
}
no data timeout triggered
disconnect from rtsp://admin:admin@10.1.45.107/1
connect to rtsp://admin:admin@10.1.45.107/1
Segmentation fault (core dumped)

@scriptorian
Copy link
Contributor

The javascript binding was keeping a stale pointer around after the force close and this was causing trouble during the garbage collection. I have added an indirection to handle this and hopefully the latest version will be better.

@heleon19
Copy link
Author

heleon19 commented Oct 21, 2020

I'm very sorry, still I get exception. But not every time.

PID 13752 received SIGSEGV for address: 0x402f1ce5
SymInit: Symbol-SearchPath: '.;C:\DATEN\nodejs\beamcode;C:\Program Files\nodejs;C:\WINDOWS;C:\WINDOWS\system32;SRVC:\websymbolshttp://msdl.microsoft.com/downl
oad/symbols;', symOptions: 530, UserName: 'rje'
OS-Version: 10.0.18362 () 0x100-0x1
c:\daten\nodejs\beamcode\node_modules\segfault-handler\src\stackwalker.cpp (924): StackWalker::ShowCallstack
c:\daten\nodejs\beamcode\node_modules\segfault-handler\src\segfault-handler.cpp (242): segfault_handler
00007FF9B4C585B6 (ntdll): (filename not available): RtlIsGenericTableEmpty
00007FF9B4C4A056 (ntdll): (filename not available): RtlRaiseException
00007FF9B4C7FE3E (ntdll): (filename not available): KiUserExceptionDispatcher
00007FF9402F1CE5 (avformat-58): (filename not available): avformat_get_riff_audio_tags
00007FF9402F20E9 (avformat-58): (filename not available): avformat_get_riff_audio_tags
00007FF9402F2D4A (avformat-58): (filename not available): avformat_get_riff_audio_tags
00007FF940307204 (avformat-58): (filename not available): avformat_get_riff_audio_tags
00007FF940309DDB (avformat-58): (filename not available): avformat_get_riff_audio_tags
00007FF940334A7B (avformat-58): (filename not available): av_find_default_stream_index
00007FF940335D1B (avformat-58): (filename not available): av_find_default_stream_index
00007FF940336D58 (avformat-58): (filename not available): av_read_frame
c:\daten\nodejs\beamcode\node_modules\beamcoder\src\demux.cc (245): readFrameExecute
00007FF7A099D26E (node): (filename not available): uv_queue_work
00007FF7A098A7CD (node): (filename not available): uv_poll_stop
00007FF7A1658100 (node): (filename not available): v8::internal::SetupIsolateDelegate::SetupHeap
00007FF9B3117BD4 (KERNEL32): (filename not available): BaseThreadInitThunk
00007FF9B4C4CE51 (ntdll): (filename not available): RtlUserThreadStart

@scriptorian
Copy link
Contributor

I think this one is a race condition. If it's now always crashing at demux.cc readFrameExecute as here than I'm fairly convinced.
You are calling the async function demuxer.read and sometimes this might run (in a separate thread) after the context is deleted. I was hesitant to add something like a mutex around this but I'm not sure what you can do from JS without me doing that. Let me know what you think.

@heleon19
Copy link
Author

I don't know how I could implement such a mutex around demuxer.read. I think my code does one by one, and there should be only one connection open. Do you think something in my test script is wrong?

Find attached my script:
index.txt

@scriptorian
Copy link
Contributor

I wasn't clear enough - if a mutex is needed I have to add it in beamcoder land.
The problem is that you are calling read (or in fact the call is being executed) after you have forceClose'd the demuxer. Your loop tests for connection being valid but that doesn't protect against this problem. It seems possible for you to add some code so that just before you call the stop or forceClose function you ensure that the read isn't going to be called.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants