Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OLA is crashing on adding Artnet Universe #1726

Closed
freaknils opened this issue Jun 9, 2021 · 7 comments
Closed

OLA is crashing on adding Artnet Universe #1726

freaknils opened this issue Jun 9, 2021 · 7 comments

Comments

@freaknils
Copy link

HI,

I have just build the current master of OLA, but when I add a new Artnet Universe via the web page the olad crashes:

common/io/IOUtils.cpp:39: open(/dev/dmx0): No such file or directory
plugins/opendmx/OpenDmxPlugin.cpp:81: Could not open /dev/dmx0 No such file or directory
common/io/IOUtils.cpp:39: open(/dev/kldmx0): No such file or directory
plugins/karate/KaratePlugin.cpp:80: Could not open /dev/kldmx0 No such file or directory
/usr/include/c++/11.1.0/bits/stl_queue.h:627: std::priority_queue<_Tp, _Sequence, _Compare>::const_reference std::priority_queue<_Tp, _Sequence, _Compare>::top() const [with _Tp = ola::io::TimeoutManager::Event*; _Sequence = std::vector<ola::io::TimeoutManager::Event*>; _Compare = ola::io::TimeoutManager::ltevent; std::priority_queue<_Tp, _Sequence, _Compare>::const_reference = ola::io::TimeoutManager::Event* const&]: Assertion '!this->empty()' failed.
Received Segmentation fault
/usr/lib/libolacommon.so.0(+0x990dc)[0x7f12105290dc]
/usr/lib/libc.so.6(+0x3cda0)[0x7f12100cfda0]
/usr/lib/libc.so.6(abort+0x1d5)[0x7f12100b9921]
/usr/lib/libolacommon.so.0(+0xaad3a)[0x7f121053ad3a]
/usr/lib/libolacommon.so.0(_ZN3ola2io14TimeoutManager15ExecuteTimeoutsEPNS_9TimeStampE+0x252)[0x7f1210542082]
/usr/lib/libolacommon.so.0(_ZN3ola2io12SelectPoller4PollEPNS0_14TimeoutManagerERKNS_12TimeIntervalE+0x229)[0x7f12105442b9]
/usr/lib/libolacommon.so.0(_ZN3ola2io12SelectServer14CheckForEventsERKNS_12TimeIntervalE+0xc6)[0x7f121053eeb6]
/usr/lib/libolacommon.so.0(_ZN3ola2io12SelectServer3RunEv+0x7c)[0x7f121053efcc]
/usr/lib/libolaserver.so.0(_ZN3ola4http10HTTPServer3RunEv+0x88)[0x7f1210716188]
/usr/lib/libolacommon.so.0(_ZN3ola6thread6Thread12_InternalRunEv+0xed)[0x7f12105c4c1d]
/usr/lib/libpthread.so.0(+0x9259)[0x7f120f936259]
/usr/lib/libc.so.6(clone+0x43)[0x7f12101915e3]

Distribution is:

[ola@a5ac37931198 ~]$ cat /etc/os-release 
NAME="Arch Linux"
PRETTY_NAME="Arch Linux"
ID=arch
BUILD_ID=rolling
ANSI_COLOR="38;2;23;147;209"
HOME_URL="https://archlinux.org/"
DOCUMENTATION_URL="https://wiki.archlinux.org/"
SUPPORT_URL="https://bbs.archlinux.org/"
BUG_REPORT_URL="https://bugs.archlinux.org/"
LOGO=archlinux

CPU:

[ola@a5ac37931198 ~]$ uname -a
Linux a5ac37931198 5.12.9-arch1-1 #1 SMP PREEMPT Thu, 03 Jun 2021 11:36:13 +0000 x86_64 GNU/Linux

Are there any known issues or am I doing something wrong?
Thanks!

@peternewman
Copy link
Member

Hmm, I can't reproduce, although on an older OS.

Some questions/requests:
Does make check pass?
Does it happen on the 0.10.8 release code?
Can you share olad -l 4 logs:
https://www.openlighting.org/ola/get-help/ola-faq/#How_do_I_get_olad_-l_4_logs
Which universe were you trying to patch it to, input or output? Was anything else already patched to the universe.
New or old web UI?
Does it happen using the ola_patch CLI tool?
What was the universe name?
Can you reproduce if you try and patch the dummy plugin instead?
Which compiler and version are you using?

@freaknils
Copy link
Author

Hi,

I am sorry, after installing python-numpy and rebuilding, OLA ist working nice now.

Thanks!

@peternewman
Copy link
Member

I'm glad you fixed it @freaknils , but that's still not good. Does it reliably break without python-numpy? It shouldn't affect it if you're using our main web UI, but perhaps we have a bug.

Or does it just randomly fail sometimes if you do a make clean and rebuild a few times?

@cbix
Copy link

cbix commented Jun 13, 2022

I could reproduce this reliably using both old and new UI with Arch Linux after someone reported this on AUR. I created that package in the first place just to have libola available for building another tool, so I always ignored the failing tests, and never used OLA in production (yet).

gcc 12.1.0

Test suite summary
PASS: common/base/CredentialsTester
PASS: common/base/FlagsTester
PASS: common/base/LoggingTester
PASS: common/dmx/RunLengthEncoderTester
PASS: common/export_map/ExportMapTester
PASS: common/file/UtilTester
PASS: common/io/DescriptorTester
PASS: common/io/IOQueueTester
PASS: common/io/IOStackTester
PASS: common/io/MemoryBlockTester
./config/test-driver: line 112: 344003 Aborted                 (core dumped) "$@" >> "$log_file" 2>&1
FAIL: common/io/SelectServerTester
PASS: common/io/StreamTester
./config/test-driver: line 112: 344070 Aborted                 (core dumped) "$@" >> "$log_file" 2>&1
FAIL: common/io/TimeoutManagerTester
PASS: common/messaging/DescriptorTester
PASS: common/network/HealthCheckedConnectionTester
PASS: common/network/NetworkTester
PASS: common/network/TCPConnectorTester
PASS: common/rdm/DiscoveryAgentTester
PASS: common/rdm/PidStoreTester
PASS: common/rdm/QueueingRDMControllerTester
PASS: common/rdm/RDMAPITester
PASS: common/rdm/RDMCommandSerializerTester
PASS: common/rdm/RDMCommandTester
PASS: common/rdm/RDMFrameTester
PASS: common/rdm/RDMHelperTester
PASS: common/rdm/RDMMessageTester
PASS: common/rdm/RDMReplyTester
PASS: common/rdm/UIDAllocatorTester
PASS: common/rdm/UIDTester
PASS: common/rpc/RpcTester
PASS: common/rpc/RpcServerTester
PASS: common/strings/UtilsTester
PASS: common/thread/ExecutorThreadTester
PASS: common/thread/ThreadTester
PASS: common/thread/FutureTester
PASS: common/timecode/TimeCodeTester
PASS: common/utils/UtilsTester
PASS: common/web/JsonTester
PASS: common/web/ParserTester
PASS: common/web/PtchParserTester
PASS: common/web/PtchTester
PASS: common/web/PointerTester
PASS: common/web/PointerTrackerTester
PASS: common/web/SchemaParserTester
PASS: common/web/SchemaTester
PASS: common/web/SectionsTester
PASS: data/rdm/PidDataTester
PASS: libs/acn/E131Tester
PASS: libs/acn/E133Tester
PASS: libs/acn/TransportTester
PASS: libs/usb/LibUsbThreadTester
PASS: ola/OlaClientTester
PASS: olad/plugin_api/ClientTester
PASS: olad/plugin_api/DeviceTester
PASS: olad/plugin_api/DmxSourceTester
PASS: olad/plugin_api/PortTester
PASS: olad/plugin_api/PreferencesTester
PASS: olad/plugin_api/UniverseTester
./config/test-driver: line 112: 345501 Aborted                 (core dumped) "$@" >> "$log_file" 2>&1
FAIL: plugins/artnet/ArtNetTester
PASS: plugins/dummy/DummyPluginTester
PASS: plugins/espnet/EspNetTester
PASS: plugins/kinet/KiNetTester
PASS: plugins/openpixelcontrol/OPCClientTester
PASS: plugins/openpixelcontrol/OPCServerTester
PASS: plugins/osc/OSCTester
PASS: plugins/shownet/ShowNetTester
PASS: plugins/spi/SPITester
PASS: plugins/usbpro/ArduinoWidgetTester
PASS: plugins/usbpro/BaseRobeWidgetTester
PASS: plugins/usbpro/BaseUsbProWidgetTester
PASS: plugins/usbpro/DmxTriWidgetTester
./config/test-driver: line 112: 345880 Aborted                 (core dumped) "$@" >> "$log_file" 2>&1
FAIL: plugins/usbpro/DmxterWidgetTester
PASS: plugins/usbpro/EnttecUsbProWidgetTester
PASS: plugins/usbpro/RobeWidgetDetectorTester
PASS: plugins/usbpro/RobeWidgetTester
PASS: plugins/usbpro/UltraDMXProWidgetTester
PASS: plugins/usbpro/UsbProWidgetDetectorTester
./config/test-driver: line 112: 346087 Aborted                 (core dumped) "$@" >> "$log_file" 2>&1
FAIL: plugins/usbpro/WidgetDetectorThreadTester
PASS: olad/OlaTester
PASS: tools/ola_trigger/ActionTester
echo "PYTHONPATH=./python PIDDATA=./data/rdm /usr/bin/python ./data/rdm/PidDataTest.py; exit \$?" > data/rdm/PidDataTest.sh
chmod +x data/rdm/PidDataTest.sh
PASS: data/rdm/PidDataTest.sh
echo "for FILE in ./examples/testdata/dos_line_endings ./examples/testdata/multiple_unis ./examples/testdata/partial_frames ./examples/testdata/single_uni ./examples/testdata/trailing_timeout; do echo \"Checking \$FILE\"; ./examples/ola_recorder --verify \$FILE; STATUS=\$?; if [ \$STATUS -ne 0 ]; then echo \"FAIL: \$FILE caused ola_recorder to exit with status \$STATUS\"; exit \$STATUS; fi; done; exit 0" > examples/RecorderVerifyTest.sh
chmod +x examples/RecorderVerifyTest.sh
PASS: examples/RecorderVerifyTest.sh
mkdir -p ./python/ola/rpc
echo "PYTHONPATH=./python /usr/bin/python ./python/ola/rpc/SimpleRpcControllerTest.py; exit \$?" > ./python/ola/rpc/SimpleRpcControllerTest.sh
chmod +x ./python/ola/rpc/SimpleRpcControllerTest.sh
PASS: python/ola/rpc/SimpleRpcControllerTest.sh
PASS: python/ola/DUBDecoderTest.py
mkdir -p ./python/ola
echo "PYTHONPATH=./python /usr/bin/python ./python/ola/ClientWrapperTest.py; exit \$?" > ./python/ola/ClientWrapperTest.sh
chmod +x ./python/ola/ClientWrapperTest.sh
FAIL: python/ola/ClientWrapperTest.sh
PASS: python/ola/MACAddressTest.py
mkdir -p ./python/ola
echo "PYTHONPATH=./python /usr/bin/python ./python/ola/OlaClientTest.py; exit \$?" > ./python/ola/OlaClientTest.sh
chmod +x ./python/ola/OlaClientTest.sh
PASS: python/ola/OlaClientTest.sh
mkdir -p ./python/ola
echo "PYTHONPATH=./python TESTDATADIR=./common/rdm/testdata /usr/bin/python ./python/ola/PidStoreTest.py; exit \$?" > ./python/ola/PidStoreTest.sh
chmod +x ./python/ola/PidStoreTest.sh
FAIL: python/ola/PidStoreTest.sh
mkdir -p ./python/ola
echo "PYTHONPATH=./python PIDSTOREDIR=./data/rdm /usr/bin/python ./python/ola/RDMTest.py; exit \$?" > ./python/ola/RDMTest.sh
chmod +x ./python/ola/RDMTest.sh
FAIL: python/ola/RDMTest.sh
PASS: python/ola/UIDTest.py
mkdir -p ./python
echo "/usr/bin/python -m compileall data include python scripts tools; exit \$?" > ./python/PyCompileTest.sh
chmod +x ./python/PyCompileTest.sh
PASS: python/PyCompileTest.sh
echo "for FILE in ./tools/ola_trigger/example.conf ./tools/ola_trigger/test_file.conf ./tools/ola_trigger/test_file_falling.conf ./tools/ola_trigger/test_file_rising.conf ./tools/ola_trigger/contrib/mac_volume.conf ./tools/ola_trigger/contrib/mac_itunes.conf ./tools/ola_trigger/contrib/philips_hue_osram_lightify.conf; do echo \"Checking \$FILE\"; ./tools/ola_trigger/ola_trigger --validate \$FILE; STATUS=\$?; if [ \$STATUS -ne 0 ]; then echo \"FAIL: \$FILE caused ola_trigger to exit with status \$STATUS\"; exit \$STATUS; fi; done; exit 0" > ./tools/ola_trigger/FileValidateTest.sh
chmod +x ./tools/ola_trigger/FileValidateTest.sh
PASS: tools/ola_trigger/FileValidateTest.sh
mkdir -p ./python/ola
echo "PYTHONPATH=./python /usr/bin/python ./tools/rdm/ResponderTestTest.py; exit \$?" > ./tools/rdm/ResponderTestTest.sh
chmod +x ./tools/rdm/ResponderTestTest.sh
PASS: tools/rdm/ResponderTestTest.sh
PASS: tools/rdm/TestStateTest.py
============================================================================
Testsuite summary for OLA 0.10.8
============================================================================
# TOTAL: 94
# PASS:  86
# SKIP:  0
# XFAIL: 0
# FAIL:  8
# XPASS: 0
# ERROR: 0
============================================================================
See ./test-suite.log
Please report to open-lighting@googlegroups.com
============================================================================

test-suite.log

olad -l 4 in gdb with debug symbols, adding a new universe with the dummy device:
olad/plugin_api/PortManager.cpp:119: Patched 1-1-O-0 to universe 0
common/io/EPoller.cpp:306: ss process time was 0.000001
common/io/SelectPoller.cpp:233: ss process time was 0.000009
/usr/include/c++/12.1.0/bits/stl_queue.h:725: std::priority_queue<_Tp, _Sequence, _Compare>::const_reference std::priority_queue<_Tp, _Sequence, _Compare>::top() const [with _Tp = ola::io::TimeoutManager::Event*; _Sequence = std::vector<ola::io::TimeoutManager::Event*>; _Compare = ola::io::TimeoutManager::ltevent; const_reference = ola::io::TimeoutManager::Event* const&]: Assertion '!this->empty()' failed.
common/io/EPoller.cpp:306: ss process time was 0.000001

Thread 4 "http" received signal SIGABRT, Aborted.
[Switching to Thread 0x7ffff54bd640 (LWP 406538)]
__pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
44 return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0;
(gdb) bt full
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
tid = <optimized out>
ret = 0
pd = <optimized out>
old_mask = {__val = {93824993149072, 13, 140737152811056, 18446744073709551512, 140737152893120, 140737152973280, 140737308770208, 17466721297838135808,
140737308770208, 18446744073709551456, 0, 140737308770208, 140737308770288, 140737308770272, 140737308770208, 140737344293107}}
ret = <optimized out>
#1 0x00007ffff768e3d3 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
No locals.
#2 0x00007ffff763e838 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
ret = <optimized out>
#3 0x00007ffff7628535 in __GI_abort () at abort.c:79
save_stage = 1
act = {__sigaction_handler = {sa_handler = 0x7fff00000000, sa_sigaction = 0x7fff00000000}, sa_mask = {__val = {140737352278072, 725, 140737352281440, 3,
140737152897616, 140737308770288, 17466721297838135808, 7020109268283581812, 18446744073709551456, 140737353894288, 93824992460552, 140737308770896,
140737352064062, 93824992460552, 140737352064494, 93824992460552}}, sa_flags = -136290758, sa_restorer = 0x107c0}
sigs = {__val = {32, 17466721297838135808, 140737152882320, 17466721297838135808, 93824993016032, 93824993016088, 140737308770896, 0, 93824993016088,
140737344830131, 206158430256, 140737308770408, 140737308770208, 17466721297838135808, 140737353603680, 0}}
#4 0x00007ffff7ad2002 in std::__glibcxx_assert_fail (file=file@entry=0x7ffff7e3a038 "/usr/include/c++/12.1.0/bits/stl_queue.h", line=line@entry=725,
function=function@entry=0x7ffff7e3ad60 "std::priority_queue<_Tp, _Sequence, _Compare>::const_reference std::priority_queue<_Tp, _Sequence, _Compare>::top() const [with _Tp = ola::io::TimeoutManager::Event*; _Sequence = std::vector<ola::io::"..., condition=condition@entry=0x7ffff7e39f38 "!this->empty()")
at /usr/src/debug/gcc/libstdc++-v3/src/c++11/debug.cc:60
No locals.
#5 0x00007ffff7d86f06 in std::priority_queue<ola::io::TimeoutManager::Event*, std::vector<ola::io::TimeoutManager::Event*, std::allocator<ola::io::TimeoutManager::Event*> >, ola::io::TimeoutManager::ltevent>::top (this=<optimized out>) at /usr/include/c++/12.1.0/bits/stl_queue.h:723
PRETTY_FUNCTION = <optimized out>
#6 std::priority_queue<ola::io::TimeoutManager::Event*, std::vector<ola::io::TimeoutManager::Event*, std::allocator<ola::io::TimeoutManager::Event*> >, ola::io::TimeoutManager::ltevent>::top (this=<optimized out>) at /usr/include/c++/12.1.0/bits/stl_queue.h:723
PRETTY_FUNCTION = <optimized out>
#7 ola::io::TimeoutManager::ExecuteTimeouts (this=this@entry=0x5555556138e0, now=now@entry=0x55555558bf08) at common/io/TimeoutManager.cpp:101
e = 0x7fffec014e40
#8 0x00007ffff7d88e6a in ola::io::SelectPoller::Poll (this=0x55555558bee0, timeout_manager=0x5555556138e0, poll_interval=...) at common/io/SelectPoller.cpp:264
maxsd = 38
r_fds = {fds_bits = {32768, 0 <repeats 15 times>}}
w_fds = {fds_bits = {0 <repeats 16 times>}}
now = {m_tv = {m_tv = {tv_sec = 67520, tv_usec = 226308}}}
sleep_interval = {m_interval = {m_tv = {tv_sec = 60, tv_usec = 0}}}
--Type <RET> for more, q to quit, c to continue without paging--
tv = {tv_sec = 59, tv_usec = 999997}
next_event_in = {m_interval = {m_tv = {tv_sec = 0, tv_usec = 0}}}
closed_descriptors = false
#9 0x00007ffff7d83d16 in ola::io::SelectServer::CheckForEvents (this=this@entry=0x55555558f8a0, poll_interval=...) at /usr/include/c++/12.1.0/backward/auto_ptr.h:213
loop_iter = <optimized out>
default_poll_interval = {m_interval = {m_tv = {tv_sec = 60, tv_usec = 0}}}
#10 0x00007ffff7d83e1c in ola::io::SelectServer::Run (this=0x55555558f8a0) at common/io/SelectServer.cpp:136
No locals.
#11 0x00007ffff7f4bbf8 in ola::http::HTTPServer::Run (this=0x555555590938) at /usr/include/c++/12.1.0/backward/auto_ptr.h:198
iter = <optimized out>
#12 0x00007ffff7e03b72 in ola::thread::Thread::_InternalRun (this=0x555555590938) at common/thread/Thread.cpp:207
truncated_name = "http"
policy = 0
param = {sched_priority = 0}
#13 0x00007ffff768c54d in start_thread (arg=<optimized out>) at pthread_create.c:442
ret = <optimized out>
pd = <optimized out>
out = <optimized out>
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140737308776000, -6662648430950109208, 140737488342830, 0, 140737488342831, 140737300385792, 6662633760785674216,
6662629645993488360}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = <optimized out>
#14 0x00007ffff7711874 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100
No locals.

You can find the build script here. Please let me know if you have any hints on what might be missing from the package script to get the tests to pass and this crash to disappear, thanks :)

(edit: bt full for better context; terminal formatting)

@cbix
Copy link

cbix commented Jun 17, 2022

I did some further debugging and thought the issues are different as @freaknils is getting a segfault while I'm getting an abort, but the same assertion is failing so I'm sure they're related.

What solved the crash (and thus 5 out of 8 failing tests) was to disable stdc++ assertions (they're enabled by default by Arch Linux' makepkg.conf) by setting CXXFLAGS=${CXXFLAGS/-Wp,-D_GLIBCXX_ASSERTIONS} in the package build script.

However I believe enabling assertions should not break a software like this and the root cause should still be fixed. @peternewman does this information help you?

The remaining 3 failing tests are python-related (test-suite.log) so I would open a separate issue for that.

Update: #1753 fixed the crash and #1757 the failing tests for me.

@peternewman
Copy link
Member

Glad you fixed your issues too @cbix ! I assume after that you were able to re-enable the assertions again?

@cbix
Copy link

cbix commented Nov 27, 2022

@peternewman yes exactly, no more crash with assertions enabled

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants