Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UdevWrapperDeathTest.DereferencingEndReturnsInvalidObject getting stuck in CI #2788

Open
Saviq opened this issue Jan 16, 2023 · 8 comments
Open
Labels

Comments

@Saviq
Copy link
Collaborator

Saviq commented Jan 16, 2023

I've seen this a handful of times in CI:

234/246 Test #234: mir_umock_unit_tests.UdevWrapperDeathTest.* ..............................................***Timeout 600.05 sec
LD_LIBRARY_PATH=/spread/mir/spread/build/build/bin/../lib/
Running main() from main.cpp
Note: Google Test filter = UdevWrapperDeathTest.*:-:--
[==========] Running 2 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 2 tests from UdevWrapperDeathTest
[ RUN ] UdevWrapperDeathTest.DereferencingEndReturnsInvalidObject

E.g. https://github.com/MirServer/mir/actions/runs/3929280014/jobs/6718106597#step:8:3667

@Saviq Saviq added the bug label Jan 16, 2023
@Saviq
Copy link
Collaborator Author

Saviq commented Jan 16, 2023

Looks like alpine is particularly prone to this.

@AlanGriffiths
Copy link
Contributor

@Saviq
Copy link
Collaborator Author

Saviq commented Jan 20, 2023

I managed to catch it in the act:

Id   Target Id                    Frame 
* 1    LWP 667359 "mir_umock_unit_" __cp_end () at src/thread/x86_64/syscall_cp.s:29
  2    LWP 667378 "mir_umock_unit_" 0x00007f9c776eb9e9 in __syscall6 (a6=<optimized out>, a5=<optimized out>, a4=<optimized out>, a3=<optimized out>, a2=<optimized out>, a1=<optimized out>, 
    n=<optimized out>) at ./arch/x86_64/syscall_arch.h:59
  3    LWP 667379 "mir_umock_unit_" 0x00007f9c776eb9e9 in __syscall6 (a6=<optimized out>, a5=<optimized out>, a4=<optimized out>, a3=<optimized out>, a2=<optimized out>, a1=<optimized out>, 
    n=<optimized out>) at ./arch/x86_64/syscall_arch.h:59
  4    LWP 667383 "mir_umock_unit_" __cp_end () at src/thread/x86_64/syscall_cp.s:29
  5    LWP 667384 "gmain"           __cp_end () at src/thread/x86_64/syscall_cp.s:29

Thread 5 (LWP 667384 "gmain"):
#0  __cp_end () at src/thread/x86_64/syscall_cp.s:29
#1  0x00007f9c7770872d in __syscall_cp_c (nr=7, u=<optimized out>, v=<optimized out>, w=<optimized out>, x=<optimized out>, y=<optimized out>, z=0) at src/thread/pthread_cancel.c:33
#2  0x00007f9c776fc23e in poll (fds=<optimized out>, n=<optimized out>, timeout=<optimized out>) at src/select/poll.c:9
#3  0x00007f9c77557fb9 in  () at /usr/lib/libglib-2.0.so.0
#4  0x00007f9c775580c5 in g_main_context_iteration () at /usr/lib/libglib-2.0.so.0
#5  0x00007f9c775580f6 in  () at /usr/lib/libglib-2.0.so.0
#6  0x00007f9c77575fd6 in  () at /usr/lib/libglib-2.0.so.0
#7  0x00007f9c77709221 in start (p=0x7f9c767c59d0) at src/thread/pthread_create.c:203
#8  0x00007f9c7770b3e0 in __clone () at src/thread/x86_64/clone.s:22

Thread 4 (LWP 667383 "mir_umock_unit_"):
#0  __cp_end () at src/thread/x86_64/syscall_cp.s:29
#1  0x00007f9c7770872d in __syscall_cp_c (nr=7, u=<optimized out>, v=<optimized out>, w=<optimized out>, x=<optimized out>, y=<optimized out>, z=0) at src/thread/pthread_cancel.c:33
#2  0x00007f9c776fc23e in poll (fds=<optimized out>, n=<optimized out>, timeout=<optimized out>) at src/select/poll.c:9
#3  0x00007f9c77557fb9 in  () at /usr/lib/libglib-2.0.so.0
#4  0x00007f9c77558353 in g_main_loop_run () at /usr/lib/libglib-2.0.so.0
#5  0x00007f9c77674a63 in  () at /usr/lib/libumockdev.so.0
#6  0x00007f9c77575fd6 in  () at /usr/lib/libglib-2.0.so.0
#7  0x00007f9c77709221 in start (p=0x7f9c767e99d0) at src/thread/pthread_create.c:203
#8  0x00007f9c7770b3e0 in __clone () at src/thread/x86_64/clone.s:22

Thread 3 (LWP 667379 "mir_umock_unit_"):
#0  0x00007f9c776eb9e9 in __syscall6 (a6=<optimized out>, a5=<optimized out>, a4=<optimized out>, a3=<optimized out>, a2=<optimized out>, a1=<optimized out>, n=<optimized out>) at ./arch/x86_64/syscall_arch.h:59
#1  syscall (n=<optimized out>) at src/misc/syscall.c:20
#2  0x00007f9c76fc09fb in  () at /usr/lib/liblttng-ust.so.0
#3  0x00007f9c77709221 in start (p=0x7f9c768329d0) at src/thread/pthread_create.c:203
#4  0x00007f9c7770b3e0 in __clone () at src/thread/x86_64/clone.s:22

Thread 2 (LWP 667378 "mir_umock_unit_"):
#0  0x00007f9c776eb9e9 in __syscall6 (a6=<optimized out>, a5=<optimized out>, a4=<optimized out>, a3=<optimized out>, a2=<optimized out>, a1=<optimized out>, n=<optimized out>) at ./arch/x86_64/syscall_arch.h:59
#1  syscall (n=<optimized out>) at src/misc/syscall.c:20
#2  0x00007f9c76fc09fb in  () at /usr/lib/liblttng-ust.so.0
#3  0x00007f9c77709221 in start (p=0x7f9c768559d0) at src/thread/pthread_create.c:203
#4  0x00007f9c7770b3e0 in __clone () at src/thread/x86_64/clone.s:22

Thread 1 (LWP 667359 "mir_umock_unit_"):
#0  __cp_end () at src/thread/x86_64/syscall_cp.s:29
#1  0x00007f9c7770872d in __syscall_cp_c (nr=0, u=<optimized out>, v=<optimized out>, w=<optimized out>, x=<optimized out>, y=<optimized out>, z=0) at src/thread/pthread_cancel.c:33
#2  0x00007f9c7770e726 in read (fd=<optimized out>, buf=<optimized out>, count=<optimized out>) at src/unistd/read.c:6
#3  0x00007f9c776a24e1 in read () at /usr/lib/libumockdev-preload.so.0
#4  0x00007f9c76eabacc in testing::internal::DeathTestImpl::ReadAndInterpretStatusByte() () at /usr/lib/libgtest.so.1.11.0
#5  0x00007f9c76eac13f in testing::internal::ForkingDeathTest::Wait() () at /usr/lib/libgtest.so.1.11.0
#6  0x0000562057e9e65e in UdevWrapperDeathTest_DereferencingEndReturnsInvalidObject_Test::TestBody() (this=0x7f9c76cdbdf0) at /spread/mir/tests/unit-tests/test_udev_wrapper.cpp:329
#7  0x00007f9c76ebe04d in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) () at /usr/lib/libgtest.so.1.11.0
#8  0x00007f9c76eb1d57 in testing::Test::Run() () at /usr/lib/libgtest.so.1.11.0
#9  0x00007f9c76eb1e42 in testing::TestInfo::Run() () at /usr/lib/libgtest.so.1.11.0
#10 0x00007f9c76eb217f in testing::TestSuite::Run() () at /usr/lib/libgtest.so.1.11.0
#11 0x00007f9c76eb6f12 in testing::internal::UnitTestImpl::RunAllTests() () at /usr/lib/libgtest.so.1.11.0
#12 0x00007f9c76ebe51c in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) () at /usr/lib/libgtest.so.1.11.0
#13 0x00007f9c76eb1f42 in testing::UnitTest::Run() () at /usr/lib/libgtest.so.1.11.0
#14 0x00005620583ab16b in RUN_ALL_TESTS() () at /usr/include/gtest/gtest.h:2490
#15 0x00005620583ab0d1 in mir_test_framework::main(int, char**) (argc=1, argv=0x7fff5bab3028) at /spread/mir/tests/mir_test_framework/command_line_server_configuration.cpp:59
#16 0x00005620583a6c51 in main(int, char**) (argc=2, argv=0x7fff5bab3028) at /spread/mir/tests/mir_test_framework/main.cpp:30
#0  __cp_end () at src/thread/x86_64/syscall_cp.s:29
#1  0x00007f9c7770872d in __syscall_cp_c (nr=0, u=<optimized out>, v=<optimized out>, w=<optimized out>, x=<optimized out>, y=<optimized out>, z=0) at src/thread/pthread_cancel.c:33
        self = 0x7f9c77512928
        r = <optimized out>
        st = <optimized out>
#2  0x00007f9c7770e726 in read (fd=<optimized out>, buf=<optimized out>, count=<optimized out>) at src/unistd/read.c:6
#3  0x00007f9c776a24e1 in read () at /usr/lib/libumockdev-preload.so.0
#4  0x00007f9c76eabacc in testing::internal::DeathTestImpl::ReadAndInterpretStatusByte() () at /usr/lib/libgtest.so.1.11.0
#5  0x00007f9c76eac13f in testing::internal::ForkingDeathTest::Wait() () at /usr/lib/libgtest.so.1.11.0
#6  0x0000562057e9e65e in UdevWrapperDeathTest_DereferencingEndReturnsInvalidObject_Test::TestBody() (this=0x7f9c76cdbdf0) at /spread/mir/tests/unit-tests/test_udev_wrapper.cpp:329
        gtest_dt_ptr = 
                    {_M_t = {<std::__uniq_ptr_impl<testing::internal::DeathTest, std::default_delete<testing::internal::DeathTest> >> = {_M_t = {<std::_Tuple_impl<0, testing::internal::DeathTest*, std::default_delete<testing::internal::DeathTest> >> = {<std::_Tuple_impl<1, std::default_delete<testing::internal::DeathTest> >> = {<std::_Head_base<1, std::default_delete<testing::internal::DeathTest>, true>> = {<std::default_delete<testing::internal::DeathTest>> = {<No data fields>}, <No data fields>}, <No data fields>}, <std::_Head_base<0, testing::internal::DeathTest*, false>> = {_M_head_impl = 0x7f9c76c069d0}, <No data fields>}, <No data fields>}}, <No data fields>}}
        gtest_dt = 0x7f9c76c069d0
        devices = 
          {ctx = {<std::__shared_ptr<mir::udev::Context, (__gnu_cxx::_Lock_policy)2>> = {<std::__shared_ptr_access<mir::udev::Context, (__gnu_cxx::_Lock_policy)2, false, false>> = {<No data fields>}, _M_ptr = 0x7f9c76c76c60, _M_refcount = {_M_pi = 0x7f9c76c76c50}}, <No data fields>}, enumerator = 0x7f9c767a29f0, scanned = true}
        iter = 
                {ctx = {<std::__shared_ptr<mir::udev::Context, (__gnu_cxx::_Lock_policy)2>> = {<std::__shared_ptr_access<mir::udev::Context, (__gnu_cxx::_Lock_policy)2, false, false>> = {<No data fields>}, _M_ptr = 0x0, _M_refcount = {_M_pi = 0x562057fa2d00 <mir_test_framework::UdevEnvironment::UdevEnvironment()+126>}}, <No data fields>}, entry = 0x7f9c767f2030, current = {<std::__shared_ptr<mir::udev::Device, (__gnu_cxx::_Lock_policy)2>> = {<std::__shared_ptr_access<mir::udev::Device, (__gnu_cxx::_Lock_policy)2, false, false>> = {<No data fields>}, _M_ptr = 0x7f9c76cdbe00, _M_refcount = {_M_pi = 0x7fff5bab2c70}}, <No data fields>}}
#7  0x00007f9c76ebe04d in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) () at /usr/lib/libgtest.so.1.11.0
#8  0x00007f9c76eb1d57 in testing::Test::Run() () at /usr/lib/libgtest.so.1.11.0
#9  0x00007f9c76eb1e42 in testing::TestInfo::Run() () at /usr/lib/libgtest.so.1.11.0
#10 0x00007f9c76eb217f in testing::TestSuite::Run() () at /usr/lib/libgtest.so.1.11.0
#11 0x00007f9c76eb6f12 in testing::internal::UnitTestImpl::RunAllTests() () at /usr/lib/libgtest.so.1.11.0
#12 0x00007f9c76ebe51c in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) () at /usr/lib/libgtest.so.1.11.0
#13 0x00007f9c76eb1f42 in testing::UnitTest::Run() () at /usr/lib/libgtest.so.1.11.0
#14 0x00005620583ab16b in RUN_ALL_TESTS() () at /usr/include/gtest/gtest.h:2490
#15 0x00005620583ab0d1 in mir_test_framework::main(int, char**) (argc=1, argv=0x7fff5bab3028) at /spread/mir/tests/mir_test_framework/command_line_server_configuration.cpp:59
#16 0x00005620583a6c51 in main(int, char**) (argc=2, argv=0x7fff5bab3028) at /spread/mir/tests/mir_test_framework/main.cpp:30
        path = "/spread/mir/tests/mir_test_framework/main.cpp"

I expect #2797 to avoid us this, but leaving for posterity.

@Saviq
Copy link
Collaborator Author

Saviq commented Jan 20, 2023

I expect #2797 to avoid us this, but leaving for posterity.

Spoke too soon…

@Saviq Saviq self-assigned this Jan 20, 2023
@Saviq
Copy link
Collaborator Author

Saviq commented Jan 23, 2023

And I reproduced on Alpine/edge, too, so there's something in the water there.

@Saviq
Copy link
Collaborator Author

Saviq commented Jan 23, 2023

A bisect took me to c071dcd, but since it's not an exact science, I'm running it again to see if I get the same…

@Saviq
Copy link
Collaborator Author

Saviq commented Jan 24, 2023

OK I'm climbing out of this rabbit hole, maybe someone else has better luck. Since things are forking, here's the trace of the other process when it's stuck:

Thread 1 (process 1197481 "mir_umock_unit_"):
#0  __syscall_cp_c (nr=202, u=139926175202144, v=128, w=-2147483647, x=0, y=0, z=0) at ./arch/x86_64/syscall_arch.h:61
#1  0x00007f4319f37ba9 in __futex4_cp (to=0x0, val=-2147483647, op=128, addr=0x7f4319f7bb60 <lock>) at src/thread/__timedwait.c:24
#2  __timedwait_cp (addr=addr@entry=0x7f4319f7bb60 <lock>, val=val@entry=-2147483647, clk=clk@entry=0, at=at@entry=0x0, priv=priv@entry=128) at src/thread/__timedwait.c:52
#3  0x00007f4319f37c4e in __timedwait (addr=addr@entry=0x7f4319f7bb60 <lock>, val=-2147483647, clk=clk@entry=0, at=at@entry=0x0, priv=128) at src/thread/__timedwait.c:68
#4  0x00007f4319f3a482 in __pthread_rwlock_timedwrlock (at=<optimized out>, rw=<optimized out>) at src/thread/pthread_rwlock_timedwrlock.c:18
#5  __pthread_rwlock_timedwrlock (rw=rw@entry=0x7f4319f7bb60 <lock>, at=at@entry=0x0) at src/thread/pthread_rwlock_timedwrlock.c:3
#6  0x00007f4319f3a551 in __pthread_rwlock_wrlock (rw=rw@entry=0x7f4319f7bb60 <lock>) at src/thread/pthread_rwlock_wrlock.c:5
#7  0x00007f4319f37d4d in __inhibit_ptc () at src/thread/lock_ptc.c:7
#8  0x00007f4319f41ac9 in dlopen (file=0x7f4319822785 "liblttng-ust.so.1", mode=4097) at ldso/dynlink.c:2060
#9  0x00007f43197e4d18 in  () at /usr/lib/liblttng-ust.so.1
#10 0x00007f4319ec5262 in  () at /usr/lib/liblttng-ust-fork.so
#11 0x00007f4319f3b38e in __clone () at src/thread/x86_64/clone.s:22
(gdb) bt full
#0  __syscall_cp_c (nr=202, u=139926175202144, v=128, w=-2147483647, x=0, y=0, z=0) at ./arch/x86_64/syscall_arch.h:61
        self = 0x7f4318fe99c0
        r = <optimized out>
        st = <optimized out>
#1  0x00007f4319f37ba9 in __futex4_cp (to=0x0, val=-2147483647, op=128, addr=0x7f4319f7bb60 <lock>) at src/thread/__timedwait.c:24
        r = <optimized out>
        r = <optimized out>
        to = {tv_sec = 139926157842936, tv_nsec = 139926158875072}
        top = 0x0
#2  __timedwait_cp (addr=addr@entry=0x7f4319f7bb60 <lock>, val=val@entry=-2147483647, clk=clk@entry=0, at=at@entry=0x0, priv=priv@entry=128) at src/thread/__timedwait.c:52
        r = <optimized out>
        to = {tv_sec = 139926157842936, tv_nsec = 139926158875072}
        top = 0x0
#3  0x00007f4319f37c4e in __timedwait (addr=addr@entry=0x7f4319f7bb60 <lock>, val=-2147483647, clk=clk@entry=0, at=at@entry=0x0, priv=128) at src/thread/__timedwait.c:68
        cs = 1
        r = <optimized out>
#4  0x00007f4319f3a482 in __pthread_rwlock_timedwrlock (at=<optimized out>, rw=<optimized out>) at src/thread/pthread_rwlock_timedwrlock.c:18
        r = <optimized out>
        t = <optimized out>
        spins = <optimized out>
        r = <optimized out>
#5  __pthread_rwlock_timedwrlock (rw=rw@entry=0x7f4319f7bb60 <lock>, at=at@entry=0x0) at src/thread/pthread_rwlock_timedwrlock.c:3
        r = <optimized out>
#6  0x00007f4319f3a551 in __pthread_rwlock_wrlock (rw=rw@entry=0x7f4319f7bb60 <lock>) at src/thread/pthread_rwlock_wrlock.c:5
#7  0x00007f4319f37d4d in __inhibit_ptc () at src/thread/lock_ptc.c:7
#8  0x00007f4319f41ac9 in dlopen (file=0x7f4319822785 "liblttng-ust.so.1", mode=4097) at ldso/dynlink.c:2060
        p = 0x7f4319f0228e <__dl_vseterr+192>
        orig_tail = <optimized out>
        orig_syms_tail = <optimized out>
        orig_lazy_head = <optimized out>
        next = <optimized out>
        orig_tls_tail = <optimized out>
        orig_tls_cnt = <optimized out>
        orig_tls_offset = <optimized out>
        orig_tls_align = <optimized out>
        cs = 0

                      jb = {{__jb = {9, 0, 4503599627370496, 140735060001768, 139926167254559, 0, 0, 0}, __fl = 0, __ss = {0, 18446744058676903679, 0, 18446744058676903679, 0, 0, 0, 139926174933923, 139926167496251, 1678884596, 0, 139926174967397, 112668034534998016, 4503599627370496, 0, 0}}}
        ctor_queue = 0x0
        constructed = <optimized out>
#9  0x00007f43197e4d18 in  () at /usr/lib/liblttng-ust.so.1
#10 0x00007f4319ec5262 in  () at /usr/lib/liblttng-ust-fork.so
#11 0x00007f4319f3b38e in __clone () at src/thread/x86_64/clone.s:22

@Saviq Saviq removed their assignment Jan 24, 2023
@AlanGriffiths
Copy link
Contributor

Probably the same underlying cause (but UdevWrapperDeathTest.MemberDereferenceOfEndDies getting stuck)

235/247 Test #235: mir_umock_unit_tests.UdevWrapperDeathTest.* ..............................................***Timeout 600.09 sec
LD_LIBRARY_PATH=/spread/mir/spread/build/build/bin/../lib/
Running main() from main.cpp
Note: Google Test filter = UdevWrapperDeathTest.*:-:--
[==========] Running 2 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 2 tests from UdevWrapperDeathTest
[ RUN ] UdevWrapperDeathTest.DereferencingEndReturnsInvalidObject
Running main() from main.cpp
Running main() from main.cpp
[ OK ] UdevWrapperDeathTest.DereferencingEndReturnsInvalidObject (300 ms)
[ RUN ] UdevWrapperDeathTest.MemberDereferenceOfEndDies

https://github.com/MirServer/mir/actions/runs/4123166159/jobs/7120852378

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants