Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unreliably failing unittests on mac #3786

Closed
est31 opened this Issue Feb 25, 2016 · 7 comments

Comments

Projects
None yet
5 participants
@est31
Copy link
Contributor

est31 commented Feb 25, 2016

Thanks to travis build service, we discover failing unittests on some runs for the mac operating system:

Its always the same error:


Test assertion failed: thread_retval == thread

    at test_threading.cpp:107

[FAIL] testStartStopWait - 79ms

@est31 est31 added Bug BSD labels Feb 25, 2016

@ShadowNinja ShadowNinja added Mac OS and removed BSD labels Mar 3, 2016

@ShadowNinja

This comment has been minimized.

Copy link
Member

ShadowNinja commented Mar 3, 2016

@paramat

This comment has been minimized.

Copy link
Member

paramat commented Mar 14, 2016

I had this happen recently on a very simple PR that is obviously correct #3851
and have also seen this exact fail on other good PRs:

======== Testing module TestThreading
Test assertion failed: thread_retval == thread
    at test_threading.cpp:107
[FAIL] testStartStopWait - 79ms
[PASS] testThreadKill - 403ms
[PASS] testAtomicSemaphoreThread - 9ms
======== Module TestThreading failed (1 failures / 3 tests) - 491ms

Seems to often be: Travis, fourth build (#****.4), clang, UNIX

@Ekdohibs

This comment has been minimized.

Copy link
Member

Ekdohibs commented May 4, 2016

@ShadowNinja

This comment has been minimized.

Copy link
Member

ShadowNinja commented May 9, 2016

@Ektohibs: If you've got my fixer commit and it still fails I guess the thread will have to set its own handle from pthread_self(). And the main thread will be setting the same variable, so we'll need some sort of locking for that variable...

Another solution would be for the spawned thread to wait for Thread::start to finish -- after seting m_running. Simply doing MutexAutoLock(m_mutex); (not assigning to anything so it's dropped immediately) is tempting, but it could cause a deadlock if Thread::wait is called immediately after starting the thread.

@Zeno-

This comment has been minimized.

Copy link
Contributor

Zeno- commented May 9, 2016

@ShadowNinja set the mutex where?

Anyway threadProc() causes race conditions and I have no idea how to solve them because it's a static function... perhaps it's these race conditions that are causing the problem?

$ cat drdgrind.log | grep Conflicting -A2 -B1 | grep thread.cpp -B1 -A2
==4730==    at 0x7CBFEC: Atomic<bool>::operator bool() (atomic.h:96)
==4730==    by 0x88A63F: Thread::start() (thread.cpp:138)
--
==4730== Thread 6:
--
==4730==    at 0x88AB8E: Atomic<bool>::operator=(bool) (atomic.h:101)
==4730==    by 0x88A6B2: Thread::stop() (thread.cpp:149)
--
==4730== 
--
==4730==    at 0x7CBFEC: Atomic<bool>::operator bool() (atomic.h:96)
==4730==    by 0x88A63F: Thread::start() (thread.cpp:138)
--
==4730== 
--
==4730==    at 0x7CBFEC: Atomic<bool>::operator bool() (atomic.h:96)
==4730==    by 0x88A63F: Thread::start() (thread.cpp:138)
--
==4730== 
--
==4730==    at 0x7CBFEC: Atomic<bool>::operator bool() (atomic.h:96)
==4730==    by 0x88A63F: Thread::start() (thread.cpp:138)
--
==4730== 
--
==4730==    at 0x7CBFEC: Atomic<bool>::operator bool() (atomic.h:96)
==4730==    by 0x88A63F: Thread::start() (thread.cpp:138)
--
==4730== 
--
==4730==    at 0x7CBFEC: Atomic<bool>::operator bool() (atomic.h:96)
==4730==    by 0x88A63F: Thread::start() (thread.cpp:138)
--
==4730== Thread 16:
--
==4730==    at 0x88AB8E: Atomic<bool>::operator=(bool) (atomic.h:101)
==4730==    by 0x88A8C1: Thread::threadProc(void*) (thread.cpp:250)
--
==4730== Thread 1:
--
==4730==    at 0x7CBFEC: Atomic<bool>::operator bool() (atomic.h:96)
==4730==    by 0x88A63F: Thread::start() (thread.cpp:138)
--
==4730== Thread 13:
--
==4730==    at 0x7CBFEC: Atomic<bool>::operator bool() (atomic.h:96)
==4730==    by 0x88A63F: Thread::start() (thread.cpp:138)
--
==4730== Thread 11:
--
==4730==    at 0x88AB8E: Atomic<bool>::operator=(bool) (atomic.h:101)
==4730==    by 0x88A8F8: Thread::threadProc(void*) (thread.cpp:254)
--
@ShadowNinja

This comment has been minimized.

Copy link
Member

ShadowNinja commented May 9, 2016

Here's one way to solve this (pseudocode):

start:
  Mutex start_mutex
  AutoLock ls(start_mutex)
  start_thread(threadProc, new ThreadStartData(this, start_mutex))

threadProc sd:
  Thread *th = sd->thread
  th->m_running = true
  // other stuff that doesn't require th->getThreadHandle
  sd->start_mutex.lock() // waits for start to exit
  delete sd
  th->run()
 // ...

est31 added a commit to est31/minetest that referenced this issue Jul 4, 2016

Temporarily disable "testStartStopWait" Threading unit test on mac
The "testStartStopWait" unit test is unreliably failing on mac,
for some time already. See bug minetest#3786.

Having the unittest fail unreliably doesn't help anybody but mostly
inhibits the main feature of travis builds: to test PRs for regressions.

Therefore, disable the specific unit test for until bug minetest#3786
is fixed.

est31 added a commit that referenced this issue Jul 4, 2016

Temporarily disable "testStartStopWait" Threading unit test on mac
The "testStartStopWait" unit test is unreliably failing on mac,
for some time already. See bug #3786.

Having the unittest fail unreliably doesn't help anybody but mostly
inhibits the main feature of travis builds: to test PRs for regressions.

Therefore, disable the specific unit test for until bug #3786
is fixed.
@ShadowNinja

This comment has been minimized.

Copy link
Member

ShadowNinja commented Jan 29, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.