Reported in version: HG 2.0 Reported for operating system, platform: Linux, x86_64
Comments on the original bug report:
On 2013-02-01 23:03:59 +0000, Scott Percival wrote:
No idea how widespread this is, but under Ubuntu 12.04 x86_64 a default SDL build with CMake would suffer deadlocks in SDL_DestroySemaphore, while a build with autotools ran without a problem. About 95% of the time it’d happen in SDL_Init, but failing that it’d occur during SDL_Quit. A typical backtrace from a deadlocked start from the latest hg:
# 0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:132
# 1 0x00007ffff690c065 in _L_lock_858 () from /lib/x86_64-linux-gnu/libpthread.so.0
# 2 0x00007ffff690beba in __pthread_mutex_lock (mutex=0x6f6950) at pthread_mutex_lock.c:61
# 3 0x00007ffff66cc3a4 in SDL_mutexP (mutex=0x6f6950) at /home/scott/Development/fourier/SDL/src/thread/pthread/SDL_sysmutex.c:103
# 4 0x00007ffff66cc84f in SDL_DestroySemaphore (sem=0x6f8860) at /home/scott/Development/fourier/SDL/src/thread/generic/SDL_syssem.c:126
# 5 0x00007ffff6651c8f in SDL_CreateThread (fn=0x7ffff6651ded <SDL_TimerThread>, name=0x7ffff66d1ba4 "SDLTimer", data=0x7ffff68fe4a0) at /home/scott/Development/fourier/SDL/src/thread/SDL_thread.c:286
# 6 0x00007ffff66520da in SDL_TimerInit () at /home/scott/Development/fourier/SDL/src/timer/SDL_timer.c:230
# 7 0x00007ffff65d35de in SDL_InitSubSystem (flags=33) at /home/scott/Development/fourier/SDL/src/SDL.c:76
# 8 0x00007ffff65d39cf in SDL_Init (flags=33) at /home/scott/Development/fourier/SDL/src/SDL.c:193
A bit of digging revealed that the CMake litmus test for PTHREADS_SEM didn’t have a definition for main(), which made gcc explosively give up when linking.
Also the tests for HAVE_RECURSIVE_MUTEXES, HAVE_SEM_TIMEDWAIT and HAVE_PTHREAD_SPINLOCK failed because Linux was set to link in -lpthread; using the GCCism -pthread works, and seems to be necessary for the linker to find methods like pthread_mutexattr_settype. I have no idea what it does differently, given that a scan with ldd doesn’t show any extra libraries linked in other than libpthread, but I’m willing to leave that mystery to the GCC high-priests.
So yes, detecting PTHREADS_SEM fixes the problem, but since the point of these tests is to disable unavailable features or cutover to a failsafe, I’m concerned that the failsafe for running with PTHREADS and without PTHREADS_SEM doesn’t work that well. Unfortunately I have zero insight as to what it might mean; everything I know about libpthread fits comfortably onto a piece of fortune cookie paper. But this behaviour was very reproducible on two x86_64 Ubuntu machines, and also quite annoying.
On 2013-02-01 23:05:28 +0000, Scott Percival wrote:
Created attachment 1032
Fix Linux pthread detection in CMake
On 2013-04-16 04:01:59 +0000, Frederick Lee wrote:
Created attachment 1110
Fix cmake test for PTHREADS_SEM
Alternate patch. I ran into the same problem; system is (also) x86-64, Ubuntu 12.04.2, using cmake 2.8.7. Similar symptoms to Scott's, reached similar conclusions. Attached is the change I implemented to compile with cmake.
Of particular note, LDFLAGS went under CMAKE_REQUIRED_LIBRARIES, so that the "-lpthread" switch shows up towards the end of the command line, where it's available for resolving(?) pthread symbols. In any case, putting "-lpthread" at the end made compiles work better.
Documentation for cmake 2.8.7 indicates CHECK_C_SOURCE_COMPILES requires main() to be in there:
<code> - source code to try to compile, must define 'main'
As an aside, docs for gcc-4.6 lists "-pthread" for only RS/6000, PowerPC, and Solaris 2. Docs imply the switch is semi-magical to get pthreads to work on those platforms. I think the switch is unnecessary/excessive under "LINUX".
On 2013-04-19 12:31:14 +0000, Gabriel Jacobo wrote:
For reference, some of the stuff I found out while researching the issue...
Solving the deadlocks was achieved by adding -pthread in the LDFLAGS. That alone seemed to be enough to get "torturethread" working.
Why? When HAVE_PTHREAD_SEM is not defined (in this case because the test was faulty and we were missing the -pthread flag), the script includes generic/SDL_syssem.c, which in turn uses SDL_LockMutex which uses pthreads. Something (AFAICT it's an issue with reentrant threads, but I'm not 100% certain) makes pthread_mutex_lock fail if -pthread was not provided when compiling, but I'm fairly certain it's not our responsibility.
I did go further down the rabbit hole, and configured with --disable-pthreads. On Linux, this has the effect of defining SDL_THREADS_DISABLED so you get no thread support (generic doesn't kick in).
I tried to hack around this (undefining SDL_THREADS_DISABLED), to discover a recursion issue on the generic code (which I don't know if it's intended or not)...the generic version of SDL_CreateSemaphore calls the generic version of SDL_CreateMutex and then the generic version of SDL_CreateMutex calls SDL_CreateSemaphore, etc...infinite recursion (which doesn't happen if you use only one of those functions, as is the case when HAVE_PTHREAD_SEM is not defined).
In the end Sam told me that by design the platform has to provide either semaphores or mutexes, which makes sense (if you have neither, threading will be disabled).
The text was updated successfully, but these errors were encountered: