Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault in AGS on CPUs with TSX (Haswell and above) when ulocking not locked pthread mutex #195

Closed
shmerl opened this issue Oct 14, 2014 · 14 comments
Labels
type: bug unexpected/erroneous behavior in the existing functionality
Milestone

Comments

@shmerl
Copy link

shmerl commented Oct 14, 2014

Backtrace of the segfault:

Program received signal SIGSEGV, Segmentation fault.
__lll_unlock_elision (lock=0x979970, private=0) at ../nptl/sysdeps/unix/sysv/linux/x86/elision-unlock.c:29
29      ../nptl/sysdeps/unix/sysv/linux/x86/elision-unlock.c: No such file or directory.
(gdb) bt
#0  __lll_unlock_elision (lock=0x979970, private=0) at ../nptl/sysdeps/unix/sysv/linux/x86/elision-unlock.c:29
#1  0x0000000000562968 in AGS::Engine::PThreadMutex::Unlock (this=0x979968) at ../Engine/util/mutex_pthread.h:47
#2  0x00000000005628d5 in AGS::Engine::PThreadMutex::~PThreadMutex (this=0x979968, __in_chrg=) at ../Engine/util/mutex_pthread.h:36
#3  0x00000000005675b3 in SOUNDCLIP::~SOUNDCLIP (this=0x979910, __in_chrg=) at media/audio/soundclip.cpp:73
#4  0x000000000056495e in stop_and_destroy_channel_ex (chid=3, resetLegacyMusicSettings=true) at media/audio/audio.cpp:499
#5  0x0000000000564a22 in stop_and_destroy_channel (chid=3) at media/audio/audio.cpp:521
#6  0x0000000000563e01 in find_free_audio_channel (clip=0x153c070, priority=50, interruptEqualPriority=true) at media/audio/audio.cpp:202
#7  0x00000000005646de in play_audio_clip (clip=0x153c070, priority=50, repeat=0, fromOffset=0, queueIfNoChannel=false) at media/audio/audio.cpp:435
#8  0x0000000000521d25 in AudioClip_Play (clip=0x153c070, priority=31998, repeat=31998) at ac/audioclip.cpp:57
#9  0x0000000000564797 in play_audio_clip_by_index (audioClipIndex=321) at media/audio/audio.cpp:452
#10 0x00000000005366b5 in CheckViewFrame (view=646, loop=0, frame=10) at ac/viewframe.cpp:154
#11 0x00000000004cf8ff in RoomObject::UpdateCyclingView (this=0x8e3048 ) at ac/roomobject.cpp:76
#12 0x000000000055a56c in update_cycling_views () at main/udpate.cpp:199
#13 0x000000000055b301 in update_stuff () at main/udpate.cpp:463
#14 0x000000000055dd0b in game_loop_do_update () at main/game_run.cpp:543
#15 0x000000000055e13f in mainloop (checkControls=true, extraBitmap=0x0, extraX=0, extraY=0) at main/game_run.cpp:711
#16 0x000000000055e4ac in main_game_loop () at main/game_run.cpp:840
#17 0x000000000055e5a4 in do_main_cycle (untilwhat=2, daaa=9077320) at main/game_run.cpp:885
#18 0x00000000004f793e in scrWait (nloops=120) at ac/global_game.cpp:924
#19 0x000000000050ce02 in Sc_scrWait (params=0x7fffffffd2d0, param_count=1) at ac/global_api.cpp:2175
#20 0x000000000057425e in ccInstance::Run (this=0x1ef4b10, curpc=31) at script/cc_instance.cpp:1174
#21 0x0000000000571c68 in ccInstance::CallScriptFunction (this=0x1ef4b10, funcname=0x968500  "room_AfterFadeIn", numargs=0, params=0x0) at script/cc_instance.cpp:368
#22 0x0000000000574cc4 in ccInstance::RunScriptFunctionIfExists (this=0x1ef4b10, tsname=0x968500  "room_AfterFadeIn", numParam=0, params=0x0) at script/cc_instance.cpp:1390
#23 0x0000000000574e2f in ccInstance::RunTextScript (this=0x1ef4b10, tsname=0x1fd4db0 "room_AfterFadeIn") at script/cc_instance.cpp:1435
#24 0x000000000056da43 in run_interaction_script (nint=0x1f1d900, evnt=7, chkAny=-1, isInv=0) at script/script.cpp:227
#25 0x00000000004d4860 in process_event (evp=0x7fffffffdd84) at ac/event.cpp:208
#26 0x00000000004d523c in processallevents (numev=6, evlist=0x8a7940 ) at ac/event.cpp:404
#27 0x00000000004d527f in update_events () at ac/event.cpp:414
#28 0x000000000055de8e in game_loop_update_events () at main/game_run.cpp:600
#29 0x000000000055e16c in mainloop (checkControls=true, extraBitmap=0x0, extraX=0, extraY=0) at main/game_run.cpp:721
#30 0x000000000055e4ac in main_game_loop () at main/game_run.cpp:840
#31 0x000000000055b5e2 in do_play_game () at main/game_start.cpp:141
#32 0x000000000055b6c0 in initialize_start_and_play_game (override_start_room=0, loadSaveGameOnStartup=0x0) at main/game_start.cpp:182
#33 0x0000000000554cd2 in initialize_engine (argc=2, argv=0x7fffffffe0d8) at main/engine.cpp:1463
#34 0x0000000000560384 in main (argc=2, argv=0x7fffffffe0d8) at main/main.cpp:492

I encountered this bug when trying to run the Quest for Infamy on current Debian testing (see the initial report).

The actual problem is triggered by invalidly unlocking the mutex which isn't locked. It's present in new glibc and on CPUs with TSX instructions (such as Haswell for example).

See this problem described here: https://bugzilla.novell.com/show_bug.cgi?id=853311

To fix it, unlocking should not happen if the mutex isn't locked. The workaround in the forum thread is just a dirty fix for the Quest for Infamy, but the engine should handle this more correctly.

@ghost
Copy link

ghost commented Oct 14, 2014

Thank you for report.
The threaded audio in AGS has a number of issues in general, and it is planned to be fixed/redone.
I will put this reference here just to bind all audio/thread related issues together: #136.

So far we just recommended to turn it off on Linux by setting

[sound]
threaded=0

in acsetup,cfg. (or just removing "threaded" parameter).

@ivan-mogilko ivan-mogilko added this to the 3.4.0 milestone Oct 14, 2014
@ivan-mogilko ivan-mogilko added type: bug unexpected/erroneous behavior in the existing functionality context: audio and removed context: audio labels Oct 14, 2014
@ghost
Copy link

ghost commented Oct 14, 2014

Actually this is not directly related to audio. I think it's a design mistake that the mutex wrapper calls unlock in destructor. Its only function is to provide platform-independent interface. If someone forgot to unlock it properly in time, the consequences may be dire anyway.

For automatic unlocking we have MutexLock.
I guess I am going to remove the call to Unlock from Mutex's destructor. Just need to double-check that it is unlocked locally where needed.

@shmerl
Copy link
Author

shmerl commented Oct 14, 2014

On a side note, did you consider using standard C++11 threads instead, or you need to support some old compilers which don't have it? It can eliminate maintaining all those different threading libraries and also allow much cleaner locking / mutex logic where things get cleaned out properly when they go out of scope without the need to manually clean it (RAII). Example: http://en.cppreference.com/w/cpp/thread/lock_guard

There are also boost threads which are close to C++11 threads and are available for older compilers.

@tobihan
Copy link
Contributor

tobihan commented Oct 14, 2014

On Windows Visual Studio 2008 is used...

Am 14.10.2014 um 22:26 schrieb shmerl:

On a side note, did you consider using standard C++11 threads instead?
Or you need to support some old compilers which don't support it?

@shmerl
Copy link
Author

shmerl commented Oct 14, 2014

Yeah, VisualStudio still lags behind with C++11 (let alone 2008, even 2013 doesn't fully support it). However it should work with boost threads fine.

@ghost
Copy link

ghost commented Oct 14, 2014

We can't use higher Studio at the moment for some reasons, but we are moving towards that.

Anyway, the auto locking is not a problem, because we have our own "lock guard", called MutexLock. The problem is the way audio objects are used, generally. But that is beyond scope of current issue.

@shmerl
Copy link
Author

shmerl commented Oct 14, 2014

For the reference, my acsetup.cfg for the Quest for Infamy doesn't even have the [sound] section and no threaded parameter either, but this bug still appeared.

@ghost
Copy link

ghost commented Oct 14, 2014

@shmerl

For the reference, my acsetup.cfg for the Quest for Infamy doesn't even have the [sound] section and no threaded parameter either, but this bug still appeared.

True, I did not realize the true problem when making my very first comment.
The mutex objects are created even if audio is playing on main thread, and destroyed too, calling destructors and, obviously, unpaired Unlock, which produces error on your system.

The solution is to remove Unlock call from destructor.

@ghost
Copy link

ghost commented Oct 15, 2014

I made a commit: a1f351b

@shmerl
Copy link
Author

shmerl commented Oct 15, 2014

I tested it with the Quest for Infamy and it worked fine. Thanks for the fix!

@rofl0r

This comment was marked as abuse.

@shmerl
Copy link
Author

shmerl commented Oct 15, 2014

and what would be the advantage of using C++11 threads

That's rather straightforward - using standard library which should work on any system with proper compilers. Otherwise you need to resort to lower level system dependent threads like now. C++11 isn't bleeding edge anymore. C++14 probably is. Most mature compilers including gcc and clang support it fully. What's lagging behind is VisualStudio. By the way, are there other compilers for Windows which can build ags but support C++11 at the same time? May be MinGW?

both definitely won't fix the issue at hand...

It offers cleaner logic of using locks and mutexes with RAII instead of explicit locking and unlocking like pthread API for example. So rewriting the code to use that approach will make it cleaner and less prone to errors.

@ghost
Copy link

ghost commented Oct 15, 2014

@shmerl

By the way, are there other compilers for Windows which can build ags but support C++11 at the same time?

MinGW, perhaps.

@shmerl

It offers cleaner logic of using locks and mutexes with RAII instead of explicit locking and unlocking like pthread API for example. So rewriting the code to use that approach will make it cleaner and less prone to errors.

We already have a RAII locking in AGS... as I told two time above :)

We were considering moving to C++11 in some future, hypothetically, but we will definitely need to think this over more before doing so.
Regarding boost, personally I would not like to import a library like that just for one class...

@shmerl
Copy link
Author

shmerl commented Oct 15, 2014

We already have a RAII locking in AGS... as I told two time above :)

Yes, but that unlock in this bug wasn't really in line with it :) I mean that it offers RAII in a standard fashion, so one doesn't need to reinvent the wheel.

About MinGW - yes, that sounds like a good candidate. It will also untie developers who use AGS from proprietary tools. C++11 is pretty good at this time and used in different products in production already so if you worry about it being experimental - it's already not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug unexpected/erroneous behavior in the existing functionality
Projects
None yet
Development

No branches or pull requests

4 participants