Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Critical section already initialized" on startup #874

Open
fatcerberus opened this issue Dec 29, 2017 · 9 comments
Open

"Critical section already initialized" on startup #874

fatcerberus opened this issue Dec 29, 2017 · 9 comments

Comments

@fatcerberus
Copy link
Contributor

Running miniSphere under Application Verifier in Windows with all options under "Basics" checked, AppVerifier triggers a breakpoint on the line below and prints the following text to the console:

InitializeCriticalSection(mutex->cs);

=======================================
VERIFIER STOP 0000000000000211: pid 0x22CC: Critical section is already initialized. 

	00000262611C4FD0 : Critical section address. Run !cs -s <address> to get more information.
	00000262611C6FD0 : Critical section debug info address.
	00000262579578D0 : First initialization stack trace. Use dps to dump it if non-NULL.
	0000000000000000 : Not used.


=======================================
This verifier stop is continuable.
After debugging it use `go' to continue.

=======================================

If I ignore the error and continue, I get another AppVerifier breakpoint here:

_al_mutex_lock(mtxExternal);

=======================================
VERIFIER STOP 0000000000000206: pid 0x1EBC: Invalid critical section owner thread. 

	000002175E56DFA0 : Critical section address. Run !cs -s <address> to get more information.
	0000000000000F34 : Owning thread.
	000000000000114C : Expected owning thread.
	000002175E57FFD0 : Critical section debug info address.


=======================================
This verifier stop is not continuable. Process will be terminated 
when you use the `go' debugger command.

=======================================

AppVerifier doesn't allow me to continue past that point so I can't see if anything else is wrong.

@fatcerberus
Copy link
Contributor Author

Initializing a critical section that's already been initialized is explicitly called out as undefined behavior in the Windows documentation, so this should be fixed ASAP. I'll look deeper into it.

@fatcerberus
Copy link
Contributor Author

From what I can tell the issue is that all threads and mutices share the same critical section, but call InitializeCriticalSection() once per mutex, which (as mentioned above) is undefined behavior. Guess we've just been lucky that it somehow manages to work...

Fixing it may require refactoring.

@SiegeLord SiegeLord added this to the 5.2.4 milestone Jan 3, 2018
SiegeLord pushed a commit that referenced this issue Jan 8, 2018
Causes undefined behavior in issue #874.
@SiegeLord
Copy link
Member

I don't think it's actually that pervasive, but there was indeed a double initialization that happened at startup in the logging system, fixed via da4b716. Could you try your analysis again with that commit applied?

@fatcerberus
Copy link
Contributor Author

Okay, I see now, mutices don't actually share a critical section, the critical section object is malloc'd which I missed due to weirdness filter (it's unusual in Win32 to have a concrete object, almost everything is a system-provided handle). So my initial diagnosis was indeed wrong.

I'll check it out with da4b716 applied and let you know if the issue is fixed.

@fatcerberus
Copy link
Contributor Author

@SiegeLord Your patch fixes the double initialization, but doesn't fix the second error.

=======================================
VERIFIER STOP 0000000000000206: pid 0xE70: Invalid critical section owner thread. 

	0000010BE9752FA0 : Critical section address. Run !cs -s <address> to get more information.
	0000000000002B94 : Owning thread.
	00000000000025B8 : Expected owning thread.
	0000010BE00E0FD0 : Critical section debug info address.


=======================================
This verifier stop is not continuable. Process will be terminated 
when you use the `go' debugger command.

=======================================

Which occurs here:

_al_mutex_lock(mtxExternal);

@SiegeLord
Copy link
Member

Do you have more of a stack trace for that?

@fatcerberus
Copy link
Contributor Author

Looks like it happens in the audio stream thread:

 	vrfcore.dll!00007fffa3702235()	Unknown	Non-user code. Cannot find or open the PDB file.
 	vfbasics.dll!00007fff9dda6160()	Unknown	Non-user code. Cannot find or open the PDB file.
 	vfbasics.dll!00007fff9dda6d4b()	Unknown	Non-user code. Cannot find or open the PDB file.
>	allegro-debug-5.2.dll!cond_wait(_AL_COND * cond, _AL_MUTEX * mtxExternal, unsigned long timeout) Line 310	C	Symbols loaded.
 	allegro-debug-5.2.dll!_al_cond_wait(_AL_COND * cond, _AL_MUTEX * mtxExternal) Line 323	C	Symbols loaded.
 	allegro-debug-5.2.dll!al_wait_for_event(ALLEGRO_EVENT_QUEUE * queue, ALLEGRO_EVENT * ret_event) Line 407	C	Symbols loaded.
 	allegro_audio-debug-5.2.dll!_al_kcm_feed_stream(ALLEGRO_THREAD * self, void * vstream) Line 696	C	Symbols loaded.
 	allegro-debug-5.2.dll!thread_func_trampoline(_AL_THREAD * inner, void * _outer) Line 79	C	Symbols loaded.
 	allegro-debug-5.2.dll!thread_proc_trampoline(void * data) Line 38	C	Symbols loaded.

@SiegeLord
Copy link
Member

I looked into this a bit, and managed to reproduce this with a very basic usage of the condition variable:

#include <allegro5/allegro.h>

typedef struct STUFF
{
	bool var;
	ALLEGRO_MUTEX* mutex;
	ALLEGRO_COND* cond;
} STUFF;

static void *proc(ALLEGRO_THREAD *thread, void *arg)
{
	STUFF* stuff = (STUFF*)arg;

	al_lock_mutex(stuff->mutex);
	stuff->var = true;
	al_signal_cond(stuff->cond);
	al_unlock_mutex(stuff->mutex);
	while (!al_get_thread_should_stop(thread))
	{
		al_rest(0.1);
	}

	return thread;
}

int main(int argc, const char *argv[])
{
	al_init();

	STUFF stuff;
	stuff.cond = al_create_cond();
	stuff.mutex = al_create_mutex();
	stuff.var = false;
	ALLEGRO_THREAD* thread = al_create_thread(proc, &stuff);
	al_start_thread(thread);

	al_lock_mutex(stuff.mutex);
	while (!stuff.var) {
		al_wait_cond(stuff.cond, stuff.mutex);
	}
	al_unlock_mutex(stuff.mutex);

	al_join_thread(thread, NULL);
	al_destroy_cond(stuff.cond);
	al_destroy_mutex(stuff.mutex);
	return 0;
}

It sounds like Allegro's windows condition variables are not implemented in a way that's verifier likes.

@fatcerberus
Copy link
Contributor Author

The verifier seems to expect a different thread to own the critical section than the one that actually does, which is baffling.

@SiegeLord SiegeLord removed this from the 5.2.4 milestone Jan 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants