Win XP only: TempoClock scheduling crashes sclang #547

Closed
jamshark70 opened this Issue Sep 27, 2012 · 16 comments

Projects

None yet

4 participants

@jamshark70

In win XP (3.6 dev), this example causes sclang to stop responding. Sometime after that, the language will crash.

TempoClock.sched(0.5, { "hello".postln });

Use SystemClock or AppClock instead, and all is well.

In win7, all clocks behave properly.

My XP build, using Jakob's copies of the dlls, doesn't resolve this problem.

3.5 did not have this problem - hence, it's a regression.

@jamshark70

Tried again, running sclang.exe from the commandline. The TempoClock example crashes immediately. So I guess the Windows pop-up warning that the language crashed is suppressed or delayed when using sc-ide, but the crash is right away.

Also retested 3.5.4 from the command line -- no problem.

So I guess this is something I could "git bisect," when I have a couple of free hours (ha!).

@jleben
SuperCollider member

James, could you try getting a backtrace using gdb?
The following command in msys shell should install gdb:

mingw-get install mingw32-gdb
@jamshark70

Just gdb'ed a full debug build and (I don't suppose anyone will be surprised to hear) the problem does not occur when the build type is Debug. (That seems to be the case with a really quite startling percentage of sc crashing issues, doesn't it?)

So I'm rebuilding as RelWithDebInfo+ and will try again...

  • BTW RelWithDebInfo always confuses me... my first instinct is always RelWithDebugInfo. IMO the latter would be clearer and easier to remember... anyone know why "ug" was omitted?
@jleben
SuperCollider member

BTW RelWithDebInfo always confuses me... my first instinct is always RelWithDebugInfo. IMO the latter would be clearer and easier to remember... anyone know why "ug" was omitted?

Maybe because "ease" was also omitted?

@jamshark70

The stack trace is not useful. The sum total of the output from "where" is:

#0  0x77c3554a in msvcrt!_abnormal_termination ()
   from C:\WINDOWS\system32\msvcrt.dll

I also did thread apply all bt. There isn't much of that, so I'll just paste it all here.

Thread 6 (Thread 3176.0x12c4):
#0 0x7c90e514 in ntdll!LdrAccessResource ()
from C:\WINDOWS\system32\ntdll.dll
#1 0x7c90df5a in ntdll!ZwWaitForSingleObject ()
from C:\WINDOWS\system32\ntdll.dll
#2 0x7c8025db in WaitForSingleObjectEx ()
from C:\WINDOWS\system32\kernel32.dll
warning: (Internal error: pc 0x7df in read in psymtab, but not in symtab.)

warning: (Internal error: pc 0x7df in read in psymtab, but not in symtab.)

#3 0x000007e0 in ?? ()
warning: (Internal error: pc 0x7df in read in psymtab, but not in symtab.)

#4 0x00000000 in ?? ()

Thread 5 (Thread 3176.0xde0):
#0 0x7c90e514 in ntdll!LdrAccessResource ()
from C:\WINDOWS\system32\ntdll.dll
#1 0x7c90df4a in ntdll!ZwWaitForMultipleObjects ()
from C:\WINDOWS\system32\ntdll.dll
#2 0x7c809590 in KERNEL32!CreateFileMappingA ()
from C:\WINDOWS\system32\kernel32.dll
#3 0x7c80a115 in WaitForMultipleObjects ()
from C:\WINDOWS\system32\kernel32.dll
#4 0x62486588 in pthreadCancelableTimedWait ()
from C:\WINDOWS\system32\pthreadGC2.dll
#5 0x62486e34 in sem_timedwait () from C:\WINDOWS\system32\pthreadGC2.dll
#6 0x62487319 in pthread_cond_wait () from C:\WINDOWS\system32\pthreadGC2.dll
#7 0x00403e63 in schedRunFunc (arg=0x0)
at d:/sc-ide-clean.git/lang/LangPrimSource/PyrSched.cpp:443
#8 0x62483e29 in ptw32_threadStart@4 ()
from C:\WINDOWS\system32\pthreadGC2.dll
#9 0x77c3a3b0 in msvcrt!_endthreadex () from C:\WINDOWS\system32\msvcrt.dll
#10 0x7c80b729 in KERNEL32!GetModuleFileNameA ()
from C:\WINDOWS\system32\kernel32.dll
#11 0x00000000 in ?? ()

Thread 4 (Thread 3176.0xc64):
#0 0x77c3554a in msvcrt!_abnormal_termination ()
from C:\WINDOWS\system32\msvcrt.dll

Thread 2 (Thread 3176.0xc78):
#0 0x7c90e514 in ntdll!LdrAccessResource ()
from C:\WINDOWS\system32\ntdll.dll
#1 0x7c90df5a in ntdll!ZwWaitForSingleObject ()
from C:\WINDOWS\system32\ntdll.dll
#2 0x71a5402b in ?? () from C:\WINDOWS\system32\mswsock.dll
#3 0x71a6107f in WSPStartup () from C:\WINDOWS\system32\mswsock.dll
#4 0x71abf6e7 in WSApSetPostRoutine () from C:\WINDOWS\system32\ws2_32.dll
#5 0x71ad303a in WSOCK32!TransmitFile () from C:\WINDOWS\system32\wsock32.dll
#6 0x0046b874 in SC_UdpInPort::Run (this=0x3f3ac48)
at d:/sc-ide-clean.git/lang/LangPrimSource/SC_ComPort.cpp:367
#7 0x62483e29 in ptw32_threadStart@4 ()
from C:\WINDOWS\system32\pthreadGC2.dll
#8 0x77c3a3b0 in msvcrt!_endthreadex () from C:\WINDOWS\system32\msvcrt.dll
#9 0x7c80b729 in KERNEL32!GetModuleFileNameA ()
from C:\WINDOWS\system32\kernel32.dll
#10 0x00000000 in ?? ()

Thread 1 (Thread 3176.0x13dc):
#0 0x7c90e514 in ntdll!LdrAccessResource ()
from C:\WINDOWS\system32\ntdll.dll
#1 0x7c90df4a in ntdll!ZwWaitForMultipleObjects ()
from C:\WINDOWS\system32\ntdll.dll
#2 0x7c809590 in KERNEL32!CreateFileMappingA ()
from C:\WINDOWS\system32\kernel32.dll
#3 0x7e4195f9 in USER32!GetLastInputInfo ()
from C:\WINDOWS\system32\user32.dll
#4 0x6a2f3271 in ZN21QEventDispatcherWin3213processEventsE6QFlagsIN10QEventLoop17ProcessEventsFlagEE () from d:\Qt\4.8.0\bin\QtCore4.dll
#5 0x6517445c in ZN19QApplicationPrivate14enterModal_sysEP7QWidget ()
from d:\Qt\4.8.0\bin\QtGui4.dll
#6 0x6a2c898e in ZN10QEventLoop13processEventsE6QFlagsINS_17ProcessEventsFlagEE () from d:\Qt\4.8.0\bin\QtCore4.dll
#7 0x6a2c8d93 in ZN10QEventLoop4execE6QFlagsINS_17ProcessEventsFlagEE ()
from d:\Qt\4.8.0\bin\QtCore4.dll
#8 0x6a2cd50f in ZN16QCoreApplication4execEv ()
from d:\Qt\4.8.0\bin\QtCore4.dll
#9 0x00406416 in SC_TerminalClient::run (this=0x22fedc, argc=0,
argv=0x3eff47c)
at d:/sc-ide-clean.git/lang/LangSource/SC_TerminalClient.cpp:276
#10 0x004019fb in QtCollider::run (argc=1, argv=0x3eff478)
at d:/sc-ide-clean.git/QtCollider/LanguageClient.cpp:40
#11 0x004013d0 in main (argc=1, argv=0x3eff478)
at d:/sc-ide-clean.git/lang/LangSource/cmdLineFuncs.cpp:40

@jamshark70

Just FYI, I've started bisecting, going back to the Version-3.5.4 tag (since that's known to work). I probably won't finish tonight.

@jamshark70

It turns out, something is wrong with my build system, causing the crash to occur in every sc version I tested, going all the way back to 3.5.0.

So... we have a crash where gdb revealed nothing useful, and which I can't "git bisect" because I can't find any "good" commit, and which Jakob can't troubleshoot because the issue doesn't happen in Windows 7.

My next best guess is that one of the dependent libraries (pthread, maybe?) might be a version that is good for win7 but not XP. But, if it's the same library version that Jakob used when building the 3.5.4 release (which doesn't have the tempoclock problem in XP), then it doesn't make sense why his 3.5.4 build would be okay while mine is broken.

I have no idea what to do next. But I don't want to let the issue go -- we really should try to get XP working. I'm willing to do a fair amount of legwork but I'm out of my depth here.

@jamshark70

Failing all else, I browsed a bit of the source code and found that a significant difference between SystemClock (no XP crash) and TempoClock (crashes in XP) is that SystemClock's queue is a global one (accessible in the language by thisProcess.prSchedulerQueue) while a TempoClock's queue is a member variable of the c++ class.

So maybe there is some bizarro issue with accessing the member variable that only occurs in XP?

Other than that, ediff-buffers didn't turn up anything significantly different between schedRunFunc() and TempoClock::Run().

Also, I started a thread on mingw-user: http://comments.gmane.org/gmane.comp.gnu.mingw.user/40438

@jamshark70

Couple of responses from the mingw list:

That's a good pointer. Try reading gcc documentation, fish out the
optimization flags that -O, -O2 and -O3 expand to, then apply them one
by one on top of -O0, until you hit the bug.

I can try that. But: how to customize the gcc flags in cmake? I've no idea.

Second:

This all sounds like a memory out of bounds issue in the program. The
fact that you execute fine with no optimization but abort with
optimization is what gives me a clue. HMM... But this interesting
post http://markmail.org/message/hlg5atswqkijgt6e#query:+page:1+mid:v45o2vwi36iemyh7+state:results
points to a bad implementation of setjmp/longjmp so we need to follow
up further if this is a GCC issue or not.

The link is about a ruby issue which may be the result of an improperly compiled instruction.

@jleben
SuperCollider member
@jamshark70

Beg pardon for being inexperienced with cmake, but where would these go in the cmake scripts?

@jleben
SuperCollider member
@bagong

I just finished a Win-build of SC3.7alpha with MinGW64/gcc4.8 where this is not the case anymore. I did nothing relevant to the source and used no special compiler flags. So this must either be due to low level changes in my toolchain or to various recent changes/fixes to the tempo-clock implementation.

@bagong

Is Win XP support still an issue? I vote for no.

@scztt

Windows XP is older than SuperCollier 3 itself. So, I vote no too. :)

@bagong

XP-specific so closed.

@bagong bagong closed this Mar 25, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment