Skip to content
This repository was archived by the owner on Oct 12, 2022. It is now read-only.

Conversation

rainers
Copy link
Member

@rainers rainers commented Oct 15, 2011

This patch creates a standard stack frame for the _d_thowc function under Windowsn, even if druntime is built in release mode. This allows the stack dump and a debugger to display a correct call stack when an exception is thrown.

In addition, a workaround is added to allow the stack walker to synchronize without symbols for kernel32/kernelbase being available.

http://d.puremagic.com/issues/show_bug.cgi?id=4809

@rainers
Copy link
Member Author

rainers commented Nov 12, 2011

Just thought about this again. Maybe adding the additional stack entries could be limited to debug builds of the executable by checking the image header for a debug directory entry. (I think this should also be done for generating stack traces in exceptions and loading dbghelp.dll).

@rainers
Copy link
Member Author

rainers commented Nov 12, 2011

I also noticed that the stack trace generation allocates an even larger buffer on the stack (char[2048] demangleBuf;) in addition to some rather large structs (e.g. CONTEXT), so it should not hurt to use the same stack space a little later.

dec EAX;
jnz clear_stack;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reasoning for this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reasoning for this?

Without full debug information (including symbols from the Microsoft
symbol sever) the stack walker used by the stack tracing and the
debugger often misses the function on the stack that actually invokes
_d_throw.
My guess is: as there is usually no debug info for both the system DLLs
involved and the D runtime library, the stack walker finds the calling
function by searching addresses on the stack that point to possible code
segments, but somehow skips the missing return address.
The patch fills the stack with zeroes to resynchronize the stack walker,
assuming it starts simply scanning the stack when it it finds a block of
zeroes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why scanning? StackWalk64 could do normal return address/frame pointer unwinding.
You can try to instrument the ReadMemoryRoutine parameter of StackWalk64 to trace
why this fails.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are often no frame pointers, so the stack walker only works
correctly with debug info that has frame pointer omission info.

I wasn't aware of the read callback, it might give a hint to improve the
patch...

@CyberShadow
Copy link
Member

BTW, personally I think we should just have debug and release Phobos+Druntime libs, and make use of the -debuglib DMD option. The release libs can get -inline, and the debug ones -g -gs.

{
naked;
push EBP;
mov EBP,ESP;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space after comma :o)

andralex added a commit that referenced this pull request Jul 8, 2012
Issue 4809: Help the stack walker to find the location of a throw statement
@andralex andralex merged commit 80625a2 into dlang:master Jul 8, 2012
@MartinNowak
Copy link
Member

I think we should revert this.

  • there is no reproducible test case

  • we need a rational for fixes

  • this might cause a stack overflow

  • the push loop could be much faster

    enum SIZE = 4096;
    
    ubyte* p;
    asm { mov p[RBP], RSP; }
    p[-SIZE .. 0] = 0;
    asm
    {
        sub RSP, SIZE;
        // ...
        add RSP, SIZE;
    }
    
  • I recently changed a lot of stack tracing code

MartinNowak added a commit to MartinNowak/druntime that referenced this pull request Jul 9, 2012
This reverts commit 80625a2, reversing
changes made to ce783ff.
@MartinNowak MartinNowak mentioned this pull request Jul 9, 2012
@rainers
Copy link
Member Author

rainers commented Jul 10, 2012

The forced stack frame helps, I guess you do not object to this, but the strange push-loop. I agree that pushing a lot of zeroes on the stack to hope for some better behaviour by the OS feels like a dirty hack, but if it helps... I'll keep an eye on it if it does actually have the expected benefit.

there is no reproducible test case

What about the issue in the bug report? The stack trace is already improved by the forced stack frame. Looking at the call stack inside the debugger is a bit harder to automate for a test case.

this might cause a stack overflow

I just rechecked: the traceNoSync call uses about 6kB of stack. As mentioned in the discussion, 400 words instead of 1000 seem fine aswell.

the push loop could be much faster

Taking a snapshot of the stack with resolving debug symbols indicates that performance is not an issue at all ;-)

I recently changed a lot of stack tracing code

Yeah, it looks much better now. Still I want a simple way to disable it (included its shared static this()) because it does not help when running in a debugger.

@MartinNowak
Copy link
Member

What about the issue in the bug report?

Right, I missed the bug report when I wrote this.
I used the posted code and it works as of #268.

Still I want a simple way to disable it (included its shared static this()) because it does not help when running in a debugger.

So Runtime.traceHandler won't help you?
I also switch to lazy symbol loading which significantly reduced the startup time.
Without a static initializer we'd need to load dbghelp.dll and calling SymInitialize in a possibly corrupted state.
It would be really cool if we could create traces lazily when an exception is catched and printed not when it's thrown.

@rainers
Copy link
Member Author

rainers commented Jul 10, 2012

So Runtime.traceHandler won't help you?

To some extend it does, but it does not prevent the loading of
dbghelp.dll (which is annoying, but a minor issue). I have a workaround
that patches the pointer to the static constructor in the moduleinfo,
but it is ugly and relies on order of execution of another static
constructor, probably only defined by link order.

It would be really cool if we could create traces lazily when an
exception is catched and printed not when it's thrown.

Maybe it works to delay the stack trace generation into the outermost
exception filter "_d_exception_filter". What's also bad is that the
demangling causes more exceptions for non-D symbols or symbols shortened
by SHA value. How about just recording addresses and resolving them when
printing?

@MartinNowak
Copy link
Member

How about just recording addresses and resolving them when
printing?

Sounds reasonable.

@rainers
Copy link
Member Author

rainers commented Jul 11, 2012

It would be really cool if we could create traces lazily when an exception is catched and printed not when it's thrown.

"_d_exception_filter" does not seem to be used at all, but I have tried something similar that seems to work: create a new class StackTraceThrowable deriving from Throwable, use it in the catch handler that wants the stack trace, and check for it in _d_framehandler. Then, generate the stack trace only If it is the searched exception, but continue as if a Throwable has been passed.
I can only test this on windows but the posix version might work similar inside _d_throwc.
If there is interest I can put together a pull request.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants