Skip to content

Stack trace

Mahdi Safsafi edited this page Mar 6, 2017 · 3 revisions

Introduction:

DebugEngine provides a very good StackTrace (CallTrace). It allows to collect sequence of nested functions up to the point where the stack trace was generated.

How StackTrace works ?

Well, as you probably know, the stack trace was always a challenge for developers to do. Why ? because when doing the trace you need to collect all items from the stack and filter them … Eliminate fake calls and preserve valid ones. Ah it seems easy … isn't ? No, this’s not an easy task to do ! The stack contains a lot of garbage data (function’s params and it’s locals, garbage data of try blocks, …). And it happens that those garbage data resemble to valid calls making it very hard to eliminate. There is two way to do a stack trace. Stack trace based on ebp frames and raw stack trace.

The stack frame tracing way:

This is very easy to implement, fast and very accurate ! All calls are often proved to be valid.

How frames stack trace works ?

If the called function has a stack frame, it will put (store) the previous frame into the stack and uses the ebp register instead of esp register to access the stack (read/write). Let’s take a quick example:

function Foo(Params):ReturnType;
asm
  push ebp // Store previous frame into the stack.
  mov ebp, esp // Use ebp register to access the stack.
  sub esp,$xx // Allocate stack memory for locals if exist.
  {...} // Foo's code.
  pop ebp // Restore caller frame.
  ret // Remove the current caller's return address from the stack and return back to the caller function.
end;

procedure DoSomething;
asm
 {...}
 {Push params into stack if calling convention is stdcall. Otherwise, use registers.}
 push return_address
 call Foo
@return_address:// Next instruction to be executed after returning from Foo function.
 {SomeInstruction}
End;

As you can see, on assembler level, frames are represented as push return_address and push ebp instructions. When DoSomething function calls Foo function, the CPU will move Foo’s params either to the stack (pushing params into stack) if the Foo’s calling convention is stdcall or moving them to registers: eax,edx,ecx, and into the stack if the params number is superior than 3. Then the CPU performs the call instruction … But how the call instruction works ? A call instruction is just a single instruction of two packed instruction push and jmp instruction. To call Foo function, the CPU will first push the return address (the address of next instruction below the call instruction) then it will call a jmp instruction that leads to Foo’s first instruction. That’s how call instruction works ! It’s easy to understand … right ? In Foo function, the first instruction is push ebp, the CPU will store the current ebp register into the stack, basically it will store the previous caller frame (DoSomething’ frame) ! Then there is mov ebp,esp that tells to the CPU to use ebp register instead of esp register to access local variables (accessing stack.). That’s it ! Now everything was initialized and the function can start doing it’s job. After the function completes its job, it’s time for clean up ! CPU will restore ebp register to its default value before entering Foo function so when returning to the caller function (DoSomething), this one can access the stack correctly. Finally, the CPU will execute ret instruction … But how the ret instruction works ? Well, like a call instruction a ret instruction is a single instruction of two packed instruction pop and jmp instruction. Before returning to caller function, the CPU will first pop up (remove the return address from the stack) and then it will jump to that address. Note that if the calling convention is stdcall, the CPU will also remove function's parameters from the stack after returning to the caller function.

Now, let’s talk about accessing frames ! Luckily, ebp frames are sequenced, thus all what we need to do is reading ebp register, finding the first frame and then walking through all those frames and finally collecting return address. A frame remains on the stack and has the following architecture:

TFrame = record
  CallerFrame : Pointer ;// The caller frame.
  ReturnAddress: Pointer; // After the ret instruction is executed , the CPU will execute this address.
End;

As you see, stack frame tracing is very structured and accurately. Now let’s talk about the dark side of frame tracing. The first big drawback is that stack frames are optional, not all functions use frames ! If you are using a compiler that can force to use stack frames … That’s good, you may be able to force all YOUR functions to compiled with stack frames. With Delphi, it’s very easy, all what we need to do is inserting {$STACKFRAMES ON} or {$W+} compiler directive. But what about other 3rd party libraries that you use ? An example, Some RTL functions are not compiled with stack frames and you can’t change that because they are already compiled. So if a function does not have a stack frame and we did a stack frame tracing, the trace will ignore the call to that function ! The second drawback is that frames chain could be broken ! If an error happens or when working with a big project, it may happen that you have a wrong or broken ebp frames chain. In that scenario, you will lose even functions that were compiled with stack frames ! The last drawback is that what happens if we get a wrong ebp register value ? That points us either to the second scenario or to the third scenario (we will end up with an empty stack trace).

The stack raw tracing way:

Unlike the previous method (frame tracing). This is a little slower and requires hard filter to get a clean stack trace. There is no frames ! We collect all items from the stack and we filter them to accept only those who seem to be valid calls. The advantage of this technique is that it works on a hard environment (there is no broken chain to worry about ). Also, we are sure that we will never skip a valid call as long as no filter was applied. Now, what are the inconvenience ? Well, it’s about filtering the valid calls ! The stack contains a lot of data, functions params and locals data, data related to try … finally/except blocks, and return addresses of sibling calls. The problem is that a lot of those data resemble to valid calls but in fact they are fake calls ! Consider this example:

Procedure Foo;
Begin

end;

Procedure DoSomething;
Var 
  Ptr:Pointer;
begin
  Ptr:= @Foo;
end;

Ptr is stored into the stack because it’s local variable and its value is a pointer to Foo function. When tracing the stack, the address of Foo function will appear and we may get confused whether it’s a valid call or not.

A second example:

Procedure DoSomething;
begin
  Foo;
  DoStackTrace;
end;

As you see, before doing stack trace, there was a call to Foo function. When doing stack trace, the return address of Foo will appear and again we will get confused whether to accept this item or not … It’s true that our function called Foo function … but this call is not a part of nested functions that called the stack trace function. Such calls are called subroutine calls.

Filtering the stack:

When tracing the stack using raw mode, a filter is necessary in order to have a readable and clean stack trace calls. The filter will eliminate fake calls and preserve valid ones. But that is not something easy to do ! Because there is no magical way to know if call is valid or not and there is no rules for filtering ! It’s totally up to the developer to define its own rules !

The first rule that any developer must implement is to check whether a return address is valid or not. A valid return address is preceded by a call instruction. It’s simple … All what you need to do is to disassembler back from the return address and check if the previous instruction was a call instruction or not. From now it’s totally up to you to choose what to define as rules.

DebugEngine stack trace:

It’s totally up to you to choose the tracing mode whether raw mode or frame mode. By default, the stack trace of DebugEngine uses a combined method (it does raw and frame trace) enhanced by a hard filter.

Rebuilding broken ebp chain:

As I said before, ebp frames are well structured and they proved to be valid calls. But the downside of using them is that the chain could be broken ! So what made the chain broken ? The first and important cause is exception ! When an exception occurs … the runtime library (let say RTL) get notified and a handler takes control from the point where the exception occurred. Then, exception handler decides whether it should continue executing the function where the exception occurred if it was marked by try … except block or just to execute the try … finally block and leaves from that function. It happens that the handler may call other internal functions which are build using ebp frames and it may also corrupt the stack and that explains why there is a broken chain. Luckily in most case, the valid ebp chain remains on the stack. DebugEngine’s stack trace provides a feature to restore the valid chain by analyzing all calls found in the stack and trying to figure what chain is the valid one. Please, for more details about this point refer to the code source.

DebugEngine filter:

Deleting calls that are located between two call that have ebp frames:

If there is a direct link between two call and both use ebp frame. We are sure that all calls located in between are not valid. So we could delete them all. This is a very good way to eliminate fake calls but it requires to have an ebp chain and knowing call’s target address. What if we couldn’t find call’s target address ? We can’t delete in between two call … but we can delete all items located between the CallerFrame and the stack’s return address :

 Offset = Frame.CallerFrame - Item.StackPtr - SizeOf(Pointer);
 Offset  = Offset / SizeOf(Pointer);
 Offset  = Item.StackIndex + Offset;
 for I = PItem^.iIndex + 1 to Offset - 1 do
     Delete Stack[I];

Note : Sometimes, you can have many direct call to the same function. And in that situation … you must pick up the valid one. If you don’t, you may delete valid calls ! How to know the valid one ? That’s not something easy and you need to do a hard test … If there is an ebp chain you can apply this rule : As long as that link does not cross a call that is in ebp chain … we consider it as a valid link.

Deleting garbage area:

Before doing your trace, you may think to remove some garbage areas such as try blocks (try … finally, try … except). In x86, try blocks are implemented into the stack. So all data there will make the filter confused because they point to a valid code location and they may looked as valid calls. We do a try blocks trace and we eliminate all related data. This doesn’t help much … but it’s simple and quick … You have nothing to lose.

Deleting invalid calls:

If you know the call’s target address, you can follow this one and check if it contains sub call or not … If no, we eliminate it as long as we didn’t found a jmp instruction that points to a valid call (Some calls use jmp instruction to jump to somewhere else and it happens that the destination address could be valid). In Delphi, dynamic calls use that way.

You can also check whether return address of sub calls appears somewhere else on the stack or not … A valid call must have a sub call that it’s address appears on the stack.

Also, you can delete calls that have fake ebp frame … A valid ebp frame is stored into the stack next to the return address. If we know that the call’s target address (function) uses ebp frame … we check if this frame is valid or not. But how to know ? It’s simple, we check if the frame and its caller frame fit into the stack range.

All those methods are useful but they all require to have call’s target address … And it happens that in many time you don’t own this address !

Emulating the stack:

If you have a good disasm library, you may be able to implement a function to emulate how the CPU will work ! In other words, guessing where the CPU will execute after exiting the call. That’s something a little complex to talk about. You can refer to DebugEngine trace to figure how it works.

Unwinding the stack:

What do we mean by unwinding ? Popping garbage data out of the stack. Data could be function’s locals or params data … See this example:

procedure Foo;
var
  P: array [0 .. 4] of Pointer;
  P1: Pointer;
Begin
  P1:=@DoSomething;
  P[0] := P1;
end;

P is a local variable and it will be accessible only through the stack because it’s a local variable and all local variables live in the stack opposite to global variables which are located in the process heap. When tracing, it happens that some of those data are similar to valid return address because they contain valid code location and that’s why we will have a no clean stack trace if we don’t eliminate them. Now how we could remove them from the stack ? As I said before: By unwinding the stack ! If you disasm Foo function, you will have something like this:

  {x86-32}
  push ebp
  mov ebp,esp
  add esp,-$18 ; <=== Reverse $18 (24) byte in the stack.
 
  ; $18 = 24 (decimal) = Locals size.
  ; P: array [0 .. 4] of Pointer; <= 5 * SizeOf(Pointer) = 20
  ; +
  ; P1: Pointer; 4 byte.
  ; ------------------------------------------------------------
  ; 24 = $18.

  mov [ebp-$04],$005c9ac8
  mov eax,[ebp-$04]
  mov [ebp-$18],eax
  mov esp,ebp
  pop ebp
  ret 

As you see, our program will allocate 24 byte into the stack for our Foo’s locals. And all it’s done by subtracting $18 from the stack pointer register. Delphi use add instruction to subtract (add instruction works with signed and unsigned immediate). So add esp,-$18 is the same as sub esp,$18 and is the same as pushing 6 time in the stack “push something”. Now, back to our technique, we try to disasm all calls and find add esp,$xxxx instruction … then we get $xxxx and we delete all items in the range [item’s index to item’s index + ($xxxx / SizeOf(Pointer))]. This method (“Unwinding the stack”) is very very powerful and eliminate a lot of fake calls. However it’s very dangerous ! if you unwind on a fake call … you may in fact going to delete invalid calls as well as valid calls ! You should use it very very wisely !

Other tricks:

There is other small tricks I used and I’m not going to talk about. You can refer to the source if you are interested.

Stack trace for x64:

On x64, things are little different, when compiling app for x64 platform, compilers insert a lot of information about compiled functions in the PE file (like exceptions and unwinding information) and all those information are only available for x64 (they do not exist if the target platform is x86). Windows will use those information to unwind the stack and it does a good job. It produces a good call trace. So for that reason I didn’t write a NEW trace for x64. DebugEngine’s trace on x64 is based on windows stack trace which is done by calling CaptureStackBackTrace function.

Strong points:

  • There is two way to do stack trace, frames and raw trace.
  • Frames trace is well structured and highly accurate (calls listed there are valid) … but not all functions use frame and function that does not have a frame will not appear as a valid call on the ebp chain.
  • Raw trace is not accurate but it will never skip a valid call as long as the filter is good.
  • When error occurs, ebp chain may get broken and the only way to get the chain is to analyse all calls found on the stack.
  • The most thing responsable for garbage data is function locals data.
  • Filtering the stack is necessary if raw trace was used and it requires a very good disasm library.
  • Finding broken ebp chain can help the filter to eliminate fake calls.

Configuring DebugEngine stack trace:

The easy way to use DebugEngine’s stack trace is calling StackTrace function which will use the default configuration for the stack trace. However, if you want to customize the stack trace, you need to use TCallTrace class instead.

Getting stack info:

You can use GetStackInfo function to get current thread stack info or you can fill StackInfo record with your customized info.

var
  StackInfo: TStackInfo;
begin
  GetStackInfo(StackInfo);
  {StackInfo.StackPtr   = ESP register.
   StackInfo.StackFrame = EBP register.
   StackInfo.StackTop   = Top of stack.
   StackInfo.StackLimit = Stack limit.}
end;

Setting trace mode:

To set trace method, use TraceMethod property. DebugEngine provides three method to do stack trace:

  • stRaw = Raw stack trace.
  • stFrames = EBP frames stack trace. All calls are guaranteed to be a part of ebp chain.
  • *stCombined = Do a raw trace based on ebp frames. If the ebp chain is not broken, DebugEngine will do a stack trace and it will base on calls inside that ebp chain to eliminate fake ones. If there is no ebp chain, Filter takes the control to eliminate fake calls. Note that stCombined is the default method for trace. And I strongly recommend to use that option.

Enabling or disabling stack filter:

Use Filter property to enable or disable the filter.

  StackTrace.Filter := True; // Enable filter.

Setting filter options:

Use FilterOptions property to set up stack trace filter options. You can combine all of those options below:

  • *sfExcludeTryBlocks = Remove the try blocks garbage data from the stack.
  • *sfUnwind = Unwind the stack.
  • sfNoTolerance= The filter will run several times to eliminate as much as it can.
  • *sfHideOutOfRangeCalls = Hide calls that their address does not have a proper address.
  • *sfHideUnknownCallsInfo = Hide calls that do not have address,symbol name information.

Setting stack trace options:

Use Options property to set up stack trace options. You can combine all of those options:

  • *soUseFirstCallOnEbp = Make esp equal to ebp before tracing. The first call is guaranteed to be a part of ebp chain.
  • soRebuildBrokenEbpChain = Try to find the broken ebp chain and use this chain to help the filter.
  • soDropCurrentEbpChain = if soRebuildBrokenEbpChain is used and the trace detected a broken chain, it will drop the current ebp chain and use the broken chain instead.

Note: ”*” means default option.

Example:

var
  I: Integer;
  StackInfo: TStackInfo;
  StackTrace: TCallTrace;
  PItem: PStackItem;
  SL: TStringList;
begin
  GetStackInfo(StackInfo);
  SL := TStringList.Create;
  StackTrace := TCallTrace.Create;
  StackTrace.Filter := True;
  StackTrace.Options := [soUseFirstCallOnEbp];
  StackTrace.FilterOptions := [sfExcludeTryBlocks, sfUnwind, sfHideOutOfRangeCalls, sfHideUnknownCallsInfo];
  try
    StackTrace.StackInfo := StackInfo;
    StackTrace.Trace;
    for I := 0 to StackTrace.Count - 1 do
    begin
      PItem := StackTrace.Items[I];
      Sl.Add(LogCall(PItem));
    end;
  finally
    SL.Free;
    StackTrace.Free;
  end;