Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Browse files

posts: dtrace ustack helper

  • Loading branch information...
commit 41ddaf6561ba87960cd3b5e219c36fa7c574f837 1 parent 952728a
@indutny authored
View
222 posts/3.dtrace-ustack-helper.md
@@ -0,0 +1,222 @@
+# DTrace and the little ustack helper that could
+
+<a target=__blank href="/f/tlsnappy-x64.svg">
+ ![Flamegraph](/images/flamegraph.png)
+</a>
+
+[Flamegraphs][0] are awesome if you need to profile your node.js application.
+They provide a nice looking visual view of where your application is spending
+it's time. Although they're [well][0] [documented][1], no one has ever said a
+word on how this thing is really working internally, but everyone is mentioning
+"ustack helper" which is working only on SmartOS right now.
+
+Call stack
+==========
+
+To understand profiling, one must understand what is call stack. During it's
+lifetime every application is using [stack][2], which is a chunk of memory that
+could be changed by using `push`, `pop`, `call` and other CPU instructions, or
+by accessing it directly.
+
+`push` and `pop` instructions will simply expand/shrink stack storing/loading
+data to the top of it. `call` instruction is a little bit more interesting:

Start each sentence with 'The'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+
+_(Quote from
+ [Intel® 64 and IA-32 Architectures Software Developer’s Manual][3])_
+
+ ...the processor pushes the value of the EIP register (which contains the
+ offset of the instruction following the CALL instruction) on the stack (for
+ use later as a return-instruction pointer). The processor then branches to
+ the address in the current code segment specified by the target operand.
+
+So in pair with `ret` instruction `call` allows you to jump into some function

s/in pair with/coupled with the/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+and return back to the place where it was called. (Despite it's simplicity, I
+still find it amazing.)
+
+That's how calling functions really work internally, but stack can also be used
+to store local (on-stack) function's data. This is achieved using stack frames,
+lets look how functions do usually look:

Odd sentence. (And it's "let's", as in "let us".)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+
+_(I'll use [AT&T assembly syntax][4])_
+
+ push ebp ; Save previous frame pointer
+ mov esp, ebp ; Set new frame pointer
+ sub $0x60, esp ; Allocate space on stack
+
+ ; Function's body.
+ mov $0x10, -0x8(%ebp) ; set on-stack variable
+
+ mov ebp, esp ; Shrink stack to it's initial value
+ pop ebp ; Restore previous frame pointer
+ ret 0 ; Return to caller
+
+If represented graphically stack is generally looks like this:
+
+![Callstack](/images/callstack.png)
+
+The main pros of using following structure are:

following?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+
+ * Easiness of contracting stack back to it's initial position

s/Easiness of contracting stack/Restoring the stack/ ?

s/it's/its/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+ * Fast and simple access of on-stack variables
+
+Stack trace
+===========
+
+Suppose application has thrown exception or crashed with segmentation fault,

Insert 'your', 'an' and 'a' judiciously here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+to find the cause of the problem one may start by looking at stack trace where
+the crash has happened:
+
+ #0 0x00007fff84356d16 in kevent ()
+ #1 0x00000001000557b7 in kqueue_poll ()
+ #2 0x000000010004c77a in uv__run ()
+ #3 0x000000010004c92a in uv_run ()
+ #4 0x0000000100015319 in node::Start ()
+ #5 0x000000010000dd24 in start ()
+
+Here, on the left, you can see addresses of functions' code. Debugger gets them
+by taking current `eip` and `ebp` registers (which stands for current
+instruction address and current stack frame address), walking stack frames and
+collecting return addresses from it. On the right side, you can see functions'
+real names. [gdb][5] automatically loads this information for you by searching
+for debugging symbols corresponding to addresses it has collected.
+
+Flamegraph
+==========
+
+In order to create flamegraph, one will need to periodically collect
+application's stack traces and join them (this's called

s/this's/this is/ (never used in contracted form)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+[statistical profiling][6]), making boxes with functions that were called more
+often - wider, and putting box on the top of another box only if their functions
+appear above each other in the stack trace.
+
+V8's stack frames
+=================
+
+When collecting stack traces of C/C++ application, dtrace will use static
+debugging information provided by compiler. But when it comes to dynamic
+languages, providing such information turns to be more complicated: functions

turns out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+are compiled lazily, often recompiled with applied optimizations, old code may
+be evicted by GC... in other words, application is evolving during it's

s/it's/its/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+execution.
+
+Thankfully, v8 is providing this information, but instead of debugging symbols
+it stores it in stack frames. Here is an example of v8's stack frame structure:
+
+![V8 Callstack](/images/v8-callstack.png)
+
+So knowing this structure we can identify frames by checking marker/function
+value and getting function names from v8's heap (it's too big topic to cover
+here, believe me).
+
+That's exactly a job of ustack helper, it takes frame address and should figure
+out and return function's name, or just fail. So everytime you call `jstack()`
+function in DTrace probe, ustack helper will be called for every not-identified

unidentified

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+frame.
+
+ustack helper example
+=====================
+
+_NOTE: some knowledge of D language is required to fully understand code below_
+
+ dtrace:helper:ustack:
+ {
+ /* frame pointer */
+ this->fp = arg1;
+
+ /* Last instruction - result */

s/instruction/statement/ ? 'instruction' suggests machine opcode.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+ "whoa! you've identified me";
+ }
+
+If you'll replace contents of `src/v8ustack.d` in node.js sources, recompile it

s/you'll/you/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+(on SmartOS), run `bash benchmark/http-flamegraph.sh`, and open `stacks.src`,
+which should contain following stack traces:
+
+ node`_ZN2v88internal7Context14native_contextEv
+ node`_ZN4node10StreamWrap15WriteStringImplILNS_13...
+ node`_ZN4node10StreamWrap15WriteUtf8StringERKN2v89ArgumentsE+0x9
+ whoa! you've identified me
+ whoa! you've identified me
+ whoa! you've identified me
+
+As you can see, DTrace has identified some C++ functions and for all other
+addresses has called our ustack helper.
+
+Lets read some data from v8's stack frame:

s/Lets/Let's/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+
+ #define FP_MARKER (-2 * 8)
+ #define FT_ENTRY (1 << 32)
+
+ /* Init */
+ dtrace:helper:ustack:
+ {
+ this->fp = arg1;
+ this->done = 0;
+ this->marker = (uint64_t) 0;
+ }
+
+ /* Get marker */
+ dtrace:helper:ustack:
+ {
+ this->marker = *(uint64_t*) copyin(this->fp + FP_MARKER,
+ sizeof(uint64_t));
+ }
+
+ /* Match entry marker */
+ dtrace:helper:ustack:
+ /this->marker == FT_ENTRY/
+ {
+ this->done = 1;
+ "entry";
+ }
+
+ /* Match everything else */
+ dtrace:helper:ustack:
+ /!this->done/
+ {
+ "everything else";
+ }
+
+Run it again, and if you're lucky enough you'll find this in `stacks.src`:
+
+ everything else
+ everything else
+ entry
+ node`_ZN2v88internalL6InvokeEbNS0...
+
+Important things about ustack helper:
+
+* It's running within kernel (though, in it's own context, so it can't crash

s/it's/its/

"context" is a little vague, isn't it? The script gets compiled to bytecode that's executed in a virtual machine that runs in kernel mode.

@indutny Owner
indutny added a note

Yeah, it is... but I'm not going to dive in such details here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+ it). The most important consequences of it is that user-land addresses can't
+ be accessed directly, but only by using `copyin()` function.
+* Usage of control flow statements (if/foreach/while) in DTrace scripts is
+ prohibited, since all probes should terminate in a reasonable time. Otherwise
+ infinite loop in kernel space will cause your system to halt.
+
+
+Debugging ustack helper
+=======================
+
+During development of 64bit platform support for node.js ustack helper, I found
+that it's pretty hard to debug ustack helper. The only method to do this is
+insertion of probes which are returning some debugging information, and
+observing this information later in stack traces.
+
+Additionally, it's worth noting that failed `copyin()` or any bad memory access
+won't produce any informative output, but you'll see raw address in stack trace
+(i.e. 0x0000000012345678) rather than your pretty real function's name.
+
+Epilogue
+========
+
+You can look at/play with node's [ustack helper][7], big kudos to
+[Dave Pacheco][8] for developing it!
+
+[0]: http://blog.nodejs.org/2012/04/25/profiling-node-js/
+[1]: http://dtrace.org/blogs/dap/2012/01/05/where-does-your-node-program-spend-its-time/
+[2]: http://en.wikipedia.org/wiki/Stack_(abstract_data_type)
+[3]: http://download.intel.com/products/processor/manual/325462.pdf
+[4]: http://en.wikipedia.org/wiki/X86_assembly_language#Syntax
+[5]: http://www.gnu.org/software/gdb/
+[6]: http://en.wikipedia.org/wiki/Profiling_(computer_programming)#Statistical_profilers
+[7]: https://github.com/joyent/node/blob/master/src/v8ustack.d
+[8]: https://github.com/davepacheco
View
2  public/css/main.css
@@ -54,7 +54,7 @@ header {
}
header h1 {
- padding: 18px 0;
+ padding: 18px 33px 18px 0;
text-align: center;
}
View
BIN  public/images/callstack.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View
BIN  public/images/flamegraph.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View
BIN  public/images/v8-callstack.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Please sign in to comment.
Something went wrong with that request. Please try again.