A tracing profiler for SBCL, with output suitable for display in Chrome’s/Chromium’s Tracing Viewer (chrome://tracing).
NOTE: in its current form, it’s a pile of hacks altering the behavior of SBCL’s implementation of Common Lisp’s
Therefore it works only on SBCL - and I only tested it on SBCL 1.5.0 (AMD64, Linux). Help wanted in making it as portable as possible.
UPDATE: There’s also an associated blog post with some thoughts about tracing profilers in general, and this one in particular.
Tracer is designed to be used in “drop-in” mode, i.e. no code in the profiled project needs to be modified to achieve its primary functionality.
In the future, extended functionality may require adding some constructs to the profiled code.
Tracer is a tracing profiler - i.e. it collects information not just what was called and how often, but records actual call sequences, optionally
with input and output arguments.
The interface is currently highly unstable and subject to change at a whim. Future versions will not be backwards compatible.
The main entry point is the macro
with-tracing, which takes two arguments - a list of things to trace, and the body of code to profile.
The list of things to trace follows the same convention as SBCL’s implementation of the built-in TRACE macro:
each element “may be a symbol, denoting an individual function, or a string, denoting all functions fbound to symbols whose home package is the package with the given name”.
To collect traces of a REPL-bound calculation:
(tracer:with-tracing ("MY-PROJECT" other-package/some-function) (my-project:some-long-operation))
To collect traces from all running threads, e.g. to trace how a web application responds to HTTP requests:
(tracer:with-tracing ("MY-PROJECT" "MY-WEBSERVER") (sleep 20))
In complex projects, it may be useful to define a helper function:
(defun interesting-traceables () `("MY-PROJECT" "MY-PROJECT/CORE" ,@(tracer:package-symbols-except "MY-PROJECT/OBJECTS" "SOME-ACCESSOR" "OTHER-ACCESSOR") jsown:to-json jsown:to-json* "SQLITE" "UIOP/RUN-PROGRAM" "CLACK.HANDLER"))
PACKAGE-SYMBOLS-EXCEPT is provided by
Tracer to make it easier to trace everything in a package except a specified blacklist of functions.)
And use it like this (note the use of
#. for read-time evaluation):
(tracer:with-tracing #.(interesting-traceables) (my-project:some-long-operation))
After that’s done, in order to save the report to a file invoke:
You can now view the results in Google Chrome or Chromium, by visiting chrome://tracing and loading the saved report with “Load” button.
Sadly, there isn’t much good documentation on how to use the tracing viewer UI. Best I can recommend is
this. Fortunately, traces generated by
Tracer use only the most basic features, so things should be self-evident.
(Also note that Chrome’s trace viewer is not without its own issues; in particular, you may need to load the report multiple times before it gets processed and displayed.)
On storing input arguments and return values of function calls
In order to minimize the size of the resulting trace (as well as maximize timing accuracy), by default saving arguments and return values
of traced functions is disabled. The API for selecting what values to store and when is in flux.
Currently, you can set a global default by installing appropriate function as the value of
*DEFAULT-ARG-CONVERTER*, or specify a
per-symbol policy using the
*TRACING-ARG-CONVERTERS* hash table. See trace-event.lisp for details. As an example, you can do:
(setf tracer::*default-arg-converter* tracer::+arg-converter-store-only-simple-objects-and-strings+) (tracer:with-tracing #.(interesting-traceables) (my-project:some-long-operation)) (tracer:save-report "report.json" :skip-args nil)
in order to remember numbers, characters, booleans, symbols and strings. The fastest and most dangerous of current arg converters
+ARG-CONVERTER-PASSTHROUGH+, which records all inputs and return values “by reference”, without re-encoding them. The problem is, if any of such
recorded values gets changed later, the trace will show these changes instead of the values at the time of the call. In extreme cases - such as use of
destructive operations - this may damage or invalidate the recorded object, causing a condition to be signalled when attempting to print it.
The current interface is pretty crude and user-unfriendly, but it will be improved in the future.
SAVE-REPORT used with
:SKIP-ARGS NIL may cause an error if
*DEFAULT-ARG-CONVERTER* is left at its default value of
+ARG-CONVERTER-IGNORE-ALL+. That’s because the API for deciding what parameters to collect is in flux, and
:SKIP-ARGS is a part of the old API.
Tracer works by abusing
TRACE feature to time and record calls instead of printing them to
To do so, it dynamically replaces a bunch of SBCL’s internals implementing the
TRACE functionality, and restores
original implementations when done recording. Recording is protected by
UNWIND-PROTECT, but if anything goes wrong,
you can call
TRACER:STOP-TRACING yourself just to be sure.
Tracer also currently binds directly to SBCL’s interface to gettimeofday to provide a microsecond-resolution clock.
This is a cludge that may or may not work on systems other than Linux (I think it should). This is an improvement over
the initial approach, which was to use GET-INTERNAL-REAL-TIME, hoping for millisecond resolution, and then to introduce
jitter so that all traced samples are at least 1 microsecond long. The jittered maybe-millisecond-accurate clock is still
available; to use it, do:
(setf tracer::*clock-reset-fun* (lambda () (setf tracer::*hack-clock-jitter* 0)) tracer::*clock-get-time-fun* #'tracer::get-current-time-usec)
It’s unknown how much damage is caused when
Tracer is run when multiple threads are already running and have their functions
wrapped by tracing infrastructure mid-flight. It might be a good idea to encourage/support running
Tracer before most of the
traced application is initialized. One way is to make tracing conditioned on a global variable, letting it run continuously
during execution of the application, and only start saving traces when the flag gets flipped.
(Really might want to consider doing it. Passive tracing guarded by flag would work well with in-code counters and block markers and whatever similar things may come up. But then again, that would mean having to define which functions are being traced at startup.)
It might me just being a bit oversensitive, but I’m starting to have feelings that sometimes (rarely, but sometimes), some
calls get missed by
Tracer. I’ve started to grow these suspicions after looking at database calls in one of the applications
I worked on; sometimes it seems that the actual CFFI calls are missing (but perhaps I misssed them in the trace;
some of them are extremely fast).
This is all a bunch of ugly hacks and I’m not proud of them. Except I kind of am. Despite its kludgy nature, it really does work well.
- Support for counters - adding information about memory use, open database connections, etc. Possibly in “auto-polling” mode, to keep the interface non-invasive. These would have their own specific display in Chrome’s tracing viewer.
- Support for connecting call constructs across threads (e.g. marking that a given
HTTP-REQUESTis related to a handler function invocation in a different thread, visualized by arrows connecting them on the trace graph).
- Tracking thread life time (to mark when a given thread was actually created or stopped).
- More actual documentation.
- Binding to some high-resolution clock that’s both portable and doesn’t require a third-party library. Currently,
Tracerdoesn’t need a library, but the solution isn’t portable.
- API for reporting events to be put on the trace, e.g. to show when the user pressed a key or clicked a button, etc.
- Actual API for safely and efficiently saving input arguments and return values of traced functions.
- Handling of signals/conditions - currently exceptional exits break rendering of the trace report.
- GZIPping saved reports to cut down on their size.
- This should really be handled by external dependency; best we can probably offer is a way to output to stream, in such a way one can pass it to compression.
Dev notes for future changes
(Not relevant to use of
- RE clock jitter hack - it should be possible to remove its influence by going over traces in order they were recorded, counting the jitter that was applied and removing anything except the “+1 us for events shorter than clock resolution” fix. But alternatively, I could just not add jitter in the first place, and post-process saved samples by going over them and applying a per-thread offset to them; offset that increases for tasks with recorded length of 0!
- Speaking of post-processing, here’s an idea: how much of the analysis that Chrome does can I do myself? Can I tell, for any call, how much wall-time vs. self-time it has?
- Anyways; what I’d like to have is a way to say, “what if calls to X took 10% of their time?”, or “what if calls to X that are longer than Y took Y?”, or in general, “what if calls to X with properties Y were Z(X, Y) long?”. I’d like to generate an alternative, filtered trace from that, for side-by-side comparison.
- Storage: I’m wondering about pros and cons of replacing a list of lists with a fixed-memory array of structs. That would definitely help reduce the memory load, perhaps even improving recording performance (but I’m not sure about this; consing is fast).
- Recording arguments:
- There’s no good way to print an arbitrary object to string if there’s a chance that the printing function will be traced, and its arguments will be printed too. This sounds like a recipe for an infinite loop.
- I could perhaps work around this with some dynamic flags.
- As a default, I’d like to move towards not even capturing arguments. But I want to capture some arguments - e.g. queries for database calls are interesting.
- Could I instead provide a selector that could be used to optionally capture arguments for a given function(s)? Could it be provided to with-tracing macro directly? E.g. instead of =’foo=, I’d say =’(foo :trace-args t)=.