with-continuation-mark based tracer #169

sorawee · 2020-09-14T08:55:22Z

This PR implements a new strategy to instrument code via the with-continuation-mark feature. Note that using w-c-m naively, like how the errortrace library does, gives a suboptimal stack trace, so we use a different algorithm that gives a better result (see details below).

The new strategy is more efficient than the current one. Here's a benchmark result, obtained from running test/trace/perf-runner.rkt.

Current strategy
----------------
test/trace/stress/non-tail.rkt: 1628.1 ms
test/trace/stress/tail.rkt: 1409.2 ms

New strategy
------------
test/trace/stress/non-tail.rkt: 1018 ms
test/trace/stress/tail.rkt: 401.7 ms

The new strategy preserves proper tail recursion (as does errortrace): space complexity does not change after instrumentation. This seems to have a side effect of speeding up programs with tail recursion significantly, as shown in the above benchmark (tail.rkt). However, this property is a double-edged sword because it means some logical stack frames must be elided. In practice, the elision doesn't seem to affect debuggability much, as most other frames can still provide useful information, and it is very rare for a symbolic program to have tail recursion due to path merging, so everything seems to work out just fine.

Instrumentation

See the proposal to improve Racket's errortrace at https://www.mail-archive.com/racket-users@googlegroups.com/msg44786.html for the background and see Matthew's suggestion for an efficient implementation strategy that I ended up basing my algorithm on.

It turns out that the proposal doesn't quite work as I expected. Consider the program:

(define (fact n)
  (if (zero? n) (1) (* n (fact (sub1 n)))))
(add1 (fact 5))

By using the mentioned algorithm, the add1 frame will be kept, even though we have not really called add1 yet when the error occurs.

To filter out the add1 frame, we refine the algorithm by assigning a flag saying that a mark is "uncertified" every time we attach a new one. Therefore, both add1 and fact frames will be uncertified initially. Then, in every function body, we certify its frame. This causes fact to be certified while leaving add1 uncertified. By throwing away uncertified frames when an error occurs, we obtain the stack trace as desired.

The actual implementation is slightly more complicated because we need to work with the expanded code. Not all function calls are user-written, so we should not add new information for these calls (e.g., branch-and-merge should not appear in the stack trace).

Limitation

The above algorithm relies on a function to certify its frame. This means that if the function is not instrumented (especially for higher-order functions that are defined in stdlib), the uncertified frame will be missing. This could be seen as either a bug or a feature. Arguably, these missing frames are not useful on their own, but their absence might be surprising.

Tests

Interestingly, from all 31 existing tests, only one outputs differently after switching to the new strategy. This suggests that the new strategy is highly compatible with the current one. For the (only one) test that failed, it is due to the limitation described above (list.rkt misses a frame due to the higher-order, built-in function map).

If users use eval, it could be an arbitrary syntax which could potentially break our invariant.

emina · 2020-09-15T17:11:08Z

Nice; thanks!

sorawee added 10 commits September 13, 2020 17:22

checkpoint: w-c-m based instrumentation

9ad4c00

checkpoint: make macro generated #%app push marks

5881620

remove symbilic-trace-tail?

48b720f

Add stress tests

3d33a23

Fix a bug when macro originates error

4bff8c8

Use eq? based comparison

9a8c7c6

Track syntax on original core form + make cert from fun body

06d63b7

Add a corresponding test

7466713

Code clean up

755ef19

Check that we are instrumenting module form

fdc30d7

If users use eval, it could be an arbitrary syntax which could potentially break our invariant.

emina merged commit d82ce66 into emina:master Sep 15, 2020

sorawee deleted the w-c-m-tracer branch September 15, 2020 17:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

with-continuation-mark based tracer #169

with-continuation-mark based tracer #169

sorawee commented Sep 14, 2020 •

edited

Loading

emina commented Sep 15, 2020

with-continuation-mark based tracer #169

with-continuation-mark based tracer #169

Conversation

sorawee commented Sep 14, 2020 • edited Loading

Instrumentation

Limitation

Tests

emina commented Sep 15, 2020

sorawee commented Sep 14, 2020 •

edited

Loading