Apply partial loop-unrolling to avoid trace aborts #238

alexandergall · 2014-08-13T10:04:55Z

The JIT compiler has to unroll inner loops in a trace. It does this up to a (configurable) limit and aborts the trace if it hits that limit.

This commit tries to avoid excessive unroll operations by manually unrolling a few iterations in critical code sections like multi-buffer receive and transmit processing. This significantly reduces trace aborts in the case of, for example, encapsulation operations that typically use two buffers per packet. It should not have any negative impact on single-buffer packets.

This could be a middle ground between full-fledged multi-buffer packets and the somewhat radical single-buffer approach proposed by Luke some time ago.

The commit also includes a few deliberate preventions of tailcalls to avoid an effect discussed in the thread
http://www.freelists.org/post/luajit/FFI-methods-for-userdata-objects,3. This is more guesswork than well-founded analysis, though.

The JIT compiler has to unroll inner loops in a trace. It does this up to a (configurable) limit and aborts the trace if it hits that limit. This commit tries to avoid excessive unroll operations by manually unrolling a few iterations in critical code sections like multi-buffer receive and transmit processing. This significantly reduces trace aborts in the case of, for example, encapsulation operations that typically use two buffers per packet. It should not have any negative impact on single-buffer packets. The commit also includes a few deliberate preventions of tailcalls to avoid an effect discussed in the thread at http://www.freelists.org/post/luajit/FFI-methods-for-userdata-objects,3

lukego · 2014-08-15T07:28:25Z

It would be really wonderful to have a way to test and measure these optimizations. Do you have any ideas?

Now we have a built-in benchmarking setup so if we could write a benchmark that captured this optimization we could avoid regressions in the future.

Generally with fancy optimizations it would also be great to find a way for people in the future to know:

Is this still needed? (e.g. after LuaJIT upgrade)
Is this needed in other places too? (How do you check if it helps?)

alexandergall · 2014-08-18T09:24:05Z

I don't know how to quantify this effect apart from the feedback from the JIT compiler, ranging from interpreter fallback and trace aborts to function blacklisting.

To me, it's basically the same category as NYIs, where the reasoning simply is: avoid interpreted code in the packet processing path.

To assess whether it's still needed at a later point in time, I suppose you'd have to revert to the original loops and check what the compiler says.

As usual, this is all guesswork and trial-and-error. At the heart of all this hackery is my worry about occasional "bad runs" that my app experiences.

lukego · 2014-08-18T15:33:57Z

The automatic performance testing that Max implemented does actually run a benchmark multiple times and look at the distribution of results. So if we had a benchmark that behaves less inconsistently after the optimization then the CI might be able to give us visibility of that.

@eugeneia what do you think?

eugeneia · 2014-08-18T17:36:07Z

While its not currently included in SnabbBot reports, our benchmark utility (src/scripts/cperf/cperf.sh) does indeed compute the standard deviation of the results of n runs where n is configurable.

I was going to ramble about standard deviation being more likely to yield false positives but then again there is only one way to find out! :)

Apply partial loop-unrolling to avoid trace aborts

lukego added a commit that referenced this pull request Aug 25, 2014

Merge pull request #238 from alexandergall/loop-unroll

f1eb423

Apply partial loop-unrolling to avoid trace aborts

lukego merged commit f1eb423 into snabbco:master Aug 25, 2014

alexandergall deleted the loop-unroll branch October 2, 2015 11:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Apply partial loop-unrolling to avoid trace aborts #238

Apply partial loop-unrolling to avoid trace aborts #238

alexandergall commented Aug 13, 2014

lukego commented Aug 15, 2014

alexandergall commented Aug 18, 2014

lukego commented Aug 18, 2014

eugeneia commented Aug 18, 2014

Apply partial loop-unrolling to avoid trace aborts #238

Apply partial loop-unrolling to avoid trace aborts #238

Conversation

alexandergall commented Aug 13, 2014

lukego commented Aug 15, 2014

alexandergall commented Aug 18, 2014

lukego commented Aug 18, 2014

eugeneia commented Aug 18, 2014