Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply partial loop-unrolling to avoid trace aborts #238

Merged
merged 1 commit into from
Aug 25, 2014

Conversation

alexandergall
Copy link
Contributor

The JIT compiler has to unroll inner loops in a trace. It does this up to a (configurable) limit and aborts the trace if it hits that limit.

This commit tries to avoid excessive unroll operations by manually unrolling a few iterations in critical code sections like multi-buffer receive and transmit processing. This significantly reduces trace aborts in the case of, for example, encapsulation operations that typically use two buffers per packet. It should not have any negative impact on single-buffer packets.

This could be a middle ground between full-fledged multi-buffer packets and the somewhat radical single-buffer approach proposed by Luke some time ago.

The commit also includes a few deliberate preventions of tailcalls to avoid an effect discussed in the thread
http://www.freelists.org/post/luajit/FFI-methods-for-userdata-objects,3. This is more guesswork than well-founded analysis, though.

The JIT compiler has to unroll inner loops in a trace.  It does this
up to a (configurable) limit and aborts the trace if it hits that
limit.

This commit tries to avoid excessive unroll operations by manually
unrolling a few iterations in critical code sections like multi-buffer
receive and transmit processing.  This significantly reduces trace
aborts in the case of, for example, encapsulation operations that
typically use two buffers per packet.  It should not have any negative
impact on single-buffer packets.

The commit also includes a few deliberate preventions of tailcalls to
avoid an effect discussed in the thread at
http://www.freelists.org/post/luajit/FFI-methods-for-userdata-objects,3
@lukego
Copy link
Member

lukego commented Aug 15, 2014

It would be really wonderful to have a way to test and measure these optimizations. Do you have any ideas?

Now we have a built-in benchmarking setup so if we could write a benchmark that captured this optimization we could avoid regressions in the future.

Generally with fancy optimizations it would also be great to find a way for people in the future to know:

  • Is this still needed? (e.g. after LuaJIT upgrade)
  • Is this needed in other places too? (How do you check if it helps?)

@alexandergall
Copy link
Contributor Author

I don't know how to quantify this effect apart from the feedback from the JIT compiler, ranging from interpreter fallback and trace aborts to function blacklisting.

To me, it's basically the same category as NYIs, where the reasoning simply is: avoid interpreted code in the packet processing path.

To assess whether it's still needed at a later point in time, I suppose you'd have to revert to the original loops and check what the compiler says.

As usual, this is all guesswork and trial-and-error. At the heart of all this hackery is my worry about occasional "bad runs" that my app experiences.

@lukego
Copy link
Member

lukego commented Aug 18, 2014

The automatic performance testing that Max implemented does actually run a benchmark multiple times and look at the distribution of results. So if we had a benchmark that behaves less inconsistently after the optimization then the CI might be able to give us visibility of that.

@eugeneia what do you think?

@eugeneia
Copy link
Member

While its not currently included in SnabbBot reports, our benchmark utility (src/scripts/cperf/cperf.sh) does indeed compute the standard deviation of the results of n runs where n is configurable.

I was going to ramble about standard deviation being more likely to yield false positives but then again there is only one way to find out! :)

lukego added a commit that referenced this pull request Aug 25, 2014
Apply partial loop-unrolling to avoid trace aborts
@lukego lukego merged commit f1eb423 into snabbco:master Aug 25, 2014
@alexandergall alexandergall deleted the loop-unroll branch October 2, 2015 11:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants