Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate to RaptorJIT #1264

Closed
wants to merge 504 commits into from
Closed

Migrate to RaptorJIT #1264

wants to merge 504 commits into from

Conversation

lukego
Copy link
Member

@lukego lukego commented Jan 15, 2018

This pull request is intended to migrate Snabb from LuaJIT to RaptorJIT. This supersedes #1172 which is based on the same branch but was opened early as a preview into the development.

Consequences:

  • JIT tracing and dumping will be always enabled. The shm file audit.log contains an efficient binary representation of the information from jit.v and jit.dump. This must be decoded with separate tools (Studio.)
  • Profiling will be always enabled. The shm directory vmprofile/ contains profiler datasets for the engine and for each individual app. The profiler data records which JIT traces were hot and separately tracks machine code, FFI code, VM code, GC, etc. This data must also be decoded with separate tools (Studio and also tools by @eugeneia.)
  • The jit.dump, jit.v, and jit.p tools are no longer available. These are superseded by the features above which are superior because they are "always on" and available in production. Sorry! We will need to adjust to the new tools and identify the use cases that they currently miss.
  • RaptorJIT has some different optimizations and maybe perform better/worse on certain things. We will need to identify and fix any regressions.
  • We will be our own masters who can hack on the JIT compiler for fun and profit! :-)

TODO:

  • I need to provide instructions for examining trace and profiler data with Studio.

I am hoping to land this branch in the next release. Thoughts?

Mike Pall and others added 30 commits July 26, 2017 09:52
Contributed by Djordje Kovacevic and Stefan Pejic from RT-RK.com.
Sponsored by Cisco Systems, Inc.
Extend the "Lua C API" for vmprofile to support allocating multiple
profiles and switching between them. This makes it easier to use.

Previously the expectation was that this functionality would be
implemented in Lua code using ljsyscall to allocate file-backed shared
memory for profiles and then the FFI to switch between them. (This is
still possible, too, and it works the same as it did before.)

Added test coverage to the test suite.
vmprofile: Extend Lua API and add test cases
raptorjit.nix: Fix executable name / broken build
For debugging purposes it is very useful to be able to refer to the
origin of a trace (parent/exit) and so this change stores that
information persistently in GCtrace instead of only ephemerally in
jit_State.

These fields are now duplicated in jit_State (valid while recording)
and GCtrace (valid after recording.) This duplication could be avoided
by putting them only in GCtrace and accessing them via J->cur but this
would be a more noisy change since the existing fields are accessed
from many places including DynASM macros.
Just to be clear about what is actually being checked.
Oops. The magic number and version fields of the VMProfile structure
were only initialized via the FFI interface and not the C API one.
Make JIT trace numbers unique for completed traces. This way a trace
number is an unambiguous way to refer to a GCtrace object even across
flushes over the JIT.

The immediate motivation is to make the vmprofile data valid across
JIT flushes. It is counting samples by trace number and so when trace
numbers are reused then multiple traces can "collide" on the same
profiler counter.
Support JIT for taking the difference between two pointers whose
element sizes are not a power of 2. This was previously an NYI but not
generates code using a division to convert a pointer to an element
number.

The case where the element size is a power of 2 is handled the same
way as before i.e. using a bit-shift instead of a division.
Suggested by François Perrad.
Contributed by Peter Cawley.
Suggested by François Perrad.
Contributed by François Perrad.
wingo and others added 2 commits March 20, 2018 14:22
@wingo
Copy link
Contributor

wingo commented Mar 20, 2018

I am working on raptorjit+snabb currently. The branch is https://github.com/Igalia/snabb/tree/raptorjit. Current additions relative to this branch are Igalia#1032, a merge from master (Igalia#1033), and a fix for DynASM 64-bit immediates (#1302).

wingo and others added 7 commits March 20, 2018 15:35
Load potentially 64-bit values using mov64
RaptorJIT prefers to always write telemetry to an audit log, and allow
rich analysis of this log with external tools.
Remove references to -jv, -jdump, and the like
wingo and others added 15 commits March 21, 2018 17:20
This is another instance of the bug from commit
93ef6bd.  We didn't see any issue on
upstream Snabb's test suites, but with RaptorJIT's new LJ_GC64 usage did
manifest itself as intermittent heap corruption.

Fixes snabbco#1307.
Fix out-of-bounds write in ctable test suite
If a 64-bit value is provided as a 32-bit immediate operand for a DynASM
instruction then error. The previous behavior was to automatically
truncate to 32-bit.

This is particularly significant for GC64 mode where pointers to
objects created by `ffi.new()` cannot be safely truncated to 32 bits.

The solution for referencing 64-bit immediates is to use 'mov64' and
this is suggested in the error message.
dasm: Error when a 64-bit value is used as a 32-bit immediate
…rce"

This reverts commit afbb06e, reversing
changes made to 2ff378a.
Git had misplaced this file in src/ instead of lib/luajit/src/
This should include a fix for
snabbco#1303.
Limit the auditlog to 100MB. This should be ample.

Future changes should make this configurable.
Timestamps are based on CLOCK_MONOTONIC i.e. suitable for calculating
the time delta from one event to the next.
Extends the auditlog feature with event timestamps and a size limit.
@wingo
Copy link
Contributor

wingo commented Apr 3, 2018

From what I can see of the log there are two blockers here:

(1) Not so good perf on basic1-100e6; that could be statistical though. Better to use hydra here
(2) A failure in the tap app's selftest. Seems to be legit. I didn't see it locally because I don't run with SNABB_TAPTEST.

@eugeneia
Copy link
Member

eugeneia commented Jan 7, 2019

Closing this one in favour of #1316.

@eugeneia eugeneia closed this Jan 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants