Merge MJIT infrastructure with conservative JIT compiler #1782
In Feature#12589, Vladimir Makarov proposed to improve VM performance by replacing VM instructions
While his approach for JIT (write C code to local file system, let C compiler executable
Then I developed a JIT compiler called YARV-MJIT, which does not require any VM instruction changes.
What's the "conservative JIT compiler"?
Benchmarked with: Intel 4.0GHz i7-4790K with 16GB memory under x86-64 Ubuntu 8 Cores
Note: result was changed when it's commited. See ed935aa.
Disclaimer: This JIT compiler performs better with gcc compared to clang for now, so it may be slow on macOS (clang).
I used Vladimir's benchmark set which I modified for my convenience https://github.com/benchmark-driver/mjit-benchmarks.
Note: result was changed when it's commited. See ed935aa.
If the proposal is accepted, I'm going to add following CLI options:
Also, for testing, I would like to have following module and method.
MJIT.enabled? #=> true / false
See the commit log for details.
Thank you for your comments. Final results should be written to https://bugs.ruby-lang.org/issues/14235 for easy tracking, but at this moment here or https://github.com/k0kubun/yarv-mjit/issues is better.
That's because the result may be changed before merge. To remove all bugs, I'm going to add some guards and the benchmark score will be a little worse after that.
I added some guards to let all tests pass with AOT compilation, and removed some optimizations to omit SEGV by race condition (which should be fixed and added later).
Then unfortunately performance is made worse as follows
Micro benchmarks (revised)
With JIT (
I got some ideas to improve above performance to original proposal's one.
Until Jan 24th (next Ruby developers meeting at Tokyo), I'll work on it, try to figure out why it becomes worse with Rails, and create a script that generates
If things go well, hepefully this (or part of this) will be merged on that day.
(edit) I'm building JIT compiler generator to decrease maintenance cost before merging.
which is created by transforming a preprocessed vm.c. Makefile.in: generate MJIT header for UNIX environments win32/Makefile.sub: generate MJIT header for Windows environments tool/transform_mjit_header.rb: New. This script was originally written by Vladimir N. Makarov <firstname.lastname@example.org>. Then I refactored a little, fixed some bugs and ported it to work on Windows. Also, as original minimize_mjit_header.rb takes too long time to run, this is modified to skip minimization step because having *static* unused definitions does not waste compilation time on -O2 as compiler can skip to compile unused static functions. This header installation does NOT include a header to automatically export symbols used by MJIT. That's because original MJIT code was failing to export symbols in the import header. But I would like to have the functionality for maintainability in the future.
mjit.c is authored by Vladimir Makarov <email@example.com>. After he invented great method JIT infrastructure for MRI as MJIT, Lars Kanis <firstname.lastname@example.org> sent the patch to support MinGW in MJIT. In addition to merging it, I ported pthread to Windows native threads. Now this MJIT infrastructure can be compiled on Visual Studio. This commit simplifies mjit.c to decrease code at initial merge. For example, this commit does not provide AOT compiler and multiple JIT threads support. We can resurrect them later if we really want them, but I wanted to minimize diff to make it easier to review this patch. mjit.h: New. It has `mjit_exec` interface similar to `vm_exec`, which is for triggering MJIT. This drops interface for AOT compared to the original MJIT. Makefile.in: define macros to let MJIT know the path of MJIT header. Probably we can refactor this to reduce the number of macros (TODO). win32/Makefile.sub: ditto. common.mk: compile mjit.o and mjit_compile.o. Unlike original MJIT, this commit separates MJIT infrastructure and JIT compiler code as independent object files. As initial patch is going to have not-ultra-fast JIT compiler, it's likely to replace JIT compiler, e.g. original MJIT's compiler or some future JIT impelementations which are not public now. mjit_compile.c: added empty compiler so that you can reuse this commit to build your own JIT compiler. inits.c: define MJIT module. This is added because `MJIT.enabled?` was necessary for testing. test/lib/zombie_hunter.rb: skip if `MJIT.enabled?`. Obviously this wouldn't work with current code when JIT is enabled. test/ruby/test_io.rb: skip this too. This would make no sense with MJIT. ruby.c: define MJIT CLI options. As major difference from original MJIT, "-j:l"/"--jit:llvm" are renamed to "-j:c"/"--jit:cc" because I want to support not only gcc/clang but also cl.exe (Visual Studio) in the future. It takes only "-j:c=gcc", "-j:c=clang" for now. Original "-j:c" is renamed to "-j:n" as it conflicts. Also this initializes MJIT thread and variables. eval.c: finalize MJIT worker thread and variables. test/ruby/test_rubyoptions.rb: fix number of CLI options for -j addition. thread_pthread.c: change for pthread abstraction in MJIT. Prefix rb_ for functions which are used by other files. thread_win32.c: ditto, for Windows. Those pthread porting is one of major works that YARV-MJIT created, which is my fork of MJIT. We should share those improvements. thread.c: follow rb_ prefix changes vm.c: trigger MJIT on VM invocation. Also trigger `mjit_mark` to avoid SEGV by race between JIT and GC of ISeq. The improvement was provided by wanabe <email@example.com>. vm_insnhelper.h: trigger MJIT on method calls during VM execution. vm_core.h: add fields required for mjit.c. `bp` must be `cfp` because rb_control_frame_struct is likely to be casted to another struct. The last position is the safest place to add the new field. vm_insnhelper.c: save initial value of cfp->ep as cfp->bp. This is an optimization which are done in both MJIT and YARV-MJIT. So this change is added in this commit. Calculating bp from ep is a little heavy work, so bp is kind of cache for it. iseq.c: notify ISeq GC to MJIT. We should know which iseq in MJIT queue is GCed to avoid SEGV. gc.c: add hooks so that MJIT can wait GC, and vice versa. Simultaneous JIT and GC executions may cause SEGV and so we should synchronize them. cont.c: save continuation information in MJIT worker. As MJIT shouldn't unload JIT-ed code which is being used, MJIT wants to know full list of saved execution contexts for continuation and detect ISeqs in use.
which has been developed by Takashi Kokubun <takashikkbn@gmail> as YARV-MJIT. Many of its bugs are fixed by wanabe <firstname.lastname@example.org>. This JIT compiler is designed to be a safe migration path to introduce JIT compiler to MRI. So this commit does not include any bytecode changes or dynamic instruction modifications, which are done in original MJIT. This commit strips off some aggressive optimizations even from YARV-MJIT, and thus it's slower than YARV-MJIT too. But it's still fairly faster than Ruby 2.5. Note that this JIT compiler passes `make test`, `make test-all`, `make test-spec` without JIT, AND even with JIT. Not only it's perfectly safe with JIT disabled because it does not replace VM instructions unlike MJIT, but also with JIT enabled it stably runs Ruby applications including Rails applications. I'm expecting this version as just "initial" JIT compiler. I have many optimization ideas which are skipped for initial merging, and you may easily replace this JIT compiler with a faster one by just replacing mjit_compile.c. `mjit_compile` interface is designed for the purpose. common.mk: update dependencies for mjit_compile.c. internal.h: declare `rb_vm_insn_addr2insn` for MJIT. vm.c: exclude some definitions if `-DMJIT_HEADER` is provided to compiler. This avoids to include some functions which take a long time to compile, e.g. vm_exec_core. Some of the purpose is achieved in transform_mjit_header.rb (see `IGNORED_FUNCTIONS`) but others are manually resolved for now. Load mjit_helper.h for MJIT header. mjit_helper.h: New. This is a file used only by JIT-ed code. I'll refactor `mjit_call_cfunc` later. vm_eval.c: add some #ifdef switches to skip compiling some functions like Init_vm_eval. win32/mkexports.rb: export thread/ec functions, which are used by MJIT. include/ruby/defines.h: add MJIT_FUNC_EXPORTED macro alis to clarify that a function is exported only for MJIT. array.c: export a function used by MJIT. bignum.c: ditto. class.c: ditto. compile.c: ditto. error.c: ditto. gc.c: ditto. hash.c: ditto. iseq.c: ditto. numeric.c: ditto. object.c: ditto. proc.c: ditto. re.c: ditto. st.c: ditto. string.c: ditto. thread.c: ditto. variable.c: ditto. vm_backtrace.c: ditto. vm_insnhelper.c: ditto. vm_method.c: ditto. I would like to improve maintainability of function exports, but I believe this way is acceptable as initial merging if we clarify the new exports are for MJIT (so that we can use them as TODO list to fix) and add unit tests to detect unresolved symbols. I'll add unit tests of JIT compilations once `MJIT.compile` feature (synchronous JIT compilation of ISeq) is accepted.
PUSH() was unexpected for the current generator.