Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

send-pop optimisation #2100

Open
wants to merge 20 commits into
base: master
from
Commits on Mar 20, 2019
  1. new VM timestamp variable

    shyouhei committed May 24, 2016
    This variable is expected to be an integer type which can be incremented
    atomically.  Expected to be used where certain object's "freshness" is
    vital, e.g. when invalidating a cache.
  2. allow accessing unified operands from attributes

    shyouhei committed Aug 31, 2018
    Attributes of normal instructions can look at their operands.
    This changeset enables the same thing for operand-unified
    instructions.
  3. new attribute trace_equivalent

    shyouhei committed Mar 6, 2019
    Some instructions are spacial cases for another ones.  These
    instructions need not preserve their trace counterparts.  By reducing
    those unnecessary trace instructions we can strip binary size of
    vm_exec_core from 25,759 bytes to 24,924 bytes on my machine.
    
    Yes, this changeset slows traces down a bit.  But is that a problem?
  4. define purity of each instructions

    shyouhei committed Mar 8, 2019
    This changeset introduces new instruction attribute called "purity".
    By doing so we can eliminate calls to methods that are entirely
    consist of pure instructions.
    
    The definition of purity is chosen to achieve that optimisation; that
    is, only instructions that do noting except stack manipulations are
    marked so.  For instance instruction `once` is not pure, because it
    can block other threads (so there can potentially be global side
    effects).
    
    A method call can both be pure and not pure at the same time.  What is
    called at a specific call site cannot be determined until the very
    moment when we actually call it.  Of course a method cannot be pure,
    until every and all of methods it calls are (possibly recursively)
    pure.  So in short, purity of a method is updated on-the-fly.
  5. skip pure methods

    shyouhei committed Nov 27, 2018
    This changeset modifies several instructions so that if the methods
    (or blocks) that are about to be invoked are pure, just does nothing.
    Fix [GH-1943]
  6. send-pop optimisation part one: calling convention

    shyouhei committed Nov 27, 2018
    Sending a method, then immediately throwing away its return value,
    is one of the most frequent waste of time that ruby does.  Why not
    tell methods if the caller uses that return value or not, and let
    them use that info for optimal operations.
    
    In order to do so our method calling convention is extended to have
    VM_FRAME_FLAG_POPPED bit which indicates that the caller does not
    use the return value.  There also is a new instruction called
    opt_bailout, which omits creation of return values to bail out early.
  7. add rb_whether_the_return_value_is_used_p()

    shyouhei committed Dec 3, 2018
    Looking at how `make rdoc` is working, I noticed that strings
    allocated inside of StringScanner#scan (which is called from
    lib/rdoc/markup/parser.rb:508, "else @s.scan ...") are becoming
    garbages immediately.  Why not make it possible for extension
    libraries to know whether the return values are needed or not.  That
    way StrigngScanner can avoid generation of such useless strings, to
    reduce the GC pressure.
  8. omit branch inside of vm_sendish

    shyouhei committed Mar 13, 2019
    Now that opt_bailout is introduced, a method that is entirely pure can
    have that instruction at the very beginning of its sequence.  Why not
    just invoke such methods as usual and let the instruction do the job.
    This adds some overhead (frame manipulations previously entirely
    skipped can now occur) and removes another (purity calculations for
    non-skippable method calls now eliminated).  So let's see the
    trade-offs.
  9. optimise String#slice!

    shyouhei committed Dec 3, 2018
    Looking at how `make rdoc` is working, I noticed that strings
    allocated inside of String#slice! (which is called from
    lib/rdoc/markup/parser.rb:313 and several other places) are becoming
    garbages immediately.  These usages of String#slice! are to delete
    portions of the receiver and are not interested in the return values.
    Why not avoid creation of the return value in such cases.
    
    Note however that by doing so, String#slice is inevitably made
    optimised also.  These two methods are tightly connected.  Decoupling
    them needs lots of copy & paste, which I think is not a good idea.
  10. optimise Enumerable#grep

    shyouhei committed Dec 4, 2018
    Enumerable#grep is interesting in two things.  First, despite
    almost everybody think it has nothing to do with return value
    optimisations, it does.  The usage without return value can be
    seen at ext/extmk.rb:368, "grep(/\A#{var}=(.*)/) {return $1}".
    Second, even when there is no block passed and no return value
    used at the same time, it cannot be a no-op.  We have to reroute
    [Bug #5801].
  11. add RubyVM.return_value_is_used?

    shyouhei committed Dec 10, 2018
    RDoc::Parser::RubyTools#skip_tkspace_without_nl is one of methods that
    is frequently called with return value discarded.  Eliminating the
    allocated array can benefit both time and memory consumption.
    
    The problem is, it is hard to auto-eliminate such wasted return values
    even when we can tell the method we don't need them.  This is because
    variables _could_ escape from the scope.  For instance, uget_tk()
    might be an alias of eval().  That is not the case for this particular
    method, but auto-detecting such evil activities are very hard -- if not
    impossible.
    
    So to ease the situation we implement RubyVM.return_value_is_used?
    method.  By manually checking that property we can define hand-crafted
    faster variation of skip_tkspace_without_nl that do not allocate the
    return values.
  12. send-pop optimisation part two: eliminate pop

    shyouhei committed Jan 21, 2019
    Sending a method, then immediately throwing away its return value, is
    one of the most frequent waste of time that ruby does.  Now that
    callee methods can skip pushing objects onto the stack, why not caller
    sites to also avoid popping them.
    
    In order to do so our compiler now does not emit pop instructions but
    add VM_FRAME_FLAG_POPIT flag to the call info of adjacent send-ish
    instructions.  It is now the caller's duty to properly igonre the
    return value.
    
    Signed-off-by: Urabe, Shyouhei <shyouhei@ruby-lang.org>
  13. optimise rb_obj_dummy

    shyouhei committed Feb 5, 2019
    Looking at discourse script/bench.rb, I found that
    BasicObject#initialize is called a considerable number of times.  It
    seems worth optimising.  By making sure we are calling rb_obj_dummy,
    we can safely skip the frame manipulations.
  14. modify tests to properly trigger JIT

    shyouhei committed Feb 25, 2019
    These methods were lightweight enough for the interpreter to avoid
    JIT compilations.  Make them a little comlicated so that they can
    be properly considered for optimisations by the engine.
  15. precalc INSN_CALLER_RETVAL_POPPED_P()

    shyouhei committed Feb 26, 2019
    This macro is expanded inside of send-ish instructions, which are
    super-duper hot paths.  By statically analysing this into the call
    info, we can optimise situations where return values _do_ get used.
  16. reduce instruction counts

    shyouhei committed Mar 2, 2019
    Experiments show that some recent compilers give up inlining
    functions called from inside of vm_exec_core, seemingly because
    it was too big.  This changeset deletes sendpop instructions,
    merge them into bare ones.
  17. no inline vm_method_cfunc_is()

    shyouhei committed Mar 6, 2019
    It seems vm_method_cfunc_is() is inlined into vm_exec_core(), wihch
    is not what we want here.  Make a wrapper function to absorb that.
  18. optcarrot tweak

    shyouhei committed Mar 11, 2019
    I admit this is a dirty hack just to boost optcarrot FPS.
Commits on Mar 21, 2019
  1. cancel optimisation when captured

    shyouhei committed Mar 20, 2019
    If there are chances for local variables to live longer than the
    original scope, we cannot eliminate local variable assignments.
    Optimisations are not possible then.
You can’t perform that action at this time.