Skip to content
Commits on May 12, 2012
  1. meta_send_stack_{1,2}{,_pop}

    Ryo Onodera committed May 12, 2012
  2. index and arg meta instrucitons

    Ryo Onodera committed May 12, 2012
  3. index and arg meta instrucitons

    Ryo Onodera committed May 12, 2012
  4. Improve GlobalCache's cache locality

    committed with Ryo Onodera Mar 20, 2012
  5. send method

    Ryo Onodera committed May 10, 2012
  6. Revert other optimizations

    Ryo Onodera committed May 7, 2012
  7. Style fix

    Ryo Onodera committed May 7, 2012
  8. JIT Enabled

    Ryo Onodera committed May 7, 2012
  9. Remove comment

    Ryo Onodera committed May 6, 2012
  10. remove allow_private

    Ryo Onodera committed Apr 15, 2012
  11. remove CALL_FLAG_CONCAT

    Ryo Onodera committed Mar 28, 2012
  12. enable optimize

    Ryo Onodera committed Mar 27, 2012
  13. CALL_FLAG_CONCAT

    Ryo Onodera committed Mar 27, 2012
  14. another commit

    Ryo Onodera committed Mar 27, 2012
  15. further

    Ryo Onodera committed Mar 26, 2012
  16. more optimizations

    Ryo Onodera committed Mar 26, 2012
  17. further optimization

    Ryo Onodera committed Mar 26, 2012
  18. meta_set_local_depth_pop

    Ryo Onodera committed Mar 25, 2012
  19. save everything

    Ryo Onodera committed Mar 25, 2012
  20. Before further optimization

    Ryo Onodera committed Mar 25, 2012
  21. another cleanups

    Ryo Onodera committed Mar 24, 2012
  22. finally optimization working!!

    Ryo Onodera committed Mar 24, 2012
  23. minor cleanups...

    Ryo Onodera committed Mar 24, 2012
  24. finally it worked...

    Ryo Onodera committed Mar 24, 2012
  25. Merge pull request #1719 from carlosgaldino/float-comparison

    Ryo Onodera committed May 12, 2012
    Float comparison
  26. Improve concurrent performance by cache locality

    committed May 12, 2012
    There is a bottle neck in SharedState::check_gc_p. It is called from all Ruby
    threads and always writes to memory. That invalidates CPU cache and thus,
    everything is slowed down.
    
    Actually writing isn't always necessary in this case. It is only when the GC
    flag is set. And that situation is unlikely.
    
    This commit removes the bottle neck by avoiding writing. Then, each CPU core
    runs at their full speed, not hampered by slow memory fetching due to cache
    invalidation.
    
    check_gc_p is heavily called under concurrently-executed tight loops (happen
    sometime in the real word, are excised often by people interested in Rubinius's
    promised concurrency)
    
    After this commit, each concurrent thread executes a specially synthesized test
    code as roughly fast as a single thread. This doesn't indicate that Rubinius
    executes Ruby code purely proportional to the number of CPU cores. However it
    does that Rubinius runs its core vm and has a potential to run Ruby code at the
    speed.
    
    Hello true concurrency :)
    
    I measured performance as follows.
    
    BEFORE THIS COMMIT:
    1 thread:
     #<Thread:0x1c id=3 run>: 1.677862 seconds for looping 100000000 times
     #<Thread:0x1c id=3 run>: 1.681874 seconds for looping 100000000 times
     #<Thread:0x1c id=3 run>: 1.682204 seconds for looping 100000000 times
     #<Thread:0x1c id=3 run>: 1.672264 seconds for looping 100000000 times
    
    2 threads:
     #<Thread:0x1c id=4 run>: 4.340247 seconds for looping 100000000 times
     #<Thread:0x24 id=3 run>: 4.443239 seconds for looping 100000000 times
     #<Thread:0x1c id=4 run>: 4.338425 seconds for looping 100000000 times
     #<Thread:0x24 id=3 run>: 4.465941 seconds for looping 100000000 times
    
    AFTER THIS COMMIT:
    1 thread:
     #<Thread:0x10 id=3 run>: 1.645313 seconds for looping 100000000 times
     #<Thread:0x10 id=3 run>: 1.640116 seconds for looping 100000000 times
     #<Thread:0x10 id=3 run>: 1.641852 seconds for looping 100000000 times
     #<Thread:0x10 id=3 run>: 1.647724 seconds for looping 100000000 times
    
    2 threads:
     #<Thread:0x10 id=4 run>: 1.764503 seconds for looping 100000000 times
     #<Thread:0x18 id=3 run>: 1.697629 seconds for looping 100000000 times
     #<Thread:0x10 id=4 run>: 1.715689 seconds for looping 100000000 times
     #<Thread:0x10 id=4 run>: 1.702426 seconds for looping 100000000 times
    
    TEST CODE:
    loop_count = 100_000_000
    threads = 2
    threads.times.collect do
      Thread.new do
        loop do
          time = Time.now
          i = loop_count
          i -= 1 until i.zero?
          puts "#{Thread.current}: #{Time.now - time} seconds for looping " +
               "#{loop_count} times"
        end
      end
    end.each(&:join)
  27. @carlosgaldino
  28. @carlosgaldino
  29. @brixen
  30. @brixen
  31. @brixen
  32. @brixen
Commits on May 11, 2012
  1. Remove trailing white spaces

    committed May 9, 2012
Something went wrong with that request. Please try again.