New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

red black tree 50% slower in 9k compared to 1.7 #2455

Closed
subbuss opened this Issue Jan 13, 2015 · 7 comments

Comments

Projects
None yet
5 participants
@subbuss
Contributor

subbuss commented Jan 13, 2015

This is on my laptop with -X+C -Xcompile.invokedynamic=true with java7.

@headius

This comment has been minimized.

Member

headius commented Jan 13, 2015

Confirmed, though for me it's more like 30%.

@chrisseaton

This comment has been minimized.

Contributor

chrisseaton commented Jan 16, 2015

Red black is memory bound isn't it? When I run it in Truffle it seems like the compilation has very little impact as it's all GC and dereferencing. Your objects aren't larger in IR are they?

@subbuss

This comment has been minimized.

Contributor

subbuss commented Jan 18, 2015

Yes, memory usage is larger in IR mode, but not sure if it fully explains the slowdown. But, you may be right that rb tree could be memory bound .. I never thought of investigating more there. Thanks for looking into it.

@enebo

This comment has been minimized.

Member

enebo commented Jul 15, 2015

Ran this with Java 8 and we are about 15% slower with indy now and about and about 5% slower with non-indy. We are about 25% slower with interp which is a much lesser concern.

@kares kares added this to the JRuby 9.2.0.0 milestone Mar 1, 2018

@kares

This comment has been minimized.

Member

kares commented Mar 1, 2018

marking for 9.2 if anyone is into re-running, or simply close as @enebo's ^^ results were already promising

@enebo enebo modified the milestones: JRuby 9.2.0.0, JRuby 9.1.17.0 Mar 1, 2018

@enebo

This comment has been minimized.

Member

enebo commented Mar 1, 2018

Original script this was recorded against:

bm1.rb

require_relative 'red_black_tree.rb'
require "benchmark"

def rbt_bm
  n = 100000
  a1 = []; n.times { a1 << rand(99999) }
  a2 = []; n.times { a2 << rand(99999) }

  start = Time.now

  tree = RedBlackTree.new

  n.times {|i| tree.add(i) }
  n.times { tree.delete(tree.root) }

  tree = RedBlackTree.new
  a1.each {|e| tree.add(e) }
  a2.each {|e| tree.search(e) }
  tree.inorder_walk {|key| key + 1 }
  tree.reverse_inorder_walk {|key| key + 1 }
  n.times { tree.minimum }
  n.times { tree.maximum }

  return Time.now - start
end

N = (ARGV[0] || 20).to_i

N.times do
  puts rbt_bm.to_f
  puts "GC.count = #{GC.count}" if GC.respond_to?(:count)
end

red_black_tree.rb

# Algorithm based on "Introduction to Algorithms" by Cormen and others
class RedBlackTree
  class Node
#    def self.attr_accessor(name)
#      eval("def #{name.to_s}; @#{name}; end");
#      eval("def #{name.to_s}=(v); @#{name} = v; end");
#    end

    attr_accessor :color
    attr_accessor :key
    attr_accessor :left
    attr_accessor :right
    attr_accessor :parent

    RED = :red
    BLACK = :black
    COLORS = [RED, BLACK].freeze

    def initialize(key, color = RED)
      raise ArgumentError, "Bad value for color parameter" unless COLORS.include?(color)
      @color = color
      @key = key
      @left = @right = @parent = NilNode.instance
    end

    def black?
      return color == BLACK
    end

    def red?
      return color == RED
    end
  end

  class NilNode < Node
    class << self
      private :new
      @instance = nil

      # it's not thread safe
      def instance
        if @instance.nil?
          @instance = new

          def instance
            return @instance
          end
        end

        return @instance
      end
    end

    def initialize
      self.color = BLACK
      self.key = 0
      self.left = nil
      self.right = nil
      self.parent = nil
    end

    def nil?
      return true
    end
  end

  include Enumerable

#  def self.attr_accessor(name)
#    eval("def #{name.to_s}; @#{name}; end");
#    eval("def #{name.to_s}=(v); @#{name} = v; end");
#  end

  attr_accessor :root
  attr_accessor :size

  def initialize
    self.root = NilNode.instance
    self.size = 0
  end

  def add(key)
    insert(Node.new(key))
  end

  def insert(x)
    insert_helper(x)

    x.color = Node::RED
    while x != root && x.parent.color == Node::RED
      if x.parent == x.parent.parent.left
        y = x.parent.parent.right
        if !y.nil? && y.color == Node::RED
          x.parent.color = Node::BLACK
          y.color = Node::BLACK
          x.parent.parent.color = Node::RED
          x = x.parent.parent
        else
          if x == x.parent.right
            x = x.parent
            left_rotate(x)
          end
          x.parent.color = Node::BLACK
          x.parent.parent.color = Node::RED
          right_rotate(x.parent.parent)
        end
      else
        y = x.parent.parent.left
        if !y.nil? && y.color == Node::RED
          x.parent.color = Node::BLACK
          y.color = Node::BLACK
          x.parent.parent.color = Node::RED
          x = x.parent.parent
        else
          if x == x.parent.left
            x = x.parent
            right_rotate(x)
          end
          x.parent.color = Node::BLACK
          x.parent.parent.color = Node::RED
          left_rotate(x.parent.parent)
        end
      end
    end
    root.color = Node::BLACK
  end

  alias << insert

  def delete(z)
    y = (z.left.nil? || z.right.nil?) ? z : successor(z)
    x = y.left.nil? ? y.right : y.left
    x.parent = y.parent

    if y.parent.nil?
      self.root = x
    else
      if y == y.parent.left
        y.parent.left = x
      else
        y.parent.right = x
      end
    end

    z.key = y.key if y != z

    if y.color == Node::BLACK
      delete_fixup(x)
    end

    self.size -= 1
    return y
  end

  def minimum(x = root)
    while !x.left.nil?
      x = x.left
    end
    return x
  end

  def maximum(x = root)
    while !x.right.nil?
      x = x.right
    end
    return x
  end

  def successor(x)
    if !x.right.nil?
      return minimum(x.right)
    end
    y = x.parent
    while !y.nil? && x == y.right
      x = y
      y = y.parent
    end
    return y
  end

  def predecessor(x)
    if !x.left.nil?
      return maximum(x.left)
    end
    y = x.parent
    while !y.nil? && x == y.left
      x = y
      y = y.parent
    end
    return y
  end

  def inorder_walk(x = root)
    x = self.minimum
    while !x.nil?
      yield x.key
      x = successor(x)
    end
  end

  alias each inorder_walk

  def reverse_inorder_walk(x = root)
    x = self.maximum
    while !x.nil?
      yield x.key
      x = predecessor(x)
    end
  end

  alias reverse_each reverse_inorder_walk

  def search(key, x = root)
    while !x.nil? && x.key != key
      key < x.key ? x = x.left : x = x.right
    end
    return x
  end

  def empty?
    return self.root.nil?
  end

  def black_height(x = root)
    height = 0
    while !x.nil?
      x = x.left
      height +=1 if x.nil? || x.black?
    end
    return height
  end

private

  def left_rotate(x)
    raise "x.right is nil!" if x.right.nil?
    y = x.right
    x.right = y.left
    y.left.parent = x if !y.left.nil?
    y.parent = x.parent
    if x.parent.nil?
      self.root = y
    else
      if x == x.parent.left
        x.parent.left = y
      else
        x.parent.right = y
      end
    end
    y.left = x
    x.parent = y
  end

  def right_rotate(x)
    raise "x.left is nil!" if x.left.nil?
    y = x.left
    x.left = y.right
    y.right.parent = x if !y.right.nil?
    y.parent = x.parent
    if x.parent.nil?
      self.root = y
    else
      if x == x.parent.left
        x.parent.left = y
      else
        x.parent.right = y
      end
    end
    y.right = x
    x.parent = y
  end

  def insert_helper(z)
    y = NilNode.instance
    x = root
    while !x.nil?
      y = x
      z.key < x.key ? x = x.left : x = x.right
    end
    z.parent = y
    if y.nil?
      self.root = z
    else
      z.key < y.key ? y.left = z : y.right = z
    end
    self.size += 1
  end

  def delete_fixup(x)
    while x != root && x.color == Node::BLACK
      if x == x.parent.left
        w = x.parent.right
        if w.color == Node::RED
          w.color = Node::BLACK
          x.parent.color = Node::RED
          left_rotate(x.parent)
          w = x.parent.right
        end
        if w.left.color == Node::BLACK && w.right.color == Node::BLACK
          w.color = Node::RED
          x = x.parent
        else
          if w.right.color == Node::BLACK
            w.left.color = Node::BLACK
            w.color = Node::RED
            right_rotate(w)
            w = x.parent.right
          end
          w.color = x.parent.color
          x.parent.color = Node::BLACK
          w.right.color = Node::BLACK
          left_rotate(x.parent)
          x = root
        end
      else
        w = x.parent.left
        if w.color == Node::RED
          w.color = Node::BLACK
          x.parent.color = Node::RED
          right_rotate(x.parent)
          w = x.parent.left
        end
        if w.right.color == Node::BLACK && w.left.color == Node::BLACK
          w.color = Node::RED
          x = x.parent
        else
          if w.left.color == Node::BLACK
            w.right.color = Node::BLACK
            w.color = Node::RED
            left_rotate(w)
            w = x.parent.left
          end
          w.color = x.parent.color
          x.parent.color = Node::BLACK
          w.left.color = Node::BLACK
          right_rotate(x.parent)
          x = root
        end
      end
    end
    x.color = Node::BLACK
  end
end

Output for 1.7:

system ~/Downloads/local_benches 1301% ../../work/jruby-1_7/bin/jruby -Xcompile.invokedynamic=true ./bm1.rb 
2.641
GC.count = 3
2.806
GC.count = 4
2.153
GC.count = 5
2.022
GC.count = 6
0.916
GC.count = 6
0.442
GC.count = 7
0.381
GC.count = 8
0.416
GC.count = 8
0.439
GC.count = 9
0.385
GC.count = 10
0.435
GC.count = 10
0.447
GC.count = 11
0.385
GC.count = 12
0.432
GC.count = 12
0.449
GC.count = 13
0.395
GC.count = 14
0.428
GC.count = 14
0.461
GC.count = 15
0.417
GC.count = 16
0.436
GC.count = 16
system ~/Downloads/local_benches 1302% ../../work/jruby-1_7/bin/jruby -Xcompile.invokedynamic=false ./bm1.rb 
1.622
GC.count = 3
1.256
GC.count = 4
1.064
GC.count = 5
1.065
GC.count = 6
1.057
GC.count = 7
1.081
GC.count = 8
1.062
GC.count = 9
1.054
GC.count = 9
1.057
GC.count = 10
1.073
GC.count = 11
1.027
GC.count = 12
1.086
GC.count = 12
1.093
GC.count = 13
1.094
GC.count = 14
0.999
GC.count = 15
1.111
GC.count = 16
1.094
GC.count = 16
1.088
GC.count = 17
1.058
GC.count = 18
1.119
GC.count = 19

Output for 9.1.x:

system ~/Downloads/local_benches 1301% ../../work/jruby/bin/jruby -Xcompile.invokedynamic ./bm1.rb 
1.780753
GC.count = 4
1.809321
GC.count = 5
0.550822
GC.count = 6
0.41645299999999996
GC.count = 7
0.391085
GC.count = 8
0.38137
GC.count = 9
0.348388
GC.count = 10
0.401225
GC.count = 10
0.38546600000000003
GC.count = 11
0.40856099999999995
GC.count = 11
0.390643
GC.count = 12
0.406027
GC.count = 13
0.39937
GC.count = 13
0.45749300000000004
GC.count = 13
0.387039
GC.count = 14
0.401494
GC.count = 14
0.391455
GC.count = 15
0.393563
GC.count = 15
0.406067
GC.count = 15
0.373488
GC.count = 16
system ~/Downloads/local_benches 1303% ../../work/jruby/bin/jruby -Xcompile.invokedynamic=false ./bm1.rb 
1.7946529999999998
GC.count = 4
1.065686
GC.count = 5
1.0503010000000002
GC.count = 6
1.0386179999999998
GC.count = 6
1.020875
GC.count = 7
0.96332
GC.count = 8
0.996759
GC.count = 8
1.019954
GC.count = 9
1.013452
GC.count = 10
1.008613
GC.count = 10
0.9754079999999999
GC.count = 11
1.0067439999999999
GC.count = 11
0.9722109999999999
GC.count = 12
0.996647
GC.count = 12
1.079552
GC.count = 13
1.0288899999999999
GC.count = 13
1.043602
GC.count = 14
1.008629
GC.count = 14
1.027662
GC.count = 14
1.0190839999999999
GC.count = 15

Eyeballing looks like 9k likely is beating 1.7 now but there is some noise. We defintely can resolve this issue though. Fascinating thing in that output is how much faster 9k warms up with indy than on 1.7. Good performance in half the iterations?

@enebo enebo closed this Mar 1, 2018

@enebo

This comment has been minimized.

Member

enebo commented Mar 1, 2018

Oh I should add that I just tried interp and 1.7 is ~ 3.4s and 9.1 is ~4.7s. I think 9.2 will help this we plan on making compound constants a single instr vs n instrs with jumps. All the extra instr overhead for interp has been a long known issue (and in the case of colon2 chains the idea we would be able to cull top-level constants has been defeated by so many call boundaries in ruby).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment