Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could you have a look at RedBlackTrees.jl? #5

Closed
pygy opened this issue Jan 25, 2014 · 17 comments
Closed

Could you have a look at RedBlackTrees.jl? #5

pygy opened this issue Jan 25, 2014 · 17 comments

Comments

@pygy
Copy link

pygy commented Jan 25, 2014

I ported this C red black tree library, (see here for a complete description), and I'd like to know if you're interested in adding it to this package.

https://github.com/pygy/RedBlackTrees.jl

It implements push!(), delete!(), in(), size(), length(), show() and the iterator protocol.

A tree is created using RedBlackTree(some_type). You can iterate backwards using for i in RedBlackTrees.Backward(tree) ...

show(tree) only displays the type and size of the tree. You can see the content of the tree using show(tree.root).

On my machine, insertion beats push!(array, item); sort!(array) from ~3000 items on.

I'll add the possibility to pass custom cmp() and copy() functions (copy for node insertion). Currently, it uses Base.cmp and doesn't make any copy.

There are probably some more things left to tweak. For example, according to @time iteration allocates ~ 400 bytes per item in the tree, even though it shouldn't.

It's also the first time I use parametric types, and I may have overused parametric function definitions.

At last, I use an alias RBLNode Union(RBNode{T}, Nothing) as the type for nodes, with Nothing representing leafs. It may or may not help to use an abstract type and have both RBNode and Leaf depend on it, but it would require some tweaks since the nodes are parametrized, I'd have to keep a typed leaf around, or create one each time I need one, which may slow things down.

@lindahua
Copy link
Contributor

Red black tree would be a great addition to this package.

I just took a brief look at the codes. I think the following line would be a problem in terms of performance:

https://github.com/pygy/RedBlackTrees.jl/blob/master/src/RedBlackTrees.jl#L75

Since Red-black tree is a binary tree, so each node should have at most two links. I think you don't need an array here, instead you can have a left and right field.

@pygy
Copy link
Author

pygy commented Jan 25, 2014

The use of an array is intentional, it makes the code simpler and probably faster. See here for the justification.

Now that I think of it, I could emulate that behavior using getindex(node::RBNode, side::Int or maybe Bool) = side == 1? node.left : node.right. It would add a branch, though, but it would remove a pointer lookup.

I'll have to benchmark it.

@lindahua
Copy link
Contributor

Julia's array is much heavier than C's array. Tuples are much more efficient, and you can still use the getindex syntax.

@kmsquire
Copy link
Member

A few general comments.

  1. Just for reference, red-black trees and AVL trees have been developed/discussed for julia previously, though not in a while. See, e.g.,

    a. Red-Black trees: https://groups.google.com/forum/#!msg/julia-dev/dDaUEqHKVCk/R1Kb2U2Gym8J
    b. AVL trees: https://github.com/torgausen/JuliaAVL, https://groups.google.com/forum/?fromgroups=#!searchin/julia-dev/AVL$20tree/julia-dev/KPpLattsEpc/PhGu0a8l-RQJ

  2. The main reason to use RedBlack trees and similar structues is as Sets or Associative (Dict) data structures. As much as possible, it would be great if it conformed to the Associative collection interface (http://docs.julialang.org/en/latest/stdlib/base/#associative-collections). In particular, it would be nice to have a nice SortedDict interface backed by a fast tree implementation.

I'll try to look at the implementation itself at some point. I'll add that Dahua as written some of the fastest Julia code around, so it would be well worth your time to listen to his advice. ;-)

@pygy
Copy link
Author

pygy commented Jan 25, 2014

> Julia's array is much heavier than C's array. Tuples are much more efficient, and you can still use the getindex syntax.

Definitely, and, in the original, the array is actaully part of the struct, which improves memory locality and cacheability.

The thing with tuples is that they are immutable. I'll have to allocate new ones for rotations and insertions, and a branch will be needed every time.

I'll have to benchmark it to know what's worse in this case: pipeline trashing or cache misses?

To give you an example, here's the code for a rotations. It is branch free.

const left = 1
const right = 2

const opposite = (right, left)

function single_rotation(root::RBNode, dir::Int)
  other = opposite[dir]
  save = root.link[other]

  root.link[other] = save.link[dir]
  save.link[dir] = root
  root.red = true
  save.red = false

  save
end

function double_rotation(root::RBNode, dir::Int)
  other = opposite[dir]
  root.link[other] = single_rotation(root.link[other], other)
  single_rotation(root, dir)
end

The same goes for node creation and deletion, even though it is less problematic.

How heavy are arrays compared to tuples?

@kmsquire: I've seen the other red black tree implementation, but it looks like it suffers from acute haskelliform typitis. Red and black nodes are a different type and insertions/deletions are sliced into several functions that call each other in order, with a different version for each color (up to _delete_case7()). It relies heavily on the method dispatcher for branches, and, from the look of it there are a lot of them.

I'll have a look at the associative collection interface. Currently, it only implements sets.

Another tweak would be to keep a pointer either to the parent node or to the next and previous values in the tree, to make iteration lighter. Currently, the iterator keeps its state in a O log(n) stack which is allocated on start(tree).

@lindahua
Copy link
Contributor

I think at this point, we should focus on getting the API consistent with Julia's convention.

It would be useful to call it SortedSet and SortedDict. This can be merged when the interface stabilize. We can work on improving the implementation efficiency afterwards.

@pygy
Copy link
Author

pygy commented Jan 27, 2014

Agreed.

Before I go fruther, I'd like you opinion on this:

I think that mutable values should be deep copied by default when used as dict keys or set elements. Otherwise, mutating an object could break the order. That being said, I'd like to offer the possibility to either do a shallow copy or no copy at all for performance conscious users who know what they are doing. Do you agree with the approach?

If so, I suppose I should provide a constructor with a named argument that defaults to deepcopy. Correct?

@kmsquire
Copy link
Member

I would suggest following the behavior of the Dict implementation in base, which does not protect the user from herself in this way:

julia> import Base: isequal, hash

julia> type MyMutableType
          x::Int
       end

julia> isequal(x::MyMutableType, y::MyMutableType) = (x.x == y.x)
isequal (generic function with 34 methods)

julia> hash(x::MyMutableType) = hash(x.x)
hash (generic function with 25 methods)

julia> x = MyMutableType(1)
MyMutableType(1)

julia> x == MyMutableType(1)
true

julia> y = MyMutableType(2)
MyMutableType(2)

julia> d = [x=>'x', y=>'y']
[MyMutableType(2)=>'y',MyMutableType(1)=>'x']

julia> x.x = 3
3

julia> x in keys(d)
false

julia> y in keys(d)
true

This is also the typical behavior in, e.g., Java and IOS programming.

I'm hopeful that Julia will eventually support deeply immutable objects (as much as possible).

@kmsquire
Copy link
Member

On my machine, insertion beats push!(array, item); sort!(array) from ~3000 items on.

Can you rerun this test with

using SortingAlgorithms

push!(array, item); sort!(array, alg=TimSort)

TimSort is generally slightly slower than the standard sorts in Julia, but one place where it shines is in handling partially (or mostly) sorted arrays.

Note that this should not take anything away from this work--I truly thing that we need a SortedDict, and red-black trees are a good base. This should just raise the bar a little. ;-)

@pygy
Copy link
Author

pygy commented Jan 27, 2014

Mutating a key can break the whole tree, not just that entry...

Regarding the naive array-based version, note that it doesn't check for dupes. A smarter way to work would be to do a binary search and, if needed, grow the array, and shift the values bigger than the one you want to insert.

You can also decide to tolerate dupes in the array, and only sort when needed to read or iterate over the collection. (for sets only). The iteration code and delete! would have to take special care of the dupes.

With more precise benchmarks, the threshold with the standard sort is actually 1500 items. TimSort beats rbtree up to 2200 items.

For the array vs. tuple vs. type field debate: we can get the best of the three worlds by using llvmcall. It allows to access object fields by their numeric index rather than their names. We get fast field access and efficient code (fewer branches, no extraneous allocations, symmetric).

@kmsquire
Copy link
Member

On Monday, January 27, 2014, Pierre-Yves Gérardy
<notifications@github.com<javascript:_e({}, 'cvml',
'notifications@github.com');>>
wrote:

Mutating a key can break the whole tree, not just that entry...

Sure. I'm just suggesting that, for performance, developers usually suggest
in the documentation not to use mutable keys or not to mutate those keys,
and that this is in line with typical Julia development. (Well, other than
the fact that the documentation itself could use some work.)

Regarding the naive array-based version, note that it doesn't check for
dupes. A smarter way to work would be to do a binary search and, if needed,
grow the array, and shift the values bigger than the one you want to
insert.

Agree. In Julia, arrays can actually grow in either direction, and the
splice! (and insert!, I think) code takes advantage of this by
shifting the shorter of the two.

There was talk before of creating an insertsorted! function. Hasn't
happened yet, but that might be an easy contribution to mainline Julia if
you're interested.

You can also decide to tolerate dupes in the array, and only sort when
needed to read or iterate over the collection. (for sets only). The
iteration code and delete! would have to take special care of the dupes.

With more precise benchmarks, the threshold with the standard sort is
actually 1500 items. TimSort beats rbtree up to 2200 items.

Ok, thanks for doing this!

For the array vs. tuple vs. type field debate: we can get the best of the
three worlds by using llvmcall. It allows to access object fields by
their numeric index rather than their names. We get fast field access and
efficient code (fewer branches, no allocation, symmetric).

I'm not as familiar with this. If you can code it up and show an obvious
benefit, great. In general, though, at the least, it sounds like premature
optimization, and that it would make the code a bit less readable. I think
it would be better to get a working version in first, and then test
this idea.

Cheers!

@stevengj
Copy link

You can already access fields using an integer argument, directly in Julia, via getfield(x, i) (see JuliaLang/julia#4806), as long as you're willing to depend on Julia 0.3. If all of the field types are the same, this is compiled directly to a pointer offset, just like a C array dereference (see codegen.cpp).

@pygy
Copy link
Author

pygy commented Jan 28, 2014

That's good news! No problem with using v0.3 (I'll have to install it, though).

Do you know if it also fast if the initial types are homogenous even though it varies afterwards? Specifically, for left and right here:

type RBNode{K,V}
    left::RBNode{K,V}
    right::RBNode{K,V}
    key::K
    value::V
    red::Bool
end

Edit: No, it isn't, since the fast case is reserved to homogenous or "all pointer" types.

I can use %2 = getelementptr inbounds %jl_value_t* %0, i64 %1, ... with llvmcall, though. It will be necessary to define it conditionally, according to sizeof(Ptr{Void}).

It would be nice if it were possible to make this kind of function completely private, though.

@lindahua
Copy link
Contributor

In terms of mutating key, let's follow Julia base's behavior: Dict does not make a deep copy of a mutating key:

julia> type X
       x::Int
       end

julia> k = X(1)
X(1)

julia> d = (X=>Int)[k=>10]
[X(1)=>10]

julia> k.x = 2
2

julia> d
[X(2)=>10]

This may cause problems under certain situations. But it doesn't seem to be a real problem in practice. Of course, we may allow other behaviors through options.

@hayd
Copy link
Contributor

hayd commented Dec 29, 2015

Did this work ever get pushed anywhere?

@StephenVavasis
Copy link
Contributor

The sorted containers are based on 2-3 trees, which are mostly functionally equivalent to red-black trees (although the latter may have a performance edge). If someone wants to implement the sorted containers using red-black trees as the substrate, there may be a performance boost. However, my own timing tests seem to indicate that the main performance drag on the currently implemented 2-3 trees is main-memory fetching (i.e., cache misses). Therefore, it may be more profitable to implement something like Bender et al.'s cache-oblivious B-trees, SIAM J. Comput., 2005.

@oxinabox
Copy link
Member

Since this hasn't been touched since 2014, shall it just be closed?
A new PR could be opened later.

@oxinabox oxinabox closed this as completed Nov 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants