Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix of a null pointer dereference in flip_color #106

Closed
wants to merge 1 commit into from
Closed

Fix of a null pointer dereference in flip_color #106

wants to merge 1 commit into from

Conversation

strohsnow
Copy link
Collaborator

In accordance with this stackoverflow post I have made some changes to the deletion. Without them rule 4 and rule 5 violations can occur which leads to segfault after dereferencing a null pointer in flip_color function. Along with deletion changes I removed some unnecessary checks for null pointers, except for the ones in remove_obj function, which are obligatory for handling deletion of unexistent keys. After testing on huge sets of random data I don't get any rule violations, segfaults or clang analyzer warnings anymore.

@strohsnow strohsnow changed the title Fix of null pointer dereference in flip_color Fix of a null pointer dereference in flip_color Jul 20, 2023
Copy link
Owner

@wolkykim wolkykim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. Did you check running the unit tests with this change and did it pass?
https://github.com/wolkykim/qlibc/blob/master/tests/README.md

src/containers/qtreetbl.c Show resolved Hide resolved
@wolkykim
Copy link
Owner

Looks like the test all passes. What do you think, we make a separate PR that includes the segfault cases first in the unit test code, then we come back on this PR to address it? Unit test code is here https://github.com/wolkykim/qlibc/blob/master/tests/test_qtreetbl.c

@strohsnow
Copy link
Collaborator Author

Sounds good to me, I will look more closely into the random examples which cause segfault and try to come up with an unit test

@wolkykim
Copy link
Owner

wolkykim commented Jul 21, 2023

Hi, I've created a working branch for this work named tree. Can you check it out and work on that?
I have made some improvements in the unit tests for visual inspection. If you run the test, it'll show/draw the internal tree structure step by step for insertion and deletion like below. I think this would help to implement the fail scenario.

* TEST : Test tree with deletion / 0 1 2 3 4 5 6 7 8 9 0 
                                    0 
                   (#nodes=1, #red=0, #black=1, root=0)

                                    1 
                 [0]`````````````````````````````````` _ 
                   (#nodes=2, #red=1, #black=1, root=1)

                                    1 
                 [0]``````````````````````````````````[2]
                   (#nodes=3, #red=2, #black=1, root=1)

                                    1 
                  0 `````````````````````````````````` 3 
         _ ```````````````` _                [2]```````````````` _ 
                   (#nodes=4, #red=1, #black=3, root=1)

                                    1 
                  0 `````````````````````````````````` 3 
         _ ```````````````` _                [2]````````````````[4]
                   (#nodes=5, #red=2, #black=3, root=1)

@strohsnow
Copy link
Collaborator Author

I like the step by step representation, that would simplify testing.

@strohsnow
Copy link
Collaborator Author

What do you think we draw tree like this?
image

@strohsnow
Copy link
Collaborator Author

Also, as you can see on the screenshot above, I have found an example when rule 4 violation occurs and it's only 20 elements, however I did that randomly, so I'll look into it and try to find violation with minimum elements. And eventually this violation will cause segfault, however it happens quite far (like several thousands nodes in from my random tests). So imo we might first address rule violation fix, which on its own resolves any future null dereference errors

@wolkykim
Copy link
Owner

wolkykim commented Jul 21, 2023 via email

@wolkykim
Copy link
Owner

wolkykim commented Jul 22, 2023

I have rewritten the tree printing entirely. It's more close to the drawing you attached above. I think this is better to trace larger trees. Hope this helps.

06cb183#diff-932784f13a92d5db3cc5e6128f9110553be5a0f7a731e06820d74437ad44f5deR848

          .--9
         |    `==8
      .--7
     |   |    .--6
     |    `==5
     |        `--4
  ---3
     |    .--2
      `--1
          `--0

5 and 8 are Red nodes.

@strohsnow
Copy link
Collaborator Author

Yeah, that's not bad, though I can make the branches shorter as well and I kinda like right angles more. After some trial and error I found what might be the easiest way to cause rule violation, now I just need to trace it and see what exact lines of code we are missing. I mean the initial PR commit probably fixes them all, but we can't just blindly push them without proper testing.
Here's my first test:
image
image
It also breaks if I delete F and R.
Also I was thinking about recreating tree which causes segfault according to clang analyzer, but this looks scary lol
image
Probably will do a little bit later (can't really put much time into work I'm on summer vacation with my family)

@wolkykim
Copy link
Owner

wolkykim commented Jul 22, 2023 via email

@strohsnow
Copy link
Collaborator Author

Yeah I just copied it from one my labs I did during the semester :D
The logic in it is similar to yours, though I don't have a branch struct and instead I have char array of fixed length, but I like your approach more, so I will redo my code a little, style it better and PR it later.
And a small note for the example I found above, if I apply fixes it just doesn't do any rotations. The tree stays the same except for the deleted nodes, what doesn't violate any rules.

@wolkykim
Copy link
Owner

I confirm that rule 4 violation case. That's a great finding. I've added the unit test case for that 37e0c32#diff-f348265e2683d7e4a3cd33f11806d692b0c2e3769a58c0258ce8832a7c56af81R170

I guess we're almost there. I'd like to let you handle the fix from now on unless you don't want to.
I have given you direct access permission, you'll be able to create a work branch if you'd like or work directly with the tree branch. If working on a separate branch, send PRs to tree branch. We'll merge the tree branch to the master as they are without squashing and cut a new release.

Thanks! This is great!

@strohsnow
Copy link
Collaborator Author

@wolkykim I have finished the fix, check it out.

@wolkykim
Copy link
Owner

wolkykim commented Jul 23, 2023

@strohsnow I like it a lot. I like every commits you made there including the pretty tree. I just added some minor style adjustments and some more unit testing. This looks awesome!!! BTW, did you happen to find the segfault case?

@strohsnow
Copy link
Collaborator Author

Thank you, it's an honour for me to be a part of this project and a pleasure to hear that you like it.
However, I have found that segfault was caused by the unnecessary null check in fix function. Removing it just caused infinite red property violation without resolving on its own, but no more segfault.

@wolkykim
Copy link
Owner

@strohsnow I'll tell ya the truth. You are a rising star with a very fast-spinning brain. I appreciate your contributions to this project. BTW, about the segfault case, are you talking about this line? https://github.com/wolkykim/qlibc/blob/master/src/containers/qtreetbl.c#L1064
I see that you have removed the null check in our new branch. But the segfault is still happening? Is it possible to reproduce it in a test case?

@strohsnow
Copy link
Collaborator Author

@wolkykim Appreciate it, man!
Yes, I was talking about that line. Though you got it wrong. This null check causes segfault and removing it (without changing anything else) prevents it. However it doesn't fix any rule violation, that's what other changes are for.
So in our new branch everything is fine. No rule violations, nor segfaults.

@strohsnow
Copy link
Collaborator Author

strohsnow commented Jul 23, 2023

Oh, nvm. Just removing this check isn't enough for the segfault fix. It just removes clang analyzer warning. The segfault still happens in flip_color, but it's so far away, that analyzer just doesn't have enough depth. So I don't think I will be able to make a test for it. However I couldn't get any segfaults with my fixes in our new branch.
My test was "insert, delete, delete, insert" with random keys and I ran it for a good hour without any errors.

@wolkykim
Copy link
Owner

ok, thanks for the clarification. Oh yeah! Sounds like the segfault case has been likely fixed in the new branch.

@strohsnow
Copy link
Collaborator Author

strohsnow commented Jul 23, 2023

I was wondering though, why did you decide to go with 2-3-4 LLRB instead of 2-3 version? We could have actually fixed all the issues by moving the 4-node split part in the insert function :D
I can actually make it a parameter that user can choose when creating a qtreetbl container (2-3 or 2-3-4 version). This would require just a few lines of code.

@wolkykim
Copy link
Owner

It's been a while but as I recall 2-3-4 provides a more balanced performance. Depending on the type of workload 2-3 tree has a higher chance of frequent rotations than 2-3-4? I'm open to hearing what your thoughts are. It's been a while since I was digging into this matter. Would it be better to eliminate 4-node splits? What would be the pros/cons?

@strohsnow
Copy link
Collaborator Author

strohsnow commented Jul 23, 2023

@wolkykim
I haven't found proof to your words, but I think you might be right. However, I found that LLRB trees are less efficient than classic RB trees.
So if we are chasing performance it would be wiser to implement them or their 2-3 variant, which is easier to code while providing similar results to 2-3-4 RB trees.
It's a lot of work, but probably worth it, though this lib will lose its uniqueness in terms of tree choice.

@wolkykim
Copy link
Owner

wolkykim commented Jul 23, 2023 via email

@strohsnow
Copy link
Collaborator Author

@wolkykim
There is a comparison in the first link:
`
This matters for performance. I measured the number of rotations per operation in the insert–find–delete benchmark above.

Insert phase: Normal RB 0.582 rotations/insert, LLRB 1.725 rotations/insert (2.96x—probably a constant factor)
Delete phase: Normal RB 0.380 rotations/delete, LLRB 19.757 rotations/delete (51.99x!!—both a constant factor and a log-N factor)

Here are some overall performance numbers.

Insert phase: conventional RB 0.476s, LLRB 0.560s (1.18x)
Find phase: conventional RB 1.648s, LLRB 1.680s (1.02x)
Delete phase: conventional RB 0.612s, LLRB 1.032s (1.69x)
`

As you can see it doesn't affect search time, but it does affect both insert and delete operations.

If we are talking 2-3 LLRB vs 2-3-4 LLRB, then there is no difference performance wise. Just few small differences in code, you can see code written in Java that implements both 2-3 and 2-3-4 variants here. BTW 2-3 variant doesn't need moveRedLeft fix that is present in the code above.

If we are talking 2-3 RB vs 2-3-4 RB, then there is also no difference in performance (2-3 is negligibly slower), however it's easier to implement than 2-3-4 RB.

Now that I think about it, we should probably leave it as it is now in our new branch. There is probably a lot of other libs that have RB trees, then there is GoLLRB which is 2-3 variant for anyone curious, and qlibc will remain unique as the only lib that has 2-3-4 variant of LLRB implemented (according to Wikipedia).

@wolkykim
Copy link
Owner

wolkykim commented Jul 23, 2023 via email

@strohsnow
Copy link
Collaborator Author

Yeah, it's not a well-known fact, we gotta test it by ourselves. I will rewrite qtreetbl using RB trees on a different branch when I have free time and run performance tests. However, I've already ran some tests on both variants of LLRB and results are the same.
2-3-4 LLRB:

Insert 1000000 keys: 376.7 ms
Lookup 1000000 keys: 262.1 ms
Delete 1000000 keys: 470.9 ms

2-3 LLRB:

Insert 1000000 keys: 375.3 ms
Lookup 1000000 keys: 261.9 ms
Delete 1000000 keys: 472.2 ms

I ran every test 10 times and calculated the average.

@wolkykim
Copy link
Owner

wolkykim commented Jul 24, 2023

Sounds good to me.

I've added a perf tester for a more detailed comparison.
Do you want to run a test again with your 2-3 updates? I'm quite curious
how the number will come out.

Here's a test output on my laptop. It's quite interesting.

* TEST : Test tree performance / random 
  Sample 593689054, 4226891818, 1085422463, 847579505, 1889779975, 236698494, 3319477102, 1343918321, 2472494321, 101886131, ... (Total 1000000)
  Insert 1000000 keys: 723ms - flip 0.57, rotate 1.55 (L 0.88, R 0.67)
  Lookup 1000000 keys: 534ms - flip 0.00, rotate 0.00 (L 0.00, R 0.00)
  Delete 1000000 keys: 916ms - flip 3.31, rotate 16.46 (L 8.21, R 8.25)
 OK (2000003 assertions, 2273ms)
* TEST : Test tree performance / ascending 
  Sample 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ... (Total 1000000)
  Insert 1000000 keys: 152ms - flip 1.00, rotate 3.00 (L 2.00, R 1.00)
  Lookup 1000000 keys: 121ms - flip 0.00, rotate 0.00 (L 0.00, R 0.00)
  Delete 1000000 keys: 205ms - flip 1.00, rotate 15.62 (L 8.31, R 7.31)
 OK (2000003 assertions, 507ms)
* TEST : Test tree performance / descending 
  Sample 1000000, 999999, 999998, 999997, 999996, 999995, 999994, 999993, 999992, 999991, ... (Total 1000000)
  Insert 1000000 keys: 175ms - flip 1.00, rotate 1.00 (L 0.00, R 1.00)
  Lookup 1000000 keys: 124ms - flip 0.00, rotate 0.00 (L 0.00, R 0.00)
  Delete 1000000 keys: 451ms - flip 1.00, rotate 45.28 (L 22.14, R 23.14)
 OK (2000003 assertions, 784ms)
* TEST : Test tree performance / 1 random + 3 ascending mix 
  Sample 3689054, 10, 20, 30, 9779975, 50, 60, 70, 2494321, 90, ... (Total 1000000)
  Insert 1000000 keys: 305ms - flip 0.74, rotate 1.99 (L 1.20, R 0.79)
  Lookup 1000000 keys: 298ms - flip 0.00, rotate 0.00 (L 0.00, R 0.00)
  Delete 1000000 keys: 369ms - flip 2.08, rotate 14.14 (L 7.13, R 7.01)
 OK (2000003 assertions, 1028ms)
* TEST : Test tree performance / low high low high 
  Sample 0, 999999, 2, 999997, 4, 999995, 6, 999993, 8, 999991, ... (Total 1000000)
  Insert 1000000 keys: 146ms - flip 0.50, rotate 1.50 (L 0.75, R 0.75)
  Lookup 1000000 keys: 120ms - flip 0.00, rotate 0.00 (L 0.00, R 0.00)
  Delete 1000000 keys: 236ms - flip 1.03, rotate 26.09 (L 13.08, R 13.01)
 OK (2000003 assertions, 563ms)

@wolkykim
Copy link
Owner

Some tests were not generating the sample keys as intended. Updated the output above.

@strohsnow
Copy link
Collaborator Author

Random Time Flips Rotations Left Right Time Flips Rotations Left Right
Insert 969.2 0.57 1.08 0.65 0.43 1108.6 0.75 1.19 0.71 0.48
Lookup 714.8 0 0 0 0 724.6 0 0 0 0
Delete 1283.2 3.31 16.46 8.21 8.25 1209.2 15.33 19.45 9.72 9.73
Ascending Time Flips Rotations Left Right Time Flips Rotations Left Right
Insert 317.4 1 1 1 0 322.4 1 1 1 0
Lookup 232.4 0 0 0 0 228.8 0 0 0 0
Delete 494.4 1 15.62 8.31 7.31 431.6 15.62 15.62 8.31 7.31
Descending Time Flips Rotations Left Right Time Flips Rotations Left Right
Insert 339 1 1 0 1 342 1 1 0 1
Lookup 225.8 0 0 0 0 253 0 0 0 0
Delete 585.8 1 45.28 22.14 23.14 462 15.62 31.19 15.09 16.09
Random + Asc Time Flips Rotations Left Right Time Flips Rotations Left Right
Insert 523.2 0.74 1.19 0.8 0.39 552.2 0.83 1.24 0.84 0.4
Lookup 610.2 0 0 0 0 561 0 0 0 0
Delete 584.6 2.06 13.86 6.99 6.87 596 13.42 17.35 8.75 8.6
Low High Time Flips Rotations Left Right Time Flips Rotations Left Right
Insert 478.2 0.5 1.25 0.62 0.62 507.8 1 1.5 0.75 0.75
Lookup 308.2 0 0 0 0 313.2 0 0 0 0
Delete 437.2 1.03 26.09 13.08 13.01 439.4 16.27 24.05 12.06 11.98

Here are the results, 2-3-4 on the left and 2-3 on the right

@strohsnow
Copy link
Collaborator Author

I don't like the amount of color flips in the 2-3 variant, we should stick with 2-3-4.

@wolkykim
Copy link
Owner

wolkykim commented Jul 25, 2023

This is valuable data. Fantastic. It looks like 2-3-4 makes it a better choice for insert&lookup heavy applications. Is the modified version for the 2-3 model without move-red-left?

@strohsnow
Copy link
Collaborator Author

The test for the 2-3 model above uses move-red-left from master branch. You gave me an idea to try and test it with the fix provided in the stackoverfow post, maybe it will change some numbers. I will brb with the test and put the code on the branch as comments.

@strohsnow
Copy link
Collaborator Author

Random Time Flips Rotations Left Right
Insert 1084.6 0.75 1.19 0.71 0.48
Lookup 781.8 0 0 0 0
Delete 1218 15.33 19.45 9.72 9.73
Ascending Time Flips Rotations Left Right
Insert 315.6 1 1 1 0
Lookup 227.8 0 0 0 0
Delete 431.8 15.62 15.62 8.31 7.31
Descending Time Flips Rotations Left Right
Insert 345.6 1 1 0 1
Lookup 250.2 0 0 0 0
Delete 465 15.62 31.19 15.09 16.09
Random + Asc Time Flips Rotations Left Right
Insert 599.8 0.83 1.24 0.84 0.4
Lookup 629.2 0 0 0 0
Delete 655.2 13.42 17.35 8.75 8.6
Low High Time Flips Rotations Left Right
Insert 529.4 1 1.5 0.75 0.75
Lookup 322.2 0 0 0 0
Delete 451 16.27 24.05 12.06 11.98

2-3 with move-red-left fix

@strohsnow
Copy link
Collaborator Author

No changes, so I guess it is indeed 2-3-4 exclusive fix

@wolkykim
Copy link
Owner

so I guess the 2-3 variant conversion is basically moving the 4-node splitting to the way up from the way down path in the insert but how is the move-to-left eliminated? Can you post a code diff here?

@strohsnow
Copy link
Collaborator Author

strohsnow commented Jul 25, 2023

@wolkykim
The move_red_left function is not eliminated, it is altered. You can see below the if-condition commented as 2-3-4 exclusive. Without this condition, in case obj->right is a balanced 4-node, it will leave us with a right-leaning red edge. And since we proceed going down to the left after calling move_red_left in the remove_obj function, we will never fix this edge on the way up. However, this condition doesn't have any effect on the 2-3 model, as this case (4-node on the search path) is impossible in it, so we don't need it.

static qtreetbl_obj_t *move_red_left(qtreetbl_obj_t *obj) {
    flip_color(obj);
    if (is_red(obj->right->left)) {
        obj->right = rotate_right(obj->right);
        obj = rotate_left(obj);
        flip_color(obj);
        // 2-3-4 exclusive
        if (is_red(obj->right->right)) {
            obj->right = rotate_left(obj->right);
        }
    }
    return obj;
}

Another altered function is fix. Unlike the 2-3 model, in the 2-3-4 model we don't split 4-nodes on the way up, so the if-condition in the bottom is commented as 2-3 exclusive. Also there is an additional if-condition at the top, commented as 2-3-4 exclusive. Without it, when obj->right->left is also red, single rotate_left will leave us with left-leaning red edge followed by right-leaning red edge, which is considered violation of red property. If we do rotate_right followed by rotate_left we will get a balanced 4-node. This case doesn't occur in the 2-3 model, so we can leave it out.

static qtreetbl_obj_t *fix(qtreetbl_obj_t *obj) {
    // rotate right red to left
    if (is_red(obj->right)) {
        // 2-3-4 exclusive
        if (is_red(obj->right->left)) {
            obj->right = rotate_right(obj->right);
        }
        obj = rotate_left(obj);
    }
    // rotate left red-red to right
    if (is_red(obj->left) && is_red(obj->left->left)) {
        obj = rotate_right(obj);
    }
    // split 4-nodes (2-3 exclusive)
    // if (is_red(obj->left) && is_red(obj->right)) {
    //     flip_color(obj);
    // }
    return obj;
}

The last change is in the put_obj function. In the 2-3 model we split 4-nodes on the way up, so we place the corresponding if-condition at the bottom. In the 2-3-4 model we split 4-nodes on the way down, so we leave it at the top.

    // split 4-nodes on the way down (2-3-4 exclusive)
    if (is_red(obj->left) && is_red(obj->right)) {
        flip_color(obj);
    }

    ...

    // fix right-leaning reds on the way up
    if (is_red(obj->right) && !is_red(obj->left)) {
        obj = rotate_left(obj);
    }

    // fix two reds in a row on the way up
    if (is_red(obj->left) && is_red(obj->left->left)) {
        obj = rotate_right(obj);
    }

    // split 4-nodes on the way up (2-3 exclusive)
    // if (is_red(obj->left) && is_red(obj->right)) {
    //     flip_color(obj);
    // }

I hope the difference is clear now. You can also call me on discord, if you got any more questions. My username is stroh.. I also have almost finished implementing CLRB, so I can run some tests and compare it to LLRB.

@strohsnow
Copy link
Collaborator Author

Well, well, well. Here are the CLRB tests.

Random Time Flips Rotations Left Right
Insert 1011 0.51 0.58 0.29 0.29
Lookup 893.6 0 0 0 0
Delete 942 0 0.42 0.23 0.2
Ascending Time Flips Rotations Left Right
Insert 172.2 1 1 1 0
Lookup 205.8 0 0 0 0
Delete 172.6 0 0.5 0.5 0
Descending Time Flips Rotations Left Right
Insert 175.2 1 1 0 1
Lookup 124.4 0 0 0 0
Delete 158.4 0 0.5 0 0.5
Random + Asc Time Flips Rotations Left Right
Insert 340.2 0.72 1 0.64 0.36
Lookup 472.2 0 0 0 0
Delete 443.8 0 0.52 0.34 0.18
Low High Time Flips Rotations Left Right
Insert 156.6 0.5 0.75 0.25 0.5
Lookup 130.4 0 0 0 0
Delete 121.4 0 0.25 0.12 0.12

@strohsnow
Copy link
Collaborator Author

Ascending and Descending tests look sweet, however I don't think that the loss in the lookup speed in Random test worth the gain in the deletion speed. It's up to you now if you think we should migrate to CLRB. It's still a lot of work, but I can handle the most of it, I will still need you to write the tests, because you are very good at it imo.
I took the algorithms from Cormen's Introduction to Algorithms book, they are not hard to understand and NilSentinel they introduce is very smart, since there is no more need to handle any corner cases (root or leaf nodes).

@wolkykim
Copy link
Owner

Hey Strow. This is great. Thanks for the detailed clarification above as well.
Yes, the insertion and deletion overhead seems clearly lesser. I think the lookup speed is mainly affected by the internal tree data structure it forms as data comes in. so the time diff from the lookup time is what represents the efficiency of insert and delete. It looks like CB has a better best-case scenario with sequential input. I wonder how the number changes with an increased workload size. BTW, you don't have to make a pretty table here, it's a lot of work. Just copy&paste. Nice work!

@wolkykim
Copy link
Owner

About the migration, I think it's something to think about but for now, let's wrap the new branch up and get it rolled out. If no more work is ongoing then I'll tomorrow evening for the merge.

@wolkykim
Copy link
Owner

wolkykim commented Jul 25, 2023

I ran some calculations for fun. For the records,

  • Professor Eddie's finding:
    Insert phase: conventional RB 0.476s, LLRB 0.560s (1.18x)
    Find phase: conventional RB 1.648s, LLRB 1.680s (1.02x) <= This is for 4M lookups
    Find phase: conventional RB 0.412s, LLRB 0.42 (1.02x). <= (calculated 1M lookup, divided by 4)
    Delete phase: conventional RB 0.612s, LLRB 1.032s (1.69x)

  • Converted to your laptop speed, he's got x2.167 faster machine than yours)
    Insert phase: conventional RB 1.03s, LLRB 1.21s (1.18x)
    Find phase: conventional RB 0.89s, LLRB 0.91 (1.02x). (calculated 1M lookup)
    Delete phase: conventional RB 1.32s, LLRB 2.23s (1.69x)
    (Assumption: internal data structure formed in random feed is similar to ours, lookup speed scales the same based on CPU speed)

  • Versus our benchmark:
    Insert phase: conventional RB 1.01s, LLRB 0.969s (1.04x FASTER)
    Find phase: conventional RB 0.89s, LLRB 0.714s (1.25x FASTER)
    Delete phase: conventional RB 0.94s, LLRB 1.283s (1.36x SLOWER but 1.02x FASTER than his RB)

  • RB insert/find time are both the same with his time (suggest scaling may be in reasonable error range)
  • Your RB delete time is a lot faster than his time
  • It shows LLRB is actually faster in Insert and Find phase.
  • Delete time is slower than our RB algorithm time (but beats his RB time :)

If the calculation is reasonable, we've got a quite different result.

@strohsnow
Copy link
Collaborator Author

Oh, don't worry, a pretty table is the least difficult work out there haha.
Merge sounds good, I still have a few commits I didn't push, mostly code style changes. I will push them when I wake up. Though, I don't now when your evening is because I feel like we live in totally different timezones, I am in GMT+3, wbu?

@strohsnow
Copy link
Collaborator Author

Find phase: conventional RB 0.412s, LLRB 0.42 (1.02x). <= (calculated 1M lookup, divided by 4)

I'm not sure this is correct, it's logarithmic, not linear.

Two times faster machine, wow, that's something. I am running Ryzen 5 5600U on my laptop.

No idea how he got such bad delete results, I literally copy pasted code from the book lol.

I wonder if any optimization can be done to the LLRB deletion algorithm in order to cut down the rotations and potentially decrease time.

@strohsnow
Copy link
Collaborator Author

I found non-recursive LLRB implementation written in C preprocessor macros in jemalloc. I could try rewrite it and see if it's any faster. I suppose it will eliminate many extra rotations.

@strohsnow
Copy link
Collaborator Author

Can't make the deletion work :(
Insert and Lookup results look sweet, they are roughly the same as our 2-3-4 model in Random test, but 15-30% faster in Ascending and Descending tests. And I believe Delete will be much faster as well, but it just stack smashes.
I pushed new functions to llrb branch, maybe you could check it out and possibly find the reason. I took the code from here.

Random Time Flips Rotations Left Right
Insert 926.6 0.27 1.19 0.71 0.48
Lookup 756.6 0 0 0 0
Ascending Time Flips Rotations Left Right
Insert 202.6 1 1 1 0
Lookup 180.4 0 0 0 0
Descending Time Flips Rotations Left Right
Insert 287 0 1 0 1
Lookup 188.2 0 0 0 0
Random + Asc Time Flips Rotations Left Right
Insert 569.4 0.42 1.24 0.84 0.4
Lookup 606.2 0 0 0 0
Low High Time Flips Rotations Left Right
Insert 417.8 0.25 1.5 0.75 0.75
Lookup 400.4 0 0 0 0

@wolkykim
Copy link
Owner

All the work and improvements are now merged in. Woohoo!!! I appreciate your hard work and efforts in the tree improvements. I look forward to continuing work and collaboration. A new release cut coming up soon.

@wolkykim
Copy link
Owner

wolkykim commented Jul 27, 2023

As a quick reference of our test result.

Random 1 million keys

2-3-4 LLRB 2-3 LLRB 2-3-4 RB
Insert 🟢 969ms 1108ms 1011ms
Lookup 714ms 724ms 893ms
Delete 1283ms 1209ms 🟢 942ms
Rotations / Insert 1.08 1.19 0.58
Rotations / Delete 16.46 19.45 0.42

@strohsnow
Copy link
Collaborator Author

Yay! I enjoyed working with you. It feels good to actually put my newly acquired (I just finished DSA course this semester) programming skills into the real world projects.
A small correction to the benchmark summary, the last column should be 2-3-4 RB. I didn't implement RB from the paper, but the one from the Cormen's book.

@wolkykim
Copy link
Owner

wolkykim commented Jul 27, 2023

@strohsnow really great work you've done there. A new release is out. Check it out https://github.com/wolkykim/qlibc/releases/tag/v2.5.0

@strohsnow
Copy link
Collaborator Author

I like it a lot! I will continue working on the llrb branch and try to figure out iterative implementations and test how they perform.

@wolkykim wolkykim closed this Jul 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants