Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve red-black tree implementation in map #29

Closed
jserv opened this issue Aug 2, 2022 · 13 comments
Closed

Improve red-black tree implementation in map #29

jserv opened this issue Aug 2, 2022 · 13 comments
Assignees

Comments

@jserv
Copy link
Contributor

jserv commented Aug 2, 2022

map.[ch] is the C Implementation for C++ std::map using red-black tree. It works well, but there is room for improvements. We can take some tricks used in Red-black Trees in Linux for potential speedup. In addition, current rbtree can benefit from the introduction of indirect pointers.

@jserv
Copy link
Contributor Author

jserv commented Aug 22, 2022

commit 7fb4ff7 integrates color into lowest bit of parent pointer.

@steven1lung
Copy link
Collaborator

steven1lung commented Sep 26, 2022

The expected improvements can be the followings:

  • Combining the color bit with the parent node pointer
    This is done by alignment, since map_node_t is defined to align with unsigned long (8 Bytes).
    The least three bits would be left unused, so we can integrate the color bit to the LSB.

  • Indirect pointers
    Indirect pointers can be used to decrease the number of branches to increase better performance in pipelines.

  • Embed map_node_t into structures for better cache locality
    Since the pointer we use to manipulate are embedded in the structure variable we are using, there would be one less layer of indirection and resulting to better cache locality.

@jserv
Copy link
Contributor Author

jserv commented Sep 27, 2022

  • Indirect pointers
    Indirect pointers can be used to decrease the number of branches to increase better performance in pipelines.

commit ff46dad and commit 09f4e65 lower the number of branches in red-black tree.

@jserv
Copy link
Contributor Author

jserv commented Nov 3, 2022

rbi is non-recursive, and has very little memory overhead (e.g. no parent pointer within the red-black tree node). Its features:

  • 3 tree implementations: unordered; ordered but unbalanced; balanced
  • memory allocation is done outside the library
  • lookup, iteration and insertion is without recursion and with constant memory usage
  • compact memory representation: parent pointers are not used
  • compact memory representation: option to store the red-black bit of the balanced implementation in the least significant bit of the (right) pointer
  • operations other than lookup, insertion and iteration are not implemented

We might learn from rbi.

@jserv jserv assigned EagleTw and unassigned steven1lung Feb 13, 2023
@EagleTw
Copy link
Collaborator

EagleTw commented Feb 21, 2023

Below is my proposal for the first change.

  • store the red-black bit of the balanced implementation in the least significant bit of the (right) pointer
  • Removing parent pointes and rewrite corresponding functions

I will also write testing to ensure the correctness of all functions.

What do you think?

@sysprog21 sysprog21 deleted a comment from EagleTw Mar 12, 2023
@EagleTw
Copy link
Collaborator

EagleTw commented Mar 17, 2023

The following result is recorded by the simple benchmark I created.
By inserting, searching, and deleting 10e6 random and not duplicated integers.

action type map rb
insert (sec) 1.123425 0.446589
search (sec) 1.000282 0.406894
delete (sec) 0.173655 0.44963

@jserv
Copy link
Contributor Author

jserv commented Mar 17, 2023

The following result is recorded by the simple benchmark I created. By inserting, searching, and deleting 10e6 random and not duplicated integers.

You shall provide the corresponding benchmark programs for reference purpose.

In particular, rbi.c lacks of the ability to remove. rb.h is exactly used for comparison.

@EagleTw
Copy link
Collaborator

EagleTw commented Mar 17, 2023

You shall provide the corresponding benchmark programs for reference purpose.

Repository: rbtree_bench

  • 1. mention the machine specifications which you performed the experiments on.
  • 2. provide a top-level CMakeLists.txt that describes the way to build both map and rb;
  • 3. pull Google Tests automatically. See Quickstart: Building with CMake; and
  • 4. rename the repository as rbtree-bench. There is no separation between rb and tree.

@sysprog21 sysprog21 deleted a comment from EagleTw Mar 17, 2023
@jserv
Copy link
Contributor Author

jserv commented Mar 17, 2023

  • 2. provide a top-level CMakeLists.txt that describes the way to build both map and rb;
  • 3. pull Google Tests automatically. See Quickstart: Building with CMake; and

The following changes are required.

--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -1,4 +1,4 @@
-cmake_minimum_required (VERSION 2.8.11)
+cmake_minimum_required (VERSION 3.14)
 
 project (rbtree_bench)
 
@@ -10,6 +10,7 @@ FetchContent_Declare(
   googletest
   URL https://github.com/google/googletest/archive/03597a01ee50ed33e9dfd640b249b4be3799d395.zip
 )
+FetchContent_MakeAvailable(googletest)
 
 add_subdirectory (map)
 add_subdirectory (rb)
diff --git a/map/CMakeLists.txt b/map/CMakeLists.txt
index 2701980..a67cbb5 100644
--- a/map/CMakeLists.txt
+++ b/map/CMakeLists.txt
@@ -1,4 +1,3 @@
-cmake_minimum_required(VERSION 3.5)
 project(map)
 
 enable_testing()
diff --git a/rb/CMakeLists.txt b/rb/CMakeLists.txt
index 06ab4d7..8bdf88d 100644
--- a/rb/CMakeLists.txt
+++ b/rb/CMakeLists.txt
@@ -1,10 +1,7 @@
-cmake_minimum_required(VERSION 3.5)
 project(rbi)
 
 enable_testing()
 
-find_package(GTest CONFIG REQUIRED)
-
 set (CMAKE_CXX_STANDARD 11)
 set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
 set(GCC_FLAGS "-s -Os -W -Wall -Werror")

@jserv
Copy link
Contributor Author

jserv commented Mar 17, 2023

The following result is recorded by the simple benchmark I created. By inserting, searching, and deleting 10e6 random and not duplicated integers.

Tested on eMag 8180, 64-bit 32-core Arm server.

action type map rb
insert (sec) 2.341151 1.11663
search (sec) 1.723217 0.851424
remove (sec) 0.352333 1.13698

@jserv
Copy link
Contributor Author

jserv commented Mar 18, 2023

See also:

@EagleTw
Copy link
Collaborator

EagleTw commented Mar 20, 2023

Action items for rbtree_bench:

  • Unify the interface for accessing different rbtree implementations. You may wrap rb.h to fit the functions exposed by rv32emu's map.h.
  • Shrink rb.h by replacing it with much smaller one I provided.
    --> Done by commit
  • Utilize minibench for benchmarking -- simpler and more informative.
    --> Done by commit1 and commit2
  • Check existing rbtree benchmark programs and see if we can learn from the test scenarios. e.g., rbt_test.c from bind9 project and rb-bench.
  • Not only average time but also worst-case behavior should be considered.
  • Measure memory consumptions -- check Massif from Valgrind.

@jserv
Copy link
Contributor Author

jserv commented Jun 14, 2023

Close via commit 434c466

@jserv jserv closed this as completed Jun 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants