Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

confluent set and map is thread safe? #1

Closed
ipconfigme opened this issue Mar 10, 2019 · 4 comments
Closed

confluent set and map is thread safe? #1

ipconfigme opened this issue Mar 10, 2019 · 4 comments

Comments

@ipconfigme
Copy link

no atomic operation in source code, are they threadsafe?

@liljenzin
Copy link
Owner

Nodes are reference counted using atomic counters and updating operations will access hash tables that are guarded by mutexes. Otherwise the sets and maps are backed by immutable trees that make them inherently thread-safe by design.

You can think of the sets and maps as smart pointers that point to nodes in a forest where all nodes are immutable. Updating a set or a map can add new nodes to the forest and/or delete old nodes that are no longer reachable, but can never modify any existing node that can be reached from other sets or maps. As with smart pointers, one instance of a set or a map should not be updated simultaneously from different threads (as a smart pointer itself is usually not guarded in that way), but it is fine to perform read operations from different threads. It is also fine to use the copy constructor to clone a set or map in O(1) and then update the cloned instances concurrently.

@ipconfigme
Copy link
Author

ipconfigme commented Mar 11, 2019

yeah, smart pointer itself is usually not guarded updated simultaneously.
node_ptr node_ in source code will be accessed in different threads, how to addref node and return node ptr atomic?
I am building some tree structure for metadata work with raft, singe write and muliti-read scene, and support O(1) snapshot. node_ptr node_ in map is threadsafe? I think it need atomic load and store to protect concurrent access.

usage:
confluent::map<std::string, std::string> kv1;

thread1:
kv1.insert("test", "value");
......
kv1.remove("test2", "value2");
......
confluent::map<std::string, std::string> kv2 = kv1;
for (it = kv2.begin(); it != kv2.end(): ++it)
serialize_and_dump(it->first, it->second)

thread2-threadn:
kv1.find("test1");
......
kv1.find("testx");

@liljenzin
Copy link
Owner

The following is not safe:

thread1:
kv1.insert("test", "value");
thread2:
kv1.find("test1");

An obvoius problem is that thread2 would explore the tree without holding a reference count on the root node, so that the searched nodes could disappear while the find operation is in progress.

I have worked with similar implementations in the past that allow this kind concurrency by making the root node pointer atomic and also increasing the reference count on the root node, to protect against deallocation while operations like find() are exploring the tree. It comes with a severe performance penalty though. First it requires load and store fences in all entry points to ensure cache consistency. Then all usage of atomics, mutexes and memory barriers insert compiler barriers that prevent the optimizations a compiler otherwise could do. With the current implementation read operations performs similar to ephemeral implementations, but that would not be possible if additional synchronization was always added just because it would be useful in rare cases.

On the other hand it should be fully possible to wrap the current implementation to add more synchronization when needed.

The following should be fine:

std::mutex m; 
confluent::set<int> s; 
 
confluent::set<int> get() { 
  std::unique_lock<std::mutex> lock(m); 
  return s; 
} 
 
void update(const confluent::set<int>& s1) { 
  std::unique_lock<std::mutex> lock(m); 
  s = s1; 
} 

thread1:
{ 
    auto s1 = get(); 
    s1.insert(1); 
    s1.erase(2); 
    update(s1); 
}

thread2:
{ 
  for(int i : get())
    std::cout << i << std::endl;  
}

@ipconfigme
Copy link
Author

yeah, wrap with mutex is simple, and i have a same mutex treap implementation ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants