Idea: hashring replication #17

rcoder · 2022-05-11T22:18:13Z

rcoder
May 11, 2022

So this is kind of a wild experiment, but I decided to spend a minute building a proof of concept to replicate data across acebase servers using a hash ring (ala memcached) instead of a single global IPC server:

https://github.com/rcoder/acebase-ring

The bulk of the actual replication logic is here: https://github.com/rcoder/acebase-ring/blob/main/src/server.ts#L80-L195

I'm almost certainly doing something wrong, and the code is just a big mess, but it's an interesting thought experiment at least. The advantage over a single IPC coordinator is of course the lack of a single coordinator that is persistently connected to all of the storage servers. (I.e., it's a true p2p cluster model.)

If I really find myself with time to burn I might try connecting replicas via WebRTC, but that's a science project for another day. :)

rcoder · 2022-05-11T22:27:57Z

rcoder
May 11, 2022
Author

Speaking of "doing something wrong", I think the above example may be accidentally exponential (b/c write amplification, perhaps?) based on the amount of log spam I'm seeing. :)

0 replies

rcoder · 2022-05-12T17:27:31Z

rcoder
May 12, 2022
Author

Also, just in case it wasn't clear: this really is just a, "hey, this works and it's neat!" issue, not "please support this in addition to your tested, stable clustering model." Feel free to close w/o further discussion, and I can let you know if I go further with my experiments. :)

1 reply

appy-one May 13, 2022
Maintainer

I moved the thread to discussions 👍🏼

appy-one · 2022-05-13T06:50:02Z

appy-one
May 13, 2022
Maintainer

Sounds cool! So, how would you describe the workings of the ring? Is it multiple servers with their own local db that distribute changes to all other known peers?

0 replies

rcoder · 2022-05-13T16:01:34Z

rcoder
May 13, 2022
Author

So consistent hashing maps a key to one of a set of known servers from a "ring", with the expectation that if a node is down you can simply move "around the ring" to find the next appropriate host. Each server ends up with a roughly balanced set of keys to manage, assuming the access patterns are similarly balanced. (There's no idea of "hot" vs. "cold" keys, so if the distribution of reads and writes is highly skewed, one server can end up shouldering a an disproportionate amount of load.)

In this implementation, a peer is both a server and a client to the other peers in the ring. Writes are cached locally per the usual AceBaseClient behavior, and fanned out to other servers based on the hash ring assignment. Because each node is connecting to every other you get an eventually-consistent view of the total database state replicated across all nodes without having to explicitly push changes from your client node to every server.¹

I've combined the client and server into one process in this case just to simplify the example, but you could just as easily let clients handle the writes, with each "peer" in the ring being a normal AceBaseServer instance.

Obviously there's no guarantee of atomic transactions across peers, and writes could easily conflict, but many of those are the same properties that disconnected/offline writes bring to the system already.

The model I'd like to prove out is something like git or UUCP, where any given node can arbitrarily write to its own local store, eagerly replicate to the ring when connected, and then re-hash and replicate to new servers as they become available.

The server pool is also statically configured in my example, but you can easily add and remove nodes on the ring at runtime, so there's nothing preventing you from doing server discovery via e.g. mDNS, rendezvous announcements, or a distributed hash table.

The "rendezvous" model is particularly cool because while you do need a locally-reachable server to discover what peers are available, that server doesn't actually see any of the replicated traffic and so doesn't need be a trusted host. That's also where WebRTC could slot in: using a simple websocket server for peer discovery, then tunnel the actual sync traffic over the WebRTC data channel once you've established a direct connection. Unlike the IPC server, the discovery service does need to be directly accessible to clients, but it can also be entirely stateless and unprivileged.

This is an area where I'm a little fuzzy on the expected behavior of AceBase's internal replication logic and so less confident in any assertion about how quickly you converge on a consistent view. I think that each server will effectively "carry" its own updates to peers alongside each new write, but I may actually need to explicitly manage cursors for each peer connection. ↩

0 replies

appy-one · 2022-05-13T18:37:10Z

appy-one
May 13, 2022
Maintainer

Awesome!

What you can do to prevent roundtrips of mutations is to add info to the context when replicating to other servers, so that when a mutation sent from server A to B to C comes back around to A, it'll know that was my mutation, I can ignore it - preventing endless loops. Or even better, all servers adding their own id to the context variable so they won't forward to ones in the list already. ref.context({ ring_servers: ['A'] }) set by server A, ref.context({ ring_servers: ['A','B'] })` set by server B, which then doesn't forward to A because it was in the list already, etc.

Really interesting stuff you're working on, keep me updated!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Idea: hashring replication #17

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 5 comments 1 reply

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Idea: hashring replication #17

rcoder May 11, 2022

Replies: 5 comments · 1 reply

rcoder May 11, 2022 Author

rcoder May 12, 2022 Author

appy-one May 13, 2022 Maintainer

appy-one May 13, 2022 Maintainer

rcoder May 13, 2022 Author

Footnotes

appy-one May 13, 2022 Maintainer

rcoder
May 11, 2022

Replies: 5 comments 1 reply

rcoder
May 11, 2022
Author

rcoder
May 12, 2022
Author

appy-one May 13, 2022
Maintainer

appy-one
May 13, 2022
Maintainer

rcoder
May 13, 2022
Author

appy-one
May 13, 2022
Maintainer