Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic Nodes in distribution #246

Open
mitcheaton1 opened this issue Mar 25, 2020 · 12 comments
Open

Dynamic Nodes in distribution #246

mitcheaton1 opened this issue Mar 25, 2020 · 12 comments
Assignees
Milestone

Comments

@mitcheaton1
Copy link

Is this suitable for kubernetes? It seems you have to define the nodes up front, and cant change membership while running? Would be a great features to have for dynamic node memebrship with libcluster.

@kyliepace
Copy link

is this true that cachex wouldn't work well in something like ECS?

@mattathompson
Copy link

I also found this to be a pretty big bummer, seems like it could be something easily remedied. We've implemented our own distributed cache around this tool and was hopping to skip that for a new service and take advantage of this feature.

@gfviegas
Copy link

gfviegas commented Nov 3, 2023

I dont have the faintest idea if it would work, but perhaps the :nodes options in start_link could be a function that returns the node list. So when fetching/deleting/inserting/etc, the module could call this function and apply it in all returned nodes.

@whitfin
Copy link
Owner

whitfin commented Jan 4, 2024

Hi all!

Happy to work on this, given there's traction in the upvotes. I'm not sure exactly what the best option here is, so I'm open to suggestions. I obviously would like to keep the distribution logic fairly straightforward (after all, this is an in-memory caching lib not a distributed data lib).

@mattathompson given you've done a lot of the work for this already, how exactly did you implement it? Is it something that can be shared or is it proprietary?

@gfviegas cool idea, I have no clue how practical it is but sounds fairly elegant to me (at least at a glance).

I'm not scheduling it for a specific version, but v3.7.0 is coming fairly soon and if we can figure this out I'm happy to include it in there! If it's a case of just updating the nodes in the cache, this is simple - i.e. something else is taking care of nodes coming up/down and just updating the reference in Cachex.

@gfviegas
Copy link

gfviegas commented Jan 4, 2024

Note: I've since migrated my clusterized app using Cachex to Nebulex. It implements several adapters, including a Cachex one.

I'd suggest discussing with Nebulex owners ideas on how clusters could be natively supported on better supported.

@whitfin
Copy link
Owner

whitfin commented Jan 4, 2024

I'm fairly confident the way to solve this is to migrate from jump hashing (numeric) to something like libring for key hashing. We can then simply add two new functions to the main Cachex API:

  • add_node/1
  • remove_node/1

Which will make this transparent, and people can plug into it via whatever clustering lib they decide to use (even if it's just :net_kernel.monitor_nodes/2). This doesn't require much change, except that the internal state of the nodes list will have to move to a Ring. I don't believe this has to result in a major, but perhaps it's best to do so.

Maybe @bitwalker can chime in here as to whether libring seems like a good fit?

@whitfin whitfin added this to the v3.7.0 milestone Jan 4, 2024
@whitfin whitfin changed the title Dynamic Nodes in distriubution Dynamic Nodes in distribution Jan 4, 2024
@whitfin whitfin modified the milestones: v3.7.0, v4.0.0 Mar 22, 2024
@whitfin
Copy link
Owner

whitfin commented Apr 1, 2024

Hi everyone!

Although I don't necessarily think it's the best approach, I filed #344 for some of the initial work here to make this available. The plan is to use libring for hashing and add two new functions to Cachex to add/remove nodes. I'm also exploring making the routing logic configurable, although no promises there.

In the initial 4.0 release there will be no automatic addition/removal, but you can hook this up manually by using the APIs (which should be tiny, OTP has stuff for this). If there's a demand and/or contribution for going further than this, I'll be happy to add it in future though! I'd do it sooner, but time is limited and I'd rather get something out rather than nothing.

Note that even if a cache can remove nodes, it's not going to be able to reshuffle all data across, etc. Those values will simply drop from the cache. As such if your containers are added/removed often, you might not actually benefit that much with a distributed cache over simply falling back to a central cache.

Thanks for all your input, and sorry for the delay - sometimes it's hard to find the time, even if it has been literal years 😅

@burmajam
Copy link

burmajam commented Apr 8, 2024

Hi all. Maybe usage of https://github.com/derekkraan/delta_crdt_ex could solve "reshuffle of data" problem when beam nodes are removed/added in a cluster?

@whitfin
Copy link
Owner

whitfin commented Apr 8, 2024

Hi @burmajam!

It's definitely possible, however that would probably require many many changes that end up being too involved. It looks like everything under the hood would likely have to change, and there's simply too much work there.

The intent for Cachex has always been caching of data; I do not intend to get into the world of supporting distributed databases. The idea of adding "distributed" caches in the first place was so a newly added Node B would be able to fetch pre-warmed data from Node A. If a user wants persistent distributed cache, the answer exists with (e.g.) Redis and Cachex being a local hot layer over a single remote.

IMO dropping a node out of your cache should (and will) simply drop the data on that node from that cache. Given that it's a "cache", this should be totally acceptable. Just like adding a node to your cache will have zero data on it initially and populate over time.

All of the consistency guarantees (or lack thereof) are fully acceptable, again because the context is "cache" not "database". The line here is very blurry and I want to ensure that I am very clear where this library will stand - in part because distributed database guarantees are much more complicated and I simply don't have to time to dedicate to that. The current path forward is to enable usage of libring, which I believe will satisfy the minimum requirements for adding/removing nodes from a cache.

At the same time I'm also working on making routing of keys within a cache configurable so that you could theoretically plug in your own library there. This might be overkill, but it's also pretty helpful in migration to the new routing. Alongside these changes I will be rewriting the documentation on distributed caches to better clarify the intended use case and to steer people to better architectures. Admittedly the initial documentation I wrote was not very clear in this regard!

I hope this makes sense, happy to hear your thoughts on this!

@burmajam
Copy link

burmajam commented Apr 9, 2024

Hi @whitfin. Thanks for the clarification of the main intent of the lib.

Maybe I didn't understand other comments well, but they sounded to me like something I'm looking for. I would agree with you that line between "cache" and "database" here is blurry. I'm not looking for persistence but was looking for something that wouldn't require Redis in my setup. I believe this is not so uncommon situation:

Imagine you have (at least) 3 k8s pods of the same app. You want to share cache data between them. Then you deploy new version and with correct k8s strategy you can have pods replaced one by one (so you always have quorum). At the end of rollout you end up with the same entries (with their TTL) in shared/clustered cache.

I use cachex in few libs and apps I was working on for years now and I'm very satisfied with it, but for this use case I'll have to tinker with delta_crdt or stick with Redis.

Keep up great work and I appreciate your responsiveness a lot as well. Cheers :)

@whitfin
Copy link
Owner

whitfin commented Apr 9, 2024

@burmajam No worries, appreciate your input - it's really hard to decide these things in a vacuum 😅.

Imagine you have (at least) 3 k8s pods of the same app. You want to share cache data between them.

I agree with this, this is how it currently works (even in 3.x). If Node A writes Key A, then Node B can read Key A even though it lives on Node A. It's not that there is a copy of Key A on every node.

Then you deploy new version and with correct k8s strategy you can have pods replaced one by one (so you always have quorum)

This is where it breaks down, IMO. Docker/Kubernetes are not meant to be used with persistent state in memory. Relying on rolling the pods out and in and having your data maintained means that Cachex will have to perform this sync/copy multiple times (once each time a node is rolled) and there's no guarantee that you end up with a balance across all.

I would agree with you that line between "cache" and "database" here is blurry.

I would like to take a step back here for a moment and look at distributed systems in general. Even if Cachex was bulletproof with this syncing and it was included as a feature, it's totally possible that you simply have a node in your cluster die with no warning. In this scenario, that data is gone. Your application must be resilient to this, which generally should result in your use case being non-critical data inside your cache (unless it's a hot layer of something else).

Taking this further, if your app is safe to recover from failures like this, then why would we need to write anything super complicated inside Cachex just to try our best to maintain state across pod restarts? We can simply let the cache re-warm itself exactly as if a node had died. If this sounds like a strange approach, I'll clarify a few things which hopefully make it clear why I use this as justification:

  • The end result is the same, your new nodes have the same data
  • The data is guaranteed to be consistent, because Cachex is not explicitly moving it around
  • Cachex's internals stay simple, with a lower surface area for bugs
  • The only real price is a few slower cache calls initially while re-warming
    • Even this is avoidable using blocking proactive warming on pod startup!

I think this is very valid, and fully believe that it's fairly redundant for the amount of effort the changes would take. It doesn't matter how hard Cachex tries to replicate things around if you still have to build your app with the assumption that data will disappear (and anyone who is not doing this, should be).

I use cachex in few libs and apps I was working on for years now and I'm very satisfied with it, but for this use case I'll have to tinker with delta_crdt or stick with Redis.

Awesome, happy to hear you like using it! Makes it all worth it :D

Even outside of the scope of this library I strongly feel that if you need persistent data across restarts like this, then you should be using a primary cache with persistent state/backup and Cachex should simply act as a hot layer within your application cluster.

For most people this will be Redis, like you said. I think too many people get hooked by the "oh hey, maybe I can get rid of Redis" either because it sounds cheaper, or cleaner, or it's easier to host an app on <insert application host of choice>. It's understandable, for sure, but I think it's too much of a trap.

As such this is definitely a weak area of the documentation from my side, and I intend to write a better version with the merging of this PR. I think guidance is very important here. I may even refer people to these comments from that documentation.

Maybe I didn't understand other comments well, but they sounded to me like something I'm looking for.

Just to finally address this for anyone coming to this thread, the intended solution of this MR will simply add new nodes to the same hash ring as the previous, which means that a) your new node may have keys allocated to it in future and b) it will be able to look up keys which live on other nodes. There is no full replication of a cache between nodes. I think this will clarify things a little better.

Keep up great work and I appreciate your responsiveness a lot as well. Cheers :)

Thank you 🙌

Let me know if you have any thoughts on the above, at least if it seems reasonable. I don't want to assert my views on everyone, but I truly do believe that this approach is simpler while also avoiding users seeing a "sync" type feature and assuming their data is guaranteed safe. I understand the friction people feel here, but I feel the friction is pushing people in a better direction (almost deliberately, at this point).

@burmajam
Copy link

@whitfin thanks for your answer. It does make sense and maybe I can solve one of my problems even with 3.1.x version of Cachex. By trying to find full blown solution I've overseen the fact that values can be read from cluster members. Since my values are ws connection pids, it also makes sense for them to disappear when pod goes down since clients have to reconnect anyway.

Thanks once again and +1 for #346 :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants