-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement clustering for horizontal scalability #1532
Comments
I like the sound of most of that proposal. So, what sort of info usually differs for each server on the network, in traditional IRC nets?
anything else? optimally, not sending an IRC-based line protocol on the wire and doing something more efficient would be nice? if we do need to, that's fine. optimally we'll be able to use the same fairly-simple command handlers like we do now to parse incoming client messages.
I'm not super familiar with how scalability is affected by different factors, but is there any reason why a leaf that has e.g. 20 clients on it would be more stable if it gets the full 50k-client-traffic piped to it vs getting a smaller amount of traffic if the hub has some logic to work out which leafs should get which traffic? or is it something like 'reducing the burden on the hub (not needing to keep in mind which leafs have which clients, not needing to spend time directing messages) provides more benefit to the network as a whole than reducing the burden on specific leaf connections'? Just to make sure that we're on the same page - if Ora is in hub mode it'll accept connections from oragono servers but not clients. if Ora is in leaf mode it'll accept connections from clients only, yeah? the standard irc padiagram has this feature where users can connect to hubs, which is sometimes useful for opers sitting on those hubs to e.g. fix the network or stuff like that, and sometimes you'll see servers that both accept clients (y'know sitting there with a few thousand normal clients connected) and also being the hub to 2-3 other leaf servers. It probably makes sense for us to just have a clean split, either hub or leaf and that means either accept servers or accept clients, done. thoughts on the 'server name' stuff again, this'd probably make sense as part of the connection authentication stuff that leaves do to auth to the hub. e.g. something along the lines of this in the hub config:
or are we gonna go the no-passwords-to-connect route and tell everyone to setup internal vpns and junk to expose the hub to the leaves or something instead? |
|
Something I hadn't considered adequately before: this design requires the leaf nodes to have an up-to-date view of channel memberships. This shouldn't be too difficult but it introduces a potential class of desync bugs. |
In the MVP of this, all password checking will happen on the hub, but later on we'll probably want to have a mechanism to defer it to the leaf (in-memory cache of accounts and hashed passwords?) |
|
Hi! This all sounds very interesting, and we can't wait to test it out. Did I understand correctly, that all IRC clients will connect exclusively to leaf-nodes, and non will connect to the root directly? If yes, will it be feasible to have one leaf on the same server as the root? |
Yes, but what would this accomplish? |
Oh --- I didn't mean to suggest that the old single-process mode of operation will no longer be supported. |
I assume they’d probably want this for the same reason I would - if users
can’t directly connect to the hub when it’s running in a hub+spoke
configuration, it might be useful to have a single consistent leaf where
opers and such can connect.
tl;dr if it’s possible to let users connect to the hub then we’ll do that,
and if they can’t then you should definitely be able to do this.
…On Fri, 12 Feb 2021 at 8:08 am, Shivaram Lingamneni < ***@***.***> wrote:
Oh --- I didn't mean to suggest that the old single-process mode of
operation will no longer be supported.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1532 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAB5LEPBPTYB7HA7QOXW3VDS6RIMLANCNFSM4XNXT62A>
.
|
I was more thinking, one would ideally have at least two leafs in such a configuration (else one could just run the single mode of operation) - if the resources of the root server are good enough, could someone just run one of their leafs on it, or are there any drawbacks of doing so? |
@DanielOaks ' point is a good one too though - given that your root server is for some reason more reliable than the others. |
My intuition is that in this setup, the leaf will compete with the master for resources and you might be better off running a single node. But I'm not sure. |
Answering some other questions received:
|
If I understand correctly this means that if the root node goes down (e.g. for a reboot) or becomes unreachable (e.g. network issues), then the whole IRC network goes down with it? The original IRC federation design was not (only) to be able to scale horizontally, but also to be resilient to such infrastructure outages (and we’ve seen over the years that even the biggest and most experienced internet companies aren’t immune to such calamities…). |
As discussed in the issue description, this is in fact the status quo for conventional IRC networks: the services node is a SPOF for the entire network. |
Motivation
The original model of IRC as a distributed system was as an open federation of symmetrical, equally privileged peer servers. This model failed almost immediately with the 1990 EFnet split. Modern IRC networks are under common management: they require all the server operators to agree on administration and policy issues. Similarly, the model of IRC as an AP system (available and partition-tolerant) failed with the introduction of services frameworks. In modern IRC networks, the services framework is a SPOF: if it fails, the network remains available but is dangerously degraded (in particular, its security properties have been silently weakened).
The current Oragono architecture is a single process. A single Oragono instance can scale comfortably to 10,000 clients and 2,000 clients per channel: you can push those limits if you have bigger hardware, but ultimately the single instance is a serious bottleneck. The largest IRC network of all time was 2004-era Quakenet, with 240,000 concurrent clients. Biella Coleman reports that the largest IRC channel of all time had 7,000 participants (#operationpayback on AnonOps in late 2010). This gives us our initial scalability targets: 250,000 concurrent clients with 10,000 clients per channel.
Oragono's single-process architecture offers compelling advantages in terms of flexibility and pace of development; it's significantly easier to prototype new features without having to worry about distributed systems issues. The architecture that balances all of these considerations --- acknowledging the need for centralized management, acknowledging the indispensability of user accounts, providing the horizontal scalability we need, and minimizing implementation complexity --- is a hub-and-spoke design.
Design
The oragono executable will accept two modes of operation: "root" and "leaf". The root mode of operation will be much like the present single-process mode. In leaf mode, the process will not have direct access to a config file or a buntdb database: it will take the IP (typically a virtual IP of some kind, as in Kubernetes) of the root node, then connect to the root node and receive a serialized copy of the root node's authoritative configuration over the network.
Clients will then connect to the leaf nodes. Most commands received by the leaf nodes will be serialized and passed through to the root node, which will process them and return an answer. A few commands, like capability negotiation, will be handled locally. (This corresponds loosely to the
Session
vs.Client
distinction in the current codebase: the leaf node will own theSession
and the root node will own theClient
. Anything affecting global server state is processed by the root; anything affecting only the client's connection is processed by the leaf.)The root node will unconditionally forward copies of all IRC messages (PRIVMSG, NOTICE, KICK, etc.) to each leaf node, which will then determine which sessions are eligible to receive them. This provides the crucial fan-out that generates the horizontal scalability: the traffic (and TLS) burden is divided evenly among the n leaf nodes.
Unresolved questions
The intended deployment strategy for this system is Kubernetes. However, I don't currently have a complete picture of which Kubernetes primitives will be used. The key assumption of the design is that the network can be virtualized such that the leaf nodes only need to know a single, consistent IP address for the root node.
I'm not sure how to do history. The simplest architecture is for
HISTORY
andCHATHISTORY
requests to be forwarded to the root. This seems like it will result in a scalability bottleneck. In a deployment that uses MySQL, the leaf nodes can connect directly to MySQL; this reduces the problem to deploying a highly available and scalable MySQL, which is nontrivial. The logical next step would seemingly be to abstract away thehistoryDB
interface, then provide a Cassandra implementation --- the problem with this is that we seem to be going in the direction of expecting read-after-write consistency from the history store (see this comment on #393 in particular.)See also
This supersedes #343, #1000, and #1265; it's related to #747.
The text was updated successfully, but these errors were encountered: