Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Peer Reservations #2938

Closed
nbougalis opened this issue May 16, 2019 · 7 comments
Closed

Peer Reservations #2938

nbougalis opened this issue May 16, 2019 · 7 comments
Labels
Feature Request Used to indicate requests to add new features

Comments

@nbougalis
Copy link
Contributor

Configuring rippled to establish a permanent connection to a third party is unnecessarily complex and error-prone.

Consider a hub is meant to only allow connections from validators and should only connect to specific servers.

In order to allow incoming connections from the validators, the hub can't have peer_private set. But in order to limit who it makes outbound connections to it must have peer_private set.

If the hub accepts incoming connections then anyone can connect to it, consuming slots, meaning servers that you want to allow may find the hub full. The only way to reserve slots for the servers you absolutely need to connect to is either configure them as a cluster node by their pubkey (which has several other consequences that may not be desirable) or to configure them with a fixed IP address (which they may not have and on which they may not be able to accept connections).

None of these are ideal.

The idea behind a token is that when you come to me and say "hey, I need a guaranteed connection to your server but I don't have a fixed IP and I don't want to cluster" you can say "sure, here's a token you can use to connect to my server always."

Of course, you don't actually need the token—something which should have been obvious from the beginning. As you say, you can just come to me with "hey, my node ID is C, please always let me in" and I can do whatever command and that's that; you then just connect to H and it magically works and all is well.

I'm not concerned with the size of the data structure or the cost of the query. We already do exact that kind of query to determine if a peer is a member of a cluster and some clusters can grow large. See Cluster::member that's invoked from PeerImp::doAccept().

I think that this probably the best path forward. Clone the Cluster class (it does everything we want already) and implement a simple command set:

/opt/ripple/bin/rippled COMMAND_NAME add nodeIDofC <optional name>
/opt/ripple/bin/rippled COMMAND_NAME del nodeIDofC
/opt/ripple/bin/rippled COMMAND_NAME list

This still leaves one issue: where to persist this information? I do not like the idea of using a new and separate file for a number of reasons. Ideally, this would be stuffed inside the config file, but we don't have the ability to seamlessly edit the config programatically.

I am, reluctantly, coming to the conclusion that wallet.db is probably the best place. Given the counterintuitive and confusing name of this file, we should probably look to rename it to something more appropriate, btw.

@MarkusTeufelberger
Copy link
Collaborator

A [fixed/reserved_peers] section with a list of node IDs might also be an option, right?

Either you are already connected to these, then you try to reconnect on connection loss or if a peer with such a nodeID tries to connect to you, you'd kick the least desirable peer (in case all slots are taken already) in favor of the new connection.

@nbougalis
Copy link
Contributor Author

Yes, that would be an option, however it suffers from one significant and very real drawback: it requires the server to be restarted to make changes to that list. That is, for me, a deal-breaker.

One could argue that this is a problem that can be solved by simply allowing the server to reload the configuration file in response to a SIGHUP or some programmatic call. Unfortunately, that is not as straightforward or easy as it sounds and is fraught with danger.

So, we arrived at the solution of using wallet.db to persist this information, allowing it to manipulated via command-line APIs, without requiring a server restart.

@MarkusTeufelberger
Copy link
Collaborator

Only being able to configure this at runtime is also far from ideal...

I don't get the fixation on having as high uptimes as possible, if anything it should be cheaper/easier to restart the server instead of having a part of the configuration applied at runtime and another part beforehand.

@movitto
Copy link
Contributor

movitto commented May 16, 2019

Just jumping in with a few random comments. At my last job we used (well ahem developed) augeas for configuration management with great success. Not 100% sure we'd want to edit the main rippled.cfg file programatically but a possibility would be to have rippled.cfg 'include' another file, which is the autogenerated/managed one.

At some point it would be cool to separate specific solutions into a pluggagle overlay management system so that different installations can be configure their peers using different rules, but probably not worth the effort at the time.

Curious as to what a server reload would entail, perhaps something I could help implement at some point. High uptime isn't the only thing but given most major services support it these day (httpd, postgres, openldap, countless others), it would be a good feature to have!

@MarkusTeufelberger
Copy link
Collaborator

If anything, I'd rather like to see move rippled more towards the principles in https://12factor.net/ so it becomes easier to run it in Kubernetes, but that's a bigger discussion.

Augeas looks interesting, thanks for the link!

@mDuo13 mDuo13 added the Feature Request Used to indicate requests to add new features label Aug 1, 2019
@nbougalis
Copy link
Contributor Author

I don't get the fixation on having as high uptimes as possible, if anything it should be cheaper/easier to restart the server instead of having a part of the configuration applied at runtime and another part beforehand.

This really isn't so much about high uptimes, as it is about network management. If you are operating a server and you want to limit connectivity only to some servers, you end up having to restart the server every single time you need to make a change and that restart doesn't just affect you but, potentially, everyone else connected to you. With the current setup, you also end up having to list peers by their IP which isn't ideal either.

If anything, I'd rather like to see move rippled more towards the principles in https://12factor.net/ so it becomes easier to run it in Kubernetes, but that's a bigger discussion.

That's actually interesting. I've never seen that before. I don't know how easy it is to get all those done or if they make sense for rippled, but certainly worth considering what, if anything, we can do.

Curious as to what a server reload would entail, perhaps something I could help implement at some point. High uptime isn't the only thing but given most major services support it these day (httpd, postgres, openldap, countless others), it would be a good feature to have!

It would be a significant refactor; several parts of the code (e.g. the port handling and the nodestore) assume that they're initialized only once and are coded with that assumption in mind. Don't mean to discourage you, but I think it's not worth the effort (or the additional complexity) to go through this process.

@MarkusTeufelberger
Copy link
Collaborator

I'm also talking about the "fork child threads/processes instead of failing on errors" part - the uptime of rippled should imho be always increasing. If some assertion is hit or the software crashes, just kill the whole process instead of spawning a new child and keeping it running. This masks errors a lot and makes debugging issues harder, since the only indication that rippled crashed is not a restart of the systemd unit but a log entry.

Personally I think the slightly higher churn (how often would these configs change anyways?) is definitely worth it, especially if some care is taken to more quickly regain state (as far as I remember, there's also a peer list somewhere for example that could be used to bootstrap already quite a few connections...).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature Request Used to indicate requests to add new features
Projects
None yet
Development

No branches or pull requests

4 participants