-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC 021: The Future of the Socket Protocol #8584
Conversation
- Question 1: Designate a single IPC standard for out-of-process applications? | ||
|
||
Claim: We should converge on one (and only one) IPC option for out-of-process | ||
applications. We should choose an option that, after a suitable period of | ||
deprecation, can address most or all the highest-impact uses of Tendermint. | ||
Maintaining multiple options increases the surface area for bugs and | ||
vulnerabilities, and we should not have multiple options for basic interfaces | ||
without a clear and well-documented reason. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally see very little issue with this, as long as, as you say, the highest-impact uses of Tendermint can be covered. Maintaining two clients with very overlapping sets of functionality is quite costly and results in a worse user experience for both as a result.
Besides the code cleanup, we will also need to clearly document a | ||
deprecation schedule, and invest time in making the migration easier for | ||
applications currently using the socket protocol. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it's worth adding a migrations strategies section to this document? I'm not sure I have a clear picture of what a migration may look like, but I think understanding how much we intended to commit to helping migrations may make the cost/benefit trade-off more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worthwhile yes, but I'm not sure I have a clear picture yet either. In theory as long as the ABCI protocol itself doesn't change, this mainly involves updating internal plumbing, but it would presumably need to be a coordinated upgrade across the whole network. (I don't know if there are networks with the same node logic implemented in two different languages—it seems unlikely but I suppose is theoretically possible)
So I believe the main work is wiring up that plumbing and then testing the bejeezus out of it to make sure nothing semantic changed. But I have observed that applications tend to be pathological about ignoring basic abstraction boundaries so I'd really want input from real network operators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some text to the document to highlight this question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is somewhat orthogonal but I believe the SDK gives the option for a node to run in-process and out-of-process.
|
||
- ✅ Less immediate impact for existing users (but see below). | ||
- ✅ Simplifies ABCI API surface by removing gRPC. | ||
- ❓ Users of the socket protocol will have a (smaller) migration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a crucial point. If we want to keep both, there is still a migration that users will need to perform to use whatever upgraded client we create. Upgrading the socket client also places the cost of implementing the protocol onto non-Go socket protocol library maintainers (of which we are not). This means that anyone wanting to use Tendermint in an out-of-process style is gated by these updates.
That point highlights an additional consequence not listed here: anyone not using Go or the few supported languages cannot use Tendermint. While the number of languages w/ gRPC support is limited, the number of languages with Tendermint socket client implementations is far more limited. We immediately lose language support for almost 10+ languages if we maintain only the socket protocol impl. I think it would be worth adding the list of currently supported languages for each protocol in this doc (if one can be found for the socket-client).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a very good point. I'll add that for sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added another paragraph to the Decision section to call out this consideration more explicitly. (It will take some time to gather the list, but I'm working on that too)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I keep looking for langauges that I think might actually be used that aren't supported by gRPC, and I think the answer to that is "scheme", everything else basically is (e.g. Haskell, Common Lisp, Erlang, Elixir, ...) so I think we're deeply into a theoretical requirement (broad language support,) for a use case that doesn't entirely exist, but I think it's good to outline things around this anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I keep looking for langauges that I think might actually be used that aren't supported by gRPC
I thought the request here was for a list of the languages with socket protocol support.
(That said, I concur with your analysis—gRPC coverage is not a practical worry)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so I think we're deeply into a theoretical requirement (broad language support,)
I agree and disagree with this. I think that Tendermint should be usable in multiple languages but we should focus on 1. popular languages developers interested in blockchains are likely to use and 2. making sure that whatever implementation we chose is well supported. Right now 'You can go implement our socket protocol in your language if you want' is the best we can say right now, which I think is a fairly bad UX. We want to support existing users and drive adoption of Tendermint and I think supporting multiple languages is the clear way to do that especially if we can lower the complexity of getting started as much as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added an appendix containing a bunch of ABCI server implementations I found. I can't speak to quality, but I filtered out some more fragmentary examples by visual inspection. These are the ones that show some evidence of maybe implementing most/all of the protocol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that Tendermint should be usable in multiple languages but we should focus on 1. popular languages developers interested in blockchains are likely to use and 2. making sure that whatever implementation we chose is well supported.
Agreed.
Right now 'You can go implement our socket protocol in your language if you want' is the best we can say right now, which I think is a fairly bad UX.
Also agreed. Perhaps more importantly, anyone who wants to implement an ABCI server is already on the hook for having good protocol buffer support. That's required even if the goal is to implement the socket protocol for transport.
While I'm sure the subset of "languages that have protocol buffer support but not gRPC" is nonempty, I would be very surprised if any language in that subset has any real impact (or even potential impact) on the usage footprint of Tendermint.
8920c57
to
693395d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice write up on the topic as always.
One small outside consideration is the proposal that we may want to implement some bidirectional ABCI protocol in the future. I don't know if this affects the decision making here but thought I'd bring it up anyway
Claim: We should converge on one (and only one) IPC option for out-of-process | ||
applications. We should choose an option that, after a suitable period of | ||
deprecation, can address most or all the highest-impact uses of Tendermint. | ||
Maintaining multiple options increases the surface area for bugs and | ||
vulnerabilities, and we should not have multiple options for basic interfaces | ||
without a clear and well-documented reason. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also agree with this claim
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Besides the code cleanup, we will also need to clearly document a | ||
deprecation schedule, and invest time in making the migration easier for | ||
applications currently using the socket protocol. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is somewhat orthogonal but I believe the SDK gives the option for a node to run in-process and out-of-process.
> application's implementation language, and would have to be sure networks | ||
> have plenty of time not only to make the change but to verify that it | ||
> preserves the function of the network. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My operating assumption was that a clear deprecation notice alongside the 0.36 release would give users an entire release cycle until v0.37, which I believe to be sufficient, to trial the migration and to provide feedback if unexpected hiccups were to arise.
In the case of https://github.com/informalsystems/tendermint-rs/tree/master/abci or https://github.com/penumbra-zone/tower-abci which I believe are the predominant known implementations, we may be able to assist in the migration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In terms of communicating the schedule, I think "one release" is probably fine. The larger question I'm concerned with is what's the impact: How many applications does it affect, how big a change is it for those specific apps, how much lead time will developers and operators need to test the change, and so forth.
Mainly I see this as a visibility issue—we can write lots of documentation and give lead time, but developers who don't see it or don't realize they need to take it seriously could still be caught out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the tendermint-rs
case, I would suggest to keep this PR open until @thanethomson is back, or, if you merge it, to leave this "discussion text" in here so that he can contribute to the discussion with his detailed knowledge of tendermint-rs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no rush about merging. I plan to keep this open till we've circulated it more widely anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that we need the opinion of the tendermint-rs
guys, but I don't think this should prevent this PR from being merged. I suggest opening an Issue to track this discussion about Rust support.
I thought about it, but it didn't seem to really fit into this document since it's not a decision we're ready to make yet. I originally had a paragraph in the background section talking about this, but it seemed too much like a digression and I removed it. If you think it's worthwhile I could put it back in, though. |
2a1467f
to
82db558
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice document, I am just wondering whether it looks more like a RFC than an ADR. If the Decision section clearly points out the option we are taking, it would look more like an ADR.
|
||
- ✅ Addresses existing performance and operational issues. | ||
- ✅ Replaces custom code with a well-maintained widely-used library. | ||
- ✅ Aligns well with Cosmos SDK, which already uses gRPC extensively. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this purpose, i.e., handling ABCI requests, or for anything else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this purpose, i.e., handling ABCI requests, or for anything else?
Mainly for other things, though I believe the SDK uses the ABCIQuery interface of the node's RPC service to implement some queries.
Claim: We should converge on one (and only one) IPC option for out-of-process | ||
applications. We should choose an option that, after a suitable period of | ||
deprecation, can address most or all the highest-impact uses of Tendermint. | ||
Maintaining multiple options increases the surface area for bugs and | ||
vulnerabilities, and we should not have multiple options for basic interfaces | ||
without a clear and well-documented reason. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
- ✅ Replaces custom code with a well-maintained widely-used library. | ||
- ✅ Aligns well with Cosmos SDK, which already uses gRPC extensively. | ||
- ✅ Aligns well priv validator interface, for which the socket protocol is already deprecated for gRPC. | ||
- ❓ Applications will be hard to implement in a language without gRPC support. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an actual example of such language?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an actual example of such language?
The current list of officially supported languages includes C#, C++, Dart, Go, Java, Kotlin, Node, Objective-C, PHP, Python, and Ruby. There are multiple community-contributed implementations for Rust, and other languages (e.g., OCaml, Scala, etc.) seem to have reasonable support.
In practical terms, I think that covers most of the languages anyone is likely to use for production applications, and it looks like community implementations of gRPC are probably available for many others. That's why I marked this a ❓ rather than a ⛔, since I think any language not covered is unlikely to have much impact in the space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How certain are we on the quality and maintenance guarantees on the best implementations for Rust? (which is currently the obvious language of choice besides Go)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How certain are we on the quality and maintenance guarantees on the best implementations for Rust? (which is currently the obvious language of choice besides Go)
I don't know enough to answer.
My intuition is that if someone is going to have to maintain a wire protocol implementation either way, it would probably still be better for that to be gRPC than the custom socket protocol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How certain are we on the quality and maintenance guarantees on the best implementations for Rust? (which is currently the obvious language of choice besides Go)
It might be reasonable to do a bit more diligence, but the standard validator stack includes a signing service (tmkms) that is moving to use gRPC (for our privval interface) so I think practically speaking it's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My intuition is that if someone is going to have to maintain a wire protocol implementation either way, it would probably still be better for that to be gRPC than the custom socket protocol.
Is this really true given the custom socket protocol should be much simpler than an implementation of grpc?
It might be valuable to try and collect some feedback on grpc implementation quality in non official languages. Hopefully it will be the case that by the time anyone wants to develop an ABCI app in another language there will be a way to use grpc from that language. I would hate to see someone want to experiment with ABCI apps in a new lang and not be able to because of the grpc blocker, but that will probably be pretty rare (I guess worse case they can prob always use FFI to a lang with a grpc impl on the server side ? )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that gRPC is a more complex protocol than the socket protocol as it is currently designed. On the other hand, most of the likely languages for app development already have good (either official or unofficial) gRPC implementations.
The important question here, I think, is what to do about the long tail of other language options. Broadly I claim we should not worry too much about that long tail, on the basis that it's unlikely to generate any meaningful impact for the ecosystem.
Under the covers, gRPC is around the same level of complexity for an implementor as websockets. The substrate is stock HTTP/2, and the message formats are protocol buffers. We already require anyone building against ABCI (of any version) to have protobuf support, and HTTP is usually an out-of-box library feature.
Despite being more work, gRPC is also widely used for other purposes (beyond ABCI) throughout the Tendermint ecosystem, so an investment in it would have broader potential returns for the application. It does mean that a developer using some unusual language would have to do more short-term work to get something running, but I'm not sure we should invest a lot in the corner case of "weird languages, toy problems".
Perhaps more importantly, if the goal is to "pick one (and only one) mechanism for out-of-process apps" (which is the decision this ADR is proposing), I think developers who are using a well-supported gRPC language will have a better experience (and matter more to our impact).
98581b4
to
8c7d5c9
Compare
> application's implementation language, and would have to be sure networks | ||
> have plenty of time not only to make the change but to verify that it | ||
> preserves the function of the network. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the tendermint-rs
case, I would suggest to keep this PR open until @thanethomson is back, or, if you merge it, to leave this "discussion text" in here so that he can contribute to the discussion with his detailed knowledge of tendermint-rs
.
- ✅ Replaces custom code with a well-maintained widely-used library. | ||
- ✅ Aligns well with Cosmos SDK, which already uses gRPC extensively. | ||
- ✅ Aligns well priv validator interface, for which the socket protocol is already deprecated for gRPC. | ||
- ❓ Applications will be hard to implement in a language without gRPC support. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How certain are we on the quality and maintenance guarantees on the best implementations for Rust? (which is currently the obvious language of choice besides Go)
8c7d5c9
to
3777302
Compare
Agreed. I plan to narrow down the decisions (and make the things we don't choose "alternatives considered") once we get more feedback from folks building applications and running networks. Unlike an RFC, we really do need a technical decision made here. The main concern I'd like to avoid is committing to what I think is the right outcome without robust input. |
bb59300
to
1a4bf8c
Compare
Is anyone against merging this PR? |
1a4bf8c
to
5c90a6d
Compare
No. But now that the core team has had a chance to comment I am going to circulate it more widely so we can get more input before we commit a decision. |
26d2498
to
2dc2f6b
Compare
bfa6ed7
to
1846214
Compare
Based on our discussion at the dev call of 19-May-2022, this is a draft ADR that proposes some design decisions about the API surface for out-of-process ABCI applications in future versions of Tendermint.
1846214
to
4f341ae
Compare
Given that we are not prepared to make a decision at this time, I have reworked this PR as an RFC instead of an ADR, so we can have a record of the discussion without committing to a particular outcome yet. |
{rendered}
This was originally a draft RFC based on our discussion at the Tendermint dev call of 19-May-2022, proposing some design decisions about the API surface for out-of-process ABCI applications in future versions of Tendermint. It has been reworked as an RFC since we are not ready to make a decision about this topic right now.