Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC 021: The Future of the Socket Protocol #8584

Merged
merged 12 commits into from
Jul 19, 2022
Merged

Conversation

creachadair
Copy link
Contributor

@creachadair creachadair commented May 19, 2022

{rendered}

This was originally a draft RFC based on our discussion at the Tendermint dev call of 19-May-2022, proposing some design decisions about the API surface for out-of-process ABCI applications in future versions of Tendermint. It has been reworked as an RFC since we are not ready to make a decision about this topic right now.

docs/architecture/adr-082-socket-protocol.md Outdated Show resolved Hide resolved
Comment on lines 94 to 105
- Question 1: Designate a single IPC standard for out-of-process applications?

Claim: We should converge on one (and only one) IPC option for out-of-process
applications. We should choose an option that, after a suitable period of
deprecation, can address most or all the highest-impact uses of Tendermint.
Maintaining multiple options increases the surface area for bugs and
vulnerabilities, and we should not have multiple options for basic interfaces
without a clear and well-documented reason.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally see very little issue with this, as long as, as you say, the highest-impact uses of Tendermint can be covered. Maintaining two clients with very overlapping sets of functionality is quite costly and results in a worse user experience for both as a result.

Comment on lines 122 to 140
Besides the code cleanup, we will also need to clearly document a
deprecation schedule, and invest time in making the migration easier for
applications currently using the socket protocol.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it's worth adding a migrations strategies section to this document? I'm not sure I have a clear picture of what a migration may look like, but I think understanding how much we intended to commit to helping migrations may make the cost/benefit trade-off more clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worthwhile yes, but I'm not sure I have a clear picture yet either. In theory as long as the ABCI protocol itself doesn't change, this mainly involves updating internal plumbing, but it would presumably need to be a coordinated upgrade across the whole network. (I don't know if there are networks with the same node logic implemented in two different languages—it seems unlikely but I suppose is theoretically possible)

So I believe the main work is wiring up that plumbing and then testing the bejeezus out of it to make sure nothing semantic changed. But I have observed that applications tend to be pathological about ignoring basic abstraction boundaries so I'd really want input from real network operators.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some text to the document to highlight this question.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is somewhat orthogonal but I believe the SDK gives the option for a node to run in-process and out-of-process.


- ✅ Less immediate impact for existing users (but see below).
- ✅ Simplifies ABCI API surface by removing gRPC.
- ❓ Users of the socket protocol will have a (smaller) migration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a crucial point. If we want to keep both, there is still a migration that users will need to perform to use whatever upgraded client we create. Upgrading the socket client also places the cost of implementing the protocol onto non-Go socket protocol library maintainers (of which we are not). This means that anyone wanting to use Tendermint in an out-of-process style is gated by these updates.

That point highlights an additional consequence not listed here: anyone not using Go or the few supported languages cannot use Tendermint. While the number of languages w/ gRPC support is limited, the number of languages with Tendermint socket client implementations is far more limited. We immediately lose language support for almost 10+ languages if we maintain only the socket protocol impl. I think it would be worth adding the list of currently supported languages for each protocol in this doc (if one can be found for the socket-client).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a very good point. I'll add that for sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added another paragraph to the Decision section to call out this consideration more explicitly. (It will take some time to gather the list, but I'm working on that too)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I keep looking for langauges that I think might actually be used that aren't supported by gRPC, and I think the answer to that is "scheme", everything else basically is (e.g. Haskell, Common Lisp, Erlang, Elixir, ...) so I think we're deeply into a theoretical requirement (broad language support,) for a use case that doesn't entirely exist, but I think it's good to outline things around this anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I keep looking for langauges that I think might actually be used that aren't supported by gRPC

I thought the request here was for a list of the languages with socket protocol support.

(That said, I concur with your analysis—gRPC coverage is not a practical worry)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I think we're deeply into a theoretical requirement (broad language support,)

I agree and disagree with this. I think that Tendermint should be usable in multiple languages but we should focus on 1. popular languages developers interested in blockchains are likely to use and 2. making sure that whatever implementation we chose is well supported. Right now 'You can go implement our socket protocol in your language if you want' is the best we can say right now, which I think is a fairly bad UX. We want to support existing users and drive adoption of Tendermint and I think supporting multiple languages is the clear way to do that especially if we can lower the complexity of getting started as much as possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an appendix containing a bunch of ABCI server implementations I found. I can't speak to quality, but I filtered out some more fragmentary examples by visual inspection. These are the ones that show some evidence of maybe implementing most/all of the protocol.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that Tendermint should be usable in multiple languages but we should focus on 1. popular languages developers interested in blockchains are likely to use and 2. making sure that whatever implementation we chose is well supported.

Agreed.

Right now 'You can go implement our socket protocol in your language if you want' is the best we can say right now, which I think is a fairly bad UX.

Also agreed. Perhaps more importantly, anyone who wants to implement an ABCI server is already on the hook for having good protocol buffer support. That's required even if the goal is to implement the socket protocol for transport.

While I'm sure the subset of "languages that have protocol buffer support but not gRPC" is nonempty, I would be very surprised if any language in that subset has any real impact (or even potential impact) on the usage footprint of Tendermint.

Copy link
Contributor

@cmwaters cmwaters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice write up on the topic as always.

One small outside consideration is the proposal that we may want to implement some bidirectional ABCI protocol in the future. I don't know if this affects the decision making here but thought I'd bring it up anyway

docs/architecture/adr-082-socket-protocol.md Outdated Show resolved Hide resolved
Comment on lines 100 to 105
Claim: We should converge on one (and only one) IPC option for out-of-process
applications. We should choose an option that, after a suitable period of
deprecation, can address most or all the highest-impact uses of Tendermint.
Maintaining multiple options increases the surface area for bugs and
vulnerabilities, and we should not have multiple options for basic interfaces
without a clear and well-documented reason.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also agree with this claim

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Comment on lines 122 to 140
Besides the code cleanup, we will also need to clearly document a
deprecation schedule, and invest time in making the migration easier for
applications currently using the socket protocol.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is somewhat orthogonal but I believe the SDK gives the option for a node to run in-process and out-of-process.

Comment on lines 142 to 148
> application's implementation language, and would have to be sure networks
> have plenty of time not only to make the change but to verify that it
> preserves the function of the network.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My operating assumption was that a clear deprecation notice alongside the 0.36 release would give users an entire release cycle until v0.37, which I believe to be sufficient, to trial the migration and to provide feedback if unexpected hiccups were to arise.

In the case of https://github.com/informalsystems/tendermint-rs/tree/master/abci or https://github.com/penumbra-zone/tower-abci which I believe are the predominant known implementations, we may be able to assist in the migration

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of communicating the schedule, I think "one release" is probably fine. The larger question I'm concerned with is what's the impact: How many applications does it affect, how big a change is it for those specific apps, how much lead time will developers and operators need to test the change, and so forth.

Mainly I see this as a visibility issue—we can write lots of documentation and give lead time, but developers who don't see it or don't realize they need to take it seriously could still be caught out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the tendermint-rs case, I would suggest to keep this PR open until @thanethomson is back, or, if you merge it, to leave this "discussion text" in here so that he can contribute to the discussion with his detailed knowledge of tendermint-rs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no rush about merging. I plan to keep this open till we've circulated it more widely anyway.

Copy link
Contributor

@cason cason May 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we need the opinion of the tendermint-rs guys, but I don't think this should prevent this PR from being merged. I suggest opening an Issue to track this discussion about Rust support.

docs/architecture/adr-082-socket-protocol.md Outdated Show resolved Hide resolved
docs/architecture/adr-082-socket-protocol.md Outdated Show resolved Hide resolved
@creachadair
Copy link
Contributor Author

One small outside consideration is the proposal that we may want to implement some bidirectional ABCI protocol in the future. I don't know if this affects the decision making here but thought I'd bring it up anyway

I thought about it, but it didn't seem to really fit into this document since it's not a decision we're ready to make yet. I originally had a paragraph in the background section talking about this, but it seemed too much like a digression and I removed it. If you think it's worthwhile I could put it back in, though.

@creachadair creachadair force-pushed the mjf/unsocketing branch 4 times, most recently from 2a1467f to 82db558 Compare May 23, 2022 13:31
Copy link
Contributor

@cason cason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice document, I am just wondering whether it looks more like a RFC than an ADR. If the Decision section clearly points out the option we are taking, it would look more like an ADR.


- ✅ Addresses existing performance and operational issues.
- ✅ Replaces custom code with a well-maintained widely-used library.
- ✅ Aligns well with Cosmos SDK, which already uses gRPC extensively.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this purpose, i.e., handling ABCI requests, or for anything else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this purpose, i.e., handling ABCI requests, or for anything else?

Mainly for other things, though I believe the SDK uses the ABCIQuery interface of the node's RPC service to implement some queries.

Comment on lines 100 to 105
Claim: We should converge on one (and only one) IPC option for out-of-process
applications. We should choose an option that, after a suitable period of
deprecation, can address most or all the highest-impact uses of Tendermint.
Maintaining multiple options increases the surface area for bugs and
vulnerabilities, and we should not have multiple options for basic interfaces
without a clear and well-documented reason.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

- ✅ Replaces custom code with a well-maintained widely-used library.
- ✅ Aligns well with Cosmos SDK, which already uses gRPC extensively.
- ✅ Aligns well priv validator interface, for which the socket protocol is already deprecated for gRPC.
- ❓ Applications will be hard to implement in a language without gRPC support.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an actual example of such language?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an actual example of such language?

The current list of officially supported languages includes C#, C++, Dart, Go, Java, Kotlin, Node, Objective-C, PHP, Python, and Ruby. There are multiple community-contributed implementations for Rust, and other languages (e.g., OCaml, Scala, etc.) seem to have reasonable support.

In practical terms, I think that covers most of the languages anyone is likely to use for production applications, and it looks like community implementations of gRPC are probably available for many others. That's why I marked this a ❓ rather than a ⛔, since I think any language not covered is unlikely to have much impact in the space.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How certain are we on the quality and maintenance guarantees on the best implementations for Rust? (which is currently the obvious language of choice besides Go)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How certain are we on the quality and maintenance guarantees on the best implementations for Rust? (which is currently the obvious language of choice besides Go)

I don't know enough to answer.

My intuition is that if someone is going to have to maintain a wire protocol implementation either way, it would probably still be better for that to be gRPC than the custom socket protocol.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How certain are we on the quality and maintenance guarantees on the best implementations for Rust? (which is currently the obvious language of choice besides Go)

It might be reasonable to do a bit more diligence, but the standard validator stack includes a signing service (tmkms) that is moving to use gRPC (for our privval interface) so I think practically speaking it's fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intuition is that if someone is going to have to maintain a wire protocol implementation either way, it would probably still be better for that to be gRPC than the custom socket protocol.

Is this really true given the custom socket protocol should be much simpler than an implementation of grpc?

It might be valuable to try and collect some feedback on grpc implementation quality in non official languages. Hopefully it will be the case that by the time anyone wants to develop an ABCI app in another language there will be a way to use grpc from that language. I would hate to see someone want to experiment with ABCI apps in a new lang and not be able to because of the grpc blocker, but that will probably be pretty rare (I guess worse case they can prob always use FFI to a lang with a grpc impl on the server side ? )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that gRPC is a more complex protocol than the socket protocol as it is currently designed. On the other hand, most of the likely languages for app development already have good (either official or unofficial) gRPC implementations.

The important question here, I think, is what to do about the long tail of other language options. Broadly I claim we should not worry too much about that long tail, on the basis that it's unlikely to generate any meaningful impact for the ecosystem.

Under the covers, gRPC is around the same level of complexity for an implementor as websockets. The substrate is stock HTTP/2, and the message formats are protocol buffers. We already require anyone building against ABCI (of any version) to have protobuf support, and HTTP is usually an out-of-box library feature.

Despite being more work, gRPC is also widely used for other purposes (beyond ABCI) throughout the Tendermint ecosystem, so an investment in it would have broader potential returns for the application. It does mean that a developer using some unusual language would have to do more short-term work to get something running, but I'm not sure we should invest a lot in the corner case of "weird languages, toy problems".

Perhaps more importantly, if the goal is to "pick one (and only one) mechanism for out-of-process apps" (which is the decision this ADR is proposing), I think developers who are using a well-supported gRPC language will have a better experience (and matter more to our impact).

@creachadair creachadair force-pushed the mjf/unsocketing branch 3 times, most recently from 98581b4 to 8c7d5c9 Compare May 23, 2022 19:43
docs/architecture/adr-082-socket-protocol.md Outdated Show resolved Hide resolved
docs/architecture/adr-082-socket-protocol.md Outdated Show resolved Hide resolved
docs/architecture/adr-082-socket-protocol.md Outdated Show resolved Hide resolved
docs/architecture/adr-082-socket-protocol.md Outdated Show resolved Hide resolved
Comment on lines 142 to 148
> application's implementation language, and would have to be sure networks
> have plenty of time not only to make the change but to verify that it
> preserves the function of the network.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the tendermint-rs case, I would suggest to keep this PR open until @thanethomson is back, or, if you merge it, to leave this "discussion text" in here so that he can contribute to the discussion with his detailed knowledge of tendermint-rs.

docs/architecture/adr-082-socket-protocol.md Outdated Show resolved Hide resolved
- ✅ Replaces custom code with a well-maintained widely-used library.
- ✅ Aligns well with Cosmos SDK, which already uses gRPC extensively.
- ✅ Aligns well priv validator interface, for which the socket protocol is already deprecated for gRPC.
- ❓ Applications will be hard to implement in a language without gRPC support.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How certain are we on the quality and maintenance guarantees on the best implementations for Rust? (which is currently the obvious language of choice besides Go)

@creachadair
Copy link
Contributor Author

creachadair commented May 24, 2022

Nice document, I am just wondering whether it looks more like a RFC than an ADR. If the Decision section clearly points out the option we are taking, it would look more like an ADR.

Agreed. I plan to narrow down the decisions (and make the things we don't choose "alternatives considered") once we get more feedback from folks building applications and running networks.

Unlike an RFC, we really do need a technical decision made here. The main concern I'd like to avoid is committing to what I think is the right outcome without robust input.

@creachadair creachadair force-pushed the mjf/unsocketing branch 2 times, most recently from bb59300 to 1a4bf8c Compare May 24, 2022 16:26
@cason
Copy link
Contributor

cason commented May 25, 2022

Is anyone against merging this PR?

@creachadair
Copy link
Contributor Author

Is anyone against merging this PR?

No. But now that the core team has had a chance to comment I am going to circulate it more widely so we can get more input before we commit a decision.

@creachadair creachadair force-pushed the mjf/unsocketing branch 2 times, most recently from 26d2498 to 2dc2f6b Compare May 25, 2022 18:08
@creachadair creachadair force-pushed the mjf/unsocketing branch 11 times, most recently from bfa6ed7 to 1846214 Compare July 18, 2022 13:38
@creachadair creachadair changed the title ADR 082: The Future of the Socket Protocol RFC 021: The Future of the Socket Protocol Jul 19, 2022
@creachadair
Copy link
Contributor Author

Given that we are not prepared to make a decision at this time, I have reworked this PR as an RFC instead of an ADR, so we can have a record of the discussion without committing to a particular outcome yet.

@creachadair creachadair merged commit 8a81541 into master Jul 19, 2022
@creachadair creachadair deleted the mjf/unsocketing branch July 19, 2022 23:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet