Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BeaconNode <--> ValidatorClient API - Protocol #1012

Closed
spble opened this issue Apr 30, 2019 · 26 comments

Comments

Projects
None yet
@spble
Copy link
Contributor

commented Apr 30, 2019

ETH2.0 Beacon Node & Validator Client Protocol Discussion

Further background, and actual protocol, is described in issue #1011

It would be useful to choose a standard protocol for the BeaconNode and ValidatorClient API.

It was discussed during the Client Architecture session at the Sydney Implementers meeting that the main decision is between gRPC and JSON-RPC.
This discussion was a follow on from the Client Architecture Roundtable in Prague.

gRPC

Advantages

  • Fast
  • Highly specified, compiled specification
  • Has explicit types and support for binary data

Disadvantages

  • Non human-readable transport
  • Requires specific framework for integration other software
  • Requires special software to interface with directly

JSON-RPC

Advantages

  • Human-readable transport
  • Simple enough to be implemented by any language
  • Well understood by many developers
  • Can easily be called with curl
  • Consistency with Eth1.0 clients

Disadvantages

  • Slower
  • Some ambiguity around data representations
  • Specification of API not as rigerous, must be specified out-of-band

In conclusion, most people had a preference towards JSON-RPC mainly due to it's human readability and ease of implementation.

@prestonvanloon

This comment has been minimized.

Copy link

commented Apr 30, 2019

For Prysm, we will be continuing to use protocol buffers for our beacon chain and validator implementation.
The discussion within the team is that the API enforcement within generative code and performance gains outweigh the marginal benefit of using curl or other pre-installed tools rather than tools created for the ecosystem.

Client interop may be achieved through a gRPC proxy gateway, but the bidirectional streaming would not work so we may not support JSON-RPC unless there is a very compelling reason to do so.

@spble

This comment has been minimized.

Copy link
Contributor Author

commented May 1, 2019

Thanks for the input @prestonvanloon - I definitely see the performance improvements with using gRPC, but I imagine the interface between the BeaconNode and ValidatorClient will be a fairly low-bandwidth interface. As such, doing a call in 10ms instead of 100ms would not bring any substantial benefit in my opinion.

We have also implemented protocol buffers in Lighthouse currently, but we are considering re-factoring this if most other clients are in favour of JSON-RPC. Interoperability is our most compelling reason for this re-factor.

While using curl is helpful, I think the human readable and widely understood nature of the protocol is the biggest benefit. Interacting with JSON-RPC is very very widely used and understood by web developers, whereas gRPC is generally a lot more niche.

@spble

This comment has been minimized.

Copy link
Contributor Author

commented May 1, 2019

Also, a quick google around reveals: https://github.com/plutov/benchmark-grpc-protobuf-vs-http-json
Turns out that speeds and resource usages are fairly compatible... within one order of magnitude anyway.

@prestonvanloon

This comment has been minimized.

Copy link

commented May 1, 2019

@spble Interesting link!

They are almost the same, HTTP+JSON is a bit faster and has less allocs/op.

This is quite surprising actually. 😄

Going forward, we would still advocate for protobuf usage even if solely for its generative schema approach. If the general consensus is to support only JSON-RPC, then we would likely provide some wrapper or use jsonpb while we continue to have success with protobuf. We're even using protobuf in the browser with a Typescript application! And with tools like prototool, we maintain productivity for the rare need for adhoc queries.

In short, we support interop even if we are the minority.

@pipermerriam

This comment has been minimized.

Copy link
Member

commented May 1, 2019

Lacking a compelling reason for the performance gains gRPC which based on the comments from this thread doesn't seem to be present, JSON-RPC is my preference.

Potentially compelling reason for JSON-RPC: It is already well supported across the existing web3 tooling which makes integration with existing web3 client libraries much simpler.

@karalabe

This comment has been minimized.

Copy link
Member

commented May 6, 2019

Hey all,

Just wanted to do a small braindump. Full disclosure, I'm not familiar with the ETH 2.0 spec at all, neither with the communication requirements between beacon chain nodes and validators. That said, I can talk out of ETH 1.0 experience + general API experience.

Generally, the dumber and more boring a protocol is, the simpler it is to interface. At the end of the day, the goal of Ethereum is to bring developers together, so we should always prefer simplicity over other advantages.

There have been two proposals made here: gRPC and JSON-RPC. I honestly don't see any advantage in gRPC if we're building an open infrastructure. "Nobody" will want to (or be able to) roll their own gRPC implementation, so you are immediately limited by what you can implement on top of Ethereum purely, because you can't talk to it. This alone should be enough to rule out gRPC (this is why you don't see protobuf, cap'n proto and others on APIs). These frameworks are very nice for internal calls in proprietary systems, but not in public APIs that you want to maximize interoperability with.

That said, JSON-RPC is also a horrible choice. It's better than gRPC in that you can at least interface it easily, but the issue is that it is a stateful protocol, which makes it a non-composable protocol. Ethereum 1.0 made the huge mistake of permitting RPC calls that span requests (e.g. req1: create a filter; req2: query logs until block N; req3: query until block M, etc). This is a huge problem in internet infrastructure as it completely breaks horizontal scaling. All the requests must go to the same backend, because they are stateful. The backend cannot be restarted, the backend cannot be scaled, cannot be load balanced, cannot be updated, etc. JSON RPC works ok for communicating 1-1 with a dedicated node, but you cannot build public infrastructure out of it.

My proposal is to seriously consider RESTful HTTP for all public APIs that Ethereum nodes need to serve. If you are unfamiliar with it, REST is simply a "schema" that defines how you should query data ("GET /path/to/resource/"), how you should upload data ("POST /path/to/resource") and how different errors should be conveyed ("404 not found"). It is a tiny specialization of the HTTP protocol, but the enormous power is that:

  • You can query if from any tool: curl, browser, literally any programming language. The data you get back is JSON that you need to interpret of course, but the "input" is mostly just URL parameters that you can even type in manually to test something.
  • Everything speaks HTTP. You can access it through a proxy, you can put it behind Tor, you can shove it on top of AppEngine, you can put an nginx load balancer in front, you can put a memcache in front, you can put Cloudflare in front, you can have AWS or GCE auto scale it for you. You can host one backend to serve it, or 100. You can serve it from multiple geographic locations (i.e. many data centers). You can have rotating DNS in front for failovers. You can have encryption and server authentication via TLS, you can add client authentication via client certs, or JWT tokens, or OAuth. You have well defined throttling mechanisms via tokenbuckets. You can even host your service through a Mashape/RapidAPI marketplace and make money for yourself.

You see, RESTful HTTP APIs are the building blocks of the modern internet. Everything on the internet is built to produce and consume it. If we go down the JSON RPC path, we remain yet another niche. Sure, some will support it, but the big guys will always be deterred. If we embrace proper architectures, Ethereum will be trivial to integrate into existing systems, giving it a huge boost in developer appeal.

My 2c.

@karalabe

This comment has been minimized.

Copy link
Member

commented May 6, 2019

Oh, just as a memo, the fact that the default reply format is JSON, is just a detail. Since the reply is just an HTTP response, you are free to send JSON, or any other format. Way back XML was also popular (e.g. "GET /path/to/res.json" vs. "GET /path/to/res.xml"), but there's nothing stopping us from also supporting a binary format (e.g. "GET /path/to/res.rlp" or "GET /path/to/res.ssz"). REST still works, it doesn't care, HTTP doesn't care, nothing cares. But we can immediately have both performance and simplicity: validators would use a binary format, and a web interface would use a json format.

@karalabe

This comment has been minimized.

Copy link
Member

commented May 6, 2019

Btw, I'd gladly help spec out a REST version if you have any pointers to the requirements. I'm aware there might be limitations that might make REST unsuitable, but I'd rather redesign the limitations than go with a non-popular protocol.

@pipermerriam

This comment has been minimized.

Copy link
Member

commented May 6, 2019

@karalabe I'm not sure I follow the argument for REST. I acknowledge and recognize the problems with the stateful ETH1.x APIs and am fully onboard with avoiding those mistakes in Eth2.0 APIs but I fail to see how REST solves that.

Note that I'm not arguing against REST, just trying to understand.

I do agree that REST is more expressive than JSON-RPC and that we could benefit from that. I will say that JSON-RPC's simplicity has been nice, exposing the API over a unix socket and bypass the need for an HTTP server's complexity.

@ligi

This comment has been minimized.

Copy link
Member

commented May 6, 2019

I think the most compelling argument for gRPC/protobuf is that it leads to a well defined API - currently with json-rpc this is a mess. As far as I see preventing to repeat this mess could be enforced by using gRPC/protobuf. So I would lean in this direction. Also having trouble understanding @karalabe 's argument against it:

I honestly don't see any advantage in gRPC if we're building an open infrastructure. "Nobody" will want to (or be able to) roll their own gRPC implementation, so you are immediately limited by what you can implement on top of Ethereum purely, because you can't talk to it.

why will nobody be able to roll their own gRPC implementation?

@karalabe

This comment has been minimized.

Copy link
Member

commented May 6, 2019

REST mostly allows Ethereum to be a component in a modern web stack. For example, I can't run my own "Infura", because it's a PITA to interpret, load balance, and cache all those requests. It takes a team just to maintain an Ethereum gateway. But if the API was simple REST, anyone could compete with Infura. You could have Cloudflare compete with them. You could launch N k8s instances and have k8s auto load balance. The advantage is that you can combine your node with existing infrastructure in a natural and native way, without relying on custom bridges (e.g. How do I write a firewall to block personal_xyz JSON RPC calls, I dunno? How do I write a firewall to block /api/personal/xyz, well, that's easy, any web server/router/proxy can do it, or authenticate it, or throttle it).

I do agree that REST is more expressive than JSON-RPC and that we could benefit from that.

I'd actually say REST is less expressive, hence why it's more composable.

exposing the API over a unix socket and bypass the need for an HTTP server's complexity.

We can still expose REST through a unix socket. The socket is just a transport, TCP vs. IPC. Whether that transport speaks REST or JSON-RPC is not relevant from the transport's perspective.

@karalabe

This comment has been minimized.

Copy link
Member

commented May 6, 2019

@ligi gRPC is a framework. You need libraries to speak to it. e.g. there's no Erlang lib. You immediately shot people like blockscout off the network who develop in Elixir.

@ligi

This comment has been minimized.

Copy link
Member

commented May 6, 2019

@karalabe

This comment has been minimized.

Copy link
Member

commented May 6, 2019

0.4-alpha, build failed on CI :) Yes, you can hack it, but that doesn't mean there's reliable code.

@ligi

This comment has been minimized.

Copy link
Member

commented May 6, 2019

OK - good point ;-)
Still really compelled by the advantage of having a strong protocol spec though - but you are right - it comes with some collateral damage ..

@karalabe

This comment has been minimized.

Copy link
Member

commented May 6, 2019

Completely agree :) https://swagger.io/specification/

@pipermerriam

This comment has been minimized.

Copy link
Member

commented May 6, 2019

@karalabe 👍 makes sense now. I would be fine with REST or JSON-RPC.

Restating my 👎 on grpc due to it having real tooling downsides and all of it's stated upsides being things we can address with things like swagger for well defined REST specifications, or just good due diligence if it's JSON-RPC.

My comment about expressiveness was intended towards the expressiveness of HTTP method semantics in REST (GET/POST/PUT/DELETE) and response status code.

@holiman

This comment has been minimized.

Copy link

commented May 6, 2019

My two cents (which mainly is the same as @karalabe brought up).

Cent one

  • JSON-RPC is a protocol for a dialogue between two peers that send messages to one another. It's good for that. It means that each message is it's own unique snowflake, and each message deserves it's own unique response. That means it is
    • Intrinsically difficult to cache,
    • A message-based processing pipeline, which is quite resource intensive
  • A REST API is a client/server protocol, where resources are served to a multitude of clients. Like Peter pointed out, it can be trivially scaled/cached/balanced.

That may be somewhat generalizing, but I think it's fairly accurate description. So, also without having deep insight into 2.0, I think you should consider whether what we're building up to is going to be a dialogue or a client/server scenario.

Cent two

  • Writing a schema for JSON-RPC is, imho, very difficult. The ecosystem for json-rpc schemas is not mature. I have, as well as @cdetrio, tried to make formal definitions of the expected requests and response schemas in use, in order to create validation tests. There are tickets a bit here and there regarding this, I can't find it right now, but suffice to say it's been a PITA to validate/maintain/specify client behaviour without any rigorous schemas on expectations.
  • Writing schema for RESTful service is very mature, with things like swagger and similar tools which generate end-user docs, examples and validation.
@FrankSzendzielarz

This comment has been minimized.

Copy link
Member

commented May 6, 2019

My Various Cents

  • Swagger metadata can be used to deploy Swagger UI pages for remote testing and integration
  • Swagger metadata can be used in Swagger codegen to auto-create stub clients and stub servers
  • The metadata defines message contents clearly
  • HTTP Media Formatters allow any encoding (flexibility) so the particular client can negotiate the encoding (RLP, XML, JSON etc...) with HTTP Accept / Content type headers as per usual in web dev, and typically web API servers will handle all that
  • Error messages are standardized. This also helps with capacity management. The usual 503 or 429 HTTP messages can be sent - and this could happen either at a networking (Infura) level and/or in the server implementation itself. (Some notes on capacity management https://ethereum-magicians.org/t/a-cross-protocol-cross-implementation-standard-for-server-capacity-management-and-flow-control/3123)
@FrankSzendzielarz

This comment has been minimized.

Copy link
Member

commented May 6, 2019

Here is a rough, part-implemented (missing other objects deeper in the object graph under BeaconBlock) example of a Beacon node HTTP REST-like architecture and API

https://beaconapi20190506111547.azurewebsites.net/

Because the Swagger metadata in the URL is downloadable this could help serve as a spec.

I can keep extending and modifying this so that it actually does validation etc., if people want. Maybe it could evolve into a test harness or an implementation. Let me know please.

You can add proto-buf media formatters and RLP formatters as well those default JSON and XML ones you see there. You can also try to auto-generate clients in the language of your choice here with the gen/clients POST method. Eg: this was auto gen'd for golang and note the docs folder.
go-client-generated.zip
Rust, just to be fair:
rust-client-generated.zip

@spble

This comment has been minimized.

Copy link
Contributor Author

commented May 7, 2019

Thanks very much for the input @karalabe - I definitely agree with your points regarding REST. I think HTTP-REST is what I had in my mind, I was just following Eth1.0 with JSON-RPC.

My vote is definitely for a HTTP REST interface, which returns JSON by default.

Thank you for the part implementation @FrankSzendzielarz - I will integrate your suggestions into my next API proposal and post it on #1011

@paulhauner

This comment has been minimized.

Copy link
Contributor

commented May 7, 2019

My proposal is to seriously consider RESTful HTTP for all public APIs that Ethereum nodes need to serve.

I support this.

@arnetheduck

This comment has been minimized.

Copy link
Contributor

commented May 10, 2019

My proposal is to seriously consider RESTful HTTP for all public APIs that Ethereum nodes need to serve

likewise, support this, for the advantages of working better with "standard" infrastructure. also good to work on specifying it unambiguously - current status quo is indeed a bit of mess to figure out, and swagger seems as good as any.

nothing prevents clients from using another, more performant or specialized protocol in their internal communication (for example when a beacon node and validator from the same client suite talks to each other), when the goal is not interop.

@gcolvin

This comment has been minimized.

Copy link

commented May 11, 2019

@karalabe Has been right about this for going on two decades now.
Roy Thomas Fielding, Architectural Styles and the Design of Network-based Software Architectures
CHAPTER 5: Representational State Transfer (REST)

@spble

This comment has been minimized.

Copy link
Contributor Author

commented May 13, 2019

So a REST API seems to be the consensus.

I have proposed an OpenAPI spec in PR #1069, which can also be viewed on SwaggerHub

Closing this issue in favour of the PR.

@spble spble closed this May 13, 2019

@BelfordZ

This comment has been minimized.

Copy link

commented May 23, 2019

I propose we use OpenRPC + JSON-RPC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.