-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WG] proposal #37
Comments
If somebody wants to help out with writing openapi spec for proposed endpoints, please contact me so we don't overlap |
Thank you BN working group!!
Yes, in particular please see the open questions. Much of the debate my lie in there. Also, if there is a particular use case that is entirely not satisfied by this API or overly complicated, please document it here for discussion |
I realize that we are coming very late to the party, since Nimbus has delayed introducing the beacon-node / validator-client split until now, but planning out our internal design revealed some questions regarding the API. Within our codebase, we've tried to prepare for the split from the start by making all signing and RANDAO reveal operations asynchronous. But one key difference from the validator API is that we've assigned the responsibilities in a different way - for us, the beacon node decides what is the best time to produce a block or attestation and the validator client is called on-demand only to perform the operations that depend on the validator keys. With the proposed APIs here, the validator client must keep track of time and it must determine the best moment to request attestations and blocks for signing, but this seems counter-productive for the goal of profit maximisation. As you may know, determining the optimal set of attestations to aggregate is a hard optimisation problem that the beacon node will be much better equipped to solve unless the validator client significantly increases in complexity. So, I'm wondering, has this different split of responsibilities been considered? Are there any other requirements that conflict with it? Admittedly, there are some obvious implications for the needed trust between the beacon node and the validator client, but it seems to me that the trust problem was never solved in a satisfying way and also that we must still provide the most optimal solution for validators who will be running their own beacon nodes (without the need to trust any third-parties). |
As you mention, the largest piece of work here is attestation aggregation. The API is designed to support both low-powered validators that may not have the resources to aggregate as well as high-powered validators that want to do the work themselves. For low-powered validators they would GET /v1/validator/aggregate_attestation to fetch an aggregate attestation built by the beacon node. For high-powered validators they would subscribe to the event stream for attestations and carry out their own aggregation. For both, they would submit the aggregated attestation with a POST to /v1/validator/aggregate_and_proof Timing is a more generic issue: waiting on proposing a block to potentially put more attestations in to it, for example, could result in more profit for the proposer, but does somewhat fly in the face of the spirit of the validator spec (if not the letter; see the section "Block proposal" for example where it states "A validator is expected to propose a All that said, it sounds like if you're combining the beacon node and validator in to a single process and separating the signing only, so it doesn't sound like the beacon node<->validator API would apply for this situation. Or am I misunderstanding your architecture? |
Yes, you are understanding our architecture correctly. The primary motivation for discussing this different split of responsibilities is that at least for us it leads to a significant reduction of code size and complexity of the validator process. |
Chatted to @zah offline and I think this is where we ended up:
@zah please correct me if the any of the above is incorrect. |
We are committed to supporting the |
We did consider this in some detail. I can't quite remember the details, however the one that stands out is to allow VCs to switch between BNs. If you having timing logic inside the VC then it can determine if the BN is actually doing it's job or not. If the VC requires the BN to call it, then, in the most naive implementation, if the BN goes down then the VC just sits there doing nothing instead of logging errors or perhaps switching to another BN. Ofc, you can get around this by implementing some logic in the VC to figure out whether or not it's been prodded in a while.. But this is timing logic and when implemented in its entirety roughly ends up replicating the timing logic you implemented in the BN.
I agree. This logic is done in the BN for us. |
cc @rauljordan. At @prysmaticlabs, we are generally not in favor of any duck typing or otherwise ambiguous object unique identifiers. |
There is only duck typing between slot and root, but you can always identify root by "0x" prefix so I don't think anyone should have problem coding that in any language though. |
This isn't a question of whether or not we could implement it this way, but whether we ought to implement it this way. There are 5 different ways to handle a
While there are only 5 different ways to interpret request data, there are likely more that could be added which would then increase the burden on maintainers. We would advocate for a more lean, clear, and simple API which doesn't set precedent for more ambiguity. BlockId should have a single, clear interpretation. Edit: Likewise with all object IDs. |
@prestonvanloon If we go for a single representation I think we have three options:
The first option would reduce functionality, so I'd assume that either 2) or 3) would be the available choices. Which of these are you considering, or are you thinking of something else? |
I would argue that head, genesis, finalized arent params they are separate endpoints. Sortof like popular |
@mcdee I believe the original proposal from @djrtwo and @protolambda had solved this. Here are a few scenarios I can think of: Retrieve head, finalized, previous justified, current justified block GET /beacon/forkchoice/head {
"head_block_root": "...",
"finalized_block_root": "...",
"previous_justified_block_root" :"...",
"current_justified_block_root": "..."
} Now I have the required information to request the particular block by root. Retrieve block by slot Given that there may exist multiple blocks per slot in the event of a fork, we could use something like GET /v1/beacon/block_root?slot={slot} or GET /v1/beacon/block_root/{slot} This would return an array of block roots or block root objects with a "canonical" property, which then can be used to access the single retrieve method. GET /v1/beacon/blocks/{block_root}
This is basically my point. This proposal is asking implementation teams to add and maintain multiple endpoints/routes which could be solved by one or two. |
How would you get forkchoice for some past state?
You are asking users to read and understand bunch of documentation to figure out hot to get some finalized block or some other piece of data while making bunch of requests. Whole point of rest api is to have some conventions how data is represented and make it easy to obtain it. I bet even eth2 developers are using fetch by slot to investigate whats going on.
I hardly think this is issue here. I doubt it takes more than an hour to create those alias endpoints and it will make API a lot more user friendlier for new users/developers to interact with... |
Get that state and look.
While the last point may not capture the nodes view at the time that state was created, I don't think anyone is advocating for historical forkchoice data so it should be adequate information.
Do we expect to support users that don't or won't read documentation?
While it might only take an hour to create, an implementation team then has to support and maintain it. |
Regarding about the complexity of To help myself think through it, I sketched out the flow for a fn resolve_alias(alias: &str) -> Hash256 {
let state_root = match alias {
"head" => get_head_state_root(),
"genesis" => get_genesis_state_root(),
"finalized" => get_finalized_state_root(),
"justified" => get_justified_state_root(),
other => {
if other.starts_with("0x") {
other
} else {
get_state_by_slot(other)
}
}
}
} So we have:
fn get_finalized_state_root() -> Hash256 {
// State reads are costly, hopefully clients can optimize around this.
let head_state = get_state(get_head_state_root());
// There's potentially extra state reads in finality > state.state_roots.len()
head_state.get_state_root(start_slot(head_state.finalized_checkpoint.epoch));
} This doesn't seem like a massive burden to me, but perhaps I'm missing something. So, I guess we have pros/cons of alias: Pros:
Cons:
I don't feel super strongly about the aliases, I don't consider them a huge burden. However, if I was designing this I would:
In summary, this API looks fine to me (thanks to those who put the effort in to produce it). I don't see aliases as a large maintenance overhead, but I also don't feel super strongly about keeping them either. I'm happy to follow the crowd here. If you want a firm statement from me I can do it, but I suspect another chef in the kitchen isn't necessarily helpful. |
@paulhauner I would just mention that justified/finalized state roots are present in forkchoice which could be option to avoid expensive state reads^^ |
There is a No duck typing tab at the end of tab list in the document. It's an example of beacon chain data endpoints with strict object identifiers. In some cases it will result in an extra round-trip to request the same data comparing to the "duck typing" approach. The other concern regarding "duck typing" endpoints is that HTTP caching policy will have to be set depending on the request parameters. For example:
|
Not to derail the duck typing debate, but this bring to mind another question: what are we asking for when we want the state at slot There are arguments for both, of course, but we should pick one, as it has implications. For example, Paul's statement that I quoted above only holds true if we use "start of slot". I think that "end of slot" (or epoch) is the more intuitive, but if there are any strong arguments for "start of slot" it would be worth hearing them. |
I would expect state where |
Having |
Hi @paulhauner we absolutely need the |
Makes sense, I didn't consider that :)
I'm happy for We have validators and committees identified by |
I would define it as It's not clear to me what a "state at the start of slot Regarding |
I've commented on the spreadsheet, but i'll reiterate here:
|
The state-id format is still painful, and I think splitting it into multiple routes is not a bad idea. Here's a proposal: In pathsI understand that having extra routes is not pretty, but essential for some tech stacks (Go with GRPC versions of API in particular), so I would go for the following compromise: we spec state ID as "state ID path segment". Every time we would use a state ID in a path, we can have the path segment consist of multiple segments, the first always being a string. The same applies for other options that end up in the path as required argument with different types. If you want to cover all options in one API handler really badly (JS and python can likely do something like that elegantly), you can, by just matching multiple routes together. Take state ID as example:
E.g. when we spec
ExpressJS and more advanced routers support matching multiple patterns for the same endpoint, and then store the path params in a special dictionary anyway, so there's no problem with typed arguments or default values there, regardless of approach. This way languages that would want to type the root, slot, validator index, etc. can do so, by simply routing them to a different endpoint. Meanwhile, the spec stays clean, and the other languages can code-golf the API definition to be more minimal all they want. Also, this solves the http caching problem, since the In queriesRight now the API proposal has no case of a mixed input type in a single query param. If a param can have two types (e.g. a block root and a slot), then it should be part of the path. If the two types are not exclusive, then there should have been two separate query params. |
It seems to me that it's expected that validators have to query the beacon nodes for the current fork version in order to perform the signing of blocks for example - in addition to requesting blocks to sign - isn't that a potential for race conditions if there is a change of fork? Perhaps And another Q: |
@onqtam the fork schedule is published through the configuration so clients will know in advance when forks are scheduled to occur. And yes, genesis_validators_root will be constant once the chain has launched. |
so are we representing byte data in hexadecimal? I thought we were using base64 encoding? |
@mpetrunic Can I assume that there is the same duck typing for |
yes. I should probably update spec to make it clear here: https://ethereum.github.io/eth2.0-APIs/#/Beacon/getStateValidator |
Please take a look at proposed API endpoints (sheet `“[WG] Proposal"):
https://docs.google.com/spreadsheets/d/1kVIx6GvzVLwNYbcd-Fj8YUlPf4qGrWUlS35uaTnIAVg/edit#gid=1802603696
Feel free to comment here or in sheet if you have questions. I would like you to checkout "Oustanding questions" as well.
API endpoints that aren't being contested will be opened here in form of PR where further things like descriptions, validation, response codes etc will be discussed.
cc-ing interested parties @paulhauner @hwwhww @arnetheduck @mratsim @terencechain @prestonvanloon @AgeManning @wemeetagain @mkalinin @ajsutton @rolfyone @moles1 @skmgoldin
The text was updated successfully, but these errors were encountered: