Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WG] proposal #37

Closed
mpetrunic opened this issue May 14, 2020 · 32 comments · Fixed by #73
Closed

[WG] proposal #37

mpetrunic opened this issue May 14, 2020 · 32 comments · Fixed by #73

Comments

@mpetrunic
Copy link
Collaborator

Please take a look at proposed API endpoints (sheet `“[WG] Proposal"):
https://docs.google.com/spreadsheets/d/1kVIx6GvzVLwNYbcd-Fj8YUlPf4qGrWUlS35uaTnIAVg/edit#gid=1802603696

Feel free to comment here or in sheet if you have questions. I would like you to checkout "Oustanding questions" as well.

API endpoints that aren't being contested will be opened here in form of PR where further things like descriptions, validation, response codes etc will be discussed.

cc-ing interested parties @paulhauner @hwwhww @arnetheduck @mratsim @terencechain @prestonvanloon @AgeManning @wemeetagain @mkalinin @ajsutton @rolfyone @moles1 @skmgoldin

@mpetrunic
Copy link
Collaborator Author

If somebody wants to help out with writing openapi spec for proposed endpoints, please contact me so we don't overlap

@djrtwo
Copy link
Contributor

djrtwo commented May 14, 2020

Thank you BN working group!!

I would like you to checkout "Oustanding questions" as well.

Yes, in particular please see the open questions. Much of the debate my lie in there.

Also, if there is a particular use case that is entirely not satisfied by this API or overly complicated, please document it here for discussion

@zah
Copy link

zah commented May 15, 2020

I realize that we are coming very late to the party, since Nimbus has delayed introducing the beacon-node / validator-client split until now, but planning out our internal design revealed some questions regarding the API.

Within our codebase, we've tried to prepare for the split from the start by making all signing and RANDAO reveal operations asynchronous. But one key difference from the validator API is that we've assigned the responsibilities in a different way - for us, the beacon node decides what is the best time to produce a block or attestation and the validator client is called on-demand only to perform the operations that depend on the validator keys.

With the proposed APIs here, the validator client must keep track of time and it must determine the best moment to request attestations and blocks for signing, but this seems counter-productive for the goal of profit maximisation. As you may know, determining the optimal set of attestations to aggregate is a hard optimisation problem that the beacon node will be much better equipped to solve unless the validator client significantly increases in complexity.

So, I'm wondering, has this different split of responsibilities been considered? Are there any other requirements that conflict with it?

Admittedly, there are some obvious implications for the needed trust between the beacon node and the validator client, but it seems to me that the trust problem was never solved in a satisfying way and also that we must still provide the most optimal solution for validators who will be running their own beacon nodes (without the need to trust any third-parties).

@mcdee
Copy link
Contributor

mcdee commented May 15, 2020

As you mention, the largest piece of work here is attestation aggregation. The API is designed to support both low-powered validators that may not have the resources to aggregate as well as high-powered validators that want to do the work themselves.

For low-powered validators they would GET /v1/validator/aggregate_attestation to fetch an aggregate attestation built by the beacon node. For high-powered validators they would subscribe to the event stream for attestations and carry out their own aggregation. For both, they would submit the aggregated attestation with a POST to /v1/validator/aggregate_and_proof

Timing is a more generic issue: waiting on proposing a block to potentially put more attestations in to it, for example, could result in more profit for the proposer, but does somewhat fly in the face of the spirit of the validator spec (if not the letter; see the section "Block proposal" for example where it states "A validator is expected to propose a SignedBeaconBlock at the beginning of any slot during which is_proposer(state, validator_index) returns True"; you could argue that "beginning" is a period of time rather than a point in time). But no, the proposed API doesn't provide functionality for the beacon node to signal the validator client the time at which it should carry out a proposal.

All that said, it sounds like if you're combining the beacon node and validator in to a single process and separating the signing only, so it doesn't sound like the beacon node<->validator API would apply for this situation. Or am I misunderstanding your architecture?

@zah
Copy link

zah commented May 15, 2020

Yes, you are understanding our architecture correctly. The primary motivation for discussing this different split of responsibilities is that at least for us it leads to a significant reduction of code size and complexity of the validator process.

@mcdee
Copy link
Contributor

mcdee commented May 15, 2020

Chatted to @zah offline and I think this is where we ended up:

  • it would be nice if support of the /validator endpoints, although desired, wasn't mandatory (nimbus are looking to support these endpoints as long as they don't have a significant impact on their architecture, but are not yet sure if they can do so). The clear grouping of endpoints can be an aid here
  • the signals that nimbus sends to tell the signer it is time to do some signing could be considered extensions of the proposed event stream mechanism. Although we do not want to overburden the mandatory functionality to meet API conformance, having somewhere that such extensions could be recorded would be useful

@zah please correct me if the any of the above is incorrect.

@zah
Copy link

zah commented May 15, 2020

We are committed to supporting the /validator endpoints. What is not completely certain is whether our "simpler" approach will remain and whether it will be recommended as a more robust solution potentially enhancing the profit maximisation algorithms of Nimbus.

@paulhauner
Copy link
Contributor

has this different split of responsibilities been considered?

We did consider this in some detail. I can't quite remember the details, however the one that stands out is to allow VCs to switch between BNs. If you having timing logic inside the VC then it can determine if the BN is actually doing it's job or not.

If the VC requires the BN to call it, then, in the most naive implementation, if the BN goes down then the VC just sits there doing nothing instead of logging errors or perhaps switching to another BN. Ofc, you can get around this by implementing some logic in the VC to figure out whether or not it's been prodded in a while.. But this is timing logic and when implemented in its entirety roughly ends up replicating the timing logic you implemented in the BN.

determining the optimal set of attestations to aggregate is a hard optimisation problem

I agree. This logic is done in the BN for us.

@prestonvanloon
Copy link
Contributor

cc @rauljordan.

At @prysmaticlabs, we are generally not in favor of any duck typing or otherwise ambiguous object unique identifiers.

@mpetrunic
Copy link
Collaborator Author

cc @rauljordan.

At @prysmaticlabs, we are generally not in favor of any duck typing or otherwise ambiguous object unique identifiers.

There is only duck typing between slot and root, but you can always identify root by "0x" prefix so I don't think anyone should have problem coding that in any language though.

@prestonvanloon
Copy link
Contributor

prestonvanloon commented May 18, 2020

There is only duck typing between slot and root, but you can always identify root by "0x" prefix so I don't think anyone should have problem coding that in any language though.

This isn't a question of whether or not we could implement it this way, but whether we ought to implement it this way.

There are 5 different ways to handle a blockId in the current proposal.

"path:blockId" "head"" | ""genesis"" | ""finalized"" | <slot> | <blockRoot>"

While there are only 5 different ways to interpret request data, there are likely more that could be added which would then increase the burden on maintainers.

We would advocate for a more lean, clear, and simple API which doesn't set precedent for more ambiguity. BlockId should have a single, clear interpretation.

Edit: Likewise with all object IDs.

@mcdee
Copy link
Contributor

mcdee commented May 18, 2020

BlockId should have a single, clear interpretation.

@prestonvanloon If we go for a single representation I think we have three options:

  1. pick a single canonical ID, drop the rest
  2. create more endpoints for the separate IDs
  3. have no canonical ID, use query parameters

The first option would reduce functionality, so I'd assume that either 2) or 3) would be the available choices. Which of these are you considering, or are you thinking of something else?

@mpetrunic
Copy link
Collaborator Author

There are 5 different ways to handle a blockId in the current proposal.

"path:blockId" "head"" | ""genesis"" | ""finalized"" | <slot> | <blockRoot>"

I would argue that head, genesis, finalized arent params they are separate endpoints. Sortof like popular /users/me endpoint

@prestonvanloon
Copy link
Contributor

prestonvanloon commented May 18, 2020

@mcdee I believe the original proposal from @djrtwo and @protolambda had solved this.

Here are a few scenarios I can think of:

Retrieve head, finalized, previous justified, current justified block

GET /beacon/forkchoice/head

{
  "head_block_root": "...",
  "finalized_block_root": "...",
  "previous_justified_block_root" :"...",
  "current_justified_block_root": "..."
}

Now I have the required information to request the particular block by root.

Retrieve block by slot

Given that there may exist multiple blocks per slot in the event of a fork, we could use something like

GET /v1/beacon/block_root?slot={slot} or GET /v1/beacon/block_root/{slot}

This would return an array of block roots or block root objects with a "canonical" property, which then can be used to access the single retrieve method.

GET /v1/beacon/blocks/{block_root}

@mpetrunic

I would argue that head, genesis, finalized arent params they are separate endpoints. Sortof like popular /users/me endpoint

This is basically my point. This proposal is asking implementation teams to add and maintain multiple endpoints/routes which could be solved by one or two.

@mpetrunic
Copy link
Collaborator Author

@prestonvanloon

GET /beacon/forkchoice/head

How would you get forkchoice for some past state? /beacon/forkchoice/<stateRoot>?

Now I have the required information to request the particular block by root.

You are asking users to read and understand bunch of documentation to figure out hot to get some finalized block or some other piece of data while making bunch of requests. Whole point of rest api is to have some conventions how data is represented and make it easy to obtain it. I bet even eth2 developers are using fetch by slot to investigate whats going on.

This is basically my point. This proposal is asking implementation teams to add and maintain multiple endpoints/routes which could be solved by one or two.

I hardly think this is issue here. I doubt it takes more than an hour to create those alias endpoints and it will make API a lot more user friendlier for new users/developers to interact with...

@prestonvanloon
Copy link
Contributor

How would you get forkchoice for some past state?

Get that state and look.

  • state.previous_justified_checkpoint.root
  • state.current_justified_checkpoint.root
  • state.finalized_checkpoint.root
  • state.block_roots[state.slot%SLOTS_PER_HISTORICAL_ROOT]

While the last point may not capture the nodes view at the time that state was created, I don't think anyone is advocating for historical forkchoice data so it should be adequate information.

You are asking users to read and understand bunch of documentation to figure out hot to get some finalized block or some other piece of data while making bunch of requests. Whole point of rest api is to have some conventions how data is represented and make it easy to obtain it. I bet even eth2 developers are using fetch by slot to investigate whats going on.

Do we expect to support users that don't or won't read documentation?

I hardly think this is issue here. I doubt it takes more than an hour to create those alias endpoints and it will make API a lot more user friendlier for new users/developers to interact with...

While it might only take an hour to create, an implementation team then has to support and maintain it.
How many convenience methods do allow when the use cases already available with a minimal API?

@paulhauner
Copy link
Contributor

Regarding about the complexity of "head" | "genesis" | "finalized" | "justified" | <slot> <stateRoot>. As others have suggested, I think we can view each of these tags as an alias for a root: once we've resolved the alias all queries are just targeting a state root.

To help myself think through it, I sketched out the flow for a /v1/beacon/states endpoint (hopefully the match statement is easy enough to interpret):

fn resolve_alias(alias: &str) -> Hash256 {
  let state_root = match alias {
    "head" => get_head_state_root(),
    "genesis" => get_genesis_state_root(),
    "finalized" => get_finalized_state_root(),
    "justified" => get_justified_state_root(),
    other => {
      if other.starts_with("0x") {
        other
      } else {
        get_state_by_slot(other)
      }
    }
  }
}

So we have:

  • get_head_state_root(): pretty simple, would probably need this in any case.
  • get_genesis_state_root(): we already store genesis block root on our BeaconChain, could just add the state root too.
  • get_finalized_state_root(): a bit funkier, have sketched it below.
  • get_justified_state_root(): also a bit funky, probably the same as the finalized case though.
  • get_state_by_slot(): pretty straight forward, we already have this functionality. I suspect other clients would too.
fn get_finalized_state_root() -> Hash256 {
  // State reads are costly, hopefully clients can optimize around this.
  let head_state = get_state(get_head_state_root());
  // There's potentially extra state reads in finality > state.state_roots.len()
  head_state.get_state_root(start_slot(head_state.finalized_checkpoint.epoch));
}

This doesn't seem like a massive burden to me, but perhaps I'm missing something. So, I guess we have pros/cons of alias:

Pros:

  • Easier for users to figure out how to get things that have an alias.
  • Less API calls for users who want an aliased thing.

Cons:

  • More work to maintain.

I don't feel super strongly about the aliases, I don't consider them a huge burden. However, if I was designing this I would:

  • Drop the genesis alias since it's exactly equivalent to slot = 0 (assuming no one has a non-zero genesis slot, out-of-scope IMO).
    • That being said, an alias that's exactly equivalent to some other thing is basically no maintenance burden so I could 100% take it or leave it here.
  • Drop the justified alias since I don't think users will frequently care about justification. Those that do can deal with an extra call to head first. If it turns out to be super useful, we add it later.

In summary, this API looks fine to me (thanks to those who put the effort in to produce it). I don't see aliases as a large maintenance overhead, but I also don't feel super strongly about keeping them either. I'm happy to follow the crowd here. If you want a firm statement from me I can do it, but I suspect another chef in the kitchen isn't necessarily helpful.

@mpetrunic
Copy link
Collaborator Author

@paulhauner I would just mention that justified/finalized state roots are present in forkchoice which could be option to avoid expensive state reads^^

@mkalinin
Copy link

There is a No duck typing tab at the end of tab list in the document. It's an example of beacon chain data endpoints with strict object identifiers. In some cases it will result in an extra round-trip to request the same data comparing to the "duck typing" approach.

The other concern regarding "duck typing" endpoints is that HTTP caching policy will have to be set depending on the request parameters. For example:

  • /v1/beacon/blocks/head -- MUST NOT be cached
  • /v1/beacon/blocks/slot -- MUST NOT be cached
  • /v1/beacon/blocks/genesis -- MAY be cached
  • /v1/beacon/blocks/{blockRoot} -- MAY be cached

@mcdee
Copy link
Contributor

mcdee commented May 19, 2020

Drop the genesis alias since it's exactly equivalent to slot = 0

Not to derail the duck typing debate, but this bring to mind another question: what are we asking for when we want the state at slot x? We could be asking for the state at the start of slot x or the end of slot x?

There are arguments for both, of course, but we should pick one, as it has implications. For example, Paul's statement that I quoted above only holds true if we use "start of slot". I think that "end of slot" (or epoch) is the more intuitive, but if there are any strong arguments for "start of slot" it would be worth hearing them.

@mpetrunic
Copy link
Collaborator Author

Not to derail the duck typing debate, but this bring to mind another question: what are we asking for when we want the state at slot x? We could be asking for the state at the start of slot x or the end of slot x?

I would expect state where state.slot equal to requested slot.

@terencechain
Copy link

Having genesis alias is nice because when phase1 rolls in the shard chain's genesis slot will most likely not be starting at slot 0. Just my 2c

@rauljordan
Copy link

Hi @paulhauner we absolutely need the genesis alias because otherwise we have no way of differentiating a request that wants slot = 0 or a request that didn't pass in the slot query param at all in Golang

@paulhauner
Copy link
Contributor

Having genesis alias is nice because when phase1 rolls in the shard chain's genesis slot will most likely not be starting at slot 0. Just my 2c

Makes sense, I didn't consider that :)

Hi @paulhauner we absolutely need the genesis alias because otherwise we have no way of differentiating a request that wants slot = 0 or a request that didn't pass in the slot query param at all in Golang

I'm happy for genesis to stay but I would be hesitant to shape the API under the assumption that servers are unable to determine if a value is uninitialized or set to whatever their default value is.

We have validators and committees identified by 0, would they also need an alias?

@paulhauner
Copy link
Contributor

We could be asking for the state at the start of slot x or the end of slot x?

I would define it as block_at_slot(slot).state_root, where block_at_slot(slot).slot == slot and block_at_slot(slot) is always in the canonical chain.

It's not clear to me what a "state at the start of slot x" is. Right before you process a block you advance the pre-state to the current slot, but this is inside state_transition so I'd say it's an intermediate value that has no value to the user. We don't even store that state, I would be surprised if other clients do.

Regarding epoch queries, I would advocate for using compute_start_slot_at_epoch since there's already a precedence that state.finalized_checkpoint.epoch refers to the first slot in the epoch.

@rolfyone
Copy link
Collaborator

I've commented on the spreadsheet, but i'll reiterate here:
/v1/beacon/blocks/{blockId} - it would be nice to have the block root at the top level also: its currently offered by lighthouse and prysm, and there's a PR in against teku because it simplifies debugging

/v1/validator/attestation_data
/v1/validator/aggregate_attestation
/v1/validator/blocks/{slot}
in the spreadsheet at least, they seem to be the only endpoints not wrapping their response object with 'data'. I actually prefer not having the wrapping object, but either way, we should probably be consistent unless there's a really good reason?

@protolambda
Copy link

The state-id format is still painful, and I think splitting it into multiple routes is not a bad idea. Here's a proposal:

In paths

I understand that having extra routes is not pretty, but essential for some tech stacks (Go with GRPC versions of API in particular), so I would go for the following compromise: we spec state ID as "state ID path segment". Every time we would use a state ID in a path, we can have the path segment consist of multiple segments, the first always being a string. The same applies for other options that end up in the path as required argument with different types.

If you want to cover all options in one API handler really badly (JS and python can likely do something like that elegantly), you can, by just matching multiple routes together.

Take state ID as example:

stateId can be any of:

  • state_root/{hex string}
  • slot/{slot decimal number}
  • head
  • genesis
  • finalized
  • justified

E.g. when we spec /v1/beacon/states/{stateId}/committees/{epoch}, it can be:

  • /v1/beacon/states/state_root/0xabc123def...45deadbeef/committees/{epoch}
  • /v1/beacon/states/head/committees/{epoch}

ExpressJS and more advanced routers support matching multiple patterns for the same endpoint, and then store the path params in a special dictionary anyway, so there's no problem with typed arguments or default values there, regardless of approach.

This way languages that would want to type the root, slot, validator index, etc. can do so, by simply routing them to a different endpoint. Meanwhile, the spec stays clean, and the other languages can code-golf the API definition to be more minimal all they want.

Also, this solves the http caching problem, since the /head path is now nice and distinct from others, making cache configuration easier.

In queries

Right now the API proposal has no case of a mixed input type in a single query param. If a param can have two types (e.g. a block root and a slot), then it should be part of the path. If the two types are not exclusive, then there should have been two separate query params.

@onqtam
Copy link

onqtam commented May 22, 2020

It seems to me that it's expected that validators have to query the beacon nodes for the current fork version in order to perform the signing of blocks for example - in addition to requesting blocks to sign - isn't that a potential for race conditions if there is a change of fork? Perhaps /v1/validator/blocks/{slot} which returns { block: BeaconBlock } should return the fork as well. Maybe such concerns apply to other parts of the API as well.

And another Q: genesis_validators_root should be unaffected because it should never change, correct? It's basically going to be set in stone... forever - after Eth 2.0 is launched...?

@mcdee
Copy link
Contributor

mcdee commented May 22, 2020

@onqtam the fork schedule is published through the configuration so clients will know in advance when forks are scheduled to occur.

And yes, genesis_validators_root will be constant once the chain has launched.

@rolfyone
Copy link
Collaborator

rolfyone commented Jun 4, 2020

so are we representing byte data in hexadecimal? I thought we were using base64 encoding?

@onqtam
Copy link

onqtam commented Jun 19, 2020

There is only duck typing between slot and root, but you can always identify root by "0x" prefix so I don't think anyone should have problem coding that in any language though.

@mpetrunic Can I assume that there is the same duck typing for <publicKey> | <index> for /v1/beacon/states/{stateId}/validators/{validatorId} - meaning that I can use 0x as a prefix to distinguish between those two?

@mpetrunic
Copy link
Collaborator Author

There is only duck typing between slot and root, but you can always identify root by "0x" prefix so I don't think anyone should have problem coding that in any language though.

@mpetrunic Can I assume that there is the same duck typing for <publicKey> | <index> for /v1/beacon/states/{stateId}/validators/{validatorId} - meaning that I can use 0x as a prefix to distinguish between those two?

yes. I should probably update spec to make it clear here: https://ethereum.github.io/eth2.0-APIs/#/Beacon/getStateValidator
as I did on state_id param

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.