Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Engine API subset for the Merge Interop #74

Merged
merged 14 commits into from
Sep 24, 2021

Conversation

mkalinin
Copy link
Collaborator

@mkalinin mkalinin commented Sep 20, 2021

Introduces Engine API subset for the Merge Interop


1. Given the `payloadId` client software **MUST** respond with the most recent version of the payload that is available in the corresponding building process at the time of receiving the call.

2. The call **MUST** be responded with error code `2` (`Action not allowed`) if a building process identified by the `payloadId` doesn't exist.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't an obvious response. Could we add a new error code 4 (Unknown ID) to make this error explicit?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not sure what is the rule of thumb of adding new error codes and just picked the one that already exists with the following description:

Should be used when some action is not allowed, e.g. preventing an action, while another depending action is processing on, like sending again when a confirmation popup is shown to the user

Do you know if each custom error code supposed to be used only in one place or semantically similar places of the client software?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not so much "not allowed" as "not possible" (due to the ID being unknown), so although the best of the current options it isn't a great answer.

I don't know what the process is to add to error codes, but I would certainly try to do so rather than overload an existing error code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added 4: Missing resource and used it here and in a couple of other places.


#### Specification

1. Client software **MUST** validate the payload according to the execution environment rule set defined in the [EIP-3675](https://eips.ethereum.org/EIPS/eip-3675#specification) and respond with the validation result.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this link to the rules within the specification?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These rules are written throughout the specification section of the EIP. Did you mean to list sections of the EIP that are related to the payload execution?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either that or have a consolidated list. Given the validation is a "must", it should be clear what the validation rules are.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to:

  1. Client software MUST validate the payload according to the execution environment rule set with modifications to this rule set definted in the Block Validity section of EIP-3675 and respond with the validation result.


3. Client software **MUST** discard the payload if it's deemed invalid.

4. The call **MUST** be responded with `SYNCING` status while the sync process is in progress and thus the execution cannot yet be validated.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would assume that in this situation the payload is discarded by the execution engine, but this should be made clear.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would defer the decision of whether to keep or discard the payload in this case to the execution client. As if the execution client is being synced then it rather store all the payloads and then prune the storage upon receiving finalized block as current sync proposal depends on finalized checkpoints and in advance it is unknown which one is already finalized or will be finalized soon. Also, execution clients might want to store blocks of the finalized chain and prune the others.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my concern here is that the caller ends up in limbo at current regarding the state of the payload. It may be discarded, it may be kept until the node is synced and then either integrated in to the chain or dropped, it may be kept only if the node is nearly synced and then broadcast if integrated in to the chain, etc. Seems like the safest and clearest option would be for the node to discard the payload if it cannot process it immediately.

(Could storing the blocks be a potential DoS vector, where a malicious attacker would flood a syncing node with future payloads the node needs to hold around in case they are used?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consensus client should verify the beacon block and send the corresponding consensusValidated message disregarding the sync status of the execution client. If consensusValidated deems the payload as invalid then it must be discarded as it is stated in the specification to the method below. During the sync all non-canonical forks may still be pruned upon receiving forkchoiceUpdated. I think execution client should be free to decide how to handle these calls while in a sync mode and rely on consensus client in sorting out invalid blocks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it is valuable to continue to pass these blocks into the execution-engine even during SYNCING because the execution engine (depending on sync method) will often need more current blocks to be able to sufficiently finish sync.

Eventually, the beacon chain will pass in a payload that is VALID or INVALID instead of SYNCING which is a signal that sync is completed and payloads are being fully validated

@MariusVanDerWijden
Copy link
Member

I've started updating our el implementation of it to this spec here: ethereum/go-ethereum#23607

- `finalizedBlockHash`: `DATA`, 32 Bytes - block hash of the most recent finalized block

#### Returns
none
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the el engine does not have the finalizedBlockHash in db?
Previously SetHead returned Success:False for that

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I think then the sync process should be initiated (if it hasn't been previously, this could be the first message after a start either with a fresh state or after a long period of being out) and the corresponding SYNCING response should be sent back. WDYT?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @fjl

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about the design of the forkchoiceUpdated for a little bit. IMO, it should be designed as a notification hence no response is expected. The same should be applicable to the consensusValidated.

I also think that we better use executePayload as the only method related to sync process if we want sync to piggy back on the core methods and responses to them. A slight improvement here could be an introduction of another type of response to the executePayload which provides more detail on the sync process.

src/engine/interop/specification.md Outdated Show resolved Hide resolved
src/engine/interop/specification.md Show resolved Hide resolved

3. Client software **MUST** discard the payload if it's deemed invalid.

4. The call **MUST** be responded with `SYNCING` status while the sync process is in progress and thus the execution cannot yet be validated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it is valuable to continue to pass these blocks into the execution-engine even during SYNCING because the execution engine (depending on sync method) will often need more current blocks to be able to sufficiently finish sync.

Eventually, the beacon chain will pass in a payload that is VALID or INVALID instead of SYNCING which is a signal that sync is completed and payloads are being fully validated

src/engine/interop/specification.md Outdated Show resolved Hide resolved
src/engine/interop/specification.md Outdated Show resolved Hide resolved
Co-authored-by: Danny Ryan <dannyjryan@gmail.com>
Co-authored-by: lightclient <14004106+lightclient@users.noreply.github.com>

2. The call **MUST** be responded with `4: Missing resource` error if the building process identified by the `payloadId` doesn't exist.

3. Client software **MAY** stop the corresponding building process after serving this call.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the engine respond to getPayload twice if called twice with the same payloadID?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should respond with error if the process doesn't exist. So, if engine kills the process after serving the getPayload for the first time then it should respond with error, if engine relies on the expiration by timeout and doesn't kill the process on getPayload then it may respond as usual. One call for this method should be pretty much enough for the consensus client unless there is an edge case we're overlooking.


5. Client software **SHOULD** respond with `2: Action not allowed` error if the sync process is in progress.

6. Client software **MUST** respond with `4: Missing resource` error if the parent block is unknown.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SHOULD verb should be used instead of a MUST for the same reason as described above

Comment on lines +14 to +15
Client software **MUST** expose Engine API at a port independent from JSON-RPC API. The default port for the Engine API is 8550 for HTTP and 8551 for WebSocket.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the 8550/1 port only serve engine endpoints or are we thinking it will offer a dedicated channel for the consensus engine to fetch from the necessary eth_* endpoints as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is your main concern that a client serving normal traffic on 8545 might become overloaded with traffic potentially causing consensus-critical requests to be dropped/delayed?

#### Parameters
1. `Object` - The payload attributes:
- `parentHash`: `DATA`, 32 Bytes - hash of the parent block
- `timestamp`: `QUANTITY`, 64 Bits - value for the `timestamp` field of the new payload
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to the api - is there a concern with the beacon node asking the execution engine to prepare a payload with a timestamp that would be considered invalid under old consensus rules?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timestamp field of the payload is set to the slot.timestamp and this condition is enforced by validation conditions on the beacon chain, so the timestamp value is enforced to be aligned with the timestamp of a slot. Beacon chain has strict time restrictions and validations in the fork choice and gossip. So, I think that execution clients should blindly follow the consensus clients with respect to the timestamp validation, i.e. don't validate timestamps at all to avoid edge case bugs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point, btw! Thanks for raising it 👍


This structure maps on the [`ExecutionPayload`](https://github.com/ethereum/consensus-specs/blob/dev/specs/merge/beacon-chain.md#ExecutionPayload) structure of the beacon chain spec. The fields are encoded as follows:
- `parentHash`: `DATA`, 32 Bytes
- `coinbase`: `DATA`, 20 Bytes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `coinbase`: `DATA`, 20 Bytes
- `feeRecipient`: `DATA`, 20 Bytes

Thoughts?

Copy link
Collaborator Author

@mkalinin mkalinin Sep 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not opposed to this renaming. If we accept this then the respective change must be done to the beacon chain spec. The feeRecipient name deviates from the geth codebase which names this field as Coinbase, it also deviates from JSON-RPC API which names the same field as miner (in the PoS the latter name doesn't make any sense and we might want to update the JSON-RPC API).

cc @MicahZoltu @djrtwo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm mixed. Seems easier to not go in and adjust it on all clients but... don't have a dog in this race

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we state "The coinbase field value MAY deviate from what is specified by the feeRecipient parameter." in another place. So we would need to be clearer here maybe "payload.feeRecipient field value.... from what is specified by the feeRecipient call parameter"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have always hated Coinbase as it is totally non-descriptive. The only thing we know for sure about this address is that it is the address that the consensus client is suggesting should receive fees. Because of this, I favor suggestedFeeRecipient to make it very clear to the reader that this is just a suggestion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I thought this was for the API. I agree with feeRecipient here, and I think the API method should have suggestedFeeRecipient. That will address @djrtwo's concern about name collisions, and also provide the most descriptive variable names possible (important for specifications IMO).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest we continue this discussion in a separate issue/PR or discord

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, let's take it there. I think the terms are fine for this initial interop release.

@paulhauner paulhauner mentioned this pull request Sep 23, 2021
7 tasks
Copy link
Contributor

@djrtwo djrtwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

Still a few qustions to answer but this will serve us well for interop

@lightclient lightclient merged commit 730a942 into ethereum:main Sep 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants