Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checkpoint Sync 5/5 - end-to-end test #10387

Closed
wants to merge 28 commits into from
Closed

Conversation

kasey
Copy link
Contributor

@kasey kasey commented Mar 17, 2022

What type of PR is this?

Other

What does this PR do? Why is it needed?

This PR adds a step to the existing end-to-end test to perform retrieve a checkpoint state+block from one of the running nodes once all nodes are in their final synced state.

Which issues(s) does this PR fix?

Fixes #

Other notes for review

Due to the way the checkpoint sync period is calculated and the small number of epochs in the end-to-end, this test will currently always fetch the genesis state. It would probably be better for it to fetch HEAD-1, but that would require modifying ws calculation. Alternatively we could tweak the parameters of the checkpoint sync algorithm (like the safety coefficient) to try to shorten the interval to just an epoch or so, but I need to look at the math again to check how many epochs e2e would need to run for this to be feasible.

@kasey kasey requested a review from a team as a code owner March 17, 2022 04:42
@kasey kasey requested review from jmozah, rauljordan, uncdr, a team, prestonvanloon, potuz and terencechain and removed request for a team, jmozah, rauljordan, uncdr, prestonvanloon, potuz and terencechain March 17, 2022 04:42
@kasey kasey force-pushed the 4-checkpoint-sync-runtime branch from 5be0fa9 to a4e0afd Compare March 18, 2022 18:20
@kasey kasey force-pushed the 5-checkpoint-sync-endtoend branch from 164b6c7 to 34571cd Compare March 18, 2022 18:21
@michaelsproul
Copy link

I just built Prysm using this branch and tried to checkpoint sync against a Lighthouse node, but it failed due to the absence of the /eth/v1/beacon/weak_subjectivity endpoint.

[2022-03-23 14:39:04]  INFO requesting http://localhost:7052/eth/v1/beacon/weak_subjectivity
[2022-03-23 14:39:05] ERROR main: Error retrieving checkpoint origin state and block: unexpected API response for prysm-only weak subjectivity checkpoint API: code=405, url=http://localhost:7052/eth/v1/beacon/weak_subjectivity, body=response body:
{"code":405,"message":"METHOD_NOT_ALLOWED","stacktraces":[]}: did not receive 2xx response from API

I noticed that you try to detect a 404 from the BN and fallback to a backwards-compatible impl if it doesn't exist, but Lighthouse returns a 405 for non-existent HTTP methods (this is our bad, I've opened an issue here to track that: sigp/lighthouse#3112). It might be cool to handle the 405 in the meantime 🙏

After trying Lighthouse I tried Teku via Infura, and got a 401 Unauthorized, which I think is due to Prysm not passing along the project_id:project_secret (username and password) from the first part of the URL:

./beacon-chain --checkpoint-sync-url "https://project_id:project_secret@eth2-beacon-mainnet.infura.io"
...
[2022-03-23 14:47:42]  INFO requesting http://eth2-beacon-mainnet.infura.io/eth/v1/beacon/weak_subjectivity
[2022-03-23 14:47:45] ERROR main: Error retrieving checkpoint origin state and block: unexpected API response for prysm-only weak subjectivity checkpoint API: code=401, url=https://eth2-beacon-mainnet.infura.io:443/eth/v1/beacon/weak_subjectivity, body=response body:
project id required
: did not receive 2xx response from API

@kasey kasey force-pushed the 4-checkpoint-sync-runtime branch from a4e0afd to d17c1c7 Compare March 25, 2022 14:42
@kasey kasey force-pushed the 4-checkpoint-sync-runtime branch 3 times, most recently from feb5120 to c1a8f56 Compare March 28, 2022 18:26
@delete-merged-branch delete-merged-branch bot deleted the branch develop March 28, 2022 21:02
@rauljordan
Copy link
Contributor

Hi @kasey curious about the status of this one and if there's anything I can do to help

@kasey
Copy link
Contributor Author

kasey commented Apr 5, 2022

I just built Prysm using this branch and tried to checkpoint sync against a Lighthouse node, but it failed due to the absence of the /eth/v1/beacon/weak_subjectivity endpoint.

[2022-03-23 14:39:04]  INFO requesting http://localhost:7052/eth/v1/beacon/weak_subjectivity
[2022-03-23 14:39:05] ERROR main: Error retrieving checkpoint origin state and block: unexpected API response for prysm-only weak subjectivity checkpoint API: code=405, url=http://localhost:7052/eth/v1/beacon/weak_subjectivity, body=response body:
{"code":405,"message":"METHOD_NOT_ALLOWED","stacktraces":[]}: did not receive 2xx response from API

I noticed that you try to detect a 404 from the BN and fallback to a backwards-compatible impl if it doesn't exist, but Lighthouse returns a 405 for non-existent HTTP methods (this is our bad, I've opened an issue here to track that: sigp/lighthouse#3112). It might be cool to handle the 405 in the meantime pray

After trying Lighthouse I tried Teku via Infura, and got a 401 Unauthorized, which I think is due to Prysm not passing along the project_id:project_secret (username and password) from the first part of the URL:

./beacon-chain --checkpoint-sync-url "https://project_id:project_secret@eth2-beacon-mainnet.infura.io"
...
[2022-03-23 14:47:42]  INFO requesting http://eth2-beacon-mainnet.infura.io/eth/v1/beacon/weak_subjectivity
[2022-03-23 14:47:45] ERROR main: Error retrieving checkpoint origin state and block: unexpected API response for prysm-only weak subjectivity checkpoint API: code=401, url=https://eth2-beacon-mainnet.infura.io:443/eth/v1/beacon/weak_subjectivity, body=response body:
project id required
: did not receive 2xx response from API

Hi @michaelsproul, thanks for trying this out! I opened a bug last week on the basic auth issue #10467 and plan to handle 405 equivalent to 404. Sorry this PR got a little stalled, I had to take some personal time off in the last week - I will ping you when these issues are resolved.

@kasey kasey force-pushed the 5-checkpoint-sync-endtoend branch from 34571cd to 23baedc Compare April 5, 2022 17:50
@kasey kasey changed the base branch from 4-checkpoint-sync-runtime to develop April 5, 2022 18:09
@kasey kasey force-pushed the 5-checkpoint-sync-endtoend branch from 39c2df1 to 03fba41 Compare May 3, 2022 19:29
@kasey kasey mentioned this pull request May 4, 2022
@kasey
Copy link
Contributor Author

kasey commented May 26, 2022

@michaelsproul we just merged PR #10723 which significantly changes the way our checkpoint sync works. Instead of trying to fetch the block from the beginning of the latest weak subjectivity period, we simply fetch the latest finalized block, working on the assumption that this is preferable in terms of sync speed. This is also more compatible with Infura, which fails to serve the state-by-slot request we need to accompany the checkpoint block, but does a fine job fetching genesis and finalized states + block by root. I was convinced by Adrian Sutton that this is generally just as safe as the other method given that users can confirm the fetched roots are canonical using a trusted beacon API. Please give this a try and let me know how it works for you.

@kasey
Copy link
Contributor Author

kasey commented May 26, 2022

Closing this PR as checkpoint sync behavior changed heavily in #10723 which includes its own e2e test.

@kasey kasey closed this May 26, 2022
@terencechain terencechain deleted the 5-checkpoint-sync-endtoend branch May 13, 2024 04:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants