Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out of Bounds Crash Shortly After Adding a New Validator Key #3463

Closed
jclapis opened this issue Mar 4, 2022 · 1 comment
Closed

Out of Bounds Crash Shortly After Adding a New Validator Key #3463

jclapis opened this issue Mar 4, 2022 · 1 comment

Comments

@jclapis
Copy link
Contributor

jclapis commented Mar 4, 2022

Describe the bug
I added a new validator key and restarted Nimbus to pick it up. After about an hour, Nimbus crashed unexpectedly with the following "stack trace", if you want to call it that:

...
INF 2022-03-04 00:50:40.261+00:00 State replayed                             topics=\"chaindag\" blocks=1 slots=0 current=7264f2be:3294251 ancestor=b2f97a65:3294239@3294240 target=8659bc8a:3294240 ancestorStateRoot=c40fa442 targetStateRoot=be8f7b50 found=false assignDur=619ms147us338ns replayDur=330ms164us207ns
[[reraised from:
]]
[[reraised from:
]]
[[reraised from:
]]
[[reraised from:
]]
[[reraised from:
]]
Error: unhandled exception: index <XXX> not in 0 .. 303812 [IndexError]

where <XXX> is the index that was assigned to the validator I just added, and is a number greater than 303812 (the bounds of whatever validator index array it was checking at the time). I've abstracted it for privacy here. Even though it's still in pending_initialized status, a call to eth/v1/beacon/states/finalized/validators/XXX successfully returned its info and I confirmed that it's the one I added.

After Nimbus restarted itself, it has been running for 7 hours without encountering any further issues.

To Reproduce
Steps to reproduce the behavior:

  • Make a new validator, restart Nimbus to pick up the new key, wait an hour and see if it crashes I suppose? Not really sure how to repro this...
  1. Platform details (OS, architecture): Ubuntu Server 20.04.4 arm64 (Raspberry Pi image)
  2. Branch/commit used: v1.7.0, using the Docker image from Docker Hub
  3. Commands being executed: /home/user/nimbus-eth2/build/nimbus_beacon_node --non-interactive --enr-auto-update --network=mainnet --data-dir=/ethclient/nimbus --tcp-port=9001 --udp-port=9001 --web3-url=ws://eth1:8546 --web3-url=ws://eth1-fallback:8546 --rest --rest-address=0.0.0.0 --rest-port=5052 --insecure-netkey-password=true --validators-dir=/validators/nimbus/validators --secrets-dir=/validators/nimbus/secrets --num-threads=0 --doppelganger-detection=false --max-peers=100 --metrics --metrics-address=0.0.0.0 --metrics-port=9100 --nat=extip:<my ip> --graffiti=<my graffiti>
  4. Relevant log lines: See above
@arnetheduck
Copy link
Member

This happens when requesting validator information via REST api for a specific validator: first using a state where the validator was activae, then making another request but historical this time, for a state where it was not yet active - a cached validator index is then used without checking validity in the old state.

zah added a commit that referenced this issue Mar 7, 2022
@zah zah closed this as completed in 5ef2ce4 Mar 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants