-
Notifications
You must be signed in to change notification settings - Fork 702
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test suite for attestation service timing #886
Comments
Hey, I'd be interested in picking this up. |
Hey @realbigsean, thanks for the interest and the help! This I've tried to make fairly localised and is a good start to learning some localised sections of the code base. This issue in particular deals with an object I've called the The In it's current (incomplete) form it has a single public function: https://github.com/sigp/lighthouse/blob/naive-attestation-aggregation/beacon_node/network/src/attestation_service/mod.rs#L115 This takes in a list of subscriptions then builds various timeouts for events that need to happen. The attester_service then needs to be regularly poll'd in order for the events to be emitted. I imagine there are a few bugs in the current implentation, and it would be nice to find them by building some tests, that create an To do this, we need to first create the service. This can be done with a dummy BeaconChain and NetworkGlobals. I'd recommend looking at these tests as an example of how to build a dummy beacon chain: https://github.com/sigp/lighthouse/blob/naive-attestation-aggregation/beacon_node/network/src/service/tests.rs#L38 Next, in order to do some tests, I need to explain the logic of what this is supposed to be doing. When a validator connects to the beacon node, it looks two epochs in advance to see when it needs to submit an attestation. Note that a validator client can have many validators, so there could be many subscriptions, ultimately in this service you will just see a list of subscriptions for all kinds of validators and they come in at assorted times. The general premise is that validators on a given slot, need to listen and obtain attestations from a particular gossipsub topic. This means the validator needs to search for peers that may be on this subnet, connect to them, subscribe to the topics and then unsubscribe after the slot. The discovery needs to happen no more than 1 epoch in advance and we should subscribe 1/3 of a slot prior to the slot (to allow for the subscription to reach our peers). We should then unsubscribe from the slot. The above is the general principle. However to complicate things further, for each validator connected, the beacon node needs to subscribe to There are only 64 subnets. So if 64 validators are connected, we should be subscribed to all subnets and shouldn't disconnect from any. The logic here should also account for the case that if we are already subscribed to a long-lived subnet, it shouldn't need to discover new peers or send a subscription request. As there is quite a bit of logic going on here, some thought about what to test should also be considered. Some initial things to get started are some of the logic cases I mentioned above plus:
Possibly a lot more, but this might do to start it off. Let me know if you need help starting or building a framework to start the testing. We're probably going to need to adjust the times, and slot definition to make the tests run faster than using realistic times. We can deal with this when we get to it i guess |
Thanks for the detailed explanation! I've started working on this and will let you know when I have some basic scenarios working, or if I run into any issues. |
I think this one can be closed? |
Yep, resolved in #1070 |
Description
The naive attestation aggregation strategy involves quite a bit of timing logic. Depending on connected validators, a beacon node needs to subscribe to a set of long-lived subnets for a period of time. On top of this, the node needs to subscribe, unsubscribe and search for new peers when a validator needs to attest on a specific slot.
There is significant logic and edge cases that can occur here. Getting the timing of these events wrong could mean a validator missing attestations and loss of money.
A collection of tests should be built on the
attester_service
to ensure the it is behaving as expected.If you are looking at picking this issue up, hit up @AgeManning either here or our discord for further direction
The text was updated successfully, but these errors were encountered: