Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Miner acts as signer set coordinator during block signing #4481

Merged
merged 17 commits into from Mar 21, 2024

Conversation

kantai
Copy link
Member

@kantai kantai commented Mar 4, 2024

This PR alters the stacks-node miner thread to act as the signing set coordinator for gettings its own blocks signed. There's a couple hacks here that I know are hacks. I added a note about it below (it has to do with figuring out the current miner's key).

Otherwise, this PR does a handful of things in support of the miner-as-coordinator:

  • Replaces msg_id u32 with an enum for message identification. Straight-forward refactor.
  • Adds an additional slot for miner messages. Miners need to send signer messages, not just block proposals. This adds a new slot for that.
  • Adds a sync channel for listening to StackerDB events. The miner needs to receive StackerDB events. It could either poll the database (seems like a bad idea) or listen for events. Registering for RPC callbacks seems bad. Instead, this uses a singleton sync channel.
  • Adds a StackerDBs method for pushing a chunk locally and emitting event. The node shouldn't make RPC calls to itself. This could trigger a deadlock if we aren't careful, it could lead to some pretty weird network patterns (depending on the deployment), and its less overhead this way anyways.
  • A new signer message type to store DKG results. This is read by miners to instantiate FIRE coordinator state. It is checked against the chainstate's aggregate key, and if even just one signer produces this message, the miner should be able to retrieve it. I understand this data can be cached on the miner for the duration of a reward cycle. I do not think that should happen in this PR, though, because I want to first focus on correctness and deal with optimizations separately.
  • Test signing channel for nakamoto integration tests. Because the miner now doesn't listen for block approvals, but instead actively acts as a coordinator, the blind signer could no longer work as it did. It would need to mimic the entire FIRE protocol, which I think would be brittle, a waste of time, and not really the point of the non-signer nakamoto integration tests. So instead, there's a [cfg(test)] code block which checks if the blind signing channel is initialized and uses it for the test if so.
  • Currently builds with a branch of wsts (this is reflected in the Cargo.toml, and should allow builds to work!) Once https://github.com/stacks-network/wsts/pull/3 merges and a release is built, the version in this repo will have to be bumped rather than building off the git branch.

Hack: Determining the coordinator public key as a signer

If the active miner is acting as the signer, its public key needs to be used as the coordinator public key. Right now, the way this works is that the signer instance checks if its reward cycle is active, and if so, it uses the most recent miner public key it has received.

It fetches this from the stackerdb chunks messages -- using the most recent received event to get the public key. This is mostly safe: the .miners stackerdb really will only allow the current miner and the prior miner to send events. However, this doesn't deal with issues related to the current miner being either non-responsive or malicious. To deal with that, the signer binary needs to be actually tracking some additional state, like the current burn chain view of the node. That way, the signer can have a designated "current trusted miner", that it only moves forward when a miner is selected. This is true regardless of the changes in this PR, though. The right way to handle this is to subscribe to burn block events, and expand that event's information. I don't necessarily think that needs to be done at the same time as this miner-as-coordinator work, though.

Former Hacks!

These all used to be necessary or hacky things in this branch, but are all no longer necessary or have otherwise been fixed.

Reusing the FIRE coordinator library in the miner

This is the correct thing to do, and the FIRE coordinator is instantiated with sufficient state to be able to detect which signers returned bad data so that it can restart signing with malicious signers removed.

As you could tell by inspecting the new sign_coordinator module, the wsts fire coordinator is reused by the miner. I don't think this is necessarily so hack-y (in fact, code reuse is A Good Thing), but it's something to think about. The miner isn't actually involved in, say, DKG, and so the fire coordinator may be a over-parameterized from its perspective, as well as theoretically capable of reaching a bunch of bad internal states (if the miner thread's coordinator was ever in a DKG state, e.g.). You can also see that I needed to make some internal state of the coordinator public, as well as implementing an Aggregator that took the aggregate public key as a given.

It's worth thinking about whether or not the miner should actually have its own state machine here.

Signers continue processing messages on their local coordinator

This was necessary to get the signers integration tests passing (as well as setting the DKG ID in the miner) without changes to them. This had more to do with the tests than anything else, and I updated the tests. I'm still working through fixing all of the signer tests though.

Removing the signature assembled checks from the signers tests

This isn't really a hack, but instead, the change from signers being the coordinator to the miner being a coordinator breaks some of the assumptions made in the signers tests setup. Namely, they no longer have a channel to a coordinator that assembles the final signature. There's definitely ways to fix these tests so that they still can assert the signatures are in place (like, we can add a variable to the tracking globals used by the miner in other tests), so I can just do that.

My fix was to just use the block production events from the event observer, and assert that the signature on the confirmed block matches the expected aggregate key.

Cargo.toml Outdated Show resolved Hide resolved
stacks-common/src/util/mod.rs Outdated Show resolved Hide resolved
stacks-signer/src/runloop.rs Outdated Show resolved Hide resolved
stacks-signer/src/runloop.rs Outdated Show resolved Hide resolved
libsigner/src/events.rs Outdated Show resolved Hide resolved
@jferrant
Copy link
Collaborator

jferrant commented Mar 5, 2024

Is it not possible to just have a fire coordinator within the miner that triggers a sign round exactly as the signers do now? The only thing that would be different is the miner would not include its own key as part of the CoordinatorConfig/if it did, it would be assigned zero key ids. Do we actually need an aggregator?

@kantai
Copy link
Member Author

kantai commented Mar 5, 2024

Is it not possible to just have a fire coordinator within the miner that triggers a sign round exactly as the signers do now? The only thing that would be different is the miner would not include its own key as part of the CoordinatorConfig/if it did, it would be assigned zero key ids. Do we actually need an aggregator?

That's basically what sign_coordinator does. It instantiates a fire coordinator and triggers a sign round basically the same way the signers do now. The reason that a new Aggregator implementation is necessary is because the miner doesn't participate in DKG, which means it doesn't read the DKG results and therefore can't be initialized the same way that the default aggregator is (see https://github.com/Trust-Machines/wsts/blob/main/src/v2.rs#L383-L406). This is okay for the miner, though, because it doesn't need those commitments, it just takes the aggregate public key as given.

@xoloki
Copy link
Collaborator

xoloki commented Mar 5, 2024

Is it not possible to just have a fire coordinator within the miner that triggers a sign round exactly as the signers do now? The only thing that would be different is the miner would not include its own key as part of the CoordinatorConfig/if it did, it would be assigned zero key ids. Do we actually need an aggregator?

That's basically what sign_coordinator does. It instantiates a fire coordinator and triggers a sign round basically the same way the signers do now. The reason that a new Aggregator implementation is necessary is because the miner doesn't participate in DKG, which means it doesn't read the DKG results and therefore can't be initialized the same way that the default aggregator is (see https://github.com/Trust-Machines/wsts/blob/main/src/v2.rs#L383-L406). This is okay for the miner, though, because it doesn't need those commitments, it just takes the aggregate public key as given.

The miner needs those commitments if it wants to be able to check signature shares when aggregation fails. If you can't check signature shares then you can't identify malicious actors. Which means you can't keep signing blocks.

@jcnelson
Copy link
Member

@kantai Is this ready for review now?

@kantai
Copy link
Member Author

kantai commented Mar 13, 2024

@kantai Is this ready for review now?

No, not yet. I'm working on getting the existing nakamoto integrations tests to work with my changes (in branch feat/miner-coordinator+dream-fixes). The issue is that the previous auto-signer used in the integration tests depended on the stacks miner not really being involved in the signing communications, so there needs to be a workaround for that.

* Replaces msg_id u32 with an enum for message identification
* Adds an additional slot for miner messages
* Adds a sync channel for listening to StackerDB events
* Adds a StackerDBs method for pushing a chunk locally and emitting event
* Uses a new message type to store DKG results, to be read by miners to instantiate coordinator
* Uses a test signing channel for nakamoto integration tests
* Currently builds with a branch of wsts
@kantai kantai changed the base branch from next to dream-team-fixes March 14, 2024 14:46
@kantai kantai changed the title DRAFT: initial work on miner acting as signer set coordinator during block signing Feat: Miner acts as signer set coordinator during block signing Mar 14, 2024
@kantai kantai requested a review from obycode March 14, 2024 16:05
Copy link

codecov bot commented Mar 14, 2024

Codecov Report

Attention: Patch coverage is 82.08955% with 228 lines in your changes are missing coverage. Please review.

Project coverage is 83.32%. Comparing base (498b6c0) to head (cb219c9).

Additional details and impacted files
@@            Coverage Diff             @@
##             next    #4481      +/-   ##
==========================================
- Coverage   83.42%   83.32%   -0.10%     
==========================================
  Files         453      455       +2     
  Lines      329589   330179     +590     
  Branches      323      323              
==========================================
+ Hits       274944   275135     +191     
- Misses      54637    55036     +399     
  Partials        8        8              
Files Coverage Δ
clarity/src/vm/types/mod.rs 91.71% <100.00%> (+0.02%) ⬆️
libstackerdb/src/libstackerdb.rs 92.56% <100.00%> (+0.25%) ⬆️
stacks-common/src/util/macros.rs 83.85% <ø> (ø)
stacks-common/src/util/mod.rs 53.44% <ø> (ø)
stacks-common/src/util/secp256k1.rs 79.27% <100.00%> (+0.12%) ⬆️
stacks-signer/src/client/mod.rs 99.14% <100.00%> (ø)
stacks-signer/src/client/stackerdb.rs 89.36% <100.00%> (-0.44%) ⬇️
stacks-signer/src/client/stacks_client.rs 87.42% <100.00%> (ø)
stacks-signer/src/config.rs 87.50% <100.00%> (+0.95%) ⬆️
stacks-signer/src/runloop.rs 90.56% <100.00%> (-0.99%) ⬇️
... and 21 more

... and 25 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 498b6c0...cb219c9. Read the comment docs.

Copy link
Contributor

@obycode obycode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great!

stackslib/src/chainstate/burn/db/sortdb.rs Outdated Show resolved Hide resolved
testnet/stacks-node/src/event_dispatcher.rs Outdated Show resolved Hide resolved
@kantai kantai changed the base branch from dream-team-fixes to next March 16, 2024 17:46
obycode
obycode previously approved these changes Mar 20, 2024
Copy link
Contributor

@obycode obycode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Collaborator

@jferrant jferrant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just delete the defunct deserialize_scalar and deserialize_point fcns (unless I am missing something) and I will approve :D

@kantai kantai dismissed stale reviews from obycode and jcnelson via 0f91591 March 20, 2024 15:29
obycode
obycode previously approved these changes Mar 20, 2024
@kantai kantai requested a review from jferrant March 20, 2024 18:28
obycode
obycode previously approved these changes Mar 20, 2024
@kantai kantai requested a review from jcnelson March 20, 2024 19:32
@kantai kantai enabled auto-merge March 20, 2024 19:32
jferrant
jferrant previously approved these changes Mar 20, 2024
@kantai kantai dismissed stale reviews from jferrant and obycode via fbf1660 March 20, 2024 21:03
@kantai kantai requested review from jferrant and obycode March 20, 2024 21:03
obycode
obycode previously approved these changes Mar 20, 2024
jcnelson
jcnelson previously approved these changes Mar 20, 2024
@kantai kantai dismissed stale reviews from obycode and jcnelson via cb219c9 March 21, 2024 03:15
Copy link
Collaborator

@jferrant jferrant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! :D

@kantai kantai added this pull request to the merge queue Mar 21, 2024
Merged via the queue into next with commit d5414a2 Mar 21, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants