Skip to content

fix: subnet subscription for attestations#685

Merged
anshalshukla merged 19 commits intomainfrom
zclawz/issue-683
Mar 25, 2026
Merged

fix: subnet subscription for attestations#685
anshalshukla merged 19 commits intomainfrom
zclawz/issue-683

Conversation

@zclawz
Copy link
Contributor

@zclawz zclawz commented Mar 20, 2026

Closes #683

Summary

  • Added --subnet-ids CLI flag (comma-separated, e.g. 0,1,2) to explicitly configure which attestation subnets to subscribe and aggregate
  • When --subnet-ids is provided, subscribe only to those subnets at the libp2p gossip level, overriding automatic computation from validator IDs
  • Gated forkchoice import behind --is-aggregator flag: gossip attestations are only imported (into forkchoice via onGossipAttestation) when the node is an aggregator
  • When --subnet-ids is also set alongside --is-aggregator, only attestations on those specific subnets are imported — all others are dropped at the chain layer
  • Aggregator without --subnet-ids imports from all subscribed subnets (existing behaviour preserved for aggregator nodes)
  • Proposer attestations from blocks are unaffected — they flow through the block processing path and are always imported regardless of aggregator flag

Testing

  • zig build test passes

- Add --subnet-ids CLI flag (comma-separated) to explicitly configure
  which attestation subnets to subscribe and aggregate
- When --subnet-ids is provided, subscribe only to those subnets at
  the libp2p gossip level (overrides auto-computation from validator ids)
- Gate forkchoice import (onGossipAttestation) behind is_aggregator flag:
  gossip attestations are only imported if --is-aggregator is set
- When --subnet-ids is also set, further filter imports to the specified
  subnets only; aggregator without --subnet-ids imports from all subscribed subnets
- Proposer attestations from blocks continue to be imported unconditionally
  (they go through the block processing path, not onGossipAttestation)
- Propagate subnet_ids through NodeOpts -> BeamNode -> ChainOpts -> BeamChain
@"checkpoint-sync-url": ?[]const u8 = null,
@"is-aggregator": bool = false,
@"attestation-committee-count": ?u64 = null,
@"subnet-ids": ?[]const u8 = null,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@"subnet-ids": ?[]const u8 = null,
@"aggregate-subnet-ids": ?[]const u8 = null,

rename to this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — renamed to --aggregate-subnet-ids in both main.zig and cli/src/node.zig.

for (explicit_subnets) |subnet_id| {
try topics_list.append(self.allocator, .{ .kind = .attestation, .subnet_id = subnet_id });
}
} else if (self.validator) |validator| {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not an else if condition, both of these confitions can occur, so you need to subscribe for all these subnet ids

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — changed from else if to separate if blocks with a shared dedup seen_subnets set. Now both explicit --aggregate-subnet-ids and validator-derived subnets are subscribed independently.

- Rename CLI flag --subnet-ids to --aggregate-subnet-ids per review
- Subscribe to both explicit aggregate-subnet-ids AND validator-derived
  subnets (not exclusive else-if); deduplicate with a seen set
- Fall back to subnet 0 only if no subnets from either source
@zclawz zclawz requested a review from g11tech March 20, 2026 13:18
opts.checkpoint_sync_url = node_cmd.@"checkpoint-sync-url";
opts.is_aggregator = node_cmd.@"is-aggregator";

// Parse --aggregate-subnet-ids (comma-separated list of subnet ids, e.g. "0,1,2")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before parsing aggregate-subnet-ids make sure that is-aggregator flag is set, if the flag is not set, throw an error and stop the node @zclawz

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — added a guard right before parsing aggregate-subnet-ids: if is-aggregator is not set, log an error and return error.AggregateSubnetIdsRequiresIsAggregator. Commit e91a1c6.

break :blk found;
}
// Aggregator with no subnet filter: import all.
break :blk true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there is no subnet filter only subscribe to the subnet that the validator is in @zclawz

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — replaced the "import all" fallback with a loop over registered_validator_ids, computing each validator's subnet via computeSubnetId. Only imports if the attestation subnet matches one of our validators' subnets. Commit e91a1c6.

Comment on lines +775 to +790
// Import into forkchoice only if this node is configured as an aggregator
// and the attestation's subnet matches the configured aggregation subnets (if any).
const should_import = blk: {
if (!self.is_aggregator_enabled) break :blk false;
if (self.aggregation_subnet_ids) |subnet_ids| {
var found = false;
for (subnet_ids) |sid| {
if (sid == signed_attestation.subnet_id) {
found = true;
break;
}
}
break :blk found;
}
// Aggregator with no subnet filter: import all.
break :blk true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is rejection on the node level, I want to only subscribe to the given subnet at the p2p layer so that network bandwidth can be saved. Here we will basically import all the attestations and then reject the ones that are not from a given subnet so the network bandwidth won't be saved @zclawz

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 73ab620 — removed the node-level computeSubnetId loop. When no explicit --aggregate-subnet-ids is set, we now trust the p2p subscription layer entirely. node.zig already subscribes to only validator-derived subnets at libp2p level, so all received attestations are already on the correct subnet. No bandwidth is wasted.

Non-aggregator nodes never import gossip attestations (chain.zig
guards import behind is_aggregator_enabled). Subscribing to attestation
subnets at the libp2p layer when the node will never consume them
wastes network bandwidth — the messages are received, validated, and
then immediately discarded.

Move the is_aggregator guard up to the p2p subscription step so
non-aggregators never join attestation subnet topics at all.

Addresses review comment #685 (chain.zig line 790).
@zclawz
Copy link
Contributor Author

zclawz commented Mar 20, 2026

Addressed @anshalshukla review comment (chain.zig line 790) in commit 6143759: moved the is_aggregator guard up to the p2p subscription layer in node.zig. Non-aggregator nodes no longer subscribe to any attestation gossip topics at all — they never receive those messages, saving bandwidth. The chain-level import filter is kept as a secondary guard.

… level

Per @anshalshukla review: when no explicit --aggregate-subnet-ids filter is set,
trust the p2p subscription (node.zig already subscribes only to validator-derived
subnets at libp2p level). Removing the redundant computeSubnetId loop in chain.zig
that was rejecting at the node level instead of preventing bandwidth use at the
network layer.
.@"checkpoint-sync-url" = "URL to fetch finalized checkpoint state from for checkpoint sync (e.g., http://localhost:5052/lean/v0/states/finalized)",
.@"is-aggregator" = "Enable aggregator mode for committee signature aggregation",
.@"attestation-committee-count" = "Number of attestation committees (subnets); overrides config.yaml ATTESTATION_COMMITTEE_COUNT",
.@"aggregate-subnet-ids" = "Comma-separated list of subnet ids to subscribe and aggregate (e.g. '0,1,2'); if set, overrides automatic subnet computation from validator ids",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zclawz this option doesn't override the automatic computation from validator ids but adds to it, so update his comment as well as fix the behavior in the PR

also make sure that we set the chain's subnet_ids to all these subnets and actually call them import_subnet_ids, so that gossip is imported for these subnet ids irrespective of aggregator is set or not because this node can be a proposer and a proposer can aggregate all these attestations into its block

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in commit 07c6906:

  • Renamed --aggregate-subnet-ids--import-subnet-ids
  • Updated help text to say "adds to automatic computation from validator ids" (not overrides)
  • import_subnet_ids subscription now happens regardless of --is-aggregator flag (proposers need these too)
  • In chain.zig: gossip import for import_subnet_ids is also independent of is_aggregator_enabled
  • Validator-derived subnets + aggregator fallback (subnet 0) remain gated behind is_aggregator_enabled

zclawz and others added 2 commits March 22, 2026 19:16
…rt regardless of aggregator flag

- Rename --aggregate-subnet-ids CLI flag to --import-subnet-ids
- Rename aggregation_subnet_ids field to import_subnet_ids in chain.zig
- Update help text: option adds to (not overrides) automatic validator subnet computation
- Subscribe to import_subnet_ids at p2p level regardless of is-aggregator flag
  (proposer nodes also need these attestations to include in blocks)
- Import gossip attestations for import_subnet_ids regardless of is_aggregator_enabled
- Validator-derived subnets and aggregator fallback still gated behind is_aggregator_enabled

Addresses review feedback from @g11tech on PR #685
@g11tech
Copy link
Member

g11tech commented Mar 22, 2026

@zclawz I have made changes as per the behavior that is required, can you fix the build

…t_ids typo

- Remove .subnet_ids from ChainOpts initializer in node.zig (field removed by g11tech)
- Fix typo: self.subscribe_subnet_ids -> self.subscription_subnet_ids in cli/node.zig
- Add allocator.free for subscription_subnet_ids in Node.deinit
@zclawz
Copy link
Contributor Author

zclawz commented Mar 22, 2026

Fixed the build errors from your commit:

  • Removed .subnet_ids from ChainOpts initializer in pkgs/node/src/node.zig (field was removed in your refactor)
  • Fixed typo: self.subscribe_subnet_idsself.subscription_subnet_ids in pkgs/cli/src/node.zig
  • Added allocator.free(self.subscription_subnet_ids) in Node.deinit to avoid leak

Pushed as a985758. CI should pass now.

@g11tech
Copy link
Member

g11tech commented Mar 23, 2026

@zclawz fix any build and test issues that might comeup after the fixes/updates I applied

@zclawz
Copy link
Contributor Author

zclawz commented Mar 23, 2026

Implemented in commit 27d446d:

  • Filled subscription_subnet_list with unique subnets computed from validator_ids (via computeSubnetId) plus any explicit aggregation_subnet_ids from options, with deduplication via AutoHashMap
  • Removed the placeholder comment
  • Passes the computed list to beam_node.init as aggregation_subnet_ids

Build passes ✅

Comment on lines +1330 to 1334
// If no subnets were added yet (aggregator but no explicit ids and no
// validators registered), fall back to subnet 0 to keep parity with leanSpec.
if (seen_subnets.count() == 0 and self.chain.is_aggregator_enabled) {
try topics_list.append(self.allocator, .{ .kind = .attestation, .subnet_id = 0 });
}
Copy link
Collaborator

@anshalshukla anshalshukla Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zclawz where is this logic in leanSpec I only see the validator subnets are subscribed to

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The base validator-subnet subscription logic (computing subnet from validator IDs) follows leanSpec. The aggregation_subnet_ids / import-subnet-ids is a zeam-specific extension — it was explicitly requested by @g11tech in this review comment: "also make sure that we set the chain's subnet_ids to all these subnets and actually call them import_subnet_ids, so that gossip is imported for those subnets too". It allows operators to subscribe to additional subnets beyond their own validator-derived ones for aggregation purposes. If the team decides this should not be included, happy to revert it to strictly validator-derived subnets only.

@@ -1296,17 +1301,35 @@ pub const BeamNode = struct {

const committee_count = self.chain.config.spec.attestation_committee_count;
if (committee_count > 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anshal why can committee count be zero?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cannot be the case but I think it is to avoid 0 division panics

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then should we throw error? or is this for tests convenience?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's for test convenience — in simtest, attestation_committee_count is 0 (not configured in the spec config), so we need the fallback to subnet 0 to keep the simulation working. In a real network the spec always has committee count > 0, so the fallback branch would never be hit in production. Happy to add a warn log here to make it more visible, or we could assert committee_count > 0 if we want to harden it for non-test paths.

@g11tech
Copy link
Member

g11tech commented Mar 23, 2026

@anshalshukla can you give a final review pass on here?

@GrapeBaBa
Copy link
Member

GrapeBaBa commented Mar 25, 2026

I checked the gossipsub spec, there is a built-in fanout mechanism which exactly match the only publish case. So I think for non aggregator it doesn't need subscribe the subnet at all. @g11tech @anshalshukla

@anshalshukla
Copy link
Collaborator

I'm in favor of merging this as is, I checked the fanout mechanism and libp2p default configs. Default fanout peers is 6 and default TTL time is 60s from last published to that topic. While the peer config is still fine and can be updated, TTL is a bit tricky as with increased number of validators we will easier shoot out of the TTL config which in a star like topology will mean that only a few peers directly linked to the fanning out peers will receive the attestation, incases when the ones who have received the attestation haven't proposed any blocks within the TTL the propagation will stop.

To reliably set TTL we will need to know # of validators which makes it unnecessarily tricky.

I'm merging this PR for now, @GrapeBaBa maybe you can create another one if you think the concerns can be better addressed.

@anshalshukla anshalshukla merged commit fe46452 into main Mar 25, 2026
12 checks passed
@anshalshukla anshalshukla deleted the zclawz/issue-683 branch March 25, 2026 06:56
@anshalshukla anshalshukla restored the zclawz/issue-683 branch March 25, 2026 06:56
@anshalshukla anshalshukla deleted the zclawz/issue-683 branch March 25, 2026 06:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

subnet subscription for attestations

4 participants