Skip to content

feat: [DSM-96] Isolate XNet on engines#9449

Merged
schneiderstefan merged 21 commits intomasterfrom
stschnei/engines-no-xnet
Mar 25, 2026
Merged

feat: [DSM-96] Isolate XNet on engines#9449
schneiderstefan merged 21 commits intomasterfrom
stschnei/engines-no-xnet

Conversation

@schneiderstefan
Copy link
Copy Markdown
Contributor

@schneiderstefan schneiderstefan commented Mar 18, 2026

This commit isolates cloud engines for the purpose of XNet. Specifically:

  1. In NetworkTopology, where the deterministic state machine makes routing decisions:
  • If the own subnet type is not CloudEngine, then any subnet with type CloudEngine is filtered out from the list of subnets and the routing table. On the NNS, we also maintain a full copy of the subnet list and routing table in order to map it into the state tree.
  • If the own subnet type is CloudEngine, only calls to the own subnet are permitted.
  1. In the XNetPayloadBuilder, where messages from other subnets are pulled:
  • If the own subnet type is not CloudEngine, then no slices from engines are produced in the payload builder or accepted in the validator.
  • If the own subnet type is CloudEngine, the payload builder produces an empty payload and the validator only accepts an empty payload.

This commit isolates cloud engines for the purpose of XNet.
Specifically:

1. In NetworkTopology, where the deterministic state machine makes
routing decisions:
a. If the own subnet type is not CloudEngine, then any calls to another
subnet with type CloudEngine are filtered out. The topology however
still contains the engines, as they are mapped to the state tree.
b. If the own subnet type is CloudEngine, only calls to the own subnet
are permitted. Furthermore, any other subnets are already filtered out
of the topology when it is constructed.

2. In the XNetPayloadBuilder, where messages from other subnets are
pulled:
a. If the own subnet type is not CloudEngine, then no slices from
engines are produced in the payload builder or accepted in the
validator.
b. If the own subnet type is CloudEngine, the payload builder produces
an empty payload and the validator only accepts an empty payload.
@schneiderstefan schneiderstefan requested review from a team as code owners March 18, 2026 09:15
@github-actions github-actions bot added the feat label Mar 18, 2026
Copy link
Copy Markdown
Contributor

@derlerd-dfinity derlerd-dfinity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I did a first quick skim and left some preliminary comments/questions. I will do another iteration later.

Comment thread rs/messaging/src/message_routing.rs Outdated
Comment thread rs/xnet/payload_builder/src/lib.rs
Comment thread rs/replica/src/setup_ic_stack.rs Outdated
Comment thread rs/xnet/payload_builder/src/lib.rs Outdated
Comment thread rs/messaging/src/message_routing.rs Outdated
Comment thread rs/xnet/payload_builder/src/impl_tests.rs
Comment thread rs/state_machine_tests/src/lib.rs Outdated
Comment thread rs/xnet/payload_builder/src/lib.rs Outdated
Comment thread rs/xnet/payload_builder/tests/xnet_payload_builder.rs Outdated
Comment thread rs/messaging/src/message_routing.rs Outdated
Comment thread rs/replica/src/setup_ic_stack.rs Outdated
Comment thread rs/replicated_state/src/metadata_state.rs
Comment thread rs/messaging/src/message_routing.rs Outdated
Comment thread rs/state_machine_tests/src/lib.rs Outdated
Comment thread rs/tests/message_routing/xnet/xnet_cloud_engine_isolation_test.rs
Comment thread rs/xnet/payload_builder/tests/xnet_payload_builder.rs Outdated
Copy link
Copy Markdown
Contributor

@derlerd-dfinity derlerd-dfinity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did another pass and left some comments.

Comment thread rs/messaging/src/message_routing.rs Outdated
Comment thread rs/messaging/src/message_routing.rs Outdated
Comment thread rs/messaging/src/message_routing.rs
Comment thread rs/messaging/src/message_routing.rs Outdated
Comment thread rs/protobuf/def/state/metadata/v1/metadata.proto
Comment thread rs/replicated_state/src/metadata_state/tests.rs Outdated
Comment thread rs/replicated_state/src/metadata_state/tests.rs
Comment thread rs/replicated_state/src/metadata_state.rs
Comment thread rs/xnet/payload_builder/src/lib.rs
Comment thread rs/xnet/payload_builder/src/test_fixtures.rs
Comment thread rs/xnet/payload_builder/src/lib.rs
Copy link
Copy Markdown
Contributor

@derlerd-dfinity derlerd-dfinity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for all the changes. The PR looks good to me now modulo the minor points that are still open.

Comment thread rs/messaging/src/message_routing/tests.rs
Comment thread rs/replicated_state/src/metadata_state/tests.rs
Comment thread rs/replicated_state/src/metadata_state.rs
Comment thread rs/tests/message_routing/xnet/xnet_cloud_engine_isolation_test.rs
Comment thread rs/replicated_state/src/metadata_state.rs
Comment thread rs/replicated_state/src/metadata_state.rs Outdated
Comment thread rs/replicated_state/src/metadata_state.rs Outdated
Comment thread rs/xnet/payload_builder/src/lib.rs Outdated
Comment thread rs/xnet/payload_builder/src/lib.rs Outdated
Comment thread rs/messaging/src/message_routing.rs Outdated
Comment thread rs/messaging/src/message_routing.rs
Comment thread rs/protobuf/def/state/metadata/v1/metadata.proto
Comment thread rs/replicated_state/src/metadata_state/tests.rs
Comment thread rs/messaging/src/message_routing/tests.rs
Copy link
Copy Markdown
Contributor

@derlerd-dfinity derlerd-dfinity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for addressing all my comments. LGTM.

@schneiderstefan schneiderstefan added this pull request to the merge queue Mar 25, 2026
Merged via the queue into master with commit bc7f8c3 Mar 25, 2026
38 checks passed
@schneiderstefan schneiderstefan deleted the stschnei/engines-no-xnet branch March 25, 2026 19:43
alin-at-dfinity pushed a commit that referenced this pull request Mar 26, 2026
This commit isolates cloud engines for the purpose of XNet.
Specifically:

1. In NetworkTopology, where the deterministic state machine makes
routing decisions:

- If the own subnet type is not CloudEngine, then any subnet with type
CloudEngine is filtered out from the list of subnets and the routing
table. On the NNS, we also maintain a full copy of the subnet list and
routing table in order to map it into the state tree.
- If the own subnet type is CloudEngine, only calls to the own subnet
are permitted.

2. In the XNetPayloadBuilder, where messages from other subnets are
pulled:
- If the own subnet type is not CloudEngine, then no slices from engines
are produced in the payload builder or accepted in the validator.
- If the own subnet type is CloudEngine, the payload builder produces an
empty payload and the validator only accepts an empty payload.

---------

Co-authored-by: IDX GitHub Automation <infra+github-automation@dfinity.org>
github-merge-queue bot pushed a commit that referenced this pull request Apr 16, 2026
…routing table in registry (#9891)

Fix flaky
`icrc_ledger_suite_integration_golden_state_upgrade_downgrade_test` by
registering all subnets referenced in the routing table in the registry.
Extract `add_cup_contents_and_key_record` helper to eliminate code
duplication.

The golden state tests use `create_routing_table` (introduced in [PR
#1530](#1530)) to route canister IDs
outside the local subnet to a non-existent subnet. This ensures that
leftover cross-subnet responses in the golden state backup are routed
into a remote stream (where they sit harmlessly) rather than triggering
a critical error in the stream builder.

This worked until [PR #9449](#9449)
introduced routing table filtering in
`try_to_populate_network_topology`: non-CloudEngine subnets now filter
the routing table to only include entries for subnets that have registry
records. Since `StateMachineBuilder::build()` only registered the local
subnet, the non-existent subnet's routing entries were silently dropped,
leaving cross-subnet responses unroutable and causing
`mr_stream_builder_response_destination_not_found` critical errors in
the golden state test.

The test is flaky (rather than always failing) because, depending on the
timing of the snapshot, it may or may not contain messages destined for
other subnets.

`StateMachineBuilder::build()` now derives the subnet list from the
routing table and registers minimal registry records (DKG transcript,
subnet record, key record) for each non-local subnet via
`register_non_local_subnet`. This ensures their routing table entries
survive the filtering.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants