Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nexus: update instance networking config after live migration #3127

Merged
merged 6 commits into from
May 17, 2023

Conversation

gjcolombo
Copy link
Contributor

Whenever Nexus gets a new instance runtime state from a sled agent, compare the state to the existing runtime state to see if applying the new state will update the instance's Propolis generation. If it will, use the sled ID in the new record to create updated OPTE V2P mappings and Dendrite NAT entries for the instance.

Retry with backoff when sled agent fails to publish a state update to Nexus. This was required for correctness anyway (see #2727) but is especially important now that there are many more ways for Nexus to fail to apply a state update. See the comments in the new code for more details.

In the future, it might be better to update this configuration using a reliable persistent workflow that's triggered by Propolis location changes. This approach will require at least some additional work in OPTE to assign generation numbers to V2P mappings (Dendrite might have a similar problem but I'm not as familiar with the tables Nexus is trying to maintain in this change).

@gjcolombo gjcolombo requested a review from jmpesp May 15, 2023 18:58
Copy link
Contributor

@jmpesp jmpesp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, some questions:

nexus/src/app/instance.rs Outdated Show resolved Hide resolved
nexus/src/app/instance.rs Outdated Show resolved Hide resolved
nexus/src/app/sagas/instance_create.rs Outdated Show resolved Hide resolved
nexus/tests/integration_tests/instances.rs Show resolved Hide resolved
nexus/tests/integration_tests/instances.rs Show resolved Hide resolved
nexus/tests/integration_tests/instances.rs Show resolved Hide resolved
nexus/src/app/instance.rs Outdated Show resolved Hide resolved
Whenever Nexus gets a new instance runtime state from a sled agent, compare the
state to the existing runtime state to see if applying the new state will update
the instance's Propolis generation. If it will, use the sled ID in the new
record to create updated OPTE V2P mappings and Dendrite NAT entries for the
instance.

Retry with backoff when sled agent fails to publish a state update to Nexus.
This was required for correctness anyway (see #2727) but is especially
important now that there are many more ways for Nexus to fail to apply a state
update. See the comments in the new code for more details.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants