Update OPTE dep to bring in some QOL improvements #1418
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This brings in a few recent improvements to OPTE, that help prevent us wedging the whole system when things go sideways. See #1364 for context. This PR should resolve that issue. Note that while it's not possible to confuse OPTE itself, we can still get into the state where the instance cannot be started again.
To test this, I setup the control plane and launched an instance. I then called
snoop(1M)on the guest VNIC (vopte0) to ensure that it can't be deleted. I then stopped the instance. In the sled agent log we see:The link itself, as well as the OPTE port and VNIC are still around:
Note that the secondary MAC has been removed, as intended, since this prevents spuriously claiming that
net0provides a path to the guest.Trying to restart the guest fails:
In the log we have:
I then stopped snooping the guest VNIC, and deleted the VNIC and OPTE port manually:
At this point, we can successfully start the instance again.
The point of these fixes is to prevent restarting the instance, rather than wedge the whole sled, preventing it from hosting any new instances. It also prevents priming the sled for panic when the
xdedriver is unloaded. We'd rather make the instance irretrievable without some kind of intervention rather than the whole sled.