Skip to content

Conversation

@bnaecker
Copy link
Collaborator

This brings in a few recent improvements to OPTE, that help prevent us wedging the whole system when things go sideways. See #1364 for context. This PR should resolve that issue. Note that while it's not possible to confuse OPTE itself, we can still get into the state where the instance cannot be started again.

To test this, I setup the control plane and launched an instance. I then called snoop(1M) on the guest VNIC (vopte0) to ensure that it can't be deleted. I then stopped the instance. In the sled agent log we see:

{"msg":"Stopped and uninstalled zone","v":0,"name":"SledAgent","level":30,"time":"2022-07-13T18:10:25.797485254Z","hostname":"feldspar","pid":9977,"zone":"oxz_propolis-server_ea6467e6-1f6d-40bf-9742-2d5cc6399ed8","instance_id":"77b7a62c-7504-4408-97b6-d45d93569dd0","component":"InstanceManager"}
WARNING: Failed to delete OPTE port overlay VNIC while dropping port. The VNIC will not be cleaned up properly, and the xde device itself will not be deleted. Both the VNIC and the xde device must be deleted out of band, and it will not be possible to recreate the xde device until then. Error: DeleteVnicError { name: "vopte0", err: CommandFailure { command: "/usr/sbin/dladm delete-vnic vopte0", status: ExitStatus(unix_wait_status(256)), stdout: "", stderr: "dladm: vnic deletion failed: link busy\n" } }

The link itself, as well as the OPTE port and VNIC are still around:

bnaecker@feldspar : ~/omicron $ oxide instance stop -o o -p p i0
Type i0 to confirm stop:: i0
✔  Waiting for instance status to be `stopped`
✘ Stopped instance i0 in o/p
bnaecker@feldspar : ~/omicron $ pfexec opteadm list-ports
LINK                             MAC ADDRESS              IPv4 ADDRESS     STATE
opte0                            A8:40:25:FD:2C:0F        172.30.0.5       running
bnaecker@feldspar : ~/omicron $ dladm | grep opte
opte0       xde       1500   up       --         --
vopte0      vnic      1500   up       --         opte0
bnaecker@feldspar : ~/omicron $ dladm show-linkprop -p secondary-macs net0
LINK         PROPERTY        PERM VALUE          DEFAULT        POSSIBLE
net0         secondary-macs  rw   --             --             --
bnaecker@feldspar : ~/omicron $

Note that the secondary MAC has been removed, as intended, since this prevents spuriously claiming that net0 provides a path to the guest.

Trying to restart the guest fails:

bnaecker@feldspar : ~/omicron $ oxide instance start -o o -p p i0
✘ Oxide API internal error: Internal Server Error

In the log we have:

{"msg":"request completed","v":0,"name":"SledAgent","level":30,"time":"2022-07-13T18:15:16.890690708Z","hostname":"feldspar","pid":9977,"uri":"/instances/77b7a62c-7504-4408-97b6-d45d93569dd0","method":"PUT","req_id":"c9d688c2-3ee3-4e2d-86ed-9c6275e7a5bf","remote_addr":"[fd00:1122:3344:101::3]:39943","local_addr":"[fd00:1122:3344:101::1]:12345","component":"dropshot (SledAgent)","error_message_external":"Internal Server Error","error_message_internal":"Error managing instances: Instance error: Failure interacting with the OPTE ioctl(2) interface: command CreateXde failed: MacExists { port: \"opte2\", vni: Vni { inner: 1147299 }, mac: MacAddr { inner: A8:40:25:FD:2C:0F } }","response_code":"500"}

I then stopped snooping the guest VNIC, and deleted the VNIC and OPTE port manually:

bnaecker@feldspar : ~/omicron $ pfexec snoop -r -d vopte0 'tcp'
Using device vopte0 (promiscuous mode)
^Cbnaecker@feldspar : ~/omicron $ pfexec opteadm list-ports
LINK                             MAC ADDRESS              IPv4 ADDRESS     STATE
opte0                            A8:40:25:FD:2C:0F        172.30.0.5       running
bnaecker@feldspar : ~/omicron $ dladm
LINK        CLASS     MTU    STATE    BRIDGE     OVER
igb0        phys      1500   up       --         --
net0        vnic      1500   up       --         igb0
net1        vnic      1500   up       --         igb0
stub0       etherstub 9000   up       --         --
underlay0   vnic      9000   up       --         stub0
oxControlService0 vnic 9000  up       --         stub0
oxControlStorage0 vnic 9000  up       --         stub0
oxControlStorage1 vnic 9000  up       --         stub0
oxControlStorage2 vnic 9000  up       --         stub0
oxControlStorage3 vnic 9000  up       --         stub0
oxControlStorage4 vnic 9000  up       --         stub0
oxControlService1 vnic 9000  up       --         stub0
oxControlPublic0 vnic 1500   up       --         igb0
oxControlService2 vnic 9000  up       --         stub0
oxControlService3 vnic 9000  up       --         stub0
opte0       xde       1500   up       --         --
vopte0      vnic      1500   up       --         opte0
bnaecker@feldspar : ~/omicron $ dladm delete-vnic vopte0
dladm: vnic deletion failed: permission denied
bnaecker@feldspar : ~/omicron $ pfexec !!
pfexec dladm delete-vnic vopte0
bnaecker@feldspar : ~/omicron $ dladm
LINK        CLASS     MTU    STATE    BRIDGE     OVER
igb0        phys      1500   up       --         --
net0        vnic      1500   up       --         igb0
net1        vnic      1500   up       --         igb0
stub0       etherstub 9000   up       --         --
underlay0   vnic      9000   up       --         stub0
oxControlService0 vnic 9000  up       --         stub0
oxControlStorage0 vnic 9000  up       --         stub0
oxControlStorage1 vnic 9000  up       --         stub0
oxControlStorage2 vnic 9000  up       --         stub0
oxControlStorage3 vnic 9000  up       --         stub0
oxControlStorage4 vnic 9000  up       --         stub0
oxControlService1 vnic 9000  up       --         stub0
oxControlPublic0 vnic 1500   up       --         igb0
oxControlService2 vnic 9000  up       --         stub0
oxControlService3 vnic 9000  up       --         stub0
opte0       xde       1500   up       --         --
bnaecker@feldspar : ~/omicron $ pfexec opteadm delete-xde opte0
bnaecker@feldspar : ~/omicron $ opteamd
-bash: opteamd: command not found
bnaecker@feldspar : ~/omicron $ pfexec opteadm list-ports
LINK                             MAC ADDRESS              IPv4 ADDRESS     STATE
bnaecker@feldspar : ~/omicron $

At this point, we can successfully start the instance again.

bnaecker@feldspar : ~/omicron $ oxide instance start -o o -p p i0
✔  Waiting for instance status to be `running`
✔ Started instance i0 in o/p
bnaecker@feldspar : ~/omicron $

The point of these fixes is to prevent restarting the instance, rather than wedge the whole sled, preventing it from hosting any new instances. It also prevents priming the sled for panic when the xde driver is unloaded. We'd rather make the instance irretrievable without some kind of intervention rather than the whole sled.

@bnaecker bnaecker requested a review from rzezeski July 13, 2022 18:39
@rzezeski
Copy link
Contributor

It also prevents priming the sled for panic when the xde driver is unloaded.

Also, as of oxidecomputer/opte#178, the driver will no longer panic, but rather refuse to detach.

@bnaecker bnaecker enabled auto-merge (squash) July 13, 2022 19:21
@bnaecker bnaecker merged commit 05e98a0 into main Jul 13, 2022
@bnaecker bnaecker deleted the update-opte-dep branch July 13, 2022 19:49
leftwo pushed a commit that referenced this pull request Sep 18, 2024
Crucible changes:
    Make crutest use BlockIO trait instead of a Guest (#1452)
    Use new syncfs syscall (#1427)
    Increment write backpressure before deferred encryption (#1444)
    Use RAII handles for backpressure (#1443)
    Fine-tuning backpressure clamping, and API cleanups (#1442)
    Fix outdated comment (#1447)
    Update Rust crate rusqlite to 0.32 (#1418)
    Fix write reordering bug (#1448)
    Remove `history_file` (#1446)
    Remove optionality for `BlockRes` in `DeferredWrite` (#1441)

Propolis changes
    instance spec rework: tighten up component naming (#761)
    instance spec rework: remove most dependencies on `InstanceSpecV0` from propolis-server (#757)
    fix new 1.81.0 warning and clippy error (#760)
    standalone: be more helpful with bad block device configs (#758)
leftwo added a commit that referenced this pull request Sep 18, 2024
Crucible changes:
    Make crutest use BlockIO trait instead of a Guest (#1452)
    Use new syncfs syscall (#1427)
    Increment write backpressure before deferred encryption (#1444)
    Use RAII handles for backpressure (#1443)
    Fine-tuning backpressure clamping, and API cleanups (#1442)
    Fix outdated comment (#1447)
    Update Rust crate rusqlite to 0.32 (#1418)
    Fix write reordering bug (#1448)
    Remove `history_file` (#1446)
    Remove optionality for `BlockRes` in `DeferredWrite` (#1441)

Propolis changes
    instance spec rework: tighten up component naming (#761)
instance spec rework: remove most dependencies on `InstanceSpecV0` from
propolis-server (#757)
    fix new 1.81.0 warning and clippy error (#760)
    standalone: be more helpful with bad block device configs (#758)

Co-authored-by: Alan Hanson <alan@oxide.computer>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants