-
Notifications
You must be signed in to change notification settings - Fork 10
Description
While doing a demo dry run we somehow got into a state where the opte1 xde device was removed from DLS (what dladm shows), but the OPTE Port opte1 was not deleted. No one on the call knows how this happened. It seems like something had possibly gone wrong with instance stop that left some state around, but we don't really know.
In this scenario we cannot manually delete the opte1 OPTE Port via opteadm because the xde handler code expects to be able to delete the DLS devnet ID.
// destroy dls devnet device
let ret = unsafe {
dls::dls_devnet_destroy(xde.mh, &mut xde.linkid, boolean_t::B_TRUE)
};
However, since the DLS device was already deleted this always fails. So then I recommended unloading the xde driver in order to clear the state, but that resulted in the following crash:
WARNING: panicked at 'called `Result::unwrap()` on an `Err` value: xde_underlay_port { name: "net0", mh: 0xfffffeb1d99b4028, mch: MacClient { close_flags: 0, mch: 0xfffffeb1cf1dd8e8 }, mph: 0xfffffeb1e2b05e08 }', src/xde.rs:947:51
The problem code is:
let state = Box::from_raw(rstate as *mut XdeState);
let underlay = state.underlay.into_inner();
match underlay {
Some(underlay) => {
let u1 = Arc::try_unwrap(underlay.u1).unwrap();
Specifically, since there is still an outstanding port, there is more than one strong ref to the underlay (u1).
The solution is probably to have detatch first check that there are no existing ports, otherwise it needs to immediately return failure. However, in this scenario, that would mean there is no way to get OPTE/xde out of this broken state without restarting the machine. So perhaps, along with this change, we also need some "force" option for the delete port command so that it will delete the OPTE Port even if the DLS devnet device doesn't exist.