Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nexus: Assertion failure when unpublishing a faulted nexus #584

Closed
wants to merge 1 commit into from

Conversation

jonathan-teh
Copy link
Contributor

The assertion occurs due to the nvmf subsystem transitioning from a
paused to an inactive state, which is forbidden. Resume the nexus
first, so that it is in an active state, before unpublishing it, which
also covers destroying a nexus. A nexus in a faulted state already has
no usable replicas so is itself already unusable.

Fixes #549, CAS-549.

@jonathan-teh jonathan-teh self-assigned this Dec 16, 2020
@jonathan-teh
Copy link
Contributor Author

bors try

bors bot pushed a commit that referenced this pull request Dec 16, 2020
@bors
Copy link
Contributor

bors bot commented Dec 16, 2020

try

Build failed:

@jonathan-teh
Copy link
Contributor Author

bors retry

bors bot pushed a commit that referenced this pull request Dec 16, 2020
@bors
Copy link
Contributor

bors bot commented Dec 16, 2020

try

Build succeeded:

Copy link

@paulyoong paulyoong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

mayastor/tests/replica_timeout.rs Outdated Show resolved Hide resolved
The assertion occurs due to the nvmf subsystem transitioning from a
paused to an inactive state, which is forbidden. Resume the nexus
first, so that it is in an active state, before unpublishing it, which
also covers destroying a nexus. A nexus in a faulted state already has
no usable replicas so is itself already unusable.

Repurpose the existing cargo test for a replica that is stopped, then
continued, to test the above case with a single remote replica.
Copy link
Contributor

@gila gila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although code wise it seems to be correct -- I wonder if we should fix it this way. The pause stated is not something that is allowed AFAIK, so what if we simply always resume with the note that when all replicas are gone, we simply have a controller without namespace.

@jonathan-teh
Copy link
Contributor Author

I tried changing Bio::child_retire to always resume the nexus instead (and dropping the change to unshare_nexus here). That passes this test as long as the delay before thawing the container is sufficiently long (I've stripped the repeated Failed to construct the tqpair):

unfreeze delay... 1/20
unfreeze delay... 2/20
unfreeze delay... 3/20
unfreeze delay... 4/20
unfreeze delay... 5/20
[2020-12-17T13:07:52.193276966+00:00  WARN mayastor::spdk:bdev_nvme.c:1232] Warning: Detected a timeout. ctrlr=0x7f29d06f0a60 qpair=(nil) cid=2   
[2020-12-17T13:07:52.193562812+00:00 ERROR mayastor::bdev::nexus::nexus_io:nexus_io.rs:319] name: 10.1.0.2:8420/nqn.2019-05.io.openebs:disk0n1, driver: nvme, product: NVMe disk, num_blocks: 204800, block_len: 512
[2020-12-17T13:07:52.193640369+00:00  WARN mayastor::bdev::nexus::nexus_io:nexus_io.rs:329] core 0 thread Some(Mthread(0x7f29d0008a30)), faulting child nvmf://10.1.0.2:8420/nqn.2019-05.io.openebs:disk0: Faulted(IoError), blk_cnt: 204800, blk_size: 512

[2020-12-17T13:07:52.193740992+00:00 DEBUG mayastor::spdk:nvme_ctrlr.c:1347] resetting controller   
unfreeze delay... 6/20
unfreeze delay... 7/20
[2020-12-17T13:07:54.192870939+00:00 ERROR mayastor::spdk:nvme_tcp.c:1674] Failed to construct the tqpair=0x7f29d06f23b0 via correct icresp   
[2020-12-17T13:07:54.192999741+00:00 ERROR mayastor::spdk:nvme_tcp.c:1764] Unable to connect the tqpair   
[2020-12-17T13:07:54.193159278+00:00 ERROR mayastor::spdk:nvme_ctrlr.c:1363] Controller reinitialization failed.   
[2020-12-17T13:07:54.193250799+00:00 ERROR mayastor::spdk:nvme_ctrlr.c:874] ctrlr 10.1.0.2 in failed state.   
[2020-12-17T13:07:54.193331868+00:00 ERROR mayastor::spdk:bdev_nvme.c:394] Resetting controller failed.   
[2020-12-17T13:07:54.193420305+00:00 ERROR mayastor::spdk:nvme_ctrlr.c:3406] Submitting Keep Alive failed   
[2020-12-17T13:07:54.193604209+00:00 DEBUG mayastor::spdk:nvme_ctrlr.c:1347] resetting controller   
unfreeze delay... 8/20
unfreeze delay... 9/20
...
[2020-12-17T13:07:56.193371976+00:00  INFO mayastor::bdev::nexus::nexus_bdev:nexus_bdev.rs:454] nexus: Dynamic reconfiguration event: ChildFault started
[2020-12-17T13:07:56.193467288+00:00 DEBUG mayastor::spdk:nvme_ctrlr.c:1347] resetting controller   
unfreeze delay... 10/20
unfreeze delay... 11/20
...
[2020-12-17T13:07:58.193331800+00:00  INFO mayastor::bdev::nexus::nexus_channel:nexus_channel.rs:102] nexus(thread:"mayastor_nvmf_tcp_pg_core_0"), refreshing IO channels
[2020-12-17T13:07:58.193436469+00:00  INFO mayastor::bdev::nexus::nexus_channel:nexus_channel.rs:248] nexus: Reconfigure completed
[2020-12-17T13:07:58.193519186+00:00 DEBUG mayastor::spdk:nvme_ctrlr.c:1347] resetting controller   
unfreeze delay... 12/20
unfreeze delay... 13/20
...
[2020-12-17T13:08:00.193181617+00:00  INFO mayastor::bdev::nexus::nexus_bdev:nexus_bdev.rs:468] nexus: Dynamic reconfiguration event: ChildFault completed 0
[2020-12-17T13:08:00.193304498+00:00  INFO mayastor::nexus_uri:bdev_destroy{uri="nvmf://10.1.0.2:8420/nqn.2019-05.io.openebs:disk0"}:nexus_uri.rs:79] new
[2020-12-17T13:08:00.193385791+00:00  INFO mayastor::nexus_uri:bdev_destroy{uri="nvmf://10.1.0.2:8420/nqn.2019-05.io.openebs:disk0"}:nexus_uri.rs:79] enter
[2020-12-17T13:08:00.193587104+00:00  INFO mayastor::nexus_uri:bdev_destroy{uri="nvmf://10.1.0.2:8420/nqn.2019-05.io.openebs:disk0"}:nexus_uri.rs:79] exit
[2020-12-17T13:08:00.193657001+00:00  INFO mayastor::nexus_uri:bdev_destroy{uri="nvmf://10.1.0.2:8420/nqn.2019-05.io.openebs:disk0"}:nexus_uri.rs:79] close time.busy=201µs time.idle=154µs
[2020-12-17T13:08:00.193774702+00:00 DEBUG mayastor::spdk:nvme_ctrlr.c:1347] resetting controller   
unfreeze delay... 14/20
unfreeze delay... 15/20
...
[2020-12-17T13:08:02.193519150+00:00  INFO mayastor::core::bdev:bdev.rs:168] Received remove event for bdev 10.1.0.2:8420/nqn.2019-05.io.openebs:disk0n1
[2020-12-17T13:08:02.193610744+00:00  INFO mayastor::bdev::nexus::nexus_child:nexus_child.rs:370] Removing child nvmf://10.1.0.2:8420/nqn.2019-05.io.openebs:disk0
[2020-12-17T13:08:02.193724046+00:00  INFO mayastor::bdev::nexus::nexus_child:nexus_child.rs:406] Child nvmf://10.1.0.2:8420/nqn.2019-05.io.openebs:disk0 removed
[2020-12-17T13:08:02.193923381+00:00 ERROR mayastor::bdev::nexus::nexus_io:nexus_io.rs:343] :nexus: state: Mutex { data: Open } blk_cnt: 102400, blk_size: 512
        nvmf://10.1.0.2:8420/nqn.2019-05.io.openebs:disk0: state Faulted(IoError)
 has no children left... 
[2020-12-17T13:08:02.194025479+00:00 ERROR mayastor::spdk:nvme_ctrlr.c:902] ctrlr 10.1.0.2 get_cc() failed   
unfreeze delay... 16/20
unfreeze delay... 17/20
unfreeze delay... 18/20
unfreeze delay... 19/20
unfreeze delay... 20/20
container thawed
[2020-12-17T13:08:06.282136209+00:00 DEBUG mayastor::subsys::nvmf::subsystem:subsystem.rs:355] stopping NvmfSubsystem { id: 1, subtype: "NVMe", subnqn: "nqn.2019-05.io.openebs:nexus", sn: "33\' ~\'~._`o##o>", mn: "Mayastor NVMe controller", allow_any_host: 1, ana_reporting: 1, listeners: Some([Transport ID { trtype: 3, trstring: "TCP", traddr: "127.0.0.1", trsvcid: "8420" }]) }
[2020-12-17T13:08:06.282355451+00:00  INFO mayastor::subsys::nvmf::subsystem:subsystem.rs:371] stopped nqn.2019-05.io.openebs:nexus
[2020-12-17T13:08:06.282483071+00:00 DEBUG mayastor::bdev::nexus::nexus_channel:nexus_channel.rs:204] nexus Destroying IO channels
...

If the delay is 10-12s, when faulting the child has started but not yet completed, then you get the original assertion back as the subsystem is in the paused state while child reconfiguration is taking place.
If the delay is 15s, you end with this panic (and a misleading error msg because SPDK's nvmf_subsystem_state_change returned EBUSY because the subsystem is in the middle of changing state):

thread 'mayastor_master' panicked at 'should unpublish nexus: UnshareNexus { source: UnshareNvmf { source: Subsystem { source: EBUSY, nqn: "nqn.2019-05.io.openebs:nexus", msg: "out of memory" } }, name: "nexus" }', mayastor/tests/replica_timeout.rs:120:18

I'm inclined to close this and go back to the drawing board for a fix with wider applicability.

@jonathan-teh
Copy link
Contributor Author

jonathan-teh commented Dec 17, 2020

I tried changing Bio::child_retire to always resume the nexus instead (and dropping the change to unshare_nexus here).

This makes things worse, with the kernel initiator connected, once the child and nexus are in a faulted state, even nvme disconnect locks up whereas it used to work before (which also stops the kernel from retrying any outstanding IO). That leaves the client with no option than to reboot.
With the changes here, the test still passes with the delay at 12s or 15s.

@gila
Copy link
Contributor

gila commented Dec 21, 2020

Then I guess the best approach here is to simply not retire the last child, but have the initiator timeout/disconnect logic handle it?

@jonathan-teh
Copy link
Contributor Author

Then I guess the best approach here is to simply not retire the last child, but have the initiator timeout/disconnect logic handle it?

I guess so, I'll rework the PR.

@jonathan-teh
Copy link
Contributor Author

FWIW, returning early in Nexus::nexus_io::child_retire if there is only 1 child (i.e. leaving the state of the child and nexus unchanged) leads to the nexus retrying the keep alive every 2s until the child comes back:

[2020-12-29T12:22:39.821734070+00:00 ERROR mayastor::spdk:nvme_tcp.c:1674] Failed to construct the tqpair=0x561822958aa0 via correct icresp   
[2020-12-29T12:22:39.821844436+00:00 ERROR mayastor::spdk:nvme_tcp.c:1764] Unable to connect the tqpair   
[2020-12-29T12:22:39.821929562+00:00 ERROR mayastor::spdk:nvme_ctrlr.c:1363] Controller reinitialization failed.   
[2020-12-29T12:22:39.821985435+00:00 ERROR mayastor::spdk:nvme_ctrlr.c:874] ctrlr 127.0.0.1 in failed state.   
[2020-12-29T12:22:39.822035939+00:00 ERROR mayastor::spdk:bdev_nvme.c:394] Resetting controller failed.   
[2020-12-29T12:22:39.822093379+00:00 ERROR mayastor::spdk:nvme_ctrlr.c:3406] Submitting Keep Alive failed   

@jonathan-teh jonathan-teh deleted the cas-549 branch January 6, 2021 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Assertion failure when unpublishing or destroying a nexus in faulted state
3 participants