-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bluestore/NVMEDevice.cc: fix NVMEManager thread hang #25646
Conversation
src/os/bluestore/NVMEDevice.cc
Outdated
@@ -590,7 +602,7 @@ int NVMEManager::try_get(const spdk_nvme_transport_id& trid, SharedDriverData ** | |||
spdk_nvme_retry_count = SPDK_NVME_DEFAULT_RETRY_COUNT; | |||
|
|||
std::unique_lock l(probe_queue_lock); | |||
while (true) { | |||
while (should_probe { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this does not compile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Will update soon.
7baf182
to
5d3ae13
Compare
@tchaikov could you please have a review the patch? Thanks a lot! :) |
src/os/bluestore/NVMEDevice.cc
Outdated
@@ -486,6 +487,17 @@ class NVMEManager { | |||
|
|||
public: | |||
NVMEManager() {} | |||
~NVMEManager() { | |||
if (!init) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we still need to wake up the try_get()
waiters, so i think we should
- remove the member variable of
init
, and instead check if thedpdk_thread
is joinable, before executingdpdk_thread
, and join it in the dtor only if it's joinable. - set all pending
ctx->done
totrue
, before notifying alltry_get()
waiters indpdk_thread
. - might want to rename
should_probe
tostopping
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Kefu.
For point 2, I will promote ctx as NVMEManager member variable then the destroyer can access it.
I will update the change soon. Thanks a lot!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't follow you. ctx
s are elements of probe_queue
. so i think we already have access to them? also, i'd recommend set them in dpdk_thread
before it exits. it's more readable this way, as we notify the waiter in a single place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. I mentioned another ctx. I will update code.
Thanks a lot! :)
@tchaikov Kefu, could you please have a review the update? Thanks a lot! |
src/os/bluestore/NVMEDevice.cc
Outdated
@@ -477,7 +477,7 @@ class NVMEManager { | |||
|
|||
private: | |||
ceph::mutex lock = ceph::make_mutex("NVMEManager::lock"); | |||
bool init = false; | |||
std::atomic<bool> stopping = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think stopping
should be false
when the NVMEManager starts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is more reasonable. Thanks. Will update the following code.
src/os/bluestore/NVMEDevice.cc
Outdated
return; | ||
{ | ||
std::lock_guard guard(probe_queue_lock); | ||
stopping = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stopping = false; | |
stopping = true; |
src/os/bluestore/NVMEDevice.cc
Outdated
@@ -590,7 +599,7 @@ int NVMEManager::try_get(const spdk_nvme_transport_id& trid, SharedDriverData ** | |||
spdk_nvme_retry_count = SPDK_NVME_DEFAULT_RETRY_COUNT; | |||
|
|||
std::unique_lock l(probe_queue_lock); | |||
while (true) { | |||
while (stopping) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while (stopping) { | |
while (!stopping) { |
src/os/bluestore/NVMEDevice.cc
Outdated
@@ -600,14 +609,15 @@ int NVMEManager::try_get(const spdk_nvme_transport_id& trid, SharedDriverData ** | |||
derr << __func__ << " device probe nvme failed" << dendl; | |||
} | |||
ctxt->done = true; | |||
for (auto p : probe_queue) | |||
p->done = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please don't set all waiting probe context done here. you need to do so after the while (!stopping)
block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Will update the code soon.
@@ -605,9 +614,10 @@ int NVMEManager::try_get(const spdk_nvme_transport_id& trid, SharedDriverData ** | |||
probe_queue_cond.wait(l); | |||
} | |||
} | |||
for (auto p : probe_queue) | |||
p->done = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add
probe_queue_cond.notify_all();
to notify all waiters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tchaikov Kefu, thanks for your comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aside from the atomic
nit, lgtm
src/os/bluestore/NVMEDevice.cc
Outdated
@@ -477,7 +477,7 @@ class NVMEManager { | |||
|
|||
private: | |||
ceph::mutex lock = ceph::make_mutex("NVMEManager::lock"); | |||
bool init = false; | |||
std::atomic<bool> stopping = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, we don't need to make this an atomic<>
. as accesses to this variable are always protected by probe_queue_lock
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack. Will update soon. Thanks.
src/os/bluestore/NVMEDevice.cc
Outdated
for (auto p : probe_queue) { | ||
p->done = true; | ||
probe_queue_cond.notify_all(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tone-zhang sorry, i missed this. you might need to do the notify_all()
out of the for (auto p : probe_queue)
loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tchaikov thanks!
When enable SPDK in Ceph and start up Ceph development cluster, met NVMEManager thread halt. On aarch64 platform, the log as below: Starting SPDK v18.04.1 / DPDK 18.05.0 initialization... [ DPDK EAL parameters: nvme-device-manager -c 0x1 -m 2048 --file-prefix=spdk_pid16987 ] EAL: Detected 46 lcore(s) EAL: Detected 1 NUMA nodes EAL: Multi-process socket /var/run/dpdk/spdk_pid16987/mp_socket EAL: Probing VFIO support... EAL: VFIO support initialized EAL: PCI device 0000:01:00.0 on NUMA socket 0 EAL: probe driver: 8086:953 spdk_nvme EAL: using IOMMU type 1 (Type 1) ^C The reason is that pthread_cond_destroy() cannot destroy the active condition_variable parameter. Also on x86 debug builds we get the following error messages due to probe_queue_lock still being active during ~NVMEManager: /home/ubuntu/ceph/src/common/mutex_debug.h: 114: FAILED ceph_assert(r == 0) ceph version 14.0.1-1862-g403622b (403622b) nautilus (dev) The change fixes the issue. Fixes: http://tracker.ceph.com/issues/37720 Signed-off-by: tone.zhang <tone.zhang@arm.com> Signed-off-by: Steve Capper <steve.capper@arm.com>
retest this please. |
@tchaikov Kefu, thanks a lot! |
[bluestore/NVMEDevice]: [fix NVMEManager thread halt]
When enable SPDK in Ceph and start up Ceph development cluster, met NVMEManager thread halt.
On aarch64 platform, the log as below:
Starting SPDK v18.04.1 / DPDK 18.05.0 initialization...
[ DPDK EAL parameters: nvme-device-manager -c 0x1 -m 2048 --file-prefix=spdk_pid16987 ]
EAL: Detected 46 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/spdk_pid16987/mp_socket
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL: PCI device 0000:01:00.0 on NUMA socket 0
EAL: probe driver: 8086:953 spdk_nvme
EAL: using IOMMU type 1 (Type 1)
^C
The reason is that pthread_cond_destroy() cannot destroy the active condition_variable parameter.
Also on x86 debug builds we get the following error messages due to probe_queue_lock still being active during ~NVMEManager:
/home/ubuntu/ceph/src/common/mutex_debug.h: 114: FAILED ceph_assert(r == 0)
ceph version 14.0.1-1862-g403622b (403622b) nautilus (dev)
The change fixes the issue.
Fixes: http://tracker.ceph.com/issues/37720
Signed-off-by: tone.zhang tone.zhang@arm.com
Signed-off-by: Steve Capper steve.capper@arm.com