tcmur_device: add priv lock support #667

lxbsz · 2021-08-26T15:03:49Z

When the tcmu-runner detect that the lock is lost, it will try to
queue a work event to reopen the image and at the same time queue
a work event to update the service status. While the reopen is not
atomic, and there has a gap between image close and image open,
during which the rbd image's state resource will be released and if
the update status event is fired, we will hit the crash bug.

This commit will add one rdev->priv_lock to protect the private data
in rdev struct. For the service status updating code just skip it
if it's in the reopen gap. And for all the other IOs just return
EBUSY to let the client try it again.

Signed-off-by: Xiubo Li xiubli@redhat.com

idryomov

I'm confused about the scope of the issue. The description says that the problem is that the status update handler could encounter a NULL state but the PR goes on to change all handlers, including I/O handers such as read and write. Would tcmu-runner core really initiate I/O on a closed image?

Could we instead look at something like TCMUR_DEV_FLAG_IS_OPEN? Wrapping all handlers with rdev->priv_lock seems too heavyweight to me.

idryomov · 2021-09-06T11:30:23Z

rbd.c

@@ -115,11 +115,12 @@ static darray(char *) blacklist_caches;
 #ifdef LIBRADOS_SUPPORTS_SERVICES

 #ifdef RBD_LOCK_ACQUIRE_SUPPORT
+/* rdev->priv_lock is held_*/


Suggested change

/* rdev->priv_lock is held_*/

/* rdev->priv_lock is held */

here and everywhere else

Will fix it.

idryomov · 2021-09-06T11:31:22Z

rbd.c

 	struct tcmur_device *rdev = tcmu_dev_get_private(dev);
+	struct tcmu_rbd_state *state = tcmur_dev_get_private(dev);


This is a purely cosmetic change, right?

This maybe introduced by my previous changes and there should be some other changes in this func but removed again, this could be removed and makes no sense.

idryomov · 2021-09-06T11:32:21Z

rbd.c

 					   bool has_lock)
 {
+	return 0;


Squash "rbd: remove possible warning" commit into an earlier commit that made the void -> int change.

I have split this into a new PR in #673.

And also have split some other commits into separate PRs to be more easy to review: #672, #671 and #670.

Please take a look, thanks.

idryomov · 2021-09-06T11:44:36Z

I would suggest splitting unrelated bug fixes ("rbd: fix use-after-free of addr", "rbd: fix memory leak when fails to get the address" and "rbd: fix and add more debug logs") into a separate PR.

lxbsz · 2021-09-06T13:11:27Z

I'm confused about the scope of the issue. The description says that the problem is that the status update handler could encounter a NULL state but the PR goes on to change all handlers, including I/O handers such as read and write. Would tcmu-runner core really initiate I/O on a closed image?

Yeah, it's possible.

Could we instead look at something like TCMUR_DEV_FLAG_IS_OPEN? Wrapping all handlers with rdev->priv_lock seems too heavyweight to me.

Let me check it more about this and have a try.

I would suggest splitting unrelated bug fixes ("rbd: fix use-after-free of addr", "rbd: fix memory leak when fails to get the address" and "rbd: fix and add more debug logs") into a separate PR.

Done.

lxbsz · 2021-09-10T03:27:03Z

@idryomov Please take a look, thanks.

idryomov · 2021-09-13T17:44:11Z

tcmur_device.c

 	ret = rhandler->report_event(dev);
 	if (ret)
 		tcmu_dev_err(dev, "Could not report events. Error %d.\n", ret);
-	pthread_mutex_unlock(&rdev->state_lock);
+
+	pthread_mutex_unlock(&rdev->rdev_lock);


Was this code path tested?

Suggested change

pthread_mutex_unlock(&rdev->rdev_lock);

pthread_mutex_lock(&rdev->rdev_lock);

Was this code path tested?

Yeah, tested, checked it this has been fixed but wasn't amended to it.

I found in my setups one node has fixed this, another didn't.

For local test I didn't hit any issue.

idryomov · 2021-09-13T18:17:24Z

tcmur_device.c

+	}
+
+	rdev->flags |= TCMUR_DEV_FLAG_REPORTING_EVENT;
+	pthread_mutex_unlock(&rdev->rdev_lock);


Why is rdev_lock released here? ->report_event() used to be called with it held.

The rdev_lock should always be released when calling the handler's hooks. I think we need to pass a has_lock boolean parameter.

idryomov · 2021-09-13T18:30:24Z

tcmur_device.h

+
+	pthread_cond_t report_event_cond;
+
+	pthread_spinlock_t cmds_list_lock; /* protects cmds_list */
 	struct list_head cmds_list;
 };



You seem to have checked in a kmod-devel-25-16.el8.x86_64.rpm binary by mistake.

Yeah, I have removed it.

idryomov · 2021-09-13T18:33:23Z

I think I'm still missing something. I'm going to ignore the renames for the moment and speak in terms of what is currently in master.

From my reading of the description, the problem is that ->report_event() and __tcmu_reopen_dev() can race and ->report_event() might crash on a momentarily closed device. Now ->report_event() is currently called under state_lock and __tcmu_reopen_dev() clears TCMUR_DEV_FLAG_IS_OPEN under state_lock before proceeding with closing. Wouldn't it be sufficient to just add a TCMUR_DEV_FLAG_IS_OPEN check to __tcmu_report_event()? Why TCMUR_DEV_FLAG_REPORTING_EVENT and a new condition variable are necessary?

lxbsz · 2021-09-14T02:58:34Z

I think I'm still missing something. I'm going to ignore the renames for the moment and speak in terms of what is currently in master.

From my reading of the description, the problem is that ->report_event() and __tcmu_reopen_dev() can race and ->report_event() might crash on a momentarily closed device. Now ->report_event() is currently called under state_lock and __tcmu_reopen_dev() clears TCMUR_DEV_FLAG_IS_OPEN under state_lock before proceeding with closing. Wouldn't it be sufficient to just add a TCMUR_DEV_FLAG_IS_OPEN check to __tcmu_report_event()? Why TCMUR_DEV_FLAG_REPORTING_EVENT and a new condition variable are necessary?

The reopen and event report will be run in two different threads.

The reopen will be split into handler->close() and handler->open() without holding the state_lock, that means the reopen is none atomic.

In case that just after the __tcmu_report_event() checked that the TCMUR_DEV_FLAG_IS_OPEN flag is set and then it will try to access to the rbd_state private data to report the events to ceph cluster. While at the same time the reopen thread could be fired, which will clear the TCMUR_DEV_FLAG_IS_OPEN flag and release the rbd_state private data in handler->close().

The use-after-free bug still exists...

We need to let the reopen thread wait a bit to be sure that the event report thread has finished.

If the device is in recovery, we can defer reporting the event in the recovery when reopening the device. And if the device is stopped or stopping we can just skip it. Just wait for the report event to finish when recoverying the device, because the recovery will close and then open the device during which the private data maybe released. And it may cause use-after-free crash in report event routine. Signed-off-by: Xiubo Li <xiubli@redhat.com>

lxbsz · 2021-09-14T05:55:34Z

Run the following test for 2 hours, worked fine for me.

[root@client ~]# lsblk 
NAME        MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINT
sda           8:0    0   25G  0 disk  
├─sda1        8:1    0    1G  0 part  /boot
└─sda2        8:2    0   24G  0 part  
  ├─cl-root 253:0    0 21.9G  0 lvm   /
  └─cl-swap 253:1    0  2.1G  0 lvm   [SWAP]
sdb           8:16   0    1M  0 disk  
└─mpathd    253:2    0    1M  0 mpath 
sdc           8:32   0    1M  0 disk  
└─mpathd    253:2    0    1M  0 mpath 
sr0          11:0    1  8.6G  0 rom   

# while [ 1 ]; do dd if=/dev/zero of=/dev/sdb bs=1K count=1024; sleep 1; dd if=/dev/zero of=/dev/sdc bs=1K count=1024; done

idryomov · 2021-09-14T18:22:36Z

The reopen and event report will be run in two different threads.

The reopen will be split into handler->close() and handler->open() without holding the state_lock, that means the reopen is none atomic.

In case that just after the __tcmu_report_event() checked that the TCMUR_DEV_FLAG_IS_OPEN flag is set and then it will try to access to the rbd_state private data to report the events to ceph cluster. While at the same time the reopen thread could be fired, which will clear the TCMUR_DEV_FLAG_IS_OPEN flag and release the rbd_state private data in handler->close().

In master, state_lock is held across ->report_event() so if while reopen is indeed not atomic, event report is. __tcmu_reopen_dev() would not be able to clear TCMUR_DEV_FLAG_IS_OPEN and close the image while ->report_event() is executing.

You are removing state_lock protection from ->report_event() though, saying that it (now renamed to rdev_lock) "should always be released when calling the handler's hooks". Can you elaborate on why it is not OK to hold it across ->report_event() as it is currently done?

lxbsz · 2021-09-15T01:20:38Z

The reopen and event report will be run in two different threads.
The reopen will be split into handler->close() and handler->open() without holding the state_lock, that means the reopen is none atomic.
In case that just after the __tcmu_report_event() checked that the TCMUR_DEV_FLAG_IS_OPEN flag is set and then it will try to access to the rbd_state private data to report the events to ceph cluster. While at the same time the reopen thread could be fired, which will clear the TCMUR_DEV_FLAG_IS_OPEN flag and release the rbd_state private data in handler->close().

In master, state_lock is held across ->report_event() so if while reopen is indeed not atomic, event report is. __tcmu_reopen_dev() would not be able to clear TCMUR_DEV_FLAG_IS_OPEN and close the image while ->report_event() is executing.

You are removing state_lock protection from ->report_event() though, saying that it (now renamed to rdev_lock) "should always be released when calling the handler's hooks". Can you elaborate on why it is not OK to hold it across ->report_event() as it is currently done?

As I remembered long time ago as discussed, the rule is that the state_lock should be only used in the libtcmu, and shouldn't be used in the handlers. And when calling any handler hook we should release the state_lock, because there may have third part handlers could call libtcmu's helpers, which will be possibly acquire the state_lock again, potentially will introduce dead lock bug.

Or possibly in the handler's hooks it will sleep, so holding the state_lock is not a good idea.

The current code in master is buggy when begin to support event report feature.

lxbsz force-pushed the priv_lock branch from c0f145b to 70d6fae Compare August 30, 2021 06:19

lxbsz mentioned this pull request Sep 2, 2021

need help / tcmu-runner performance very slow #668

Open

idryomov reviewed Sep 6, 2021

View reviewed changes

lxbsz force-pushed the priv_lock branch 2 times, most recently from 96199cb to 1d948c0 Compare September 9, 2021 09:37

idryomov mentioned this pull request Sep 13, 2021

rbd: misc fixes #673

Merged

idryomov reviewed Sep 13, 2021

View reviewed changes

lxbsz force-pushed the priv_lock branch from 1d948c0 to 072cb1f Compare September 14, 2021 02:44

lxbsz force-pushed the priv_lock branch from 072cb1f to 7f9ad1c Compare September 14, 2021 05:49

lxbsz changed the base branch from master to main August 10, 2022 00:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tcmur_device: add priv lock support #667

tcmur_device: add priv lock support #667

lxbsz commented Aug 26, 2021

idryomov left a comment

idryomov Sep 6, 2021

lxbsz Sep 6, 2021

idryomov Sep 6, 2021

lxbsz Sep 6, 2021

idryomov Sep 6, 2021

lxbsz Sep 6, 2021

idryomov commented Sep 6, 2021

lxbsz commented Sep 6, 2021

lxbsz commented Sep 10, 2021

idryomov Sep 13, 2021

lxbsz Sep 14, 2021

idryomov Sep 13, 2021

lxbsz Sep 14, 2021

idryomov Sep 13, 2021

lxbsz Sep 14, 2021

idryomov commented Sep 13, 2021

lxbsz commented Sep 14, 2021 •

edited

lxbsz commented Sep 14, 2021

idryomov commented Sep 14, 2021

lxbsz commented Sep 15, 2021 •

edited

		struct tcmur_device *rdev = tcmu_dev_get_private(dev);
		struct tcmu_rbd_state *state = tcmur_dev_get_private(dev);

	pthread_mutex_unlock(&rdev->rdev_lock);
	pthread_mutex_lock(&rdev->rdev_lock);

tcmur_device: add priv lock support #667

Are you sure you want to change the base?

tcmur_device: add priv lock support #667

Conversation

lxbsz commented Aug 26, 2021

idryomov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

idryomov commented Sep 6, 2021

lxbsz commented Sep 6, 2021

lxbsz commented Sep 10, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

idryomov commented Sep 13, 2021

lxbsz commented Sep 14, 2021 • edited

lxbsz commented Sep 14, 2021

idryomov commented Sep 14, 2021

lxbsz commented Sep 15, 2021 • edited

lxbsz commented Sep 14, 2021 •

edited

lxbsz commented Sep 15, 2021 •

edited