Skip to content

Commit

Permalink
rbd: retrieve and check lock owner twice before blocklisting
Browse files Browse the repository at this point in the history
commit 5881590 upstream.

An attempt to acquire exclusive lock can race with the current lock
owner closing the image:

1. lock is held by client123, rbd_lock() returns -EBUSY
2. get_lock_owner_info() returns client123 instance details
3. client123 closes the image, lock is released
4. find_watcher() returns 0 as there is no matching watcher anymore
5. client123 instance gets erroneously blocklisted

Particularly impacted is mirror snapshot scheduler in snapshot-based
mirroring since it happens to open and close images a lot (images are
opened only for as long as it takes to take the next mirror snapshot,
the same client instance is used for all images).

To reduce the potential for erroneous blocklisting, retrieve the lock
owner again after find_watcher() returns 0.  If it's still there, make
sure it matches the previously detected lock owner.

Cc: stable@vger.kernel.org # f38cb9d: rbd: make get_lock_owner_info() return a single locker or NULL
Cc: stable@vger.kernel.org # 8ff2c64: rbd: harden get_lock_owner_info() a bit
Cc: stable@vger.kernel.org
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  • Loading branch information
idryomov authored and gregkh committed Aug 3, 2023
1 parent 0c0b641 commit 73679f8
Showing 1 changed file with 23 additions and 2 deletions.
25 changes: 23 additions & 2 deletions drivers/block/rbd.c
Original file line number Diff line number Diff line change
Expand Up @@ -3850,6 +3850,15 @@ static void wake_lock_waiters(struct rbd_device *rbd_dev, int result)
list_splice_tail_init(&rbd_dev->acquiring_list, &rbd_dev->running_list);
}

static bool locker_equal(const struct ceph_locker *lhs,
const struct ceph_locker *rhs)
{
return lhs->id.name.type == rhs->id.name.type &&
lhs->id.name.num == rhs->id.name.num &&
!strcmp(lhs->id.cookie, rhs->id.cookie) &&
ceph_addr_equal_no_type(&lhs->info.addr, &rhs->info.addr);
}

static void free_locker(struct ceph_locker *locker)
{
if (locker)
Expand Down Expand Up @@ -3970,11 +3979,11 @@ static int find_watcher(struct rbd_device *rbd_dev,
static int rbd_try_lock(struct rbd_device *rbd_dev)
{
struct ceph_client *client = rbd_dev->rbd_client->client;
struct ceph_locker *locker;
struct ceph_locker *locker, *refreshed_locker;
int ret;

for (;;) {
locker = NULL;
locker = refreshed_locker = NULL;

ret = rbd_lock(rbd_dev);
if (ret != -EBUSY)
Expand All @@ -3994,6 +4003,16 @@ static int rbd_try_lock(struct rbd_device *rbd_dev)
if (ret)
goto out; /* request lock or error */

refreshed_locker = get_lock_owner_info(rbd_dev);
if (IS_ERR(refreshed_locker)) {
ret = PTR_ERR(refreshed_locker);
refreshed_locker = NULL;
goto out;
}
if (!refreshed_locker ||
!locker_equal(locker, refreshed_locker))
goto again;

rbd_warn(rbd_dev, "breaking header lock owned by %s%llu",
ENTITY_NAME(locker->id.name));

Expand All @@ -4015,10 +4034,12 @@ static int rbd_try_lock(struct rbd_device *rbd_dev)
}

again:
free_locker(refreshed_locker);
free_locker(locker);
}

out:
free_locker(refreshed_locker);
free_locker(locker);
return ret;
}
Expand Down

0 comments on commit 73679f8

Please sign in to comment.