Skip to content

Commit

Permalink
RDMA/addr: Fix race with netevent_callback()/rdma_addr_cancel()
Browse files Browse the repository at this point in the history
commit 2ee9bf3 upstream.

This three thread race can result in the work being run once the callback
becomes NULL:

       CPU1                 CPU2                   CPU3
 netevent_callback()
                     process_one_req()       rdma_addr_cancel()
                      [..]
     spin_lock_bh()
  	set_timeout()
     spin_unlock_bh()

						spin_lock_bh()
						list_del_init(&req->list);
						spin_unlock_bh()

		     req->callback = NULL
		     spin_lock_bh()
		       if (!list_empty(&req->list))
                         // Skipped!
		         // cancel_delayed_work(&req->work);
		     spin_unlock_bh()

		    process_one_req() // again
		     req->callback() // BOOM
						cancel_delayed_work_sync()

The solution is to always cancel the work once it is completed so any
in between set_timeout() does not result in it running again.

Cc: stable@vger.kernel.org
Fixes: 44e7505 ("RDMA/rdma_cm: Make rdma_addr_cancel into a fence")
Link: https://lore.kernel.org/r/20200930072007.1009692-1-leon@kernel.org
Reported-by: Dan Aloni <dan@kernelim.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  • Loading branch information
jgunthorpe authored and gregkh committed Nov 1, 2020
1 parent 0f6dd1b commit 35a1d62
Showing 1 changed file with 5 additions and 6 deletions.
11 changes: 5 additions & 6 deletions drivers/infiniband/core/addr.c
Expand Up @@ -647,13 +647,12 @@ static void process_one_req(struct work_struct *_work)
req->callback = NULL;

spin_lock_bh(&lock);
/*
* Although the work will normally have been canceled by the workqueue,
* it can still be requeued as long as it is on the req_list.
*/
cancel_delayed_work(&req->work);
if (!list_empty(&req->list)) {
/*
* Although the work will normally have been canceled by the
* workqueue, it can still be requeued as long as it is on the
* req_list.
*/
cancel_delayed_work(&req->work);
list_del_init(&req->list);
kfree(req);
}
Expand Down

0 comments on commit 35a1d62

Please sign in to comment.