Skip to content

Commit

Permalink
Avoid taking open handler mutex from request manager failure.
Browse files Browse the repository at this point in the history
Observed deadlock:

Thread 1:
  - FileTimer::Run holds FileTimer::pMutex
  - FileStateHandler::Tick wants to take the FileStateHandler::pMutex

Thread 2:
  - FileStateHandler::OnStateError holds the FileStateHandler::pMutex lock
  - RequestManager::requestFailure calls
  - OpenHandler::current_source, which wants the OpenHandler::m_mutex

Thread 3:
  - OpenHandler::HandleResponseWithHosts holds OpenHandler::m_mutex,
  - ~FileStateHandler calls FileTimer::UnRegisterFileObject which tries
    to get the FileTimer::pMutex.

We remove the call to OpenHandler::current_source to break the deadlock.

If a file-open is in progress, we cannot take the open handler
mutex from within (RequestManager::requestFailure).

It is safe to call XrdAdaptor::RequestManager::OpenHandler::open
from within the requestFailure callback; if the file-open was in progress, it will return the shared
future and not touch Xrootd code.  If the file-open was not in progress, it
is safe to take the open handler mutex in the first place.
  • Loading branch information
bbockelm committed Mar 7, 2015
1 parent 10a4aa9 commit ac01563
Show file tree
Hide file tree
Showing 2 changed files with 14 additions and 2 deletions.
12 changes: 10 additions & 2 deletions Utilities/XrdAdaptor/src/XrdRequestManager.cc
Original file line number Diff line number Diff line change
Expand Up @@ -694,7 +694,6 @@ RequestManager::requestFailure(std::shared_ptr<XrdAdaptor::ClientRequest> c_ptr,
<< "', flags=0x" << std::hex << m_flags
<< ", permissions=0" << std::oct << m_perms << std::dec
<< ", old source=" << source_ptr->PrettyID()
<< ", current server=" << m_open_handler->current_source()
<< ") => timeout when waiting for file open";
ex.addContext("In XrdAdaptor::RequestManager::requestFailure()");
addConnections(ex);
Expand Down Expand Up @@ -988,7 +987,6 @@ XrdAdaptor::RequestManager::OpenHandler::current_source()
std::shared_future<std::shared_ptr<Source> >
XrdAdaptor::RequestManager::OpenHandler::open()
{
std::lock_guard<std::recursive_mutex> sentry(m_mutex);
auto manager_ptr = m_manager.lock();
if (!manager_ptr)
{
Expand All @@ -1011,10 +1009,20 @@ XrdAdaptor::RequestManager::OpenHandler::open()
throw ex;
}

// NOTE NOTE: we look at this variable *without* the lock. This means the method
// is not thread-safe; the caller is responsible to verify it is not called from
// multiple threads simultaneously.
//
// This is done because ::open may be called from a Xrootd callback; if we
// tried to hold m_mutex here, this object's callback may also be active, hold m_mutex,
// and make a call into xrootd (when it invokes m_file.reset()). Hence, our callback
// holds our mutex and attempts to grab an Xrootd mutex; RequestManager::requestFailure holds
// an Xrootd mutex and tries to hold m_mutex. This is a classic deadlock.
if (m_file.get())
{
return m_shared_future;
}
std::lock_guard<std::recursive_mutex> sentry(m_mutex);
std::promise<std::shared_ptr<Source> > new_promise;
m_promise.swap(new_promise);
m_shared_future = m_promise.get_future().share();
Expand Down
4 changes: 4 additions & 0 deletions Utilities/XrdAdaptor/src/XrdRequestManager.h
Original file line number Diff line number Diff line change
Expand Up @@ -231,6 +231,10 @@ class RequestManager : boost::noncopyable {
* Future-based version of the handler
* If called while a file-open is in progress, we will not start a new file-open.
* Instead, the callback will be fired for the ongoing open.
*
* NOTE NOTE: This function is not thread-safe due to a lock-ordering issue.
* The caller must ensure it is not called from multiple threads at once
* for this object.
*/
std::shared_future<std::shared_ptr<Source> > open();

Expand Down

0 comments on commit ac01563

Please sign in to comment.