-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Self deadlock in XrdFileCache #239
Comments
Sure, we'll get this sorted out asap. What do you do to get it to lock consistently? |
All I'm doing is running xrdcp on a small file (8.2M) that's already in the cache. This part of a trivial test setup, not a production system. The weird thing is that it worked fine earlier today; now it's consistently hanging even after several server restarts. |
I see it would be better to use XrdSysCondVarHelper for m_stateCond in XrdFileCachePrefetc::InitiateClose() instead of directly calling Lock/UnLock on the m_statCond. Still I'm trying to understand how this deadlock happens. The m_started variable is set to true as soon as Prefetch::Run() thread is started and is never set to false except in constructor. Can you send me an xrootd log file if you have it? I could not reproduce the lock. Did you run xrdcp simultaneously from different clients? Were you killing xrdcp commands? Is it by any chance possible disk usage exceeded the limit and the files were purged in the meantime? Is the lock global for all files or for a specific file? Thanks, |
I've figured out how I can reproduce it. If I restart the server and copy the file it works. But if I don't remove the existing local file and try to copy over it I get the hang (which is something I keep doing accidentally). After that all transfers are stuck
(Notice there's a minute delay after the
|
… is started. Observed with xrdcp ran without -f option and an existing local file. Fixes xrootd#239.
I can confirm this fixes the problem for me. |
… is started. Observed with xrdcp ran without -f option and an existing local file. Fixes #239.
I can consistently provoke a deadlock in the file cache (version 4.2.0):
where the thread is stuck trying to acquire a lock it already holds.
I think the cause is Prefetch::InitiateClose in XrdFileCache/XrdFileCachePrefetch.cc (called previously from XrdOfsFile::close() via IOEntireFile::ioActive()
if
m_started
is falsem_stateCond
is not unlocked, causing a self-deadlock when the destructor tries to lock it again.The text was updated successfully, but these errors were encountered: